MODULAR SEARCH SPACE FOR AUTOMATED DESIGN OF NEURAL ARCHITECTURE

The past years of research have shown that automated machine learning and neural architecture search are an inevitable future for image recognition tasks. In addition, a crucial aspect of any automated search is the predefined search space. As many studies have demonstrated, the modularization technique may simplify the underlying search space by fostering successful blocks’ reuse. In this regard, the presented research aims to investigate the use of modularization in automated machine learning. In this paper, we propose and examine a modularized space based on the substantial limitation to seeded building blocks for neural architecture search. To make a search space viable, we presented all modules of the space as multisectoral networks. Therefore, each architecture within the search space could be unequivocally described by a vector. In our case, a module was a predetermined number of parameterized layers with information about their relationships. We applied the proposed modular search space to a genetic algorithm and evaluated it on the CIFAR-10 and CIFAR-100 datasets based on modules from the NAS-Bench-201 benchmark. To address the complexity of the search space, we randomly sampled twenty-five modules and included them in the database. Overall, our approach retrieved competitive architectures in averaged 8 GPU hours. The final model achieved the validation accuracy of 89.1% and 73.2% on the CIFAR-10 and CIFAR- 100 datasets, respectively. The learning process required slightly fewer GPU hours compared to other approaches, and the resulting network contained fewer parameters to signal lightness of the model. Such an outcome may indicate the considerable potential of sophisticated ranking approaches. The conducted experiments also revealed that a straightforward and transparent search space could address the challenging task of neural architecture search. Further research should be undertaken to explore how the predefined knowledge base of modules could benefit modular search space.

Download Full-text

Automated Machine Learning on Graphs: A Survey

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/637 ◽

2021 ◽

Author(s):

Ziwei Zhang ◽

Xin Wang ◽

Wenwu Zhu

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Research Community ◽

Future Research ◽

Research Directions ◽

Vast Number ◽

Neural Architecture ◽

Future Research Directions ◽

Automated Machine Learning ◽

Optimal Machine

Machine learning on graphs has been extensively studied in both academic and industry. However, as the literature on graph learning booms with a vast number of emerging methods and techniques, it becomes increasingly difficult to manually design the optimal machine learning algorithm for different graph-related tasks. To solve this critical challenge, automated machine learning (AutoML) on graphs which combines the strength of graph machine learning and AutoML together, is gaining attention from the research community. Therefore, we comprehensively survey AutoML on graphs in this paper, primarily focusing on hyper-parameter optimization (HPO) and neural architecture search (NAS) for graph machine learning. We further overview libraries related to automated graph machine learning and in-depth discuss AutoGL, the first dedicated open-source library for AutoML on graphs. In the end, we share our insights on future research directions for automated graph machine learning. This paper is the first systematic and comprehensive review of automated machine learning on graphs to the best of our knowledge.

Download Full-text

TUNING OUT HATE SPEECH ON REDDIT: AUTOMATING MODERATION AND DETECTING TOXICITY IN THE MANOSPHERE

AoIR Selected Papers of Internet Research ◽

10.5210/spir.v2020i0.11352 ◽

2020 ◽

Author(s):

Verity Trott ◽

Jennifer Beckett ◽

Venessa Paech

Keyword(s):

Machine Learning ◽

Social Media ◽

Hate Speech ◽

The Past ◽

Community Platforms ◽

Social Media Platforms ◽

Automated Machine Learning ◽

Tuning Out

Over the past two years social media platforms have been struggling to moderate at scale. At the same time, they have come under fire for failing to mitigate the risks of perceived ‘toxic’ content or behaviour on their platforms. In effort to better cope with content moderation, to combat hate speech, ‘dangerous organisations’ and other bad actors present on platforms, discussion has turned to the role that automated machine-learning (ML) tools might play. This paper contributes to thinking about the role and suitability of ML for content moderation on community platforms such as Reddit and Facebook. In particular, it looks at how ML tools operate (or fail to operate) effectively at the intersection between online sentiment within communities and social and platform expectations of acceptable discourse. Through an examination of the r/MGTOW subreddit we problematise current understandings of the notion of ‘tox¬icity’ as applied to cultural or social sub-communities online and explain how this interacts with Google’s Perspective tool.

Download Full-text

Automated machine learning based on radiomics features predicts H3 K27M mutation in midline gliomas of the brain

Neuro-Oncology ◽

10.1093/neuonc/noz184 ◽

2019 ◽

Cited By ~ 3

Author(s):

Xiaorui Su ◽

Ni Chen ◽

Huaiqiang Sun ◽

Yanhui Liu ◽

Xibiao Yang ◽

...

Keyword(s):

Machine Learning ◽

Prediction Models ◽

Area Under The Curve ◽

Average Precision ◽

Final Model ◽

Mutation Status ◽

K27m Mutation ◽

Discriminatory Accuracy ◽

Fluid Attenuated Inversion Recovery ◽

Automated Machine Learning

Abstract Background Conventional MRI cannot be used to identify H3 K27M mutation status. This study aimed to investigate the feasibility of predicting H3 K27M mutation status by applying an automated machine learning (autoML) approach to the MR radiomics features of patients with midline gliomas. Methods This single-institution retrospective study included 100 patients with midline gliomas, including 40 patients with H3 K27M mutations and 60 wild-type patients. Radiomics features were extracted from fluid-attenuated inversion recovery images. Prior to autoML analysis, the dataset was randomly stratified into separate 75% training and 25% testing cohorts. The Tree-based Pipeline Optimization Tool (TPOT) was applied to optimize the machine learning pipeline and select important radiomics features. We compared the performance of 10 independent TPOT-generated models based on training and testing cohorts using the area under the curve (AUC) and average precision to obtain the final model. An independent cohort of 22 patients was used to validate the best model. Results Ten prediction models were generated by TPOT, and the accuracy obtained with the best pipeline ranged from 0.788 to 0.867 for the training cohort and from 0.60 to 0.84 for the testing cohort. After comparison, the AUC value and average precision of the final model were 0.903 and 0.911 in the testing cohort, respectively. In the validation set, the AUC was 0.85, and the average precision was 0.855 for the best model. Conclusions The autoML classifier using radiomics features of conventional MR images provides high discriminatory accuracy in predicting the H3 K27M mutation status of midline glioma.

Download Full-text

Illuminating Elite Patches of Chemical Space

10.26434/chemrxiv.12608228.v1 ◽

2020 ◽

Author(s):

Jonas Verhellen ◽

Jeriek Van den Abeele

Keyword(s):

Machine Learning ◽

Genetic Algorithm ◽

State Of The Art ◽

Chemical Space ◽

Search Space ◽

Robot Design ◽

Soft Robot ◽

Learning Approaches ◽

High Performing ◽

The Past

In the past few years, there has been considerable activity in both academic and industrial research to develop innovative machine learning approaches to locate novel, high-performing molecules in chemical space. Here we describe a new and fundamentally different type of approach that provides a holistic overview of how high-performing molecules are distributed throughout a search space. Based on an open-source, graph-based implementation [Jensen, Chem. Sci., 2019, 12, 3567-3572] of a traditional genetic algorithm for molecular optimisation, and influenced by state-of-the-art concepts from soft robot design [Mouret et al., IEEE Trans. Evolut. Comput., 2016, 22, 623-630], we provide an algorithm that (i) produces a large diversity of high-performing, yet qualitatively different molecules, (ii) illuminates the distribution of optimal solutions, and (iii) improves search efficiency compared to both machine learning and traditional genetic algorithm approaches.

Download Full-text

Illuminating Elite Patches of Chemical Space

10.26434/chemrxiv.12608228 ◽

2020 ◽

Author(s):

Jonas Verhellen ◽

Jeriek Van den Abeele

Keyword(s):

Machine Learning ◽

Genetic Algorithm ◽

State Of The Art ◽

Chemical Space ◽

Search Space ◽

Robot Design ◽

Soft Robot ◽

Learning Approaches ◽

High Performing ◽

The Past

Download Full-text

Automated Machine Learning for High-Throughput Image-Based Plant Phenotyping

10.1101/2020.12.03.410746 ◽

2020 ◽

Author(s):

Joshua C.O KOh ◽

German Spangenberg ◽

Surya Kant

Keyword(s):

Neural Network ◽

Machine Learning ◽

Image Classification ◽

Transfer Learning ◽

High Performance ◽

Plant Phenotyping ◽

Network Architectures ◽

Neural Architecture ◽

Automated Machine Learning ◽

Minimal Effort

Automated machine learning (AutoML) has been heralded as the next wave in artificial intelligence with its promise to deliver high performance end-to-end machine learning pipelines with minimal effort from the user. AutoML with neural architecture search which searches for the best neural network architectures in deep learning has delivered state-of-the-art performance in computer vision tasks such as image classification and object detection. Using wheat lodging assessment with UAV imagery as an example, we compared the performance of an open-source AutoML framework, AutoKeras in image classification and regression tasks to transfer learning using modern convolutional neural network (CNN) architectures pretrained on the ImageNet dataset. For image classification, transfer learning with Xception and DenseNet-201 achieved best classification accuracy of 93.2%, whereas Autokeras had 92.4% accuracy. For image regression, transfer learning with DenseNet-201 had the best performance (R2=0.8303, RMSE=9.55, MAE=7.03, MAPE=12.54%), followed closely by AutoKeras (R2=0.8273, RMSE=10.65, MAE=8.24, MAPE=13.87%). Interestingly, in both tasks, AutoKeras generated compact CNN models with up to 40-fold faster inference times compared to the pretrained CNNs. The merits and drawbacks of AutoML compared to transfer learning for image-based plant phenotyping are discussed.

Download Full-text

AutoShrink: A Topology-Aware NAS for Discovering Efficient Neural Architecture

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.6163 ◽

2020 ◽

Vol 34 (04) ◽

pp. 6829-6836

Author(s):

Tunhou Zhang ◽

Hsin-Pai Cheng ◽

Zhenwen Li ◽

Feng Yan ◽

Chengyu Huang ◽

...

Keyword(s):

Cell Structure ◽

Search Time ◽

Building Blocks ◽

Search Space ◽

Directed Acyclic Graphs ◽

Neural Architecture ◽

Cell Structures ◽

Acyclic Graphs ◽

Search Approach ◽

Network Patterns

Resource is an important constraint when deploying Deep Neural Networks (DNNs) on mobile and edge devices. Existing works commonly adopt the cell-based search approach, which limits the flexibility of network patterns in learned cell structures. Moreover, due to the topology-agnostic nature of existing works, including both cell-based and node-based approaches, the search process is time consuming and the performance of found architecture may be sub-optimal. To address these problems, we propose AutoShrink, a topology-aware Neural Architecture Search (NAS) for searching efficient building blocks of neural architectures. Our method is node-based and thus can learn flexible network patterns in cell structures within a topological search space. Directed Acyclic Graphs (DAGs) are used to abstract DNN architectures and progressively optimize the cell structure through edge shrinking. As the search space intrinsically reduces as the edges are progressively shrunk, AutoShrink explores more flexible search space with even less search time. We evaluate AutoShrink on image classification and language tasks by crafting ShrinkCNN and ShrinkRNN models. ShrinkCNN is able to achieve up to 48% parameter reduction and save 34% Multiply-Accumulates (MACs) on ImageNet-1K with comparable accuracy of state-of-the-art (SOTA) models. Specifically, both ShrinkCNN and ShrinkRNN are crafted within 1.5 GPU hours, which is 7.2× and 6.7× faster than the crafting time of SOTA CNN and RNN models, respectively.

Download Full-text

Techniques for Automated Machine Learning

ACM SIGKDD Explorations Newsletter ◽

10.1145/3447556.3447567 ◽

2021 ◽

Vol 22 (2) ◽

pp. 35-50

Author(s):

Yi-Wei Chen ◽

Qingquan Song ◽

Xia Hu

Keyword(s):

Machine Learning ◽

Optimization Problem ◽

Search Space ◽

Task Type ◽

Iterative Solver ◽

Extensive Experience ◽

Domain Experts ◽

Problem Description ◽

Automated Machine Learning ◽

Optimal Machine

Automated machine learning (AutoML) aims to find optimal machine learning solutions automatically given a problem description, its task type, and datasets. It could release the burden of data scientists from the multifarious manual tuning process and enable the access of domain experts to the off-the-shelf machine learning solutions without extensive experience. In this paper, we portray AutoML as a bi-level optimization problem, where one problem is nested within another to search the optimum in the search space, and review the current developments of AutoML in terms of three categories, automated feature engineering (AutoFE), automated model and hyperparameter tuning (AutoMHT), and automated deep learning (AutoDL). Stateof- the-art techniques in the three categories are presented. The iterative solver is proposed to generalize AutoML techniques. We summarize popular AutoML frameworks and conclude with current open challenges of AutoML.

Download Full-text

Efficient Prediction of Structural and Electronic Properties of Hybrid 2D Materials Using DFT and Machine Learning

10.26434/chemrxiv.6254756.v1 ◽

2018 ◽

Author(s):

Sherif Tawfik ◽

Olexandr Isayev ◽

Catherine Stampfl ◽

Joseph Shapter ◽

David Winkler ◽

...

Keyword(s):

Machine Learning ◽

Band Gap ◽

Density Functional ◽

2D Materials ◽

Van Der Waals ◽

Building Blocks ◽

Machine Learning Techniques ◽

Interlayer Distance ◽

Computational Screening ◽

Wide Range

Materials constructed from different van der Waals two-dimensional (2D) heterostructures offer a wide range of benefits, but these systems have been little studied because of their experimental and computational complextiy, and because of the very large number of possible combinations of 2D building blocks. The simulation of the interface between two different 2D materials is computationally challenging due to the lattice mismatch problem, which sometimes necessitates the creation of very large simulation cells for performing density-functional theory (DFT) calculations. Here we use a combination of DFT, linear regression and machine learning techniques in order to rapidly determine the interlayer distance between two different 2D heterostructures that are stacked in a bilayer heterostructure, as well as the band gap of the bilayer. Our work provides an excellent proof of concept by quickly and accurately predicting a structural property (the interlayer distance) and an electronic property (the band gap) for a large number of hybrid 2D materials. This work paves the way for rapid computational screening of the vast parameter space of van der Waals heterostructures to identify new hybrid materials with useful and interesting properties.

Download Full-text

Recent Progress in Machine Learning-based Prediction of Peptide Activity for Drug Discovery

Current Topics in Medicinal Chemistry ◽

10.2174/1568026619666190122151634 ◽

2019 ◽

Vol 19 (1) ◽

pp. 4-16 ◽

Cited By ~ 6

Author(s):

Qihui Wu ◽

Hanzhong Ke ◽

Dongli Li ◽

Qi Wang ◽

Jiansong Fang ◽

...

Keyword(s):

Machine Learning ◽

Drug Discovery ◽

Large Scale ◽

Recent Progress ◽

High Specificity ◽

Learning Approaches ◽

Anticancer Peptides ◽

The Past ◽

Traditional Approaches ◽

Large Scale Screening

Over the past decades, peptide as a therapeutic candidate has received increasing attention in drug discovery, especially for antimicrobial peptides (AMPs), anticancer peptides (ACPs) and antiinflammatory peptides (AIPs). It is considered that the peptides can regulate various complex diseases which are previously untouchable. In recent years, the critical problem of antimicrobial resistance drives the pharmaceutical industry to look for new therapeutic agents. Compared to organic small drugs, peptide- based therapy exhibits high specificity and minimal toxicity. Thus, peptides are widely recruited in the design and discovery of new potent drugs. Currently, large-scale screening of peptide activity with traditional approaches is costly, time-consuming and labor-intensive. Hence, in silico methods, mainly machine learning approaches, for their accuracy and effectiveness, have been introduced to predict the peptide activity. In this review, we document the recent progress in machine learning-based prediction of peptides which will be of great benefit to the discovery of potential active AMPs, ACPs and AIPs.

Download Full-text