Techniques for Automated Machine Learning

Automated machine learning (AutoML) aims to find optimal machine learning solutions automatically given a problem description, its task type, and datasets. It could release the burden of data scientists from the multifarious manual tuning process and enable the access of domain experts to the off-the-shelf machine learning solutions without extensive experience. In this paper, we portray AutoML as a bi-level optimization problem, where one problem is nested within another to search the optimum in the search space, and review the current developments of AutoML in terms of three categories, automated feature engineering (AutoFE), automated model and hyperparameter tuning (AutoMHT), and automated deep learning (AutoDL). Stateof- the-art techniques in the three categories are presented. The iterative solver is proposed to generalize AutoML techniques. We summarize popular AutoML frameworks and conclude with current open challenges of AutoML.

Download Full-text

Benchmark and Survey of Automated Machine Learning Frameworks

Journal of Artificial Intelligence Research ◽

10.1613/jair.1.11854 ◽

2021 ◽

Vol 70 ◽

pp. 409-472

Author(s):

Marc-André Zöller ◽

Marco F. Huber

Keyword(s):

Machine Learning ◽

Daily Life ◽

Real Data ◽

Data Sets ◽

Domain Experts ◽

Vital Part ◽

Machine Learning Applications ◽

Automated Machine Learning ◽

Learning Frameworks

Machine learning (ML) has become a vital part in many aspects of our daily life. However, building well performing machine learning applications requires highly specialized data scientists and domain experts. Automated machine learning (AutoML) aims to reduce the demand for data scientists by enabling domain experts to build machine learning applications automatically without extensive knowledge of statistics and machine learning. This paper is a combination of a survey on current AutoML methods and a benchmark of popular AutoML frameworks on real data sets. Driven by the selected frameworks for evaluation, we summarize and review important AutoML techniques and methods concerning every step in building an ML pipeline. The selected AutoML frameworks are evaluated on 137 data sets from established AutoML benchmark suites.

Download Full-text

Automated Machine Learning on Graphs: A Survey

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/637 ◽

2021 ◽

Author(s):

Ziwei Zhang ◽

Xin Wang ◽

Wenwu Zhu

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Research Community ◽

Future Research ◽

Research Directions ◽

Vast Number ◽

Neural Architecture ◽

Future Research Directions ◽

Automated Machine Learning ◽

Optimal Machine

Machine learning on graphs has been extensively studied in both academic and industry. However, as the literature on graph learning booms with a vast number of emerging methods and techniques, it becomes increasingly difficult to manually design the optimal machine learning algorithm for different graph-related tasks. To solve this critical challenge, automated machine learning (AutoML) on graphs which combines the strength of graph machine learning and AutoML together, is gaining attention from the research community. Therefore, we comprehensively survey AutoML on graphs in this paper, primarily focusing on hyper-parameter optimization (HPO) and neural architecture search (NAS) for graph machine learning. We further overview libraries related to automated graph machine learning and in-depth discuss AutoGL, the first dedicated open-source library for AutoML on graphs. In the end, we share our insights on future research directions for automated graph machine learning. This paper is the first systematic and comprehensive review of automated machine learning on graphs to the best of our knowledge.

Download Full-text

AutoML to Date and Beyond: Challenges and Opportunities

ACM Computing Surveys ◽

10.1145/3470918 ◽

2022 ◽

Vol 54 (8) ◽

pp. 1-36

Author(s):

Shubhra Kanti Karmaker (“Santu”) ◽

Md. Mahadi Hassan ◽

Micah J. Smith ◽

Lei Xu ◽

Chengxiang Zhai ◽

...

Keyword(s):

Machine Learning ◽

Training Dataset ◽

Learning Tools ◽

Specific Data ◽

Domain Experts ◽

Domain Specific ◽

Challenges And Opportunities ◽

Automated Machine Learning ◽

End To End ◽

Prediction Problems

As big data becomes ubiquitous across domains, and more and more stakeholders aspire to make the most of their data, demand for machine learning tools has spurred researchers to explore the possibilities of automated machine learning (AutoML). AutoML tools aim to make machine learning accessible for non-machine learning experts (domain experts), to improve the efficiency of machine learning, and to accelerate machine learning research. But although automation and efficiency are among AutoML’s main selling points, the process still requires human involvement at a number of vital steps, including understanding the attributes of domain-specific data, defining prediction problems, creating a suitable training dataset, and selecting a promising machine learning technique. These steps often require a prolonged back-and-forth that makes this process inefficient for domain experts and data scientists alike and keeps so-called AutoML systems from being truly automatic. In this review article, we introduce a new classification system for AutoML systems, using a seven-tiered schematic to distinguish these systems based on their level of autonomy. We begin by describing what an end-to-end machine learning pipeline actually looks like, and which subtasks of the machine learning pipeline have been automated so far. We highlight those subtasks that are still done manually—generally by a data scientist—and explain how this limits domain experts’ access to machine learning. Next, we introduce our novel level-based taxonomy for AutoML systems and define each level according to the scope of automation support provided. Finally, we lay out a roadmap for the future, pinpointing the research required to further automate the end-to-end machine learning pipeline and discussing important challenges that stand in the way of this ambitious goal.

Download Full-text

MODULAR SEARCH SPACE FOR AUTOMATED DESIGN OF NEURAL ARCHITECTURE

Proceedings of the O S Popov ОNAT ◽

10.33243/2518-7139-2020-1-1-37-44 ◽

2020 ◽

Vol 1 (1) ◽

pp. 37-44

Author(s):

P.M. Radiuk ◽

Keyword(s):

Machine Learning ◽

Building Blocks ◽

Search Space ◽

Automated Design ◽

Final Model ◽

Considerable Potential ◽

Neural Architecture ◽

The Past ◽

Automated Machine Learning ◽

Predetermined Number

The past years of research have shown that automated machine learning and neural architecture search are an inevitable future for image recognition tasks. In addition, a crucial aspect of any automated search is the predefined search space. As many studies have demonstrated, the modularization technique may simplify the underlying search space by fostering successful blocks’ reuse. In this regard, the presented research aims to investigate the use of modularization in automated machine learning. In this paper, we propose and examine a modularized space based on the substantial limitation to seeded building blocks for neural architecture search. To make a search space viable, we presented all modules of the space as multisectoral networks. Therefore, each architecture within the search space could be unequivocally described by a vector. In our case, a module was a predetermined number of parameterized layers with information about their relationships. We applied the proposed modular search space to a genetic algorithm and evaluated it on the CIFAR-10 and CIFAR-100 datasets based on modules from the NAS-Bench-201 benchmark. To address the complexity of the search space, we randomly sampled twenty-five modules and included them in the database. Overall, our approach retrieved competitive architectures in averaged 8 GPU hours. The final model achieved the validation accuracy of 89.1% and 73.2% on the CIFAR-10 and CIFAR- 100 datasets, respectively. The learning process required slightly fewer GPU hours compared to other approaches, and the resulting network contained fewer parameters to signal lightness of the model. Such an outcome may indicate the considerable potential of sophisticated ranking approaches. The conducted experiments also revealed that a straightforward and transparent search space could address the challenging task of neural architecture search. Further research should be undertaken to explore how the predefined knowledge base of modules could benefit modular search space.

Download Full-text

Bandit-Based Automated Machine Learning

2018 7th Brazilian Conference on Intelligent Systems (BRACIS) ◽

10.1109/bracis.2018.00029 ◽

2018 ◽

Author(s):

Silvia Cristina Nunes das Dores ◽

Carlos Soares ◽

Duncan Ruiz

Keyword(s):

Machine Learning ◽

Automated Machine Learning

Download Full-text

Discretization and machine learning approximation of BSDEs with a constraint on the Gains-process

Monte Carlo Methods and Applications ◽

10.1515/mcma-2020-2080 ◽

2021 ◽

Vol 0 (0) ◽

Author(s):

Idris Kharroubi ◽

Thomas Lim ◽

Xavier Warin

Keyword(s):

Neural Network ◽

Machine Learning ◽

Neural Networks ◽

Differential Equations ◽

Numerical Experiments ◽

Optimization Problem ◽

Learning Approach ◽

The Neural Network ◽

Machine Learning Approach ◽

Mesh Grid

AbstractWe study the approximation of backward stochastic differential equations (BSDEs for short) with a constraint on the gains process. We first discretize the constraint by applying a so-called facelift operator at times of a grid. We show that this discretely constrained BSDE converges to the continuously constrained one as the mesh grid converges to zero. We then focus on the approximation of the discretely constrained BSDE. For that we adopt a machine learning approach. We show that the facelift can be approximated by an optimization problem over a class of neural networks under constraints on the neural network and its derivative. We then derive an algorithm converging to the discretely constrained BSDE as the number of neurons goes to infinity. We end by numerical experiments.

Download Full-text