Automated Machine Learning: a case study of genomic “image-based” prediction in maize hybrids

Abstract Machine learning methods such as Multilayer perceptrons (MLP) and Convolutional Neural Networks (CNN) have emerged as promising methods for genomic prediction (GP). In this sense, we assess the performance of MLP and CNN on regression and classification tasks in a case study with maize hybrids. The genomic information was provided to the MLP as a relationship matrix and to the CNN as “genomic images”. In the regression task, the machine learning models were compared along with GBLUP. Under the classification task, MLP and CNN were compared. In this case, the traits (plant height and grain yield) were discretized in such a way to create balanced (moderate selection intensity) and unbalanced (extreme selection intensity) datasets for further evaluations. An automatic hyperparameter search for MLP and CNN was performed, and the best models were reported. For both task types, several metrics were calculated under a validation scheme to assess the effect of the prediction method and other variables. Overall, MLP and CNN presented competitive results to GBLUP but improved a little using only the additive genomic layer. It is expected that the average effect of allele substitution is mostly linear. Nevertheless, the methodology’s potential for GP is unprecedented because we can create “multispectral genome images,” including other effects and layers of data, such as dominance, epistasis, g × e, transcriptome, and so on, capturing linear and non-linear effects and boosting prediction accuracies. Hence, we bring new insights on automated machine learning for genomic prediction and its implications to plant breeding.

Download Full-text

General-Purpose Automated Machine Learning for Transportation: A Case Study of Auto-sklearn for Traffic Forecasting

Information Processing and Management of Uncertainty in Knowledge-Based Systems - Communications in Computer and Information Science ◽

10.1007/978-3-030-50143-3_57 ◽

2020 ◽

pp. 728-744

Author(s):

Juan S. Angarita-Zapata ◽

Antonio D. Masegosa ◽

Isaac Triguero

Keyword(s):

Machine Learning ◽

General Purpose ◽

Traffic Forecasting ◽

Automated Machine Learning

Download Full-text

A case study of fully automated machine learning petrophysical interpretation using unstructured data

10.3997/2214-4609.202075023 ◽

2020 ◽

Author(s):

Saufi Karim ◽

Patrick Lucañas ◽

Ain Nadrah Sazali ◽

Nina Marie Hernandez ◽

Francois Baillard

Keyword(s):

Machine Learning ◽

Unstructured Data ◽

Automated Machine Learning

Download Full-text

Leveraging Automated Machine Learning for the Analysis of Global Public Health Data: A Case Study in Malaria

International Journal of Public Health ◽

10.3389/ijph.2021.614296 ◽

2021 ◽

Vol 66 ◽

Author(s):

Elisabetta Manduchi ◽

Jason H. Moore

Keyword(s):

Public Health ◽

Machine Learning ◽

Health Data ◽

Global Public Health ◽

Public Health Data ◽

Automated Machine Learning

Download Full-text

Automated Machine Learning based on Genetic Programming: a case study on a real house pricing dataset

2019 1st International Conference on Artificial Intelligence and Data Sciences (AiDAS) ◽

10.1109/aidas47888.2019.8970916 ◽

2019 ◽

Author(s):

Suraya Masrom ◽

Thuraiya Mohd ◽

Nur Syafiqah Jamil ◽

Abdullah Sani Abd. Rahman ◽

Norhayati Baharun

Keyword(s):

Machine Learning ◽

Genetic Programming ◽

Automated Machine Learning

Download Full-text

Prediction of Structural Type for City-Scale Seismic Damage Simulation Based on Machine Learning

Applied Sciences ◽

10.3390/app10051795 ◽

2020 ◽

Vol 10 (5) ◽

pp. 1795 ◽

Cited By ~ 3

Author(s):

Zhen Xu ◽

Yuan Wu ◽

Ming-zhu Qi ◽

Ming Zheng ◽

Chen Xiong ◽

...

Keyword(s):

Machine Learning ◽

Prediction Method ◽

Structural Type ◽

Seismic Damage ◽

Training Data ◽

Structural Types ◽

Simulation Based ◽

Damage Simulation ◽

The City

Being the necessary data of the city-scale seismic damage simulations, structural types of buildings of a city need to be collected. To this end, a prediction method of structural types of buildings based on machine learning (ML) is proposed herein. Specifically, using the training data of 230,683 buildings in Tangshan city, China, a supervised ML solution based on a decision forest model was designed for the prediction. The scale sensitivity and regional applicability of the designed solution are discussed, respectively, and the results show that the supervised ML solution can maintain high accuracy for different scales; however, it is only suitable for cities similar to the sample city. For wide applicability for various cities, a semi-supervised ML solution was designed based on sampling investigation and self-training procedures. The downtowns of Daxing and Tongzhou districts in Beijing were selected as a case study for the designed semi-supervised ML solution. The overall prediction accuracies of structural types for Daxing and Tongzhou downtowns can reach 94.8% and 99.5%, respectively, which are acceptable for seismic damage simulations. Based on the predicted results, the distributions of seismic damage in Daxing and Tongzhou downtown were output. This study provides a smart and efficient method for obtaining structural types for a city-scale seismic damage simulation.

Download Full-text

Short-Term River Flood Forecasting Using Composite Models and Automated Machine Learning: The Case Study of Lena River

Water ◽

10.3390/w13243482 ◽

2021 ◽

Vol 13 (24) ◽

pp. 3482

Author(s):

Mikhail Sarafanov ◽

Yulia Borisova ◽

Mikhail Maslyaev ◽

Ilia Revin ◽

Gleb Maximov ◽

...

Keyword(s):

Machine Learning ◽

High Water ◽

Flood Forecasting ◽

Experimental Comparison ◽

Efficiency Coefficient ◽

Short Term ◽

Lena River ◽

River Flood ◽

Automated Machine Learning

The paper presents a hybrid approach for short-term river flood forecasting. It is based on multi-modal data fusion from different sources (weather stations, water height sensors, remote sensing data). To improve the forecasting efficiency, the machine learning methods and the Snowmelt-Runoff physical model are combined in a composite modeling pipeline using automated machine learning techniques. The novelty of the study is based on the application of automated machine learning to identify the individual blocks of a composite pipeline without involving an expert. It makes it possible to adapt the approach to various river basins and different types of floods. Lena River basin was used as a case study since its modeling during spring high water is complicated by the high probability of ice-jam flooding events. Experimental comparison with the existing methods confirms that the proposed approach reduces the error at each analyzed level gauging station. The value of Nash–Sutcliffe model efficiency coefficient for the ten stations chosen for comparison is 0.80. The other approaches based on statistical and physical models could not surpass the threshold of 0.74. Validation for a high-water period also confirms that a composite pipeline designed using automated machine learning is much more efficient than stand-alone models.

Download Full-text