Automated Machine Learning for High-Throughput Image-Based Plant Phenotyping

Automated machine learning (AutoML) has been heralded as the next wave in artificial intelligence with its promise to deliver high-performance end-to-end machine learning pipelines with minimal effort from the user. However, despite AutoML showing great promise for computer vision tasks, to the best of our knowledge, no study has used AutoML for image-based plant phenotyping. To address this gap in knowledge, we examined the application of AutoML for image-based plant phenotyping using wheat lodging assessment with unmanned aerial vehicle (UAV) imagery as an example. The performance of an open-source AutoML framework, AutoKeras, in image classification and regression tasks was compared to transfer learning using modern convolutional neural network (CNN) architectures. For image classification, which classified plot images as lodged or non-lodged, transfer learning with Xception and DenseNet-201 achieved the best classification accuracy of 93.2%, whereas AutoKeras had a 92.4% accuracy. For image regression, which predicted lodging scores from plot images, transfer learning with DenseNet-201 had the best performance (R2 = 0.8303, root mean-squared error (RMSE) = 9.55, mean absolute error (MAE) = 7.03, mean absolute percentage error (MAPE) = 12.54%), followed closely by AutoKeras (R2 = 0.8273, RMSE = 10.65, MAE = 8.24, MAPE = 13.87%). In both tasks, AutoKeras models had up to 40-fold faster inference times compared to the pretrained CNNs. AutoML has significant potential to enhance plant phenotyping capabilities applicable in crop breeding and precision agriculture.

Download Full-text

Automated Machine Learning for High-Throughput Image-Based Plant Phenotyping

10.1101/2020.12.03.410746 ◽

2020 ◽

Author(s):

Joshua C.O KOh ◽

German Spangenberg ◽

Surya Kant

Keyword(s):

Neural Network ◽

Machine Learning ◽

Image Classification ◽

Transfer Learning ◽

High Performance ◽

Plant Phenotyping ◽

Network Architectures ◽

Neural Architecture ◽

Automated Machine Learning ◽

Minimal Effort

Automated machine learning (AutoML) has been heralded as the next wave in artificial intelligence with its promise to deliver high performance end-to-end machine learning pipelines with minimal effort from the user. AutoML with neural architecture search which searches for the best neural network architectures in deep learning has delivered state-of-the-art performance in computer vision tasks such as image classification and object detection. Using wheat lodging assessment with UAV imagery as an example, we compared the performance of an open-source AutoML framework, AutoKeras in image classification and regression tasks to transfer learning using modern convolutional neural network (CNN) architectures pretrained on the ImageNet dataset. For image classification, transfer learning with Xception and DenseNet-201 achieved best classification accuracy of 93.2%, whereas Autokeras had 92.4% accuracy. For image regression, transfer learning with DenseNet-201 had the best performance (R2=0.8303, RMSE=9.55, MAE=7.03, MAPE=12.54%), followed closely by AutoKeras (R2=0.8273, RMSE=10.65, MAE=8.24, MAPE=13.87%). Interestingly, in both tasks, AutoKeras generated compact CNN models with up to 40-fold faster inference times compared to the pretrained CNNs. The merits and drawbacks of AutoML compared to transfer learning for image-based plant phenotyping are discussed.

Download Full-text

Prediction of tensile strength of polymer carbon nanotube composites using practical machine learning method

Journal of Composite Materials ◽

10.1177/0021998320953540 ◽

2020 ◽

pp. 002199832095354 ◽

Cited By ~ 5

Author(s):

Tien-Thinh Le

Keyword(s):

Machine Learning ◽

Mechanical Properties ◽

Tensile Strength ◽

Carbon Nanotube ◽

Polymer Matrix ◽

Mean Squared Error ◽

Gaussian Process Regression ◽

Weight Fraction ◽

Percentage Error ◽

Input Variables

This paper is devoted to the development and construction of a practical Machine Learning (ML)-based model for the prediction of tensile strength of polymer carbon nanotube (CNTs) composites. To this end, a database was compiled from the available literature, composed of 11 input variables. The input variables for predicting tensile strength of nanocomposites were selected for the following main reasons: (i) type of polymer matrix, (ii) mechanical properties of polymer matrix, (iii) physical characteristics of CNTs, (iv) mechanical properties of CNTs and (v) incorporation parameters such as CNT weight fraction, CNT surface modification method and processing method. As the problem of prediction is highly dimensional (with 11 dimensions), the Gaussian Process Regression (GPR) model was selected and optimized by means of a parametric study. The correlation coefficient (R), Willmott’s index of agreement (IA), slope of regression, Mean Absolute Percentage Error (MAPE), Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE) were employed as error measurement criteria when training the GPR model. The GPR model exhibited good performance for both training and testing parts (RMSE = 5.982 and 5.327 MPa, MAE = 3.447 and 3.539 MPa, respectively). In addition, uncertainty analysis was also applied to estimate the prediction confidence intervals. Finally, the prediction capability of the GPR model with different ranges of values of input variables was investigated and discussed. For practical application, a Graphical User Interface (GUI) was developed in Matlab for predicting the tensile strength of nanocomposites.

Download Full-text

mAML: an automated machine learning pipeline with a microbiome repository for human disease classification

10.1101/2020.02.11.943316 ◽

2020 ◽

Author(s):

Fenglong Yang ◽

Quan Zou

Keyword(s):

Machine Learning ◽

Human Disease ◽

High Performance ◽

Model Building ◽

Disease Classification ◽

Benchmark Datasets ◽

Automated Machine Learning ◽

Classification Tasks ◽

Interpretable Models ◽

Multi Class Classification

AbstractDue to the concerted efforts to utilize the microbial features to improve disease prediction capabilities, automated machine learning (AutoML) systems designed to get rid of the tediousness in manually performing ML tasks are in great demand. Here we developed mAML, an ML model-building pipeline, which can automatically and rapidly generate optimized and interpretable models for personalized microbial classification tasks in a reproducible way. The pipeline is deployed on a web-based platform and the server is user-friendly, flexible, and has been designed to be scalable according to the specific requirements. This pipeline exhibits high performance for 13 benchmark datasets including both binary and multi-class classification tasks. In addition, to facilitate the application of mAML and expand the human disease-related microbiome learning repository, we developed GMrepo ML repository (GMrepo Microbiome Learning repository) from the GMrepo database. The repository involves 120 microbial classification tasks for 85 human-disease phenotypes referring to 12,429 metagenomic samples and 38,643 amplicon samples. The mAML pipeline and the GMrepo ML repository are expected to be important resources for researches in microbiology and algorithm developments.Database URLhttp://39.100.246.211:8050/Home

Download Full-text

Sugarcane Yield Mapping Using High-Resolution Imagery Data and Machine Learning Technique

Remote Sensing ◽

10.3390/rs13020232 ◽

2021 ◽

Vol 13 (2) ◽

pp. 232

Author(s):

Tatiana Fernanda Canata ◽

Marcelo Chan Fu Wei ◽

Leonardo Felipe Maldaner ◽

José Paulo Molin

Keyword(s):

Machine Learning ◽

Precision Agriculture ◽

Mean Squared Error ◽

Machine Learning Technique ◽

Yield Mapping ◽

Essential Information ◽

Spectral Bands ◽

Learning Technique ◽

Sugarcane Yield ◽

Imagery Data

Yield maps provide essential information to guide precision agriculture (PA) practices. Yet, on-board yield monitoring for sugarcane can be challenging. At the same time, orbital images have been widely used for indirect crop yield estimation for many crops like wheat, corn, and rice, but not for sugarcane. Due to this, the objective of this study is to explore the potential of multi-temporal imagery data as an alternative for sugarcane yield mapping. The study was based on developing predictive sugarcane yield models integrating time-series orbital imaging and a machine learning technique. A commercial sugarcane site was selected, and Sentinel-2 images were acquired from the beginning of the ratoon sprouting until harvesting of two consecutive cropping seasons. The predictive yield models RF (Random forest) and MLR (Multiple Linear Regression) were developed using orbital images and yield maps generated by a commercial sensor-system on harvesting. Original yield data were filtered and interpolated with the same spatial resolution of the orbital images. The entire dataset was divided into training and testing datasets. Spectral bands, especially the near-infrared at tillering crop stage showed greater contribution to predicting sugarcane yield than the use of derived spectral vegetation indices. The Root Mean Squared Error (RMSE) obtained for the RF regression based on multiple spectral bands was 4.63 Mg ha−1 with an R2 of 0.70 for the testing dataset. Overall, the RF regression had better performance than the MLR to predict sugarcane yield.

Download Full-text

Virtual to Real-World Transfer Learning: A Systematic Review

Electronics ◽

10.3390/electronics10121491 ◽

2021 ◽

Vol 10 (12) ◽

pp. 1491

Author(s):

Mahesh Ranaweera ◽

Qusay H. Mahmoud

Keyword(s):

Machine Learning ◽

Systematic Review ◽

Transfer Learning ◽

Real World ◽

High Performance ◽

Research Area ◽

Training Data ◽

Machine Learning Techniques ◽

Current Status ◽

The Real

Machine learning has become an important research area in many domains and real-world applications. The prevailing assumption in traditional machine learning techniques, that training and testing data should be of the same domain, is a challenge. In the real world, gathering enough training data to create high-performance learning models is not easy. Sometimes data are not available, very expensive, or dangerous to collect. In this scenario, the concept of machine learning does not hold up to its potential. Transfer learning has recently gained much acclaim in the field of research as it has the capability to create high performance learners through virtual environments or by using data gathered from other domains. This systematic review defines (a) transfer learning; (b) discusses the recent research conducted; (c) the current status of transfer learning and finally, (d) discusses how transfer learning can bridge the gap between the virtual and the real.

Download Full-text

Machine Learning Modelling of the Relationship between Weather and Paddy Yield in Sri Lanka

Journal of Mathematics ◽

10.1155/2021/9941899 ◽

2021 ◽

Vol 2021 ◽

pp. 1-14

Author(s):

Piyal Ekanayake ◽

Windhya Rankothge ◽

Rukmal Weliwatta ◽

Jeevani W. Jayasinghe

Keyword(s):

Machine Learning ◽

Sri Lanka ◽

Relative Humidity ◽

Mean Squared Error ◽

Absolute Error ◽

Maximum Temperature ◽

Percentage Error ◽

Pairwise Correlation ◽

Geographical Regions ◽

Paddy Yield

This paper presents the development of crop-weather models for the paddy yield in Sri Lanka based on nine weather indices, namely, rainfall, relative humidity (minimum and maximum), temperature (minimum and maximum), wind speed (morning and evening), evaporation, and sunshine hours. The statistics of seven geographical regions, which contribute to about two-thirds of the country’s total paddy production, were used for this study. The significance of the weather indices on the paddy yield was explored by employing Random Forest (RF) and the variable importance of each of them was determined. Pearson’s correlation and Spearman’s correlation were used to identify the behavior of correlation in a positive or negative direction. Further, the pairwise correlation among the weather indices was examined. The results indicate that the minimum relative humidity and the maximum temperature during the paddy cultivation period are the most influential weather indices. Moreover, RF was used to develop a paddy yield prediction model and four more techniques, namely, Power Regression (PR), Multiple Linear Regression (MLR) with stepwise selection, forward (step-up) selection, and backward (step-down) elimination, were used to benchmark the performance of the machine learning technique. Their performances were compared in terms of the Root Mean Squared Error (RMSE), Correlation Coefficient (R), Mean Absolute Error (MAE), and the Mean Absolute Percentage Error (MAPE). As per the results, RF is a reliable and accurate model for the prediction of paddy yield in Sri Lanka, demonstrating a very high R of 0.99 and the least MAPE of 1.4%.

Download Full-text

Automated Machine Learning Overview

Research Papers Faculty of Materials Science and Technology Slovak University of Technology ◽

10.2478/rput-2019-0033 ◽

2019 ◽

Vol 27 (45) ◽

pp. 107-112

Author(s):

Roman Budjač ◽

Marcel Nikmon ◽

Peter Schreiber ◽

Barbora Zahradníková ◽

Dagmar Janáčová

Keyword(s):

Machine Learning ◽

Image Classification ◽

Learning Model ◽

Machine Learning Method ◽

Learning Method ◽

Learning Tasks ◽

Machine Learning Model ◽

Automated Machine Learning ◽

Expertise Level

Abstract This paper aims at deeper exploration of the new field named auto-machine learning, as it shows promising results in specific machine learning tasks e.g. image classification. The following article is about to summarize the most successful approaches now available in the A.I. community. The automated machine learning method is very briefly described here, but the concept of automated task solving seems to be very promising, since it can significantly reduce expertise level of a person developing the machine learning model. We used Auto-Keras to find the best architecture on several datasets, and demonstrated several automated machine learning features, as well as discussed the issue deeper.

Download Full-text

Generation of Synthetic Photoelectric Log using Machine Learning Approach

10.2118/208201-ms ◽

2021 ◽

Author(s):

Mohammad Rasheed Khan ◽

Zeeshan Tariq ◽

Mohamed Mahmoud

Keyword(s):

Machine Learning ◽

Mean Squared Error ◽

Coefficient Of Determination ◽

Percentage Error ◽

Well Log ◽

Log Data ◽

Hydrocarbon Reservoir ◽

Analysis Scheme ◽

Data Points ◽

Error Metric

Abstract Photoelectric factor (PEF) is one of functional parameters of a hydrocarbon reservoir that could provide invaluable data for reservoir characterization. Well logs are critical to formation evaluation processes; however, they are not always readily available due to unfeasible logging conditions. In addition, with call for efficiency in hydrocarbon E&P business, it has become imperative to optimize logging programs to acquire maximum data with minimal cost impact. As a result, the present study proposes an improved strategy for generating synthetic log by making a quantitative formulation between conventional well log data, rock mineralogical content and PEF. 230 data points were utilized to implement the machine learning (ML) methodology which is initiated by implementing a statistical analysis scheme. The input logs that are used for architecture establishment include the density and sonic logs. Moreover, rock mineralogical content (carbonate, quartz, clay) has been incorporated for model development which is strongly correlated to the PEF. At the next stage of this study, architecture of artificial neural networks (ANN) was developed and optimized to predict the PEF from conventional well log data. A sub-set of data points was used for ML model construction and another unseen set was employed to assess the model performance. Furthermore, a comprehensive error metrics analysis is used to evaluate performance of the proposed model. The synthetic PEF log generated using the developed ANN correlation is compared with the actual well log data available and demonstrate an average absolute percentage error less than 0.38. In addition, a comprehensive error metric analysis is presented which depicts coefficient of determination more than 0.99 and root mean squared error of only 0.003. The numerical analysis of the error metric point towards the robustness of the ANN model and capability to link mineralogical content with the PEF.

Download Full-text

mAML: an automated machine learning pipeline with a microbiome repository for human disease classification

Database ◽

10.1093/database/baaa050 ◽

2020 ◽

Vol 2020 ◽

Author(s):

Fenglong Yang ◽

Quan Zou

Keyword(s):

Machine Learning ◽

Human Disease ◽

High Performance ◽

Model Building ◽

Disease Classification ◽

Benchmark Datasets ◽

Automated Machine Learning ◽

Classification Tasks ◽

Interpretable Models ◽

Multi Class Classification

Abstract Due to the concerted efforts to utilize the microbial features to improve disease prediction capabilities, automated machine learning (AutoML) systems aiming to get rid of the tediousness in manually performing ML tasks are in great demand. Here we developed mAML, an ML model-building pipeline, which can automatically and rapidly generate optimized and interpretable models for personalized microbiome-based classification tasks in a reproducible way. The pipeline is deployed on a web-based platform, while the server is user-friendly and flexible and has been designed to be scalable according to the specific requirements. This pipeline exhibits high performance for 13 benchmark datasets including both binary and multi-class classification tasks. In addition, to facilitate the application of mAML and expand the human disease-related microbiome learning repository, we developed GMrepo ML repository (GMrepo Microbiome Learning repository) from the GMrepo database. The repository involves 120 microbiome-based classification tasks for 85 human-disease phenotypes referring to 12 429 metagenomic samples and 38 643 amplicon samples. The mAML pipeline and the GMrepo ML repository are expected to be important resources for researches in microbiology and algorithm developments. Database URL: http://lab.malab.cn/soft/mAML

Download Full-text

PREDIKSI DATA TRANSAKSI PENJUALAN TIME SERIES MENGGUNAKAN REGRESI LSTM

Jurnal Nasional Pendidikan Teknik Informatika (JANAPATI) ◽

10.23887/janapati.v9i1.19140 ◽

2020 ◽

Vol 9 (1) ◽

pp. 1

Author(s):

Marie Luthfi Ashari ◽

Mujiono Sadikin

Keyword(s):

Machine Learning ◽

Time Series ◽

Short Term Memory ◽

Mean Squared Error ◽

Percentage Error ◽

Short Term ◽

Term Memory ◽

Root Mean Squared Error ◽

Squared Error ◽

Long Short Term Memory

Sebagai upaya untuk memenangkan persaingan di pasar, perusahaan farmasi harus menghasilkan produk obat – obatan yang berkualitas. Untuk menghasilkan produk yang berkualitas, diperlukan perencanaan produksi yang baik dan efisien. Salah satu dasar perencanaan produksi adalah prediksi penjualan. PT. Metiska Farma telah menerapkan metode prediksi dalam proses produksi, akan tetapi prediksi yang dihasilkan tidak akurat sehingga menyebabkan tidak optimal dalam memenuhi permintaan pasar. Untuk meminimalisir masalah kurang akuratnya proses prediksi tersebut, dalam penelitian yang disajikan pada makalah ini dilakukan uji coba prediksi menggunakan teknik Machine Learning dengan metode Regresi Long Short Term Memory (LSTM). Teknik yang diusulkan diuji coba menggunakan dataset penjualan produk “X” dari PT. Metiska Farma dengan parameter kinerja Root Mean Squared Error (RMSE) dan MAPE (Mean Absolute Percentage Error). Hasil penelitian ini berupa nilai rata – rata evaluasi error dari pemodelan data training dan data testing. Di mana hasil menunjukan bahwa Regresi LSTM memiliki nilai prediksi penjualan dengan evaluasi model melalui RMSE sebesar 286.465.424 untuk data training dan 187.013.430 untuk data testing. Untuk nilai MAPE sebesar 787% dan 309% untuk data training dan data testing secara berurut.

Download Full-text