Total Nitrogen Estimation in Agricultural Soils via Aerial Multispectral Imaging and Libs

Abstract Measuring soil health indicators is an important and challenging task that affects farmers’ decisions on timing, placement, and quantity of fertilizers applied in the farms. Most existing methods to measure soil health indicators (SHIs) are in-lab wet chemistry or spectroscopy-based methods, which require significant human input and effort, time-consuming, costly, and are low-throughput in nature. To address this challenge, we develop an artificial intelligence (AI)-driven near real-time unmanned aerial vehicle (UAV)-based multispectral sensing (UMS) solution to estimate total nitrogen (TN) of the soil, an important macro-nutrient or SHI that directly affects the crop health. Accurate prediction of soil TN can significantly increase crop yield through informed decision making on the timing of seed planting, and fertilizer quantity and timing. We train two machine learning models including multi-layer perceptron and support vector machine to predict the soil nitrogen using a suite of data classes including multispectral characteristics of the soil and crops in red, near-infrared, and green spectral bands, computed vegetation indices, and environmental variables including air temperature and relative humidity. To generate the ground-truth data or the training data for the machine learning models, we measure the total nitrogen of the soil samples (collected from a farm) using laser-induced breakdown spectroscopy (LIBS).

Download Full-text

Coalbed methane content prediction using deep belief network

Interpretation ◽

10.1190/int-2019-0126.1 ◽

2020 ◽

Vol 8 (2) ◽

pp. T309-T321

Author(s):

Fan Peng ◽

Suping Peng ◽

Wenfeng Du ◽

Hongshuan Liu

Keyword(s):

Machine Learning ◽

Coal Seam ◽

Coalbed Methane ◽

Deep Belief Network ◽

Training Data ◽

Fine Tuning ◽

Support Vector ◽

Learning Models ◽

Belief Network ◽

Machine Learning Models

Accurate measurement of coalbed methane (CBM) content is the foundation for CBM resource exploration and development. Machine-learning techniques can help address CBM content prediction tasks. Due to the small amount of actual measurement data and the shallow model structure, however, the results from traditional machine-learning models have errors to some extent. We have developed a deep belief network (DBN)-based model with the input as continuous real values and the activation function as the rectified linear unit. We first calculated a variety of seismic attributes of the target coal seam to highlight the features of the coal seam, then we preprocessed the original attribute features, and finally developed the performance of the DBN model using the preprocessed features. We used 23,374 training data to train our model, 23,240 for pretraining, and 134 for fine-tuning. For the purpose of demonstrating the advantages of the DBN model, we compared it with two typical machine-learning models, including the multilayer perceptron model and the support vector regression model. These two models were trained based on the same labeled training data. The results, obtained from different models, indicated that the DBN model has the least error, which means that it is more accurate than the other two models when used to predict CBM content.

Download Full-text

Comparative Analysis of Machine Learning Models for Day-Ahead Photovoltaic Power Production Forecasting

Energies ◽

10.3390/en14041081 ◽

2021 ◽

Vol 14 (4) ◽

pp. 1081

Author(s):

Spyros Theocharides ◽

Marios Theristis ◽

George Makrides ◽

Marios Kynigos ◽

Chrysovalantis Spanias ◽

...

Keyword(s):

Machine Learning ◽

Supervised Learning ◽

Regression Tree ◽

Training Data ◽

Support Vector ◽

Learning Models ◽

Bayesian Neural Network ◽

Production Forecasting ◽

Main Challenge ◽

Machine Learning Models

A main challenge for integrating the intermittent photovoltaic (PV) power generation remains the accuracy of day-ahead forecasts and the establishment of robust performing methods. The purpose of this work is to address these technological challenges by evaluating the day-ahead PV production forecasting performance of different machine learning models under different supervised learning regimes and minimal input features. Specifically, the day-ahead forecasting capability of Bayesian neural network (BNN), support vector regression (SVR), and regression tree (RT) models was investigated by employing the same dataset for training and performance verification, thus enabling a valid comparison. The training regime analysis demonstrated that the performance of the investigated models was strongly dependent on the timeframe of the train set, training data sequence, and application of irradiance condition filters. Furthermore, accurate results were obtained utilizing only the measured power output and other calculated parameters for training. Consequently, useful information is provided for establishing a robust day-ahead forecasting methodology that utilizes calculated input parameters and an optimal supervised learning approach. Finally, the obtained results demonstrated that the optimally constructed BNN outperformed all other machine learning models achieving forecasting accuracies lower than 5%.

Download Full-text

Detecting Depression Through Gait Data: Examining the Contribution of Gait Features in Recognizing Depression

Frontiers in Psychiatry ◽

10.3389/fpsyt.2021.661213 ◽

2021 ◽

Vol 12 ◽

Author(s):

Yameng Wang ◽

Jingying Wang ◽

Xiaoqian Liu ◽

Tingshao Zhu

Keyword(s):

Machine Learning ◽

Mental Disorders ◽

Frequency Domain ◽

Common Mental Disorders ◽

Microsoft Kinect ◽

Training Data ◽

Support Vector ◽

Learning Models ◽

Gait Features ◽

Machine Learning Models

While depression is one of the most common mental disorders affecting more than 300 million people across the world, it is often left undiagnosed. This paper investigated the association between depression and gait characteristics with the aim to assist in diagnosing depression. Our dataset consisted of 121 healthy people and 126 patients with depression who diagnosed by psychiatrists according to the Diagnostic and Statistical Manual of Mental Disorders. Spatiotemporal, temporal-domain, and frequency-domain features were extracted based on the walking data of 247 participants recorded by Microsoft Kinect (Version 2). Multiple logistic regression was used to analyze the variance of spatiotemporal (12.55%), time-domain (58.36%), and frequency-domain features (60.71%) on recognizing depression based on Nagelkerke's R2 measure, respectively. The contributions of the different types of features were further explored by building machine learning models by using support vector machine algorithm. All the combinations of the three types of gait features were used as training data of machine learning models, respectively. The results showed that the model trained using only time- and frequency-domain features demonstrated the same best performance compared to the model trained using all the features (sensitivity = 0.94, specificity = 0.91, and AUC = 0.93). These results indicated that depression could be effectively recognized through gait analysis. This approach is a step forward toward developing low-cost, non-intrusive solutions for real-time depression recognition.

Download Full-text

Survey of Public Assay Data: Opportunities and Challenges to Understanding Antimicrobial Resistance

10.1101/2019.12.13.874909 ◽

2019 ◽

Author(s):

Akshay Agarwal ◽

Gowri Nayar ◽

James Kaufman

Keyword(s):

Machine Learning ◽

Antimicrobial Resistance ◽

Pathogen Detection ◽

Ground Truth ◽

Training Data ◽

Learning Models ◽

Learning Methods ◽

Ground Truth Data ◽

Specificity And Sensitivity ◽

Machine Learning Models

ABSTRACTComputational learning methods allow researchers to make predictions, draw inferences, and automate generation of mathematical models. These models are crucial to solving real world problems, such as antimicrobial resistance, pathogen detection, and protein evolution. Machine learning methods depend upon ground truth data to achieve specificity and sensitivity. Since the data is limited in this case, as we will show during the course of this paper, and as the size of available data increases super-linearly, it is of paramount importance to understand the distribution of ground truth data and the analyses it is suited and where it may have limitations that bias downstream learning methods. In this paper, we focus on training data required to model antimicrobial resistance (AR). We report an analysis of bacterial biochemical assay data associated with whole genome sequencing (WGS) from the National Center for Biotechnology Information (NCBI), and discuss important implications when making use of assay data, utilizing genetic features as training data for machine learning models. Complete discussion of machine learning model implementation is outside the scope of this paper and the subject to a later publication.The antimicrobial assay data was obtained from NCBI BioSample, which contains descriptive information about the physical biological specimen from which experimental data is obtained and the results of those experiments themselves.[1] Assay data includes minimum inhibitory concentrations (MIC) of antibiotics, links to associated microbial WGS data, and treatment of a particular microorganism with antibiotics.We observe that there is minimal microbial data available for many antibiotics and for targeted taxonomic groups. The antibiotics with the highest number of assays have less than 1500 measurements each. Corresponding bias in available assays makes machine learning problematic for some important microbes and for building more advanced models that can work across microbial genera. In this study we focus, therefore, on the antibiotic with most assay data (tetracycline) and the corresponding genus with the most available sequence (Acinetobacter with 14000 measurements across 49 antibiotic compounds). Using this data for training and testing, we observed contradictions in the distribution of assay outcomes and report methods to identify and resolve such conflicts. Per antibiotic, we find that there can be up to 30% of (resolvable) conflicting measurements. As more data becomes available, automated training data curation will be an important part of creating useful machine learning models to predict antibiotic resistance.CCS CONCEPTS• Applied computing → Computational biology; Computational genomics; Bioinformatics;

Download Full-text

Early Warning System for Online STEM Learning—A Slimmer Approach Using Recurrent Neural Networks

Sustainability ◽

10.3390/su132212461 ◽

2021 ◽

Vol 13 (22) ◽

pp. 12461

Author(s):

Chih-Chang Yu ◽

Yufeng (Leon) Wu

Keyword(s):

Neural Network ◽

Machine Learning ◽

Neural Networks ◽

At Risk ◽

At Risk Students ◽

Training Data ◽

Support Vector ◽

Learning Models ◽

Conventional Machine ◽

Machine Learning Models

While the use of deep neural networks is popular for predicting students’ learning outcomes, convolutional neural network (CNN)-based methods are used more often. Such methods require numerous features, training data, or multiple models to achieve week-by-week predictions. However, many current learning management systems (LMSs) operated by colleges cannot provide adequate information. To make the system more feasible, this article proposes a recurrent neural network (RNN)-based framework to identify at-risk students who might fail the course using only a few common learning features. RNN-based methods can be more effective than CNN-based methods in identifying at-risk students due to their ability to memorize time-series features. The data used in this study were collected from an online course that teaches artificial intelligence (AI) at a university in northern Taiwan. Common features, such as the number of logins, number of posts and number of homework assignments submitted, are considered to train the model. This study compares the prediction results of the RNN model with the following conventional machine learning models: logistic regression, support vector machines, decision trees and random forests. This work also compares the performance of the RNN model with two neural network-based models: the multi-layer perceptron (MLP) and a CNN-based model. The experimental results demonstrate that the RNN model used in this study is better than conventional machine learning models and the MLP in terms of F-score, while achieving similar performance to the CNN-based model with fewer parameters. Our study shows that the designed RNN model can identify at-risk students once one-third of the semester has passed. Some future directions are also discussed.

Download Full-text

Total nitrogen estimation in agricultural soils via aerial multispectral imaging and LIBS

Scientific Reports ◽

10.1038/s41598-021-90624-6 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Md Abir Hossen ◽

Prasoon K Diwakar ◽

Shankarachary Ragi

Keyword(s):

Machine Learning ◽

Total Nitrogen ◽

Multispectral Imaging ◽

Ground Truth ◽

Soil Samples ◽

Optimization Techniques ◽

Support Vector ◽

Growth Stages ◽

Calibration Model ◽

Ground Truth Data

AbstractMeasuring soil health indicators (SHIs), particularly soil total nitrogen (TN), is an important and challenging task that affects farmers’ decisions on timing, placement, and quantity of fertilizers applied in the farms. Most existing methods to measure SHIs are in-lab wet chemistry or spectroscopy-based methods, which require significant human input and effort, time-consuming, costly, and are low-throughput in nature. To address this challenge, we develop an artificial intelligence (AI)-driven near real-time unmanned aerial vehicle (UAV)-based multispectral sensing solution (UMS) to estimate soil TN in an agricultural farm. TN is an important macro-nutrient or SHI that directly affects the crop health. Accurate prediction of soil TN can significantly increase crop yield through informed decision making on the timing of seed planting, and fertilizer quantity and timing. The ground-truth data required to train the AI approaches is generated via laser-induced breakdown spectroscopy (LIBS), which can be readily used to characterize soil samples, providing rapid chemical analysis of the samples and their constituents (e.g., nitrogen, potassium, phosphorus, calcium). Although LIBS was previously applied for soil nutrient detection, there is no existing study on the integration of LIBS with UAV multispectral imaging and AI. We train two machine learning (ML) models including multi-layer perceptron regression and support vector regression to predict the soil nitrogen using a suite of data classes including multispectral characteristics of the soil and crops in red (R), near-infrared, and green (G) spectral bands, computed vegetation indices (NDVI), and environmental variables including air temperature and relative humidity (RH). To generate the ground-truth data or the training data for the machine learning models, we determine the N spectrum of the soil samples (collected from a farm) using LIBS and develop a calibration model using the correlation between actual TN of the soil samples and the maximum intensity of N spectrum. In addition, we extract the features from the multispectral images captured while the UAV follows an autonomous flight plan, at different growth stages of the crops. The ML model’s performance is tested on a fixed configuration space for the hyper-parameters using various hyper-parameter optimization techniques at three different wavelengths of the N spectrum.

Download Full-text

Machine Learning Based Predictions of Dissolved Oxygen in a Small Coastal Embayment

Journal of Marine Science and Engineering ◽

10.3390/jmse8121007 ◽

2020 ◽

Vol 8 (12) ◽

pp. 1007

Author(s):

Manuel Valera ◽

Ryan K. Walter ◽

Barbara A. Bailey ◽

Jose E. Castillo

Keyword(s):

Machine Learning ◽

Dissolved Oxygen ◽

Numerical Models ◽

High Accuracy ◽

Training Data ◽

Machine Learning Techniques ◽

Support Vector ◽

Learning Models ◽

Coastal Embayment ◽

Machine Learning Models

Coastal dissolved oxygen (DO) concentrations have a profound impact on nearshore ecosystems and, in recent years, there has been an increased prevalance of low DO hypoxic events that negatively impact nearshore organisms. Even with advanced numerical models, accurate prediction of coastal DO variability is challenging and computationally expensive. Here, we apply machine learning techniques in order to reconstruct and predict nearshore DO concentrations in a small coastal embayment while using a comprehensive set of nearshore and offshore measurements and easily measured input (training) parameters. We show that both random forest regression (RFR) and support vector regression (SVR) models accurately reproduce both the offshore DO and nearshore DO with extremely high accuracy. In general, RFR consistently peformed slightly better than SVR, the latter of which was more difficult to tune and took longer to train. Although each of the nearshore datasets were able to accurately predict DO values using training data from the same site, the model only had moderate success when using training data from one site to predict DO at another site, which was likely due to the the complexities in the underlying dynamics across the sites. We also show that high accuracy can be achieved with relatively little training data, highlighting a potential application for correcting time series with missing DO data due to quality control or sensor issues. This work establishes the ability of machine learning models to accurately reproduce DO concentrations in both offshore and nearshore coastal waters, with important implications for the ability to detect and indirectly measure coastal hypoxic events in near real-time. Future work should explore the ability of machine learning models in order to accurately forecast hypoxic events.

Download Full-text

Monitoring the Foliar Nutrients Status of Mango Using Spectroscopy-Based Spectral Indices and PLSR-Combined Machine Learning Models

Remote Sensing ◽

10.3390/rs13040641 ◽

2021 ◽

Vol 13 (4) ◽

pp. 641

Author(s):

Gopal Ramdas Mahajan ◽

Bappa Das ◽

Dayesh Murgaokar ◽

Ittai Herrmann ◽

Katja Berger ◽

...

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Partial Least Square ◽

Least Square ◽

Partial Least Square Regression ◽

Support Vector ◽

Spectral Indices ◽

Learning Models ◽

Leaf Nutrients ◽

Machine Learning Models

Conventional methods of plant nutrient estimation for nutrient management need a huge number of leaf or tissue samples and extensive chemical analysis, which is time-consuming and expensive. Remote sensing is a viable tool to estimate the plant’s nutritional status to determine the appropriate amounts of fertilizer inputs. The aim of the study was to use remote sensing to characterize the foliar nutrient status of mango through the development of spectral indices, multivariate analysis, chemometrics, and machine learning modeling of the spectral data. A spectral database within the 350–1050 nm wavelength range of the leaf samples and leaf nutrients were analyzed for the development of spectral indices and multivariate model development. The normalized difference and ratio spectral indices and multivariate models–partial least square regression (PLSR), principal component regression, and support vector regression (SVR) were ineffective in predicting any of the leaf nutrients. An approach of using PLSR-combined machine learning models was found to be the best to predict most of the nutrients. Based on the independent validation performance and summed ranks, the best performing models were cubist (R2 ≥ 0.91, the ratio of performance to deviation (RPD) ≥ 3.3, and the ratio of performance to interquartile distance (RPIQ) ≥ 3.71) for nitrogen, phosphorus, potassium, and zinc, SVR (R2 ≥ 0.88, RPD ≥ 2.73, RPIQ ≥ 3.31) for calcium, iron, copper, boron, and elastic net (R2 ≥ 0.95, RPD ≥ 4.47, RPIQ ≥ 6.11) for magnesium and sulfur. The results of the study revealed the potential of using hyperspectral remote sensing data for non-destructive estimation of mango leaf macro- and micro-nutrients. The developed approach is suggested to be employed within operational retrieval workflows for precision management of mango orchard nutrients.

Download Full-text

Machine learning models to identify low adherence to influenza vaccination among Korean adults with cardiovascular disease

BMC Cardiovascular Disorders ◽

10.1186/s12872-021-01925-7 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Moojung Kim ◽

Young Jae Kim ◽

Sung Jin Park ◽

Kwang Gi Kim ◽

Pyung Chun Oh ◽

...

Keyword(s):

Machine Learning ◽

Cardiovascular Disease ◽

Influenza Vaccination ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Support Vector ◽

Age Group ◽

Learning Models ◽

Extreme Gradient Boosting ◽

Machine Learning Models

Abstract Background Annual influenza vaccination is an important public health measure to prevent influenza infections and is strongly recommended for cardiovascular disease (CVD) patients, especially in the current coronavirus disease 2019 (COVID-19) pandemic. The aim of this study is to develop a machine learning model to identify Korean adult CVD patients with low adherence to influenza vaccination Methods Adults with CVD (n = 815) from a nationally representative dataset of the Fifth Korea National Health and Nutrition Examination Survey (KNHANES V) were analyzed. Among these adults, 500 (61.4%) had answered "yes" to whether they had received seasonal influenza vaccinations in the past 12 months. The classification process was performed using the logistic regression (LR), random forest (RF), support vector machine (SVM), and extreme gradient boosting (XGB) machine learning techniques. Because the Ministry of Health and Welfare in Korea offers free influenza immunization for the elderly, separate models were developed for the < 65 and ≥ 65 age groups. Results The accuracy of machine learning models using 16 variables as predictors of low influenza vaccination adherence was compared; for the ≥ 65 age group, XGB (84.7%) and RF (84.7%) have the best accuracies, followed by LR (82.7%) and SVM (77.6%). For the < 65 age group, SVM has the best accuracy (68.4%), followed by RF (64.9%), LR (63.2%), and XGB (61.4%). Conclusions The machine leaning models show comparable performance in classifying adult CVD patients with low adherence to influenza vaccination.

Download Full-text

414 Deep Neural Networks: A Survey Tool for Obstructive Sleep Apnea Prediction

SLEEP ◽

10.1093/sleep/zsab072.413 ◽

2021 ◽

Vol 44 (Supplement_2) ◽

pp. A164-A164

Author(s):

Pahnwat Taweesedt ◽

JungYoon Kim ◽

Jaehyun Park ◽

Jangwoon Park ◽

Munish Sharma ◽

...

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Obstructive Sleep Apnea ◽

Sleep Apnea ◽

Deep Neural Networks ◽

Support Vector ◽

Learning Models ◽

Obstructive Sleep ◽

Screening Questionnaires ◽

Machine Learning Models

Abstract Introduction Obstructive sleep apnea (OSA) is a common sleep-related breathing disorder with an estimation of one billion people. Full-night polysomnography is considered the gold standard for OSA diagnosis. However, it is time-consuming, expensive and is not readily available in many parts of the world. Many screening questionnaires and scores have been proposed for OSA prediction with high sensitivity and low specificity. The present study is intended to develop models with various machine learning techniques to predict the severity of OSA by incorporating features from multiple questionnaires. Methods Subjects who underwent full-night polysomnography in Torr sleep center, Texas and completed 5 OSA screening questionnaires/scores were included. OSA was diagnosed by using Apnea-Hypopnea Index ≥ 5. We trained five different machine learning models including Deep Neural Networks with the scaled principal component analysis (DNN-PCA), Random Forest (RF), Adaptive Boosting classifier (ABC), and K-Nearest Neighbors classifier (KNC) and Support Vector Machine Classifier (SVMC). Training:Testing subject ratio of 65:35 was used. All features including demographic data, body measurement, snoring and sleepiness history were obtained from 5 OSA screening questionnaires/scores (STOP-BANG questionnaires, Berlin questionnaires, NoSAS score, NAMES score and No-Apnea score). Performance parametrics were used to compare between machine learning models. Results Of 180 subjects, 51.5 % of subjects were male with mean (SD) age of 53.6 (15.1). One hundred and nineteen subjects were diagnosed with OSA. Area Under the Receiver Operating Characteristic Curve (AUROC) of DNN-PCA, RF, ABC, KNC, SVMC, STOP-BANG questionnaire, Berlin questionnaire, NoSAS score, NAMES score, and No-Apnea score were 0.85, 0.68, 0.52, 0.74, 0.75, 0.61, 0.63, 0,61, 0.58 and 0,58 respectively. DNN-PCA showed the highest AUROC with sensitivity of 0.79, specificity of 0.67, positive-predictivity of 0.93, F1 score of 0.86, and accuracy of 0.77. Conclusion Our result showed that DNN-PCA outperforms OSA screening questionnaires, scores and other machine learning models. Support (if any):

Download Full-text