scholarly journals Development of Machine Learning Models to Predict Compressed Sward Height in Walloon Pastures Based on Sentinel-1, Sentinel-2 and Meteorological Data Using Multiple Data Transformations

2021 ◽  
Vol 13 (3) ◽  
pp. 408
Author(s):  
Charles Nickmilder ◽  
Anthony Tedde ◽  
Isabelle Dufrasne ◽  
Françoise Lessire ◽  
Bernard Tychon ◽  
...  

Accurate information about the available standing biomass on pastures is critical for the adequate management of grazing and its promotion to farmers. In this paper, machine learning models are developed to predict available biomass expressed as compressed sward height (CSH) from readily accessible meteorological, optical (Sentinel-2) and radar satellite data (Sentinel-1). This study assumed that combining heterogeneous data sources, data transformations and machine learning methods would improve the robustness and the accuracy of the developed models. A total of 72,795 records of CSH with a spatial positioning, collected in 2018 and 2019, were used and aggregated according to a pixel-like pattern. The resulting dataset was split into a training one with 11,625 pixellated records and an independent validation one with 4952 pixellated records. The models were trained with a 19-fold cross-validation. A wide range of performances was observed (with mean root mean square error (RMSE) of cross-validation ranging from 22.84 mm of CSH to infinite-like values), and the four best-performing models were a cubist, a glmnet, a neural network and a random forest. These models had an RMSE of independent validation lower than 20 mm of CSH at the pixel-level. To simulate the behavior of the model in a decision support system, performances at the paddock level were also studied. These were computed according to two scenarios: either the predictions were made at a sub-parcel level and then aggregated, or the data were aggregated at the parcel level and the predictions were made for these aggregated data. The results obtained in this study were more accurate than those found in the literature concerning pasture budgeting and grassland biomass evaluation. The training of the 124 models resulting from the described framework was part of the realization of a decision support system to help farmers in their daily decision making.

2022 ◽  
Vol 193 ◽  
pp. 106688
Author(s):  
Christoforos-Nikitas Kasimatis ◽  
Evangelos Psomakelis ◽  
Nikolaos Katsenios ◽  
Giannis Katsenios ◽  
Marilena Papatheodorou ◽  
...  

2022 ◽  
Vol 14 (1) ◽  
pp. 229
Author(s):  
Jiarui Shi ◽  
Qian Shen ◽  
Yue Yao ◽  
Junsheng Li ◽  
Fu Chen ◽  
...  

Chlorophyll-a concentrations in water bodies are one of the most important environmental evaluation indicators in monitoring the water environment. Small water bodies include headwater streams, springs, ditches, flushes, small lakes, and ponds, which represent important freshwater resources. However, the relatively narrow and fragmented nature of small water bodies makes it difficult to monitor chlorophyll-a via medium-resolution remote sensing. In the present study, we first fused Gaofen-6 (a new Chinese satellite) images to obtain 2 m resolution images with 8 bands, which was approved as a good data source for Chlorophyll-a monitoring in small water bodies as Sentinel-2. Further, we compared five semi-empirical and four machine learning models to estimate chlorophyll-a concentrations via simulated reflectance using fused Gaofen-6 and Sentinel-2 spectral response function. The results showed that the extreme gradient boosting tree model (one of the machine learning models) is the most accurate. The mean relative error (MRE) was 9.03%, and the root-mean-square error (RMSE) was 4.5 mg/m3 for the Sentinel-2 sensor, while for the fused Gaofen-6 image, MRE was 6.73%, and RMSE was 3.26 mg/m3. Thus, both fused Gaofen-6 and Sentinel-2 could estimate the chlorophyll-a concentrations in small water bodies. Since the fused Gaofen-6 exhibited a higher spatial resolution and Sentinel-2 exhibited a higher temporal resolution.


2020 ◽  
Author(s):  
Tahmina Nasrin Poly ◽  
Md.Mohaimenul Islam ◽  
Muhammad Solihuddin Muhtar ◽  
Hsuan-Chia Yang ◽  
Phung Anh (Alex) Nguyen ◽  
...  

BACKGROUND Computerized physician order entry (CPOE) systems are incorporated into clinical decision support systems (CDSSs) to reduce medication errors and improve patient safety. Automatic alerts generated from CDSSs can directly assist physicians in making useful clinical decisions and can help shape prescribing behavior. Multiple studies reported that approximately 90%-96% of alerts are overridden by physicians, which raises questions about the effectiveness of CDSSs. There is intense interest in developing sophisticated methods to combat alert fatigue, but there is no consensus on the optimal approaches so far. OBJECTIVE Our objective was to develop machine learning prediction models to predict physicians’ responses in order to reduce alert fatigue from disease medication–related CDSSs. METHODS We collected data from a disease medication–related CDSS from a university teaching hospital in Taiwan. We considered prescriptions that triggered alerts in the CDSS between August 2018 and May 2019. Machine learning models, such as artificial neural network (ANN), random forest (RF), naïve Bayes (NB), gradient boosting (GB), and support vector machine (SVM), were used to develop prediction models. The data were randomly split into training (80%) and testing (20%) datasets. RESULTS A total of 6453 prescriptions were used in our model. The ANN machine learning prediction model demonstrated excellent discrimination (area under the receiver operating characteristic curve [AUROC] 0.94; accuracy 0.85), whereas the RF, NB, GB, and SVM models had AUROCs of 0.93, 0.91, 0.91, and 0.80, respectively. The sensitivity and specificity of the ANN model were 0.87 and 0.83, respectively. CONCLUSIONS In this study, ANN showed substantially better performance in predicting individual physician responses to an alert from a disease medication–related CDSS, as compared to the other models. To our knowledge, this is the first study to use machine learning models to predict physician responses to alerts; furthermore, it can help to develop sophisticated CDSSs in real-world clinical settings.


2018 ◽  
Vol 124 (5) ◽  
pp. 1284-1293 ◽  
Author(s):  
Alexander H. K. Montoye ◽  
Bradford S. Westgate ◽  
Morgan R. Fonley ◽  
Karin A. Pfeiffer

Wrist-worn accelerometers are gaining popularity for measurement of physical activity. However, few methods for predicting physical activity intensity from wrist-worn accelerometer data have been tested on data not used to create the methods (out-of-sample data). This study utilized two previously collected data sets [Ball State University (BSU) and Michigan State University (MSU)] in which participants wore a GENEActiv accelerometer on the left wrist while performing sedentary, lifestyle, ambulatory, and exercise activities in simulated free-living settings. Activity intensity was determined via direct observation. Four machine learning models (plus 2 combination methods) and six feature sets were used to predict activity intensity (30-s intervals) with the accelerometer data. Leave-one-out cross-validation and out-of-sample testing were performed to evaluate accuracy in activity intensity prediction, and classification accuracies were used to determine differences among feature sets and machine learning models. In out-of-sample testing, the random forest model (77.3–78.5%) had higher accuracy than other machine learning models (70.9–76.4%) and accuracy similar to combination methods (77.0–77.9%). Feature sets utilizing frequency-domain features had improved accuracy over other feature sets in leave-one-out cross-validation (92.6–92.8% vs. 87.8–91.9% in MSU data set; 79.3–80.2% vs. 76.7–78.4% in BSU data set) but similar or worse accuracy in out-of-sample testing (74.0–77.4% vs. 74.1–79.1% in MSU data set; 76.1–77.0% vs. 75.5–77.3% in BSU data set). All machine learning models outperformed the euclidean norm minus one/GGIR method in out-of-sample testing (69.5–78.5% vs. 53.6–70.6%). From these results, we recommend out-of-sample testing to confirm generalizability of machine learning models. Additionally, random forest models and feature sets with only time-domain features provided the best accuracy for activity intensity prediction from a wrist-worn accelerometer. NEW & NOTEWORTHY This study includes in-sample and out-of-sample cross-validation of an alternate method for deriving meaningful physical activity outcomes from accelerometer data collected with a wrist-worn accelerometer. This method uses machine learning to directly predict activity intensity. By so doing, this study provides a classification model that may avoid high errors present with energy expenditure prediction while still allowing researchers to assess adherence to physical activity guidelines.


10.2196/19489 ◽  
2020 ◽  
Vol 8 (11) ◽  
pp. e19489
Author(s):  
Tahmina Nasrin Poly ◽  
Md.Mohaimenul Islam ◽  
Muhammad Solihuddin Muhtar ◽  
Hsuan-Chia Yang ◽  
Phung Anh (Alex) Nguyen ◽  
...  

Background Computerized physician order entry (CPOE) systems are incorporated into clinical decision support systems (CDSSs) to reduce medication errors and improve patient safety. Automatic alerts generated from CDSSs can directly assist physicians in making useful clinical decisions and can help shape prescribing behavior. Multiple studies reported that approximately 90%-96% of alerts are overridden by physicians, which raises questions about the effectiveness of CDSSs. There is intense interest in developing sophisticated methods to combat alert fatigue, but there is no consensus on the optimal approaches so far. Objective Our objective was to develop machine learning prediction models to predict physicians’ responses in order to reduce alert fatigue from disease medication–related CDSSs. Methods We collected data from a disease medication–related CDSS from a university teaching hospital in Taiwan. We considered prescriptions that triggered alerts in the CDSS between August 2018 and May 2019. Machine learning models, such as artificial neural network (ANN), random forest (RF), naïve Bayes (NB), gradient boosting (GB), and support vector machine (SVM), were used to develop prediction models. The data were randomly split into training (80%) and testing (20%) datasets. Results A total of 6453 prescriptions were used in our model. The ANN machine learning prediction model demonstrated excellent discrimination (area under the receiver operating characteristic curve [AUROC] 0.94; accuracy 0.85), whereas the RF, NB, GB, and SVM models had AUROCs of 0.93, 0.91, 0.91, and 0.80, respectively. The sensitivity and specificity of the ANN model were 0.87 and 0.83, respectively. Conclusions In this study, ANN showed substantially better performance in predicting individual physician responses to an alert from a disease medication–related CDSS, as compared to the other models. To our knowledge, this is the first study to use machine learning models to predict physician responses to alerts; furthermore, it can help to develop sophisticated CDSSs in real-world clinical settings.


2021 ◽  
Author(s):  
Michael Tarasiou

This paper presents DeepSatData a pipeline for automatically generating satellite imagery datasets for training machine learning models. We also discuss design considerations with emphasis on dense classification tasks, e.g. semantic segmentation. The implementation presented makes use of freely available Sentinel-2 data which allows the generation of large scale datasets required for training deep neural networks (DNN). We discuss issues faced from the point of view of DNN training and evaluation such as checking the quality of ground truth data and comment on the scalability of the approach.


2020 ◽  
pp. 98-105
Author(s):  
Darshan Jagannath Pangarkar ◽  
Rajesh Sharma ◽  
Amita Sharma ◽  
Madhu Sharma

Prediction of crop yield can help traders, agri-business and government agencies to plan their activities accordingly. It can help government agencies to manage situations like over or under production. Traditionally statistical and crop simulation methods are used for this purpose. Machine learning models can be great deal of help. Aim of present study is to assess the predictive ability of various machine learning models for Cluster bean (Cyamopsis tetragonoloba L. Taub.) yield prediction. Various machine learning models were applied and tested on panel data of 19 years i.e. from 1999-2000 to 2017-18 for the Bikaner district of Rajasthan. Various data mining steps were performed before building a model. K- Nearest Nighbors (K-NN), Support Vector Regression (SVR) with various kernels, and Random forest regression were applied. Cross validation was also performed to know extra sampler validity. The best fitted model was chosen based cross validation scores and R2 values. Besides the coefficient of determination (R2), root mean squared error (RMSE), mean absolute error (MAE), and root relative squared error (RRSE) were calculated for the testing set. Support vector regression with linear kernel has the lowest RMSE (23.19), RRSE (0.14), MAE (19.27) values followed by random forest regression and second-degree polynomial support vector regression with the value of gamma = auto. Instead there was a little difference with R2, placing support vector regression first (98.31%), followed by second-degree polynomial support vector regression with value of gamma = auto (89.83%) and second-degree polynomial support vector regression with value of gamma = scale (88.83%). On two-fold cross validation, support vector regression with a linear kernel had the highest cross validation score explaining 71% (+/-0.03) followed by second-degree polynomial support vector regression with a value of gamma = auto and random forest regression. KNN and support vector regression with radial basis function as a kernel function had negative cross validation scores. Support vector regression with linear kernel was found to be the best-fitted model for predicting the yield as it had higher sample validity (98.31%) and global validity (71%).


Author(s):  
Jinhui Jeanne Huang ◽  
Hongwei Guo ◽  
Bowen Chen ◽  
Xiaolong Guo ◽  
Vijay P. Singh

Water quality retrieval for small urban waterbodies by remote sensing get used to be difficult due to coarse spatial resolution of the remote sensing imagery. The recently launched Sentinel-2 produces imagery with a spatial resolution of 10 m. It provides an opportunity to solve the problem of retrieving water quality for small waterbodies. Additionally, many water management issues also require fine resolution of imagery, e.g. illegal discharge to an urban waterbody. Since illegal discharges are an important issue for urban water management, chemical oxygen demand (COD), total phosphorous (TP), and total nitrogen (TN) were chosen as the target parameters for water quality retrieval in this study. COD, TP and TN, however, are non-optically active parameters. There were limited studies in the past to retrieve these parameters in comparison with optically active parameters, e.g. Chlorophyll-A etc. This study compared three machine learning models, namely Random Forest (RF), Support Vector Regression (SVR), and Neural Networks (NN), to investigate the opportunity to retrieve the above non-optically active parameters. Results showed that R2 of TP, TN, and COD by NN, RF and SVR were 0.94, 0.88, and 0.86, respectively. The performances of water quality retrieval for these non-optically active parameters were significantly improved by the optimized machine learning models. These models hence solved the problem to use remote sensing data to retrieve these non-optically active water quality parameters and provided a new monitoring strategy for small waterbodies. Water quality mapping obtained by Sentinel-2 imagery provided a full spatial coverage of the water quality characterization for the entire water surface. Compared with water samples collecting and testing, it greatly reduced labor cost, reagents cost, and waste treatment cost. It also may help identify illegal discharges to urban waterbodies. The method developed in this research provides a new practical and efficient water quality monitoring strategy in managing water with consideration of environmental sustainability.


2020 ◽  
Vol 2020 ◽  
pp. 1-9
Author(s):  
Yinghua Zhao ◽  
Lianying Yang ◽  
Changqing Sun ◽  
Yang Li ◽  
Yangzhige He ◽  
...  

Acute appendicitis is one of the most common acute abdomens, but the confident preoperative diagnosis is still a challenge. In order to profile noninvasive urinary biomarkers that could discriminate acute appendicitis from other acute abdomens, we carried out mass spectrometric experiments on urine samples from patients with different acute abdomens and evaluated diagnostic potential of urinary proteins with various machine-learning models. Firstly, outlier protein pools of acute appendicitis and controls were constructed using the discovery dataset (32 acute appendicitis and 41 control acute abdomens) against a reference set of 495 normal urine samples. Ten outlier proteins were then selected by feature selection algorithm and were applied in construction of machine-learning models using naïve Bayes, support vector machine, and random forest algorithms. The models were assessed in the discovery dataset by leave-one-out cross validation and were verified in the validation dataset (16 acute appendicitis and 45 control acute abdomens). Among the three models, random forest model achieved the best performance: the accuracy was 84.9% in the leave-one-out cross validation of discovery dataset and 83.6% (sensitivity: 81.2%, specificity: 84.4%) in the validation dataset. In conclusion, we developed a 10-protein diagnostic panel by the random forest model that was able to distinguish acute appendicitis from confusable acute abdomens with high specificity, which indicated the clinical application potential of noninvasive urinary markers in disease diagnosis.


Sign in / Sign up

Export Citation Format

Share Document