A Cross-Validation Strategy for the Identification of Electromechanical Oscillations in Real Ambient Data

Author(s):  
Juliana Luiza Pereira ◽  
Rodolfo Bialecki Leandro ◽  
Ildemar Cassana Decker ◽  
Guido Rossetto Moraes
Author(s):  
Haitham Issa ◽  
Sali Issa ◽  
Wahab Shah

This paper presents a new gender and age classification system based on Electroencephalography (EEG) brain signals. First, Continuous Wavelet Transform (CWT) technique is used to get the time-frequency information of only one EEG electrode for eight distinct emotional states instead of the ordinary neutral or relax states. Then, sequential steps are implemented to extract the improved grayscale image feature. For system evaluation, a three-fold-cross validation strategy is applied to construct four different classifiers. The experimental test shows that the proposed extracted feature with Convolutional Neural Network (CNN) classifier improves the performance of both gender and age classification, and achieves an average accuracy of 96.3% and 89% for gender and age classification, respectively. Moreover, the ability to predict human gender and age during the mood of different emotional states is practically approved.


2019 ◽  
Author(s):  
Daniel Runcie ◽  
Hao Cheng

ABSTRACTIncorporating measurements on correlated traits into genomic prediction models can increase prediction accuracy and selection gain. However, multi-trait genomic prediction models are complex and prone to overfitting which may result in a loss of prediction accuracy relative to single-trait genomic prediction. Cross-validation is considered the gold standard method for selecting and tuning models for genomic prediction in both plant and animal breeding. When used appropriately, cross-validation gives an accurate estimate of the prediction accuracy of a genomic prediction model, and can effectively choose among disparate models based on their expected performance in real data. However, we show that a naive cross-validation strategy applied to the multi-trait prediction problem can be severely biased and lead to sub-optimal choices between single and multi-trait models when secondary traits are used to aid in the prediction of focal traits and these secondary traits are measured on the individuals to be tested. We use simulations to demonstrate the extent of the problem and propose three partial solutions: 1) a parametric solution from selection index theory, 2) a semi-parametric method for correcting the cross-validation estimates of prediction accuracy, and 3) a fully non-parametric method which we call CV2*: validating model predictions against focal trait measurements from genetically related individuals. The current excitement over high-throughput phenotyping suggests that more comprehensive phenotype measurements will be useful for accelerating breeding programs. Using an appropriate cross-validation strategy should more reliably determine if and when combining information across multiple traits is useful.


Plant Disease ◽  
2012 ◽  
Vol 96 (6) ◽  
pp. 889-896 ◽  
Author(s):  
S. Landschoot ◽  
W. Waegeman ◽  
K. Audenaert ◽  
J. Vandepitte ◽  
G. Haesaert ◽  
...  

Despite great efforts to forecast plant diseases, many of the existing systems often fall short in providing farmers with accurate predictions. One of the main problems arises from the existence of year and location effects, so that more advanced procedures are required for evaluating existing systems in an unbiased manner. This paper illustrates the case of Fusarium head blight of winter wheat in Belgium. We present a new cross-validation strategy that enables the evaluation of the predictive performance of a forecasting system for years and locations that are different from the years and locations on which the forecast was developed. Four different cross-validation strategies and five regression techniques are used. The results demonstrated that traditional evaluation strategies are too optimistic in their predictions, whereas the cross-year cross-location validation strategy yielded more realistic outcomes. Using this procedure, the mean squared error increased and the coefficient of determination decreased in predicting disease severity and deoxynivalenol content, suggesting that existing evaluation strategies may generate a substantial optimistic bias. The strongest discrepancies between the cross-validation strategies were observed for multiple linear regression models.


2020 ◽  
Vol 98 (Supplement_4) ◽  
pp. 383-383
Author(s):  
Leonardo Augusto Coelho Ribeiro ◽  
Tiago Bresolin ◽  
Guilherme J M Rosa ◽  
Daniel Rume Casagrande ◽  
Marina De Arruda Camargo Danes ◽  
...  

Abstract Wearable sensors have been adopted as an alternative for real-time monitoring of cattle feeding behavior in grazing systems. However, even using machine learning (ML) techniques confounding effects such as cross-validation strategy may inflate the prediction quality. Our objective was to evaluate the effect of different cross-validation strategies on the prediction of grazing activities in cattle using wearable sensor data and ML algorithms. Six Nellore bulls (345 ± 21 kg) had their behavior visually classified as grazing or not-grazing for a period of 15 days. Generalized Linear Model (GLM), Random Forest (RF), and Artificial Neural Network (ANN) were employed to predict behavior (grazing or not-grazing) using 3-axis accelerometer data. For each analytical method, three cross-validation strategies were evaluated: holdout, leave-one-animal-out (LOAO), and leave-one-day-out (LODO). Algorithms were trained using similar dataset sizes (holdout: n = 57,862; LOAO: n = 56,786; LODO: n = 56,672). Regardless of the cross-validation strategy, GLM achieved the worst prediction accuracy (53%) compared to the ML techniques (65% for both RF and ANN). ANN performed slightly better than RF for LOAO (73%) and LODO (64%) cross-validation strategies. The holdout yielded the highest accuracy values for all three ML approaches (GLM: 59%, RF: 76%, and ANN: 74%), followed by LODO (58%) and LOAO (55%). In conclusion, the GLM approach was not adequate to predict grazing behavior, regardless of the cross-validation strategy. The greater prediction accuracy observed for holdout cross-validation may simply indicate a lack of data independence and the presence of carry-over effects from animals and grazing management. Our results suggest that generalizing predictive models to unknown (not used for training) animals or grazing management may incur in poor prediction quality. The results highlight the need for using biological knowledge to define the validation strategy that is closer to the real-life situation.


2017 ◽  
Vol 4 (1) ◽  
pp. 100047 ◽  
Author(s):  
Emre Guney

Following the recent availability of high-throughput data for drug discovery, computational methods, especially machine learning based approaches, have gained remarkable attention. A number of studies use chemical, target and side effect similarity between drugs to build knowledge-based models that predict drug indications and drug-drug interactions. In light of previous works demonstrating the perils of cross-validation using paired data, in this study, we employ a disjoint cross validation approach for similarity-based drug-drug interaction (DDI) prediction and we investigate the prediction accuracy of classifier under various settings. Our results point to the dependence on the cross validation strategy used to evaluate prediction accuracy of drug similarity-based classifiers operating on paired data such as pharmacokinetic interactions between drugs.


2018 ◽  
Author(s):  
Renato Moraes Silva ◽  
Ana Carolina Lorena ◽  
Tiago A. Almeida

In this paper, we present a new public and real dataset of labeled images of meteors and non-meteors that we recently used in a machine learning competition. We also present a comprehensive performance evaluation of several established machine learning methods and compare the results with a stacking approach – one of the winning solutions of the competition. We compared the performance obtained by the methods in the traditional repeated five-fold cross-validation with the ones obtained using the training and test partitions used in the competition. A careful analysis of the results indicates that, in general, the stacking based approach obtained the best performances compared to the baselines. Moreover, we found evidence that the validation strategy used by the platform that hosted the competition can lead to results that do not sustain in a cross-validation setup, which is recommendable in real-world scenarios.


2014 ◽  
Vol 6 (1) ◽  
Author(s):  
Howard Burkom ◽  
Yevgeniy Elbert ◽  
Liane Ramac-Thomas ◽  
Christopher Cuellar

To manage an increasingly complex data environment, a fusion module based on Bayesian networks (BN) was developed for the Dept. of Defense (DoD) Electronic Surveillance System for the Early Notification of Community-Based Epidemics (ESSENCE).  Subsequent efforts have produced a full fusion-enabled version of ESSENCE for beta testing and further upgrades. The current presentation describes advances to formalize the network training, calibrate the component alerting algorithms and decision nodes together, and implement a validation strategy. A cross-validation strategy produced consistent threshold combinations yielding 88% sensitivity from reported events, a 10-15% improvement over the original demonstration module.


Sign in / Sign up

Export Citation Format

Share Document