The data complexity index to construct an efficient cross-validation method

2010 ◽  
Vol 50 (1) ◽  
pp. 93-102 ◽  
Author(s):  
Der-Chiang Li ◽  
Yao-Hwei Fang ◽  
Y.M. Frank Fang
2008 ◽  
Vol 52 (No. 9) ◽  
pp. 284-291 ◽  
Author(s):  
K. Šustová ◽  
J. Růžičková ◽  
J. Kuchtík

Our study deals with a possibility of determining true protein and casein in cow’s, ewe’s and goat’s milk and in ewe’s colostrums by FT NIR spectroscopy. Samples of milk were analysed by FT NIR in the reflectance mode with the transflectance cuvette. The values of correlation coefficients of calibration were as follows: cow’s protein 0.943; cow’s casein 0.964; ewe’s protein 0.997; ewe’s casein 0.977; goat’s protein 0.989; goat’s casein 0.890; ewe’s colostrum protein 0.983. Calibration was tested using the same set of samples by the cross validation method. The values of correlation coefficients of validation were as follows: cow’s protein 0.923; cow’s casein 0.910; ewe’s protein 0.994; ewe’s casein 0.963; goat’s protein 0.972; goat’s casein 0.814; ewe’s colostrum protein 0.871. The NIRS results were compared with reference data and no significant differences between them were found (<i>P</i> = 0.05). Results of this study indicate that FT NIR spectroscopy can be used for a rapid analysis of protein and casein in cow’s, ewe’s and goat’s milk and ewe’s colostrum.


2004 ◽  
Vol 14 (03) ◽  
pp. 1037-1051 ◽  
Author(s):  
S. A. BILLINGS ◽  
K. L. LEE

A new NARMA based smoothing algorithm is introduced for chaotic and nonchaotic time series. The new algorithm employs a cross-validation method to determine the smoother structure, requires very little user interaction, and can be combined with wavelet thresholding to further enhance the noise reduction. Numerical examples are included to illustrate the application of the new algorithm.


2020 ◽  
Vol 43 (1) ◽  
pp. 103-125
Author(s):  
Yi Zhong ◽  
Jianghua He ◽  
Prabhakar Chalise

With the advent of high throughput technologies, the high-dimensional datasets are increasingly available. This has not only opened up new insight into biological systems but also posed analytical challenges. One important problem is the selection of informative feature-subset and prediction of the future outcome. It is crucial that models are not overfitted and give accurate results with new data. In addition, reliable identification of informative features with high predictive power (feature selection) is of interests in clinical settings. We propose a two-step framework for feature selection and classification model construction, which utilizes a nested and repeated cross-validation method. We evaluated our approach using both simulated data and two publicly available gene expression datasets. The proposed method showed comparatively better predictive accuracy for new cases than the standard cross-validation method.


Sign in / Sign up

Export Citation Format

Share Document