scholarly journals Classifying Oryza sativa accessions into Indica and Japonica using logistic regression model with phenotypic data

PeerJ ◽  
2019 ◽  
Vol 7 ◽  
pp. e7259
Author(s):  
Bongsong Kim

In Oryza sativa, indica and japonica are pivotal subpopulations, and other subpopulations such as aus and aromatic are considered to be derived from indica or japonica. In this regard, Oryza sativa accessions are frequently viewed from the indica/japonica perspective. This study introduces a computational method for indica/japonica classification by applying phenotypic variables to the logistic regression model (LRM). The population used in this study included 413 Oryza sativa accessions, of which 280 accessions were indica or japonica. Out of 24 phenotypic variables, a set of seven phenotypic variables was identified to collectively generate the fully accurate indica/japonica separation power of the LRM. The resulting parameters were used to define the customized LRM. Given the 280 indica/japonica accessions, the classification accuracy of the customized LRM along with the set of seven phenotypic variables was estimated by 100 iterations of ten-fold cross-validations. As a result, the classification accuracy of 100% was achieved. This suggests that the LRM can be an effective tool to analyze the indica/japonica classification with phenotypic variables in Oryza sativa.

1998 ◽  
Vol 28 (9) ◽  
pp. 1398-1404 ◽  
Author(s):  
L A Venier ◽  
A A Hopkin ◽  
D W McKenney ◽  
Y. Wang

We used historical distribution data of Scleroderris disease (caused by the fungus Gremmeniella abietina var. abietina (Lagerb.) Morelet) in Ontario to model its probability of occurrence as a function of climate factors. A logistic regression model of the probability of occurrence as a function of the mean temperature of the coldest quarter and the precipitation of the coldest quarter was a very good fit. The concordance (index of classification accuracy) of the model was 84%. We subsampled the data repeatedly, generated new parameter estimates, and tested the predictions against data not included in the model. Classification accuracy was similar for each subsample model; therefore, we concluded that the final model is stable. Gridded estimates of the climate variables were used to spatially extend the two-variable logistic regression model and produce a probability of occurrence map for Scleroderris disease across Ontario. The predicted map of probability of occurrence fits well with the map of the observed locations of the disease. These results lend credence to previous work that suggests that distribution of Scleroderris disease is strongly influenced by climate. The classification results also suggest that this model is a useful tool for assessing the risk of Scleroderris disease throughout Ontario.


2018 ◽  
Author(s):  
Bongsong Kim

AbstractThis article introduces how to implement the logistic regression model (LRM) with phenotypic variables for classifying Asian rice (Oryza sativa L.) cultivars into two pivotal subpopulations, indica and japonica. This study took advantage of publicly available data attached to a previous paper. The classification accuracy was assessed using an area under curve (AUC) of a receiver operating characteristic (ROC) curve. Given 24 phenotypic variables for 280 indica/japonica accessions, the LRMs were fitted with up to six phenotypic variables of all possible combinations; the highest AUC accounts for 0.9977, obtained with six variables including panicle number per plant, seed number per panicle, florets per panicle, panicle fertility, straighthead susceptibility and blast resistance. Overall, the more variables there are, the higher the resulting AUCs are. The ultimate purpose of this study is to demonstrate the indica/japonica prediction ability of the LRM when applied to unclassified Asian rice cultivars. To estimate the indica/japonica prediction accuracy, ten-fold cross-validations were conducted 100 times with the 280 indica/japonica accessions using the LRM with parameters that yielded the highest AUC. The resulting prediction accuracy accounted for 0.9779. This suggests that the LRM promises to be a highly effective indica/japonica prediction tool using phenotypic variables in Asian cultivated rice.


2020 ◽  
Vol 30 (Supplement_5) ◽  
Author(s):  
J Matos ◽  
C Matias Dias ◽  
A Félix

Abstract Background Studies on the impact of patients with multimorbidity in the absence of work indicate that the number and type of chronic diseases may increase absenteeism and that the risk of absence from work is higher in people with two or more chronic diseases. This study analyzed the association between multimorbidity and greater frequency and duration of work absence in the portuguese population between the ages of 25 and 65 during 2015. Methods This is an epidemiological, observational, cross-sectional study with an analytical component that has its source of information from the 1st National Health Examination Survey. The study analyzed univariate, bivariate and multivariate variables under study. A multivariate logistic regression model was constructed. Results The prevalence of absenteeism was 55,1%. Education showed an association with absence of work (p = 0,0157), as well as professional activity (p = 0,0086). It wasn't possible to verify association between the presence of chronic diseases (p = 0,9358) or the presence of multimorbidity (p = 0,4309) with absence of work. The prevalence of multimorbidity was 31,8%. There was association between age (p < 0,0001), education (p < 0,001) and yield (p = 0,0009) and multimorbidity. There is no increase in the number of days of absence from work due to the increase in the number of chronic diseases. In the optimized logistic regression model the only variables that demonstrated association with the variable labor absence were age (p = 0,0391) and education (0,0089). Conclusions The scientific evidence generated will contribute to the current discussion on the need for the health and social security system to develop policies to patients with multimorbidity. Key messages The prevalence of absenteeism and multimorbidity in Portugal was respectively 55,1% and 31,8%. In the optimized model age and education demonstrated association with the variable labor absence.


2021 ◽  
Vol 11 (14) ◽  
pp. 6594
Author(s):  
Yu-Chia Hsu

The interdisciplinary nature of sports and the presence of various systemic and non-systemic factors introduce challenges in predicting sports match outcomes using a single disciplinary approach. In contrast to previous studies that use sports performance metrics and statistical models, this study is the first to apply a deep learning approach in financial time series modeling to predict sports match outcomes. The proposed approach has two main components: a convolutional neural network (CNN) classifier for implicit pattern recognition and a logistic regression model for match outcome judgment. First, the raw data used in the prediction are derived from the betting market odds and actual scores of each game, which are transformed into sports candlesticks. Second, CNN is used to classify the candlesticks time series on a graphical basis. To this end, the original 1D time series are encoded into 2D matrix images using Gramian angular field and are then fed into the CNN classifier. In this way, the winning probability of each matchup team can be derived based on historically implied behavioral patterns. Third, to further consider the differences between strong and weak teams, the CNN classifier adjusts the probability of winning the match by using the logistic regression model and then makes a final judgment regarding the match outcome. We empirically test this approach using 18,944 National Football League game data spanning 32 years and find that using the individual historical data of each team in the CNN classifier for pattern recognition is better than using the data of all teams. The CNN in conjunction with the logistic regression judgment model outperforms the CNN in conjunction with SVM, Naïve Bayes, Adaboost, J48, and random forest, and its accuracy surpasses that of betting market prediction.


Sign in / Sign up

Export Citation Format

Share Document