Classifying Oryza sativa accessions into Indica and Japonica using logistic regression model with phenotypic data

PeerJ ◽

10.7717/peerj.7259 ◽

2019 ◽

Vol 7 ◽

pp. e7259

Author(s):

Bongsong Kim

Keyword(s):

Logistic Regression ◽

Oryza Sativa ◽

Regression Model ◽

Classification Accuracy ◽

Logistic Regression Model ◽

Computational Method ◽

Phenotypic Data ◽

Separation Power

In Oryza sativa, indica and japonica are pivotal subpopulations, and other subpopulations such as aus and aromatic are considered to be derived from indica or japonica. In this regard, Oryza sativa accessions are frequently viewed from the indica/japonica perspective. This study introduces a computational method for indica/japonica classification by applying phenotypic variables to the logistic regression model (LRM). The population used in this study included 413 Oryza sativa accessions, of which 280 accessions were indica or japonica. Out of 24 phenotypic variables, a set of seven phenotypic variables was identified to collectively generate the fully accurate indica/japonica separation power of the LRM. The resulting parameters were used to define the customized LRM. Given the 280 indica/japonica accessions, the classification accuracy of the customized LRM along with the set of seven phenotypic variables was estimated by 100 iterations of ten-fold cross-validations. As a result, the classification accuracy of 100% was achieved. This suggests that the LRM can be an effective tool to analyze the indica/japonica classification with phenotypic variables in Oryza sativa.

Download Full-text

A spatial, climate-determined risk rating for Scleroderris disease of pines in Ontario

Canadian Journal of Forest Research ◽

10.1139/x98-126 ◽

1998 ◽

Vol 28 (9) ◽

pp. 1398-1404 ◽

Cited By ~ 13

Author(s):

L A Venier ◽

A A Hopkin ◽

D W McKenney ◽

Y. Wang

Keyword(s):

Logistic Regression ◽

Regression Model ◽

Classification Accuracy ◽

Logistic Regression Model ◽

Distribution Data ◽

Parameter Estimates ◽

Final Model ◽

Probability Of Occurrence ◽

Model Classification ◽

Historical Distribution

We used historical distribution data of Scleroderris disease (caused by the fungus Gremmeniella abietina var. abietina (Lagerb.) Morelet) in Ontario to model its probability of occurrence as a function of climate factors. A logistic regression model of the probability of occurrence as a function of the mean temperature of the coldest quarter and the precipitation of the coldest quarter was a very good fit. The concordance (index of classification accuracy) of the model was 84%. We subsampled the data repeatedly, generated new parameter estimates, and tested the predictions against data not included in the model. Classification accuracy was similar for each subsample model; therefore, we concluded that the final model is stable. Gridded estimates of the climate variables were used to spatially extend the two-variable logistic regression model and produce a probability of occurrence map for Scleroderris disease across Ontario. The predicted map of probability of occurrence fits well with the map of the observed locations of the disease. These results lend credence to previous work that suggests that distribution of Scleroderris disease is strongly influenced by climate. The classification results also suggest that this model is a useful tool for assessing the risk of Scleroderris disease throughout Ontario.

Download Full-text

Classifying Asian Rice Cultivars (Oryza sativa L.) into Indica and Japonica Using Logistic Regression Model with Publicly Available Phenotypic Data

10.1101/470351 ◽

2018 ◽

Author(s):

Bongsong Kim

Keyword(s):

Logistic Regression ◽

Oryza Sativa ◽

Regression Model ◽

Prediction Accuracy ◽

Logistic Regression Model ◽

Rice Cultivars ◽

Prediction Ability ◽

Plant Seed ◽

Asian Rice ◽

Panicle Number

AbstractThis article introduces how to implement the logistic regression model (LRM) with phenotypic variables for classifying Asian rice (Oryza sativa L.) cultivars into two pivotal subpopulations, indica and japonica. This study took advantage of publicly available data attached to a previous paper. The classification accuracy was assessed using an area under curve (AUC) of a receiver operating characteristic (ROC) curve. Given 24 phenotypic variables for 280 indica/japonica accessions, the LRMs were fitted with up to six phenotypic variables of all possible combinations; the highest AUC accounts for 0.9977, obtained with six variables including panicle number per plant, seed number per panicle, florets per panicle, panicle fertility, straighthead susceptibility and blast resistance. Overall, the more variables there are, the higher the resulting AUCs are. The ultimate purpose of this study is to demonstrate the indica/japonica prediction ability of the LRM when applied to unclassified Asian rice cultivars. To estimate the indica/japonica prediction accuracy, ten-fold cross-validations were conducted 100 times with the 280 indica/japonica accessions using the LRM with parameters that yielded the highest AUC. The resulting prediction accuracy accounted for 0.9779. This suggests that the LRM promises to be a highly effective indica/japonica prediction tool using phenotypic variables in Asian cultivated rice.

Download Full-text

Independent predictors for functional outcome after drainage of chronic subdural hematoma identified using a logistic regression model

Journal of Neurosurgical Sciences ◽

10.23736/s0390-5616.17.04056-5 ◽

2020 ◽

Vol 64 (2) ◽

Cited By ~ 2

Author(s):

Sotirios Katsigiannis ◽

Christina Hamisch ◽

Boris Krischek ◽

Marco Timmer ◽

Anastasios Mpotsaris ◽

...

Keyword(s):

Logistic Regression ◽

Regression Model ◽

Functional Outcome ◽

Subdural Hematoma ◽

Logistic Regression Model ◽

Chronic Subdural Hematoma

Download Full-text

Survey on turnover intention of scientific and technological workers based on the binary logistic regression model—a case study of XPCC

Information Management and Management Engineering ◽

10.2495/imme140591 ◽

2014 ◽

Author(s):

Zhui Liu ◽

Honglu Gou ◽

Lingying Kong

Keyword(s):

Logistic Regression ◽

Regression Model ◽

Turnover Intention ◽

Logistic Regression Model ◽

Binary Logistic Regression ◽

Binary Logistic Regression Model

Download Full-text

The Effects of Major Stakeholders on SMME's Performance Turnaround -- Empirical Analysis Based on the Ordinal Logistic Regression Model

SSRN Electronic Journal ◽

10.2139/ssrn.2282214 ◽

2013 ◽

Author(s):

Li Qi

Keyword(s):

Logistic Regression ◽

Regression Model ◽

Empirical Analysis ◽

Logistic Regression Model ◽

Ordinal Logistic Regression ◽

Ordinal Logistic Regression Model

Download Full-text

Logistic Regression Model for Business Failures Prediction of Technology Industry in Thailand

SSRN Electronic Journal ◽

10.2139/ssrn.2932026 ◽

2012 ◽

Author(s):

Sittichai Puagwatana

Keyword(s):

Logistic Regression ◽

Regression Model ◽

Logistic Regression Model ◽

Technology Industry ◽

Business Failures

Download Full-text

Logistic Regression Model of Relationship between Breast Cancer Pathology Diagnosis with Metastasis

Journal of Physics Conference Series ◽

10.1088/1742-6596/1752/1/012026 ◽

2021 ◽

Vol 1752 (1) ◽

pp. 012026

Author(s):

M N Bustan ◽

B Poerwanto

Keyword(s):

Breast Cancer ◽

Logistic Regression ◽

Regression Model ◽

Logistic Regression Model ◽

Cancer Pathology ◽

Breast Cancer Pathology

Download Full-text

Work absence and multimorbidity in Portugal: results from the 1st National Health Examination Survey

European Journal of Public Health ◽

10.1093/eurpub/ckaa166.1390 ◽

2020 ◽

Vol 30 (Supplement_5) ◽

Author(s):

J Matos ◽

C Matias Dias ◽

A Félix

Keyword(s):

Logistic Regression ◽

Regression Model ◽

Chronic Diseases ◽

National Health ◽

Logistic Regression Model ◽

Health Examination ◽

Work Absence ◽

Health Examination Survey ◽

Absence From Work ◽

The Impact

Abstract Background Studies on the impact of patients with multimorbidity in the absence of work indicate that the number and type of chronic diseases may increase absenteeism and that the risk of absence from work is higher in people with two or more chronic diseases. This study analyzed the association between multimorbidity and greater frequency and duration of work absence in the portuguese population between the ages of 25 and 65 during 2015. Methods This is an epidemiological, observational, cross-sectional study with an analytical component that has its source of information from the 1st National Health Examination Survey. The study analyzed univariate, bivariate and multivariate variables under study. A multivariate logistic regression model was constructed. Results The prevalence of absenteeism was 55,1%. Education showed an association with absence of work (p = 0,0157), as well as professional activity (p = 0,0086). It wasn't possible to verify association between the presence of chronic diseases (p = 0,9358) or the presence of multimorbidity (p = 0,4309) with absence of work. The prevalence of multimorbidity was 31,8%. There was association between age (p < 0,0001), education (p < 0,001) and yield (p = 0,0009) and multimorbidity. There is no increase in the number of days of absence from work due to the increase in the number of chronic diseases. In the optimized logistic regression model the only variables that demonstrated association with the variable labor absence were age (p = 0,0391) and education (0,0089). Conclusions The scientific evidence generated will contribute to the current discussion on the need for the health and social security system to develop policies to patients with multimorbidity. Key messages The prevalence of absenteeism and multimorbidity in Portugal was respectively 55,1% and 31,8%. In the optimized model age and education demonstrated association with the variable labor absence.

Download Full-text

Logistic regression model with TreeNet and association rules analysis: applications with medical datasets

Communications in Statistics - Simulation and Computation ◽

10.1080/03610918.2021.1912764 ◽

2021 ◽

pp. 1-25

Author(s):

Pannapa Changpetch

Keyword(s):

Logistic Regression ◽

Regression Model ◽

Association Rules ◽

Logistic Regression Model ◽

Association Rules Analysis

Download Full-text

Using Convolutional Neural Network and Candlestick Representation to Predict Sports Match Outcomes

Applied Sciences ◽

10.3390/app11146594 ◽

2021 ◽

Vol 11 (14) ◽

pp. 6594

Author(s):

Yu-Chia Hsu

Keyword(s):

Neural Network ◽

Time Series ◽

Pattern Recognition ◽

Logistic Regression ◽

Regression Model ◽

Convolutional Neural Network ◽

Logistic Regression Model ◽

National Football League ◽

Performance Metrics ◽

Betting Market

The interdisciplinary nature of sports and the presence of various systemic and non-systemic factors introduce challenges in predicting sports match outcomes using a single disciplinary approach. In contrast to previous studies that use sports performance metrics and statistical models, this study is the first to apply a deep learning approach in financial time series modeling to predict sports match outcomes. The proposed approach has two main components: a convolutional neural network (CNN) classifier for implicit pattern recognition and a logistic regression model for match outcome judgment. First, the raw data used in the prediction are derived from the betting market odds and actual scores of each game, which are transformed into sports candlesticks. Second, CNN is used to classify the candlesticks time series on a graphical basis. To this end, the original 1D time series are encoded into 2D matrix images using Gramian angular field and are then fed into the CNN classifier. In this way, the winning probability of each matchup team can be derived based on historically implied behavioral patterns. Third, to further consider the differences between strong and weak teams, the CNN classifier adjusts the probability of winning the match by using the logistic regression model and then makes a final judgment regarding the match outcome. We empirically test this approach using 18,944 National Football League game data spanning 32 years and find that using the individual historical data of each team in the CNN classifier for pattern recognition is better than using the data of all teams. The CNN in conjunction with the logistic regression judgment model outperforms the CNN in conjunction with SVM, Naïve Bayes, Adaboost, J48, and random forest, and its accuracy surpasses that of betting market prediction.

Download Full-text