Prediction of ADHD diagnosis using brief, low-cost, clinical measures: a competitive model evaluation

Proper diagnosis of ADHD is costly, requiring in-depth evaluation via interview, multi-informant and observational assessment, and scrutiny of possible other conditions. The increasing availability of data may allow the development of machine-learning algorithms capable of accurate diagnostic predictions using low-cost measures. We report on the performance of multiple classification methods used to predict a clinician-consensus ADHD diagnosis. Classification methods ranged from fairly simple (e.g., logistic regression) to more complex (e.g., random forest), and also included a multi-stage Bayesian approach. All methods were evaluated in two large (N>1000), independent cohorts. The multi-stage Bayesian classifier provides an intuitive approach that is consistent with clinical workflows, and is able to predict ADHD diagnosis with high accuracy (>86%)—though not significantly better than other commonly used classifiers, including logistic regression. Results suggest that data from parent and teacher surveys is sufficient for high-confidence classifications in the vast majority of cases using relatively straightforward methods.

Download Full-text

Drug Target Group Prediction with Multiple Drug Networks

Combinatorial Chemistry & High Throughput Screening ◽

10.2174/1386207322666190702103927 ◽

2020 ◽

Vol 23 (4) ◽

pp. 274-284 ◽

Cited By ~ 12

Author(s):

Jingang Che ◽

Lei Chen ◽

Zi-Han Guo ◽

Shuaiqun Wang ◽

Aorigele

Keyword(s):

Drug Target ◽

Low Cost ◽

Machine Learning Algorithms ◽

Classification Model ◽

Support Vector ◽

Multiple Drug ◽

Property A ◽

Multiple Networks ◽

Proposed Model ◽

The One

Background: Identification of drug-target interaction is essential in drug discovery. It is beneficial to predict unexpected therapeutic or adverse side effects of drugs. To date, several computational methods have been proposed to predict drug-target interactions because they are prompt and low-cost compared with traditional wet experiments. Methods: In this study, we investigated this problem in a different way. According to KEGG, drugs were classified into several groups based on their target proteins. A multi-label classification model was presented to assign drugs into correct target groups. To make full use of the known drug properties, five networks were constructed, each of which represented drug associations in one property. A powerful network embedding method, Mashup, was adopted to extract drug features from above-mentioned networks, based on which several machine learning algorithms, including RAndom k-labELsets (RAKEL) algorithm, Label Powerset (LP) algorithm and Support Vector Machine (SVM), were used to build the classification model. Results and Conclusion: Tenfold cross-validation yielded the accuracy of 0.839, exact match of 0.816 and hamming loss of 0.037, indicating good performance of the model. The contribution of each network was also analyzed. Furthermore, the network model with multiple networks was found to be superior to the one with a single network and classic model, indicating the superiority of the proposed model.

Download Full-text

Electrodeposited Ni-Fe-P-FeMnO3/Fe multi-stage nanostructured electrocatalyst with superior catalytic performance for water splitting

Journal of Materials Chemistry A ◽

10.1039/d1ta04518k ◽

2021 ◽

Author(s):

Xiao Tan ◽

Xin Liu ◽

Yingying Si ◽

Zunhang Lv ◽

Zihan Li ◽

...

Keyword(s):

Water Splitting ◽

Alkaline Solution ◽

Low Cost ◽

Catalytic Performance ◽

Multi Stage

It is very important to design and prepare low-cost and efficiency electrocatalysts for water splitting in alkaline solution. In this works, Ni-Fe-P and Ni-Fe-P-FeMnO3 electrocatalysts are developed using facile electrodeposition...

Download Full-text

Google Play Content Scraping and Knowledge Engineering using Natural Language Processing Techniques with the Analysis of User Reviews

Journal of Intelligent Systems ◽

10.1515/jisys-2019-0197 ◽

2020 ◽

Vol 30 (1) ◽

pp. 192-208 ◽

Cited By ~ 1

Author(s):

Hamza Aldabbas ◽

Abdullah Bajahzar ◽

Meshrif Alruily ◽

Ali Adil Qureshi ◽

Rana M. Amir Latif ◽

...

Keyword(s):

Logistic Regression ◽

Language Processing ◽

Mobile Application ◽

Knowledge Engineering ◽

Machine Learning Algorithms ◽

Application Development ◽

User Reviews ◽

N Gram ◽

Logistic Regression Algorithm ◽

Google Play

Abstract To maintain the competitive edge and evaluating the needs of the quality app is in the mobile application market. The user’s feedback on these applications plays an essential role in the mobile application development industry. The rapid growth of web technology gave people an opportunity to interact and express their review, rate and share their feedback about applications. In this paper we have scrapped 506259 of user reviews and applications rate from Google Play Store from 14 different categories. The statistical information was measured in the results using different of common machine learning algorithms such as the Logistic Regression, Random Forest Classifier, and Multinomial Naïve Bayes. Different parameters including the accuracy, precision, recall, and F1 score were used to evaluate Bigram, Trigram, and N-gram, and the statistical result of these algorithms was compared. The analysis of each algorithm, one by one, is performed, and the result has been evaluated. It is concluded that logistic regression is the best algorithm for review analysis of the Google Play Store applications. The results have been checked scientifically, and it is found that the accuracy of the logistic regression algorithm for analyzing different reviews based on three classes, i.e., positive, negative, and neutral.

Download Full-text

Digital Smoke Taint Detection in Pinot Grigio Wines Using an E-Nose and Machine Learning Algorithms Following Treatment with Activated Carbon and a Cleaving Enzyme

Fermentation ◽

10.3390/fermentation7030119 ◽

2021 ◽

Vol 7 (3) ◽

pp. 119

Author(s):

Vasiliki Summerson ◽

Claudia Gonzalez Viejo ◽

Damir D. Torrico ◽

Alexis Pang ◽

Sigfredo Fuentes

Keyword(s):

Machine Learning ◽

Activated Carbon ◽

Low Cost ◽

Aroma Compounds ◽

Machine Learning Algorithms ◽

Enzyme Treatment ◽

Mean Values ◽

Volatile Aroma ◽

Carbon Treatment ◽

Volatile Aroma Compounds

The incidence and intensity of bushfires is increasing due to climate change, resulting in a greater risk of smoke taint development in wine. In this study, smoke-tainted and non-smoke-tainted wines were subjected to treatments using activated carbon with/without the addition of a cleaving enzyme treatment to hydrolyze glycoconjugates. Chemical measurements and volatile aroma compounds were assessed for each treatment, with the two smoke taint amelioration treatments exhibiting lower mean values for volatile aroma compounds exhibiting positive ‘fruit’ aromas. Furthermore, a low-cost electronic nose (e-nose) was used to assess the wines. A machine learning model based on artificial neural networks (ANN) was developed using the e-nose outputs from the unsmoked control wine, unsmoked wine with activated carbon treatment, unsmoked wine with a cleaving enzyme plus activated carbon treatment, and smoke-tainted control wine samples as inputs to classify the wines according to the smoke taint amelioration treatment. The model displayed a high overall accuracy of 98% in classifying the e-nose readings, illustrating it may be a rapid, cost-effective tool for winemakers to assess the effectiveness of smoke taint amelioration treatment by activated carbon with/without the use of a cleaving enzyme. Furthermore, the use of a cleaving enzyme coupled with activated carbon was found to be effective in ameliorating smoke taint in wine and may help delay the resurgence of smoke aromas in wine following the aging and hydrolysis of glycoconjugates.

Download Full-text

GENE SELECTION USING LOGISTIC REGRESSIONS BASED ON AIC, BIC AND MDL CRITERIA

New Mathematics and Natural Computation ◽

10.1142/s179300570500007x ◽

2005 ◽

Vol 01 (01) ◽

pp. 129-145 ◽

Cited By ~ 15

Author(s):

XIAOBO ZHOU ◽

XIAODONG WANG ◽

EDWARD R. DOUGHERTY

Keyword(s):

Logistic Regression ◽

Regression Model ◽

Logistic Regression Model ◽

Gene Selection ◽

Information Criterion ◽

Cancer Classification ◽

Data Sets ◽

Classification Methods ◽

Gene Expressions ◽

Experimental Conditions

In microarray-based cancer classification, gene selection is an important issue owing to the large number of variables (gene expressions) and the small number of experimental conditions. Many gene-selection and classification methods have been proposed; however most of these treat gene selection and classification separately, and not under the same model. We propose a Bayesian approach to gene selection using the logistic regression model. The Akaike information criterion (AIC), the Bayesian information criterion (BIC) and the minimum description length (MDL) principle are used in constructing the posterior distribution of the chosen genes. The same logistic regression model is then used for cancer classification. Fast implementation issues for these methods are discussed. The proposed methods are tested on several data sets including those arising from hereditary breast cancer, small round blue-cell tumors, lymphoma, and acute leukemia. The experimental results indicate that the proposed methods show high classification accuracies on these data sets. Some robustness and sensitivity properties of the proposed methods are also discussed. Finally, mixing logistic-regression based gene selection with other classification methods and mixing logistic-regression-based classification with other gene-selection methods are considered.

Download Full-text

Machine Learning Models for Finger Bend Evaluation using Implemented Low cost Flex Sensor

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.35742 ◽

2021 ◽

Vol 9 (VI) ◽

pp. 3605-3611

Author(s):

Pratyush Kaware

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Low Cost ◽

Learning Algorithms ◽

Cost Effective ◽

Machine Learning Algorithms ◽

Support Vector ◽

Learning Models ◽

Machine Learning Models

In this paper a cost-effective sensor has been implemented to read finger bend signals, by attaching the sensor to a finger, so as to classify them based on the degree of bent as well as the joint about which the finger was being bent. This was done by testing with various machine learning algorithms to get the most accurate and consistent classifier. Finally, we found that Support Vector Machine was the best algorithm suited to classify our data, using we were able predict live state of a finger, i.e., the degree of bent and the joints involved. The live voltage values from the sensor were transmitted using a NodeMCU micro-controller which were converted to digital and uploaded on a database for analysis.

Download Full-text

Implementation of Machine Learning Algorithms for Prediction of Fluidelastic Instability in Tube Arrays

Journal of Pressure Vessel Technology ◽

10.1115/1.4049876 ◽

2021 ◽

Vol 143 (2) ◽

Author(s):

Joaquin E. Moran ◽

Yasser Selima

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Learning Algorithm ◽

Machine Learning Algorithms ◽

Two Phase ◽

Factors Affecting ◽

Logistic Regression Models ◽

Number Of Factors ◽

Tube Arrays ◽

Fluidelastic Instability

Abstract Fluidelastic instability (FEI) in tube arrays has been studied extensively experimentally and theoretically for the last 50 years, due to its potential to cause significant damage in short periods. Incidents similar to those observed at San Onofre Nuclear Generating Station indicate that the problem is not yet fully understood, probably due to the large number of factors affecting the phenomenon. In this study, a new approach for the analysis and interpretation of FEI data using machine learning (ML) algorithms is explored. FEI data for both single and two-phase flows have been collected from the literature and utilized for training a machine learning algorithm in order to either provide estimates of the reduced velocity (single and two-phase) or indicate if the bundle is stable or unstable under certain conditions (two-phase). The analysis included the use of logistic regression as a classification algorithm for two-phase flow problems to determine if specific conditions produce a stable or unstable response. The results of this study provide some insight into the capability and potential of logistic regression models to analyze FEI if appropriate quantities of experimental data are available.

Download Full-text

Predicting hospitalization following psychiatric crisis care using machine learning

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-020-01361-1 ◽

2020 ◽

Vol 20 (1) ◽

Author(s):

Matthijs Blankers ◽

Louk F. M. van der Post ◽

Jack J. M. Dekker

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Prediction Models ◽

Learning Algorithms ◽

Nearest Neighbors ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Ensemble Model ◽

K Nearest Neighbors ◽

Crisis Care

Abstract Background Accurate prediction models for whether patients on the verge of a psychiatric criseis need hospitalization are lacking and machine learning methods may help improve the accuracy of psychiatric hospitalization prediction models. In this paper we evaluate the accuracy of ten machine learning algorithms, including the generalized linear model (GLM/logistic regression) to predict psychiatric hospitalization in the first 12 months after a psychiatric crisis care contact. We also evaluate an ensemble model to optimize the accuracy and we explore individual predictors of hospitalization. Methods Data from 2084 patients included in the longitudinal Amsterdam Study of Acute Psychiatry with at least one reported psychiatric crisis care contact were included. Target variable for the prediction models was whether the patient was hospitalized in the 12 months following inclusion. The predictive power of 39 variables related to patients’ socio-demographics, clinical characteristics and previous mental health care contacts was evaluated. The accuracy and area under the receiver operating characteristic curve (AUC) of the machine learning algorithms were compared and we also estimated the relative importance of each predictor variable. The best and least performing algorithms were compared with GLM/logistic regression using net reclassification improvement analysis and the five best performing algorithms were combined in an ensemble model using stacking. Results All models performed above chance level. We found Gradient Boosting to be the best performing algorithm (AUC = 0.774) and K-Nearest Neighbors to be the least performing (AUC = 0.702). The performance of GLM/logistic regression (AUC = 0.76) was slightly above average among the tested algorithms. In a Net Reclassification Improvement analysis Gradient Boosting outperformed GLM/logistic regression by 2.9% and K-Nearest Neighbors by 11.3%. GLM/logistic regression outperformed K-Nearest Neighbors by 8.7%. Nine of the top-10 most important predictor variables were related to previous mental health care use. Conclusions Gradient Boosting led to the highest predictive accuracy and AUC while GLM/logistic regression performed average among the tested algorithms. Although statistically significant, the magnitude of the differences between the machine learning algorithms was in most cases modest. The results show that a predictive accuracy similar to the best performing model can be achieved when combining multiple algorithms in an ensemble model.

Download Full-text

Design and Optimization of Multi-Stage Centrifugal Compressors With Uncertainty Quantification of Off Design Performance

Volume 2C: Turbomachinery ◽

10.1115/gt2017-63770 ◽

2017 ◽

Author(s):

A. Romei ◽

R. Maffulli ◽

C. Garcia Sanchez ◽

S. Lavagnoli

Keyword(s):

Uncertainty Quantification ◽

Oil And Gas ◽

Low Cost ◽

Design Procedure ◽

Design Tool ◽

Test Case ◽

Centrifugal Compressors ◽

Design Performance ◽

Leading Role ◽

Multi Stage

The use of multi-stage centrifugal compressors carries out a leading role in oil and gas process applications. Green operation and market competitiveness require the use of low-cost reliable compression units with high efficiencies and wide operating range. A methodology is presented for the design optimization of multi-stage centrifugal compressors with prediction of the compressor map and estimation of the uncertainty limits. A one-dimensional (1D) design tool has been developed that automatically generates a multi-stage radial compressor satisfying the target machine requirements based on a few input parameters. The compressor performance map is then assessed using the method proposed by Casey-Robinson [1], and the approach developed by Al-Busaidi-Pilidis [2]. The off-design performance method relies on empirical correlations calibrated on the performance maps of many single-stage centrifugal compressors. An uncertainty quantification study on the predicted performance maps was conducted using Monte Carlo method (MCM) and generalized Polynomial Chaos Expansion (gPCE). Finally, the design procedure has been coupled to an in-house optimizer based on evolutionary algorithms. The complete design procedure has been applied to a multi-stage industrial compressor test case. A multi-objective optimization of a multi-stage industrial compressor has been performed targeting maximum compressor efficiency and flow range. The results of the optimization show the existence of optimum compressor architectures and how the Pareto fronts evolve depending on the number of stages and shafts.

Download Full-text

Predicting Hospitalization following Psychiatric Crisis Care using Machine Learning

10.21203/rs.2.12338/v1 ◽

2019 ◽

Author(s):

Matthijs Blankers ◽

Louk F. M. van der Post ◽

Jack J. M. Dekker

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Learning Algorithms ◽

Nearest Neighbors ◽

Machine Learning Algorithms ◽

Predictor Variables ◽

Gradient Boosting ◽

K Nearest Neighbors ◽

Psychiatric Crisis ◽

Crisis Care

Abstract Background: It is difficult to accurately predict whether a patient on the verge of a potential psychiatric crisis will need to be hospitalized. Machine learning may be helpful to improve the accuracy of psychiatric hospitalization prediction models. In this paper we evaluate and compare the accuracy of ten machine learning algorithms including the commonly used generalized linear model (GLM/logistic regression) to predict psychiatric hospitalization in the first 12 months after a psychiatric crisis care contact, and explore the most important predictor variables of hospitalization. Methods: Data from 2,084 patients with at least one reported psychiatric crisis care contact included in the longitudinal Amsterdam Study of Acute Psychiatry were used. The accuracy and area under the receiver operating characteristic curve (AUC) of the machine learning algorithms were compared. We also estimated the relative importance of each predictor variable. The best and least performing algorithms were compared with GLM/logistic regression using net reclassification improvement analysis. Target variable for the prediction models was whether or not the patient was hospitalized in the 12 months following inclusion in the study. The 39 predictor variables were related to patients’ socio-demographics, clinical characteristics and previous mental health care contacts. Results: We found Gradient Boosting to perform the best (AUC=0.774) and K-Nearest Neighbors performing the least (AUC=0.702). The performance of GLM/logistic regression (AUC=0.76) was above average among the tested algorithms. Gradient Boosting outperformed GLM/logistic regression and K-Nearest Neighbors, and GLM outperformed K-Nearest Neighbors in a Net Reclassification Improvement analysis, although the differences between Gradient Boosting and GLM/logistic regression were small. Nine of the top-10 most important predictor variables were related to previous mental health care use. Conclusions: Gradient Boosting led to the highest predictive accuracy and AUC while GLM/logistic regression performed average among the tested algorithms. Although statistically significant, the magnitude of the differences between the machine learning algorithms was modest. Future studies may consider to combine multiple algorithms in an ensemble model for optimal performance and to mitigate the risk of choosing suboptimal performing algorithms.

Download Full-text