scholarly journals Use of Machine Learning to Investigate The Quantitative Checklist For Autism in Toddlers (Q-CHAT) Towards Early Autism Screening

2020 ◽  
Author(s):  
Gennaro Tartarisco ◽  
Giovanni Cicceri ◽  
Davide Di Pietro ◽  
Stefania Aiello ◽  
Elisa Leonardi ◽  
...  

Abstract Background: In the past two decades, several screening instruments have been developed to detect toddlers who may be autistic, both in clinical and unselected samples. Among others, the Quantitative CHecklist for Autism in Toddlers (Q−CHAT) is a quantitative and normally distributed measure of autistic traits which demonstrated good psychometric properties in different settings and cultures. Recently machine learning (ML) has been applied to behavioural science to improve classification performance of autism screening and diagnostic tools, but mainly in children, adolescents and adults. Methods: In this study, we used machine learning (ML) to investigate the accuracy and reliability of the Q−CHAT in discriminating young autistic children from those without. Three different ML algorithms (Random Forest, Naive Bayes and Support Vector Machine) were applied to investigate the complete set of Q-CHAT items and the best predictive items. Results: Our results showed that the three selected models outperformed the classical statistical methods of predictive validity and among the three ML classifiers, the Support Vector Machine was the most effective, being able to classify autism with 95% accuracy. Furthermore, using the Support Vector Machine-Recursive Feature Elimination approach we were able to select a subset of 14 items ensuring an accuracy of 93%, while an accuracy of 83% was obtained from the best 3 discriminating items in common to our and the previous reported Q-CHAT-10. Limitations: Further data collection is needed.Conclusions: This evidence confirms the high performance and cross-cultural validity of the Q-CHAT and supports the application of ML to create shorter and faster versions of the instrument maintaining high classification accuracy, to be used as a quick, easy and high-performance tool in primary care settings.

Diagnostics ◽  
2021 ◽  
Vol 11 (3) ◽  
pp. 574
Author(s):  
Gennaro Tartarisco ◽  
Giovanni Cicceri ◽  
Davide Di Pietro ◽  
Elisa Leonardi ◽  
Stefania Aiello ◽  
...  

In the past two decades, several screening instruments were developed to detect toddlers who may be autistic both in clinical and unselected samples. Among others, the Quantitative CHecklist for Autism in Toddlers (Q-CHAT) is a quantitative and normally distributed measure of autistic traits that demonstrates good psychometric properties in different settings and cultures. Recently, machine learning (ML) has been applied to behavioral science to improve the classification performance of autism screening and diagnostic tools, but mainly in children, adolescents, and adults. In this study, we used ML to investigate the accuracy and reliability of the Q-CHAT in discriminating young autistic children from those without. Five different ML algorithms (random forest (RF), naïve Bayes (NB), support vector machine (SVM), logistic regression (LR), and K-nearest neighbors (KNN)) were applied to investigate the complete set of Q-CHAT items. Our results showed that ML achieved an overall accuracy of 90%, and the SVM was the most effective, being able to classify autism with 95% accuracy. Furthermore, using the SVM–recursive feature elimination (RFE) approach, we selected a subset of 14 items ensuring 91% accuracy, while 83% accuracy was obtained from the 3 best discriminating items in common to ours and the previously reported Q-CHAT-10. This evidence confirms the high performance and cross-cultural validity of the Q-CHAT, and supports the application of ML to create shorter and faster versions of the instrument, maintaining high classification accuracy, to be used as a quick, easy, and high-performance tool in primary-care settings.


Author(s):  
Seyma Kiziltas Koc ◽  
Mustafa Yeniad

Technologies which are used in the healthcare industry are changing rapidly because the technology is evolving to improve people's lifestyles constantly. For instance, different technological devices are used for the diagnosis and treatment of diseases. It has been revealed that diagnosis of disease can be made by computer systems with developing technology.Machine learning algorithms are frequently used tools because of their high performance in the field of health as well as many field. The aim of this study is to investigate different machine learning classification algorithms that can be used in the diagnosis of diabetes and to make comparative analyzes according to the metrics in the literature. In the study, seven classification algorithms were used in the literature. These algorithms are Logistic Regression, K-Nearest Neighbor, Multilayer Perceptron, Random Forest, Decision Trees, Support Vector Machine and Naive Bayes. Firstly, classification performance of algorithms are compared. These comparisons are based on accuracy, sensitivity, precision, and F1-score. The results obtained showed that support vector machine algorithm had the highest accuracy with 78.65%.


Author(s):  
Seyma Kiziltas Koc ◽  
Mustafa Yeniad

Technologies which are used in the healthcare industry are changing rapidly because the technology is evolving to improve people's lifestyles constantly. For instance, different technological devices are used for the diagnosis and treatment of diseases. It has been revealed that diagnosis of disease can be made by computer systems with developing technology.Machine learning algorithms are frequently used tools because of their high performance in the field of health as well as many field. The aim of this study is to investigate different machine learning classification algorithms that can be used in the diagnosis of diabetes and to make comparative analyzes according to the metrics in the literature. In the study, seven classification algorithms were used in the literature. These algorithms are Logistic Regression, K-Nearest Neighbor, Multilayer Perceptron, Random Forest, Decision Trees, Support Vector Machine and Naive Bayes. Firstly, classification performance of algorithms are compared. These comparisons are based on accuracy, sensitivity, precision, and F1-score. The results obtained showed that support vector machine algorithm had the highest accuracy with 78.65%.


2012 ◽  
Vol 2012 ◽  
pp. 1-12 ◽  
Author(s):  
Chen-An Tsai ◽  
Chien-Hsun Huang ◽  
Ching-Wei Chang ◽  
Chun-Houh Chen

The development of DNA microarray makes researchers screen thousands of genes simultaneously and it also helps determine high- and low-expression level genes in normal and disease tissues. Selecting relevant genes for cancer classification is an important issue. Most of the gene selection methods use univariate ranking criteria and arbitrarily choose a threshold to choose genes. However, the parameter setting may not be compatible to the selected classification algorithms. In this paper, we propose a new gene selection method (SVM-t) based on the use oft-statistics embedded in support vector machine. We compared the performance to two similar SVM-based methods: SVM recursive feature elimination (SVMRFE) and recursive support vector machine (RSVM). The three methods were compared based on extensive simulation experiments and analyses of two published microarray datasets. In the simulation experiments, we found that the proposed method is more robust in selecting informative genes than SVMRFE and RSVM and capable to attain good classification performance when the variations of informative and noninformative genes are different. In the analysis of two microarray datasets, the proposed method yields better performance in identifying fewer genes with good prediction accuracy, compared to SVMRFE and RSVM.


2020 ◽  
Vol 8 (6) ◽  
pp. 2862-2867

E-commerce is a website or mobile application platform that help people to buy products. Before purchasing the product, customer will decide to buy it or not by reading the review from previous buyer. There is a problem that there are a lot of review so it will take a long time for customer to read it all. This research will be using sentiment analysis method to classify the review data. Sentiment analysis or opinion mining is a machine learning approach to classify and analyse texts or documents about human’s sentiments, emotions, and opinions. In this research, sentiment analysis was used to classify product reviews from e-commerce websites into positive or negative classes. The results could be processed further and be used to summarize customers' opinions about a certain product without reading every single review. The goal of this research is to optimize classification performance by using feature selection technique. Terms Frequency-Inverse Document Frequency (TF-IDF) feature extraction, Backward Elimination feature selection, and five different classifiers (Naïve Bayes, Support Vector Machine, K-Nearest Neighbour, Decision Tree, Random Forest) were used in analysing the sentiment of the reviews. In this research, the dataset used are Indonesian language and classified into two classes(positive and negative). The best accuracy is achieved by using TF-IDF, Backward Elimination and Support Vector Machine (SVM) with a score of 85.97%, which increases by 7.91% if compared to the process without feature selection. Based on the results, Backward Elimination feature selection succeeded in improving all performance for all classifiers used in this research.


2021 ◽  
Vol 79 (4) ◽  
pp. 1691-1700
Author(s):  
Fan Zhang ◽  
Melissa Petersen ◽  
Leigh Johnson ◽  
James Hall ◽  
Sid E. O’Bryant

Background: There is a need for more reliable diagnostic tools for the early detection of Alzheimer’s disease (AD). This can be a challenge due to a number of factors and logistics making machine learning a viable option. Objective: In this paper, we present on a Support Vector Machine Leave-One-Out Recursive Feature Elimination and Cross Validation (SVM-RFE-LOO) algorithm for use in the early detection of AD and show how the SVM-RFE-LOO method can be used for both classification and prediction of AD. Methods: Data were analyzed on n = 300 participants (n = 150 AD; n = 150 cognitively normal controls). Serum samples were assayed via a multi-plex biomarker assay platform using electrochemiluminescence (ECL). Results: The SVM-RFE-LOO method reduced the number of features in the model from 21 to 16 biomarkers and achieved an area under the curve (AUC) of 0.980 with a sensitivity of 94.0% and a specificity of 93.3%. When the classification and prediction performance of SVM-RFE-LOO was compared to that of SVM and SVM-RFE, we found similar performance across the models; however, the SVM-RFE-LOO method utilized fewer markers. Conclusion: We found that 1) the SVM-RFE-LOO is suitable for analyzing noisy high-throughput proteomic data, 2) it outperforms SVM-RFE in the robustness to noise and in the ability to recover informative features, and 3) it can improve the prediction performance. Our recursive feature elimination model can serve as a general model for biomarker discovery in other diseases.


2021 ◽  
Vol 50 (3) ◽  
pp. 753-768
Author(s):  
NANYONGA AZIIDA ◽  
SORAYYA MALEK ◽  
FIRDAUS AZIZ ◽  
KHAIRUL SHAFIQ IBRAHIM ◽  
SAZZLI KASIM

Hybrid combinations of feature selection, classification and visualisation using machine learning (ML) methods have the potential for enhanced understanding and 30-day mortality prediction of patients with cardiovascular disease using population-specific data. Identifying a feature selection method with a classifier algorithm that produces high performance in mortality studies is essential and has not been reported before. Feature selection methods such as Boruta, Random Forest (RF), Elastic Net (EN), Recursive Feature Elimination (RFE), learning vector quantization (LVQ), Genetic Algorithm (GA), Cluster Dendrogram (CD), Support Vector Machine (SVM) and Logistic Regression (LR) were combined with RF, SVM, LR, and EN classifiers for 30-day mortality prediction. ML models were constructed using 302 patients and 54 input variables from the Malaysian National Cardiovascular Disease Database. Validation of the best ML model was performed against Thrombolysis in Myocardial Infarction (TIMI) using an additional dataset of 102 patients. The Self-Organising Feature Map (SOM) was used to visualise mortality-related factors post-ACS. The performance of MLmodels using the area under the curve (AUC) ranged from 0.48 to 0.80. The best-performing model (AUC = 0.80) was a hybrid combination of the RF variable importance method, the sequential backward selection and the RF classifier using five predictors (age, triglyceride, creatinine, troponin, and total cholesterol). Comparison with TIMI using an additional dataset resulted in the best ML model outperforming the TIMI score (AUC = 0.75 vs. AUC = 0.60). The findings of this study will provide a basis for developing an online ML-based population-specific risk scoring calculator.


Atmosphere ◽  
2021 ◽  
Vol 12 (7) ◽  
pp. 846
Author(s):  
Ilseok Noh ◽  
Hae-Won Doh ◽  
Soo-Ock Kim ◽  
Su-Hyun Kim ◽  
Seoleun Shin ◽  
...  

Spring frosts damage crops that have weakened freezing resistance after germination. We developed a machine learning (ML)-based frost-classification model and optimized it for orchard farming environments. First, logistic regression, decision tree, random forest, and support vector machine models were trained using balanced Korea Meteorological Administration (KMA) Automated Synoptic Observing System (ASOS) frost observation data for March from the last 10 years (2008–2017). Random forest and support vector machine models showed good classification performance and were selected as the main techniques, which were optimized for orchard fields based on initial frost occurrence times. The training period was then extended to March–April for 20 years (2000–2019). Finally, the model was applied to the KMA ASOS frost observation data from March to April 2020, which were not used in the previous steps, and RGB data were extracted by digital cameras installed in an orchard in Gyeonggi-do. The developed model successfully classified 117 of 139 frost observation cases from the domestic ASOS data and 35 of 37 orchard camera observations. The assumption of the initial frost occurrence time for training helped the most in improving the frost-classification model. These results clearly indicate that the frost-classification model using ML has applicable accuracy in orchard farming.


2018 ◽  
Vol 2018 ◽  
pp. 1-16 ◽  
Author(s):  
Nhat-Duc Hoang ◽  
Quoc-Lam Nguyen

Periodic surveys of asphalt pavement condition are very crucial in road maintenance. This work carries out a comparative study on the performance of machine learning approaches used for automatic pavement crack recognition. Six machine learning approaches, Naïve Bayesian Classifier (NBC), Classification Tree (CT), Backpropagation Artificial Neural Network (BPANN), Radial Basis Function Neural Network (RBFNN), Support Vector Machine (SVM), and Least Squares Support Vector Machine (LSSVM), have been employed. Additionally, Median Filter (MF), Steerable Filter (SF), and Projective Integral (PI) have been used to extract useful features from pavement images. In the feature extraction phase, performance comparison shows that the input pattern including the diagonal PIs enhances the classification performance significantly by creating more informative features. A simple moving average method is also employed to reduce the size of the feature set with positive effects on the model classification performance. Experimental results point out that LSSVM has achieved the highest classification accuracy rate. Therefore, this machine learning algorithm used with the feature extraction process proposed in this study can be a very promising tool to assist transportation agencies in the task of pavement condition survey.


2021 ◽  
Author(s):  
Boshra Shams ◽  
Ziqian Wang ◽  
Timo Roine ◽  
Baran Aydogan ◽  
Peter Vajkoczy ◽  
...  

AbstractAlong tract statistics enables white matter characterization using various diffusion MRI (dMRI) metrics. Here, we applied a machine learning (ML) method to assess the clinical utility of dMRI metrics along corticospinal tracts (CST), investigating whether motor glioma patients can be classified with respect to their motor status. The ML-based analysis included developing models based on support vector machine (SVM) using histogram-based measures of dMRI-based tract profiles (e.g., mean, standard deviation, kurtosis and skewness), following a recursive feature elimination (RFE) method based on SVM (SVM-RFE). Our model achieved high performance (74% sensitivity, 75% specificity, 74% overall accuracy and 77% AUC). Incorporating the patients’ demographics and clinical features such as age, tumor WHO grade, tumor location, gender and resting motor threshold (RMT) into our designed models demonstrated that these features were not as effective as microstructural measures. The results revealed that ADC, FA and RD contributed more than other features to the model.


Sign in / Sign up

Export Citation Format

Share Document