Dysgraphia detection through machine learning

AbstractDysgraphia, a disorder affecting the written expression of symbols and words, negatively impacts the academic results of pupils as well as their overall well-being. The use of automated procedures can make dysgraphia testing available to larger populations, thereby facilitating early intervention for those who need it. In this paper, we employed a machine learning approach to identify handwriting deteriorated by dysgraphia. To achieve this goal, we collected a new handwriting dataset consisting of several handwriting tasks and extracted a broad range of features to capture different aspects of handwriting. These were fed to a machine learning algorithm to predict whether handwriting is affected by dysgraphia. We compared several machine learning algorithms and discovered that the best results were achieved by the adaptive boosting (AdaBoost) algorithm. The results show that machine learning can be used to detect dysgraphia with almost 80% accuracy, even when dealing with a heterogeneous set of subjects differing in age, sex and handedness.

Download Full-text

Predictors of tooth loss: A machine learning approach

PLoS ONE ◽

10.1371/journal.pone.0252873 ◽

2021 ◽

Vol 16 (6) ◽

pp. e0252873

Author(s):

Hawazin W. Elani ◽

André F. M. Batista ◽

W. Murray Thomson ◽

Ichiro Kawachi ◽

Alexandre D. P. Chiavegatto Filho

Keyword(s):

Machine Learning ◽

Tooth Loss ◽

Learning Algorithm ◽

Machine Learning Algorithms ◽

Future Application ◽

Learning Approach ◽

Machine Learning Algorithm ◽

Predictive Values ◽

Extreme Gradient Boosting ◽

Machine Learning Approach

Introduction Little is understood about the socioeconomic predictors of tooth loss, a condition that can negatively impact individual’s quality of life. The goal of this study is to develop a machine-learning algorithm to predict complete and incremental tooth loss among adults and to compare the predictive performance of these models. Methods We used data from the National Health and Nutrition Examination Survey from 2011 to 2014. We developed multiple machine-learning algorithms and assessed their predictive performances by examining the area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, specificity, and positive and negative predictive values. Results The extreme gradient boosting trees presented the highest performance in the prediction of edentulism (AUC = 88.7%; 95%CI: 87.1, 90.2), the absence of a functional dentition (AUC = 88.3% 95%CI: 87.3,89.3) and for predicting missing any tooth (AUC = 83.2%; 95%CI, 82.0, 84.4). Although, as expected, age and routine dental care emerged as strong predictors of tooth loss, the machine learning approach identified additional predictors, including socioeconomic conditions. Indeed, the performance of models incorporating socioeconomic characteristics was better at predicting tooth loss than those relying on clinical dental indicators alone. Conclusions Future application of machine-learning algorithm, with longitudinal cohorts, for identification of individuals at risk for tooth loss could assist clinicians to prioritize interventions directed toward the prevention of tooth loss.

Download Full-text

A Machine Learning Approach to Study Glycosidase Activities from Bifidobacterium

Microorganisms ◽

10.3390/microorganisms9051034 ◽

2021 ◽

Vol 9 (5) ◽

pp. 1034

Author(s):

Carlos Sabater ◽

Lorena Ruiz ◽

Abelardo Margolles

Keyword(s):

Machine Learning ◽

Supervised Classification ◽

Machine Learning Algorithms ◽

Learning Approach ◽

Human Milk Oligosaccharides ◽

Future Studies ◽

High Fiber ◽

Machine Learning Approach ◽

Prebiotic Oligosaccharides

This study aimed to recover metagenome-assembled genomes (MAGs) from human fecal samples to characterize the glycosidase profiles of Bifidobacterium species exposed to different prebiotic oligosaccharides (galacto-oligosaccharides, fructo-oligosaccharides and human milk oligosaccharides, HMOs) as well as high-fiber diets. A total of 1806 MAGs were recovered from 487 infant and adult metagenomes. Unsupervised and supervised classification of glycosidases codified in MAGs using machine-learning algorithms allowed establishing characteristic hydrolytic profiles for B. adolescentis, B. bifidum, B. breve, B. longum and B. pseudocatenulatum, yielding classification rates above 90%. Glycosidase families GH5 44, GH32, and GH110 were characteristic of B. bifidum. The presence or absence of GH1, GH2, GH5 and GH20 was characteristic of B. adolescentis, B. breve and B. pseudocatenulatum, while families GH1 and GH30 were relevant in MAGs from B. longum. These characteristic profiles allowed discriminating bifidobacteria regardless of prebiotic exposure. Correlation analysis of glycosidase activities suggests strong associations between glycosidase families comprising HMOs-degrading enzymes, which are often found in MAGs from the same species. Mathematical models here proposed may contribute to a better understanding of the carbohydrate metabolism of some common bifidobacteria species and could be extrapolated to other microorganisms of interest in future studies.

Download Full-text

Using Supervised Machine Learning Algorithms for Automated Lithology Prediction from Wireline Log Data

10.2118/208559-ms ◽

2021 ◽

Author(s):

Marian Popescu ◽

Rebecca Head ◽

Tim Ferriday ◽

Kate Evans ◽

Jose Montero ◽

...

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Training Dataset ◽

Depth Interval ◽

Log Data ◽

Machine Learning Approach ◽

Lithology Prediction ◽

Logging While Drilling

Abstract This paper presents advancements in machine learning and cloud deployment that enable rapid and accurate automated lithology interpretation. A supervised machine learning technique is described that enables rapid, consistent, and accurate lithology prediction alongside quantitative uncertainty from large wireline or logging-while-drilling (LWD) datasets. To leverage supervised machine learning, a team of geoscientists and petrophysicists made detailed lithology interpretations of wells to generate a comprehensive training dataset. Lithology interpretations were based on applying determinist cross-plotting by utilizing and combining various raw logs. This training dataset was used to develop a model and test a machine learning pipeline. The pipeline was applied to a dataset previously unseen by the algorithm, to predict lithology. A quality checking process was performed by a petrophysicist to validate new predictions delivered by the pipeline against human interpretations. Confidence in the interpretations was assessed in two ways. The prior probability was calculated, a measure of confidence in the input data being recognized by the model. Posterior probability was calculated, which quantifies the likelihood that a specified depth interval comprises a given lithology. The supervised machine learning algorithm ensured that the wells were interpreted consistently by removing interpreter biases and inconsistencies. The scalability of cloud computing enabled a large log dataset to be interpreted rapidly; >100 wells were interpreted consistently in five minutes, yielding >70% lithological match to the human petrophysical interpretation. Supervised machine learning methods have strong potential for classifying lithology from log data because: 1) they can automatically define complex, non-parametric, multi-variate relationships across several input logs; and 2) they allow classifications to be quantified confidently. Furthermore, this approach captured the knowledge and nuances of an interpreter's decisions by training the algorithm using human-interpreted labels. In the hydrocarbon industry, the quantity of generated data is predicted to increase by >300% between 2018 and 2023 (IDC, Worldwide Global DataSphere Forecast, 2019–2023). Additionally, the industry holds vast legacy data. This supervised machine learning approach can unlock the potential of some of these datasets by providing consistent lithology interpretations rapidly, allowing resources to be used more effectively.

Download Full-text

Employing a Machine Learning Approach to Detect Combined Internet of Things Attacks against Two Objective Functions Using a Novel Dataset

Security and Communication Networks ◽

10.1155/2020/2804291 ◽

2020 ◽

Vol 2020 ◽

pp. 1-17

Author(s):

John Foley ◽

Naghmeh Moradpoor ◽

Henry Ochenyi

Keyword(s):

Machine Learning ◽

Low Power ◽

Vulnerability Analysis ◽

Machine Learning Algorithms ◽

Learning Approach ◽

Lossy Networks ◽

Global Networks ◽

Machine Learning Approach ◽

Iot Devices ◽

Network Metrics

One of the important features of routing protocol for low-power and lossy networks (RPLs) is objective function (OF). OF influences an IoT network in terms of routing strategies and network topology. On the contrary, detecting a combination of attacks against OFs is a cutting-edge technology that will become a necessity as next generation low-power wireless networks continue to be exploited as they grow rapidly. However, current literature lacks study on vulnerability analysis of OFs particularly in terms of combined attacks. Furthermore, machine learning is a promising solution for the global networks of IoT devices in terms of analysing their ever-growing generated data and predicting cyberattacks against such devices. Therefore, in this paper, we study the vulnerability analysis of two popular OFs of RPL to detect combined attacks against them using machine learning algorithms through different simulated scenarios. For this, we created a novel IoT dataset based on power and network metrics, which is deployed as part of an RPL IDS/IPS solution to enhance information security. Addressing the captured results, our machine learning approach is successful in detecting combined attacks against two popular OFs of RPL based on the power and network metrics in which MLP and RF algorithms are the most successful classifier deployment for single and ensemble models.

Download Full-text

An Entropy-Based Machine Learning Algorithm for Combining Macroeconomic Forecasts

Entropy ◽

10.3390/e21101015 ◽

2019 ◽

Vol 21 (10) ◽

pp. 1015 ◽

Cited By ~ 3

Author(s):

Carles Bretó ◽

Priscila Espinosa ◽

Penélope Hernández ◽

Jose M. Pavía

Keyword(s):

Machine Learning ◽

Gross Domestic Product ◽

Maximum Entropy ◽

Simulation Study ◽

Learning Algorithm ◽

Predictive Ability ◽

Learning Approach ◽

Machine Learning Algorithm ◽

Machine Learning Approach ◽

Maximum Entropy Inference

This paper applies a Machine Learning approach with the aim of providing a single aggregated prediction from a set of individual predictions. Departing from the well-known maximum-entropy inference methodology, a new factor capturing the distance between the true and the estimated aggregated predictions presents a new problem. Algorithms such as ridge, lasso or elastic net help in finding a new methodology to tackle this issue. We carry out a simulation study to evaluate the performance of such a procedure and apply it in order to forecast and measure predictive ability using a dataset of predictions on Spanish gross domestic product.

Download Full-text

Reconstructive derivational analogy: A machine learning approach to automating redesign

Artificial intelligence for engineering design analysis and manufacturing ◽

10.1017/s0890060400001359 ◽

1996 ◽

Vol 10 (2) ◽

pp. 115-126 ◽

Cited By ~ 3

Author(s):

B.D. Britt ◽

T. Glagowski

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Design Tool ◽

Design Reuse ◽

Learning Approach ◽

Machine Learning Algorithm ◽

Machine Learning Approach ◽

Design Plan ◽

Primary Focus ◽

Different Parts

AbstractThis paper describes current research toward automating the redesign process. In redesign, a working design is altered to meet new problem specifications. This process is complicated by interactions between different parts of the design, and many researchers have addressed these issues. An overview is given of a large design tool under development, the Circuit Designer's Apprentice. This tool integrates various techniques for reengineering existing circuits so that they meet new circuit requirements. The primary focus of the paper is one particular technique being used to reengineer circuits when they cannot be transformed to meet the new problem requirements. In these cases, a design plan is automatically generated for the circuit, and then replayed to solve all or part of the new problem. This technique is based upon the derivational analogy approach to design reuse. Derivational Analogy is a machine learning algorithm in which a design plan is saved at the time of design so that it can be replayed on a new design problem. Because design plans were not saved for the circuits available to the Circuit Designer's Apprentice, an algorithm was developed that automatically reconstructs a design plan for any circuit. This algorithm, Reconstructive Derivational Analogy, is described in detail, including a quantitative analysis of the implementation of this algorithm.

Download Full-text

An Analytical Model for Prediction of Heart Disease using Machine Learning Classifiers

10.36227/techrxiv.14867175 ◽

2021 ◽

Author(s):

Diti Roy ◽

Md. Ashiq Mahmood ◽

Tamal Joyti Roy

Keyword(s):

Machine Learning ◽

Heart Disease ◽

Random Forest ◽

Learning Algorithm ◽

Modern Technology ◽

Learning Approach ◽

Data Sets ◽

Machine Learning Classifiers ◽

Machine Learning Approach ◽

Day By Day

Heart Disease is the most dominating disease which is taking a large number of deaths every year. A report from WHO in 2016 portrayed that every year at least 17 million people die of heart disease. This number is gradually increasing day by day and WHO estimated that this death toll will reach the summit of 75 million by 2030. Despite having modern technology and health care system predicting heart disease is still beyond limitations. As the Machine Learning algorithm is a vital source predicting data from available data sets we have used a machine learning approach to predict heart disease. We have collected data from the UCI repository. In our study, we have used Random Forest, Zero R, Voted Perceptron, K star classifier. We have got the best result through the Random Forest classifier with an accuracy of 97.69.

Download Full-text

Sentiment Analysis Using Hybrid Approach

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.39202 ◽

2021 ◽

Vol 9 (12) ◽

pp. 282-285

Author(s):

Ganesh K. Shinde

Keyword(s):

Machine Learning ◽

Sentiment Analysis ◽

Learning Algorithm ◽

Hybrid Approach ◽

Online Review ◽

Second Step ◽

Learning Approach ◽

Machine Learning Classifiers ◽

Machine Learning Approach ◽

Mining Methods

Abstract: Most important part of information gathering is to focus on how people think. There are so many opinion resources such as online review sites and personal blogs are available. In this paper we focused on the Twitter. Twitter allow user to express his opinion on variety of entities. We performed sentiment analysis on tweets using Text Mining methods such as Lexicon and Machine Learning Approach. We performed Sentiment Analysis in two steps, first by searching the polarity words from the pool of words that are already predefined in lexicon dictionary and in Second step training the machine learning algorithm using polarities given in the first step. Keywords: Sentiment analysis, Social Media, Twitter, Lexicon Dictionary, Machine Learning Classifiers, SVM.

Download Full-text

Phenotyping Cardiogenic Shock

Journal of the American Heart Association ◽

10.1161/jaha.120.020085 ◽

2021 ◽

Author(s):

Elric Zweck ◽

Katherine L. Thayer ◽

Ole K. L. Helgestad ◽

Manreet Kanwar ◽

Mohyee Ayouty ◽

...

Keyword(s):

Machine Learning ◽

Hospital Mortality ◽

Cardiogenic Shock ◽

Treatment Strategies ◽

Machine Learning Algorithms ◽

Learning Approach ◽

Tailored Treatment ◽

Machine Learning Approach ◽

Clinical Profiles ◽

Patient Enrollment

Background Cardiogenic shock (CS) is a heterogeneous syndrome with varied presentations and outcomes. We used a machine learning approach to test the hypothesis that patients with CS have distinct phenotypes at presentation, which are associated with unique clinical profiles and in‐hospital mortality. Methods and Results We analyzed data from 1959 patients with CS from 2 international cohorts: CSWG (Cardiogenic Shock Working Group Registry) (myocardial infarction [CSWG‐MI; n=410] and acute‐on‐chronic heart failure [CSWG‐HF; n=480]) and the DRR (Danish Retroshock MI Registry) (n=1069). Clusters of patients with CS were identified in CSWG‐MI using the consensus k means algorithm and subsequently validated in CSWG‐HF and DRR. Patients in each phenotype were further categorized by their Society of Cardiovascular Angiography and Interventions staging. The machine learning algorithms revealed 3 distinct clusters in CS: "non‐congested (I)", "cardiorenal (II)," and "cardiometabolic (III)" shock. Among the 3 cohorts (CSWG‐MI versus DDR versus CSWG‐HF), in‐hospital mortality was 21% versus 28% versus 10%, 45% versus 40% versus 32%, and 55% versus 56% versus 52% for clusters I, II, and III, respectively. The "cardiometabolic shock" cluster had the highest risk of developing stage D or E shock as well as in‐hospital mortality among the phenotypes, regardless of cause. Despite baseline differences, each cluster showed reproducible demographic, metabolic, and hemodynamic profiles across the 3 cohorts. Conclusions Using machine learning, we identified and validated 3 distinct CS phenotypes, with specific and reproducible associations with mortality. These phenotypes may allow for targeted patient enrollment in clinical trials and foster development of tailored treatment strategies in subsets of patients with CS.

Download Full-text

A machine learning approach to predict ethnicity using personal name and census location in Canada

PLoS ONE ◽

10.1371/journal.pone.0241239 ◽

2020 ◽

Vol 15 (11) ◽

pp. e0241239

Author(s):

Kai On Wong ◽

Osmar R. Zaïane ◽

Faith G. Davis ◽

Yutaka Yasui

Keyword(s):

Machine Learning ◽

First Nations ◽

Predictive Value ◽

Large Scale ◽

Performance Metrics ◽

Characteristic Curve ◽

Machine Learning Algorithms ◽

Support Vector ◽

Learning Approach ◽

Machine Learning Approach

Background Canada is an ethnically-diverse country, yet its lack of ethnicity information in many large databases impedes effective population research and interventions. Automated ethnicity classification using machine learning has shown potential to address this data gap but its performance in Canada is largely unknown. This study conducted a large-scale machine learning framework to predict ethnicity using a novel set of name and census location features. Methods Using census 1901, the multiclass and binary class classification machine learning pipelines were developed. The 13 ethnic categories examined were Aboriginal (First Nations, Métis, Inuit, and all-combined)), Chinese, English, French, Irish, Italian, Japanese, Russian, Scottish, and others. Machine learning algorithms included regularized logistic regression, C-support vector, and naïve Bayes classifiers. Name features consisted of the entire name string, substrings, double-metaphones, and various name-entity patterns, while location features consisted of the entire location string and substrings of province, district, and subdistrict. Predictive performance metrics included sensitivity, specificity, positive predictive value, negative predictive value, F1, Area Under the Curve for Receiver Operating Characteristic curve, and accuracy. Results The census had 4,812,958 unique individuals. For multiclass classification, the highest performance achieved was 76% F1 and 91% accuracy. For binary classifications for Chinese, French, Italian, Japanese, Russian, and others, the F1 ranged 68–95% (median 87%). The lower performance for English, Irish, and Scottish (F1 ranged 63–67%) was likely due to their shared cultural and linguistic heritage. Adding census location features to the name-based models strongly improved the prediction in Aboriginal classification (F1 increased from 50% to 84%). Conclusions The automated machine learning approach using only name and census location features can predict the ethnicity of Canadians with varying performance by specific ethnic categories.

Download Full-text