Schoolchildren’ Depression and Anxiety Prediction Using Machine Learning Algorithms (Preprint)

2021 ◽  
Author(s):  
Radwan Qasrawi ◽  
Stephanny Vicuna Polo ◽  
Diala Abu Al-Halawah ◽  
Sameh Hallaq ◽  
Ziad Abdeen

BACKGROUND : Depression and anxiety symptoms in early childhood have a major effect on children's mental health growth and cognitive development. Studying the effect of mental health problems on cognitive development has gained researchers' attention for the last two decades OBJECTIVE In this paper, we seek to use machine learning techniques to predict the risk factors associated with school children's depression and anxiety METHODS The study data consisted of 5685 students in grades 5-9, aged 10-17 years, studying at public and refugee schools in the West Bank. The data were collected using the health behaviors school children questionnaire in the 2012-2013 academic year and analyzed using machine learning to predict the risk factors associated with student mental health symptoms. Five machine learning techniques (Random Forest, Neural Network, Decision Tree, Support Vector Machine, and Naïve Bayes) were used for the prediction. RESULTS The results indicated that the Random Forest model had the highest accuracy levels (72.6%, 68.5%) for depression and anxiety respectively. Thus, the Random Forest had the best performance in classifying and predicting the student's depression and anxiety. The results showed that school violence and bullying, home violence, academic performance, and family income were the most important factors affecting depression and anxiety scales CONCLUSIONS Overall, machine learning proved to be an efficient tool for identifying and predicting the associated factors that influence student depression and anxiety. The deployment of machine learning within the school information systems might facilitate the development of health prevention and intervention programs that will enhance students’ mental health and cognitive development.

RSC Advances ◽  
2014 ◽  
Vol 4 (106) ◽  
pp. 61624-61630 ◽  
Author(s):  
N. S. Hari Narayana Moorthy ◽  
Silvia A. Martins ◽  
Sergio F. Sousa ◽  
Maria J. Ramos ◽  
Pedro A. Fernandes

Classification models to predict the solvation free energies of organic molecules were developed using decision tree, random forest and support vector machine approaches and with MACCS fingerprints, MOE and PaDEL descriptors.


2021 ◽  
Vol 12 ◽  
Author(s):  
Santu Rana ◽  
Wei Luo ◽  
Truyen Tran ◽  
Svetha Venkatesh ◽  
Paul Talman ◽  
...  

Aim: To use available electronic administrative records to identify data reliability, predict discharge destination, and identify risk factors associated with specific outcomes following hospital admission with stroke, compared to stroke specific clinical factors, using machine learning techniques.Method: The study included 2,531 patients having at least one admission with a confirmed diagnosis of stroke, collected from a regional hospital in Australia within 2009–2013. Using machine learning (penalized regression with Lasso) techniques, patients having their index admission between June 2009 and July 2012 were used to derive predictive models, and patients having their index admission between July 2012 and June 2013 were used for validation. Three different stroke types [intracerebral hemorrhage (ICH), ischemic stroke, transient ischemic attack (TIA)] were considered and five different comparison outcome settings were considered. Our electronic administrative record based predictive model was compared with a predictive model composed of “baseline” clinical features, more specific for stroke, such as age, gender, smoking habits, co-morbidities (high cholesterol, hypertension, atrial fibrillation, and ischemic heart disease), types of imaging done (CT scan, MRI, etc.), and occurrence of in-hospital pneumonia. Risk factors associated with likelihood of negative outcomes were identified.Results: The data was highly reliable at predicting discharge to rehabilitation and all other outcomes vs. death for ICH (AUC 0.85 and 0.825, respectively), all discharge outcomes except home vs. rehabilitation for ischemic stroke, and discharge home vs. others and home vs. rehabilitation for TIA (AUC 0.948 and 0.873, respectively). Electronic health record data appeared to provide improved prediction of outcomes over stroke specific clinical factors from the machine learning models. Common risk factors associated with a negative impact on expected outcomes appeared clinically intuitive, and included older age groups, prior ventilatory support, urinary incontinence, need for imaging, and need for allied health input.Conclusion: Electronic administrative records from this cohort produced reliable outcome prediction and identified clinically appropriate factors negatively impacting most outcome variables following hospital admission with stroke. This presents a means of future identification of modifiable factors associated with patient discharge destination. This may potentially aid in patient selection for certain interventions and aid in better patient and clinician education regarding expected discharge outcomes.


Analysis of credit scoring is an effective credit risk assessment technique, which is one of the major research fields in the banking sector. Machine learning has a variety of applications in the banking sector and it has been widely used for data analysis. Modern techniques such as machine learning have provided a self-regulating process to analyze the data using classification techniques. The classification method is a supervised learning process in which the computer learns from the input data provided and makes use of this information to classify the new dataset. This research paper presents a comparison of various machine learning techniques used to evaluate the credit risk. A credit transaction that needs to be accepted or rejected is trained and implemented on the dataset using different machine learning algorithms. The techniques are implemented on the German credit dataset taken from UCI repository which has 1000 instances and 21 attributes, depending on which the transactions are either accepted or rejected. This paper compares algorithms such as Support Vector Network, Neural Network, Logistic Regression, Naive Bayes, Random Forest, and Classification and Regression Trees (CART) algorithm and the results obtained show that Random Forest algorithm was able to predict credit risk with higher accuracy


2021 ◽  
Author(s):  
Nisha Agnihotri

<i>Bipolar disorder, a complex disorder in brain has affected many millions of people around the world. This brain disorder is identified by the occurrence of the oscillations of the patient’s changing mood. The mood swing between two states i.e. depression and mania. This is a result of different psychological and physical features. A set of psycholinguistic features like behavioral changes, mood swings and mental illness are observed to provide feedback on health and wellness. The study is an objective measure of identifying the stress level of human brain that could improve the harmful effects associated with it considerably. In the paper, we present the study prediction of symptoms and behavior of a commonly known mental health illness, bipolar disorder using Machine Learning Techniques. Therefore, we extracted data from articles and research papers were studied and analyzed by using statistical analysis tools and machine learning (ML) techniques. Data is visualized to extract and communicate meaningful information from complex datasets on predicting and optimizing various day to day analyses. The study also includes the various research papers having machine Learning algorithms and different classifiers like Decision Trees, Random Forest, Support Vector Machine, Naïve Bayes, Logistic Regression and K- Nearest Neighbor are studied and analyzed for identifying the mental state in a target group. The purpose of the paper is mainly to explore the challenges, adequacy and limitations in detecting the mental health condition using Machine Learning Techniques</i>


2021 ◽  
Author(s):  
Nisha Agnihotri

<i>Bipolar disorder, a complex disorder in brain has affected many millions of people around the world. This brain disorder is identified by the occurrence of the oscillations of the patient’s changing mood. The mood swing between two states i.e. depression and mania. This is a result of different psychological and physical features. A set of psycholinguistic features like behavioral changes, mood swings and mental illness are observed to provide feedback on health and wellness. The study is an objective measure of identifying the stress level of human brain that could improve the harmful effects associated with it considerably. In the paper, we present the study prediction of symptoms and behavior of a commonly known mental health illness, bipolar disorder using Machine Learning Techniques. Therefore, we extracted data from articles and research papers were studied and analyzed by using statistical analysis tools and machine learning (ML) techniques. Data is visualized to extract and communicate meaningful information from complex datasets on predicting and optimizing various day to day analyses. The study also includes the various research papers having machine Learning algorithms and different classifiers like Decision Trees, Random Forest, Support Vector Machine, Naïve Bayes, Logistic Regression and K- Nearest Neighbor are studied and analyzed for identifying the mental state in a target group. The purpose of the paper is mainly to explore the challenges, adequacy and limitations in detecting the mental health condition using Machine Learning Techniques</i>


2021 ◽  
Author(s):  
Rakesh Kumar Saroj ◽  
Pawan Kumar Yadav ◽  
Rajneesh Singh ◽  
Obvious Nchimunya Chilyabanyama

Abstract Background: The death rate of under-five children in India declined last few decades, but few bigger states have poor performance. This is a matter of serious concern for the child's health as well as social development. Nowadays, machine learning techniques play a crucial role in the smart health care system to capture the hidden factors and patterns of outcomes. In this paper, we used machine learning techniques to predict the important factors of under-five mortality.This study aims to explore the importance of machine learning techniques to predict under-five mortality and to find the important factors that cause under-five mortality.The data was taken from the National Family Health Survey-IV of Uttar Pradesh. We used four machine learning techniques like decision tree, support vector machine, random forest, and logistic regression to predict under-five mortality factors and model accuracy of each model. We have also used information gain to rank to know the important variables for accurate predictions in under-five mortality data.Result: Random Forest (RF) predicts the child mortality factors with the highest accuracy of 97.5 %, and the number of living children, births in the last five years, educational level, birth order, total children ever born, currently breastfeeding, and size of child at birth that identifying as essential factors for under-five mortality.Conclusion: The study focuses on machine learning techniques to predict and identify important factors for under-five mortality. The random forest model provides an excellent predictive result for estimating the risk factors of under-five mortality. Based on the resulting outcome, policymakers can make policies and plans to reduce under-five mortality.


Symmetry ◽  
2021 ◽  
Vol 13 (3) ◽  
pp. 403
Author(s):  
Muhammad Waleed ◽  
Tai-Won Um ◽  
Tariq Kamal ◽  
Syed Muhammad Usman

In this paper, we apply the multi-class supervised machine learning techniques for classifying the agriculture farm machinery. The classification of farm machinery is important when performing the automatic authentication of field activity in a remote setup. In the absence of a sound machine recognition system, there is every possibility of a fraudulent activity taking place. To address this need, we classify the machinery using five machine learning techniques—K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF) and Gradient Boosting (GB). For training of the model, we use the vibration and tilt of machinery. The vibration and tilt of machinery are recorded using the accelerometer and gyroscope sensors, respectively. The machinery included the leveler, rotavator and cultivator. The preliminary analysis on the collected data revealed that the farm machinery (when in operation) showed big variations in vibration and tilt, but observed similar means. Additionally, the accuracies of vibration-based and tilt-based classifications of farm machinery show good accuracy when used alone (with vibration showing slightly better numbers than the tilt). However, the accuracies improve further when both (the tilt and vibration) are used together. Furthermore, all five machine learning algorithms used for classification have an accuracy of more than 82%, but random forest was the best performing. The gradient boosting and random forest show slight over-fitting (about 9%), but both algorithms produce high testing accuracy. In terms of execution time, the decision tree takes the least time to train, while the gradient boosting takes the most time.


2021 ◽  
Vol 11 (5) ◽  
pp. 343
Author(s):  
Fabiana Tezza ◽  
Giulia Lorenzoni ◽  
Danila Azzolina ◽  
Sofia Barbar ◽  
Lucia Anna Carmela Leone ◽  
...  

The present work aims to identify the predictors of COVID-19 in-hospital mortality testing a set of Machine Learning Techniques (MLTs), comparing their ability to predict the outcome of interest. The model with the best performance will be used to identify in-hospital mortality predictors and to build an in-hospital mortality prediction tool. The study involved patients with COVID-19, proved by PCR test, admitted to the “Ospedali Riuniti Padova Sud” COVID-19 referral center in the Veneto region, Italy. The algorithms considered were the Recursive Partition Tree (RPART), the Support Vector Machine (SVM), the Gradient Boosting Machine (GBM), and Random Forest. The resampled performances were reported for each MLT, considering the sensitivity, specificity, and the Receiving Operative Characteristic (ROC) curve measures. The study enrolled 341 patients. The median age was 74 years, and the male gender was the most prevalent. The Random Forest algorithm outperformed the other MLTs in predicting in-hospital mortality, with a ROC of 0.84 (95% C.I. 0.78–0.9). Age, together with vital signs (oxygen saturation and the quick SOFA) and lab parameters (creatinine, AST, lymphocytes, platelets, and hemoglobin), were found to be the strongest predictors of in-hospital mortality. The present work provides insights for the prediction of in-hospital mortality of COVID-19 patients using a machine-learning algorithm.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Tom Elliot ◽  
Robert Morse ◽  
Duane Smythe ◽  
Ashley Norris

AbstractIt is 50 years since Sieveking et al. published their pioneering research in Nature on the geochemical analysis of artefacts from Neolithic flint mines in southern Britain. In the decades since, geochemical techniques to source stone artefacts have flourished globally, with a renaissance in recent years from new instrumentation, data analysis, and machine learning techniques. Despite the interest over these latter approaches, there has been variation in the quality with which these methods have been applied. Using the case study of flint artefacts and geological samples from England, we present a robust and objective evaluation of three popular techniques, Random Forest, K-Nearest-Neighbour, and Support Vector Machines, and present a pipeline for their appropriate use. When evaluated correctly, the results establish high model classification performance, with Random Forest leading with an average accuracy of 85% (measured through F1 Scores), and with Support Vector Machines following closely. The methodology developed in this paper demonstrates the potential to significantly improve on previous approaches, particularly in removing bias, and providing greater means of evaluation than previously utilised.


2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Ali Soleymani ◽  
Fatemeh Arabgol

In today’s security landscape, advanced threats are becoming increasingly difficult to detect as the pattern of attacks expands. Classical approaches that rely heavily on static matching, such as blacklisting or regular expression patterns, may be limited in flexibility or uncertainty in detecting malicious data in system data. This is where machine learning techniques can show their value and provide new insights and higher detection rates. The behavior of botnets that use domain-flux techniques to hide command and control channels was investigated in this research. The machine learning algorithm and text mining used to analyze the network DNS protocol and identify botnets were also described. For this purpose, extracted and labeled domain name datasets containing healthy and infected DGA botnet data were used. Data preprocessing techniques based on a text-mining approach were applied to explore domain name strings with n-gram analysis and PCA. Its performance is improved by extracting statistical features by principal component analysis. The performance of the proposed model has been evaluated using different classifiers of machine learning algorithms such as decision tree, support vector machine, random forest, and logistic regression. Experimental results show that the random forest algorithm can be used effectively in botnet detection and has the best botnet detection accuracy.


Sign in / Sign up

Export Citation Format

Share Document