The Role of Machine Learning in Diagnosing Bipolar Disorder: Scoping Review (Preprint)

BACKGROUND Bipolar disorder (BD) is the tenth common cause of frailty in young individuals and has triggered morbidity and mortality worldwide. BD patients have 9–17 years lower lifetime as compared to the normal population. It is a predominant mental disorder but misdiagnosed as depressive disorder that leads to difficulties in the treatment of affected patients. 60% of patients with bipolar disorder are looking for the treatment of depression. However, machine learning provides advanced skills and techniques for the better diagnosis of bipolar disorder. OBJECTIVE This review aims to explore the machine learning algorithms for the detection and diagnosis of bipolar disorder and its subtypes. METHODS The study protocol adapts PRISMA extension guidelines. It explores three databases, which were Google scholar, ScienceDirect, and PubMed. To enhance the search, we performed backward screening of all the references of the included studies. Based on the predefined selection criteria, two levels of screening were carried out: the title and abstract review and the full review of the articles that met the inclusion criteria. Data extraction was performed independently by all investigators. To synthesize the extracted data, a narrative synthesis approach was followed. RESULTS 573 potential articles were retrieved from three databases. After pre-processing and screening, only 33 articles were identified, which met our inclusion criteria. The most commonly used data belonged to the clinical category (n=22, 66.66%). We identified 8 machine learning models used in the selected studies, Support-vector machines (n=9, 27%), Artificial neural network (n=4, 12.12%) , Linear regression (n=3, 0.9%) , Gaussian process model (n=2, 0.6%), Ensemble model (n=2, 0.6%) , Natural language processing (n=1, 0.3%), Probabilistic Methods (n=1, 0.3%), and Logistic regression (n=1, 0.35%). The most common data utilized was magnetic resonance imaging (MRI) for classifying bipolar patients compared to other groups (n=11, 34%) while the least common utilized data was microarray expression dataset and genomic data. The maximum ratio of accuracy was 98% while the minimum accuracy range was 64%. CONCLUSIONS This scoping review provides an overview of recent studies based on machine learning models used to diagnose bipolar disorder patients regardless of their demographics or if they were assessed compared to patients with psychiatric diagnoses. Further research can be conducted for clinical decision support in the health industry. CLINICALTRIAL Null

Download Full-text

Machine Learning Models for Finger Bend Evaluation using Implemented Low cost Flex Sensor

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.35742 ◽

2021 ◽

Vol 9 (VI) ◽

pp. 3605-3611

Author(s):

Pratyush Kaware

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Low Cost ◽

Learning Algorithms ◽

Cost Effective ◽

Machine Learning Algorithms ◽

Support Vector ◽

Learning Models ◽

Machine Learning Models

In this paper a cost-effective sensor has been implemented to read finger bend signals, by attaching the sensor to a finger, so as to classify them based on the degree of bent as well as the joint about which the finger was being bent. This was done by testing with various machine learning algorithms to get the most accurate and consistent classifier. Finally, we found that Support Vector Machine was the best algorithm suited to classify our data, using we were able predict live state of a finger, i.e., the degree of bent and the joints involved. The live voltage values from the sensor were transmitted using a NodeMCU micro-controller which were converted to digital and uploaded on a database for analysis.

Download Full-text

Validation of Machine Learning Models for Structural Dam Behaviour Interpretation and Prediction

Water ◽

10.3390/w13192717 ◽

2021 ◽

Vol 13 (19) ◽

pp. 2717

Author(s):

Juan Mata ◽

Fernando Salazar ◽

José Barateiro ◽

António Antunes

Keyword(s):

Machine Learning ◽

Machine Learning Algorithms ◽

Arch Dam ◽

Boosted Regression Trees ◽

Structural Safety ◽

Support Vector ◽

Learning Models ◽

Structural Behaviour ◽

Safety Control ◽

Machine Learning Models

The main aim of structural safety control is the multiple assessments of the expected dam behaviour based on models and the measurements and parameters that characterise the dam’s response and condition. In recent years, there is an increase in the use of data-based models for the analysis and interpretation of the structural behaviour of dams. Multiple Linear Regression is the conventional, widely used approach in dam engineering, although interesting results have been published based on machine learning algorithms such as artificial neural networks, support vector machines, random forest, and boosted regression trees. However, these models need to be carefully developed and properly assessed before their application in practice. This is even more relevant when an increase in users of machine learning models is expected. For this reason, this paper presents extensive work regarding the verification and validation of data-based models for the analysis and interpretation of observed dam’s behaviour. This is presented by means of the development of several machine learning models to interpret horizontal displacements in an arch dam in operation. Several validation techniques are applied, including historical data validation, sensitivity analysis, and predictive validation. The results are discussed and conclusions are drawn regarding the practical application of data-based models.

Download Full-text

A Comparative Analysis of Machine Learning Algorithms to Predict Alzheimer’s Disease

Journal of Healthcare Engineering ◽

10.1155/2021/9917919 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Morshedul Bari Antor ◽

A. H. M. Shafayet Jamil ◽

Maliha Mamtaz ◽

Mohammad Monirujjaman Khan ◽

Sultan Aljahdali ◽

...

Keyword(s):

Machine Learning ◽

Alzheimer’S Disease ◽

Alzheimer's Disease ◽

Support Vector Machine ◽

Machine Learning Algorithms ◽

Learning System ◽

Fine Tuning ◽

Support Vector ◽

Learning Models ◽

Machine Learning Models

Alzheimer’s disease has been one of the major concerns recently. Around 45 million people are suffering from this disease. Alzheimer’s is a degenerative brain disease with an unspecified cause and pathogenesis which primarily affects older people. The main cause of Alzheimer’s disease is Dementia, which progressively damages the brain cells. People lost their thinking ability, reading ability, and many more from this disease. A machine learning system can reduce this problem by predicting the disease. The main aim is to recognize Dementia among various patients. This paper represents the result and analysis regarding detecting Dementia from various machine learning models. The Open Access Series of Imaging Studies (OASIS) dataset has been used for the development of the system. The dataset is small, but it has some significant values. The dataset has been analyzed and applied in several machine learning models. Support vector machine, logistic regression, decision tree, and random forest have been used for prediction. First, the system has been run without fine-tuning and then with fine-tuning. Comparing the results, it is found that the support vector machine provides the best results among the models. It has the best accuracy in detecting Dementia among numerous patients. The system is simple and can easily help people by detecting Dementia among them.

Download Full-text

Detection of Online Fake News Using Blending Ensemble Learning

Scientific Programming ◽

10.1155/2021/3434458 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Arvin Hansrajh ◽

Timothy T. Adeliyi ◽

Jeanette Wing

Keyword(s):

Machine Learning ◽

Performance Metrics ◽

Machine Learning Algorithms ◽

Stochastic Gradient Descent ◽

Support Vector ◽

Fake News ◽

Learning Models ◽

Linear Discriminant ◽

Proposed Model ◽

Machine Learning Models

The exponential growth in fake news and its inherent threat to democracy, public trust, and justice has escalated the necessity for fake news detection and mitigation. Detecting fake news is a complex challenge as it is intentionally written to mislead and hoodwink. Humans are not good at identifying fake news. The detection of fake news by humans is reported to be at a rate of 54% and an additional 4% is reported in the literature as being speculative. The significance of fighting fake news is exemplified during the present pandemic. Consequently, social networks are ramping up the usage of detection tools and educating the public in recognising fake news. In the literature, it was observed that several machine learning algorithms have been applied to the detection of fake news with limited and mixed success. However, several advanced machine learning models are not being applied, although recent studies are demonstrating the efﬁcacy of the ensemble machine learning approach; hence, the purpose of this study is to assist in the automated detection of fake news. An ensemble approach is adopted to help resolve the identified gap. This study proposed a blended machine learning ensemble model developed from logistic regression, support vector machine, linear discriminant analysis, stochastic gradient descent, and ridge regression, which is then used on a publicly available dataset to predict if a news report is true or not. The proposed model will be appraised with the popular classical machine learning models, while performance metrics such as AUC, ROC, recall, accuracy, precision, and f1-score will be used to measure the performance of the proposed model. Results presented showed that the proposed model outperformed other popular classical machine learning models.

Download Full-text

MACHINE LEARNING FOR SEISMIC-INDUCED DAMAGE ESTIMATION OF STEEL TANKS

Proceedings of International Structural Engineering and Construction ◽

10.14455/isec.2021.8(1).str-53 ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Laura Micheli ◽

Han Nguyen ◽

Mahmoud Faytarouni ◽

Lin-Ching Chang

Keyword(s):

Machine Learning ◽

Seismic Vulnerability ◽

Machine Learning Algorithms ◽

Support Vector ◽

Storage Tanks ◽

Learning Models ◽

Geometric Properties ◽

Liquid Storage ◽

Steel Tanks ◽

Machine Learning Models

Aboveground steel storage tanks are large vessels employed to store various liquids, including water, food, fertilizers, oil, and other hazardous chemicals. The damage and collapse of storage tanks can generate long-lasting consequences on built environment and communities. The seismic behavior of storage tanks is often evaluated employing fragility functions obtained simulating the tank under a variety of ground motion records. However, the computational demand of high-fidelity simulation models makes risk assessment a burdensome task. The use of data-driven surrogate models could represent a suitable solution to evaluate the seismic vulnerability of steel tanks rapidly. In this context, this paper presents an open dataset composed of 204 aboveground cylindrical steel liquid storage tanks with different geometric properties. The dataset was assembled based on past earthquake reconnaissance reports. In the dataset, different types of damage experienced by the steel tanks are divided into four classes, ranging from no damage to complete failure. Eight different machine learning algorithms are trained to predict the damage class of a steel tank as a function of its geometric properties and seismic excitation parameters. Stratified 5-fold cross-validation is used to split the dataset into training and testing subsets and to assess the prediction capability of the machine learning models. Results showed that the Support Vector Machine algorithm yielded the most accurate predictions, followed by Random Forest, XGBoost, and LightBoost. Overall, the paper demonstrated the feasibility of using machine learning models to predict the damage level of steel liquid storage tanks subjected to seismic hazard.

Download Full-text

Machine Learning Models for Spring Discharge Forecasting

Geofluids ◽

10.1155/2018/8328167 ◽

2018 ◽

Vol 2018 ◽

pp. 1-13 ◽

Cited By ~ 10

Author(s):

Francesco Granata ◽

Michele Saroli ◽

Giovanni de Marinis ◽

Rudy Gargano

Keyword(s):

Machine Learning ◽

Central Italy ◽

Regression Tree ◽

Machine Learning Algorithms ◽

Support Vector ◽

Spring Discharge ◽

Learning Models ◽

Promising Alternative ◽

Cross Correlation Analysis ◽

Machine Learning Models

Nowadays, drought phenomena increasingly affect large areas of the globe; therefore, the need for a careful and rational management of water resources is becoming more pressing. Considering that most of the world’s unfrozen freshwater reserves are stored in aquifers, the capability of prediction of spring discharges is a crucial issue. An approach based on water balance is often extremely complicated or ineffective. A promising alternative is represented by data-driven approaches. Recently, many hydraulic engineering problems have been addressed by means of advanced models derived from artificial intelligence studies. Three different machine learning algorithms were used for spring discharge forecasting in this comparative study: M5P regression tree, random forest, and support vector regression. The spring of Rasiglia Alzabove, Umbria, Central Italy, was selected as a case study. The machine learning models have proven to be able to provide very encouraging results. M5P provides good short-term predictions of monthly average flow rates (e.g., in predicting average discharge of the spring after 1 month, R2=0.991, RAE=14.97%, if a 4-month input is considered), while RF is able to provide accurate medium-term forecasts (e.g., in forecasting average discharge of the spring after 3 months, R2=0.964, RAE=43.12%, if a 4-month input is considered). As the time of forecasting advances, the models generally provide less accurate predictions. Moreover, the effectiveness of the models significantly depends on the duration of the period considered for input data. This duration should be close to the aquifer response time, approximately estimated by cross-correlation analysis.

Download Full-text

Performance of Statistical and Machine Learning-Based Methods for Predicting Biogeographical Patterns of Fungal Productivity in Forest Ecosystems

10.21203/rs.3.rs-122045/v1 ◽

2020 ◽

Author(s):

Albert Morera ◽

Juan Martínez de Aragón ◽

José Antonio Bonet ◽

Jingjing Liang ◽

Sergio de-Miguel

Keyword(s):

Machine Learning ◽

Random Forest ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Support Vector ◽

Learning Approaches ◽

Learning Models ◽

Extreme Gradient Boosting ◽

Machine Learning Models ◽

Modelling Approaches

Abstract BackgroundThe prediction of biogeographical patterns from a large number of driving factors with complex interactions, correlations and non-linear dependences require advanced analytical methods and modelling tools. This study compares different statistical and machine learning models for predicting fungal productivity biogeographical patterns as a case study for the thorough assessment of the performance of alternative modelling approaches to provide accurate and ecologically-consistent predictions.MethodsWe evaluated and compared the performance of two statistical modelling techniques, namely, generalized linear mixed models and geographically weighted regression, and four machine learning models, namely, random forest, extreme gradient boosting, support vector machine and deep learning to predict fungal productivity. We used a systematic methodology based on substitution, random, spatial and climatic blocking combined with principal component analysis, together with an evaluation of the ecological consistency of spatially-explicit model predictions.ResultsFungal productivity predictions were sensitive to the modelling approach and complexity. Moreover, the importance assigned to different predictors varied between machine learning modelling approaches. Decision tree-based models increased prediction accuracy by ~7% compared to other machine learning approaches and by more than 25% compared to statistical ones, and resulted in higher ecological consistence at the landscape level.ConclusionsWhereas a large number of predictors are often used in machine learning algorithms, in this study we show that proper variable selection is crucial to create robust models for extrapolation in biophysically differentiated areas. When dealing with spatial-temporal data in the analysis of biogeographical patterns, climatic blocking is postulated as a highly informative technique to be used in cross-validation to assess the prediction error over larger scales. Random forest was the best approach for prediction both in sampling-like environments as well as in extrapolation beyond the spatial and climatic range of the modelling data.

Download Full-text

Short-Term Electricity Generation Forecasting Using Machine Learning Algorithms: A Case Study of the Benin Electricity Community (C.E.B)

TH Wildau Engineering and Natural Sciences Proceedings ◽

10.52825/thwildauensp.v1i.25 ◽

2021 ◽

Vol 1 ◽

Author(s):

Agbassou Guenoupkati ◽

Adekunlé Akim Salami ◽

Mawugno Koffi Kodjo ◽

Kossi Napo

Keyword(s):

Machine Learning ◽

Time Series ◽

Linear Regression ◽

Performance Metrics ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Support Vector ◽

Learning Models ◽

Short Term ◽

Machine Learning Models

Time series forecasting in the energy sector is important to power utilities for decision making to ensure the sustainability and quality of electricity supply, and the stability of the power grid. Unfortunately, the presence of certain exogenous factors such as weather conditions, electricity price complicate the task using linear regression models that are becoming unsuitable. The search for a robust predictor would be an invaluable asset for electricity companies. To overcome this difficulty, Artificial Intelligence differs from these prediction methods through the Machine Learning algorithms which have been performing over the last decades in predicting time series on several levels. This work proposes the deployment of three univariate Machine Learning models: Support Vector Regression, Multi-Layer Perceptron, and the Long Short-Term Memory Recurrent Neural Network to predict the electricity production of Benin Electricity Community. In order to validate the performance of these different methods, against the Autoregressive Integrated Mobile Average and Multiple Regression model, performance metrics were used. Overall, the results show that the Machine Learning models outperform the linear regression methods. Consequently, Machine Learning methods offer a perspective for short-term electric power generation forecasting of Benin Electricity Community sources.

Download Full-text

Estimating salt content of vegetated soil at different depths with Sentinel-2 data

PeerJ ◽

10.7717/peerj.10585 ◽

2020 ◽

Vol 8 ◽

pp. e10585

Author(s):

Yinwen Chen ◽

Yuanlin Qiu ◽

Zhitao Zhang ◽

Junrui Zhang ◽

Ce Chen ◽

...

Keyword(s):

Machine Learning ◽

Salt Content ◽

European Space Agency ◽

Machine Learning Algorithms ◽

Support Vector ◽

Great Decrease ◽

Learning Models ◽

Different Depths ◽

Machine Learning Models ◽

Sentinel 2

The accurate and timely monitoring of the soil salt content (SSC) at different depths is the prerequisite for the solution to salinization in the arid and semiarid areas. Sentinel-2 has demonstrated significant superiority in SSC inversion for its higher temporal, spatial and spectral resolution, but previous research on SSC inversion with Sentinel-2 mainly focused on the unvegetated surface soil. Based on Sentinel-2 data, this study aimed to build four machine learning models at five depths (0∼20 cm, 20∼40 cm, 40∼60 cm, 0∼40 cm, and 0∼60 cm) in the vegetated area, and evaluate the sensitivity of Sentinel-2 to SSC at different depths and the inversion capability of the models. Firstly, 117 soil samples were collected from Jiefangzha Irrigation Area (JIA) in Hetao Irrigation District (HID), Inner Mongolia, China during August, 2019. Then a set of independent variables (IVs, including 12 bands and 32 spectral indices) were obtained based on the Sentinel-2 data (released by the European Space Agency), and the full subset selection was used to select the optimal combination of IVs at five depths. Finally, four machine learning algorithms, back propagation neural network (BPNN), support vector machine (SVM), extreme learning machine (ELM) and random forest (RF), were used to build inversion models at each depth. The model performance was assessed using adjusted coefficient of determination (R2adj), root mean square error (RMSE) and mean absolute error (MAE). The results indicated that 20∼40 cm was the optimal depth for SSC inversion. All the models at this depth demonstrated a good fitting (R2adj≈ 0.6) and a good control of the inversion errors (RMSE < 0.16%, MAE < 0.12%). At the depths of 40∼60 cm and 0∼20 cm the inversion performance showed a slight and a great decrease respectively. The sensitivity of Sentinel-2 to SSC at different depths was as follows: 20∼40 cm > 40∼60 cm > 0∼40 cm > 0∼60 cm > 0∼20 cm. All four machine learning models demonstrated good inversion performance (R2adj > 0.46). RF was the best model with high fitting and inversion accuracy. Its R2adj at five depths were between 0.5 to 0.68. The SSC inversion capabilities of all the four models were as follows: RF model > ELM model > SVM model > BPNN model. This study can provide a reference for soil salinization monitoring in large vegetated area.

Download Full-text

Hyperparameter Tuning for Machine Learning Algorithms Used for Arabic Sentiment Analysis

Informatics ◽

10.3390/informatics8040079 ◽

2021 ◽

Vol 8 (4) ◽

pp. 79

Author(s):

Enas Elgeldawi ◽

Awny Sayed ◽

Ahmed R. Galal ◽

Alaa M. Zaki

Keyword(s):

Machine Learning ◽

Sentiment Analysis ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Bayesian Optimization ◽

Support Vector ◽

Learning Models ◽

Before And After ◽

Tuning Process ◽

Machine Learning Models

Machine learning models are used today to solve problems within a broad span of disciplines. If the proper hyperparameter tuning of a machine learning classifier is performed, significantly higher accuracy can be obtained. In this paper, a comprehensive comparative analysis of various hyperparameter tuning techniques is performed; these are Grid Search, Random Search, Bayesian Optimization, Particle Swarm Optimization (PSO), and Genetic Algorithm (GA). They are used to optimize the accuracy of six machine learning algorithms, namely, Logistic Regression (LR), Ridge Classifier (RC), Support Vector Machine Classifier (SVC), Decision Tree (DT), Random Forest (RF), and Naive Bayes (NB) classifiers. To test the performance of each hyperparameter tuning technique, the machine learning models are used to solve an Arabic sentiment classification problem. Sentiment analysis is the process of detecting whether a text carries a positive, negative, or neutral sentiment. However, extracting such sentiment from a complex derivational morphology language such as Arabic has been always very challenging. The performance of all classifiers is tested using our constructed dataset both before and after the hyperparameter tuning process. A detailed analysis is described, along with the strengths and limitations of each hyperparameter tuning technique. The results show that the highest accuracy was given by SVC both before and after the hyperparameter tuning process, with a score of 95.6208 obtained when using Bayesian Optimization.

Download Full-text