Predicting Post-Stroke Somatosensory Function from Resting-State Functional Connectivity: A Feasibility Study

Accumulating evidence shows that brain functional deficits may be impacted by damage to remote brain regions. Recent advances in neuroimaging suggest that stroke impairment can be better predicted based on disruption to brain networks rather than from lesion locations or volumes only. Our aim was to explore the feasibility of predicting post-stroke somatosensory function from brain functional connectivity through the application of machine learning techniques. Somatosensory impairment was measured using the Tactile Discrimination Test. Functional connectivity was employed to model the global brain function. Behavioral measures and MRI were collected at the same timepoint. Two machine learning models (linear regression and support vector regression) were chosen to predict somatosensory impairment from disrupted networks. Along with two feature pools (i.e., low-order and high-order functional connectivity, or low-order functional connectivity only) engineered, four predictive models were built and evaluated in the present study. Forty-three chronic stroke survivors participated this study. Results showed that the regression model employing both low-order and high-order functional connectivity can predict outcomes based on correlation coefficient of r = 0.54 (p = 0.0002). A machine learning predictive approach, involving high- and low-order modelling, is feasible for the prediction of residual somatosensory function in stroke patients using functional brain networks.

Download Full-text

Do Data Mining Methods Matter? : A Wolfcamp “Shale” Case Study

10.2118/spe-173334-ms ◽

2015 ◽

Author(s):

Ming Zhong ◽

Jared Schuetter ◽

Srikanta Mishra ◽

Randy F. LaFollette

Keyword(s):

Machine Learning ◽

Data Mining ◽

Production Optimization ◽

Machine Learning Techniques ◽

Support Vector ◽

Learning Methods ◽

Data Set ◽

Machine Learning Methods ◽

Learning Techniques ◽

Study Results

Abstract Data mining for production optimization in unconventional reservoirs brings together data from multiple sources with varying levels of aggregation, detail, and quality. Tens of variables are typically included in data sets to be analyzed. There are many statistical and machine learning techniques that can be used to analyze data and summarize the results. These methods were developed to work extremely well in certain scenarios but can be terrible choices in others. The analyst may or may not be trained and experienced in using those methods. The question for both the analyst and the consumer of data mining analyses is, “What difference does the method make in the final interpreted result of an analysis?” The objective of this study was to compare and review the relative utility of several univariate and multivariate statistical and machine learning methods in predicting the production quality of Permian Basin Wolfcamp Shale wells. The data set for the study was restricted to wells completed in and producing from the Wolfcamp. Data categories used in the study included the well location and assorted metrics capturing various aspects of the well architecture, well completion, stimulation, and production. All of this information was publicly available. Data variables were scrutinized and corrected for inconsistent units and were sanity checked for out-of-bounds and other “bad data” problems. After the quality control effort was completed, the test data set was distributed among the statistical team for application of an agreed upon set of statistical and machine learning methods. Methods included standard univariate and multivariate linear regression as well as advanced machine learning techniques such as Support Vector Machine, Random Forests, and Boosted Regression Trees. The strengths, limitations, implementation, and study results of each of the methods tested are discussed and compared to those of the other methods. Consistent with other data mining studies, univariate linear methods are shown to be much less robust than multivariate non-linear methods, which tend to produce more reliable results. The practical importance is that when tens to hundreds of millions of dollars are at stake in the development of shale reservoirs, operators should have the confidence that their decisions are statistically sound. The work presented here shows that methods do matter, and useful insights can be derived regarding complex geosystem behavior by geoscientists, engineers, and statisticians working together.

Download Full-text

Using Machine Learning Algorithms on Prediction of Stock Price

Journal of Modeling and Optimization ◽

10.32732/jmo.2020.12.2.84 ◽

2020 ◽

Vol 12 (2) ◽

pp. 84-99

Author(s):

Li-Pang Chen

Keyword(s):

Machine Learning ◽

Stock Price ◽

Short Term Memory ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Short Term ◽

Learning Techniques ◽

Historical Database ◽

Long Short Term Memory

In this paper, we investigate analysis and prediction of the time-dependent data. We focus our attention on four different stocks are selected from Yahoo Finance historical database. To build up models and predict the future stock price, we consider three different machine learning techniques including Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNN) and Support Vector Regression (SVR). By treating close price, open price, daily low, daily high, adjusted close price, and volume of trades as predictors in machine learning methods, it can be shown that the prediction accuracy is improved.

Download Full-text

A Comparative Study of Different Machine Learning Algorithms for Disease Prediction

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse/v7i7/0177 ◽

2017 ◽

Vol 7 (7) ◽

pp. 172

Author(s):

Anantvir Singh Romana

Keyword(s):

Machine Learning ◽

Subsequent Treatment ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Disease Prediction ◽

Classification Problems ◽

Learning Techniques ◽

Neural Network Classifiers ◽

Diagnostic Detection

Accurate diagnostic detection of the disease in a patient is critical and may alter the subsequent treatment and increase the chances of survival rate. Machine learning techniques have been instrumental in disease detection and are currently being used in various classification problems due to their accurate prediction performance. Various techniques may provide different desired accuracies and it is therefore imperative to use the most suitable method which provides the best desired results. This research seeks to provide comparative analysis of Support Vector Machine, Naïve bayes, J48 Decision Tree and neural network classifiers breast cancer and diabetes datsets.

Download Full-text

Structure-Based Virtual Screening of Perfluoroalkyl and Polyfluoroalkyl Substances (PFASs) as Endocrine Disruptors of Androgen Receptor Activity Using Molecular Docking and Machine Learning

10.26434/chemrxiv.11886702.v1 ◽

2020 ◽

Author(s):

Azhagiya Singam Ettayapuram Ramaprasad ◽

Phum Tachachartvanich ◽

Denis Fourches ◽

Anatoly Soshilov ◽

Jennifer C.Y. Hsieh ◽

...

Keyword(s):

Machine Learning ◽

Molecular Docking ◽

Androgen Receptor ◽

Endocrine Disruptors ◽

Hormone Receptors ◽

Steroid Hormone Receptors ◽

Machine Learning Techniques ◽

Support Vector ◽

Polyfluoroalkyl Substances ◽

Perfluoroalkyl And Polyfluoroalkyl Substances

Perfluoroalkyl and Polyfluoroalkyl Substances (PFASs) pose a substantial threat as endocrine disruptors, and thus early identification of those that may interact with steroid hormone receptors, such as the androgen receptor (AR), is critical. In this study we screened 5,206 PFASs from the CompTox database against the different binding sites on the AR using both molecular docking and machine learning techniques. We developed support vector machine models trained on Tox21 data to classify the active and inactive PFASs for AR using different chemical fingerprints as features. The maximum accuracy was 95.01% and Matthew’s correlation coefficient (MCC) was 0.76 respectively, based on MACCS fingerprints (MACCSFP). The combination of docking-based screening and machine learning models identified 29 PFASs that have strong potential for activity against the AR and should be considered priority chemicals for biological toxicity testing.

Download Full-text

Binary Spectrum Feature for Improved Classiﬁer Performance

10.36227/techrxiv.12993122 ◽

2020 ◽

Author(s):

Nalika Ulapane ◽

Karthick Thiyagarajan ◽

sarath kodagoda

Keyword(s):

Machine Learning ◽

Classification Performance ◽

Feature Reduction ◽

Sensor Data ◽

Machine Learning Techniques ◽

Support Vector ◽

Svm Classifier ◽

Monitoring Task ◽

Classifier Performance ◽

Spectrum Feature

<div>Classiﬁcation has become a vital task in modern machine learning and Artiﬁcial Intelligence applications, including smart sensing. Numerous machine learning techniques are available to perform classiﬁcation. Similarly, numerous practices, such as feature selection (i.e., selection of a subset of descriptor variables that optimally describe the output), are available to improve classiﬁer performance. In this paper, we consider the case of a given supervised learning classiﬁcation task that has to be performed making use of continuous-valued features. It is assumed that an optimal subset of features has already been selected. Therefore, no further feature reduction, or feature addition, is to be carried out. Then, we attempt to improve the classiﬁcation performance by passing the given feature set through a transformation that produces a new feature set which we have named the “Binary Spectrum”. Via a case study example done on some Pulsed Eddy Current sensor data captured from an infrastructure monitoring task, we demonstrate how the classiﬁcation accuracy of a Support Vector Machine (SVM) classiﬁer increases through the use of this Binary Spectrum feature, indicating the feature transformation’s potential for broader usage.</div><div><br></div>

Download Full-text

Big Data to Knowledge: Application of Machine Learning to Predictive Modeling of Therapeutic Response in Cancer.

Current Genomics ◽

10.2174/1389202921999201224110101 ◽

2020 ◽

Vol 21 ◽

Author(s):

Sukanya Panja ◽

Sarra Rahem ◽

Cassandra J. Chu ◽

Antonina Mitrofanova

Keyword(s):

Machine Learning ◽

Missing Values ◽

Therapeutic Response ◽

Patient Data ◽

Machine Learning Techniques ◽

Support Vector ◽

Complex Data ◽

Human Machine Interaction ◽

Data Repositories ◽

Response Modeling

Background: In recent years, the availability of high throughput technologies, establishment of large molecular patient data repositories, and advancement in computing power and storage have allowed elucidation of complex mechanisms implicated in therapeutic response in cancer patients. The breadth and depth of such data, alongside experimental noise and missing values, requires a sophisticated human-machine interaction that would allow effective learning from complex data and accurate forecasting of future outcomes, ideally embedded in the core of machine learning design. Objective: In this review, we will discuss machine learning techniques utilized for modeling of treatment response in cancer, including Random Forests, support vector machines, neural networks, and linear and logistic regression. We will overview their mathematical foundations and discuss their limitations and alternative approaches all in light of their application to therapeutic response modeling in cancer. Conclusion: We hypothesize that the increase in the number of patient profiles and potential temporal monitoring of patient data will define even more complex techniques, such as deep learning and causal analysis, as central players in therapeutic response modeling.

Download Full-text

A Review on Machine-Learning Based Code Smell Detection Techniquesin Object-Oriented Software System(s)

Recent Advances in Electrical & Electronic Engineering (Formerly Recent Patents on Electrical & Electronic Engineering) ◽

10.2174/2352096513999200922125839 ◽

2020 ◽

Vol 13 ◽

Author(s):

Amandeep Kaur ◽

Sushma Jain ◽

Shivani Goel ◽

Gaurav Dhiman

Keyword(s):

Machine Learning ◽

Programming Languages ◽

Software Quality ◽

Empirical Studies ◽

Statistical Testing ◽

Machine Learning Techniques ◽

Support Vector ◽

Code Smells ◽

Detection Techniques ◽

Code Smell

Context: Code smells are symptoms, that something may be wrong in software systems that can cause complications in maintaining software quality. In literature, there exists many code smells and their identification is far from trivial. Thus, several techniques have also been proposed to automate code smell detection in order to improve software quality. Objective: This paper presents an up-to-date review of simple and hybrid machine learning based code smell detection techniques and tools. Methods: We collected all the relevant research published in this field till 2020. We extracted the data from those articles and classified them into two major categories. In addition, we compared the selected studies based on several aspects like, code smells, machine learning techniques, datasets, programming languages used by datasets, dataset size, evaluation approach, and statistical testing. Results: Majority of empirical studies have proposed machine- learning based code smell detection tools. Support vector machine and decision tree algorithms are frequently used by the researchers. Along with this, a major proportion of research is conducted on Open Source Softwares (OSS) such as, Xerces, Gantt Project and ArgoUml. Furthermore, researchers paid more attention towards Feature Envy and Long Method code smells. Conclusion: We identified several areas of open research like, need of code smell detection techniques using hybrid approaches, need of validation employing industrial datasets, etc.

Download Full-text

Predictive Modelling of Employee Turnover in Indian IT Industry Using Machine Learning Techniques

Vision The Journal of Business Perspective ◽

10.1177/0972262918821221 ◽

2019 ◽

Vol 23 (1) ◽

pp. 12-21 ◽

Cited By ~ 2

Author(s):

Shikha N. Khera ◽

Divya

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Confusion Matrix ◽

Predictive Modelling ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Support Vector ◽

It Industry ◽

Knowledge Based ◽

Employee Attrition

Information technology (IT) industry in India has been facing a systemic issue of high attrition in the past few years, resulting in monetary and knowledge-based loses to the companies. The aim of this research is to develop a model to predict employee attrition and provide the organizations opportunities to address any issue and improve retention. Predictive model was developed based on supervised machine learning algorithm, support vector machine (SVM). Archival employee data (consisting of 22 input features) were collected from Human Resource databases of three IT companies in India, including their employment status (response variable) at the time of collection. Accuracy results from the confusion matrix for the SVM model showed that the model has an accuracy of 85 per cent. Also, results show that the model performs better in predicting who will leave the firm as compared to predicting who will not leave the company.

Download Full-text

Forecasting and trading cryptocurrencies with machine learning under changing market conditions

Financial Innovation ◽

10.1186/s40854-020-00217-x ◽

2021 ◽

Vol 7 (1) ◽

Author(s):

Helder Sebastião ◽

Pedro Godinho

Keyword(s):

Machine Learning ◽

Linear Models ◽

Test Sample ◽

Trading Strategies ◽

Network Activity ◽

Machine Learning Techniques ◽

Support Vector ◽

Success Rates ◽

Market Conditions ◽

Sharpe Ratios

AbstractThis study examines the predictability of three major cryptocurrencies—bitcoin, ethereum, and litecoin—and the profitability of trading strategies devised upon machine learning techniques (e.g., linear models, random forests, and support vector machines). The models are validated in a period characterized by unprecedented turmoil and tested in a period of bear markets, allowing the assessment of whether the predictions are good even when the market direction changes between the validation and test periods. The classification and regression methods use attributes from trading and network activity for the period from August 15, 2015 to March 03, 2019, with the test sample beginning on April 13, 2018. For the test period, five out of 18 individual models have success rates of less than 50%. The trading strategies are built on model assembling. The ensemble assuming that five models produce identical signals (Ensemble 5) achieves the best performance for ethereum and litecoin, with annualized Sharpe ratios of 80.17% and 91.35% and annualized returns (after proportional round-trip trading costs of 0.5%) of 9.62% and 5.73%, respectively. These positive results support the claim that machine learning provides robust techniques for exploring the predictability of cryptocurrencies and for devising profitable trading strategies in these markets, even under adverse market conditions.

Download Full-text

Predictive modeling for peri-implantitis by using machine learning techniques

Scientific Reports ◽

10.1038/s41598-021-90642-4 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Tomoaki Mameno ◽

Masahiro Wada ◽

Kazunori Nozaki ◽

Toshihito Takahashi ◽

Yoshitaka Tsujioka ◽

...

Keyword(s):

Machine Learning ◽

Demographic Data ◽

Risk Indicators ◽

Machine Learning Techniques ◽

Support Vector ◽

Machine Learning Methods ◽

Complex Interactions ◽

Learning Techniques ◽

Increased Risk ◽

Vector Machines

AbstractThe purpose of this retrospective cohort study was to create a model for predicting the onset of peri-implantitis by using machine learning methods and to clarify interactions between risk indicators. This study evaluated 254 implants, 127 with and 127 without peri-implantitis, from among 1408 implants with at least 4 years in function. Demographic data and parameters known to be risk factors for the development of peri-implantitis were analyzed with three models: logistic regression, support vector machines, and random forests (RF). As the results, RF had the highest performance in predicting the onset of peri-implantitis (AUC: 0.71, accuracy: 0.70, precision: 0.72, recall: 0.66, and f1-score: 0.69). The factor that had the most influence on prediction was implant functional time, followed by oral hygiene. In addition, PCR of more than 50% to 60%, smoking more than 3 cigarettes/day, KMW less than 2 mm, and the presence of less than two occlusal supports tended to be associated with an increased risk of peri-implantitis. Moreover, these risk indicators were not independent and had complex effects on each other. The results of this study suggest that peri-implantitis onset was predicted in 70% of cases, by RF which allows consideration of nonlinear relational data with complex interactions.

Download Full-text