A machine learning-based diagnostic model associated with knee osteoarthritis severity

Abstract Knee osteoarthritis (KOA) is characterized by pain and decreased gait function. We aimed to find KOA-related gait features based on patient reported outcome measures (PROMs) and develop regression models using machine learning algorithms to estimate KOA severity. The study included 375 volunteers with variable KOA grades. The severity of KOA was determined using the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC). WOMAC scores were used to classify disease severity into three groups. A total of 1087 features were extracted from the gait data. An ANOVA and student’s t-test were performed and only features that were significant were selected for inclusion in the machine learning algorithm. Three WOMAC subscales (physical function, pain and stiffness) were further divided into three classes. An ANOVA was performed to determine which selected features were significantly related to the subscales. Both linear regression models and a random forest regression was used to estimate patient the WOMAC scores. Forty-three features were selected based on ANOVA and student’s t-test results. The following number of features were selected from each joint: 12 from hip, 1 feature from pelvic, 17 features from knee, 9 features from ankle, 1 feature from foot, and 3 features from spatiotemporal parameters. A significance level of < 0.0001 and < 0.00003 was set for the ANOVA and t-test, respectively. The physical function, pain, and stiffness subscales were related to 41, 10, and 16 features, respectively. Linear regression models showed a correlation of 0.723 and the machine learning algorithm showed a correlation of 0.741. The severity of KOA was predicted by gait analysis features, which were incorporated to develop an objective estimation model for KOA severity. The identified features may serve as a tool to guide rehabilitation and progress assessments. In addition, the estimation model presented here suggests an approach for clinical application of gait analysis data for KOA evaluation.

Download Full-text

Study on Influence of Range of Data in Concrete Compressive Strength with Respect to the Accuracy of Machine Learning with Linear Regression

Applied Sciences ◽

10.3390/app11093866 ◽

2021 ◽

Vol 11 (9) ◽

pp. 3866

Author(s):

Jun-Ryeol Park ◽

Hye-Jin Lee ◽

Keun-Hyeok Yang ◽

Jung-Keun Kook ◽

Sanghee Kim

Keyword(s):

Machine Learning ◽

Compressive Strength ◽

Linear Regression ◽

Coarse Aggregate ◽

Learning Algorithm ◽

Linear Regression Analysis ◽

Aggregate Size ◽

Training Dataset ◽

Machine Learning Algorithm ◽

Concrete Compressive Strength

This study aims to predict the compressive strength of concrete using a machine-learning algorithm with linear regression analysis and to evaluate its accuracy. The open-source software library TensorFlow was used to develop the machine-learning algorithm. In the machine-earning algorithm, a total of seven variables were set: water, cement, fly ash, blast furnace slag, sand, coarse aggregate, and coarse aggregate size. A total of 4297 concrete mixtures with measured compressive strengths were employed to train and testing the machine-learning algorithm. Of these, 70% were used for training, and 30% were utilized for verification. For verification, the research was conducted by classifying the mixtures into three cases: the case where the machine-learning algorithm was trained using all the data (Case-1), the case where the machine-learning algorithm was trained while maintaining the same number of training dataset for each strength range (Case-2), and the case where the machine-learning algorithm was trained after making the subcase of each strength range (Case-3). The results indicated that the error percentages of Case-1 and Case-2 did not differ significantly. The error percentage of Case-3 was far smaller than those of Case-1 and Case-2. Therefore, it was concluded that the range of training dataset of the concrete compressive strength is as important as the amount of training dataset for accurately predicting the concrete compressive strength using the machine-learning algorithm.

Download Full-text

Application of multi-linear regression models and machine learning techniques for online voltage stability margin estimation

2010 IREP Symposium Bulk Power System Dynamics and Control - VIII (IREP) ◽

10.1109/irep.2010.5563288 ◽

2010 ◽

Cited By ~ 3

Author(s):

Bruno Leonardi ◽

Venkataramana Ajjarapu ◽

Miodrag Djukanovic ◽

Pei Zhang

Keyword(s):

Machine Learning ◽

Linear Regression ◽

Regression Models ◽

Voltage Stability ◽

Stability Margin ◽

Machine Learning Techniques ◽

Linear Regression Models ◽

Voltage Stability Margin ◽

Learning Techniques ◽

Multi Linear Regression

Download Full-text

Application of Machine Learning Techniques to Predict Mechanical Properties for Polyamide 2200 (PA12) in Additive Manufacturing

10.20944/preprints201903.0051.v1 ◽

2019 ◽

Author(s):

Ivanna Baturynska

Keyword(s):

Machine Learning ◽

Mechanical Properties ◽

Additive Manufacturing ◽

Linear Regression ◽

Prediction Accuracy ◽

Regression Models ◽

Tensile Modulus ◽

Machine Learning Techniques ◽

Linear Regression Models ◽

Elongation At Break

Additive manufacturing (AM) is an attractive technology for manufacturing industry due to flexibility in design and functionality, but inconsistency in quality is one of the major limitations that does not allow utilizing this technology for production of end-use parts. Prediction of mechanical properties can be one of the possible ways to improve the repeatability of the results. The part placement, part orientation, and STL model properties (number of mesh triangles, surface, and volume) are used to predict tensile modulus, nominal stress and elongation at break for polyamide 2200 (also known as PA12). EOS P395 polymer powder bed fusion system was used to fabricate 217 specimens in two identical builds (434 specimens in total). Prediction is performed for XYZ, XZY, ZYX, and Angle orientations separately, and all orientations together. The different non-linear models based on machine learning methods have higher prediction accuracy compared with linear regression models. Linear regression models have prediction accuracy higher than 80% only for Tensile Modulus and Elongation at break in Angle orientation. Since orientation-based modeling has low prediction accuracy due to a small number of data points and lack of information about material properties, these models need to be improved in the future based on additional experimental work.

Download Full-text

Application of Machine Learning Techniques to Predict the Mechanical Properties of Polyamide 2200 (PA12) in Additive Manufacturing

Applied Sciences ◽

10.3390/app9061060 ◽

2019 ◽

Vol 9 (6) ◽

pp. 1060

Author(s):

Ivanna Baturynska

Keyword(s):

Machine Learning ◽

Mechanical Properties ◽

Additive Manufacturing ◽

Linear Regression ◽

Prediction Accuracy ◽

Regression Models ◽

Tensile Modulus ◽

Machine Learning Techniques ◽

Linear Regression Models ◽

Elongation At Break

Additive manufacturing (AM) is an attractive technology for the manufacturing industry due to flexibility in its design and functionality, but inconsistency in quality is one of the major limitations preventing utilizing this technology for the production of end-use parts. The prediction of mechanical properties can be one of the possible ways to improve the repeatability of results. The part placement, part orientation, and STL model properties (number of mesh triangles, surface, and volume) are used to predict tensile modulus, nominal stress, and elongation at break for polyamide 2200 (also known as PA12). An EOS P395 polymer powder bed fusion system was used to fabricate 217 specimens in two identical builds (434 specimens in total). Prediction is performed for XYZ, XZY, ZYX, and Angle orientations separately, and all orientations together. The different non-linear models based on machine learning methods have higher prediction accuracy compared with linear regression models. Linear regression models only have prediction accuracy higher than 80% for Tensile Modulus and Elongation at break in Angle orientation. Since orientation-based modeling has low prediction accuracy due to a small number of data points and lack of information about the material properties, these models need to be improved in the future based on additional experimental work.

Download Full-text

Oxidation States of Binary Oxides from Data Analytics of the Electronic Structure

10.26434/chemrxiv.7234658.v1 ◽

2018 ◽

Author(s):

Sergei Posysaev ◽

Olga Miroshnichenko ◽

Matti Alatalo ◽

Duy Le ◽

Talat S. Rahman

Keyword(s):

Machine Learning ◽

Electronic Structure ◽

Linear Regression ◽

Data Analytics ◽

Learning Algorithm ◽

Mixed Valence ◽

Oxidation States ◽

Machine Learning Algorithm ◽

Binary Oxides ◽

Mixed Valence Compounds

<p>A connection between the oxidation state (OS) and Bader charge has been missing so far. To our knowledge, all previous work tried to connect OS with Bader charges only with few compounds. The aim of this work was to find a dependency between OS and Bader charge, using <a>a large number of compounds from an open database</a>. We show that a <a>correlation indeed exists between OSs and Bader charges</a> using the simplest machine learning algorithm, linear regression. The applicability of determining OS by Bader charges in mixed-valence compounds and surfaces is considered.</p>

Download Full-text

Predictive performance of linear regression models in estimation of Artemia salina abundance using field and remote sensing data

Monitoring systems of environment ◽

10.33075/2220-5861-2021-2-88-95 ◽

2021 ◽

pp. 88-95

Author(s):

D. Krivoguz ◽

◽

R. Borovskaya ◽

Keyword(s):

Machine Learning ◽

Linear Regression ◽

Spatial Data ◽

Regression Models ◽

Remote Sensing Data ◽

Artemia Salina ◽

Machine Learning Algorithms ◽

Linear Regression Models ◽

Light Spectrum ◽

Fourth Degree

This research has been aimed at finding the possibilities for application of the linear regression models, as a part of the machine learning methods, in visual representation of the spatial patterns of Artemia salina distribution in the Southern Sivash. Development of such models allows for estimation of A. salina biomass in water bodies with high accuracy. For investigation of maximum absorption levels in different parts of the light spectrum, spectral signatures at all the monitoring stations have been compared with the satellite data, and the analysis of the absorption spectra for astaxanthin and hemoglobin has been conducted with a spectrophotometer. As a result, Sentinel-2 satellite looks very promising as a key spatial data provider that can be of major help in increasing the frequency of A. salina monitoring in the Southern Sivash. The linear regression models, fitted by the third and the fourth degree polynomials, have shown satisfactory results, suitable for their subsequent use in fisheries. On the other hand, it should be noted that these models are slightly prone to overfitting, which to some extent can distort further forecasts feeding upon the new data. In turn, linear regression models fitted by a polynomial of the first degree show less accurate results, but their advantages include the lack of tendency to overfit. It is also worth noting that small-sized datasets within the scope of this investigation do not appear to be problematic, and simple machine learning algorithms can provide good accuracy results, which are suitable for further application in this field.

Download Full-text

STATISTICAL TESTING TECHNIQUE FOR COMPARISON MACHINE LEARNING MODELS PERFORMANCE

Vestnik komp iuternykh i informatsionnykh tekhnologii ◽

10.14489/vkit.2019.12.pp.010-017 ◽

2019 ◽

pp. 10-17

Author(s):

Yu. S. Fedorenko

Keyword(s):

Machine Learning ◽

T Test ◽

Statistical Testing ◽

Independent Random Variables ◽

Learning Models ◽

Testing Technique ◽

Test Set ◽

Student’S T ◽

Machine Learning Models ◽

Student’S T Test

The statistical testing technique is considered to compare the metrics values of machine learning models on a test set. Since the values of metrics depend not only on the models, but also on the data, it may turn out that different models are the best on different test sets. For this reason, the traditional approach to comparing the values of metrics on a test set is often not enough. Sometimes a statistical comparison of the results obtained on the basis of cross-validation is used, but in this case it is impossible to guarantee the independence of the obtained measurements, which does not allow the use of the Student's t-test. There are criteria that do not require independent measurements, but they have less power. For additive metrics, a technique is proposed in this paper, when a test sample is divided into N parts, on each of which the values of the metrics are calculated. Since the value on each part is obtained as the sum of independent random variables, according to the central limit theorem, the obtained metrics values on each of the N parts are realizations of the normally distributed random variable. To estimate the required sample size, it is proposed to use normality tests and build quantile– quantile plots. You can then use a modification of the Student's t-test to conduct a statistical test comparing the mean values of the metrics. A simplified approach is also considered, in which confidence intervals are built for the base model. A model whose metric values do not fall into this interval works differently from the base model. This approach reduces the amount of computations needed, however, an experimental analysis of the binary cross-entropy metric for CTR (Click-Through Rate) prediction models showed that it is more rough than the first one.

Download Full-text

Motion Detection and Prediction Using Machine Learning Algorithm

Issue 4 - Journal of Science and Technology ◽

10.46243/jst.2020.v5.i5.pp220-226 ◽

2020 ◽

pp. 220-226

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Logistic Regression ◽

Linear Regression ◽

21St Century ◽

Motion Detection ◽

Data Analytics ◽

Learning Algorithm ◽

Machine Learning Algorithm ◽

Processing Data

Machine learning is a branch of Artificial Intelligence which is gaining importance in the 21st century with increasing processing speeds and miniaturization of sensors, the applications of Artificial Intelligence and cognitive technologies are growing rapidly. An array of ultrasonic sensors i.e., HCSR-04 is placed at different directions, collecting data for a particularinterval of a period during a particular day. The acquired sensor values are subjected to pre-processing, data analytics, and visualization. The prepared data is now split into test and train. A prediction model is designed using logistic regression and linear regression and checked for accuracy, F1 score, and precision compared.

Download Full-text

Oxidation States of Binary Oxides from Data Analytics of the Electronic Structure

10.26434/chemrxiv.7234658 ◽

2018 ◽

Author(s):

Sergei Posysaev ◽

Olga Miroshnichenko ◽

Matti Alatalo ◽

Duy Le ◽

Talat S. Rahman

Keyword(s):

Machine Learning ◽

Electronic Structure ◽

Linear Regression ◽

Data Analytics ◽

Learning Algorithm ◽

Mixed Valence ◽

Oxidation States ◽

Machine Learning Algorithm ◽

Binary Oxides ◽

Mixed Valence Compounds

Download Full-text

Development of a machine-learning model to assess terminal ileum Endoscopic healing in pediatric Crohn's disease from Magnetic Resonance Enterography data

10.1101/2021.08.29.21262424 ◽

2021 ◽

Author(s):

Itai Guez ◽

Gili Focht ◽

Mary-Louise C.Greer ◽

Ruth Cytter-Kuint ◽

Li-tal Pratt ◽

...

Keyword(s):

Machine Learning ◽

Magnetic Resonance ◽

Linear Regression ◽

Regression Models ◽

Magnetic Resonance Enterography ◽

Linear Regression Models ◽

Learning Models ◽

Machine Learning Model ◽

Relevant Variables ◽

Machine Learning Models

Background and Aims: Endoscopic healing (EH), is a major treatment goal for Crohn's disease(CD). However, terminal ileum (TI) intubation failure is common, especially in children. We evaluated the added-value of machine-learning models in imputing a TI Simple Endoscopic Score for CD (SES-CD) from Magnetic Resonance Enterography (MRE) data of pediatric CD patients. Methods: This is a sub-study of the prospective ImageKids study. We developed machine-learning and baseline linear-regression models to predict TI SES-CD score from the Magnetic Resonance Index of Activity (MaRIA) and the Pediatric Inflammatory Crohn's MRE Index (PICMI) variables. We assessed TI SES-CD predictions' accuracy for intubated patients with a stratified 2-fold validation experimental setup, repeated 50 times. We determined clinical impact by imputing TI SES-CD in patients with ileal intubation failure during ileocolonscopy. Results: A total of 223 children were included (mean age 14.1+-2.5 years), of whom 132 had all relevant variables (107 with TI intubation and 25 with TI intubation failure). The combination of a machine-learning model with the PICMI variables achieved the lowest SES-CD prediction error compared to a baseline MaRIA-based linear regression model for the intubated patients (N=107, 11.7 (10.5-12.5) vs. 12.1 (11.4-12.9), p<0.05). The PICMI-based models suggested a higher rate of patients with TI disease among the non-intubated patients compared to a baseline MaRIA-based linear regression model (N=25, up to 25/25 (100%) vs. 23/25 (92%)). Conclusions: Machine-learning models with clinically-relevant variables as input are more accurate than linear-regression models in predicting TI SES-CD and EH when using the same MRE-based variables.

Download Full-text