Region-specific Brain Age Prediction Models for Children and Adolescents Derived by Machine Learning

Author(s):  
Hyerin Jeong ◽  
Ukeob Park ◽  
Seung Wan Kang

Abstract EEG biomarkers can reveal significant and actionable differences in brain development between normal children and those with developmental disorders. Frontal slow frequency EEG is one common differentiator between normal and abnormal brain function. The present study sought to establish models, based on machine learning, to predict brain age in children and adolescents. Four brain regions were studied: left anterior, right anterior, left posterior, and right posterior, based on the different functions characteristic of each region. Importantly, differences were also considered in the construction of the models. All models yielded promising r2 values for the prediction of brain age, with values of 0.80 or higher. Our technique employed a tree-based feature selection algorithm, allowing selection of a minimum number of features while still preserving predictive power. These prediction models can be used to quantify deviations between estimated and biological brain age, and so serve as valuable tools in efforts to assess and intervene early in several profound developmental disorders.

2018 ◽  
Author(s):  
Paul Herent ◽  
Simon Jegou ◽  
Gilles Wainrib ◽  
Thomas Clozel

Objectives: Define a clinically usable preprocessing pipeline for MRI data. Predict brain age using various machine learning and deep learning algorithms. Define Caveat against common machine learning traps. Data and Methods: We used 1597 open-access T1 weighted MRI from 24 hospitals. Preprocessing consisted in applying: N4 bias field correction, registration to MNI152 space, white and grey stripe intensity normalization, skull stripping and brain tissue segmentation. Prediction of brain age was done with growing complexity of data input (histograms, grey matter from segmented MRI, raw data) and models for training (linear models, non linear model such as gradient boosting over decision trees, and 2D and 3D convolutional neural networks). Work on interpretability consisted in (i) proceeding on basic data visualization like correlations maps between age and voxels value, and generating (ii) weights maps of simpler models, (iii) heatmap from CNNs model with occlusion method. Results: Processing time seemed feasible in a radiological workflow: 5 min for one 3D T1 MRI. We found a significant correlation between age and gray matter volume with a correlation r = -0.74. Our best model obtained a mean absolute error of 3.60 years, with fine tuned convolution neural network (CNN) pretrained on ImageNet. We carefully analyzed and interpreted the center effect. Our work on interpretability on simpler models permitted to observe heterogeneity of prediction depending on brain regions known for being involved in ageing (grey matter, ventricles). Occlusion method of CNN showed the importance of Insula and deep grey matter (thalami, caudate nuclei) in predictions. Conclusions: Predicting the brain age using deep learning could be a standardized metric usable in daily neuroradiological reports. An explainable algorithm gives more confidence and acceptability for its use in practice. More clinical studies using this new quantitative biomarker in neurological diseases will show how to use it at its best.


2021 ◽  
Author(s):  
Ravi Arkalgud ◽  
◽  
Andrew McDonald ◽  
Ross Brackenridge ◽  
◽  
...  

Automation is becoming an integral part of our daily lives as technology and techniques rapidly develop. Many automation workflows are now routinely being applied within the geoscience domain. The basic structure of automation and its success of modelling fundamentally hinges on the appropriate choice of parameters and speed of processing. The entire process demands that the data being fed into any machine learning model is essentially of good quality. The technological advances in well logging technology over decades have enabled the collection of vast amounts of data across wells and fields. This poses a major issue in automating petrophysical workflows. It necessitates to ensure that, the data being fed is appropriate and fit for purpose. The selection of features (logging curves) and parameters for machine learning algorithms has therefore become a topic at the forefront of related research. Inappropriate feature selections can lead erroneous results, reduced precision and have proved to be computationally expensive. Experienced Eye (EE) is a novel methodology, derived from Domain Transfer Analysis (DTA), which seeks to identify and elicit the optimum input curves for modelling. During the EE solution process, relationships between the input variables and target variables are developed, based on characteristics and attributes of the inputs instead of statistical averages. The relationships so developed between variables can then be ranked appropriately and selected for modelling process. This paper focuses on three distinct petrophysical data scenarios where inputs are ranked prior to modelling: prediction of continuous permeability from discrete core measurements, porosity from multiple logging measurements and finally the prediction of key geomechanical properties. Each input curve is ranked against a target feature. For each case study, the best ranked features were carried forward to the modelling stage, and the results are validated alongside conventional interpretation methods. Ranked features were also compared between different machine learning algorithms: DTA, Neural Networks and Multiple Linear Regression. Results are compared with the available data for various case studies. The use of the new feature selection has been proven to improve accuracy and precision of prediction results from multiple modelling algorithms.


Author(s):  
M. VALKEMA ◽  
H. LINGSMA ◽  
P. LAMBIN ◽  
J. VAN LANSCHOT

Biostatistics versus machine learning: from traditional prediction models to automated medical analysis Machine learning is increasingly applied to medical data to develop clinical prediction models. This paper discusses the application of machine learning in comparison with traditional biostatistical methods. Biostatistics is well-suited for structured datasets. The selection of variables for a biostatistical prediction model is primarily knowledge-driven. A similar approach is possible with machine learning. But in addition, machine learning allows for analysis of unstructured datasets, which are e.g. derived from medical imaging and written texts in patient records. In contrast to biostatistics, the selection of variables with machine learning is mainly data-driven. Complex machine learning models are able to detect nonlinear patterns and interactions in data. However, this requires large datasets to prevent overfitting. For both machine learning and biostatistics, external validation of a developed model in a comparable setting is required to evaluate a model’s reproducibility. Machine learning models are not easily implemented in clinical practice, since they are recognized as black boxes (i.e. non-intuitive). For this purpose, research initiatives are ongoing within the field of explainable artificial intelligence. Finally, the application of machine learning for automated imaging analysis and development of clinical decision support systems is discussed.


2020 ◽  
Vol 12 (4) ◽  
pp. 1525 ◽  
Author(s):  
Feifei Yang ◽  
David W. Wanik ◽  
Diego Cerrai ◽  
Md Abul Ehsan Bhuiyan ◽  
Emmanouil N. Anagnostou

A growing number of electricity utilities use machine learning-based outage prediction models (OPMs) to predict the impact of storms on their networks for sustainable management. The accuracy of OPM predictions is sensitive to sample size and event severity representativeness in the training dataset, the extent of which has not yet been quantified. This study devised a randomized and out-of-sample validation experiment to quantify an OPM’s prediction uncertainty to different training sample sizes and event severity representativeness. The study showed random error decreasing by more than 100% for sample sizes ranging from 10 to 80 extratropical events, and by 32% for sample sizes from 10 to 40 thunderstorms. This study quantified the minimum number of sample size for the OPM attaining an acceptable prediction performance. The results demonstrated that conditioning the training of the OPM to a subset of events representative of the predicted event’s severity reduced the underestimation bias exhibited in high-impact events and the overestimation bias in low-impact ones. We used cross entropy (CE) to quantify the relatedness of weather variable distribution between the training dataset and the forecasted event.


2019 ◽  
Author(s):  
Oskar Flygare ◽  
Jesper Enander ◽  
Erik Andersson ◽  
Brjánn Ljótsson ◽  
Volen Z Ivanov ◽  
...  

**Background:** Previous attempts to identify predictors of treatment outcomes in body dysmorphic disorder (BDD) have yielded inconsistent findings. One way to increase precision and clinical utility could be to use machine learning methods, which can incorporate multiple non-linear associations in prediction models. **Methods:** This study used a random forests machine learning approach to test if it is possible to reliably predict remission from BDD in a sample of 88 individuals that had received internet-delivered cognitive behavioral therapy for BDD. The random forest models were compared to traditional logistic regression analyses. **Results:** Random forests correctly identified 78% of participants as remitters or non-remitters at post-treatment. The accuracy of prediction was lower in subsequent follow-ups (68%, 66% and 61% correctly classified at 3-, 12- and 24-month follow-ups, respectively). Depressive symptoms, treatment credibility, working alliance, and initial severity of BDD were among the most important predictors at the beginning of treatment. By contrast, the logistic regression models did not identify consistent and strong predictors of remission from BDD. **Conclusions:** The results provide initial support for the clinical utility of machine learning approaches in the prediction of outcomes of patients with BDD. **Trial registration:** ClinicalTrials.gov ID: NCT02010619.


2020 ◽  
Author(s):  
Sina Faizollahzadeh Ardabili ◽  
Amir Mosavi ◽  
Pedram Ghamisi ◽  
Filip Ferdinand ◽  
Annamaria R. Varkonyi-Koczy ◽  
...  

Several outbreak prediction models for COVID-19 are being used by officials around the world to make informed-decisions and enforce relevant control measures. Among the standard models for COVID-19 global pandemic prediction, simple epidemiological and statistical models have received more attention by authorities, and they are popular in the media. Due to a high level of uncertainty and lack of essential data, standard models have shown low accuracy for long-term prediction. Although the literature includes several attempts to address this issue, the essential generalization and robustness abilities of existing models needs to be improved. This paper presents a comparative analysis of machine learning and soft computing models to predict the COVID-19 outbreak as an alternative to SIR and SEIR models. Among a wide range of machine learning models investigated, two models showed promising results (i.e., multi-layered perceptron, MLP, and adaptive network-based fuzzy inference system, ANFIS). Based on the results reported here, and due to the highly complex nature of the COVID-19 outbreak and variation in its behavior from nation-to-nation, this study suggests machine learning as an effective tool to model the outbreak. This paper provides an initial benchmarking to demonstrate the potential of machine learning for future research. Paper further suggests that real novelty in outbreak prediction can be realized through integrating machine learning and SEIR models.


2020 ◽  
pp. 73-75
Author(s):  
B.M. Bazrov ◽  
T.M. Gaynutdinov

The selection of technological bases is considered before the choice of the type of billet and the development of the route of the technological process. A technique is proposed for selecting the minimum number of sets of technological bases according to the criterion of equality in the cost price of manufacturing the part according to the principle of unity and combination of bases at this stage. Keywords: part, surface, coordinating size, accuracy, design and technological base, labor input, cost price. [email protected]


2019 ◽  
Vol 21 (9) ◽  
pp. 662-669 ◽  
Author(s):  
Junnan Zhao ◽  
Lu Zhu ◽  
Weineng Zhou ◽  
Lingfeng Yin ◽  
Yuchen Wang ◽  
...  

Background: Thrombin is the central protease of the vertebrate blood coagulation cascade, which is closely related to cardiovascular diseases. The inhibitory constant Ki is the most significant property of thrombin inhibitors. Method: This study was carried out to predict Ki values of thrombin inhibitors based on a large data set by using machine learning methods. Taking advantage of finding non-intuitive regularities on high-dimensional datasets, machine learning can be used to build effective predictive models. A total of 6554 descriptors for each compound were collected and an efficient descriptor selection method was chosen to find the appropriate descriptors. Four different methods including multiple linear regression (MLR), K Nearest Neighbors (KNN), Gradient Boosting Regression Tree (GBRT) and Support Vector Machine (SVM) were implemented to build prediction models with these selected descriptors. Results: The SVM model was the best one among these methods with R2=0.84, MSE=0.55 for the training set and R2=0.83, MSE=0.56 for the test set. Several validation methods such as yrandomization test and applicability domain evaluation, were adopted to assess the robustness and generalization ability of the model. The final model shows excellent stability and predictive ability and can be employed for rapid estimation of the inhibitory constant, which is full of help for designing novel thrombin inhibitors.


Sign in / Sign up

Export Citation Format

Share Document