scholarly journals Interpretable machine learning for high-dimensional trajectories of aging health

2022 ◽  
Vol 18 (1) ◽  
pp. e1009746
Author(s):  
Spencer Farrell ◽  
Arnold Mitnitski ◽  
Kenneth Rockwood ◽  
Andrew D. Rutenberg

We have built a computational model for individual aging trajectories of health and survival, which contains physical, functional, and biological variables, and is conditioned on demographic, lifestyle, and medical background information. We combine techniques of modern machine learning with an interpretable interaction network, where health variables are coupled by explicit pair-wise interactions within a stochastic dynamical system. Our dynamic joint interpretable network (DJIN) model is scalable to large longitudinal data sets, is predictive of individual high-dimensional health trajectories and survival from baseline health states, and infers an interpretable network of directed interactions between the health variables. The network identifies plausible physiological connections between health variables as well as clusters of strongly connected health variables. We use English Longitudinal Study of Aging (ELSA) data to train our model and show that it performs better than multiple dedicated linear models for health outcomes and survival. We compare our model with flexible lower-dimensional latent-space models to explore the dimensionality required to accurately model aging health outcomes. Our DJIN model can be used to generate synthetic individuals that age realistically, to impute missing data, and to simulate future aging outcomes given arbitrary initial health states.

2021 ◽  
Vol 5 (Supplement_1) ◽  
pp. 676-676
Author(s):  
Spencer Farrell ◽  
Arnold Mitnitski ◽  
Kenneth Rockwood ◽  
Andrew Rutenberg

Abstract We have built a computational model of individual aging trajectories of health and survival, that contains physical, functional, and biological variables, and is conditioned on demographic, lifestyle, and medical background information. We combine techniques of modern machine learning with an interpretable network approach, where health variables are coupled by an explicit interaction network within a stochastic dynamical system. Our model is scalable to large longitudinal data sets, is predictive of individual high-dimensional health trajectories and survival from baseline health states, and infers an interpretable network of directed interactions between the health variables. The network identifies plausible physiological connections between health variables and clusters of strongly connected heath variables. We use English Longitudinal Study of Aging (ELSA) data to train our model and show that it performs better than traditional linear models for health outcomes and survival. Our model can also be used to generate synthetic individuals that age realistically, to impute missing data, and to simulate future aging outcomes given an arbitrary initial health state.


2020 ◽  
Vol 4 (Supplement_1) ◽  
pp. 923-923
Author(s):  
Spencer Farrell ◽  
Arnold Mitnitski ◽  
Kenneth Rockwood ◽  
Andrew Rutenberg

Abstract We have built a computational model of individual aging trajectories of health and survival, containing physical, functional, and biological variables, conditioned on demographic, lifestyle, and medical background information. We combine techniques of modern machine learning with a network approach, where the health variables are coupled by an interaction network within a stochastic dynamical system. The resulting model is scalable to large longitudinal data sets, is predictive of individual high-dimensional health trajectories and survival, and infers an interpretable network of interactions between the health variables. The interaction network gives us the ability to identify which interactions between variables are used by the model, demonstrating that realistic physiological connections are inferred. We use English Longitudinal Study of Aging (ELSA) data to train our model and show that it performs better than standard linear models for health outcomes and survival, while also revealing the relevant interactions. Our model can be used to generate synthetic individuals that age realistically from input data at baseline, as well as the ability to probe future aging outcomes given an arbitrary initial health state.


2020 ◽  
Vol 10 (5) ◽  
pp. 1797 ◽  
Author(s):  
Mera Kartika Delimayanti ◽  
Bedy Purnama ◽  
Ngoc Giang Nguyen ◽  
Mohammad Reza Faisal ◽  
Kunti Robiatul Mahmudah ◽  
...  

Manual classification of sleep stage is a time-consuming but necessary step in the diagnosis and treatment of sleep disorders, and its automation has been an area of active study. The previous works have shown that low dimensional fast Fourier transform (FFT) features and many machine learning algorithms have been applied. In this paper, we demonstrate utilization of features extracted from EEG signals via FFT to improve the performance of automated sleep stage classification through machine learning methods. Unlike previous works using FFT, we incorporated thousands of FFT features in order to classify the sleep stages into 2–6 classes. Using the expanded version of Sleep-EDF dataset with 61 recordings, our method outperformed other state-of-the art methods. This result indicates that high dimensional FFT features in combination with a simple feature selection is effective for the improvement of automated sleep stage classification.


2021 ◽  
pp. 1-36
Author(s):  
Henry Prakken ◽  
Rosa Ratsma

This paper proposes a formal top-level model of explaining the outputs of machine-learning-based decision-making applications and evaluates it experimentally with three data sets. The model draws on AI & law research on argumentation with cases, which models how lawyers draw analogies to past cases and discuss their relevant similarities and differences in terms of relevant factors and dimensions in the problem domain. A case-based approach is natural since the input data of machine-learning applications can be seen as cases. While the approach is motivated by legal decision making, it also applies to other kinds of decision making, such as commercial decisions about loan applications or employee hiring, as long as the outcome is binary and the input conforms to this paper’s factor- or dimension format. The model is top-level in that it can be extended with more refined accounts of similarities and differences between cases. It is shown to overcome several limitations of similar argumentation-based explanation models, which only have binary features and do not represent the tendency of features towards particular outcomes. The results of the experimental evaluation studies indicate that the model may be feasible in practice, but that further development and experimentation is needed to confirm its usefulness as an explanation model. Main challenges here are selecting from a large number of possible explanations, reducing the number of features in the explanations and adding more meaningful information to them. It also remains to be investigated how suitable our approach is for explaining non-linear models.


2021 ◽  
Vol 7 (1) ◽  
Author(s):  
Helder Sebastião ◽  
Pedro Godinho

AbstractThis study examines the predictability of three major cryptocurrencies—bitcoin, ethereum, and litecoin—and the profitability of trading strategies devised upon machine learning techniques (e.g., linear models, random forests, and support vector machines). The models are validated in a period characterized by unprecedented turmoil and tested in a period of bear markets, allowing the assessment of whether the predictions are good even when the market direction changes between the validation and test periods. The classification and regression methods use attributes from trading and network activity for the period from August 15, 2015 to March 03, 2019, with the test sample beginning on April 13, 2018. For the test period, five out of 18 individual models have success rates of less than 50%. The trading strategies are built on model assembling. The ensemble assuming that five models produce identical signals (Ensemble 5) achieves the best performance for ethereum and litecoin, with annualized Sharpe ratios of 80.17% and 91.35% and annualized returns (after proportional round-trip trading costs of 0.5%) of 9.62% and 5.73%, respectively. These positive results support the claim that machine learning provides robust techniques for exploring the predictability of cryptocurrencies and for devising profitable trading strategies in these markets, even under adverse market conditions.


2021 ◽  
Vol 11 (13) ◽  
pp. 6030
Author(s):  
Daljeet Singh ◽  
Antonella B. Francavilla ◽  
Simona Mancini ◽  
Claudio Guarnaccia

A vehicular road traffic noise prediction methodology based on machine learning techniques has been presented. The road traffic parameters that have been considered are traffic volume, percentage of heavy vehicles, honking occurrences and the equivalent continuous sound pressure level. Leq A method to include the honking effect in the traffic noise prediction has been illustrated. The techniques that have been used for the prediction of traffic noise are decision trees, random forests, generalized linear models and artificial neural networks. The results obtained by using these methods have been compared on the basis of mean square error, correlation coefficient, coefficient of determination and accuracy. It has been observed that honking is an important parameter and contributes to the overall traffic noise, especially in congested Indian road traffic conditions. The effects of honking noise on the human health cannot be ignored and it should be included as a parameter in the future traffic noise prediction models.


2021 ◽  
Vol 11 (2) ◽  
pp. 472
Author(s):  
Hyeongmin Cho ◽  
Sangkyun Lee

Machine learning has been proven to be effective in various application areas, such as object and speech recognition on mobile systems. Since a critical key to machine learning success is the availability of large training data, many datasets are being disclosed and published online. From a data consumer or manager point of view, measuring data quality is an important first step in the learning process. We need to determine which datasets to use, update, and maintain. However, not many practical ways to measure data quality are available today, especially when it comes to large-scale high-dimensional data, such as images and videos. This paper proposes two data quality measures that can compute class separability and in-class variability, the two important aspects of data quality, for a given dataset. Classical data quality measures tend to focus only on class separability; however, we suggest that in-class variability is another important data quality factor. We provide efficient algorithms to compute our quality measures based on random projections and bootstrapping with statistical benefits on large-scale high-dimensional data. In experiments, we show that our measures are compatible with classical measures on small-scale data and can be computed much more efficiently on large-scale high-dimensional datasets.


Sign in / Sign up

Export Citation Format

Share Document