scholarly journals Deep learning for clustering of multivariate clinical patient trajectories with missing values

GigaScience ◽  
2019 ◽  
Vol 8 (11) ◽  
Author(s):  
Johann de Jong ◽  
Mohammad Asif Emon ◽  
Ping Wu ◽  
Reagon Karki ◽  
Meemansa Sood ◽  
...  

Abstract Background Precision medicine requires a stratification of patients by disease presentation that is sufficiently informative to allow for selecting treatments on a per-patient basis. For many diseases, such as neurological disorders, this stratification problem translates into a complex problem of clustering multivariate and relatively short time series because (i) these diseases are multifactorial and not well described by single clinical outcome variables and (ii) disease progression needs to be monitored over time. Additionally, clinical data often additionally are hindered by the presence of many missing values, further complicating any clustering attempts. Findings The problem of clustering multivariate short time series with many missing values is generally not well addressed in the literature. In this work, we propose a deep learning–based method to address this issue, variational deep embedding with recurrence (VaDER). VaDER relies on a Gaussian mixture variational autoencoder framework, which is further extended to (i) model multivariate time series and (ii) directly deal with missing values. We validated VaDER by accurately recovering clusters from simulated and benchmark data with known ground truth clustering, while varying the degree of missingness. We then used VaDER to successfully stratify patients with Alzheimer disease and patients with Parkinson disease into subgroups characterized by clinically divergent disease progression profiles. Additional analyses demonstrated that these clinical differences reflected known underlying aspects of Alzheimer disease and Parkinson disease. Conclusions We believe our results show that VaDER can be of great value for future efforts in patient stratification, and multivariate time-series clustering in general.

Author(s):  
Hiroko Kato Solvang ◽  
Benjamin Planque

Abstract We propose a trend estimation and classification (TREC) approach to estimating dominant common trends among multivariate time series observations. Our methods are based on two statistical procedures that includes trend modelling and discriminant analysis for classifying similar trend (common trend) classes. We use simulations to evaluate the proposed approach and compare it with a relevant dynamic factor analysis in the time domain, which was recently proposed to estimate common trends in fisheries time series. We apply the TREC approach to the multivariate short time series datasets investigated by the ICES integrated assessment working groups for the Norwegian Sea and the Barents Sea. The proposed approach is robust for application to short time series, and it directly identifies and classifies the dominant trends underlying observations. Based on the classified trend classes, we suggest that communication among stakeholders like marine managers, industry representatives, non-governmental organizations, and governmental agencies can be enhanced by finding the common tendency between a biological community in a marine ecosystem and the environmental factors, as well as by the icons produced by generalizing common trend patterns.


2019 ◽  
Vol 16 (159) ◽  
pp. 20190629 ◽  
Author(s):  
Els Weinans ◽  
J. Jelle Lever ◽  
Sebastian Bathiany ◽  
Rick Quax ◽  
Jordi Bascompte ◽  
...  

The dynamics of complex systems, such as ecosystems, financial markets and the human brain, emerge from the interactions of numerous components. We often lack the knowledge to build reliable models for the behaviour of such network systems. This makes it difficult to predict potential instabilities. We show that one could use the natural fluctuations in multivariate time series to reveal network regions with particularly slow dynamics. The multidimensional slowness points to the direction of minimal resilience, in the sense that simultaneous perturbations on this set of nodes will take longest to recover. We compare an autocorrelation-based method with a variance-based method for different time-series lengths, data resolution and different noise regimes. We show that the autocorrelation-based method is less robust for short time series or time series with a low resolution but more robust for varying noise levels. This novel approach may help to identify unstable regions of multivariate systems or to distinguish safe from unsafe perturbations.


Author(s):  
Tie Liang ◽  
Qingyu Zhang ◽  
Xiaoguang Liu ◽  
Bin Dong ◽  
Xiuling Liu ◽  
...  

Abstract Background The key challenge to constructing functional corticomuscular coupling (FCMC) is to accurately identify the direction and strength of the information flow between scalp electroencephalography (EEG) and surface electromyography (SEMG). Traditional TE and TDMI methods have difficulty in identifying the information interaction for short time series as they tend to rely on long and stable data, so we propose a time-delayed maximal information coefficient (TDMIC) method. With this method, we aim to investigate the directional specificity of bidirectional total and nonlinear information flow on FCMC, and to explore the neural mechanisms underlying motor dysfunction in stroke patients. Methods We introduced a time-delayed parameter in the maximal information coefficient to capture the direction of information interaction between two time series. We employed the linear and non-linear system model based on short data to verify the validity of our algorithm. We then used the TDMIC method to study the characteristics of total and nonlinear information flow in FCMC during a dorsiflexion task for healthy controls and stroke patients. Results The simulation results showed that the TDMIC method can better detect the direction of information interaction compared with TE and TDMI methods. For healthy controls, the beta band (14–30 Hz) had higher information flow in FCMC than the gamma band (31–45 Hz). Furthermore, the beta-band total and nonlinear information flow in the descending direction (EEG to EMG) was significantly higher than that in the ascending direction (EMG to EEG), whereas in the gamma band the ascending direction had significantly higher information flow than the descending direction. Additionally, we found that the strong bidirectional information flow mainly acted on Cz, C3, CP3, P3 and CPz. Compared to controls, both the beta-and gamma-band bidirectional total and nonlinear information flows of the stroke group were significantly weaker. There is no significant difference in the direction of beta- and gamma-band information flow in stroke group. Conclusions The proposed method could effectively identify the information interaction between short time series. According to our experiment, the beta band mainly passes downward motor control information while the gamma band features upward sensory feedback information delivery. Our observation demonstrate that the center and contralateral sensorimotor cortex play a major role in lower limb motor control. The study further demonstrates that brain damage caused by stroke disrupts the bidirectional information interaction between cortex and effector muscles in the sensorimotor system, leading to motor dysfunction.


Author(s):  
Hossein Ebrahimidinaki ◽  
Shervin Shirmohammadi ◽  
Emil Janulewicz ◽  
David Cote

2021 ◽  
Vol 13 (3) ◽  
pp. 67
Author(s):  
Eric Hitimana ◽  
Gaurav Bajpai ◽  
Richard Musabe ◽  
Louis Sibomana ◽  
Jayavel Kayalvizhi

Many countries worldwide face challenges in controlling building incidence prevention measures for fire disasters. The most critical issues are the localization, identification, detection of the room occupant. Internet of Things (IoT) along with machine learning proved the increase of the smartness of the building by providing real-time data acquisition using sensors and actuators for prediction mechanisms. This paper proposes the implementation of an IoT framework to capture indoor environmental parameters for occupancy multivariate time-series data. The application of the Long Short Term Memory (LSTM) Deep Learning algorithm is used to infer the knowledge of the presence of human beings. An experiment is conducted in an office room using multivariate time-series as predictors in the regression forecasting problem. The results obtained demonstrate that with the developed system it is possible to obtain, process, and store environmental information. The information collected was applied to the LSTM algorithm and compared with other machine learning algorithms. The compared algorithms are Support Vector Machine, Naïve Bayes Network, and Multilayer Perceptron Feed-Forward Network. The outcomes based on the parametric calibrations demonstrate that LSTM performs better in the context of the proposed application.


Sign in / Sign up

Export Citation Format

Share Document