scholarly journals Random forests for time-dependent processes

2020 ◽  
Vol 24 ◽  
pp. 801-826
Author(s):  
Benjamin Goehry

Random forests were introduced by Breiman in 2001. We study theoretical aspects of both original Breiman’s random forests and a simplified version, the centred random forests. Under the independent and identically distributed hypothesis, Scornet, Biau and Vert proved the consistency of Breiman’s random forest, while Biau studied the simplified version and obtained a rate of convergence in the sparse case. However, the i.i.d hypothesis is generally not satisfied for example when dealing with time series. We extend the previous results to the case where observations are weakly dependent, more precisely when the sequences are stationary β−mixing.

Sensors ◽  
2021 ◽  
Vol 21 (10) ◽  
pp. 3553
Author(s):  
Jeremy Watts ◽  
Anahita Khojandi ◽  
Rama Vasudevan ◽  
Fatta B. Nahab ◽  
Ritesh A. Ramdhani

Parkinson’s disease medication treatment planning is generally based on subjective data obtained through clinical, physician-patient interactions. The Personal KinetiGraph™ (PKG) and similar wearable sensors have shown promise in enabling objective, continuous remote health monitoring for Parkinson’s patients. In this proof-of-concept study, we propose to use objective sensor data from the PKG and apply machine learning to cluster patients based on levodopa regimens and response. The resulting clusters are then used to enhance treatment planning by providing improved initial treatment estimates to supplement a physician’s initial assessment. We apply k-means clustering to a dataset of within-subject Parkinson’s medication changes—clinically assessed by the MDS-Unified Parkinson’s Disease Rating Scale-III (MDS-UPDRS-III) and the PKG sensor for movement staging. A random forest classification model was then used to predict patients’ cluster allocation based on their respective demographic information, MDS-UPDRS-III scores, and PKG time-series data. Clinically relevant clusters were partitioned by levodopa dose, medication administration frequency, and total levodopa equivalent daily dose—with the PKG providing similar symptomatic assessments to physician MDS-UPDRS-III scores. A random forest classifier trained on demographic information, MDS-UPDRS-III scores, and PKG time-series data was able to accurately classify subjects of the two most demographically similar clusters with an accuracy of 86.9%, an F1 score of 90.7%, and an AUC of 0.871. A model that relied solely on demographic information and PKG time-series data provided the next best performance with an accuracy of 83.8%, an F1 score of 88.5%, and an AUC of 0.831, hence further enabling fully remote assessments. These computational methods demonstrate the feasibility of using sensor-based data to cluster patients based on their medication responses with further potential to assist with medication recommendations.


2019 ◽  
Vol 11 (7) ◽  
pp. 866 ◽  
Author(s):  
Imke Hans ◽  
Martin Burgdorf ◽  
Stefan A. Buehler

Understanding the causes of inter-satellite biases in climate data records from observations of the Earth is crucial for constructing a consistent time series of the essential climate variables. In this article, we analyse the strong scan- and time-dependent biases observed for the microwave humidity sounders on board the NOAA-16 and NOAA-19 satellites. We find compelling evidence that radio frequency interference (RFI) is the cause of the biases. We also devise a correction scheme for the raw count signals for the instruments to mitigate the effect of RFI. Our results show that the RFI-corrected, recalibrated data exhibit distinctly reduced biases and provide consistent time series.


Sensors ◽  
2021 ◽  
Vol 21 (21) ◽  
pp. 7417
Author(s):  
Alex J. Hope ◽  
Utkarsh Vashisth ◽  
Matthew J. Parker ◽  
Andreas B. Ralston ◽  
Joshua M. Roper ◽  
...  

Concussion injuries remain a significant public health challenge. A significant unmet clinical need remains for tools that allow related physiological impairments and longer-term health risks to be identified earlier, better quantified, and more easily monitored over time. We address this challenge by combining a head-mounted wearable inertial motion unit (IMU)-based physiological vibration acceleration (“phybrata”) sensor and several candidate machine learning (ML) models. The performance of this solution is assessed for both binary classification of concussion patients and multiclass predictions of specific concussion-related neurophysiological impairments. Results are compared with previously reported approaches to ML-based concussion diagnostics. Using phybrata data from a previously reported concussion study population, four different machine learning models (Support Vector Machine, Random Forest Classifier, Extreme Gradient Boost, and Convolutional Neural Network) are first investigated for binary classification of the test population as healthy vs. concussion (Use Case 1). Results are compared for two different data preprocessing pipelines, Time-Series Averaging (TSA) and Non-Time-Series Feature Extraction (NTS). Next, the three best-performing NTS models are compared in terms of their multiclass prediction performance for specific concussion-related impairments: vestibular, neurological, both (Use Case 2). For Use Case 1, the NTS model approach outperformed the TSA approach, with the two best algorithms achieving an F1 score of 0.94. For Use Case 2, the NTS Random Forest model achieved the best performance in the testing set, with an F1 score of 0.90, and identified a wider range of relevant phybrata signal features that contributed to impairment classification compared with manual feature inspection and statistical data analysis. The overall classification performance achieved in the present work exceeds previously reported approaches to ML-based concussion diagnostics using other data sources and ML models. This study also demonstrates the first combination of a wearable IMU-based sensor and ML model that enables both binary classification of concussion patients and multiclass predictions of specific concussion-related neurophysiological impairments.


2017 ◽  
Author(s):  
Carlos J Corrada Bravo ◽  
Rafael Álvarez Berríos ◽  
T. Mitchell Aide

We developed a web-based cloud-hosted system that allow users to archive, listen, visualize, and annotate recordings. The system also provides tools to convert these annotations into datasets that can be used to train a computer to detect the presence or absence of a species. The algorithm used by the system was selected after comparing the accuracy and efficiency of three variants of a template-based classification. The algorithm computes a similarity vector by comparing a template of a species call with time increments across the spectrogram. Statistical features are extracted from this vector and used as input for a Random Forest classifier that predicts presence or absence of the species in the recording. The fastest algorithm variant had the highest average accuracy and specificity; therefore, it was implemented in the ARBIMON web-based system.


2021 ◽  
Vol 5 (CHI PLAY) ◽  
pp. 1-29
Author(s):  
Alessandro Canossa ◽  
Dmitry Salimov ◽  
Ahmad Azadvar ◽  
Casper Harteveld ◽  
Georgios Yannakakis

Is it possible to detect toxicity in games just by observing in-game behavior? If so, what are the behavioral factors that will help machine learning to discover the unknown relationship between gameplay and toxic behavior? In this initial study, we examine whether it is possible to predict toxicity in the MOBA gameFor Honor by observing in-game behavior for players that have been labeled as toxic (i.e. players that have been sanctioned by Ubisoft community managers). We test our hypothesis of detecting toxicity through gameplay with a dataset of almost 1,800 sanctioned players, and comparing these sanctioned players with unsanctioned players. Sanctioned players are defined by their toxic action type (offensive behavior vs. unfair advantage) and degree of severity (warned vs. banned). Our findings, based on supervised learning with random forests, suggest that it is not only possible to behaviorally distinguish sanctioned from unsanctioned players based on selected features of gameplay; it is also possible to predict both the sanction severity (warned vs. banned) and the sanction type (offensive behavior vs. unfair advantage). In particular, all random forest models predict toxicity, its severity, and type, with an accuracy of at least 82%, on average, on unseen players. This research shows that observing in-game behavior can support the work of community managers in moderating and possibly containing the burden of toxic behavior.


Sign in / Sign up

Export Citation Format

Share Document