Machine Learning in Classification Time Series with Fractal Properties

Concussion injuries remain a significant public health challenge. A significant unmet clinical need remains for tools that allow related physiological impairments and longer-term health risks to be identified earlier, better quantified, and more easily monitored over time. We address this challenge by combining a head-mounted wearable inertial motion unit (IMU)-based physiological vibration acceleration (“phybrata”) sensor and several candidate machine learning (ML) models. The performance of this solution is assessed for both binary classification of concussion patients and multiclass predictions of specific concussion-related neurophysiological impairments. Results are compared with previously reported approaches to ML-based concussion diagnostics. Using phybrata data from a previously reported concussion study population, four different machine learning models (Support Vector Machine, Random Forest Classifier, Extreme Gradient Boost, and Convolutional Neural Network) are first investigated for binary classification of the test population as healthy vs. concussion (Use Case 1). Results are compared for two different data preprocessing pipelines, Time-Series Averaging (TSA) and Non-Time-Series Feature Extraction (NTS). Next, the three best-performing NTS models are compared in terms of their multiclass prediction performance for specific concussion-related impairments: vestibular, neurological, both (Use Case 2). For Use Case 1, the NTS model approach outperformed the TSA approach, with the two best algorithms achieving an F1 score of 0.94. For Use Case 2, the NTS Random Forest model achieved the best performance in the testing set, with an F1 score of 0.90, and identified a wider range of relevant phybrata signal features that contributed to impairment classification compared with manual feature inspection and statistical data analysis. The overall classification performance achieved in the present work exceeds previously reported approaches to ML-based concussion diagnostics using other data sources and ML models. This study also demonstrates the first combination of a wearable IMU-based sensor and ML model that enables both binary classification of concussion patients and multiclass predictions of specific concussion-related neurophysiological impairments.

Download Full-text

Machine-learning and statistical methods for DDoS attack detection and defense system in software defined networks

10.32920/ryerson.14657556 ◽

2021 ◽

Author(s):

Merlin James Rukshan Dennis

Keyword(s):

Machine Learning ◽

Random Forest ◽

Statistical Approach ◽

Denial Of Service ◽

Attack Detection ◽

Learning Approach ◽

Ddos Attack ◽

Machine Learning Approach ◽

Ddos Detection ◽

Ddos Attack Detection

Distributed Denial of Service (DDoS) attack is a serious threat on today’s Internet. As the traffic across the Internet increases day by day, it is a challenge to distinguish between legitimate and malicious traffic. This thesis proposes two different approaches to build an efficient DDoS attack detection system in the Software Defined Networking environment. SDN is the latest networking approach which implements centralized controller, which is programmable. The central control and the programming capability of the controller are used in this thesis to implement the detection and mitigation mechanisms. In this thesis, two designed approaches, statistical approach and machine-learning approach, are proposed for the DDoS detection. The statistical approach implements entropy computation and flow statistics analysis. It uses the mean and standard deviation of destination entropy, new flow arrival rate, packets per flow and flow duration to compute various thresholds. These thresholds are then used to distinguish normal and attack traffic. The machine learning approach uses Random Forest classifier to detect the DDoS attack. We fine-tune the Random Forest algorithm to make it more accurate in DDoS detection. In particular, we introduce the weighted voting instead of the standard majority voting to improve the accuracy. Our result shows that the proposed machine-learning approach outperforms the statistical approach. Furthermore, it also outperforms other machine-learning approach found in the literature.

Download Full-text

Comparative analysis of machine learning classification of time series with fractal properties

2019 IEEE 8th International Conference on Advanced Optoelectronics and Lasers (CAOL) ◽

10.1109/caol46282.2019.9019416 ◽

2019 ◽

Author(s):

Tamara Radivilova ◽

Lyudmyla Kirichenko ◽

Bulakh Vitalii

Keyword(s):

Machine Learning ◽

Time Series ◽

Comparative Analysis ◽

Machine Learning Classification ◽

Fractal Properties

Download Full-text

A random forest method for constructing long-term time series of nighttime light in Central Asia

Remote Sensing Applications Society and Environment ◽

10.1016/j.rsase.2021.100687 ◽

2021 ◽

pp. 100687

Author(s):

Hui Chen ◽

Yina Qiao ◽

Hailong Liu

Keyword(s):

Time Series ◽

Random Forest ◽

Central Asia ◽

Random Forest Method ◽

Nighttime Light

Download Full-text

The differential diagnosis of IgG4-related disease based on machine learning

10.21203/rs.3.rs-968651/v1 ◽

2021 ◽

Author(s):

Motohisa Yamamoto ◽

Masanori Nojima ◽

Ryuta Kamekura ◽

Akiko Kuribara-Souta ◽

Masaaki Uehara ◽

...

Keyword(s):

Machine Learning ◽

Differential Diagnosis ◽

Random Forest ◽

Blood Test ◽

Patient Characteristics ◽

Validation Sample ◽

Related Disease ◽

Igg4 Related Disease ◽

Random Forest Method ◽

Serum Igg4

Abstract Introduction: To eliminate the disparity and maldistribution of physicians and medical specialty services, the development of diagnostic support for rare diseases using artificial intelligence is being promoted. Immunoglobulin G4 (IgG4)-related disease (IgG4-RD) is a rare disorder often requiring special knowledge and experience to diagnose. In this study, we investigated the possibility of differential diagnosis of IgG4-RD based on basic patient characteristics and blood test findings using machine learning. Methods Six-hundred and two patients with IgG4-RD and 212 patients with non-IgG4-RD that needed to be differentiated who visited the participating institutions were included in the study. Ten percent of the subjects were randomly excluded as a validation sample. Among the remaining cases, 80% were used as training samples, and the remaining 20% were used as test samples. Finally, validation was performed on the validation sample. The analysis was performed using a decision tree and a random forest model. Furthermore, a comparison was made between conditions with and without the serum IgG4 concentration. Accuracy was evaluated using the area under the receiver-operating characteristic (AUROC) curve. Results In diagnosing IgG4-RD, AUROC curve values of the decision tree and the random forest method were 0.905 and 0.970, respectively, when serum IgG4 levels were included in the analysis. Excluding serum IgG4 levels, the AUROC curve value of the analysis by the random forest method was 0.919. Conclusion Based on machine learning in a multicenter collaboration, with or without serum IgG4 data, basic patient characteristics and blood test findings alone were sufficient to differentiate IgG4-RD from non-IgG4-RD.

Download Full-text

Machine-learning and statistical methods for DDoS attack detection and defense system in software defined networks

10.32920/ryerson.14657556.v1 ◽

2021 ◽

Author(s):

Merlin James Rukshan Dennis

Keyword(s):

Machine Learning ◽

Random Forest ◽

Statistical Approach ◽

Denial Of Service ◽

Attack Detection ◽

Learning Approach ◽

Ddos Attack ◽

Machine Learning Approach ◽

Ddos Detection ◽

Ddos Attack Detection

Distributed Denial of Service (DDoS) attack is a serious threat on today’s Internet. As the traffic across the Internet increases day by day, it is a challenge to distinguish between legitimate and malicious traffic. This thesis proposes two different approaches to build an efficient DDoS attack detection system in the Software Defined Networking environment. SDN is the latest networking approach which implements centralized controller, which is programmable. The central control and the programming capability of the controller are used in this thesis to implement the detection and mitigation mechanisms. In this thesis, two designed approaches, statistical approach and machine-learning approach, are proposed for the DDoS detection. The statistical approach implements entropy computation and flow statistics analysis. It uses the mean and standard deviation of destination entropy, new flow arrival rate, packets per flow and flow duration to compute various thresholds. These thresholds are then used to distinguish normal and attack traffic. The machine learning approach uses Random Forest classifier to detect the DDoS attack. We fine-tune the Random Forest algorithm to make it more accurate in DDoS detection. In particular, we introduce the weighted voting instead of the standard majority voting to improve the accuracy. Our result shows that the proposed machine-learning approach outperforms the statistical approach. Furthermore, it also outperforms other machine-learning approach found in the literature.

Download Full-text

Detecting DDoS Attack

Applications of Artificial Intelligence for Smart Technology - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-7998-3335-2.ch004 ◽

2021 ◽

pp. 55-66

Author(s):

Megala G. ◽

S. Prabu ◽

Liyanapathirana B. C.

Keyword(s):

Machine Learning ◽

Random Forest ◽

Learning Algorithm ◽

Denial Of Service ◽

Economic Loss ◽

High Accuracy ◽

Denial Of Service Attack ◽

Ddos Attack ◽

Internet Users ◽

Service Attack

The major network security problems faced by many internet users is the DDoS (distributed denial of service) attack. This attack makes the service inaccessible by exhausting the network and resources with high repudiation and economic loss. It denies the network services to the potential users. To detect this DDoS attack accurately in the network, random forest classifier which is a machine learning based classifier is used. The experimental results are compared with naïve Bayes classifier and KNN classifier showing that random forest produces high accuracy results in classification. Application of machine learning, detecting DDoS attacks is modeled based on the supervised learning algorithm to produce best outcome with high accuracy of training algorithm on network dataset.

Download Full-text

Using machine learning to guide targeted and locally-tailored empiric antibiotic prescribing in a children's hospital in Cambodia

Wellcome Open Research ◽

10.12688/wellcomeopenres.14847.1 ◽

2018 ◽

Vol 3 ◽

pp. 131 ◽

Cited By ~ 15

Author(s):

Mathupanee Oonsivilai ◽

Yin Mo ◽

Nantasit Luangasanatip ◽

Yoel Lubell ◽

Thyl Miliya ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Antimicrobial Resistance ◽

Blood Culture ◽

Learning Algorithms ◽

Empiric Therapy ◽

Patient Data ◽

Machine Learning Algorithms ◽

Random Forest Method ◽

Empiric Antibiotic

Background: Early and appropriate empiric antibiotic treatment of patients suspected of having sepsis is associated with reduced mortality. The increasing prevalence of antimicrobial resistance reduces the efficacy of empiric therapy guidelines derived from population data. This problem is particularly severe for children in developing country settings. We hypothesized that by applying machine learning approaches to readily collect patient data, it would be possible to obtain individualized predictions for targeted empiric antibiotic choices. Methods and Findings: We analysed blood culture data collected from a 100-bed children's hospital in North-West Cambodia between February 2013 and January 2016. Clinical, demographic and living condition information was captured with 35 independent variables. Using these variables, we used a suite of machine learning algorithms to predict Gram stains and whether bacterial pathogens could be treated with common empiric antibiotic regimens: i) ampicillin and gentamicin; ii) ceftriaxone; iii) none of the above. 243 patients with bloodstream infections were available for analysis. We found that the random forest method had the best predictive performance overall as assessed by the area under the receiver operating characteristic curve (AUC). The random forest method gave an AUC of 0.80 (95%CI 0.66-0.94) for predicting susceptibility to ceftriaxone, 0.74 (0.59-0.89) for susceptibility to ampicillin and gentamicin, 0.85 (0.70-1.00) for susceptibility to neither, and 0.71 (0.57-0.86) for Gram stain result. Most important variables for predicting susceptibility were time from admission to blood culture, patient age, hospital versus community-acquired infection, and age-adjusted weight score. Conclusions: Applying machine learning algorithms to patient data that are readily available even in resource-limited hospital settings can provide highly informative predictions on antibiotic susceptibilities to guide appropriate empiric antibiotic therapy. When used as a decision support tool, such approaches have the potential to improve targeting of empiric therapy, patient outcomes and reduce the burden of antimicrobial resistance.

Download Full-text

Large-Scale Crop Mapping Based on Machine Learning and Parallel Computation with Grids

Remote Sensing ◽

10.3390/rs11121500 ◽

2019 ◽

Vol 11 (12) ◽

pp. 1500 ◽

Cited By ~ 15

Author(s):

Ning Yang ◽

Diyou Liu ◽

Quanlong Feng ◽

Quan Xiong ◽

Lin Zhang ◽

...

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Time Series ◽

Parallel Computing ◽

Random Forest ◽

Large Scale ◽

Remote Sensing Data ◽

Remote Sensing Images ◽

Medium Resolution ◽

Crop Mapping

Large-scale crop mapping provides important information in agricultural applications. However, it is a challenging task due to the inconsistent availability of remote sensing data caused by the irregular time series and limited coverage of the images, together with the low spatial resolution of the classification results. In this study, we proposed a new efficient method based on grids to address the inconsistent availability of the high-medium resolution images for large-scale crop classification. First, we proposed a method to block the remote sensing data into grids to solve the problem of temporal inconsistency. Then, a parallel computing technique was introduced to improve the calculation efficiency on the grid scale. Experiments were designed to evaluate the applicability of this method for different high-medium spatial resolution remote sensing images and different machine learning algorithms and to compare the results with the widely used nonparallel method. The computational experiments showed that the proposed method was successful at identifying large-scale crop distribution using common high-medium resolution remote sensing images (GF-1 WFV images and Sentinel-2) and common machine learning classifiers (the random forest algorithm and support vector machine). Finally, we mapped the croplands in Heilongjiang Province in 2015, 2016, 2017, which used a random forest classifier with the time series GF-1 WFV images spectral features, the enhanced vegetation index (EVI) and normalized difference water index (NDWI). Ultimately, the accuracy was assessed using a confusion matrix. The results showed that the classification accuracy reached 88%, 82%, and 85% in 2015, 2016, and 2017, respectively. In addition, with the help of parallel computing, the calculation speed was significantly improved by at least seven-fold. This indicates that using the grid framework to block the data for classification is feasible for crop mapping in large areas and has great application potential in the future.

Download Full-text