Machine Learning Classification of Head Impact Sensor Data

Sensor Data ◽

Environmental Sensors ◽

Head Acceleration ◽

Environmental Sensor ◽

Validation Set

Abstract A shortcoming of using environmental sensors for the surveillance of potentially concussive events is substantial uncertainty regarding whether the event was caused by head acceleration (“head impacts”) or sensor motion (with no head acceleration). The goal of the present study is to develop a machine learning model to classify environmental sensor data obtained in the field and evaluate the performance of the model against the performance of the proprietary classification algorithm used by the environmental sensor. Data were collected from Soldiers attending sparring sessions conducted under a U.S. Army Combatives School course. Data from one sparring session were used to train a decision tree classification algorithm to identify good and bad signals. Data from the remaining sparring sessions were kept as an external validation set. The performance of the proprietary algorithm used by the sensor was also compared to the trained algorithm performance. The trained decision tree was able to correctly classify 95% of events for internal cross-validation and 88% of events for the external validation set. Comparatively, the proprietary algorithm was only able to correctly classify 61% of the events. In general, the trained algorithm was better able to predict when a signal was good or bad compared to the proprietary algorithm. The present study shows it is possible to train a decision tree algorithm using environmental sensor data collected in the field.

Automatic Identification of Upper Extremity Rehabilitation Exercise Type and Dose Using Body-Worn Sensors and Machine Learning: A Pilot Study

Digital Biomarkers ◽

10.1159/000516619 ◽

2021 ◽

pp. 158-166

Author(s):

Noah Balestra ◽

Gaurav Sharma ◽

Linda M. Riek ◽

Ania Busza

Keyword(s):

Machine Learning ◽

Upper Extremity ◽

Sensor Data ◽

Inpatient Setting ◽

Accelerometer Data ◽

Data Set ◽

Exercise Type ◽

Exercise Dose ◽

Rehabilitation Exercises

Background: Prior studies suggest that participation in rehabilitation exercises improves motor function poststroke; however, studies on optimal exercise dose and timing have been limited by the technical challenge of quantifying exercise activities over multiple days. Objectives: The objectives of this study were to assess the feasibility of using body-worn sensors to track rehabilitation exercises in the inpatient setting and investigate which recording parameters and data analysis strategies are sufficient for accurately identifying and counting exercise repetitions. Methods: MC10 BioStampRC® sensors were used to measure accelerometer and gyroscope data from upper extremities of healthy controls (n = 13) and individuals with upper extremity weakness due to recent stroke (n = 13) while the subjects performed 3 preselected arm exercises. Sensor data were then labeled by exercise type and this labeled data set was used to train a machine learning classification algorithm for identifying exercise type. The machine learning algorithm and a peak-finding algorithm were used to count exercise repetitions in non-labeled data sets. Results: We achieved a repetition counting accuracy of 95.6% overall, and 95.0% in patients with upper extremity weakness due to stroke when using both accelerometer and gyroscope data. Accuracy was decreased when using fewer sensors or using accelerometer data alone. Conclusions: Our exploratory study suggests that body-worn sensor systems are technically feasible, well tolerated in subjects with recent stroke, and may ultimately be useful for developing a system to measure total exercise “dose” in poststroke patients during clinical rehabilitation or clinical trials.

IJITEE (International Journal of Information Technology and Electrical Engineering) ◽

Applying Machine Learning for Improving Performance Classification on Driving Behavior

10.22146/ijitee.56919 ◽

2021 ◽

Vol 4 (1) ◽

pp. 8

Author(s):

Ahmad Iwan Fadli ◽

Selo Sulistyo ◽

Sigit Wibowo

Keyword(s):

Machine Learning ◽

Traffic Accident ◽

Large Scale ◽

Detection System ◽

Difficult Problem ◽

Sensor Data ◽

Driving Safety ◽

Support Vector ◽

Classification Methods ◽

Traffic accident is a very difficult problem to handle on a large scale in a country. Indonesia is one of the most populated, developing countries that use vehicles for daily activities as its main transportation. It is also the country with the largest number of car users in Southeast Asia, so driving safety needs to be considered. Using machine learning classification method to determine whether a driver is driving safely or not can help reduce the risk of driving accidents. We created a detection system to classify whether the driver is driving safely or unsafely using trip sensor data, which include Gyroscope, Acceleration, and GPS. The classification methods used in this study are Random Forest (RF) classification algorithm, Support Vector Machine (SVM), and Multilayer Perceptron (MLP) by improving data preprocessing using feature extraction and oversampling methods. This study shows that RF has the best performance with 98% accuracy, 98% precision, and 97% sensitivity using the proposed preprocessing stages compared to SVM or MLP.

Predictive Analysis of Genetic Disease Haemophilia-A based on Machine Learning Classification Algorithm

IJARCCE ◽

10.17148/ijarcce.2021.101210 ◽

2021 ◽

Vol 10 (12) ◽

Author(s):

Dillip Narayan Sahu ◽

Vijay Pal Singh

Keyword(s):

Machine Learning ◽

Genetic Disease ◽

Predictive Analysis ◽

Haemophilia A ◽

Discovery of Highly Polymorphic Organic Materials: A New Machine Learning Approach

10.26434/chemrxiv.9524219 ◽

2019 ◽

Author(s):

Zied Hosni ◽

Annalisa Riccardi ◽

Stephanie Yerdelen ◽

Alan R. G. Martin ◽

Deborah Bowering ◽

...

Keyword(s):

Machine Learning ◽

Structure Prediction ◽

External Validation ◽

New Drugs ◽

Training Dataset ◽

Validation Dataset ◽

Novel Approach ◽

Physical Form ◽

Machine Learning Approach

<div><div>Polymorphism is the capacity of a molecule to adopt different conformations or molecular packing arrangements in the solid state. This is a key property to control during pharmaceutical manufacturing because it can impact a range of properties including stability and solubility. In this study, a novel approach based on machine learning classification methods is used to predict the likelihood for an organic compound to crystallise in multiple forms. A training dataset of drug-like molecules was curated from the Cambridge Structural Database (CSD) and filtered according to entries in the Drug Bank database. The number of separate forms in the CSD for each molecule was recorded. A metaclassifier was trained using this dataset to predict the expected number of crystalline forms from the compound descriptors. This approach was used to estimate the number of crystallographic forms for an external validation dataset. These results suggest this novel methodology can be used to predict the extent of polymorphism of new drugs or not-yet experimentally screened molecules. This promising method complements expensive ab initio methods for crystal structure prediction and as integral to experimental physical form screening, may identify systems that with unexplored potential. </div> </div>

A Machine Learning-Based Prediction Platform for P-Glycoprotein Modulators and Its Validation by Molecular Docking

Cells ◽

10.3390/cells8101286 ◽

2019 ◽

Vol 8 (10) ◽

pp. 1286 ◽

Cited By ~ 1

Author(s):

Onat Kadioglu ◽

Thomas Efferth

Keyword(s):

Machine Learning ◽

Molecular Docking ◽

Learning Strategies ◽

High Performance ◽

External Validation ◽

Major Drawback ◽

Chemotherapy Drugs ◽

P Glycoprotein ◽

Validation Set ◽

Leave One Out

P-glycoprotein (P-gp) is an important determinant of multidrug resistance (MDR) because its overexpression is associated with increased efflux of various established chemotherapy drugs in many clinically resistant and refractory tumors. This leads to insufficient therapeutic targeting of tumor populations, representing a major drawback of cancer chemotherapy. Therefore, P-gp is a target for pharmacological inhibitors to overcome MDR. In the present study, we utilized machine learning strategies to establish a model for P-gp modulators to predict whether a given compound would behave as substrate or inhibitor of P-gp. Random forest feature selection algorithm-based leave-one-out random sampling was used. Testing the model with an external validation set revealed high performance scores. A P-gp modulator list of compounds from the ChEMBL database was used to test the performance, and predictions from both substrate and inhibitor classes were selected for the last step of validation with molecular docking. Predicted substrates revealed similar docking poses than that of doxorubicin, and predicted inhibitors revealed similar docking poses than that of the known P-gp inhibitor elacridar, implying the validity of the predictions. We conclude that the machine-learning approach introduced in this investigation may serve as a tool for the rapid detection of P-gp substrates and inhibitors in large chemical libraries.

A supervised machine learning classification algorithm for research articles

Proceedings of the 28th Annual ACM Symposium on Applied Computing - SAC '13 ◽

10.1145/2480362.2480388 ◽

2013 ◽

Cited By ~ 2

Author(s):

Leonidas Akritidis ◽

Panayiotis Bozanis

Keyword(s):

Machine Learning ◽

Supervised Machine Learning ◽

Research Articles ◽

Comparison Decision Tree and Logistic Regression Machine Learning Classification Algorithms to determine Covid-19

SinkrOn ◽

10.33395/sinkron.v7i1.11243 ◽

2022 ◽

Vol 7 (1) ◽

pp. 59-65

Author(s):

Artika Arista

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Decision Tree ◽

Cross Validation ◽

Performance Testing ◽

Signs And Symptoms ◽

Classification Algorithms ◽

Wide Range ◽

Testing Performance

Many people today are unsure whether they have COVID-19. The frequent fever, dry cough, and sore throat are all signs and symptoms of COVID-19. If a person has signs or symptoms of coronavirus disease 2019 (COVID-19), he/she should see the doctor or go to a clinic as soon as possible. As a result, it's vital to learn and comprehend the fundamental differences. COVID-19 can cause a wide range of symptoms. The experiments were carried out using two Machine Learning Classification Algorithms, namely Decision Tree (DT) and Logistic Regression (LR). Both algorithms were written and analyzed using the Python program in Jupyter Notebook 6.4.5. From the results obtained in the experiments of covid symptoms dataset, on average, the DT model has obtained the best cross-validation average and the testing performance average compared to the LR machine learning models. For cross-validation results, the DT model has achieved an accuracy of 98.0%. For performance testing, the DT model has achieved an accuracy of 98.0%. The LR has obtained the second-best result on the average of cross-validation performance and the testing results. For cross-validation results, the LR model has achieved an accuracy of 96.0%. For performance testing, the LR model has achieved an accuracy of 97.0%. Consequently, the DT for the COVID-19 symptoms dataset is outperforming the LR for cross-validation and testing results.

Predictive Analysis of Coronary Heart Disease (CHD) based on Machine Learning Classification Algorithm

IJARCCE ◽

10.17148/ijarcce.2021.101202 ◽

2021 ◽

Vol 10 (12) ◽

Author(s):

Dillip Narayan Sahu ◽

Vijay Pal Singh

Keyword(s):

Machine Learning ◽

Coronary Heart Disease ◽

Heart Disease ◽

Predictive Analysis ◽

Fake Job Detection and Analysis Using Machine Learning and Deep Learning Algorithms

Revista Gestão Inovação e Tecnologias ◽

10.47059/revistageintec.v11i2.1701 ◽

2021 ◽

Vol 11 (2) ◽

pp. 642-650

Author(s):

C.S. Anita ◽

P. Nagarajan ◽

G. Aditya Sairam ◽

P. Ganesh ◽

G. Deepakkumar

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Data Cleaning ◽

Personal Information ◽

Learning Algorithms ◽

Paper Machine ◽

Accuracy And Precision ◽

Processing Step

With the pandemic situation, there is a strong rise in the number of online jobs posted on the internet in various job portals. But some of the jobs being posted online are actually fake jobs which lead to a theft of personal information and vital information. Thus, these fake jobs can be precisely detected and classified from a pool of job posts of both fake and real jobs by using advanced deep learning as well as machine learning classification algorithms. In this paper, machine learning and deep learning algorithms are used so as to detect fake jobs and to differentiate them from real jobs. The data analysis part and data cleaning part are also proposed in this paper, so that the classification algorithm applied is highly precise and accurate. It has to be noted that the data cleaning step is a very important step in machine learning project because it actually determines the accuracy of the machine learning as well as deep learning algorithms. Hence a great importance is emphasized on data cleaning and pre-processing step in this paper. The classification and detection of fake jobs can be done with high accuracy and high precision. Hence the machine learning and deep learning algorithms have to be applied on cleaned and pre-processed data in order to achieve a better accuracy. Further, deep learning neural networks are used so as to achieve higher accuracy. Finally all these classification models are compared with each other to find the classification algorithm with highest accuracy and precision.

COMPARISON OF MACHINE LEARNING CLASSIFICATION ALGORITHM ON HOTEL REVIEW SENTIMENT ANALYSIS (CASE STUDY: LUMINOR HOTEL PECENONGAN)

Jurnal Pilar Nusa Mandiri ◽

10.33480/pilar.v16i1.1131 ◽

2020 ◽

Vol 16 (1) ◽

pp. 59-64

Author(s):

Jaja Miharja ◽

Jordy Lasmana Putra ◽

Nur Hadianto

Keyword(s):

Machine Learning ◽

Nearest Neighbor ◽

K Nearest Neighbor ◽

Business Decisions ◽

Business People ◽

Auc Value ◽

The Right

Analysis of hotel review sentiment is very helpful to be used as a benchmark or reference for making hotel business decisions today. However, all the review information obtained must be processed first by using an algorithm. The purpose of this study is to compare the Classification Algorithm of Machine Learning to obtain information that has a better level of accuracy in the analysis of hotel reviews. The algorithm that will be used is k-NN (k-Nearest Neighbor) and NB (Naive Bayes). After doing the calculation, the following accuracy level is obtained: k-NN of 60,50% with an AUC value of 0.632 and NB of 85,25% with an AUC value of 0.658. These results can be determined by the right algorithm to assist in making accurate decisions by business people in the analysis of hotel reviews using the NB Algorithm.