A Statistical Design of Experiments Approach to Machine Learning Model Selection in Engineering Applications

Abstract An important but insufficiently addressed issue for machine learning in engineering applications is the task of model selection for new problems. Existing approaches to model selection generally focus on optimizing the learning algorithm and associated hyperparameters. However, in real-world engineering applications, the parameters that are external to the learning algorithm, such as feature engineering, can also have a significant impact on the performance of the model. These external parameters do not fit into most existing approaches for model selection and are therefore often studied ad hoc or not at all. In this article, we develop a statistical design of experiment (DOEs) approach to model selection based on the use of the Taguchi method. The key idea is that we use orthogonal arrays to plan a set of build-and-test experiments to study the external parameters in combination with the learning algorithm. The use of orthogonal arrays maximizes the information learned from each experiment and, therefore, enables the experimental space to be explored extremely efficiently in comparison with grid or random search methods. We demonstrated the application of the statistical DOE approach to a real-world model selection problem involving predicting service request escalation. Statistical DOE significantly reduced the number of experiments necessary to fully explore the external parameters for this problem and was able to successfully optimize the model with respect to the objective function of minimizing total cost in addition to the standard evaluation metrics such as accuracy, f-measure, and g-mean.

Download Full-text

Deep Learning Classification of Canine Behavior Using a Single Collar-Mounted Accelerometer: Real-World Validation

Animals ◽

10.3390/ani11061549 ◽

2021 ◽

Vol 11 (6) ◽

pp. 1549

Author(s):

Robert D. Chambers ◽

Nathanael C. Yoder ◽

Aletha B. Carson ◽

Christian Junge ◽

David E. Allen ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Real World ◽

Learning Algorithm ◽

Drinking Behavior ◽

True Positive Rate ◽

Training Dataset ◽

Activity Levels ◽

Accelerometer Data ◽

Activity Monitors

Collar-mounted canine activity monitors can use accelerometer data to estimate dog activity levels, step counts, and distance traveled. With recent advances in machine learning and embedded computing, much more nuanced and accurate behavior classification has become possible, giving these affordable consumer devices the potential to improve the efficiency and effectiveness of pet healthcare. Here, we describe a novel deep learning algorithm that classifies dog behavior at sub-second resolution using commercial pet activity monitors. We built machine learning training databases from more than 5000 videos of more than 2500 dogs and ran the algorithms in production on more than 11 million days of device data. We then surveyed project participants representing 10,550 dogs, which provided 163,110 event responses to validate real-world detection of eating and drinking behavior. The resultant algorithm displayed a sensitivity and specificity for detecting drinking behavior (0.949 and 0.999, respectively) and eating behavior (0.988, 0.983). We also demonstrated detection of licking (0.772, 0.990), petting (0.305, 0.991), rubbing (0.729, 0.996), scratching (0.870, 0.997), and sniffing (0.610, 0.968). We show that the devices’ position on the collar had no measurable impact on performance. In production, users reported a true positive rate of 95.3% for eating (among 1514 users), and of 94.9% for drinking (among 1491 users). The study demonstrates the accurate detection of important health-related canine behaviors using a collar-mounted accelerometer. We trained and validated our algorithms on a large and realistic training dataset, and we assessed and confirmed accuracy in production via user validation.

Download Full-text

A study on supervised machine learning algorithm to improvise intrusion detection systems for mobile ad hoc networks

Cluster Computing ◽

10.1007/s10586-018-2686-x ◽

2018 ◽

Vol 22 (S2) ◽

pp. 4065-4074 ◽

Cited By ~ 4

Author(s):

S. Vimala ◽

V. Khanaa ◽

C. Nalini

Keyword(s):

Machine Learning ◽

Mobile Ad Hoc Networks ◽

Ad Hoc ◽

Learning Algorithm ◽

Supervised Machine Learning ◽

Intrusion Detection Systems ◽

Machine Learning Algorithm ◽

Detection Systems ◽

Mobile Ad Hoc ◽

Hoc Networks

Download Full-text

Empirical Comparison of Various Discretization Procedures

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001498000567 ◽

1998 ◽

Vol 12 (07) ◽

pp. 1017-1032 ◽

Cited By ~ 10

Author(s):

Petr Berka ◽

Ivan Bruha

Keyword(s):

Machine Learning ◽

Real World ◽

Learning Algorithm ◽

Machine Learning Algorithms ◽

The Other ◽

Machine Learning Algorithm ◽

Empirical Comparison ◽

Numerical Attributes ◽

Real World Problems ◽

Discretization Procedure

The genuine symbolic machine learning (ML) algorithms are capable of processing symbolic, categorial data only. However, real-world problems, e.g. in medicine or finance, involve both symbolic and numerical attributes. Therefore, there is an important issue of ML to discretize (categorize) numerical attributes. There exist quite a few discretization procedures in the ML field. This paper describes two newer algorithms for categorization (discretization) of numerical attributes. The first one is implemented in the KEX (Knowledge EXplorer) as its preprocessing procedure. Its idea is to discretize the numerical attributes in such a way that the resulting categorization corresponds to KEX knowledge acquisition algorithm. Since the categorization for KEX is done "off-line" before using the KEX machine learning algorithm, it can be used as a preprocessing step for other machine learning algorithms, too. The other discretization procedure is implemented in CN4, a large extension of the well-known CN2 machine learning algorithm. The range of numerical attributes is divided into intervals that may form a complex generated by the algorithm as a part of the class description. Experimental results show a comparison of performance of KEX and CN4 on some well-known ML databases. To make the comparison more exhibitory, we also used the discretization procedure of the MLC++ library. Other ML algorithms such as ID3 and C4.5 were run under our experiments, too. Then, the results are compared and discussed.

Download Full-text

An Efficient Perpetual Learning Algorithm

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213019500222 ◽

2019 ◽

Vol 28 (07) ◽

pp. 1950022 ◽

Cited By ~ 1

Author(s):

Haiou Qin ◽

Du Zhang ◽

Xibin Sun ◽

Jiahua Tang ◽

Jun Peng

Keyword(s):

Machine Learning ◽

Real World ◽

Efficient Algorithm ◽

Learning Algorithm ◽

Small Data ◽

Computing Systems ◽

Agent Systems ◽

Multiple Tasks ◽

Improved Performance ◽

Over Time

One of the emerging research opportunities in machine learning is to develop computing systems that learn many tasks continuously and improve the performance of learned tasks incrementally over time. In real world, learners have to adapt to labeled and unlabeled samples from various tasks which arrive randomly. In this paper, we propose an efficient algorithm called Efficient Perpetual Learning Algorithm (EPLA) which is suitable for learning multiple tasks in both offline and online settings. The algorithm, which is an extension of ELLA,4 is part of what we call perpetual learning that can learn new tasks or refine knowledge of learned tasks for improved performance with newly arrived labeled samples in an incremental fashion. Several salient features exist for EPLA. The learning episodes are triggered via either extrinsic or intrinsic stimuli. Agent systems based on the proposed algorithm can be engaged in an open-ended and alternating sequence of learning episodes and working episodes. Unlabeled samples can be used to self-train the learner in small data setting. Compared with ELLA, EPLA shows almost equivalent performance without memorizing any labeled samples learned previously.

Download Full-text

Statistical Design of Experiments with Engineering Applications

10.1201/b16326 ◽

2005 ◽

Cited By ~ 26

Author(s):

Kamel Rekab ◽

Muzaffar Shaikh

Keyword(s):

Design Of Experiments ◽

Statistical Design ◽

Statistical Design Of Experiments ◽

Engineering Applications

Download Full-text

Statistical Design of Experiments With Engineering Applications

Journal of the American Statistical Association ◽

10.1198/jasa.2006.s76 ◽

2006 ◽

Vol 101 (473) ◽

pp. 396-397

Author(s):

Christine M Anderson-Cook

Keyword(s):

Design Of Experiments ◽

Statistical Design ◽

Statistical Design Of Experiments ◽

Engineering Applications

Download Full-text

Study design: Development of an advanced machine learning algorithm for the early diagnosis of Gaucher disease using real-world data

Molecular Genetics and Metabolism ◽

10.1016/j.ymgme.2020.12.218 ◽

2021 ◽

Vol 132 (2) ◽

pp. S91

Author(s):

Shoshana Revel-Vilk ◽

Gabriel Chodick ◽

Varda Shalev ◽

Noga Gadir

Keyword(s):

Machine Learning ◽

Early Diagnosis ◽

Study Design ◽

Real World ◽

Gaucher Disease ◽

Learning Algorithm ◽

Machine Learning Algorithm ◽

Design Development ◽

Real World Data ◽

World Data

Download Full-text

Machine Learning Prediction of Stroke Mechanism in Embolic Strokes of Undetermined Source

Stroke ◽

10.1161/strokeaha.120.029305 ◽

2020 ◽

Vol 51 (9) ◽

Cited By ~ 1

Author(s):

Hooman Kamel ◽

Babak B. Navi ◽

Neal S. Parikh ◽

Alexander E. Merkler ◽

Peter M. Okin ◽

...

Keyword(s):

Machine Learning ◽

Atrial Fibrillation ◽

Cross Validation ◽

Learning Algorithm ◽

Random Search ◽

Model Performance ◽

Area Under The Curve ◽

Independent Set ◽

Predicted Probability ◽

Using Data

Background and Purpose: One-fifth of ischemic strokes are embolic strokes of undetermined source (ESUS). Their theoretical causes can be classified as cardioembolic versus noncardioembolic. This distinction has important implications, but the categories’ proportions are unknown. Methods: Using data from the Cornell Acute Stroke Academic Registry, we trained a machine-learning algorithm to distinguish cardioembolic versus non-cardioembolic strokes, then applied the algorithm to ESUS cases to determine the predicted proportion with an occult cardioembolic source. A panel of neurologists adjudicated stroke etiologies using standard criteria. We trained a machine learning classifier using data on demographics, comorbidities, vitals, laboratory results, and echocardiograms. An ensemble predictive method including L1 regularization, gradient-boosted decision tree ensemble (XGBoost), random forests, and multivariate adaptive splines was used. Random search and cross-validation were used to tune hyperparameters. Model performance was assessed using cross-validation among cases of known etiology. We applied the final algorithm to an independent set of ESUS cases to determine the predicted mechanism (cardioembolic or not). To assess our classifier’s validity, we correlated the predicted probability of a cardioembolic source with the eventual post-ESUS diagnosis of atrial fibrillation. Results: Among 1083 strokes with known etiologies, our classifier distinguished cardioembolic versus noncardioembolic cases with excellent accuracy (area under the curve, 0.85). Applied to 580 ESUS cases, the classifier predicted that 44% (95% credibility interval, 39%–49%) resulted from cardiac embolism. Individual ESUS patients’ predicted likelihood of cardiac embolism was associated with eventual atrial fibrillation detection (OR per 10% increase, 1.27 [95% CI, 1.03–1.57]; c-statistic, 0.68 [95% CI, 0.58–0.78]). ESUS patients with high predicted probability of cardiac embolism were older and had more coronary and peripheral vascular disease, lower ejection fractions, larger left atria, lower blood pressures, and higher creatinine levels. Conclusions: A machine learning estimator that distinguished known cardioembolic versus noncardioembolic strokes indirectly estimated that 44% of ESUS cases were cardioembolic.

Download Full-text

Effect of a sepsis prediction algorithm on patient mortality, length of stay and readmission: a prospective multicentre clinical outcomes evaluation of real-world patient data from US hospitals

BMJ Health & Care Informatics ◽

10.1136/bmjhci-2019-100109 ◽

2020 ◽

Vol 27 (1) ◽

pp. e100109 ◽

Cited By ~ 1

Author(s):

Hoyt Burdick ◽

Eduardo Pino ◽

Denise Gabel-Comeau ◽

Andrea McCoy ◽

Carol Gu ◽

...

Keyword(s):

Machine Learning ◽

Severe Sepsis ◽

Length Of Stay ◽

Hospital Mortality ◽

Clinical Outcomes ◽

Real World ◽

Learning Algorithm ◽

Hospital Length ◽

Hospital Length Of Stay ◽

Machine Learning Algorithm

BackgroundSevere sepsis and septic shock are among the leading causes of death in the USA. While early prediction of severe sepsis can reduce adverse patient outcomes, sepsis remains one of the most expensive conditions to diagnose and treat.ObjectiveThe purpose of this study was to evaluate the effect of a machine learning algorithm for severe sepsis prediction on in-hospital mortality, hospital length of stay and 30-day readmission.DesignProspective clinical outcomes evaluation.SettingEvaluation was performed on a multiyear, multicentre clinical data set of real-world data containing 75 147 patient encounters from nine hospitals across the continental USA, ranging from community hospitals to large academic medical centres.ParticipantsAnalyses were performed for 17 758 adult patients who met two or more systemic inflammatory response syndrome criteria at any point during their stay (‘sepsis-related’ patients).InterventionsMachine learning algorithm for severe sepsis prediction.Outcome measuresIn-hospital mortality, length of stay and 30-day readmission rates.ResultsHospitals saw an average 39.5% reduction of in-hospital mortality, a 32.3% reduction in hospital length of stay and a 22.7% reduction in 30-day readmission rate for sepsis-related patient stays when using the machine learning algorithm in clinical outcomes analysis.ConclusionsReductions of in-hospital mortality, hospital length of stay and 30-day readmissions were observed in real-world clinical use of the machine learning-based algorithm. The predictive algorithm may be successfully used to improve sepsis-related outcomes in live clinical settings.Trial registration numberNCT03960203

Download Full-text

Machine Learning-Based Cooperative Spectrum Sensing in Dynamic Segmentation Enabled Cognitive Radio Vehicular Network

Energies ◽

10.3390/en14041169 ◽

2021 ◽

Vol 14 (4) ◽

pp. 1169

Author(s):

Mohammad Asif Hossain ◽

Rafidah Md Noor ◽

Kok-Lim Alvin Yau ◽

Saaidal Razalli Azzuhri ◽

Muhammad Reza Z’aba ◽

...

Keyword(s):

Machine Learning ◽

Cognitive Radio ◽

Spectrum Sensing ◽

Ad Hoc ◽

Cooperative Spectrum Sensing ◽

Learning Algorithm ◽

Congestion Management ◽

Secondary Users ◽

Dynamic Segmentation ◽

Services Integration

A vehicle ad hoc network (VANET) is a solution for road safety, congestion management, and infotainment services. Integration of cognitive radio (CR), known as CR-VANET, is needed to solve the spectrum scarcity problems of VANET. Several research efforts have addressed the concerns of CR-VANET. However, more reliable, robust, and faster spectrum sensing is still a challenge. A novel segment-based CR-VANET (Seg-CR-VANET) architecture is therefore proposed in this paper. Roads are divided equally into segments, and they are sub-segmented based on the probability value. Individual vehicles or secondary users produce local sensing results by choosing an optimal spectrum sensing (SS) technique using a hybrid machine learning algorithm that includes fuzzy and naïve Bayes algorithms. We used dynamic threshold values for the sensing techniques. In this proposed cooperative SS, the segment spectrum agent (SSA) made the global decision using the tri-agent reinforcement learning (TA-RL) algorithm. Three environments (network, signal, and vehicle) are learned by this proposed algorithm to determine primary (licensed) users’ activities. The simulation results indicate that, compared to current works, the proposed Seg-CR-VANET produces better results in spectrum sensing.

Download Full-text