scholarly journals A Novel Performance Measure for Machine Learning Classification

2021 ◽  
Vol 13 (1) ◽  
pp. 11-19
Author(s):  
Mingxing Gong

Machine learning models have been widely used in numerous classification problems and performance measures play a critical role in machine learning model development, selection, and evaluation. This paper covers a comprehensive overview of performance measures in machine learning classification. Besides, we proposed a framework to construct a novel evaluation metric that is based on the voting results of three performance measures, each of which has strengths and limitations. The new metric can be proved better than accuracy in terms of consistency and discriminancy.

Stroke ◽  
2012 ◽  
Vol 43 (suppl_1) ◽  
Author(s):  
Crismely A Perdomo ◽  
Vepuka E Kauari ◽  
Elizabeth Suarez ◽  
Olajide Williams ◽  
Joshua Stillman ◽  
...  

Background and Purpose The literature demonstrates how utilizing evidence-based, standardized stroke care can improve patient outcomes; however, the contribution of electronic medical record (EMR) systems may also impact outcomes by ensuring utilization and compliance with established stroke performance measures, facilitating and improving documentation requirements, and standardizing approach to care. In 2008, documentation in patients’ medical records was done in combination of paper and a template free EMR. Originally, the EMR was used for order entry, then transitioned to full electronic documentation in 2009. At that time we implemented our stroke templates and performance measures based on regulatory standards. We hypothesized that the stroke template implementation would help us achieve performance measure criteria above state benchmarks as set out by the New York State Department of Health (NYS DOH). Methods Implementation was phased in [over 18 months], initially using a template that only included neurological assessment and free text fields for stroke measures. By July 2010, existing templates were modified and additional stroke templates were implemented to meet new regulatory requirements and meaningful use criteria. Retrospective data review was conducted for performance comparison between 2008 -- one year prior to EMR/template implementation -- and 2010. In Quarter 1 of 2011 EMR was also implemented in the Emergency Department (ED). Data was reviewed for compliance with stroke measures. Results Documentation compliance substantially improved between 2008 and Quarter 1 2011: Compliance for these measures has been maintained ≥ 85% since November 2010, ≥ 90% Q1 2011 Conclusions The EMR implementation of stroke templates and performance measures can produce substantial improvement in performance measure compliance. Future steps will include automated documentation alerts to retrieve information and real time discovery of missing documentation for concurrent quality review and improvement


Author(s):  
Maktuba Mohid ◽  
Julian Francis Miller ◽  
Simon L. Harding ◽  
Gunnar Tufte ◽  
Odd Rune Lykkebø ◽  
...  

2011 ◽  
Vol 84 (4) ◽  
pp. 493-506
Author(s):  
Irene S. Yurovska ◽  
Michael D. Morris ◽  
Theo Al

Abstract Racing tires and motorcycle tires present individual segments of the tire market. For instance, while the average life of car and truck tires is 50 000 miles, the average life of race tires is 100 miles. Because tires play a critical role in a race, technical demands to assure safety and performance are growing. Similarly, tires have a large influence on safety, handling/grip, and performance of the rapidly growing world fleet of motorcycles, due to the fact of only two wheels being in contact with the ground. Thus, the common feature of both market segments is that the typical tire compromise of wear, rolling resistance, and traction is strongly weighted toward traction. Most of the recent efforts of rubber scientists have been directed toward lowering rolling resistance of the tread compounds, which left a certain void in the science of compounding for racing and motorcycle treads. Particularly, the industrial assortment of polymers and fillers used for motorcycle treads is commonly different from that used for car or truck treads, but it is not known how the filler properties affect the hysteresis–stiffness compromise. The objective of this study is to evaluate the effects of the carbon black characteristics on the important properties of a typical racing and motorcycle tire tread compound. More than 50 individual carbon blacks were mixed in a SBR formulation. The acquired data were statistically analyzed, and a linear multiple regression model was developed to relate rubber properties (responses), such as static modulus, complex dynamic modulus, hysteresis, and viscosity to the key carbon black characteristics (variables) of surface area, structure, aggregate size distribution, and surface activity. Prediction profiles created from the model demonstrate rubber performance limits for the range of carbon blacks tested, and indicate the niches to provide required combinations of the rubber properties.


2022 ◽  
pp. 24-56
Author(s):  
Rajab Ssemwogerere ◽  
Wamwoyo Faruk ◽  
Nambobi Mutwalibi

Classification is a data mining technique or approach used to estimate the grouped membership of items on a basis of a common feature. This technique is virtuous for future planning and discovering new knowledge about a specific dataset. An in-depth study of previous pieces of literature implementing data mining techniques in the design of recommender systems was performed. This chapter provides a broad study of the way of designing recommender systems using various data mining classification techniques of machine learning and also exploiting their methodological decisions in four aspects, the recommendation approaches, data mining techniques, recommendation types, and performance measures. This study focused on some selected classification methods and can be so supportive for both the researchers and the students in the field of computer science and machine learning in strengthening their knowledge about the machine learning hypothesis and data mining.


2018 ◽  
Vol 22 (1) ◽  
pp. 31-41 ◽  
Author(s):  
Nopadol Rompho

PurposeThe purpose of this study is to investigate the uses of performance measures in startup firms, including perceived importance and performance of those measures. Design/methodology/approachThe survey method is used in this study. Data are collected from founders/chief executive officers/managers of 110 startups in Thailand. The correlation analysis and analysis of variance techniques are used as the analysis tool in this study. FindingsThe results show that there is a positive relationship between the perceived importance and the performance of each metric. However, no significant differences are found in the importance and performance of each metric among the various stages of startups. Research limitations/implicationsBecause there are so few startups compared to large corporations, the sample size of this study is relatively small, which is a limitation for some statistical tests. Practical implicationsStartup should measure and monitor the correct metrics in a particular stage, instead of trying to perform well in all areas, which will lead them to lose focus, and possibly even fail. Results obtained from this study will aid startups in properly monitoring and managing their performance. Originality/valueUnlike large corporations, the performance measures used by startups vary, and depend on a startup’s stage and type. Because of the fact that there are much fewer startups than large corporations, there are a limited number of studies in this area. This research is among the first studies that try to investigate the uses of performance measure for this new type of organizations.


1980 ◽  
Vol 5 (4) ◽  
pp. 267-274
Author(s):  
Mirza S. Saiyadain

Several reasons have been offered for the depressed values of coefficients of correlation between performance evaluation scores and test scores for tests that otherwise seem to have high validity. Most of these studies have concerned themselves with only the first year performance measure. This study was undertaken to broadbase the validity design by including performance measures of three subsequent years. Data on the test and performance scores of a sample of executives were analysed. The results indicate that though test scores may not show significant relationship with the first year performance appraisal score, they show positive and significant relationship with subsequent performance appraisal scores. The results are explained in terms of changed performance evaluation.


Author(s):  
Qi Wang ◽  
Xia Zhao ◽  
Jincai Huang ◽  
Yanghe Feng ◽  
Zhong Liu ◽  
...  

The concept of ‘big data’ has been widely discussed, and its value has been illuminated throughout a variety of domains. To quickly mine potential values and alleviate the ever-increasing volume of information, machine learning is playing an increasingly important role and faces more challenges than ever. Because few studies exist regarding how to modify machine learning techniques to accommodate big data environments, we provide a comprehensive overview of the history of the evolution of big data, the foundations of machine learning, and the bottlenecks and trends of machine learning in the big data era. More specifically, based on learning principals, we discuss regularization to enhance generalization. The challenges of quality in big data are reduced to the curse of dimensionality, class imbalances, concept drift and label noise, and the underlying reasons and mainstream methodologies to address these challenges are introduced. Learning model development has been driven by domain specifics, dataset complexities, and the presence or absence of human involvement. In this paper, we propose a robust learning paradigm by aggregating the aforementioned factors. Over the next few decades, we believe that these perspectives will lead to novel ideas and encourage more studies aimed at incorporating knowledge and establishing data-driven learning systems that involve both data quality considerations and human interactions.


2021 ◽  
Author(s):  
Coralie Joucla ◽  
Damien Gabriel ◽  
Emmanuel Haffen ◽  
Juan-Pablo Ortega

Research in machine-learning classification of electroencephalography (EEG) data offers important perspectives for the diagnosis and prognosis of a wide variety of neurological and psychiatric conditions, but the clinical adoption of such systems remains low. We propose here that much of the difficulties translating EEG-machine learning research to the clinic result from consistent inaccuracies in their technical reporting, which severely impair the interpretability of their often-high claims of performance. Taking example from a major class of machine-learning algorithms used in EEG research, the support-vector machine (SVM), we highlight three important aspects of model development (normalization, hyperparameter optimization and cross-validation) and show that, while these 3 aspects can make or break the performance of the system, they are left entirely undocumented in a shockingly vast majority of the research literature. Providing a more systematic description of these aspects of model development constitute three simple steps to improve the interpretability of EEG-SVM research and, in fine, its clinical adoption.


2020 ◽  
Vol 27 (12) ◽  
pp. 1885-1893
Author(s):  
York Jiao ◽  
Anshuman Sharma ◽  
Arbi Ben Abdallah ◽  
Thomas M Maddox ◽  
Thomas Kannampallil

Abstract Objective Accurate estimations of surgical case durations can lead to the cost-effective utilization of operating rooms. We developed a novel machine learning approach, using both structured and unstructured features as input, to predict a continuous probability distribution of surgical case durations. Materials and Methods The data set consisted of 53 783 surgical cases performed over 4 years at a tertiary-care pediatric hospital. Features extracted included categorical (American Society of Anesthesiologists [ASA] Physical Status, inpatient status, day of week), continuous (scheduled surgery duration, patient age), and unstructured text (procedure name, surgical diagnosis) variables. A mixture density network (MDN) was trained and compared to multiple tree-based methods and a Bayesian statistical method. A continuous ranked probability score (CRPS), a generalized extension of mean absolute error, was the primary performance measure. Pinball loss (PL) was calculated to assess accuracy at specific quantiles. Performance measures were additionally evaluated on common and rare surgical procedures. Permutation feature importance was measured for the best performing model. Results MDN had the best performance, with a CRPS of 18.1 minutes, compared to tree-based methods (19.5–22.1 minutes) and the Bayesian method (21.2 minutes). MDN had the best PL at all quantiles, and the best CRPS and PL for both common and rare procedures. Scheduled duration and procedure name were the most important features in the MDN. Conclusions Using natural language processing of surgical descriptors, we demonstrated the use of ML approaches to predict the continuous probability distribution of surgical case durations. The more discerning forecast of the ML-based MDN approach affords opportunities for guiding intelligent schedule design and day-of-surgery operational decisions.


Sign in / Sign up

Export Citation Format

Share Document