Predicting the Visibility of the First Crescent

This study presents an application of machine learning to predict whether the first crescent of the lunar month will be visible to naked eye on a given date. The study presents a dataset of successful and unsuccessful attempts to find the first crescent at the start of the lunar month. Previously, this problem was solved by analytically deriving the equations for visibility parameter(s) and manually fixing threshold values. However, we applied supervised machine learning on the independent variables of the problem, and the system learnt about the criteria of classification. The system gives precision of 0.88 and recall of 0.87 and hence it treats both false positives and false negatives equally well.

Download Full-text

Supporting End-User Understanding of Classification Errors: Visualization and Usability Issues

Journal of Interaction Science ◽

10.24982/jois.1814019.003 ◽

2019 ◽

Vol 7 ◽

pp. 29

Author(s):

Emma M.A.L. Beauxis-Aussalet ◽

Joost Van Doorn ◽

Lynda Hardman

Keyword(s):

Machine Learning ◽

Roc Curves ◽

End Users ◽

False Positives ◽

Visual Features ◽

End User ◽

False Negatives ◽

Informed Decisions ◽

Classification Errors ◽

Confusion Matrices

Classifiers are applied in many domains where classification errors have significant implications. However, end-users may not always understand the errors and their impact, as error visualizations are typically designed for experts and for improving classifiers. We discuss the specific needs of classifiers' end-users, and a simplified visualization designed to address them. We evaluate this design with users from three levels of expertise, and compare it with ROC curves and confusion matrices. We identify key difficulties with understanding the classification errors, and how visualizations addressed or aggravated them. The main issues concerned confusions of the actual and predicted classes (e.g., confusion of False Positives and False Negatives). The machine learning terminology, complexity of ROC curves, and symmetry of confusion matrices aggravated the confusions. The end-user-oriented visualization reduced the difficulties by using several visual features to clarify the actual and predicted classes, and more tangible metrics and representation. Our results contribute to supporting end-users' understanding of classification errors, and informed decisions when choosing or tuning classifiers.

Download Full-text

Predicting the Authenticity of Banknotes Using Supervised Learning

American Journal of Advanced Computing ◽

10.15864/ajac.1204 ◽

2020 ◽

Vol 1 (2) ◽

pp. 1-4

Author(s):

Priyam Guha ◽

Abhishek Mukherjee ◽

Abhishek Verma

Keyword(s):

Machine Learning ◽

Supervised Learning ◽

Confusion Matrix ◽

Learning Algorithms ◽

High Accuracy ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

False Negatives ◽

Supervised Learning Algorithms ◽

Very High

This research paper deals with using supervised machine learning algorithms to detect authenticity of bank notes. In this research we were successful in achieving very high accuracy (of the order of 99%) by applying some data preprocessing tricks and then running the processed data on supervised learning algorithms like SVM, Decision Trees, Logistic Regression, KNN. We then proceed to analyze the misclassified points. We examine the confusion matrix to find out which algorithms had more number of false positives and which algorithm had more number of False negatives. This research paper deals with using supervised machine learning algorithms to detect authenticity of bank notes. In this research we were successful in achieving very high accuracy (of the order of 99%) by applying some data preprocessing tricks and then running the processed data on supervised learning algorithms like SVM, Decision Trees, Logistic Regression, KNN. We then proceed to analyze the misclassified points. We examine the confusion matrix to find out which algorithms had more number of false positives and which algorithm had more number of False negatives.

Download Full-text

Machine Learning Fairness in Justice Systems: Base Rates, False Positives, and False Negatives

2020 19th IEEE International Conference on Machine Learning and Applications (ICMLA) ◽

10.1109/icmla51294.2020.00133 ◽

2020 ◽

Author(s):

Jesse Russell

Keyword(s):

Machine Learning ◽

False Positives ◽

False Negatives ◽

Base Rates ◽

Justice Systems

Download Full-text

Improving the calling of non-invasive prenatal testing on 13-/18-/21-trisomy by support vector machine discrimination

10.1101/216689 ◽

2017 ◽

Cited By ~ 1

Author(s):

Jianfeng Yang ◽

Xiaofan Ding ◽

Weidong Zhu

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Clinical Signs ◽

Prenatal Testing ◽

False Positives ◽

Support Vector ◽

False Negatives ◽

Non Invasive ◽

Cell Free Fetal Dna ◽

Z Value

AbstractWith the advance of next-generation sequencing technologies, non-invasive prenatal testing (NIPT) has been developed and employed in fetal aneuploidy screening on 13-/18-/21-trisomies through detecting cell-free fetal DNA (cffDNA) in maternal blood. Although Z test is widely used in NIPT nowadays, there is still necessity to improve its accuracy for removing a) false negatives and false positives, and b) the ratio of unclassified data, so as to reduce the potential harm to patients caused by these inaccuracies as well as the induced cost of retests.Employing multiple Z tests with machine-learning algorithm could provide a better prediction on NIPT data. Combining the multiple Z values with indexes of clinical signs and quality control, features were collected from the known samples and scaled for model training in support vector machine (SVM) discrimination. The trained model was applied to predict the unknown samples, which showed significant improvement. In 4752 qualified NIPT data, our method reached 100% accuracies on all three chromosomes, including 151 data that were grouped as unclassified by one-Z-value based method. Moreover, four false positives and four false negatives were corrected by using this machine-learning model.To our knowledge, this is the first study to employ support vector machine in NIPT data analysis. It is expected to replace the current one-Z-value based NIPT analysis in clinical use.

Download Full-text

Use of a Machine Learning Program to Correctly Triage Incoming Text Messaging Replies From a Cardiovascular Text–Based Secondary Prevention Program: Feasibility Study

JMIR mhealth and uhealth ◽

10.2196/19200 ◽

2020 ◽

Vol 8 (6) ◽

pp. e19200

Author(s):

Nicole Lowres ◽

Andrew Duckworth ◽

Julie Redfern ◽

Aravinda Thiagalingam ◽

Clara K Chow

Keyword(s):

Machine Learning ◽

Cost Effectiveness ◽

Secondary Prevention ◽

Multilayer Perceptron ◽

Text Messaging ◽

False Positives ◽

Text Messages ◽

Ensemble Model ◽

False Negatives ◽

Sms Text Messaging

Background SMS text messaging programs are increasingly being used for secondary prevention, and have been shown to be effective in a number of health conditions including cardiovascular disease. SMS text messaging programs have the potential to increase the reach of an intervention, at a reduced cost, to larger numbers of people who may not access traditional programs. However, patients regularly reply to the SMS text messages, leading to additional staffing requirements to monitor and moderate the patients’ SMS text messaging replies. This additional staff requirement directly impacts the cost-effectiveness and scalability of SMS text messaging interventions. Objective This study aimed to test the feasibility and accuracy of developing a machine learning (ML) program to triage SMS text messaging replies (ie, identify which SMS text messaging replies require a health professional review). Methods SMS text messaging replies received from 2 clinical trials were manually coded (1) into “Is staff review required?” (binary response of yes/no); and then (2) into 12 general categories. Five ML models (Naïve Bayes, OneVsRest, Random Forest Decision Trees, Gradient Boosted Trees, and Multilayer Perceptron) and an ensemble model were tested. For each model run, data were randomly allocated into training set (2183/3118, 70.01%) and test set (935/3118, 29.98%). Accuracy for the yes/no classification was calculated using area under the receiver operating characteristics curve (AUC), false positives, and false negatives. Accuracy for classification into 12 categories was compared using multiclass classification evaluators. Results A manual review of 3118 SMS text messaging replies showed that 22.00% (686/3118) required staff review. For determining need for staff review, the Multilayer Perceptron model had highest accuracy (AUC 0.86; 4.85% false negatives; and 4.63% false positives); with addition of heuristics (specified keywords) fewer false negatives were identified (3.19%), with small increase in false positives (7.66%) and AUC 0.79. Application of this model would result in 26.7% of SMS text messaging replies requiring review (true + false positives). The ensemble model produced the lowest false negatives (1.43%) at the expense of higher false positives (16.19%). OneVsRest was the most accurate (72.3%) for the 12-category classification. Conclusions The ML program has high sensitivity for identifying the SMS text messaging replies requiring staff input; however, future research is required to validate the models against larger data sets. Incorporation of an ML program to review SMS text messaging replies could significantly reduce staff workload, as staff would not have to review all incoming SMS text messages. This could lead to substantial improvements in cost-effectiveness, scalability, and capacity of SMS text messaging–based interventions.

Download Full-text

Use of a Machine Learning Program to Correctly Triage Incoming Text Messaging Replies From a Cardiovascular Text–Based Secondary Prevention Program: Feasibility Study (Preprint)

10.2196/preprints.19200 ◽

2020 ◽

Author(s):

Nicole Lowres ◽

Andrew Duckworth ◽

Julie Redfern ◽

Aravinda Thiagalingam ◽

Clara K Chow

Keyword(s):

Machine Learning ◽

Cost Effectiveness ◽

Secondary Prevention ◽

Multilayer Perceptron ◽

Text Messaging ◽

False Positives ◽

Text Messages ◽

Ensemble Model ◽

False Negatives ◽

Sms Text Messaging

BACKGROUND SMS text messaging programs are increasingly being used for secondary prevention, and have been shown to be effective in a number of health conditions including cardiovascular disease. SMS text messaging programs have the potential to increase the reach of an intervention, at a reduced cost, to larger numbers of people who may not access traditional programs. However, patients regularly reply to the SMS text messages, leading to additional staffing requirements to monitor and moderate the patients’ SMS text messaging replies. This additional staff requirement directly impacts the cost-effectiveness and scalability of SMS text messaging interventions. OBJECTIVE This study aimed to test the feasibility and accuracy of developing a machine learning (ML) program to triage SMS text messaging replies (ie, identify which SMS text messaging replies require a health professional review). METHODS SMS text messaging replies received from 2 clinical trials were manually coded (1) into “Is staff review required?” (binary response of yes/no); and then (2) into 12 general categories. Five ML models (Naïve Bayes, OneVsRest, Random Forest Decision Trees, Gradient Boosted Trees, and Multilayer Perceptron) and an ensemble model were tested. For each model run, data were randomly allocated into training set (2183/3118, 70.01%) and test set (935/3118, 29.98%). Accuracy for the yes/no classification was calculated using area under the receiver operating characteristics curve (AUC), false positives, and false negatives. Accuracy for classification into 12 categories was compared using multiclass classification evaluators. RESULTS A manual review of 3118 SMS text messaging replies showed that 22.00% (686/3118) required staff review. For determining need for staff review, the Multilayer Perceptron model had highest accuracy (AUC 0.86; 4.85% false negatives; and 4.63% false positives); with addition of heuristics (specified keywords) fewer false negatives were identified (3.19%), with small increase in false positives (7.66%) and AUC 0.79. Application of this model would result in 26.7% of SMS text messaging replies requiring review (true + false positives). The ensemble model produced the lowest false negatives (1.43%) at the expense of higher false positives (16.19%). OneVsRest was the most accurate (72.3%) for the 12-category classification. CONCLUSIONS The ML program has high sensitivity for identifying the SMS text messaging replies requiring staff input; however, future research is required to validate the models against larger data sets. Incorporation of an ML program to review SMS text messaging replies could significantly reduce staff workload, as staff would not have to review all incoming SMS text messages. This could lead to substantial improvements in cost-effectiveness, scalability, and capacity of SMS text messaging–based interventions.

Download Full-text

P571Accuracy of a machine learning program to correctly triage incoming SMS text replies from a successful cardiovascular SMS-based secondary prevention program

European Heart Journal ◽

10.1093/eurheartj/ehz747.0182 ◽

2019 ◽

Vol 40 (Supplement_1) ◽

Author(s):

N Lowres ◽

A Duckworth ◽

C K Chow ◽

A Thiagalingam ◽

J Redfern

Keyword(s):

Neural Network ◽

Machine Learning ◽

Secondary Prevention ◽

Health Professional ◽

Health Professionals ◽

Prevention Programs ◽

False Positives ◽

Cardiac Risk Factor ◽

False Negatives ◽

Data Set

Abstract Background Cardiovascular SMS text programs are effective alternate secondary prevention programs for cardiac risk factor reduction and can be delivered as one-way or two-way communication. However, people text back regularly, leading to staffing costs to monitor replies. If you could reduce the need for staff review by 60–70%, costs and scalability of text programs would substantially improve. Purpose To develop and assess accuracy of a machine-learning (ML) program to “triage” and identify texts requiring review/action. Methods We manually reviewed and classified all replies received from two “TEXT ME” cardiovascular secondary prevention programs. Simultaneously a ML model was developed to classify texts and determine those needing a reply (figure). Comparison of ML models included “Naïve Bayes”, “random forest decision trees”, and “gradient boosted trees”, along with comparison to “convolutional neural network” and “recurrent neural network” classification approaches. “Natural language programming” was evaluated however this presented challenges in relation to text content due to non-standard English grammar, frequent use of non-standard abbreviations, and spelling errors. The ML program was trained with 70% of the data-set and accuracy was tested with 30%. Results Manual review of 3118 text replies revealed that only one text was considered urgent, and only 21% required review/action: categorisation was not straight forward due to complexity of texts often containing more than one sentiment (table). The ML program was able to correctly classify 84% of texts into the designated 12 categories. The sensitivity for correctly identifing the need for health professional review was 94% (6.4% false negatives; 3.6% false positives); but with addition of “heuristics” (e.g. searching for specified keywords, question marks etc) sensitivity increased to 97% (2.9% false negatives; 7.3% false positives). Therefore, health professionals would only have to review 27% (true + false positives) of all text replies. Table 1. SMS manual categorisation (n=3118) REVIEW REQUIRED Health Question/concern Admin request Request to STOP Ceased smoking SMS not delivered Urgent/ distress (13%) (4.5%) (3%) (0.8%) (0.4%) (0.03%) NO REVIEW REQUIRED General statement Statement of thanks Reporting good health Blank message Unrelated/ accidental Emoticon only (33%) (23%) (11%) (6%) (4%) (2.4%) Figure 1. Development process Conclusions The ML program has high sensitivity identifying text replies requiring health professional input and a low false negative rate indicating few messages needing response would be missed. Thus, introduction of the program could significantly reduce the workload of health professionals, leading to substantial improvements in scalability and capacity of text-based programs. The future implications for this technology are vast, including utilisation in other interactive mHealth interfaces and cardiovascular health “apps”. Acknowledgement/Funding National Heart Foundation Vanguard Grant; National Health and Medical Research Council Project Grant

Download Full-text

Exploring the Use of Machine Learning to Automate the Qualitative Coding of Church-related Tweets

Fieldwork in Religion ◽

10.1558/firn.40610 ◽

2020 ◽

Vol 14 (2) ◽

pp. 140-159

Author(s):

Anthony-Paul Cooper ◽

Emmanuel Awuni Kolog ◽

Erkki Sutinen

Keyword(s):

Machine Learning ◽

Online Community ◽

High Volume ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Social Media Data ◽

Twitter Data ◽

Resource Intensity ◽

Media Data ◽

Better Than

This article builds on previous research around the exploration of the content of church-related tweets. It does so by exploring whether the qualitative thematic coding of such tweets can, in part, be automated by the use of machine learning. It compares three supervised machine learning algorithms to understand how useful each algorithm is at a classification task, based on a dataset of human-coded church-related tweets. The study finds that one such algorithm, Naïve-Bayes, performs better than the other algorithms considered, returning Precision, Recall and F-measure values which each exceed an acceptable threshold of 70%. This has far-reaching consequences at a time where the high volume of social media data, in this case, Twitter data, means that the resource-intensity of manual coding approaches can act as a barrier to understanding how the online community interacts with, and talks about, church. The findings presented in this article offer a way forward for scholars of digital theology to better understand the content of online church discourse.

Download Full-text

Sky Segmentation for Enhanced Depth Reconstruction and Bokeh Rendering with Efficient Architectures

Electronic Imaging ◽

10.2352/issn.2470-1173.2020.14.coimg-378 ◽

2020 ◽

Vol 2020 (14) ◽

pp. 378-1-378-7

Author(s):

Tyler Nuanes ◽

Matt Elsey ◽

Radek Grzeszczuk ◽

John Paul Shen

Keyword(s):

Real Time ◽

Mobile Device ◽

Computational Cost ◽

False Positives ◽

Compact Model ◽

High Quality ◽

False Negatives ◽

Trade Off ◽

Depth Reconstruction ◽

Binary Classifiers

We present a high-quality sky segmentation model for depth refinement and investigate residual architecture performance to inform optimally shrinking the network. We describe a model that runs in near real-time on mobile device, present a new, highquality dataset, and detail a unique weighing to trade off false positives and false negatives in binary classifiers. We show how the optimizations improve bokeh rendering by correcting stereo depth misprediction in sky regions. We detail techniques used to preserve edges, reject false positives, and ensure generalization to the diversity of sky scenes. Finally, we present a compact model and compare performance of four popular residual architectures (ShuffleNet, MobileNetV2, Resnet-101, and Resnet-34-like) at constant computational cost.

Download Full-text

Application of Supervised Machine Learning Algorithms for Lithofacies Classification.

10.2523/19349-ms ◽

2019 ◽

Author(s):

Subhadeep Sarkar ◽

Chandan Majumdar

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Lithofacies Classification

Download Full-text