Human-in-the-loop applied machine learning

BACKGROUND With the rapidly accelerating spread of dissemination of false medical information on the Web, the task of establishing the credibility of online sources of medical information becomes a pressing necessity. The sheer number of websites offering questionable medical information presented as reliable and actionable suggestions with possibly harmful effects poses an additional requirement for potential solutions, as they have to scale to the size of the problem. Machine learning is one such solution which, when properly deployed, can be an effective tool in fighting medical disinformation on the Web. OBJECTIVE We present a comprehensive framework for designing and curating of machine learning training datasets for online medical information credibility assessment. We show how the annotation process should be constructed and what pitfalls should be avoided. Our main objective is to provide researchers from medical and computer science communities with guidelines on how to construct datasets for machine learning models for various areas of medical information wars. METHODS The key component of our approach is the active annotation process. We begin by outlining the annotation protocol for the curation of high-quality training dataset, which then can be augmented and rapidly extended by employing the human-in-the-loop paradigm to machine learning training. To circumvent the cold start problem of insufficient gold standard annotations, we propose a pre-processing pipeline consisting of representation learning, clustering, and re-ranking of sentences for the acceleration of the training process and the optimization of human resources involved in the annotation. RESULTS We collect over 10 000 annotations of sentences related to selected subjects (psychiatry, cholesterol, autism, antibiotics, vaccines, steroids, birth methods, food allergy testing) for less than $7 000 employing 9 highly qualified annotators (certified medical professionals) and we release this dataset to the general public. We develop an active annotation framework for more efficient annotation of non-credible medical statements. The results of the qualitative analysis support our claims of the efficacy of the presented method. CONCLUSIONS A set of very diverse incentives is driving the widespread dissemination of medical disinformation on the Web. An effective strategy of countering this spread is to use machine learning for automatically establishing the credibility of online medical information. This, however, requires a thoughtful design of the training pipeline. In this paper we present a comprehensive framework of active annotation. In addition, we publish a large curated dataset of medical statements labelled as credible, non-credible, or neutral.

Download Full-text

A Review on Human–AI Interaction in Machine Learning and Insights for Medical Applications

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph18042121 ◽

2021 ◽

Vol 18 (4) ◽

pp. 2121

Author(s):

Mansoureh Maadi ◽

Hadi Akbarzadeh Khorshidi ◽

Uwe Aickelin

Keyword(s):

Machine Learning ◽

Future Research ◽

Computational Power ◽

Medical Field ◽

Interactive Machine Learning ◽

Human In The Loop ◽

Human Interactions ◽

Scoping Literature Review ◽

Domain Expertise ◽

Expertise Level

Objective: To provide a human–Artificial Intelligence (AI) interaction review for Machine Learning (ML) applications to inform how to best combine both human domain expertise and computational power of ML methods. The review focuses on the medical field, as the medical ML application literature highlights a special necessity of medical experts collaborating with ML approaches. Methods: A scoping literature review is performed on Scopus and Google Scholar using the terms “human in the loop”, “human in the loop machine learning”, and “interactive machine learning”. Peer-reviewed papers published from 2015 to 2020 are included in our review. Results: We design four questions to investigate and describe human–AI interaction in ML applications. These questions are “Why should humans be in the loop?”, “Where does human–AI interaction occur in the ML processes?”, “Who are the humans in the loop?”, and “How do humans interact with ML in Human-In-the-Loop ML (HILML)?”. To answer the first question, we describe three main reasons regarding the importance of human involvement in ML applications. To address the second question, human–AI interaction is investigated in three main algorithmic stages: 1. data producing and pre-processing; 2. ML modelling; and 3. ML evaluation and refinement. The importance of the expertise level of the humans in human–AI interaction is described to answer the third question. The number of human interactions in HILML is grouped into three categories to address the fourth question. We conclude the paper by offering a discussion on open opportunities for future research in HILML.

Download Full-text

Financial Context News Sentiment Analysis for the Lithuanian Language

Applied Sciences ◽

10.3390/app11104443 ◽

2021 ◽

Vol 11 (10) ◽

pp. 4443

Author(s):

Rokas Štrimaitis ◽

Pavel Stefanovič ◽

Simona Ramanauskaitė ◽

Asta Slotkienė

Keyword(s):

Machine Learning ◽

Sentiment Analysis ◽

Short Term Memory ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Experimental Investigations ◽

Support Vector ◽

Applied Machine Learning ◽

Bayes Algorithm ◽

Website Content

Financial area analysis is not limited to enterprise performance analysis. It is worth analyzing as wide an area as possible to obtain the full impression of a specific enterprise. News website content is a datum source that expresses the public’s opinion on enterprise operations, status, etc. Therefore, it is worth analyzing the news portal article text. Sentiment analysis in English texts and financial area texts exist, and are accurate, the complexity of Lithuanian language is mostly concentrated on sentiment analysis of comment texts, and does not provide high accuracy. Therefore in this paper, the supervised machine learning model was implemented to assign sentiment analysis on financial context news, gathered from Lithuanian language websites. The analysis was made using three commonly used classification algorithms in the field of sentiment analysis. The hyperparameters optimization using the grid search was performed to discover the best parameters of each classifier. All experimental investigations were made using the newly collected datasets from four Lithuanian news websites. The results of the applied machine learning algorithms show that the highest accuracy is obtained using a non-balanced dataset, via the multinomial Naive Bayes algorithm (71.1%). The other algorithm accuracies were slightly lower: a long short-term memory (71%), and a support vector machine (70.4%).

Download Full-text

Applied machine learning in wind speed prediction and loss minimization in unbalanced radial distribution system

Energy Sources Part A Recovery Utilization and Environmental Effects ◽

10.1080/15567036.2020.1859010 ◽

2020 ◽

pp. 1-21

Author(s):

Aliva Routray ◽

Khyati D Mistry ◽

Sabha Raj Arya ◽

B. Chittibabu

Keyword(s):

Machine Learning ◽

Wind Speed ◽

Radial Distribution ◽

Distribution System ◽

Loss Minimization ◽

Radial Distribution System ◽

Wind Speed Prediction ◽

Applied Machine Learning ◽

Speed Prediction

Download Full-text

Machine learning powered ellipsometry

Light Science & Applications ◽

10.1038/s41377-021-00482-0 ◽

2021 ◽

Vol 10 (1) ◽

Author(s):

Jinchao Liu ◽

Di Zhang ◽

Dianqiang Yu ◽

Mengxin Ren ◽

Jingjun Xu

Keyword(s):

Machine Learning ◽

Optical Constants ◽

Optical Characterization ◽

Superior Performance ◽

Trial And Error ◽

Powerful Method ◽

Human In The Loop ◽

Ill Posed ◽

Fully Automatic

AbstractEllipsometry is a powerful method for determining both the optical constants and thickness of thin films. For decades, solutions to ill-posed inverse ellipsometric problems require substantial human–expert intervention and have become essentially human-in-the-loop trial-and-error processes that are not only tedious and time-consuming but also limit the applicability of ellipsometry. Here, we demonstrate a machine learning based approach for solving ellipsometric problems in an unambiguous and fully automatic manner while showing superior performance. The proposed approach is experimentally validated by using a broad range of films covering categories of metals, semiconductors, and dielectrics. This method is compatible with existing ellipsometers and paves the way for realizing the automatic, rapid, high-throughput optical characterization of films.

Download Full-text

Applied machine learning for a zero defect tolerance system in the automated assembly of pharmaceutical devices (DECSUP-D-20-00799R1)

Decision Support Systems ◽

10.1016/j.dss.2021.113540 ◽

2021 ◽

pp. 113540

Author(s):

Sebastian Dengler ◽

Said Lahriri ◽

Emanuel Trunzer ◽

Birgit Vogel-Heuser

Keyword(s):

Machine Learning ◽

Defect Tolerance ◽

Automated Assembly ◽

Applied Machine Learning ◽

D 20

Download Full-text

Comment on “Predicting reaction performance in C–N cross-coupling using machine learning”

Science ◽

10.1126/science.aat8603 ◽

2018 ◽

Vol 362 (6416) ◽

pp. eaat8603 ◽

Cited By ~ 30

Author(s):

Kangway V. Chuang ◽

Michael J. Keiser

Keyword(s):

Machine Learning ◽

Experimental Design ◽

Coupling Reaction ◽

Cross Coupling ◽

Learning Models ◽

Applied Machine Learning ◽

Cross Coupling Reaction ◽

Reaction Performance ◽

Test Scenarios ◽

Machine Learning Models

Ahneman et al. (Reports, 13 April 2018) applied machine learning models to predict C–N cross-coupling reaction yields. The models use atomic, electronic, and vibrational descriptors as input features. However, the experimental design is insufficient to distinguish models trained on chemical features from those trained solely on random-valued features in retrospective and prospective test scenarios, thus failing classical controls in machine learning.

Download Full-text

An Applied Machine Learning Approach to Subsea Asset Inspection

10.2118/193122-ms ◽

2018 ◽

Cited By ~ 1

Author(s):

Stephen James Bertram ◽

Yilun Fan ◽

David Raffelt ◽

Pawel Michalak

Keyword(s):

Machine Learning ◽

Learning Approach ◽

Applied Machine Learning ◽

Machine Learning Approach

Download Full-text

Filtering Method for Twitter Streaming Data Using Human-in-the-Loop Machine Learning

Journal of Information Processing ◽

10.2197/ipsjjip.27.404 ◽

2019 ◽

Vol 27 (0) ◽

pp. 404-410

Author(s):

Yu Suzuki

Keyword(s):

Machine Learning ◽

Streaming Data ◽

Filtering Method ◽

Human In The Loop

Download Full-text

A Survey of Domain Knowledge Elicitation in Applied Machine Learning

Multimodal Technologies and Interaction ◽

10.3390/mti5120073 ◽

2021 ◽

Vol 5 (12) ◽

pp. 73

Author(s):

Daniel Kerrigan ◽

Jessica Hullman ◽

Enrico Bertini

Keyword(s):

Machine Learning ◽

Domain Knowledge ◽

Development Process ◽

Model Development ◽

Knowledge Elicitation ◽

Future Directions ◽

Domain Experts ◽

Applied Machine Learning ◽

Elicitation Process ◽

Model Development Process

Eliciting knowledge from domain experts can play an important role throughout the machine learning process, from correctly specifying the task to evaluating model results. However, knowledge elicitation is also fraught with challenges. In this work, we consider why and how machine learning researchers elicit knowledge from experts in the model development process. We develop a taxonomy to characterize elicitation approaches according to the elicitation goal, elicitation target, elicitation process, and use of elicited knowledge. We analyze the elicitation trends observed in 28 papers with this taxonomy and identify opportunities for adding rigor to these elicitation approaches. We suggest future directions for research in elicitation for machine learning by highlighting avenues for further exploration and drawing on what we can learn from elicitation research in other fields.

Download Full-text