A Human-AI Loop Approach for Joint Keyword Discovery and Expectation Estimation in Micropost Event Detection

Microblogging platforms such as Twitter are increasingly being used in event detection. Existing approaches mainly use machine learning models and rely on event-related keywords to collect the data for model training. These approaches make strong assumptions on the distribution of the relevant microposts containing the keyword – referred to as the expectation of the distribution – and use it as a posterior regularization parameter during model training. Such approaches are, however, limited as they fail to reliably estimate the informativeness of a keyword and its expectation for model training. This paper introduces a Human-AI loop approach to jointly discover informative keywords for model training while estimating their expectation. Our approach iteratively leverages the crowd to estimate both keyword-specific expectation and the disagreement between the crowd and the model in order to discover new keywords that are most beneficial for model training. These keywords and their expectation not only improve the resulting performance but also make the model training process more transparent. We empirically demonstrate the merits of our approach, both in terms of accuracy and interpretability, on multiple real-world datasets and show that our approach improves the state of the art by 24.3%.

Download Full-text

Counterfactual Fairness: Unidentification, Bound and Algorithm

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/199 ◽

2019 ◽

Cited By ~ 2

Author(s):

Yongkai Wu ◽

Lu Zhang ◽

Xintao Wu

Keyword(s):

Machine Learning ◽

Observational Data ◽

Real World ◽

Causal Model ◽

Learning Models ◽

Inherent Limitation ◽

Demographic Group ◽

Real World Datasets ◽

The Individual ◽

Machine Learning Models

Fairness-aware learning studies the problem of building machine learning models that are subject to fairness requirements. Counterfactual fairness is a notion of fairness derived from Pearl's causal model, which considers a model is fair if for a particular individual or group its prediction in the real world is the same as that in the counterfactual world where the individual(s) had belonged to a different demographic group. However, an inherent limitation of counterfactual fairness is that it cannot be uniquely quantified from the observational data in certain situations, due to the unidentifiability of the counterfactual quantity. In this paper, we address this limitation by mathematically bounding the unidentifiable counterfactual quantity, and develop a theoretically sound algorithm for constructing counterfactually fair classifiers. We evaluate our method in the experiments using both synthetic and real-world datasets, as well as compare with existing methods. The results validate our theory and show the effectiveness of our method.

Download Full-text

A Survey on Data-driven Network Intrusion Detection

ACM Computing Surveys ◽

10.1145/3472753 ◽

2022 ◽

Vol 54 (9) ◽

pp. 1-36

Author(s):

Dylan Chou ◽

Meng Jiang

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Real World ◽

Data Driven ◽

Network Intrusion Detection ◽

Large Network ◽

Learning Models ◽

Simulated Environments ◽

Network Intrusion ◽

Machine Learning Models

Data-driven network intrusion detection (NID) has a tendency towards minority attack classes compared to normal traffic. Many datasets are collected in simulated environments rather than real-world networks. These challenges undermine the performance of intrusion detection machine learning models by fitting machine learning models to unrepresentative “sandbox” datasets. This survey presents a taxonomy with eight main challenges and explores common datasets from 1999 to 2020. Trends are analyzed on the challenges in the past decade and future directions are proposed on expanding NID into cloud-based environments, devising scalable models for large network data, and creating labeled datasets collected in real-world networks.

Download Full-text

Using Machine Learning Methods Incorporating Individual Reader Annotations to Classify Paediatric Chest Radiographs in Epidemiological Studies

Wellcome Open Research ◽

10.12688/wellcomeopenres.17164.1 ◽

2021 ◽

Vol 6 ◽

pp. 309

Author(s):

Paul Mwaniki ◽

Timothy Kamanu ◽

Samuel Akech ◽

M. J. C Eijkemans

Keyword(s):

Machine Learning ◽

Epidemiological Studies ◽

Chest Radiographs ◽

World Health ◽

Data Sets ◽

Learning Models ◽

Middle Income ◽

Training Models ◽

Model Training ◽

Machine Learning Models

Introduction: Epidemiological studies that involve interpretation of chest radiographs (CXRs) suffer from inter-reader and intra-reader variability. Inter-reader and intra-reader variability hinder comparison of results from different studies or centres, which negatively affects efforts to track the burden of chest diseases or evaluate the efficacy of interventions such as vaccines. This study explores machine learning models that could standardize interpretation of CXR across studies and the utility of incorporating individual reader annotations when training models using CXR data sets annotated by multiple readers. Methods: Convolutional neural networks were used to classify CXRs from seven low to middle-income countries into five categories according to the World Health Organization's standardized methodology for interpreting paediatric CXRs. We compared models trained to predict the final/aggregate classification with models trained to predict how each reader would classify an image and then aggregate predictions for all readers using unweighted mean. Results: Incorporating individual reader's annotations during model training improved classification accuracy by 3.4% (multi-class accuracy 61% vs 59%). Model accuracy was higher for children above 12 months of age (68% vs 58%). The accuracy of the models in different countries ranged between 45% and 71%. Conclusions: Machine learning models can annotate CXRs in epidemiological studies reducing inter-reader and intra-reader variability. In addition, incorporating individual reader annotations can improve the performance of machine learning models trained using CXRs annotated by multiple readers.

Download Full-text

Chapter 15. Human-Centered Concept Explanations for Neural Networks

10.3233/faia210362 ◽

2021 ◽

Author(s):

Chih-Kuan Yeh ◽

Been Kim ◽

Pradeep Ravikumar

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Case Studies ◽

Real World ◽

Deep Neural Networks ◽

Learning Models ◽

Real World Applications ◽

The Right ◽

Concept Activation ◽

Machine Learning Models

Understanding complex machine learning models such as deep neural networks with explanations is crucial in various applications. Many explanations stem from the model perspective, and may not necessarily effectively communicate why the model is making its predictions at the right level of abstraction. For example, providing importance weights to individual pixels in an image can only express which parts of that particular image is important to the model, but humans may prefer an explanation which explains the prediction by concept-based thinking. In this work, we review the emerging area of concept based explanations. We start by introducing concept explanations including the class of Concept Activation Vectors (CAV) which characterize concepts using vectors in appropriate spaces of neural activations, and discuss different properties of useful concepts, and approaches to measure the usefulness of concept vectors. We then discuss approaches to automatically extract concepts, and approaches to address some of their caveats. Finally, we discuss some case studies that showcase the utility of such concept-based explanations in synthetic settings and real world applications.

Download Full-text

Research Methods to Study and Empower Crowd Workers

10.1093/oso/9780198860679.003.0009 ◽

2021 ◽

pp. 164-184

Author(s):

Saiph Savage ◽

Carlos Toxtli ◽

Eber Betanzos-Torres

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Research Methods ◽

Real World ◽

Intelligent Systems ◽

Quantitative Information ◽

Learning Models ◽

Professional Goals ◽

Career Growth ◽

Machine Learning Models

The artificial intelligence (AI) industry has created new jobs that are essential to the real world deployment of intelligent systems. Part of the job focuses on labelling data for machine learning models or having workers complete tasks that AI alone cannot do. These workers are usually known as ‘crowd workers’—they are part of a large distributed crowd that is jointly (but separately) working on the tasks although they are often invisible to end-users, leading to workers often being paid below minimum wage and having limited career growth. In this chapter, we draw upon the field of human–computer interaction to provide research methods for studying and empowering crowd workers. We present our Computational Worker Leagues which enable workers to work towards their desired professional goals and also supply quantitative information about crowdsourcing markets. This chapter demonstrates the benefits of this approach and highlights important factors to consider when researching the experiences of crowd workers.

Download Full-text

Effectiveness, Explainability and Reliability of Machine Meta-Learning Methods for Predicting Mortality in Patients with COVID-19: Results of the Brazilian COVID-19 Registry

10.1101/2021.11.01.21265527 ◽

2021 ◽

Author(s):

Bruno Barbosa Miranda de Paiva ◽

Polianna Delfino Pereira ◽

Claudio Moises Valiense de Andrade ◽

Virginia Mara Reis Gomes ◽

Maria Clara Pontello Barbosa Lima ◽

...

Keyword(s):

Machine Learning ◽

Prediction Models ◽

State Of The Art ◽

Laboratory Data ◽

Machine Learning Algorithms ◽

Training Data ◽

Learning Models ◽

Learning Methods ◽

Meta Learning ◽

Machine Learning Models

Objective: To provide a thorough comparative study among state ofthe art machine learning methods and statistical methods for determining in-hospital mortality in COVID 19 patients using data upon hospital admission; to study the reliability of the predictions of the most effective methods by correlating the probability of the outcome and the accuracy of the methods; to investigate how explainable are the predictions produced by the most effective methods. Materials and Methods: De-identified data were obtained from COVID 19 positive patients in 36 participating hospitals, from March 1 to September 30, 2020. Demographic, comorbidity, clinical presentation and laboratory data were used as training data to develop COVID 19 mortality prediction models. Multiple machine learning and traditional statistics models were trained on this prediction task using a folded cross validation procedure, from which we assessed performance and interpretability metrics. Results: The Stacking of machine learning models improved over the previous state of the art results by more than 26% in predicting the class of interest (death), achieving 87.1% of AUROC and macroF1 of 73.9%. We also show that some machine learning models can be very interpretable and reliable, yielding more accurate predictions while providing a good explanation for the why. Conclusion: The best results were obtained using the meta learning ensemble model Stacking. State of the art explainability techniques such as SHAP values can be used to draw useful insights into the patterns learned by machine-learning algorithms. Machine learning models can be more explainable than traditional statistics models while also yielding highly reliable predictions. Key words: COVID-19; prognosis; prediction model; machine learning

Download Full-text

The State of the Art in Enhancing Trust in Machine Learning Models with the Use of Visualizations

Computer Graphics Forum ◽

10.1111/cgf.14034 ◽

2020 ◽

Vol 39 (3) ◽

pp. 713-756 ◽

Cited By ~ 1

Author(s):

A. Chatzimparmpas ◽

R. M. Martins ◽

I. Jusufi ◽

K. Kucher ◽

F. Rossi ◽

...

Keyword(s):

Machine Learning ◽

State Of The Art ◽

The State ◽

Learning Models ◽

Machine Learning Models

Download Full-text

A Systematised State-of-the-Art Review of Machine Learning Models to Aid Clinical Decision-Making in Epilepsy

SSRN Electronic Journal ◽

10.2139/ssrn.3541140 ◽

2020 ◽

Author(s):

Edward Jonathan Han-Burgess ◽

Richard J. Stevens

Keyword(s):

Machine Learning ◽

Decision Making ◽

Clinical Decision Making ◽

State Of The Art ◽

Clinical Decision ◽

Learning Models ◽

Machine Learning Models

Download Full-text

Evaluation of statistical and machine learning models for time series prediction: Identifying the state-of-the-art and the best conditions for the use of each model

Information Sciences ◽

10.1016/j.ins.2019.01.076 ◽

2019 ◽

Vol 484 ◽

pp. 302-337 ◽

Cited By ~ 17

Author(s):

Antonio Rafael Sabino Parmezan ◽

Vinicius M.A. Souza ◽

Gustavo E.A.P.A. Batista

Keyword(s):

Machine Learning ◽

Time Series ◽

State Of The Art ◽

Time Series Prediction ◽

The State ◽

Learning Models ◽

Machine Learning Models

Download Full-text

Long-Term Impacts of Fair Machine Learning

Ergonomics in Design The Quarterly of Human Factors Applications ◽

10.1177/1064804619884160 ◽

2019 ◽

Vol 28 (3) ◽

pp. 7-11

Author(s):

Xueru Zhang ◽

Mohammad Mahdi Khalili ◽

Mingyan Liu

Keyword(s):

Machine Learning ◽

Real World ◽

Human Beings ◽

Learning Models ◽

Real World Data ◽

World Data ◽

Fairness Concerns ◽

Fairness Constraints ◽

Machine Learning Models

Machine learning models developed from real-world data can inherit potential, preexisting bias in the dataset. When these models are used to inform decisions involving human beings, fairness concerns inevitably arise. Imposing certain fairness constraints in the training of models can be effective only if appropriate criteria are applied. However, a fairness criterion can be defined/assessed only when the interaction between the decisions and the underlying population is well understood. We introduce two feedback models describing how people react when receiving machine-aided decisions and illustrate that some commonly used fairness criteria can end with undesirable consequences while reinforcing discrimination.

Download Full-text