Algorithmic reparation

Machine learning algorithms pervade contemporary society. They are integral to social institutions, inform processes of governance, and animate the mundane technologies of daily life. Consistently, the outcomes of machine learning reflect, reproduce, and amplify structural inequalities. The field of fair machine learning has emerged in response, developing mathematical techniques that increase fairness based on anti-classification, classification parity, and calibration standards. In practice, these computational correctives invariably fall short, operating from an algorithmic idealism that does not, and cannot, address systemic, Intersectional stratifications. Taking present fair machine learning methods as our point of departure, we suggest instead the notion and practice of algorithmic reparation. Rooted in theories of Intersectionality, reparative algorithms name, unmask, and undo allocative and representational harms as they materialize (American English sp) in sociotechnical form. We propose algorithmic reparation as a foundation for building, evaluating, adjusting, and when necessary, omitting and eradicating machine learning systems.

Download Full-text

Systematic literature review of machine learning methods used in the analysis of real-world data for patient-provider decision making

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-021-01403-2 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Alan Brnabic ◽

Lisa M. Hess

Keyword(s):

Machine Learning ◽

Decision Making ◽

Literature Review ◽

Systematic Literature Review ◽

Real World ◽

Learning Algorithms ◽

External Validation ◽

Machine Learning Algorithms ◽

Learning Methods ◽

Machine Learning Methods

Abstract Background Machine learning is a broad term encompassing a number of methods that allow the investigator to learn from the data. These methods may permit large real-world databases to be more rapidly translated to applications to inform patient-provider decision making. Methods This systematic literature review was conducted to identify published observational research of employed machine learning to inform decision making at the patient-provider level. The search strategy was implemented and studies meeting eligibility criteria were evaluated by two independent reviewers. Relevant data related to study design, statistical methods and strengths and limitations were identified; study quality was assessed using a modified version of the Luo checklist. Results A total of 34 publications from January 2014 to September 2020 were identified and evaluated for this review. There were diverse methods, statistical packages and approaches used across identified studies. The most common methods included decision tree and random forest approaches. Most studies applied internal validation but only two conducted external validation. Most studies utilized one algorithm, and only eight studies applied multiple machine learning algorithms to the data. Seven items on the Luo checklist failed to be met by more than 50% of published studies. Conclusions A wide variety of approaches, algorithms, statistical software, and validation strategies were employed in the application of machine learning methods to inform patient-provider decision making. There is a need to ensure that multiple machine learning approaches are used, the model selection strategy is clearly defined, and both internal and external validation are necessary to be sure that decisions for patient care are being made with the highest quality evidence. Future work should routinely employ ensemble methods incorporating multiple machine learning algorithms.

Download Full-text

Machine-learning based prediction of Cushing’s syndrome in dogs attending UK primary-care veterinary practice

Scientific Reports ◽

10.1038/s41598-021-88440-z ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Imogen Schofield ◽

David C. Brodbelt ◽

Noel Kennedy ◽

Stijn J. M. Niessen ◽

David B. Church ◽

...

Keyword(s):

Machine Learning ◽

Cushing’S Syndrome ◽

Clinical Decision Making ◽

Predictive Performance ◽

Clinical Decision ◽

Cushing's Syndrome ◽

Machine Learning Algorithms ◽

Learning Methods ◽

Machine Learning Methods ◽

Clinical Records

AbstractCushing’s syndrome is an endocrine disease in dogs that negatively impacts upon the quality-of-life of affected animals. Cushing’s syndrome can be a challenging diagnosis to confirm, therefore new methods to aid diagnosis are warranted. Four machine-learning algorithms were applied to predict a future diagnosis of Cushing's syndrome, using structured clinical data from the VetCompass programme in the UK. Dogs suspected of having Cushing's syndrome were included in the analysis and classified based on their final reported diagnosis within their clinical records. Demographic and clinical features available at the point of first suspicion by the attending veterinarian were included within the models. The machine-learning methods were able to classify the recorded Cushing’s syndrome diagnoses, with good predictive performance. The LASSO penalised regression model indicated the best overall performance when applied to the test set with an AUROC = 0.85 (95% CI 0.80–0.89), sensitivity = 0.71, specificity = 0.82, PPV = 0.75 and NPV = 0.78. The findings of our study indicate that machine-learning methods could predict the future diagnosis of a practicing veterinarian. New approaches using these methods could support clinical decision-making and contribute to improved diagnosis of Cushing’s syndrome in dogs.

Download Full-text

Acoustic feature-based sentiment analysis of call center data

10.32469/10355/66751 ◽

2017 ◽

Author(s):

◽

Zeshan Peng

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Emotion Recognition ◽

Sentiment Analysis ◽

Call Center ◽

Machine Learning Algorithms ◽

Language Recognition ◽

Acoustic Features ◽

Learning Methods ◽

Machine Learning Methods

With the advancement of machine learning methods, audio sentiment analysis has become an active research area in recent years. For example, business organizations are interested in persuasion tactics from vocal cues and acoustic measures in speech. A typical approach is to find a set of acoustic features from audio data that can indicate or predict a customer's attitude, opinion, or emotion state. For audio signals, acoustic features have been widely used in many machine learning applications, such as music classification, language recognition, emotion recognition, and so on. For emotion recognition, previous work shows that pitch and speech rate features are important features. This thesis work focuses on determining sentiment from call center audio records, each containing a conversation between a sales representative and a customer. The sentiment of an audio record is considered positive if the conversation ended with an appointment being made, and is negative otherwise. In this project, a data processing and machine learning pipeline for this problem has been developed. It consists of three major steps: 1) an audio record is split into segments by speaker turns; 2) acoustic features are extracted from each segment; and 3) classification models are trained on the acoustic features to predict sentiment. Different set of features have been used and different machine learning methods, including classical machine learning algorithms and deep neural networks, have been implemented in the pipeline. In our deep neural network method, the feature vectors of audio segments are stacked in temporal order into a feature matrix, which is fed into deep convolution neural networks as input. Experimental results based on real data shows that acoustic features, such as Mel frequency cepstral coefficients, timbre and Chroma features, are good indicators for sentiment. Temporal information in an audio record can be captured by deep convolutional neural networks for improved prediction accuracy.

Download Full-text

MODIS-FIRMS and ground-truthing based wildfire likelihood mapping of Sikkim Himalaya using machine learning algorithms.

10.21203/rs.3.rs-750123/v1 ◽

2021 ◽

Author(s):

Polash Banerjee

Keyword(s):

Machine Learning ◽

Machine Learning Algorithms ◽

Tree Cover ◽

Anthropogenic Factors ◽

Gradient Boosting ◽

Support Vector ◽

Learning Methods ◽

Sikkim Himalaya ◽

Environmental Features ◽

Machine Learning Methods

Abstract Wildfires in limited extent and intensity can be a boon for the forest ecosystem. However, recent episodes of wildfires of 2019 in Australia and Brazil are sad reminders of their heavy ecological and economical costs. Understanding the role of environmental factors in the likelihood of wildfires in a spatial context would be instrumental in mitigating it. In this study, 14 environmental features encompassing meteorological, topographical, ecological, in situ and anthropogenic factors have been considered for preparing the wildfire likelihood map of Sikkim Himalaya. A comparative study on the efficiency of machine learning methods like Generalized Linear Model (GLM), Support Vector Machine (SVM), Random Forest (RF) and Gradient Boosting Model (GBM) has been performed to identify the best performing algorithm in wildfire prediction. The study indicates that all the machine learning methods are good at predicting wildfires. However, RF has outperformed, followed by GBM in the prediction. Also, environmental features like average temperature, average wind speed, proximity to roadways and tree cover percentage are the most important determinants of wildfires in Sikkim Himalaya. This study can be considered as a decision support tool for preparedness, efficient resource allocation and sensitization of people towards mitigation of wildfires in Sikkim.

Download Full-text

Landslide susceptibility mapping using machine learning for Wenchuan County, Sichuan province, China

E3S Web of Conferences ◽

10.1051/e3sconf/202019803023 ◽

2020 ◽

Vol 198 ◽

pp. 03023

Author(s):

Xin Yang ◽

Rui Liu ◽

Luyao Li ◽

Mei Yang ◽

Yuantao Yang

Keyword(s):

Machine Learning ◽

Landslide Susceptibility ◽

Susceptibility Mapping ◽

Machine Learning Algorithms ◽

Landslide Susceptibility Mapping ◽

Support Vector ◽

Roc Curve Analysis ◽

Learning Methods ◽

Machine Learning Methods ◽

Boosted Decision Tree

Landslide susceptibility mapping is a method used to assess the probability and spatial distribution of landslide occurrences. Machine learning methods have been widely used in landslide susceptibility in recent years. In this paper, six popular machine learning algorithms namely logistic regression, multi-layer perceptron, random forests, support vector machine, Adaboost, and gradient boosted decision tree were leveraged to construct landslide susceptibility models with a total of 1365 landslide points and 14 predisposing factors. Subsequently, the landslide susceptibility maps (LSM) were generated by the trained models. LSM shows the main landslide zone is concentrated in the southeastern area of Wenchuan County. The result of ROC curve analysis shows that all models fitted the training datasets and achieved satisfactory results on validation datasets. The results of this paper reveal that machine learning methods are feasible to build robust landslide susceptibility models.

Download Full-text

Distributing Epistemic Functions and Tasks - A Framework for Augmenting Human Analytic Power With Machine Learning in Science Education Research

10.31219/osf.io/sg9jk ◽

2022 ◽

Author(s):

Marcus Kubsch ◽

Christina Krist ◽

Joshua Rosenberg

Keyword(s):

Machine Learning ◽

Science Education ◽

Education Research ◽

Educational Research ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Science Education Research ◽

Computational Power ◽

Machine Learning Methods ◽

Applications Of Machine Learning

Machine learning has become commonplace in educational research and science education research, especially to support assessment efforts. Such applications of machine learning have shown their promise in replicating and scaling human-driven codes of students’ work. Despite this promise, we and other scholars argue that machine learning has not achieved its transformational potential. We argue that this is because our field is currently lacking frameworks for supporting creative, principled, and critical endeavors to use machine learning in science education research. To offer considerations for science education researchers’ use of ML, we present a framework, Distributing Epistemic Functions and Tasks (DEFT), that highlights the functions and tasks that pertain to generating knowledge that can be carried out by either trained researchers or machine learning algorithms. Such considerations are critical decisions that should occur alongside those about, for instance, the type of data or algorithm used. We apply this framework to two cases, one that exemplifies the cutting-edge use of machine learning in science education research and another that offers a wholly different means of using machine learning and human-driven inquiry together. We conclude with strategies for researchers to adopt machine learning and call for the field to rethink how we prepare science education researchers in an era of great advances in computational power and access to machine learning methods.

Download Full-text

Advanced Machine Learning Methods for Prediction of Fracture Closure Pressure

10.2118/200782-ms ◽

2021 ◽

Author(s):

Mohamed Ibrahim Mohamed ◽

Dinesh Mehta ◽

Erdal Ozkan

Keyword(s):

Machine Learning ◽

Integrated Approach ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Learning Methods ◽

Pressure Derivative ◽

Fracture Geometry ◽

Machine Learning Methods ◽

Personal Bias ◽

Fracture Closure

Abstract Determining the closure pressure is crucial for optimal hydraulic fracturing design and successful execution of fracturing treatment. Historically, the use of diagnostic tests before the main fracturing treatment has significantly advanced to gain more information about the pattern of fracture propagation and fluid performance to optimize the designs. The goal is to inject a small volume of fracturing fluid to breakdown the formation and create small fracture geometry, then once pumping is stopped the pressure decline is analyzed to observe the fracture closure. Many analytical methods such as G-Function, square root of time, etc. have been developed to determine the fracture closure pressure. There are cases in which there is difficulty in determining the fracture closure pressure, as well as personal bias and field experiences make it challenging to interpret the changes in the pressure derivative slope and identify fracture closure. These conditions include: High permeability reservoirs where fracture closure occurs very fast due to the quick fluid leakoff.Extremely low permeability reservoir, which requires a long shut-in time for the fluid to leak off and determine the fracture closure pressure.The non-ideal fluid leak-off behavior under complex conditions. The objective of this study is to apply machine learning methods to implement a predesigned algorithm to execute the required tasks and predict the fracture closure pressure while minimizing the shortcomings in determining the closure pressure for non-ideal or subjective conditions. This paper demonstrates training different supervised machine learning algorithms to help predict fracture closure pressure. The workflow involves using the datasets to train and optimize the models, which subsequently are used to predict the closure pressure of testing data. The output results are then compared with actual results from more than 120 DFIT data points. We further propose an integrated approach to feature selection and dataset processing and study the effects of data processing on the success of the model prediction. The results from this study limit the subjectivity and the need for the experience of personal interpreting the data. We speculate that a linear regression and MLP neural network algorithms can yield high scores in the prediction of fracture closure pressure.

Download Full-text

Machine Learning Approaches for Sentiment Analysis

Data Mining and Analysis in the Engineering Field - Advances in Data Mining and Database Management ◽

10.4018/978-1-4666-6086-1.ch011 ◽

2014 ◽

pp. 193-208 ◽

Cited By ~ 9

Author(s):

Basant Agarwal ◽

Namita Mittal

Keyword(s):

Machine Learning ◽

Sentiment Analysis ◽

Opinion Mining ◽

Machine Learning Algorithms ◽

Sentiment Classification ◽

Learning Approaches ◽

Learning Methods ◽

Machine Learning Methods ◽

Knowledge Based ◽

Semantic Orientation

Opinion Mining or Sentiment Analysis is the study that analyzes people's opinions or sentiments from the text towards entities such as products and services. It has always been important to know what other people think. With the rapid growth of availability and popularity of online review sites, blogs', forums', and social networking sites' necessity of analysing and understanding these reviews has arisen. The main approaches for sentiment analysis can be categorized into semantic orientation-based approaches, knowledge-based, and machine-learning algorithms. This chapter surveys the machine learning approaches applied to sentiment analysis-based applications. The main emphasis of this chapter is to discuss the research involved in applying machine learning methods mostly for sentiment classification at document level. Machine learning-based approaches work in the following phases, which are discussed in detail in this chapter for sentiment classification: (1) feature extraction, (2) feature weighting schemes, (3) feature selection, and (4) machine-learning methods. This chapter also discusses the standard free benchmark datasets and evaluation methods for sentiment analysis. The authors conclude the chapter with a comparative study of some state-of-the-art methods for sentiment analysis and some possible future research directions in opinion mining and sentiment analysis.

Download Full-text

Machine Learning Approaches for Sentiment Analysis

Big Data ◽

10.4018/978-1-4666-9840-6.ch088 ◽

2016 ◽

pp. 1917-1933

Author(s):

Basant Agarwal ◽

Namita Mittal

Keyword(s):

Machine Learning ◽

Sentiment Analysis ◽

Opinion Mining ◽

Machine Learning Algorithms ◽

Sentiment Classification ◽

Learning Approaches ◽

Learning Methods ◽

Machine Learning Methods ◽

Knowledge Based ◽

Semantic Orientation

Download Full-text

Predicting rice blast disease: machine learning versus process-based models

BMC Bioinformatics ◽

10.1186/s12859-019-3065-1 ◽

2019 ◽

Vol 20 (1) ◽

Cited By ~ 1

Author(s):

David F. Nettleton ◽

Dimitrios Katsantonis ◽

Argyris Kalaitzidis ◽

Natasa Sarafijanovic-Djukic ◽

Pau Puigdollers ◽

...

Keyword(s):

Machine Learning ◽

Rice Blast ◽

Machine Learning Algorithms ◽

Rice Blast Disease ◽

Blast Disease ◽

Data Driven ◽

Learning Methods ◽

Machine Learning Methods ◽

Plant Disease Management ◽

Process Based Models

Abstract Background In this study, we compared four models for predicting rice blast disease, two operational process-based models (Yoshino and Water Accounting Rice Model (WARM)) and two approaches based on machine learning algorithms (M5Rules and Recurrent Neural Networks (RNN)), the former inducing a rule-based model and the latter building a neural network. In situ telemetry is important to obtain quality in-field data for predictive models and this was a key aspect of the RICE-GUARD project on which this study is based. According to the authors, this is the first time process-based and machine learning modelling approaches for supporting plant disease management are compared. Results Results clearly showed that the models succeeded in providing a warning of rice blast onset and presence, thus representing suitable solutions for preventive remedial actions targeting the mitigation of yield losses and the reduction of fungicide use. All methods gave significant “signals” during the “early warning” period, with a similar level of performance. M5Rules and WARM gave the maximum average normalized scores of 0.80 and 0.77, respectively, whereas Yoshino gave the best score for one site (Kalochori 2015). The best average values of r and r2 and %MAE (Mean Absolute Error) for the machine learning models were 0.70, 0.50 and 0.75, respectively and for the process-based models the corresponding values were 0.59, 0.40 and 0.82. Thus it has been found that the ML models are competitive with the process-based models. This result has relevant implications for the operational use of the models, since most of the available studies are limited to the analysis of the relationship between the model outputs and the incidence of rice blast. Results also showed that machine learning methods approximated the performances of two process-based models used for years in operational contexts. Conclusions Process-based and data-driven models can be used to provide early warnings to anticipate rice blast and detect its presence, thus supporting fungicide applications. Data-driven models derived from machine learning methods are a viable alternative to process-based approaches and – in cases when training datasets are available – offer a potentially greater adaptability to new contexts.

Download Full-text