Machine Learning and Irresponsible Inference: Morally Assessing the Training Data for Image Recognition Systems

Diagnosing Gender Bias in Image Recognition Systems

Socius Sociological Research for a Dynamic World ◽

10.1177/2378023120967171 ◽

2020 ◽

Vol 6 ◽

pp. 237802312096717

Author(s):

Carsten Schwemmer ◽

Carly Knight ◽

Emily D. Bello-Pardo ◽

Stan Oklobdzija ◽

Martijn Schoonvelde ◽

...

Keyword(s):

Machine Learning ◽

Gender Bias ◽

Image Recognition ◽

Gender Stereotypes ◽

Expert Knowledge ◽

Physical Appearance ◽

Past Research ◽

Learning Systems ◽

Gender Biases ◽

Recognition Systems

Image recognition systems offer the promise to learn from images at scale without requiring expert knowledge. However, past research suggests that machine learning systems often produce biased output. In this article, we evaluate potential gender biases of commercial image recognition platforms using photographs of U.S. members of Congress and a large number of Twitter images posted by these politicians. Our crowdsourced validation shows that commercial image recognition systems can produce labels that are correct and biased at the same time as they selectively report a subset of many possible true labels. We find that images of women received three times more annotations related to physical appearance. Moreover, women in images are recognized at substantially lower rates in comparison with men. We discuss how encoded biases such as these affect the visibility of women, reinforce harmful gender stereotypes, and limit the validity of the insights that can be gathered from such data.

Download Full-text

Implementing Convolutional Neural Networks for Simple Image Classification

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.b3279.129219 ◽

2019 ◽

Vol 9 (2) ◽

pp. 3616-3619

Keyword(s):

Neural Network ◽

Machine Learning ◽

Neural Networks ◽

Convolutional Neural Network ◽

Image Recognition ◽

High Volume ◽

Classification Systems ◽

Learning Tools ◽

Huge Amount ◽

Recognition Systems

In recent years, huge amounts of data in form of images has been efficiently created and accumulated at extraordinary rates. This huge amount of data that has high volume and velocity has presented us with the problem of coming up with practical and effective ways to classify it for analysis. Existing classification systems can never fulfil the demand and the difficulties of accurately classifying such data. In this paper, we built a Convolutional Neural Network (CNN) which is one of the most powerful and popular machine learning tools used in image recognition systems for classifying images from one of the widely used image datasets CIFAR-10. This paper also gives a thorough overview of the working of our CNN architecture with its parameters and difficulties.

Download Full-text

The Effect of Different Flaw Data to Machine Learning Powered Ultrasonic Inspection

Journal of Nondestructive Evaluation ◽

10.1007/s10921-021-00757-x ◽

2021 ◽

Vol 40 (1) ◽

Author(s):

Tuomas Koskinen ◽

Iikka Virkkunen ◽

Oskar Siljama ◽

Oskari Jessen-Juhler

Keyword(s):

Neural Network ◽

Machine Learning ◽

Convolutional Neural Network ◽

Image Recognition ◽

Phased Array ◽

Ultrasonic Inspection ◽

Deep Convolutional Neural Network ◽

Flaw Size ◽

Training Data ◽

Link Type

AbstractPrevious research (Li et al., Understanding the disharmony between dropout and batch normalization by variance shift. CoRR abs/1801.05134 (2018). http://arxiv.org/abs/1801.05134arXiv:1801.05134) has shown the plausibility of using a modern deep convolutional neural network to detect flaws from phased-array ultrasonic data. This brings the repeatability and effectiveness of automated systems to complex ultrasonic signal evaluation, previously done exclusively by human inspectors. The major breakthrough was to use virtual flaws to generate ample flaw data for the teaching of the algorithm. This enabled the use of raw ultrasonic scan data for detection and to leverage some of the approaches used in machine learning for image recognition. Unlike traditional image recognition, training data for ultrasonic inspection is scarce. While virtual flaws allow us to broaden the data considerably, original flaws with proper flaw-size distribution are still required. This is of course the same for training human inspectors. The training of human inspectors is usually done with easily manufacturable flaws such as side-drilled holes and EDM notches. While the difference between these easily manufactured artificial flaws and real flaws is obvious, human inspectors still manage to train with them and perform well in real inspection scenarios. In the present work, we use a modern, deep convolutional neural network to detect flaws from phased-array ultrasonic data and compare the results achieved from different training data obtained from various artificial flaws. The model demonstrated good generalization capability toward flaw sizes larger than the original training data, and the effect of the minimum flaw size in the data set affects the $$a_{90/95}$$ a 90 / 95 value. This work also demonstrates how different artificial flaws, solidification cracks, EDM notch and simple simulated flaws generalize differently.

Download Full-text

Beautiful Fractals as a Crystal Ball for Financial Markets? - Investment Decision Support System Based on Image Recognition Using Artificial Intelligence

The Journal of Prediction Markets ◽

10.5750/jpm.v14i2.1804 ◽

2020 ◽

Vol 14 (2) ◽

pp. 27-44

Author(s):

Benjamin M. Abdel-Karim

Keyword(s):

Machine Learning ◽

Financial Markets ◽

Image Recognition ◽

Fractal Geometry ◽

Stock Price ◽

Simulated Data ◽

Real Data ◽

Point Of View ◽

Training Data ◽

Theoretical Point

The work by Mandelbrot develops a basic understanding of fractals and the artwork of Jackson Pollok to reveal the beauty fractal geometry. The pattern of recurring structures is also reflected in share prices. Mandelbrot himself speaks of the fractal heart of the financial markets. Previous research has shown the potential of image recognition. This paper presents the possibility of using the structure recognition capability of modern machine learning methods to make forecasts based on fractal course information. We generate training data from real and simulated data. These data are represented in images to train a special artificial neural network. Subsequently, real data are presented to the network for use in predicting. The results show that the forecast of time series based on stock price illustration, compared to a benchmark, delivers promising results. This paper makes two essential contributions to research. From a theoretical point of view, fractal geometry shows that it can serve as a means of legitimation for technical analysis. From a practical point of view, highly developed methods from the field of machine learning are able to recognize patterns in data through appropriate data transformation, and that models such as random walk have an informational content that can be used to train machine learning models.

Download Full-text

Diagnosing Gender Bias in Image Recognition Systems

10.31235/osf.io/as25q ◽

2018 ◽

Author(s):

Carsten Schwemmer ◽

Carly Knight ◽

Emily Bello-Pardo ◽

Stan Oklobdzija ◽

Martijn Schoonvelde ◽

...

Keyword(s):

Machine Learning ◽

Gender Bias ◽

Image Recognition ◽

Gender Stereotypes ◽

Expert Knowledge ◽

Physical Appearance ◽

Past Research ◽

Learning Systems ◽

Gender Biases ◽

Recognition Systems

Image recognition systems offer the promise to learn from images at scale without requiring expert knowledge. However, past research suggests that machine learning systems often produce biased output. In this article, we evaluate potential gender biases of commercial image recognition platforms using photographs of U.S. members of Congress and a large number of Twitter images posted by these politicians. Our crowdsourced validation shows that commercial image recognition systems can produce labels that are correct and biased at the same time as they selectively report a subset of many possible true labels. We find that images of women received three times more annotations related to physical appearance. Moreover, women in images are recognized at substantially lower rates in comparison with men. We discuss how encoded biases such as these affect the visibility of women, reinforce harmful gender stereotypes, and limit the validity of the insights that can be gathered from such data.

Download Full-text

Efficient mobilenet architecture as image recognition on mobile and embedded devices

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v16.i1.pp389-394 ◽

2019 ◽

Vol 16 (1) ◽

pp. 389

Author(s):

Barlian Khasoggi ◽

Ermatita Ermatita ◽

Samsuryadi Samsuryadi

Keyword(s):

Machine Learning ◽

Computer Vision ◽

Image Recognition ◽

Training Model ◽

Human Interaction ◽

Training Data ◽

Embedded Devices ◽

Moderate Amount ◽

Computing Power ◽

Efficiency And Effectiveness

The introduction of a modern image recognition that has millions of parameters and requires a lot of training data as well as high computing power that is hungry for energy consumption so it becomes inefficient in everyday use. Machine Learning has changed the computing paradigm, from complex calculations that require high computational power to environmentally friendly technologies that can efficiently meet daily needs. To get the best training model, many studies use large numbers of datasets. However, the complexity of large datasets requires large devices and requires high computing power. Therefore large computational resources do not have high flexibility towards the tendency of human interaction which prioritizes the efficiency and effectiveness of computer vision. This study uses the Convolutional Neural Networks (CNN) method with MobileNet architecture for image recognition on mobile devices and embedded devices with limited resources with ARM-based CPUs and works with a moderate amount of training data (thousands of labeled images). As a result, the MobileNet v1 architecture on the ms8pro device can classify the caltech101 dataset with an accuracy rate 92.4% and 2.1 Watt power draw. With the level of accuracy and efficiency of the resources used, it is expected that MobileNet's architecture can change the machine learning paradigm so that it has a high degree of flexibility towards the tendency of human interaction that prioritizes the efficiency and effectiveness of computer vision.

Download Full-text

Scalable Approach to High Coverages on Oxides via Iterative Training of a Machine-Learning Algorithm

10.26434/chemrxiv.10288514.v1 ◽

2019 ◽

Author(s):

Andrew Medford ◽

Shengchun Yang ◽

Fuzhu Liu

Keyword(s):

Machine Learning ◽

Chemical Potential ◽

Learning Algorithm ◽

Absolute Error ◽

Low Energy ◽

Training Data ◽

High Coverage ◽

Metal Compounds ◽

Adsorption Energies ◽

The Stability

Understanding the interaction of multiple types of adsorbate molecules on solid surfaces is crucial to establishing the stability of catalysts under various chemical environments. Computational studies on the high coverage and mixed coverages of reaction intermediates are still challenging, especially for transition-metal compounds. In this work, we present a framework to predict differential adsorption energies and identify low-energy structures under high- and mixed-adsorbate coverages on oxide materials. The approach uses Gaussian process machine-learning models with quantified uncertainty in conjunction with an iterative training algorithm to actively identify the training set. The framework is demonstrated for the mixed adsorption of CHx, NHx and OHx species on the oxygen vacancy and pristine rutile TiO2(110) surface sites. The results indicate that the proposed algorithm is highly efficient at identifying the most valuable training data, and is able to predict differential adsorption energies with a mean absolute error of ~0.3 eV based on <25% of the total DFT data. The algorithm is also used to identify 76% of the low-energy structures based on <30% of the total DFT data, enabling construction of surface phase diagrams that account for high and mixed coverage as a function of the chemical potential of C, H, O, and N. Furthermore, the computational scaling indicates the algorithm scales nearly linearly (N1.12) as the number of adsorbates increases. This framework can be directly extended to metals, metal oxides, and other materials, providing a practical route toward the investigation of the behavior of catalysts under high-coverage conditions.

Download Full-text

Optimization of Diabetes Training DATA using Machine Learning Algorithms

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v6i2.283286 ◽

2018 ◽

Vol 6 (2) ◽

pp. 283-286

Author(s):

M. Samba Siva Rao ◽

◽

M.Yaswanth . ◽

K. Raghavendra Swamy ◽

◽

...

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Training Data

Download Full-text

Comparative Analysis of Machine Learning Techniques Using Predictive Modeling

Recent Advances in Computer Science and Communications ◽

10.2174/2666255813999200904164539 ◽

2020 ◽

Vol 13 ◽

Author(s):

Ritu Khandelwal ◽

Hemlata Goyal ◽

Rajveer Singh Shekhawat

Keyword(s):

Machine Learning ◽

Comparative Analysis ◽

Data Science ◽

Training Data ◽

Machine Learning Techniques ◽

Future Trends ◽

Data Set ◽

Learning Stage ◽

Learning Techniques ◽

Different Types

Introduction: Machine learning is an intelligent technology that works as a bridge between businesses and data science. With the involvement of data science, the business goal focuses on findings to get valuable insights on available data. The large part of Indian Cinema is Bollywood which is a multi-million dollar industry. This paper attempts to predict whether the upcoming Bollywood Movie would be Blockbuster, Superhit, Hit, Average or Flop. For this Machine Learning techniques (classification and prediction) will be applied. To make classifier or prediction model first step is the learning stage in which we need to give the training data set to train the model by applying some technique or algorithm and after that different rules are generated which helps to make a model and predict future trends in different types of organizations. Methods: All the techniques related to classification and Prediction such as Support Vector Machine(SVM), Random Forest, Decision Tree, Naïve Bayes, Logistic Regression, Adaboost, and KNN will be applied and try to find out efficient and effective results. All these functionalities can be applied with GUI Based workflows available with various categories such as data, Visualize, Model, and Evaluate. Result: To make classifier or prediction model first step is learning stage in which we need to give the training data set to train the model by applying some technique or algorithm and after that different rules are generated which helps to make a model and predict future trends in different types of organizations Conclusion: This paper focuses on Comparative Analysis that would be performed based on different parameters such as Accuracy, Confusion Matrix to identify the best possible model for predicting the movie Success. By using Advertisement Propaganda, they can plan for the best time to release the movie according to the predicted success rate to gain higher benefits. Discussion: Data Mining is the process of discovering different patterns from large data sets and from that various relationships are also discovered to solve various problems that come in business and helps to predict the forthcoming trends. This Prediction can help Production Houses for Advertisement Propaganda and also they can plan their costs and by assuring these factors they can make the movie more profitable.

Download Full-text

Multilayer Soil Moisture Mapping at a Regional Scale from Multisource Data via a Machine Learning Method

Remote Sensing ◽

10.3390/rs11030284 ◽

2019 ◽

Vol 11 (3) ◽

pp. 284 ◽

Cited By ~ 1

Author(s):

Linglin Zeng ◽

Shun Hu ◽

Daxiang Xiang ◽

Xiang Zhang ◽

Deren Li ◽

...

Keyword(s):

Machine Learning ◽

Soil Moisture ◽

Regional Scale ◽

Remotely Sensed ◽

Temporal Variations ◽

Training Data ◽

Estimation Accuracy ◽

Learning Approaches ◽

Remotely Sensed Data ◽

Deep Soil

Soil moisture mapping at a regional scale is commonplace since these data are required in many applications, such as hydrological and agricultural analyses. The use of remotely sensed data for the estimation of deep soil moisture at a regional scale has received far less emphasis. The objective of this study was to map the 500-m, 8-day average and daily soil moisture at different soil depths in Oklahoma from remotely sensed and ground-measured data using the random forest (RF) method, which is one of the machine-learning approaches. In order to investigate the estimation accuracy of the RF method at both a spatial and a temporal scale, two independent soil moisture estimation experiments were conducted using data from 2010 to 2014: a year-to-year experiment (with a root mean square error (RMSE) ranging from 0.038 to 0.050 m3/m3) and a station-to-station experiment (with an RMSE ranging from 0.044 to 0.057 m3/m3). Then, the data requirements, importance factors, and spatial and temporal variations in estimation accuracy were discussed based on the results using the training data selected by iterated random sampling. The highly accurate estimations of both the surface and the deep soil moisture for the study area reveal the potential of RF methods when mapping soil moisture at a regional scale, especially when considering the high heterogeneity of land-cover types and topography in the study area.

Download Full-text