Challenges of automated machine learning on causal impact analytics for policy evaluation

In the past years, several machine-learning-based techniques have arisen for providing effective crop protection. For instance, deep neural networks have been used to identify different types of weeds under different real-world conditions. However, these techniques usually require extensive involvement of experts working iteratively in the development of the most suitable machine learning system. To support this task and save resources, a new technique called Automated Machine Learning has started being studied. In this work, a complete open-source Automated Machine Learning system was evaluated with two different datasets, (i) The Early Crop Weeds dataset and (ii) the Plant Seedlings dataset, covering the weeds identification problem. Different configurations, such as the use of plant segmentation, the use of classifier ensembles instead of Softmax and training with noisy data, have been compared. The results showed promising performances of 93.8% and 90.74% F1 score depending on the dataset used. These performances were aligned with other related works in AutoML, but they are far from machine-learning-based systems manually fine-tuned by human experts. From these results, it can be concluded that finding a balance between manual expert work and Automated Machine Learning will be an interesting path to work in order to increase the efficiency in plant protection.

Download Full-text

PAIRS AutoGeo: an Automated Machine Learning Framework for Massive Geospatial Data

2020 IEEE International Conference on Big Data (Big Data) ◽

10.1109/bigdata50022.2020.9378036 ◽

2020 ◽

Author(s):

Wang Zhou ◽

Levente J. Klein ◽

Siyuan Lu

Keyword(s):

Machine Learning ◽

Geospatial Data ◽

Learning Framework ◽

Automated Machine Learning

Download Full-text

Novel Meta-Features for Automated Machine Learning Model Selection in Anomaly Detection

IEEE Access ◽

10.1109/access.2021.3090936 ◽

2021 ◽

pp. 1-1

Author(s):

Milos Kotlar ◽

Marija Punt ◽

Zaharije Radivojevic ◽

Milos Cvetanovic ◽

Veljko Milutinovic

Keyword(s):

Machine Learning ◽

Model Selection ◽

Anomaly Detection ◽

Learning Model ◽

Machine Learning Model ◽

Automated Machine Learning

Download Full-text

Risky Driver Recognition with Class Imbalance Data and Automated Machine Learning Framework

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph18147534 ◽

2021 ◽

Vol 18 (14) ◽

pp. 7534

Author(s):

Ke Wang ◽

Qingwen Xue ◽

Jian John Lu

Keyword(s):

Machine Learning ◽

High Risk ◽

Loss Function ◽

Class Imbalance ◽

Support Vector ◽

Trajectory Data ◽

Recognition Model ◽

Learning Framework ◽

Sampling Cost ◽

Automated Machine Learning

Identifying high-risk drivers before an accident happens is necessary for traffic accident control and prevention. Due to the class-imbalance nature of driving data, high-risk samples as the minority class are usually ill-treated by standard classification algorithms. Instead of applying preset sampling or cost-sensitive learning, this paper proposes a novel automated machine learning framework that simultaneously and automatically searches for the optimal sampling, cost-sensitive loss function, and probability calibration to handle class-imbalance problem in recognition of risky drivers. The hyperparameters that control sampling ratio and class weight, along with other hyperparameters, are optimized by Bayesian optimization. To demonstrate the performance of the proposed automated learning framework, we establish a risky driver recognition model as a case study, using video-extracted vehicle trajectory data of 2427 private cars on a German highway. Based on rear-end collision risk evaluation, only 4.29% of all drivers are labeled as risky drivers. The inputs of the recognition model are the discrete Fourier transform coefficients of target vehicle’s longitudinal speed, lateral speed, and the gap between the target vehicle and its preceding vehicle. Among 12 sampling methods, 2 cost-sensitive loss functions, and 2 probability calibration methods, the result of automated machine learning is consistent with manual searching but much more computation-efficient. We find that the combination of Support Vector Machine-based Synthetic Minority Oversampling TEchnique (SVMSMOTE) sampling, cost-sensitive cross-entropy loss function, and isotonic regression can significantly improve the recognition ability and reduce the error of predicted probability.

Download Full-text

Estimating Causal Effects When the Treatment Affects All Subjects Simultaneously: An Application

Big Data and Cognitive Computing ◽

10.3390/bdcc5020022 ◽

2021 ◽

Vol 5 (2) ◽

pp. 22

Author(s):

Chiara Binelli

Keyword(s):

Machine Learning ◽

Carbon Dioxide ◽

Global Warming ◽

Carbon Dioxide Emissions ◽

Control Group ◽

Sources Of Information ◽

Average Global Temperature ◽

Causal Impact ◽

The Impact ◽

Robust Evidence

Several important questions cannot be answered with the standard toolkit of causal inference since all subjects are treated for a given period and thus there is no control group. One example of this type of questions is the impact of carbon dioxide emissions on global warming. In this paper, we address this question using a machine learning method, which allows estimating causal impacts in settings when a randomized experiment is not feasible. We discuss the conditions under which this method can identify a causal impact, and we find that carbon dioxide emissions are responsible for an increase in average global temperature of about 0.3 degrees Celsius between 1961 and 2011. We offer two main contributions. First, we provide one additional application of Machine Learning to answer causal questions of policy relevance. Second, by applying a methodology that relies on few directly testable assumptions and is easy to replicate, we provide robust evidence of the man-made nature of global warming, which could reduce incentives to turn to biased sources of information that fuels climate change skepticism.

Download Full-text

Adaptation Strategies for Automated Machine Learning on Evolving Data

IEEE Transactions on Pattern Analysis and Machine Intelligence ◽

10.1109/tpami.2021.3062900 ◽

2021 ◽

pp. 1-1

Author(s):

Bilge Celik ◽

Joaquin Vanschoren

Keyword(s):

Machine Learning ◽

Adaptation Strategies ◽

Automated Machine Learning ◽

Evolving Data

Download Full-text