Integrating hierarchical statistical models and machine-learning algorithms for ground-truthing drone images of the vegetation: taxonomy, abundance and population ecological models

Mapping Intimacies ◽

10.1101/491381 ◽

2018 ◽

Cited By ~ 1

Author(s):

Christian Damgaard

Keyword(s):

Machine Learning ◽

Statistical Models ◽

Learning Algorithms ◽

Plant Competition ◽

Image Data ◽

Ground Truth ◽

Ecological Models ◽

Machine Learning Algorithms ◽

Ground Truth Data ◽

Ground Truthing

AbstractIn order to fit population ecological models, e.g. plant competition models, to new drone-aided image data, we need to develop statistical models that may take the new type of measurement uncertainty when applying machine-learning algorithms into account and quantify its importance for statistical inferences and ecological predictions. Here, it is proposed to quantify the uncertainty and bias of image predicted plant taxonomy and abundance in a hierarchical statistical model that is linked to ground-truth data obtained by the pin-point method. It is critical that the error rate in the species identification process is minimized when the image data are fitted to the population ecological models, and several avenues for reaching this objective are discussed. The outlined method to statistically model known sources of uncertainty when applying machine-learning algorithms may be relevant for other applied scientific disciplines.

Download Full-text

Integrating Hierarchical Statistical Models and Machine-Learning Algorithms for Ground-Truthing Drone Images of the Vegetation: Taxonomy, Abundance and Population Ecological Models

Remote Sensing ◽

10.3390/rs13061161 ◽

2021 ◽

Vol 13 (6) ◽

pp. 1161

Author(s):

Christian Damgaard

Keyword(s):

Machine Learning ◽

Statistical Models ◽

Learning Algorithms ◽

Plant Competition ◽

Image Data ◽

Ground Truth ◽

Ecological Models ◽

Machine Learning Algorithms ◽

Ground Truth Data ◽

Ground Truthing

In order to fit population ecological models, e.g., plant competition models, to new drone-aided image data, we need to develop statistical models that may take the new type of measurement uncertainty when applying machine-learning algorithms into account and quantify its importance for statistical inferences and ecological predictions. Here, it is proposed to quantify the uncertainty and bias of image predicted plant taxonomy and abundance in a hierarchical statistical model that is linked to ground-truth data obtained by the pin-point method. It is critical that the error rate in the species identification process is minimized when the image data are fitted to the population ecological models, and several avenues for reaching this objective are discussed. The outlined method to statistically model known sources of uncertainty when applying machine-learning algorithms may be relevant for other applied scientific disciplines.

Download Full-text

Assessing biases, relaxing moralism: On ground-truthing practices in machine learning design and application

Big Data & Society ◽

10.1177/20539517211013569 ◽

2021 ◽

Vol 8 (1) ◽

pp. 205395172110135

Author(s):

Florian Jaton

Keyword(s):

Machine Learning ◽

William James ◽

A Priori ◽

Learning Algorithms ◽

Three Dimensional ◽

Ground Truth ◽

Machine Learning Algorithms ◽

Ground Truthing ◽

Set Up ◽

The Moment

This theoretical paper considers the morality of machine learning algorithms and systems in the light of the biases that ground their correctness. It begins by presenting biases not as a priori negative entities but as contingent external referents—often gathered in benchmarked repositories called ground-truth datasets—that define what needs to be learned and allow for performance measures. I then argue that ground-truth datasets and their concomitant practices—that fundamentally involve establishing biases to enable learning procedures—can be described by their respective morality, here defined as the more or less accounted experience of hesitation when faced with what pragmatist philosopher William James called “genuine options”—that is, choices to be made in the heat of the moment that engage different possible futures. I then stress three constitutive dimensions of this pragmatist morality, as far as ground-truthing practices are concerned: (I) the definition of the problem to be solved (problematization), (II) the identification of the data to be collected and set up (databasing), and (III) the qualification of the targets to be learned (labeling). I finally suggest that this three-dimensional conceptual space can be used to map machine learning algorithmic projects in terms of the morality of their respective and constitutive ground-truthing practices. Such techno-moral graphs may, in turn, serve as equipment for greater governance of machine learning algorithms and systems.

Download Full-text

Establishing Ground Truth on Pyschophysiological Models for Training Machine Learning Algorithms: Options for Ground Truth Proxies

Lecture Notes in Computer Science - Augmented Cognition. Neurocognition and Machine Learning ◽

10.1007/978-3-319-58628-1_35 ◽

2017 ◽

pp. 468-477

Author(s):

Keith Brawner ◽

Michael W. Boyce

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Ground Truth ◽

Machine Learning Algorithms

Download Full-text

Sex estimation from the greater sciatic notch: a comparison of classical statistical models and machine learning algorithms

International Journal of Legal Medicine ◽

10.1007/s00414-021-02700-1 ◽

2021 ◽

Author(s):

Siam Knecht ◽

Luísa Nogueira ◽

Maël Servant ◽

Frédéric Santos ◽

Véronique Alunni ◽

...

Keyword(s):

Machine Learning ◽

Statistical Models ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Sex Estimation ◽

Sciatic Notch

Download Full-text

Invited Commentary: Quantitative Bias Analysis can see the Forest for the Trees

American Journal of Epidemiology ◽

10.1093/aje/kwab011 ◽

2021 ◽

Author(s):

Paul Gustafson

Keyword(s):

Machine Learning ◽

Statistical Models ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Bias Analysis ◽

Quantitative Bias Analysis

Abstract The article by Jiang et al (Am J. Epidemiol.) extends quantitative bias analysis from the realm of statistical models to the realm of machine learning algorithms. Given the rooting of statistical models in the spirit of explanation and the rooting of machine learning algorithms in the spirt of prediction, this extension is thought provoking indeed. Some such thoughts are expounded here.

Download Full-text

Improving Accelerometry-Based Measurement of Functional Use of the Upper Extremity After Stroke: Machine Learning Versus Counts Threshold Method

Neurorehabilitation and Neural Repair ◽

10.1177/1545968320962483 ◽

2020 ◽

Vol 34 (12) ◽

pp. 1078-1087

Author(s):

Peter S. Lum ◽

Liqi Shu ◽

Elaine M. Bochniewicz ◽

Tan Tran ◽

Lin-Ching Chang ◽

...

Keyword(s):

Machine Learning ◽

Upper Extremity ◽

Functional Activity ◽

Learning Algorithms ◽

Ground Truth ◽

Machine Learning Algorithms ◽

Average Error ◽

Test Duration ◽

Accelerometer Data ◽

Paretic Limb

Background Wrist-worn accelerometry provides objective monitoring of upper-extremity functional use, such as reaching tasks, but also detects nonfunctional movements, leading to ambiguity in monitoring results. Objective Compare machine learning algorithms with standard methods (counts ratio) to improve accuracy in detecting functional activity. Methods Healthy controls and individuals with stroke performed unstructured tasks in a simulated community environment (Test duration = 26 ± 8 minutes) while accelerometry and video were synchronously recorded. Human annotators scored each frame of the video as being functional or nonfunctional activity, providing ground truth. Several machine learning algorithms were developed to separate functional from nonfunctional activity in the accelerometer data. We also calculated the counts ratio, which uses a thresholding scheme to calculate the duration of activity in the paretic limb normalized by the less-affected limb. Results The counts ratio was not significantly correlated with ground truth and had large errors ( r = 0.48; P = .16; average error = 52.7%) because of high levels of nonfunctional movement in the paretic limb. Counts did not increase with increased functional movement. The best-performing intrasubject machine learning algorithm had an accuracy of 92.6% in the paretic limb of stroke patients, and the correlation with ground truth was r = 0.99 ( P < .001; average error = 3.9%). The best intersubject model had an accuracy of 74.2% and a correlation of r =0.81 ( P = .005; average error = 5.2%) with ground truth. Conclusions In our sample, the counts ratio did not accurately reflect functional activity. Machine learning algorithms were more accurate, and future work should focus on the development of a clinical tool.

Download Full-text

Towards Urban Environment Familiarity Prediction

Advances in Cartography and GIScience of the ICA ◽

10.5194/ica-adv-2-5-2019 ◽

2019 ◽

Vol 2 ◽

pp. 1-8

Author(s):

Lukas Gokl ◽

Marvin Mc Cutchan ◽

Bartosz Mazurkiewicz ◽

Paolo Fogliaroni ◽

Ioannis Giannopoulos

Keyword(s):

Machine Learning ◽

Information Exchange ◽

Municipal Government ◽

Binary Classification ◽

Ground Truth ◽

Machine Learning Algorithms ◽

Location Based Services ◽

Ground Truth Data ◽

Level Of Details

Abstract. Location Based Services (LBS) are definitely very helpful for people that interact within an unfamiliar environment, but also for those that already possess a certain level of familiarity with it. In order to avoid overwhelming familiar users with unnecessary information, the level of details offered by the LBS shall be adapted to the level of familiarity with the environment: providing more details to unfamiliar users and a lighter amount of information (that would be superfluous, if not even misleading) to the users that are more familiar with the current environment. Currently, the information exchange between the service and its users is not taking into account familiarity. Within this work, we investigate the potential of machine learning for a binary classification of environment familiarity (i.e., familiar vs unfamiliar) with the surrounding environment. For this purpose, a 3D virtual environment based on a part of Vienna, Austria was designed using datasets from the municipal government. During a navigation experiment with 22 participants we collected ground truth data in order to train four machine learning algorithms. The captured data included motion and orientation of the users as well as visual interaction with the surrounding buildings during navigation. This work demonstrates the potential of machine learning for predicting the state of familiarity as an enabling step for the implementation of LBS better tailored to the user.

Download Full-text

Selection of Reliable Machine Learning Algorithms for Geophysical Applications

10.5194/egusphere-egu2020-7586 ◽

2020 ◽

Author(s):

Octavian Dumitru ◽

Gottfried Schwarz ◽

Dongyang Ao ◽

Gabriel Dax ◽

Vlad Andrei ◽

...

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Learning Algorithms ◽

Image Data ◽

Data Interpretation ◽

Computational Effort ◽

Machine Learning Algorithms ◽

Learning Tools ◽

Convolutional Networks ◽

Strong Winds

During the last years, one could see a broad use of machine learning tools and applications. However, when we use these techniques for geophysical analyses, we must be sure that the obtained results are scientifically valid and allow us to derive quantitative outcomes that can be directly compared with other measurements.Therefore, we set out to identify typical datasets that lend themselves well to geophysical data interpretation. To simplify this very general task, we concentrate in this contribution on multi-dimensional image data acquired by satellites with typical remote sensing instruments for Earth observation being used for the analysis for:<ul><li>Atmospheric phenomena (cloud cover, cloud characteristics, smoke and plumes, strong winds, etc.)</li> <li>Land cover and land use (open terrain, agriculture, forestry, settlements, buildings and streets, industrial and transportation facilities, mountains, etc.)</li> <li>Sea and ocean surfaces (waves, currents, ships, icebergs, coastlines, etc.)</li> <li>Ice and snow on land and water (ice fields, glaciers, etc.)</li> <li>Image time series (dynamical phenomena, their occurrence and magnitude, mapping techniques)</li> </ul>Then we analyze important data characteristics for each type of instrument. One can see that most selected images are characterized by their type of imaging instrument (e.g., radar or optical images), their typical signal-to-noise figures, their preferred pixel sizes, their various spectral bands, etc.As a third step, we select a number of established machine learning algorithms, available tools, software packages, required environments, published experiences, and specific caveats. The comparisons cover traditional &#8220;flat&#8221; as well as advanced &#8220;deep&#8221; techniques that have to be compared in detail before making any decision about their usefulness for geophysical applications. They range from simple thresholding to k-means, from multi-scale approaches to convolutional networks (with visible or hidden layers) and auto-encoders with sub-components from rectified linear units to adversarial networks.Finally, we summarize our findings in several instrument / machine learning algorithm matrices (e.g., for active or passive instruments). These matrices also contain important features of the input data and their consequences, computational effort, attainable figures-of-merit, and necessary testing and verification steps (positive and negative examples). Typical examples are statistical similarities, characteristic scales, rotation invariance, target groupings, topic bagging and targeting (hashing) capabilities as well as local compression behavior.

Download Full-text