scholarly journals Integrating hierarchical statistical models and machine-learning algorithms for ground-truthing drone images of the vegetation: taxonomy, abundance and population ecological models

2018 ◽  
Author(s):  
Christian Damgaard

AbstractIn order to fit population ecological models, e.g. plant competition models, to new drone-aided image data, we need to develop statistical models that may take the new type of measurement uncertainty when applying machine-learning algorithms into account and quantify its importance for statistical inferences and ecological predictions. Here, it is proposed to quantify the uncertainty and bias of image predicted plant taxonomy and abundance in a hierarchical statistical model that is linked to ground-truth data obtained by the pin-point method. It is critical that the error rate in the species identification process is minimized when the image data are fitted to the population ecological models, and several avenues for reaching this objective are discussed. The outlined method to statistically model known sources of uncertainty when applying machine-learning algorithms may be relevant for other applied scientific disciplines.

2021 ◽  
Vol 13 (6) ◽  
pp. 1161
Author(s):  
Christian Damgaard

In order to fit population ecological models, e.g., plant competition models, to new drone-aided image data, we need to develop statistical models that may take the new type of measurement uncertainty when applying machine-learning algorithms into account and quantify its importance for statistical inferences and ecological predictions. Here, it is proposed to quantify the uncertainty and bias of image predicted plant taxonomy and abundance in a hierarchical statistical model that is linked to ground-truth data obtained by the pin-point method. It is critical that the error rate in the species identification process is minimized when the image data are fitted to the population ecological models, and several avenues for reaching this objective are discussed. The outlined method to statistically model known sources of uncertainty when applying machine-learning algorithms may be relevant for other applied scientific disciplines.


2021 ◽  
Vol 8 (1) ◽  
pp. 205395172110135
Author(s):  
Florian Jaton

This theoretical paper considers the morality of machine learning algorithms and systems in the light of the biases that ground their correctness. It begins by presenting biases not as a priori negative entities but as contingent external referents—often gathered in benchmarked repositories called ground-truth datasets—that define what needs to be learned and allow for performance measures. I then argue that ground-truth datasets and their concomitant practices—that fundamentally involve establishing biases to enable learning procedures—can be described by their respective morality, here defined as the more or less accounted experience of hesitation when faced with what pragmatist philosopher William James called “genuine options”—that is, choices to be made in the heat of the moment that engage different possible futures. I then stress three constitutive dimensions of this pragmatist morality, as far as ground-truthing practices are concerned: (I) the definition of the problem to be solved (problematization), (II) the identification of the data to be collected and set up (databasing), and (III) the qualification of the targets to be learned (labeling). I finally suggest that this three-dimensional conceptual space can be used to map machine learning algorithmic projects in terms of the morality of their respective and constitutive ground-truthing practices. Such techno-moral graphs may, in turn, serve as equipment for greater governance of machine learning algorithms and systems.


Author(s):  
Paul Gustafson

Abstract The article by Jiang et al (Am J. Epidemiol.) extends quantitative bias analysis from the realm of statistical models to the realm of machine learning algorithms. Given the rooting of statistical models in the spirit of explanation and the rooting of machine learning algorithms in the spirt of prediction, this extension is thought provoking indeed. Some such thoughts are expounded here.


2020 ◽  
Vol 34 (12) ◽  
pp. 1078-1087
Author(s):  
Peter S. Lum ◽  
Liqi Shu ◽  
Elaine M. Bochniewicz ◽  
Tan Tran ◽  
Lin-Ching Chang ◽  
...  

Background Wrist-worn accelerometry provides objective monitoring of upper-extremity functional use, such as reaching tasks, but also detects nonfunctional movements, leading to ambiguity in monitoring results. Objective Compare machine learning algorithms with standard methods (counts ratio) to improve accuracy in detecting functional activity. Methods Healthy controls and individuals with stroke performed unstructured tasks in a simulated community environment (Test duration = 26 ± 8 minutes) while accelerometry and video were synchronously recorded. Human annotators scored each frame of the video as being functional or nonfunctional activity, providing ground truth. Several machine learning algorithms were developed to separate functional from nonfunctional activity in the accelerometer data. We also calculated the counts ratio, which uses a thresholding scheme to calculate the duration of activity in the paretic limb normalized by the less-affected limb. Results The counts ratio was not significantly correlated with ground truth and had large errors ( r = 0.48; P = .16; average error = 52.7%) because of high levels of nonfunctional movement in the paretic limb. Counts did not increase with increased functional movement. The best-performing intrasubject machine learning algorithm had an accuracy of 92.6% in the paretic limb of stroke patients, and the correlation with ground truth was r = 0.99 ( P < .001; average error = 3.9%). The best intersubject model had an accuracy of 74.2% and a correlation of r =0.81 ( P = .005; average error = 5.2%) with ground truth. Conclusions In our sample, the counts ratio did not accurately reflect functional activity. Machine learning algorithms were more accurate, and future work should focus on the development of a clinical tool.


2019 ◽  
Vol 2 ◽  
pp. 1-8
Author(s):  
Lukas Gokl ◽  
Marvin Mc Cutchan ◽  
Bartosz Mazurkiewicz ◽  
Paolo Fogliaroni ◽  
Ioannis Giannopoulos

Abstract. Location Based Services (LBS) are definitely very helpful for people that interact within an unfamiliar environment, but also for those that already possess a certain level of familiarity with it. In order to avoid overwhelming familiar users with unnecessary information, the level of details offered by the LBS shall be adapted to the level of familiarity with the environment: providing more details to unfamiliar users and a lighter amount of information (that would be superfluous, if not even misleading) to the users that are more familiar with the current environment. Currently, the information exchange between the service and its users is not taking into account familiarity. Within this work, we investigate the potential of machine learning for a binary classification of environment familiarity (i.e., familiar vs unfamiliar) with the surrounding environment. For this purpose, a 3D virtual environment based on a part of Vienna, Austria was designed using datasets from the municipal government. During a navigation experiment with 22 participants we collected ground truth data in order to train four machine learning algorithms. The captured data included motion and orientation of the users as well as visual interaction with the surrounding buildings during navigation. This work demonstrates the potential of machine learning for predicting the state of familiarity as an enabling step for the implementation of LBS better tailored to the user.


2020 ◽  
Author(s):  
Octavian Dumitru ◽  
Gottfried Schwarz ◽  
Dongyang Ao ◽  
Gabriel Dax ◽  
Vlad Andrei ◽  
...  

&lt;p&gt;During the last years, one could see a broad use of machine learning tools and applications. However, when we use these techniques for geophysical analyses, we must be sure that the obtained results are scientifically valid and allow us to derive quantitative outcomes that can be directly compared with other measurements.&lt;/p&gt;&lt;p&gt;Therefore, we set out to identify typical datasets that lend themselves well to geophysical data interpretation. To simplify this very general task, we concentrate in this contribution on multi-dimensional image data acquired by satellites with typical remote sensing instruments for Earth observation being used for the analysis for:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Atmospheric phenomena (cloud cover, cloud characteristics, smoke and plumes, strong winds, etc.)&lt;/li&gt; &lt;li&gt;Land cover and land use (open terrain, agriculture, forestry, settlements, buildings and streets, industrial and transportation facilities, mountains, etc.)&lt;/li&gt; &lt;li&gt;Sea and ocean surfaces (waves, currents, ships, icebergs, coastlines, etc.)&lt;/li&gt; &lt;li&gt;Ice and snow on land and water (ice fields, glaciers, etc.)&lt;/li&gt; &lt;li&gt;Image time series (dynamical phenomena, their occurrence and magnitude, mapping techniques)&lt;/li&gt; &lt;/ul&gt;&lt;p&gt;Then we analyze important data characteristics for each type of instrument. One can see that most selected images are characterized by their type of imaging instrument (e.g., radar or optical images), their typical signal-to-noise figures, their preferred pixel sizes, their various spectral bands, etc.&lt;/p&gt;&lt;p&gt;As a third step, we select a number of established machine learning algorithms, available tools, software packages, required environments, published experiences, and specific caveats. The comparisons cover traditional &amp;#8220;flat&amp;#8221; as well as advanced &amp;#8220;deep&amp;#8221; techniques that have to be compared in detail before making any decision about their usefulness for geophysical applications. They range from simple thresholding to k-means, from multi-scale approaches to convolutional networks (with visible or hidden layers) and auto-encoders with sub-components from rectified linear units to adversarial networks.&lt;/p&gt;&lt;p&gt;Finally, we summarize our findings in several instrument / machine learning algorithm matrices (e.g., for active or passive instruments). These matrices also contain important features of the input data and their consequences, computational effort, attainable figures-of-merit, and necessary testing and verification steps (positive and negative examples). Typical examples are statistical similarities, characteristic scales, rotation invariance, target groupings, topic bagging and targeting (hashing) capabilities as well as local compression behavior.&lt;/p&gt;


Sign in / Sign up

Export Citation Format

Share Document