Active learning in automated text classification: a case study exploring bias in predicted model performance metrics

2019 ◽  
Vol 39 (3) ◽  
pp. 269-280 ◽  
Author(s):  
Arun Varghese ◽  
Tao Hong ◽  
Chelsea Hunter ◽  
George Agyeman-Badu ◽  
Michelle Cawley
2020 ◽  
Vol 40 (4) ◽  
pp. 465-479 ◽  
Author(s):  
Arun Varghese ◽  
George Agyeman-Badu ◽  
Michelle Cawley

2021 ◽  
Author(s):  
Troy Smith

The study examines the applicability of Naïve Bayes in predictive classification modelling using a case study of cybercrime victimization data. The goal of which was a targeted presentation of the benefits of Bayesian analysis in crime research geared to policymakers. The method is assessed using a Model-Comparison Approach and model performance metrics. The study shows that Naïve Bayes can be useful in predictive classification where the target population is small or difficult to acquire such as offender profiling and analysis of high crime areas. This is important as it provides a plausible option to traditional Frequentist methods, that overcome statistical limitations and provides results in a form easily conveyable to policymakers. Further, the conditional probability data produced makes future prediction transparent and can foster confidence in predicted outcomes. In particular, Directed Acyclic Graph can be easily used to represent the Naïve Bayes output allowing visualization of the relationships between variables.


2020 ◽  
Author(s):  
Thaine H. Assumpção ◽  
Ioana Popescu ◽  
Andreja Jonoski ◽  
Dimitri P. Solomatine

<p>The calibration and validation of inundation models have since long been influenced by data availability. When only stage hydrographs and high water level marks were available, metrics such as the Root Mean Square Error (RMSE) were selected for goodness-of-fit assessment. When remotely sensed flood extent data started to be obtained, binary performance measures started being used. Although data availability and modelling resolution have advanced in the past decades, the methods behind performance evaluation remain similar. Shape-based metrics used in topology and pattern recognition could enhance not only the raw model performance but our ability to diagnose achieved results. Therefore, in this study, we discuss how much improvement in calibration can be obtained by employing shape matching metrics. The research is conducted in two experiments: a 2D hydrodynamic benchmarking model and the Po River case study. Different metrics traditionally used in inundation modelling and metrics tailored towards shape matching were employed. Calibration of the Manning coefficient was performed using one metric at a time. Experiments showed that metrics incorporating scale components (e.g. differences in areas and/or distances) provide better calibration. This corroborates the wide use of traditional metrics and indicates the potential of using shape-based metrics, which can augment our ability to diagnose models and improve modelling results.</p>


2018 ◽  
Vol 24 (4) ◽  
pp. 733-754
Author(s):  
Hyeon Woo Lee ◽  
Yoon Mi Cha ◽  
Kibeom Kim Kibeom Kim

2020 ◽  
Vol 12 (6) ◽  
pp. 2208 ◽  
Author(s):  
Jamie E. Filer ◽  
Justin D. Delorit ◽  
Andrew J. Hoisington ◽  
Steven J. Schuldt

Remote communities such as rural villages, post-disaster housing camps, and military forward operating bases are often located in remote and hostile areas with limited or no access to established infrastructure grids. Operating these communities with conventional assets requires constant resupply, which yields a significant logistical burden, creates negative environmental impacts, and increases costs. For example, a 2000-member isolated village in northern Canada relying on diesel generators required 8.6 million USD of fuel per year and emitted 8500 tons of carbon dioxide. Remote community planners can mitigate these negative impacts by selecting sustainable technologies that minimize resource consumption and emissions. However, the alternatives often come at a higher procurement cost and mobilization requirement. To assist planners with this challenging task, this paper presents the development of a novel infrastructure sustainability assessment model capable of generating optimal tradeoffs between minimizing environmental impacts and minimizing life-cycle costs over the community’s anticipated lifespan. Model performance was evaluated using a case study of a hypothetical 500-person remote military base with 864 feasible infrastructure portfolios and 48 procedural portfolios. The case study results demonstrated the model’s novel capability to assist planners in identifying optimal combinations of infrastructure alternatives that minimize negative sustainability impacts, leading to remote communities that are more self-sufficient with reduced emissions and costs.


Water ◽  
2020 ◽  
Vol 13 (1) ◽  
pp. 37
Author(s):  
Tomás de Figueiredo ◽  
Ana Caroline Royer ◽  
Felícia Fonseca ◽  
Fabiana Costa de Araújo Schütz ◽  
Zulimar Hernández

The European Space Agency Climate Change Initiative Soil Moisture (ESA CCI SM) product provides soil moisture estimates from radar satellite data with a daily temporal resolution. Despite validation exercises with ground data that have been performed since the product’s launch, SM has not yet been consistently related to soil water storage, which is a key step for its application for prediction purposes. This study aimed to analyse the relationship between soil water storage (S), which was obtained from soil water balance computations with ground meteorological data, and soil moisture, which was obtained from radar data, as affected by soil water storage capacity (Smax). As a case study, a 14-year monthly series of soil water storage, produced via soil water balance computations using ground meteorological data from northeast Portugal and Smax from 25 mm to 150 mm, were matched with the corresponding monthly averaged SM product. Linear (I) and logistic (II) regression models relating S with SM were compared. Model performance (r2 in the 0.8–0.9 range) varied non-monotonically with Smax, with it being the highest at an Smax of 50 mm. The logistic model (II) performed better than the linear model (I) in the lower range of Smax. Improvements in model performance obtained with segregation of the data series in two subsets, representing soil water recharge and depletion phases throughout the year, outlined the hysteresis in the relationship between S and SM.


Author(s):  
Elena Bartolomé ◽  
Paula Benítez

Failure Mode and Effect Analysis (FMEA) is a powerful quality tool, widely used in industry, for the identification of failure modes, their effects and causes. In this work, we investigated the utility of FMEA in the education field to improve active learning processes. In our case study, the FMEA principles were adapted to assess the risk of failures in a Mechanical Engineering course on “Theory of Machines and Mechanisms” conducted through a project-based, collaborative “Study and Research Path (SRP)” methodology. The SRP is an active learning instruction format which is initiated by a generating question that leads to a sequence of derived questions and answers, and combines moments of study and inquiry. By applying the FMEA, the teaching team was able to identify the most critical failures of the process, and implement corrective actions to improve the SRP in the subsequent year. Thus, our work shows that FMEA represents a simple tool of risk assesment which can serve to identify criticality in educational process, and improve the quality of active learning.


2014 ◽  
Vol 48 (1) ◽  
pp. 42-42 ◽  
Author(s):  
Giacomo Berardi

Sign in / Sign up

Export Citation Format

Share Document