scholarly journals Statistical Learning with a Nuisance Component (Extended Abstract)

Author(s):  
Dylan J. Foster ◽  
Vasilis Syrgkanis

We provide excess risk guarantees for statistical learning in a setting where the population risk with respect to which we evaluate a target parameter depends on an unknown parameter that must be estimated from data (a "nuisance parameter"). We analyze a two-stage sample splitting meta-algorithm that takes as input two arbitrary estimation algorithms: one for the target parameter and one for the nuisance parameter. We show that if the population risk satisfies a condition called Neyman orthogonality, the impact of the nuisance estimation error on the excess risk bound achieved by the meta-algorithm is of second order. Our theorem is agnostic to the particular algorithms used for the target and nuisance and only makes an assumption on their individual performance. This enables the use of a plethora of existing results from statistical learning and machine learning literature to give new guarantees for learning with a nuisance component. Moreover, by focusing on excess risk rather than parameter estimation, we can give guarantees under weaker assumptions than in previous works and accommodate the case where the target parameter belongs to a complex nonparametric class. We characterize conditions on the metric entropy such that oracle rates---rates of the same order as if we knew the nuisance parameter---are achieved. We also analyze the rates achieved by specific estimation algorithms such as variance-penalized empirical risk minimization, neural network estimation and sparse high-dimensional linear model estimation. We highlight the applicability of our results in four settings of central importance in the literature: 1) heterogeneous treatment effect estimation, 2) offline policy optimization, 3) domain adaptation, and 4) learning with missing data.

2020 ◽  
Vol 6 (1) ◽  
Author(s):  
Malte Seemann ◽  
Lennart Bargsten ◽  
Alexander Schlaefer

AbstractDeep learning methods produce promising results when applied to a wide range of medical imaging tasks, including segmentation of artery lumen in computed tomography angiography (CTA) data. However, to perform sufficiently, neural networks have to be trained on large amounts of high quality annotated data. In the realm of medical imaging, annotations are not only quite scarce but also often not entirely reliable. To tackle both challenges, we developed a two-step approach for generating realistic synthetic CTA data for the purpose of data augmentation. In the first step moderately realistic images are generated in a purely numerical fashion. In the second step these images are improved by applying neural domain adaptation. We evaluated the impact of synthetic data on lumen segmentation via convolutional neural networks (CNNs) by comparing resulting performances. Improvements of up to 5% in terms of Dice coefficient and 20% for Hausdorff distance represent a proof of concept that the proposed augmentation procedure can be used to enhance deep learning-based segmentation for artery lumen in CTA images.


2021 ◽  
Author(s):  
Anne-Marie Begin

<p>To estimate the impact of climate change on our society we need to use climate projections based on numerical models. These models make it possible to assess the effects on climate of the increase in greenhouse gases (GHG) as well as natural variability. We know that the global average temperature will increase and that the occurrence, intensity and spatio-temporal distribution of extreme precipitations will change. These extreme weather events cause droughts, floods and other natural disasters that have significant consequences on our life and environment. Precipitation is a key variable in adapting to climate change.</p><p> </p><p>This study focuses on the ClimEx large ensemble, a set of 50 independent simulations created to study the effect of climate change and natural variability on the water network in Quebec. This dataset consists of simulations produced using the Canadian Regional Climate Model version 5 (CRCM5) at 12 km of resolution driven by simulations from the second generation Canadian Earth System Model (CanESM2) global model at 310 km of resolution.</p><p> </p><p>The aim of the project is to evaluate the performance of the ClimEx ensemble in simulating the daily cycle and representing extreme values.  To get there, 30 years of hourly time series for precipitation and 3 hourly for temperature are analyzed. The simulations are compared with the values from the simulation of CRCM5 driven by ERA-Interim reanalysis, the ERA5 reanalysis and Environment and Climate Change Canada (ECCC) stations. An evaluation of the sensitivity of different statistics to the number of members is also performed.</p><p> </p><p>The daily cycle of precipitation from ClimEx shows mainly non-significant correlations with the other datasets and its amplitude is less than the observation datas from ECCC stations. For temperature, the correlation is strong and the amplitude of the cycle is similar to observations. ClimEx provides a fairly good representation of the 95, 97, 99<sup>th</sup> quantiles for precipitation. For temperature it represents a good distribution of quantiles but with a warm bias in southern Quebec. For precipitation hourly maximum, ClimEx shows values 10 times higher than ERA5.  For temperature, minimum and maximum values may exceed the ERA5 limit by up to 20°C. For precipitation, the minimum number of members for the estimation of the 95 and 99<sup>th</sup><sup></sup>quantiles and the mean cycle is between 15 and 50 for an estimation error of less than 5%. For the 95, 99<sup>th</sup> quantiles of temperature, the minimum number of members is between 1 and 17 and for the mean cycle 1 to 2 members are necessary to obtain an estimation error of less than 0.5°C.</p>


2021 ◽  
Vol 15 (3) ◽  
pp. 55-62
Author(s):  
V. P. Asovskiy ◽  
A. S. Kuzmenko ◽  
O. V. Khudolenko

The authors considered the use of unmanned aerial vehicles as one of the promising innovative directions for the development of economic and social sectors. The authors touched upon the prospects for their use in agriculture, especially for pesticide and agrochemical application, where accuracy, quality and timeliness are important. The relevance of multicopter performance assessment was noted. (Research purpose) The authors aim to develop and test a methodology for the evaluation of multicopters’ performance indicators for pesticide and agrochemical application in the agricultural industry. (Materials and methods) The authors used scientific and technical information and experimental materials, applied methods of system, statistical and functional-cost analysis, mathematical modeling, object and process parameter optimization, as well as previously developed methodological approaches to studying the aerial distribution of substances. (Results and discussion) The authors presented a general description and content of the developed methodology and means for assessing multicopter performance when applying working solutions that provide for an estimation error of up to 7 percent.The typical options for field plots and their treatment were specified. The authors analyzed the results of testing the methodology and software for a typical hexacopter with the payload of up to 10 kilograms. The authors analyzed the impact of working speed of up to 10 meters per second, application rates of 2-30 liters per hectare, the size and characteristics of the field plot up to 200 hectares, traffic patterns and other factors on productivity and multicopter treatment cost.  (Conclusions) The authors confirmed the efficiency of implementing complex multi-factor assessment of multicopter performance indicators for working fluids application in agricultural production. The authors determined the appropriate area of  applying multicopters with a payload of up to 10 kilograms in the field plots up to 50-60 hectares with a rut length of up to 800-900 meters with different treatment performance: flight – up to 10.5 hectares per flight hour, working – up to 7.5 hectares per hour, daytime – up to 55 hectares. Proposals and recommendations for the provision, organization and implementation of this work were formulated. 


2016 ◽  
Vol 4 (2) ◽  
pp. 123
Author(s):  
Novita Anugrah Listiyana ◽  
Dedi Rusdi

This study analyzed the relationship between humans as users of the system and application software as object an inseparable relationship. The purpose of this study was to analyze the effect of variable quality system to the perception of the quality system, the effect of perceived quality system and quality of information on the intensity of use and user satisfaction, and the influence of the intensity of use and user satisfaction of the impact of individual performance. This research is an empirical study using purposive sampling technique in data collection. Data were collected through  questionnaires to 39 BMT’s operational employees. Then, performed an analysis of the data obtained used path analysis. This included: testing hypotheses through path analysis. The results of tests performed using path analysis to get the results that each variable in the model of equation 4 had a coefficient with a positive direction. This means that the improvement of the quality of the system will be able to improve the quality of information for the individual impact of the use and satisfaction of employees through the use of the system.


2020 ◽  
Author(s):  
Bethany Growns ◽  
Kristy Martire

Forensic feature-comparison examiners in select disciplines are more accurate than novices when comparing visual evidence samples. This paper examines a key cognitive mechanism that may contribute to this superior visual comparison performance: the ability to learn how often stimuli occur in the environment (distributional statistical learning). We examined the relation-ship between distributional learning and visual comparison performance, and the impact of training about the diagnosticity of distributional information in visual comparison tasks. We compared performance between novices given no training (uninformed novices; n = 32), accu-rate training (informed novices; n = 32) or inaccurate training (misinformed novices; n = 32) in Experiment 1; and between forensic examiners (n = 26), informed novices (n = 29) and unin-formed novices (n = 27) in Experiment 2. Across both experiments, forensic examiners and nov-ices performed significantly above chance in a visual comparison task where distributional learning was required for high performance. However, informed novices outperformed all par-ticipants and only their visual comparison performance was significantly associated with their distributional learning. It is likely that forensic examiners’ expertise is domain-specific and doesn’t generalise to novel visual comparison tasks. Nevertheless, diagnosticity training could be critical to the relationship between distributional learning and visual comparison performance.


2016 ◽  
Vol 73 (9) ◽  
pp. 2190-2207 ◽  
Author(s):  
Chantel R. Wetzel ◽  
André E. Punt ◽  

Abstract Ending overfishing and rebuilding fish stocks to levels that provide for optimum sustainable yield is a concern for fisheries management worldwide. In the United States, fisheries managers are legally mandated to end overfishing and to implement rebuilding plans for fish stocks that fall below minimum stock size thresholds. Rebuilding plans should lead to recovery to target stock sizes within 10 years, except in situations where the life history of the stock or environmental conditions dictate otherwise. Federally managed groundfish species along the US West Coast have diverse life histories where some are able to rebuild quickly from overfished status, while others, specifically rockfish (Sebastes spp.), may require decades for rebuilding. A management strategy evaluation which assumed limited estimation error was conducted to evaluate the performance of alternative strategies for rebuilding overfished stocks for these alternative US West Coast life histories. Generally, the results highlight the trade-off between the reduction of catches during rebuilding vs. the length of rebuilding. The most precautionary rebuilding plans requiring the greatest harvest reduction resulted in higher average catches over the entire projection period compared with strategies that required a longer rebuilding period with less of a reduction in rebuilding catch. Attempting to maintain a 50% probability of rebuilding was the poorest performing rebuilding strategy for all life histories, resulting in a large number of changes to the rebuilding plan, increased frequency of failing to meet rebuilding targets, and higher variation in catch. The rebuilding plans that implemented a higher initial rebuilding probability (≥60%) for determining rebuilding fishing mortality and targets generally resulted in fewer changes to the rebuilding plans and rebuilt by the target rebuilding year, particularly for stocks with the longer rebuilding plans (e.g. rockfishes).


Sign in / Sign up

Export Citation Format

Share Document