diverse ensemble
Recently Published Documents


TOTAL DOCUMENTS

29
(FIVE YEARS 14)

H-INDEX

6
(FIVE YEARS 2)

2020 ◽  
Vol 34 (04) ◽  
pp. 4264-4271
Author(s):  
Siddhartha Jain ◽  
Ge Liu ◽  
Jonas Mueller ◽  
David Gifford

The inaccuracy of neural network models on inputs that do not stem from the distribution underlying the training data is problematic and at times unrecognized. Uncertainty estimates of model predictions are often based on the variation in predictions produced by a diverse ensemble of models applied to the same input. Here we describe Maximize Overall Diversity (MOD), an approach to improve ensemble-based uncertainty estimates by encouraging larger overall diversity in ensemble predictions across all possible inputs. We apply MOD to regression tasks including 38 Protein-DNA binding datasets, 9 UCI datasets, and the IMDB-Wiki image dataset. We also explore variants that utilize adversarial training techniques and data density estimation. For out-of-distribution test examples, MOD significantly improves predictive performance and uncertainty calibration without sacrificing performance on test data drawn from same distribution as the training data. We also find that in Bayesian optimization tasks, the performance of UCB acquisition is improved via MOD uncertainty estimates.


2019 ◽  
Vol 28 (2) ◽  
pp. 293-301
Author(s):  
Joseph T. Ornstein

I develop a procedure for estimating local-area public opinion called stacked regression and poststratification (SRP), a generalization of classical multilevel regression and poststratification (MRP). This procedure employs a diverse ensemble of predictive models—including multilevel regression, LASSO, k-nearest neighbors, random forest, and gradient boosting—to improve the cross-validated fit of the first-stage predictions. In a Monte Carlo simulation, SRP significantly outperforms MRP when there are deep interactions in the data generating process, without requiring the researcher to specify a complex parametric model in advance. In an empirical application, I show that SRP produces superior local public opinion estimates on a broad range of issue areas, particularly when trained on large datasets.


2019 ◽  
Author(s):  
Alberto Fabrizio ◽  
Andrea Grisafi ◽  
benjamin meyer ◽  
Michele Certiotti ◽  
Clemence Corminboeuf

<div>Chemists continuously harvest the power of non-covalent interactions to control phenomena in both the micro- and macroscopic worlds. From the quantum chemical perspective, the strategies essentially rely upon an in-depth understanding of the physical origin of these interactions, the quantification of their magnitude and their visualization in real-space. </div><div>The total electron density rho(r) represents the simplest yet most comprehensive piece of information available for fully characterizing bonding patterns and non-covalent interactions. The charge density of a molecule can be computed by solving the Schrodinger equation, but this approach becomes rapidly demanding if the electron density has to be evaluated for thousands of different molecules or for very large chemical systems, such as peptides and proteins. </div><div>Here we present a transferable and scalable machine-learning model capable of predicting the total electron density directly from the atomic coordinates. The regression model is used to access qualitative and quantitative insights beyond the underlying rho(r) in a diverse ensemble of sidechain-sidechain dimers extracted from the BioFragment database (BFDb). The transferability of the model to more complex chemical systems is demonstrated by predicting and analyzing the electron density of a collection of 8 polypeptides.</div>


2019 ◽  
Author(s):  
Alberto Fabrizio ◽  
Andrea Grisafi ◽  
benjamin meyer ◽  
Michele Certiotti ◽  
Clemence Corminboeuf

<div>Chemists continuously harvest the power of non-covalent interactions to control phenomena in both the micro- and macroscopic worlds. From the quantum chemical perspective, the strategies essentially rely upon an in-depth understanding of the physical origin of these interactions, the quantification of their magnitude and their visualization in real-space. </div><div>The total electron density rho(r) represents the simplest yet most comprehensive piece of information available for fully characterizing bonding patterns and non-covalent interactions. The charge density of a molecule can be computed by solving the Schrodinger equation, but this approach becomes rapidly demanding if the electron density has to be evaluated for thousands of different molecules or for very large chemical systems, such as peptides and proteins. </div><div>Here we present a transferable and scalable machine-learning model capable of predicting the total electron density directly from the atomic coordinates. The regression model is used to access qualitative and quantitative insights beyond the underlying rho(r) in a diverse ensemble of sidechain-sidechain dimers extracted from the BioFragment database (BFDb). The transferability of the model to more complex chemical systems is demonstrated by predicting and analyzing the electron density of a collection of 8 polypeptides.</div>


2019 ◽  
Vol 11 (11) ◽  
pp. 1259 ◽  
Author(s):  
Eike Jens Hoffmann ◽  
Yuanyuan Wang ◽  
Martin Werner ◽  
Jian Kang ◽  
Xiao Xiang Zhu

This article addresses the question of mapping building functions jointly using both aerial and street view images via deep learning techniques. One of the central challenges here is determining a data fusion strategy that can cope with heterogeneous image modalities. We demonstrate that geometric combinations of the features of such two types of images, especially in an early stage of the convolutional layers, often lead to a destructive effect due to the spatial misalignment of the features. Therefore, we address this problem through a decision-level fusion of a diverse ensemble of models trained from each image type independently. In this way, the significant differences in appearance of aerial and street view images are taken into account. Compared to the common multi-stream end-to-end fusion approaches proposed in the literature, we are able to increase the precision scores from 68% to 76%. Another challenge is that sophisticated classification schemes needed for real applications are highly overlapping and not very well defined without sharp boundaries. As a consequence, classification using machine learning becomes significantly harder. In this work, we choose a highly compact classification scheme with four classes, commercial, residential, public, and industrial because such a classification has a very high value to urban geography being correlated with socio-demographic parameters such as population density and income.


Sign in / Sign up

Export Citation Format

Share Document