Sparse Bayesian modeling of mixed econometric data using data augmentation

2013 ◽  
pp. 173-188
Author(s):  
Helga Wagner ◽  
Regina Tüchler
Author(s):  
Gabriel Ribeiro ◽  
Marcos Yamasaki ◽  
Helon Vicente Hultmann Ayala ◽  
Leandro Coelho ◽  
Viviana Mariani

2021 ◽  
Vol 11 (9) ◽  
pp. 3974
Author(s):  
Laila Bashmal ◽  
Yakoub Bazi ◽  
Mohamad Mahmoud Al Rahhal ◽  
Haikel Alhichri ◽  
Naif Al Ajlan

In this paper, we present an approach for the multi-label classification of remote sensing images based on data-efficient transformers. During the training phase, we generated a second view for each image from the training set using data augmentation. Then, both the image and its augmented version were reshaped into a sequence of flattened patches and then fed to the transformer encoder. The latter extracts a compact feature representation from each image with the help of a self-attention mechanism, which can handle the global dependencies between different regions of the high-resolution aerial image. On the top of the encoder, we mounted two classifiers, a token and a distiller classifier. During training, we minimized a global loss consisting of two terms, each corresponding to one of the two classifiers. In the test phase, we considered the average of the two classifiers as the final class labels. Experiments on two datasets acquired over the cities of Trento and Civezzano with a ground resolution of two-centimeter demonstrated the effectiveness of the proposed model.


2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Nadin Ulrich ◽  
Kai-Uwe Goss ◽  
Andrea Ebert

AbstractToday more and more data are freely available. Based on these big datasets deep neural networks (DNNs) rapidly gain relevance in computational chemistry. Here, we explore the potential of DNNs to predict chemical properties from chemical structures. We have selected the octanol-water partition coefficient (log P) as an example, which plays an essential role in environmental chemistry and toxicology but also in chemical analysis. The predictive performance of the developed DNN is good with an rmse of 0.47 log units in the test dataset and an rmse of 0.33 for an external dataset from the SAMPL6 challenge. To this end, we trained the DNN using data augmentation considering all potential tautomeric forms of the chemicals. We further demonstrate how DNN models can help in the curation of the log P dataset by identifying potential errors, and address limitations of the dataset itself.


2021 ◽  
Author(s):  
Xin Sui ◽  
Wanjing Wang ◽  
Jinfeng Zhang

In this work, we trained an ensemble model for predicting drug-protein interactions within a sentence based on only its semantics. Our ensembled model was built using three separate models: 1) a classification model using a fine-tuned BERT model; 2) a fine-tuned sentence BERT model that embeds every sentence into a vector; and 3) another classification model using a fine-tuned T5 model. In all models, we further improved performance using data augmentation. For model 2, we predicted the label of a sentence using k-nearest neighbors with its embedded vector. We also explored ways to ensemble these 3 models: a) we used the majority vote method to ensemble these 3 models; and b) based on the HDBSCAN clustering algorithm, we trained another ensemble model using features from all the models to make decisions. Our best model achieved an F-1 score of 0.753 on the BioCreative VII Track 1 test dataset.


Author(s):  
Tieming Chen ◽  
Zhengqiu Weng ◽  
YunPeng Chen ◽  
Chenqiang Jin ◽  
Mingqi Lv ◽  
...  

2018 ◽  
Vol 8 (12) ◽  
pp. 2512 ◽  
Author(s):  
Ghouthi Boukli Hacene ◽  
Vincent Gripon ◽  
Nicolas Farrugia ◽  
Matthieu Arzel ◽  
Michel Jezequel

Deep learning-based methods have reached state of the art performances, relying on a large quantity of available data and computational power. Such methods still remain highly inappropriate when facing a major open machine learning problem, which consists of learning incrementally new classes and examples over time. Combining the outstanding performances of Deep Neural Networks (DNNs) with the flexibility of incremental learning techniques is a promising venue of research. In this contribution, we introduce Transfer Incremental Learning using Data Augmentation (TILDA). TILDA is based on pre-trained DNNs as feature extractors, robust selection of feature vectors in subspaces using a nearest-class-mean based technique, majority votes and data augmentation at both the training and the prediction stages. Experiments on challenging vision datasets demonstrate the ability of the proposed method for low complexity incremental learning, while achieving significantly better accuracy than existing incremental counterparts.


Sign in / Sign up

Export Citation Format

Share Document