Definition Extraction from Generic and Mathematical Domains with Deep Ensemble Learning

Natalia Vanetik; Marina Litvak

doi:10.3390/math9192502

Improving Land Cover Classification Using Genetic Programming for Feature Construction

Remote Sensing ◽

10.3390/rs13091623 ◽

2021 ◽

Vol 13 (9) ◽

pp. 1623

Author(s):

João E. Batista ◽

Ana I. R. Cabral ◽

Maria J. P. Vasconcelos ◽

Leonardo Vanneschi ◽

Sara Silva

Keyword(s):

Land Cover ◽

Genetic Programming ◽

Satellite Images ◽

State Of The Art ◽

Binary Classification ◽

Feature Construction ◽

Classification Problems ◽

Construction Methods ◽

Box Models

Genetic programming (GP) is a powerful machine learning (ML) algorithm that can produce readable white-box models. Although successfully used for solving an array of problems in different scientific areas, GP is still not well known in the field of remote sensing. The M3GP algorithm, a variant of the standard GP algorithm, performs feature construction by evolving hyperfeatures from the original ones. In this work, we use the M3GP algorithm on several sets of satellite images over different countries to create hyperfeatures from satellite bands to improve the classification of land cover types. We add the evolved hyperfeatures to the reference datasets and observe a significant improvement of the performance of three state-of-the-art ML algorithms (decision trees, random forests, and XGBoost) on multiclass classifications and no significant effect on the binary classifications. We show that adding the M3GP hyperfeatures to the reference datasets brings better results than adding the well-known spectral indices NDVI, NDWI, and NBR. We also compare the performance of the M3GP hyperfeatures in the binary classification problems with those created by other feature construction methods such as FFX and EFS.

Download Full-text

Learning transferable deep convolutional neural networks for the classification of bacterial virulence factors

Bioinformatics ◽

10.1093/bioinformatics/btaa230 ◽

2020 ◽

Vol 36 (12) ◽

pp. 3693-3702 ◽

Cited By ~ 2

Author(s):

Dandan Zheng ◽

Guansong Pang ◽

Bo Liu ◽

Lihong Chen ◽

Jian Yang

Keyword(s):

Virulence Factors ◽

State Of The Art ◽

Binary Classification ◽

Network Models ◽

Bacterial Virulence ◽

Supplementary Information ◽

Deep Convolutional Neural Networks ◽

Neural Network Models ◽

Auxiliary Data

Abstract Motivation Identification of virulence factors (VFs) is critical to the elucidation of bacterial pathogenesis and prevention of related infectious diseases. Current computational methods for VF prediction focus on binary classification or involve only several class(es) of VFs with sufficient samples. However, thousands of VF classes are present in real-world scenarios, and many of them only have a very limited number of samples available. Results We first construct a large VF dataset, covering 3446 VF classes with 160 495 sequences, and then propose deep convolutional neural network models for VF classification. We show that (i) for common VF classes with sufficient samples, our models can achieve state-of-the-art performance with an overall accuracy of 0.9831 and an F1-score of 0.9803; (ii) for uncommon VF classes with limited samples, our models can learn transferable features from auxiliary data and achieve good performance with accuracy ranging from 0.9277 to 0.9512 and F1-score ranging from 0.9168 to 0.9446 when combined with different predefined features, outperforming traditional classifiers by 1–13% in accuracy and by 1–16% in F1-score. Availability and implementation All of our datasets are made publicly available at http://www.mgc.ac.cn/VFNet/, and the source code of our models is publicly available at https://github.com/zhengdd0422/VFNet. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Improving Land Cover Classification Using Genetic Programming for Feature Construction

10.20944/preprints202010.0168.v2 ◽

2020 ◽

Author(s):

João Batista ◽

Ana Cabral ◽

Maria Vasconcelos ◽

Leonardo Vanneschi ◽

Sara Silva

Keyword(s):

Land Cover ◽

Genetic Programming ◽

Satellite Images ◽

State Of The Art ◽

Binary Classification ◽

Feature Construction ◽

Classification Problems ◽

Construction Methods ◽

Box Models

Genetic Programming (GP) is a powerful Machine Learning (ML) algorithm that can produce readable white-box models. Although successfully used for solving an array of problems in different scientific areas, GP is still not well known in Remote Sensing. The M3GP algorithm, a variant of the standard GP algorithm, performs Feature Construction by evolving hyper-features from the original ones. In this work, we use the M3GP algorithm on several sets of satellite images over different countries to create hyper-feature from satellite bands to improve the classification of land cover types. We add the evolved hyper-features to the reference datasets and observe a significant improvement of the performance of three state-of-the-art ML algorithms (Decision Trees, Random Forests and XGBoost) on multiclass classifications and no significant effect on the binary classifications. We show that adding the M3GP hyper-features to the reference datasets brings better results than adding the well-known spectral indices NDVI, NDWI and NBR. We also compare the performance of the M3GP hyper-features in the binary classification problems with those created by other Feature Construction methods like FFX and EFS.

Download Full-text

ECG Heartbeat Classification Based on an Improved ResNet-18 Model

Computational and Mathematical Methods in Medicine ◽

10.1155/2021/6649970 ◽

2021 ◽

Vol 2021 ◽

pp. 1-13

Author(s):

Enbiao Jing ◽

Haiyang Zhang ◽

ZhiGang Li ◽

Yazhi Liu ◽

Zhanlin Ji ◽

...

Keyword(s):

State Of The Art ◽

Classification Performance ◽

Classification Models ◽

Heartbeat Classification ◽

Ecg Signals ◽

Residual Structure ◽

Proposed Model ◽

Model Training ◽

Electrocardiogram Ecg

Based on a convolutional neural network (CNN) approach, this article proposes an improved ResNet-18 model for heartbeat classification of electrocardiogram (ECG) signals through appropriate model training and parameter adjustment. Due to the unique residual structure of the model, the utilized CNN layered structure can be deepened in order to achieve better classification performance. The results of applying the proposed model to the MIT-BIH arrhythmia database demonstrate that the model achieves higher accuracy (96.50%) compared to other state-of-the-art classification models, while specifically for the ventricular ectopic heartbeat class, its sensitivity is 93.83% and the precision is 97.44%.

Download Full-text

Improving Land Cover Classification Using Genetic Programming for Feature Construction

10.20944/preprints202012.0471.v1 ◽

2020 ◽

Author(s):

João E. Batista ◽

Ana I. R. Cabral ◽

Maria J. P. Vasconcelos ◽

Leonardo Vanneschi ◽

Sara Silva

Keyword(s):

Land Cover ◽

Genetic Programming ◽

Satellite Images ◽

State Of The Art ◽

Binary Classification ◽

Feature Construction ◽

Classification Problems ◽

Construction Methods ◽

Box Models

Genetic Programming (GP) is a powerful Machine Learning (ML) algorithm that can produce readable white-box models. Although successfully used for solving an array of problems in different scientific areas, GP is still not well known in Remote Sensing. The M3GP algorithm, a variant of the standard GP algorithm, performs Feature Construction by evolving hyper-features from the original ones. In this work, we use the M3GP algorithm on several sets of satellite images over different countries to create hyper-feature from satellite bands to improve the classification of land cover types. We add the evolved hyper-features to the reference datasets and observe a significant improvement of the performance of three state-of-the-art ML algorithms (Decision Trees, Random Forests and XGBoost) on multiclass classifications and no significant effect on the binary classifications. We show that adding the M3GP hyper-features to the reference datasets brings better results than adding the well-known spectral indices NDVI, NDWI and NBR. We also compare the performance of the M3GP hyper-features in the binary classification problems with those created by other Feature Construction methods like FFX and EFS.

Download Full-text

Binary classification of Lupus scientific articles applying deep ensemble model on text data

2019 Seventh International Conference on Digital Information Processing and Communications (ICDIPC) ◽

10.1109/icdipc.2019.8723787 ◽

2019 ◽

Author(s):

Maryam Samami ◽

Elham Mousazade soure

Keyword(s):

Binary Classification ◽

Ensemble Model ◽

Text Data

Download Full-text

GENERATING SYNTHETIC 3D POINT SEGMENTS FOR IMPROVED CLASSIFICATION OF MOBILE LIDAR POINT CLOUDS

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xliii-b2-2021-139-2021 ◽

2021 ◽

Vol XLIII-B2-2021 ◽

pp. 139-144

Author(s):

S. A. Chitnis ◽

Z. Huang ◽

K. Khoshelham

Keyword(s):

Semantic Information ◽

State Of The Art ◽

Point Clouds ◽

Classification Performance ◽

Training Data ◽

Lidar Data ◽

Real Point ◽

The Road ◽

Training Samples

Abstract. Mobile lidar point clouds are commonly used for 3d mapping of road environments as they provide a rich, highly detailed geometric representation of objects on and around the road. However, raw lidar point clouds lack semantic information about the type of objects, which is necessary for various applications. Existing methods for the classification of objects in mobile lidar data, including state of the art deep learning methods, achieve relatively low accuracies, and a primary reason for this under-performance is the inadequacy of available 3d training samples to sufficiently train deep networks. In this paper, we propose a generative model for creating synthetic 3d point segments that can aid in improving the classification performance of mobile lidar point clouds. We train a 3d Adversarial Autoencoder (3dAAE) to generate synthetic point segments that exhibit a high resemblance to and share similar geometric features with real point segments. We evaluate the performance of a PointNet-like classifier trained with and without the synthetic point segments. The evaluation results support our hypothesis that training a classifier with training data augmented with synthetic samples leads to significant improvement in the classification performance. Specifically, our model achieves an F1 score of 0.94 for vehicles and pedestrians and 1.00 for traffic signs.

Download Full-text

How much is a cow like a meow? A novel database of human judgements of audiovisual semantic relatedness

10.31234/osf.io/7h82c ◽

2021 ◽

Author(s):

Kira Wegner-Clemens ◽

George Law Malcolm ◽

Sarah Shomstein

Keyword(s):

Semantic Information ◽

Binary Classification ◽

Semantic Relatedness ◽

Research Community ◽

Audiovisual Stimulus ◽

Text Corpora ◽

Similarity Ratings ◽

Similarity Judgments ◽

Similarity Judgements

Semantic information about objects, events, and scenes influences how humans perceive, interact with, and navigate the world. Most evidence in support of semantic influence on cognition has been garnered from research conducted with an isolated modality (e.g., vision, audition). However, the influence of semantic information has not yet been extensively studied in multisensory environments potentially because of the difficulty in quantification of semantic relatedness. Past studies have primary relied on either a simplified binary classification of semantic relatedness based on category or on algorithmic values based on text corpora rather than human perceptual experience and judgement. With the aim to accelerate research into multisensory semantics, we created a constrained audiovisual stimulus set and derived similarity ratings between items within three categories (animals, instruments, household items). A set of 140 participants provided similarity judgments between sounds and images. Participants either heard a sound (e.g., a meow) and judged which of two pictures of objects (e.g., a picture of a dog and a duck) it was more similar to, or saw a picture (e.g., a picture of a duck) and selected which of two sounds it was more similar to (e.g., a bark or a meow). Judgements were then used to calculate similarity values of any given cross-modal pair. The derived and reported similarity judgements reflect a range of semantic similarities across three categories and items, and highlight similarities and differences among similarity judgments between modalities. We make the derived similarity values available in a database format to the research community to be used as a measure of semantic relatedness in cognitive psychology experiments, enabling more robust studies of semantics in audiovisual environments.

Download Full-text

A new dataset of dog breed images and a benchmark for finegrained classification

Computational Visual Media ◽

10.1007/s41095-020-0184-6 ◽

2020 ◽

Vol 6 (4) ◽

pp. 477-487

Author(s):

Ding-Nan Zou ◽

Song-Hai Zhang ◽

Tai-Jiang Mu ◽

Min Zhang

Keyword(s):

Real World ◽

State Of The Art ◽

Whole Body ◽

Classification Models ◽

Neural Models ◽

Fine Grained ◽

Image Dataset ◽

Dog Breed ◽

Bounding Boxes

AbstractIn this paper, we introduce an image dataset for fine-grained classification of dog breeds: the Tsinghua Dogs Dataset. It is currently the largest dataset for fine-grained classification of dogs, including 130 dog breeds and 70,428 real-world images. It has only one dog in each image and provides annotated bounding boxes for the whole body and head. In comparison to previous similar datasets, it contains more breeds and more carefully chosen images for each breed. The diversity within each breed is greater, with between 200 and 7000+ images for each breed. Annotation of the whole body and head makes the dataset not only suitable for the improvement of finegrained image classification models based on overall features, but also for those locating local informative parts. We show that dataset provides a tough challenge by benchmarking several state-of-the-art deep neural models. The dataset is available for academic purposes at https://cg.cs.tsinghua.edu.cn/ThuDogs/.

Download Full-text

Automatic Detection and Classification of Cough Events Based on Deep Learning

Current Directions in Biomedical Engineering ◽

10.1515/cdbme-2020-3083 ◽

2020 ◽

Vol 6 (3) ◽

pp. 322-325

Author(s):

Seyed Amir Hossein Tabatabaei ◽

Gabriela Augustinov ◽

Volker Gross ◽

Keywan Sohrabi ◽

Patrick Fischer ◽

...

Keyword(s):

Deep Learning ◽

Classification Accuracy ◽

Operating Characteristic ◽

Binary Classification ◽

Classification Model ◽

Learning Approach ◽

Classification Models ◽

Recording Channels ◽

Cough Sound

AbstractIn this paper, a deep learning approach for classification of cough sound segments is presented. The architecture of the network is based on a pre-trained network and the spectrogram images of three recording channels have been extracted for the sake of training the network. The classification accuracy based on three recording channels is 92% for a binary classification model and the network converges fast. Two classification models based on binary and multi-class problems are proposed. Relevant classification parameters including the Receiver Operating Characteristic (ROC) curve are reported.

Download Full-text