Combining Deep Learning and Argumentative Reasoning for the Analysis of Social Media Textual Content Using Small Data Sets

The use of social media has become a regular habit for many and has changed the way people interact with each other. In this article, we focus on analyzing whether news headlines support tweets and whether reviews are deceptive by analyzing the interaction or the influence that these texts have on the others, thus exploiting contextual information. Concretely, we define a deep learning method for relation–based argument mining to extract argumentative relations of attack and support. We then use this method for determining whether news articles support tweets, a useful task in fact-checking settings, where determining agreement toward a statement is a useful step toward determining its truthfulness. Furthermore, we use our method for extracting bipolar argumentation frameworks from reviews to help detect whether they are deceptive. We show experimentally that our method performs well in both settings. In particular, in the case of deception detection, our method contributes a novel argumentative feature that, when used in combination with other features in standard supervised classifiers, outperforms the latter even on small data sets.

Download Full-text

Toward Orbital-Free Density Functional Theory with Small Data Sets and Deep Learning

Journal of Chemical Theory and Computation ◽

10.1021/acs.jctc.1c00812 ◽

2022 ◽

Author(s):

Kevin Ryczko ◽

Sebastian J. Wetzel ◽

Roger G. Melko ◽

Isaac Tamblyn

Keyword(s):

Density Functional Theory ◽

Deep Learning ◽

Density Functional ◽

Small Data ◽

Data Sets ◽

Functional Theory ◽

Small Data Sets ◽

Orbital Free

Download Full-text

Deep transfer learning for aortic root dilation identification in 3D ultrasound images

Current Directions in Biomedical Engineering ◽

10.1515/cdbme-2018-0018 ◽

2018 ◽

Vol 4 (1) ◽

pp. 71-74 ◽

Cited By ~ 1

Author(s):

Jannis Hagenah ◽

Mattias Heinrich ◽

Floris Ernst

Keyword(s):

Deep Learning ◽

Transfer Learning ◽

Aortic Root ◽

Classification Accuracy ◽

3D Ultrasound ◽

Small Data ◽

Data Sets ◽

Ultrasound Images ◽

Small Data Sets ◽

Learned Features

AbstractPre-operative planning of valve-sparing aortic root reconstruction relies on the automatic discrimination of healthy and pathologically dilated aortic roots. The basis of this classification are features extracted from 3D ultrasound images. In previously published approaches, handcrafted features showed a limited classification accuracy. However, feature learning is insufficient due to the small data sets available for this specific problem. In this work, we propose transfer learning to use deep learning on these small data sets. For this purpose, we used the convolutional layers of the pretrained deep neural network VGG16 as a feature extractor. To simplify the problem, we only took two prominent horizontal slices throgh the aortic root, the coaptation plane and the commissure plane, into account by stitching the features of both images together and training a Random Forest classifier on the resulting feature vectors. We evaluated this method on a data set of 48 images (24 healthy, 24 dilated) using 10-fold cross validation. Using the deep learned features we could reach a classification accuracy of 84 %, which clearly outperformed the handcrafted features (71 % accuracy). Even though the VGG16 network was trained on RGB photos and for different classification tasks, the learned features are still relevant for ultrasound image analysis of aortic root pathology identification. Hence, transfer learning makes deep learning possible even on very small ultrasound data sets.

Download Full-text

Deep learning method for predicting the mechanical properties of aluminum alloys with small data sets

Materials Today Communications ◽

10.1016/j.mtcomm.2021.102570 ◽

2021 ◽

pp. 102570

Author(s):

Zhang Yu ◽

Sang Ye ◽

Yanli Sun ◽

Hucheng Zhao ◽

Xi-Qiao Feng

Keyword(s):

Mechanical Properties ◽

Deep Learning ◽

Aluminum Alloys ◽

Small Data ◽

Data Sets ◽

Learning Method ◽

Small Data Sets

Download Full-text

Investigating the impact of pre-processing techniques and pre-trained word embeddings in detecting Arabic health information on social media

Journal Of Big Data ◽

10.1186/s40537-021-00488-w ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Yahya Albalawi ◽

Jim Buckley ◽

Nikola S. Nikolov

Keyword(s):

Social Media ◽

Deep Learning ◽

Comprehensive Evaluation ◽

Classification Problem ◽

Data Sets ◽

Word Embeddings ◽

Data Set ◽

Lower Accuracy ◽

Health Related ◽

The Impact

AbstractThis paper presents a comprehensive evaluation of data pre-processing and word embedding techniques in the context of Arabic document classification in the domain of health-related communication on social media. We evaluate 26 text pre-processings applied to Arabic tweets within the process of training a classifier to identify health-related tweets. For this task we use the (traditional) machine learning classifiers KNN, SVM, Multinomial NB and Logistic Regression. Furthermore, we report experimental results with the deep learning architectures BLSTM and CNN for the same text classification problem. Since word embeddings are more typically used as the input layer in deep networks, in the deep learning experiments we evaluate several state-of-the-art pre-trained word embeddings with the same text pre-processing applied. To achieve these goals, we use two data sets: one for both training and testing, and another for testing the generality of our models only. Our results point to the conclusion that only four out of the 26 pre-processings improve the classification accuracy significantly. For the first data set of Arabic tweets, we found that Mazajak CBOW pre-trained word embeddings as the input to a BLSTM deep network led to the most accurate classifier with F1 score of 89.7%. For the second data set, Mazajak Skip-Gram pre-trained word embeddings as the input to BLSTM led to the most accurate model with F1 score of 75.2% and accuracy of 90.7% compared to F1 score of 90.8% achieved by Mazajak CBOW for the same architecture but with lower accuracy of 70.89%. Our results also show that the performance of the best of the traditional classifier we trained is comparable to the deep learning methods on the first dataset, but significantly worse on the second dataset.

Download Full-text

Classification of jujube defects in small data sets based on transfer learning

Neural Computing and Applications ◽

10.1007/s00521-021-05715-2 ◽

2021 ◽

Author(s):

Jianping Ju ◽

Hong Zheng ◽

Xiaohang Xu ◽

Zhongyuan Guo ◽

Zhaohui Zheng ◽

...

Keyword(s):

Transfer Learning ◽

Loss Function ◽

Training Model ◽

Parameter Distribution ◽

Test Accuracy ◽

Small Data ◽

Data Sets ◽

Data Set ◽

Small Data Sets

AbstractAlthough convolutional neural networks have achieved success in the field of image classification, there are still challenges in the field of agricultural product quality sorting such as machine vision-based jujube defects detection. The performance of jujube defect detection mainly depends on the feature extraction and the classifier used. Due to the diversity of the jujube materials and the variability of the testing environment, the traditional method of manually extracting the features often fails to meet the requirements of practical application. In this paper, a jujube sorting model in small data sets based on convolutional neural network and transfer learning is proposed to meet the actual demand of jujube defects detection. Firstly, the original images collected from the actual jujube sorting production line were pre-processed, and the data were augmented to establish a data set of five categories of jujube defects. The original CNN model is then improved by embedding the SE module and using the triplet loss function and the center loss function to replace the softmax loss function. Finally, the depth pre-training model on the ImageNet image data set was used to conduct training on the jujube defects data set, so that the parameters of the pre-training model could fit the parameter distribution of the jujube defects image, and the parameter distribution was transferred to the jujube defects data set to complete the transfer of the model and realize the detection and classification of the jujube defects. The classification results are visualized by heatmap through the analysis of classification accuracy and confusion matrix compared with the comparison models. The experimental results show that the SE-ResNet50-CL model optimizes the fine-grained classification problem of jujube defect recognition, and the test accuracy reaches 94.15%. The model has good stability and high recognition accuracy in complex environments.

Download Full-text

45 A permutation test for validation of genomic estimated breeding values

Journal of Animal Science ◽

10.1093/jas/skaa278.016 ◽

2020 ◽

Vol 98 (Supplement_4) ◽

pp. 8-9

Author(s):

Zahra Karimi ◽

Brian Sullivan ◽

Mohsen Jafarikia

Keyword(s):

Permutation Test ◽

Breeding Value ◽

Small Data ◽

Data Sets ◽

Type I ◽

Future Performance ◽

Small Data Sets ◽

Estimated Breeding Value ◽

Estimated Breeding Values ◽

Top 40

Abstract Previous studies have shown that the accuracy of Genomic Estimated Breeding Value (GEBV) as a predictor of future performance is higher than the traditional Estimated Breeding Value (EBV). The purpose of this study was to estimate the potential advantage of selection on GEBV for litter size (LS) compared to selection on EBV in the Canadian swine dam line breeds. The study included 236 Landrace and 210 Yorkshire gilts born in 2017 which had their first farrowing after 2017. GEBV and EBV for LS were calculated with data that was available at the end of 2017 (GEBV2017 and EBV2017, respectively). De-regressed EBV for LS in July 2019 (dEBV2019) was used as an adjusted phenotype. The average dEBV2019 for the top 40% of sows based on GEBV2017 was compared to the average dEBV2019 for the top 40% of sows based on EBV2017. The standard error of the estimated difference for each breed was estimated by comparing the average dEBV2019 for repeated random samples of two sets of 40% of the gilts. In comparison to the top 40% ranked based on EBV2017, ranking based on GEBV2017 resulted in an extra 0.45 (±0.29) and 0.37 (±0.25) piglets born per litter in Landrace and Yorkshire replacement gilts, respectively. The estimated Type I errors of the GEBV2017 gain over EBV2017 were 6% and 7% in Landrace and Yorkshire, respectively. Considering selection of both replacement boars and replacement gilts using GEBV instead of EBV can translate into increased annual genetic gain of 0.3 extra piglets per litter, which would more than double the rate of gain observed from typical EBV based selection. The permutation test for validation used in this study appears effective with relatively small data sets and could be applied to other traits, other species and other prediction methods.

Download Full-text

A Comparison Study of Mahalanobis-Taguchi System and Neural Network for Multivariate Pattern Recognition

Design Engineering, Parts A and B ◽

10.1115/imece2005-80029 ◽

2005 ◽

Cited By ~ 10

Author(s):

Jungeui Hong ◽

Elizabeth A. Cudney ◽

Genichi Taguchi ◽

Rajesh Jugulum ◽

Kioumars Paryani ◽

...

Keyword(s):

Neural Network ◽

Small Data ◽

Data Sets ◽

Comparison Study ◽

Data Set ◽

Set Size ◽

Breast Cancer Study ◽

Discriminant Ability ◽

Small Data Sets ◽

Multivariate Pattern

The Mahalanobis-Taguchi System is a diagnosis and predictive method for analyzing patterns in multivariate cases. The goal of this study is to compare the ability of the Mahalanobis-Taguchi System and a neural network to discriminate using small data sets. We examine the discriminant ability as a function of data set size using an application area where reliable data is publicly available. The study uses the Wisconsin Breast Cancer study with nine attributes and one class.

Download Full-text