Ani-GIFs: A Benchmark Dataset for Domain Generalization of Action Recognition from GIFs

Deep learning models perform remarkably well for the same task under the assumption that data is always coming from the same distribution. However, this is generally violated in practice, mainly due to the differences in the data acquisition techniques and the lack of information about the underlying source of new data. Domain Generalization targets the ability to generalize to test data of an unseen domain; while this problem is well-studied for images, such studies are significantly lacking in spatiotemporal visual content – videos and GIFs. This is due to (1) the challenging nature of misalignment of temporal features and the varying appearance/motion of actors and actions in different domains, and (2) spatiotemporal datasets being laborious to collect and annotate for multiple domains. We collect and present the first synthetic video dataset of Animated GIFs for domain generalization, Ani-GIFs, that is used to study domain gap of videos vs. GIFs, and animated vs. real GIFs, for the task of action recognition. We provide a training and testing setting for Ani-GIFs, and extend two domain generalization baseline approaches, based on data augmentation and explainability, to the spatiotemporal domain to catalyze research in this direction.

Download Full-text

Levenshtein Augmentation Improves Performance of SMILES Based Deep-Learning Synthesis Prediction

10.26434/chemrxiv.12562121 ◽

2020 ◽

Author(s):

Dean Sumner ◽

Jiazhen He ◽

Amol Thakkar ◽

Ola Engkvist ◽

Esben Jannik Bjerrum

Keyword(s):

Neural Networks ◽

Pattern Recognition ◽

Deep Learning ◽

Recurrent Neural Networks ◽

Data Augmentation ◽

State Of The Art ◽

Sequence Similarity ◽

Learning Models ◽

Underlying Network

<p>SMILES randomization, a form of data augmentation, has previously been shown to increase the performance of deep learning models compared to non-augmented baselines. Here, we propose a novel data augmentation method we call “Levenshtein augmentation” which considers local SMILES sub-sequence similarity between reactants and their respective products when creating training pairs. The performance of Levenshtein augmentation was tested using two state of the art models - transformer and sequence-to-sequence based recurrent neural networks with attention. Levenshtein augmentation demonstrated an increase performance over non-augmented, and conventionally SMILES randomization augmented data when used for training of baseline models. Furthermore, Levenshtein augmentation seemingly results in what we define as <i>attentional gain </i>– an enhancement in the pattern recognition capabilities of the underlying network to molecular motifs.</p>

Download Full-text

A Hybrid Optimized LSTM Models for Human Activity Recognition with IOT Devices

International Journal of Advanced Research in Science, Communication and Technology ◽

10.48175/ijarsct-2326 ◽

2021 ◽

pp. 182-189

Author(s):

S. Arokiaraj ◽

Dr. N. Viswanathan

Keyword(s):

Deep Learning ◽

Internet Of Things ◽

Short Term Memory ◽

The Other ◽

Learning Models ◽

Computational Overhead ◽

Temporal Features ◽

Human Movements ◽

Proposed Model ◽

Iot Devices

With the advent of Internet of things(IoT),HA (HA) recognition has contributed the more application in health care in terms of diagnosis and Clinical process. These devices must be aware of human movements to provide better aid in the clinical applications as well as user’s daily activity.Also , In addition to machine and deep learning algorithms, HA recognition systems has significantly improved in terms of high accurate recognition. However, the most of the existing models designed needs improvisation in terms of accuracy and computational overhead. In this research paper, we proposed a BAT optimized Long Short term Memory (BAT-LSTM) for an effective recognition of human activities using real time IoT systems. The data are collected by implanting the Internet of things) devices invasively. Then, proposed BAT-LSTM is deployed to extract the temporal features which are then used for classification to HA. Nearly 10,0000 dataset were collected and used for evaluating the proposed model. For the validation of proposed framework, accuracy, precision, recall, specificity and F1-score parameters are chosen and comparison is done with the other state-of-art deep learning models. The finding shows the proposed model outperforms the other learning models and finds its suitability for the HA recognition.

Download Full-text

Data Augmentation for Improving Deep Learning Models in Building Inspections or Postdisaster Evaluation

Journal of Performance of Constructed Facilities ◽

10.1061/(asce)cf.1943-5509.0001594 ◽

2021 ◽

Vol 35 (4) ◽

Author(s):

Samuel Leach ◽

Yunhe Xue ◽

Rahul Sridhar ◽

Stephanie Paal ◽

Zhangyang Wang ◽

...

Keyword(s):

Deep Learning ◽

Data Augmentation ◽

Learning Models

Download Full-text

Distinct Two-Stream Convolutional Networks for Human Action Recognition in Videos Using Segment-Based Temporal Modeling

Data ◽

10.3390/data5040104 ◽

2020 ◽

Vol 5 (4) ◽

pp. 104

Author(s):

Ashok Sarabu ◽

Ajit Kumar Santra

Keyword(s):

Action Recognition ◽

Data Augmentation ◽

Main Idea ◽

Human Action Recognition ◽

Human Action ◽

Great Success ◽

Temporal Modeling ◽

Convolutional Networks ◽

Temporal Features ◽

Augmentation Techniques

The Two-stream convolution neural network (CNN) has proven a great success in action recognition in videos. The main idea is to train the two CNNs in order to learn spatial and temporal features separately, and two scores are combined to obtain final scores. In the literature, we observed that most of the methods use similar CNNs for two streams. In this paper, we design a two-stream CNN architecture with different CNNs for the two streams to learn spatial and temporal features. Temporal Segment Networks (TSN) is applied in order to retrieve long-range temporal features, and to differentiate the similar type of sub-action in videos. Data augmentation techniques are employed to prevent over-fitting. Advanced cross-modal pre-training is discussed and introduced to the proposed architecture in order to enhance the accuracy of action recognition. The proposed two-stream model is evaluated on two challenging action recognition datasets: HMDB-51 and UCF-101. The findings of the proposed architecture shows the significant performance increase and it outperforms the existing methods.

Download Full-text

Deploying Machine and Deep Learning Models for Efficient Data-Augmented Detection of COVID-19 Infections

Viruses ◽

10.3390/v12070769 ◽

2020 ◽

Vol 12 (7) ◽

pp. 769 ◽

Cited By ~ 12

Author(s):

Ahmed Sedik ◽

Abdullah M Iliyasu ◽

Basma Abd El-Rahiem ◽

Mohammed E. Abdel Samea ◽

Asmaa Abdel-Raheem ◽

...

Keyword(s):

Deep Learning ◽

Data Augmentation ◽

Short Term Memory ◽

Machine Learning Techniques ◽

Detection Accuracy ◽

Testing Time ◽

Learning Models ◽

Accurate Identification ◽

Infected People ◽

Corona Virus

This generation faces existential threats because of the global assault of the novel Corona virus 2019 (i.e., COVID-19). With more than thirteen million infected and nearly 600000 fatalities in 188 countries/regions, COVID-19 is the worst calamity since the World War II. These misfortunes are traced to various reasons, including late detection of latent or asymptomatic carriers, migration, and inadequate isolation of infected people. This makes detection, containment, and mitigation global priorities to contain exposure via quarantine, lockdowns, work/stay at home, and social distancing that are focused on “flattening the curve”. While medical and healthcare givers are at the frontline in the battle against COVID-19, it is a crusade for all of humanity. Meanwhile, machine and deep learning models have been revolutionary across numerous domains and applications whose potency have been exploited to birth numerous state-of-the-art technologies utilised in disease detection, diagnoses, and treatment. Despite these potentials, machine and, particularly, deep learning models are data sensitive, because their effectiveness depends on availability and reliability of data. The unavailability of such data hinders efforts of engineers and computer scientists to fully contribute to the ongoing assault against COVID-19. Faced with a calamity on one side and absence of reliable data on the other, this study presents two data-augmentation models to enhance learnability of the Convolutional Neural Network (CNN) and the Convolutional Long Short-Term Memory (ConvLSTM)-based deep learning models (DADLMs) and, by doing so, boost the accuracy of COVID-19 detection. Experimental results reveal improvement in terms of accuracy of detection, logarithmic loss, and testing time relative to DLMs devoid of such data augmentation. Furthermore, average increases of 4% to 11% in COVID-19 detection accuracy are reported in favour of the proposed data-augmented deep learning models relative to the machine learning techniques. Therefore, the proposed algorithm is effective in performing a rapid and consistent Corona virus diagnosis that is primarily aimed at assisting clinicians in making accurate identification of the virus.

Download Full-text

Focused small-scale fisheries as complex systems using deep learning models

Latin American Journal of Aquatic Research ◽

10.3856/vol49-issue2-fulltext-2622 ◽

2021 ◽

Vol 49 (2) ◽

pp. 342-353

Author(s):

Ricardo Cavieses-Núñez ◽

Miguel A. Ojeda-Ruiz ◽

Alfredo Flores-Irigollen ◽

Elvia Marín-Monroy ◽

Mirtha Lbañez-Lucero ◽

...

Keyword(s):

Deep Learning ◽

Environmental Variables ◽

Predictive Power ◽

Small Scale ◽

Learning Models ◽

Small Scale Fisheries ◽

Productive Activities ◽

Lack Of Information ◽

Management Measures ◽

The Face

Small-scale fishing (SSF) is a relevant economic activity worldwide, so sustainable development will be essential to assure its contributions to food security, poverty alleviation, and healthy ecosystems. However, the wide diversity of fisheries, their complexity, and the lack of information limit the ability to propose/evaluate management measures and plans and their effects on communities and other productive activities. The state of Baja California Sur, Mexico, our study case, ranks as the third place in national fisheries production, possesses SSF fleets, has a wide variety of fisheries that share fishing areas, fishing seasons, and operating units. In this work, assuming SSF as a complex system were proposed deep learning models (DLM) to forecast the catch volumes, evaluate each input variable's importance, and find interactions. Environmental variables and catch fisheries were tested in the DLM to estimate their predictive power. Different DLM structures and parameters to find the optimal model was used. The variables that presented higher predictive power are the environmental variables with R = 0.90. Moreover, when used in combination with the catches from other areas, the performance of R = 0.95 is obtained. Using only the catches, the model has an R = 0.81. This model allows the use of variables that indirectly affect the system and demonstrates a useful tool to assess a complex system's state in the face of disturbances in its variables.

Download Full-text

Efficient Pneumonia Detection in Chest Xray Images Using Deep Transfer Learning

Diagnostics ◽

10.3390/diagnostics10060417 ◽

2020 ◽

Vol 10 (6) ◽

pp. 417 ◽

Cited By ~ 5

Author(s):

Mohammad Farukh Hashmi ◽

Satyarth Katiyar ◽

Avinash G Keskar ◽

Neeraj Dhanraj Bokde ◽

Zong Woo Geem

Keyword(s):

Deep Learning ◽

Transfer Learning ◽

Data Augmentation ◽

Medical Center ◽

Training Dataset ◽

Test Accuracy ◽

X Rays ◽

Learning Models ◽

Novel Approach ◽

Unseen Data

Pneumonia causes the death of around 700,000 children every year and affects 7% of the global population. Chest X-rays are primarily used for the diagnosis of this disease. However, even for a trained radiologist, it is a challenging task to examine chest X-rays. There is a need to improve the diagnosis accuracy. In this work, an efficient model for the detection of pneumonia trained on digital chest X-ray images is proposed, which could aid the radiologists in their decision making process. A novel approach based on a weighted classifier is introduced, which combines the weighted predictions from the state-of-the-art deep learning models such as ResNet18, Xception, InceptionV3, DenseNet121, and MobileNetV3 in an optimal way. This approach is a supervised learning approach in which the network predicts the result based on the quality of the dataset used. Transfer learning is used to fine-tune the deep learning models to obtain higher training and validation accuracy. Partial data augmentation techniques are employed to increase the training dataset in a balanced way. The proposed weighted classifier is able to outperform all the individual models. Finally, the model is evaluated, not only in terms of test accuracy, but also in the AUC score. The final proposed weighted classifier model is able to achieve a test accuracy of 98.43% and an AUC score of 99.76 on the unseen data from the Guangzhou Women and Children’s Medical Center pneumonia dataset. Hence, the proposed model can be used for a quick diagnosis of pneumonia and can aid the radiologists in the diagnosis process.

Download Full-text

Object and Human Action Recognition From Video Using Deep Learning Models

2019 IEEE International Conference on Signals and Systems (ICSigSys) ◽

10.1109/icsigsys.2019.8811081 ◽

2019 ◽

Author(s):

Padmeswari Nandiya Soentanto ◽

Janson Hendryli ◽

Dyah E. Herwindiati

Keyword(s):

Deep Learning ◽

Action Recognition ◽

Human Action Recognition ◽

Human Action ◽

Learning Models

Download Full-text

Data Augmentation in Deep Learning-Based Fusion of Depth and Inertial Sensing for Action Recognition

IEEE Sensors Letters ◽

10.1109/lsens.2018.2878572 ◽

2019 ◽

Vol 3 (1) ◽

pp. 1-4 ◽

Cited By ~ 16

Author(s):

Neha Dawar ◽

Sarah Ostadabbas ◽

Nasser Kehtarnavaz

Keyword(s):

Deep Learning ◽

Action Recognition ◽

Data Augmentation ◽

Inertial Sensing

Download Full-text

A Language Model for Misogyny Detection in Latin American Spanish Driven by Multisource Feature Extraction and Transformers

Applied Sciences ◽

10.3390/app112110467 ◽

2021 ◽

Vol 11 (21) ◽

pp. 10467

Author(s):

Edwin Aldana-Bobadilla ◽

Alejandro Molina-Villegas ◽

Yuridia Montelongo-Padilla ◽

Ivan Lopez-Arevalo ◽

Oscar S. Sordia

Keyword(s):

Deep Learning ◽

Latin American ◽

Data Augmentation ◽

Physical Violence ◽

Language Model ◽

Learning Models ◽

Exact Figure ◽

Latin American Spanish ◽

American Spanish ◽

Statistical Systems

Creating effective mechanisms to detect misogyny online automatically represents significant scientific and technological challenges. The complexity of recognizing misogyny through computer models lies in the fact that it is a subtle type of violence, it is not always explicitly aggressive, and it can even hide behind seemingly flattering words, jokes, parodies, and other expressions. Currently, it is even difficult to have an exact figure for the rate of misogynistic comments online because, unlike other types of violence, such as physical violence, these events are not registered by any statistical systems. This research contributes to the development of models for the automatic detection of misogynistic texts in Latin American Spanish and contributes to the design of data augmentation methodologies since the amount of data required for deep learning models is considerable.

Download Full-text