scholarly journals Ani-GIFs: A Benchmark Dataset for Domain Generalization of Action Recognition from GIFs

Author(s):  
Shoumik Majumdar ◽  
Shubhangi Jain ◽  
Isidora Chara Tourni ◽  
Arsenii Mustafin ◽  
Diala Lteif ◽  
...  

Deep learning models perform remarkably well for the same task under the assumption that data is always coming from the same distribution. However, this is generally violated in practice, mainly due to the differences in the data acquisition techniques and the lack of information about the underlying source of new data. Domain Generalization targets the ability to generalize to test data of an unseen domain; while this problem is well-studied for images, such studies are significantly lacking in spatiotemporal visual content – videos and GIFs. This is due to (1) the challenging nature of misalignment of temporal features and the varying appearance/motion of actors and actions in different domains, and (2) spatiotemporal datasets being laborious to collect and annotate for multiple domains. We collect and present the first synthetic video dataset of Animated GIFs for domain generalization, Ani-GIFs, that is used to study domain gap of videos vs. GIFs, and animated vs. real GIFs, for the task of action recognition. We provide a training and testing setting for Ani-GIFs, and extend two domain generalization baseline approaches, based on data augmentation and explainability, to the spatiotemporal domain to catalyze research in this direction.

2020 ◽  
Author(s):  
Dean Sumner ◽  
Jiazhen He ◽  
Amol Thakkar ◽  
Ola Engkvist ◽  
Esben Jannik Bjerrum

<p>SMILES randomization, a form of data augmentation, has previously been shown to increase the performance of deep learning models compared to non-augmented baselines. Here, we propose a novel data augmentation method we call “Levenshtein augmentation” which considers local SMILES sub-sequence similarity between reactants and their respective products when creating training pairs. The performance of Levenshtein augmentation was tested using two state of the art models - transformer and sequence-to-sequence based recurrent neural networks with attention. Levenshtein augmentation demonstrated an increase performance over non-augmented, and conventionally SMILES randomization augmented data when used for training of baseline models. Furthermore, Levenshtein augmentation seemingly results in what we define as <i>attentional gain </i>– an enhancement in the pattern recognition capabilities of the underlying network to molecular motifs.</p>


Author(s):  
S. Arokiaraj ◽  
Dr. N. Viswanathan

With the advent of Internet of things(IoT),HA (HA) recognition has contributed the more application in health care in terms of diagnosis and Clinical process. These devices must be aware of human movements to provide better aid in the clinical applications as well as user’s daily activity.Also , In addition to machine and deep learning algorithms, HA recognition systems has significantly improved in terms of high accurate recognition. However, the most of the existing models designed needs improvisation in terms of accuracy and computational overhead. In this research paper, we proposed a BAT optimized Long Short term Memory (BAT-LSTM) for an effective recognition of human activities using real time IoT systems. The data are collected by implanting the Internet of things) devices invasively. Then, proposed BAT-LSTM is deployed to extract the temporal features which are then used for classification to HA. Nearly 10,0000 dataset were collected and used for evaluating the proposed model. For the validation of proposed framework, accuracy, precision, recall, specificity and F1-score parameters are chosen and comparison is done with the other state-of-art deep learning models. The finding shows the proposed model outperforms the other learning models and finds its suitability for the HA recognition.


Author(s):  
Samuel Leach ◽  
Yunhe Xue ◽  
Rahul Sridhar ◽  
Stephanie Paal ◽  
Zhangyang Wang ◽  
...  

Data ◽  
2020 ◽  
Vol 5 (4) ◽  
pp. 104
Author(s):  
Ashok Sarabu ◽  
Ajit Kumar Santra

The Two-stream convolution neural network (CNN) has proven a great success in action recognition in videos. The main idea is to train the two CNNs in order to learn spatial and temporal features separately, and two scores are combined to obtain final scores. In the literature, we observed that most of the methods use similar CNNs for two streams. In this paper, we design a two-stream CNN architecture with different CNNs for the two streams to learn spatial and temporal features. Temporal Segment Networks (TSN) is applied in order to retrieve long-range temporal features, and to differentiate the similar type of sub-action in videos. Data augmentation techniques are employed to prevent over-fitting. Advanced cross-modal pre-training is discussed and introduced to the proposed architecture in order to enhance the accuracy of action recognition. The proposed two-stream model is evaluated on two challenging action recognition datasets: HMDB-51 and UCF-101. The findings of the proposed architecture shows the significant performance increase and it outperforms the existing methods.


Viruses ◽  
2020 ◽  
Vol 12 (7) ◽  
pp. 769 ◽  
Author(s):  
Ahmed Sedik ◽  
Abdullah M Iliyasu ◽  
Basma Abd El-Rahiem ◽  
Mohammed E. Abdel Samea ◽  
Asmaa Abdel-Raheem ◽  
...  

This generation faces existential threats because of the global assault of the novel Corona virus 2019 (i.e., COVID-19). With more than thirteen million infected and nearly 600000 fatalities in 188 countries/regions, COVID-19 is the worst calamity since the World War II. These misfortunes are traced to various reasons, including late detection of latent or asymptomatic carriers, migration, and inadequate isolation of infected people. This makes detection, containment, and mitigation global priorities to contain exposure via quarantine, lockdowns, work/stay at home, and social distancing that are focused on “flattening the curve”. While medical and healthcare givers are at the frontline in the battle against COVID-19, it is a crusade for all of humanity. Meanwhile, machine and deep learning models have been revolutionary across numerous domains and applications whose potency have been exploited to birth numerous state-of-the-art technologies utilised in disease detection, diagnoses, and treatment. Despite these potentials, machine and, particularly, deep learning models are data sensitive, because their effectiveness depends on availability and reliability of data. The unavailability of such data hinders efforts of engineers and computer scientists to fully contribute to the ongoing assault against COVID-19. Faced with a calamity on one side and absence of reliable data on the other, this study presents two data-augmentation models to enhance learnability of the Convolutional Neural Network (CNN) and the Convolutional Long Short-Term Memory (ConvLSTM)-based deep learning models (DADLMs) and, by doing so, boost the accuracy of COVID-19 detection. Experimental results reveal improvement in terms of accuracy of detection, logarithmic loss, and testing time relative to DLMs devoid of such data augmentation. Furthermore, average increases of 4% to 11% in COVID-19 detection accuracy are reported in favour of the proposed data-augmented deep learning models relative to the machine learning techniques. Therefore, the proposed algorithm is effective in performing a rapid and consistent Corona virus diagnosis that is primarily aimed at assisting clinicians in making accurate identification of the virus.


2021 ◽  
Vol 49 (2) ◽  
pp. 342-353
Author(s):  
Ricardo Cavieses-Núñez ◽  
Miguel A. Ojeda-Ruiz ◽  
Alfredo Flores-Irigollen ◽  
Elvia Marín-Monroy ◽  
Mirtha Lbañez-Lucero ◽  
...  

Small-scale fishing (SSF) is a relevant economic activity worldwide, so sustainable development will be essential to assure its contributions to food security, poverty alleviation, and healthy ecosystems. However, the wide diversity of fisheries, their complexity, and the lack of information limit the ability to propose/evaluate management measures and plans and their effects on communities and other productive activities. The state of Baja California Sur, Mexico, our study case, ranks as the third place in national fisheries production, possesses SSF fleets, has a wide variety of fisheries that share fishing areas, fishing seasons, and operating units. In this work, assuming SSF as a complex system were proposed deep learning models (DLM) to forecast the catch volumes, evaluate each input variable's importance, and find interactions. Environmental variables and catch fisheries were tested in the DLM to estimate their predictive power. Different DLM structures and parameters to find the optimal model was used. The variables that presented higher predictive power are the environmental variables with R = 0.90. Moreover, when used in combination with the catches from other areas, the performance of R = 0.95 is obtained. Using only the catches, the model has an R = 0.81. This model allows the use of variables that indirectly affect the system and demonstrates a useful tool to assess a complex system's state in the face of disturbances in its variables.


Diagnostics ◽  
2020 ◽  
Vol 10 (6) ◽  
pp. 417 ◽  
Author(s):  
Mohammad Farukh Hashmi ◽  
Satyarth Katiyar ◽  
Avinash G Keskar ◽  
Neeraj Dhanraj Bokde ◽  
Zong Woo Geem

Pneumonia causes the death of around 700,000 children every year and affects 7% of the global population. Chest X-rays are primarily used for the diagnosis of this disease. However, even for a trained radiologist, it is a challenging task to examine chest X-rays. There is a need to improve the diagnosis accuracy. In this work, an efficient model for the detection of pneumonia trained on digital chest X-ray images is proposed, which could aid the radiologists in their decision making process. A novel approach based on a weighted classifier is introduced, which combines the weighted predictions from the state-of-the-art deep learning models such as ResNet18, Xception, InceptionV3, DenseNet121, and MobileNetV3 in an optimal way. This approach is a supervised learning approach in which the network predicts the result based on the quality of the dataset used. Transfer learning is used to fine-tune the deep learning models to obtain higher training and validation accuracy. Partial data augmentation techniques are employed to increase the training dataset in a balanced way. The proposed weighted classifier is able to outperform all the individual models. Finally, the model is evaluated, not only in terms of test accuracy, but also in the AUC score. The final proposed weighted classifier model is able to achieve a test accuracy of 98.43% and an AUC score of 99.76 on the unseen data from the Guangzhou Women and Children’s Medical Center pneumonia dataset. Hence, the proposed model can be used for a quick diagnosis of pneumonia and can aid the radiologists in the diagnosis process.


2021 ◽  
Vol 11 (21) ◽  
pp. 10467
Author(s):  
Edwin Aldana-Bobadilla ◽  
Alejandro Molina-Villegas ◽  
Yuridia Montelongo-Padilla ◽  
Ivan Lopez-Arevalo ◽  
Oscar S. Sordia

Creating effective mechanisms to detect misogyny online automatically represents significant scientific and technological challenges. The complexity of recognizing misogyny through computer models lies in the fact that it is a subtle type of violence, it is not always explicitly aggressive, and it can even hide behind seemingly flattering words, jokes, parodies, and other expressions. Currently, it is even difficult to have an exact figure for the rate of misogynistic comments online because, unlike other types of violence, such as physical violence, these events are not registered by any statistical systems. This research contributes to the development of models for the automatic detection of misogynistic texts in Latin American Spanish and contributes to the design of data augmentation methodologies since the amount of data required for deep learning models is considerable.


Sign in / Sign up

Export Citation Format

Share Document