Posture Detection of Individual Pigs Based on Lightweight Convolution Neural Networks and Efficient Channel-Wise Attention

In this paper, a lightweight channel-wise attention model is proposed for the real-time detection of five representative pig postures: standing, lying on the belly, lying on the side, sitting, and mounting. An optimized compressed block with symmetrical structure is proposed based on model structure and parameter statistics, and the efficient channel attention modules are considered as a channel-wise mechanism to improve the model architecture.The results show that the algorithm’s average precision in detecting standing, lying on the belly, lying on the side, sitting, and mounting is 97.7%, 95.2%, 95.7%, 87.5%, and 84.1%, respectively, and the speed of inference is around 63 ms (CPU = i7, RAM = 8G) per postures image. Compared with state-of-the-art models (ResNet50, Darknet53, CSPDarknet53, MobileNetV3-Large, and MobileNetV3-Small), the proposed model has fewer model parameters and lower computation complexity. The statistical results of the postures (with continuous 24 h monitoring) show that some pigs will eat in the early morning, and the peak of the pig’s feeding appears after the input of new feed, which reflects the health of the pig herd for farmers.

Download Full-text

Multi-agent Attentional Activity Recognition

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/186 ◽

2019 ◽

Cited By ~ 3

Author(s):

Kaixuan Chen ◽

Lina Yao ◽

Dalin Zhang ◽

Bin Guo ◽

Zhiwen Yu

Keyword(s):

Activity Recognition ◽

State Of The Art ◽

Body Part ◽

Body Parts ◽

Temporal Attention ◽

Attention Model ◽

Proposed Model ◽

Collective Motions ◽

Multi Agent ◽

Real World Datasets

Multi-modality is an important feature of sensor based activity recognition. In this work, we consider two inherent characteristics of human activities, the spatially-temporally varying salience of features and the relations between activities and corresponding body part motions. Based on these, we propose a multi-agent spatial-temporal attention model. The spatial-temporal attention mechanism helps intelligently select informative modalities and their active periods. And the multiple agents in the proposed model represent activities with collective motions across body parts by independently selecting modalities associated with single motions. With a joint recognition goal, the agents share gained information and coordinate their selection policies to learn the optimal recognition model. The experimental results on four real-world datasets demonstrate that the proposed model outperforms the state-of-the-art methods.

Download Full-text

Convolutional Spatial Attention Model for Reading Comprehension with Multiple-Choice Questions

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33016276 ◽

2019 ◽

Vol 33 ◽

pp. 6276-6283 ◽

Cited By ~ 6

Author(s):

Zhipeng Chen ◽

Yiming Cui ◽

Wentao Ma ◽

Shijin Wang ◽

Guoping Hu

Keyword(s):

Reading Comprehension ◽

Mutual Information ◽

Spatial Attention ◽

State Of The Art ◽

Multiple Choice ◽

Multiple Choice Questions ◽

Attention Model ◽

Novel Approach ◽

Proposed Model ◽

Machine Reading

Machine Reading Comprehension (MRC) with multiplechoice questions requires the machine to read given passage and select the correct answer among several candidates. In this paper, we propose a novel approach called Convolutional Spatial Attention (CSA) model which can better handle the MRC with multiple-choice questions. The proposed model could fully extract the mutual information among the passage, question, and the candidates, to form the enriched representations. Furthermore, to merge various attention results, we propose to use convolutional operation to dynamically summarize the attention values within the different size of regions. Experimental results show that the proposed model could give substantial improvements over various state-of- the-art systems on both RACE and SemEval-2018 Task11 datasets.

Download Full-text

End-to-end speech emotion recognition using a novel context-stacking dilated convolution neural network

EURASIP Journal on Audio Speech and Music Processing ◽

10.1186/s13636-021-00208-5 ◽

2021 ◽

Vol 2021 (1) ◽

Author(s):

Duowei Tang ◽

Peter Kuppens ◽

Luc Geurts ◽

Toon van Waterschoot

Keyword(s):

Neural Network ◽

Emotion Recognition ◽

Speech Signal ◽

State Of The Art ◽

Speech Emotion Recognition ◽

Model Parameters ◽

Proposed Model ◽

End To End ◽

The Impact ◽

Temporal Dependencies

AbstractAmongst the various characteristics of a speech signal, the expression of emotion is one of the characteristics that exhibits the slowest temporal dynamics. Hence, a performant speech emotion recognition (SER) system requires a predictive model that is capable of learning sufficiently long temporal dependencies in the analysed speech signal. Therefore, in this work, we propose a novel end-to-end neural network architecture based on the concept of dilated causal convolution with context stacking. Firstly, the proposed model consists only of parallelisable layers and is hence suitable for parallel processing, while avoiding the inherent lack of parallelisability occurring with recurrent neural network (RNN) layers. Secondly, the design of a dedicated dilated causal convolution block allows the model to have a receptive field as large as the input sequence length, while maintaining a reasonably low computational cost. Thirdly, by introducing a context stacking structure, the proposed model is capable of exploiting long-term temporal dependencies hence providing an alternative to the use of RNN layers. We evaluate the proposed model in SER regression and classification tasks and provide a comparison with a state-of-the-art end-to-end SER model. Experimental results indicate that the proposed model requires only 1/3 of the number of model parameters used in the state-of-the-art model, while also significantly improving SER performance. Further experiments are reported to understand the impact of using various types of input representations (i.e. raw audio samples vs log mel-spectrograms) and to illustrate the benefits of an end-to-end approach over the use of hand-crafted audio features. Moreover, we show that the proposed model can efficiently learn intermediate embeddings preserving speech emotion information.

Download Full-text

A Lightweight YOLOv4-Based Forestry Pest Detection Method Using Coordinate Attention and Feature Fusion

Entropy ◽

10.3390/e23121587 ◽

2021 ◽

Vol 23 (12) ◽

pp. 1587

Author(s):

Mingfeng Zha ◽

Wenbin Qian ◽

Wenlong Yi ◽

Jing Hua

Keyword(s):

Detection Method ◽

Feature Fusion ◽

State Of The Art ◽

Detection Methods ◽

Model Parameters ◽

Symmetric Structure ◽

Proposed Model ◽

Pest Detection ◽

Feature Information ◽

Small Targets

Traditional pest detection methods are challenging to use in complex forestry environments due to their low accuracy and speed. To address this issue, this paper proposes the YOLOv4_MF model. The YOLOv4_MF model utilizes MobileNetv2 as the feature extraction block and replaces the traditional convolution with depth-wise separated convolution to reduce the model parameters. In addition, the coordinate attention mechanism was embedded in MobileNetv2 to enhance feature information. A symmetric structure consisting of a three-layer spatial pyramid pool is presented, and an improved feature fusion structure was designed to fuse the target information. For the loss function, focal loss was used instead of cross-entropy loss to enhance the network’s learning of small targets. The experimental results showed that the YOLOv4_MF model has 4.24% higher mAP, 4.37% higher precision, and 6.68% higher recall than the YOLOv4 model. The size of the proposed model was reduced to 1/6 of that of YOLOv4. Moreover, the proposed algorithm achieved 38.62% mAP with respect to some state-of-the-art algorithms on the COCO dataset.

Download Full-text

Detecting Human-Object Interactions via Functional Generalization

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6616 ◽

2020 ◽

Vol 34 (07) ◽

pp. 10460-10469 ◽

Cited By ~ 9

Author(s):

Ankan Bansal ◽

Sai Saketh Rambhatla ◽

Abhinav Shrivastava ◽

Rama Chellappa

Keyword(s):

Experimental Validation ◽

State Of The Art ◽

Visual Features ◽

Average Precision ◽

Proposed Model ◽

Human Object ◽

Significant Performance ◽

Unseen Objects ◽

Performance Gains ◽

Object Interactions

We present an approach for detecting human-object interactions (HOIs) in images, based on the idea that humans interact with functionally similar objects in a similar manner. The proposed model is simple and efficiently uses the data, visual features of the human, relative spatial orientation of the human and the object, and the knowledge that functionally similar objects take part in similar interactions with humans. We provide extensive experimental validation for our approach and demonstrate state-of-the-art results for HOI detection. On the HICO-Det dataset our method achieves a gain of over 2.5% absolute points in mean average precision (mAP) over state-of-the-art. We also show that our approach leads to significant performance gains for zero-shot HOI detection in the seen object setting. We further demonstrate that using a generic object detector, our model can generalize to interactions involving previously unseen objects.

Download Full-text

A lightweight CNN model and its application in intelligent practical teaching evaluation

MATEC Web of Conferences ◽

10.1051/matecconf/202030905016 ◽

2020 ◽

Vol 309 ◽

pp. 05016

Author(s):

Yi He ◽

Tianli Li

Keyword(s):

Teaching Evaluation ◽

Model Parameters ◽

Parameter Control ◽

Model Structure ◽

Control Solution ◽

Test Error ◽

Computing Unit ◽

Parameter Redundancy ◽

Proposed Model ◽

Practical Teaching

In this paper, we propose a lightweight CNN model. Firstly, we standardize the existing CNN model structure based on the minimum computing unit, and second we apply a parameter control solution to solve the problem of parameter redundancy in the model. At last we build a lightweight nonaligned CNN model. The experimental results show that the model parameters can be reduced by more than 50% when the test error is almost the same. Through deep learning, the proposed model is applied to the practical teaching system to achieve the intelligent evaluation effect of the practical teaching process, while improve the quality and efficiency of teaching.

Download Full-text

River Bathymetry Model Based on Floodplain Topography

Water ◽

10.3390/w11061287 ◽

2019 ◽

Vol 11 (6) ◽

pp. 1287

Author(s):

Ludek Bures ◽

Petra Sychova ◽

Petr Maca ◽

Radek Roub ◽

Stepan Marval

Keyword(s):

Hydrodynamic Model ◽

State Of The Art ◽

Model Parameters ◽

Hydrodynamic Modelling ◽

The Czech Republic ◽

Bed Topography ◽

Digital Elevation ◽

Proposed Model ◽

Elevation Model

An appropriate digital elevation model (DEM) is required for purposes of hydrodynamic modelling of floods. Such a DEM describes a river’s bathymetry (bed topography) as well as its surrounding area. Extensive measurements for creating accurate bathymetry are time-consuming and expensive. Mathematical modelling can provide an alternative way for representing river bathymetry. This study explores new possibilities in mathematical depiction of river bathymetry. A new bathymetric model (Bathy-supp) is proposed, and the model’s ability to represent actual bathymetry is assessed. Three statistical methods for the determination of model parameters were evaluated. The best results were achieved by the random forest (RF) method. A two-dimensional (2D) hydrodynamic model was used to evaluate the influence of the Bathy-supp model on the hydrodynamic modelling results. Also presented is a comparison of the proposed model with another state-of-the-art bathymetric model. The study was carried out on a reach of the Otava River in the Czech Republic. The results show that the proposed model’s ability to represent river bathymetry exceeds that of his current competitor. Use of the bathymetric model may have a significant impact on improving the hydrodynamic model results.

Download Full-text

BERT-BU12 Hate Speech Detection using Bidirectional Encoder-Decoder

International Journal of System Dynamics Applications ◽

10.4018/ijsda.20220801oa04 ◽

2022 ◽

Vol 11 (2) ◽

pp. 0-0

Keyword(s):

Text Classification ◽

Question Answering ◽

Hate Speech ◽

State Of The Art ◽

Learning Models ◽

Speech Detection ◽

Attention Networks ◽

Attention Model ◽

Proposed Model ◽

Novel Method

In the recent times transfer learning models have known to exhibited good results in the area of text classification for question-answering, summarization, next word prediction but these learning models have not been extensively used for the problem of hate speech detection yet. We anticipate that these networks may give better results in another task of text classification i.e. hate speech detection. This paper introduces a novel method of hate speech detection based on the concept of attention networks using the BERT attention model. We have conducted exhaustive experiments and evaluation over publicly available datasets using various evaluation metrics (precision, recall and F1 score). We show that our model outperforms all the state-of-the-art methods by almost 4%. We have also discussed in detail the technical challenges faced during the implementation of the proposed model.

Download Full-text

Multi-layer attention for person re-identification

MATEC Web of Conferences ◽

10.1051/matecconf/201927702025 ◽

2019 ◽

Vol 277 ◽

pp. 02025

Author(s):

Yuele Zhang ◽

Jie Guo ◽

Zheng Huang ◽

Weidong Qiu ◽

Hexiaohui Fan

Keyword(s):

Spatial Attention ◽

State Of The Art ◽

Metric Learning ◽

Feature Maps ◽

Factors Affecting ◽

Attention Model ◽

Proposed Model ◽

Learning Functions ◽

Surveillance Analysis ◽

Different Levels

Person re-identification has been a significant application in the field of video surveillance analysis, yet it remains a challenging work to recognize the person of interest across disjoint cameras of different viewpoints. The factors affecting the identification results include the variation in background, different illumination conditions and the changes of human body poses. Existing person re-identification methods mainly focus on the feature extraction of the whole frame and metric learning functions. However, most of those algorithms treat different areas without distinction. It is worth emphasizing that different local regions make different contributions to image representaion, which exactly conforms to the attention mechanism. In this paper, we introduce a novel attention network which explores spatial attention in a convolutional neural network. Our algorithm learns the visual attention in multi-layer feature maps. The proposed model not only pays attention to the spatial probabilities of local regions, but also takes the features in different levels into consideration. We evaluate this multi-layer spatial attention model on three benchmark person re-identification datasets: Market-1501, CUHK03, and DukeMTMC-reID. The experiment results validate the advances of our adopted network by comparing with state-of-the-art baselines.

Download Full-text

In-Plane Flexible Ring Tire Model—Part 1: Modeling and Parameter Identification

Tire Science and Technology ◽

10.2346/tire.18.460303 ◽

2018 ◽

Vol 46 (3) ◽

pp. 174-219 ◽

Cited By ~ 2

Author(s):

Bin Li ◽

Xiaobo Yang ◽

James Yang ◽

Yunqing Zhang ◽

Zeyu Ma

Keyword(s):

Static Load ◽

Model Parameters ◽

Tire Model ◽

Test Results ◽

Lumped Mass ◽

Virtual Test ◽

Proposed Model ◽

Particle Swarm Method ◽

Vehicle Dynamic Simulation ◽

Flexible Ring

ABSTRACT The tire model is essential for accurate and efficient vehicle dynamic simulation. In this article, an in-plane flexible ring tire model is proposed, in which the tire is composed of a rigid rim, a number of discretized lumped mass belt points, and numerous massless tread blocks attached on the belt. One set of tire model parameters is identified by approaching the predicted results with ADAMS® FTire virtual test results for one particular cleat test through the particle swarm method using MATLAB®. Based on the identified parameters, the tire model is further validated by comparing the predicted results with FTire for the static load-deflection tests and other cleat tests. Finally, several important aspects regarding the proposed model are discussed.

Download Full-text