scholarly journals Posture Detection of Individual Pigs Based on Lightweight Convolution Neural Networks and Efficient Channel-Wise Attention

Sensors ◽  
2021 ◽  
Vol 21 (24) ◽  
pp. 8369
Author(s):  
Yizhi Luo ◽  
Zhixiong Zeng ◽  
Huazhong Lu ◽  
Enli Lv

In this paper, a lightweight channel-wise attention model is proposed for the real-time detection of five representative pig postures: standing, lying on the belly, lying on the side, sitting, and mounting. An optimized compressed block with symmetrical structure is proposed based on model structure and parameter statistics, and the efficient channel attention modules are considered as a channel-wise mechanism to improve the model architecture.The results show that the algorithm’s average precision in detecting standing, lying on the belly, lying on the side, sitting, and mounting is 97.7%, 95.2%, 95.7%, 87.5%, and 84.1%, respectively, and the speed of inference is around 63 ms (CPU = i7, RAM = 8G) per postures image. Compared with state-of-the-art models (ResNet50, Darknet53, CSPDarknet53, MobileNetV3-Large, and MobileNetV3-Small), the proposed model has fewer model parameters and lower computation complexity. The statistical results of the postures (with continuous 24 h monitoring) show that some pigs will eat in the early morning, and the peak of the pig’s feeding appears after the input of new feed, which reflects the health of the pig herd for farmers.

Author(s):  
Kaixuan Chen ◽  
Lina Yao ◽  
Dalin Zhang ◽  
Bin Guo ◽  
Zhiwen Yu

Multi-modality is an important feature of sensor based activity recognition. In this work, we consider two inherent characteristics of human activities, the spatially-temporally varying salience of features and the relations between activities and corresponding body part motions. Based on these, we propose a multi-agent spatial-temporal attention model. The spatial-temporal attention mechanism helps intelligently select informative modalities and their active periods. And the multiple agents in the proposed model represent activities with collective motions across body parts by independently selecting modalities associated with single motions. With a joint recognition goal, the agents share gained information and coordinate their selection policies to learn the optimal recognition model. The experimental results on four real-world datasets demonstrate that the proposed model outperforms the state-of-the-art methods.


Author(s):  
Zhipeng Chen ◽  
Yiming Cui ◽  
Wentao Ma ◽  
Shijin Wang ◽  
Guoping Hu

Machine Reading Comprehension (MRC) with multiplechoice questions requires the machine to read given passage and select the correct answer among several candidates. In this paper, we propose a novel approach called Convolutional Spatial Attention (CSA) model which can better handle the MRC with multiple-choice questions. The proposed model could fully extract the mutual information among the passage, question, and the candidates, to form the enriched representations. Furthermore, to merge various attention results, we propose to use convolutional operation to dynamically summarize the attention values within the different size of regions. Experimental results show that the proposed model could give substantial improvements over various state-of- the-art systems on both RACE and SemEval-2018 Task11 datasets.


Author(s):  
Duowei Tang ◽  
Peter Kuppens ◽  
Luc Geurts ◽  
Toon van Waterschoot

AbstractAmongst the various characteristics of a speech signal, the expression of emotion is one of the characteristics that exhibits the slowest temporal dynamics. Hence, a performant speech emotion recognition (SER) system requires a predictive model that is capable of learning sufficiently long temporal dependencies in the analysed speech signal. Therefore, in this work, we propose a novel end-to-end neural network architecture based on the concept of dilated causal convolution with context stacking. Firstly, the proposed model consists only of parallelisable layers and is hence suitable for parallel processing, while avoiding the inherent lack of parallelisability occurring with recurrent neural network (RNN) layers. Secondly, the design of a dedicated dilated causal convolution block allows the model to have a receptive field as large as the input sequence length, while maintaining a reasonably low computational cost. Thirdly, by introducing a context stacking structure, the proposed model is capable of exploiting long-term temporal dependencies hence providing an alternative to the use of RNN layers. We evaluate the proposed model in SER regression and classification tasks and provide a comparison with a state-of-the-art end-to-end SER model. Experimental results indicate that the proposed model requires only 1/3 of the number of model parameters used in the state-of-the-art model, while also significantly improving SER performance. Further experiments are reported to understand the impact of using various types of input representations (i.e. raw audio samples vs log mel-spectrograms) and to illustrate the benefits of an end-to-end approach over the use of hand-crafted audio features. Moreover, we show that the proposed model can efficiently learn intermediate embeddings preserving speech emotion information.


Entropy ◽  
2021 ◽  
Vol 23 (12) ◽  
pp. 1587
Author(s):  
Mingfeng Zha ◽  
Wenbin Qian ◽  
Wenlong Yi ◽  
Jing Hua

Traditional pest detection methods are challenging to use in complex forestry environments due to their low accuracy and speed. To address this issue, this paper proposes the YOLOv4_MF model. The YOLOv4_MF model utilizes MobileNetv2 as the feature extraction block and replaces the traditional convolution with depth-wise separated convolution to reduce the model parameters. In addition, the coordinate attention mechanism was embedded in MobileNetv2 to enhance feature information. A symmetric structure consisting of a three-layer spatial pyramid pool is presented, and an improved feature fusion structure was designed to fuse the target information. For the loss function, focal loss was used instead of cross-entropy loss to enhance the network’s learning of small targets. The experimental results showed that the YOLOv4_MF model has 4.24% higher mAP, 4.37% higher precision, and 6.68% higher recall than the YOLOv4 model. The size of the proposed model was reduced to 1/6 of that of YOLOv4. Moreover, the proposed algorithm achieved 38.62% mAP with respect to some state-of-the-art algorithms on the COCO dataset.


2020 ◽  
Vol 34 (07) ◽  
pp. 10460-10469 ◽  
Author(s):  
Ankan Bansal ◽  
Sai Saketh Rambhatla ◽  
Abhinav Shrivastava ◽  
Rama Chellappa

We present an approach for detecting human-object interactions (HOIs) in images, based on the idea that humans interact with functionally similar objects in a similar manner. The proposed model is simple and efficiently uses the data, visual features of the human, relative spatial orientation of the human and the object, and the knowledge that functionally similar objects take part in similar interactions with humans. We provide extensive experimental validation for our approach and demonstrate state-of-the-art results for HOI detection. On the HICO-Det dataset our method achieves a gain of over 2.5% absolute points in mean average precision (mAP) over state-of-the-art. We also show that our approach leads to significant performance gains for zero-shot HOI detection in the seen object setting. We further demonstrate that using a generic object detector, our model can generalize to interactions involving previously unseen objects.


2020 ◽  
Vol 309 ◽  
pp. 05016
Author(s):  
Yi He ◽  
Tianli Li

In this paper, we propose a lightweight CNN model. Firstly, we standardize the existing CNN model structure based on the minimum computing unit, and second we apply a parameter control solution to solve the problem of parameter redundancy in the model. At last we build a lightweight nonaligned CNN model. The experimental results show that the model parameters can be reduced by more than 50% when the test error is almost the same. Through deep learning, the proposed model is applied to the practical teaching system to achieve the intelligent evaluation effect of the practical teaching process, while improve the quality and efficiency of teaching.


Water ◽  
2019 ◽  
Vol 11 (6) ◽  
pp. 1287
Author(s):  
Ludek Bures ◽  
Petra Sychova ◽  
Petr Maca ◽  
Radek Roub ◽  
Stepan Marval

An appropriate digital elevation model (DEM) is required for purposes of hydrodynamic modelling of floods. Such a DEM describes a river’s bathymetry (bed topography) as well as its surrounding area. Extensive measurements for creating accurate bathymetry are time-consuming and expensive. Mathematical modelling can provide an alternative way for representing river bathymetry. This study explores new possibilities in mathematical depiction of river bathymetry. A new bathymetric model (Bathy-supp) is proposed, and the model’s ability to represent actual bathymetry is assessed. Three statistical methods for the determination of model parameters were evaluated. The best results were achieved by the random forest (RF) method. A two-dimensional (2D) hydrodynamic model was used to evaluate the influence of the Bathy-supp model on the hydrodynamic modelling results. Also presented is a comparison of the proposed model with another state-of-the-art bathymetric model. The study was carried out on a reach of the Otava River in the Czech Republic. The results show that the proposed model’s ability to represent river bathymetry exceeds that of his current competitor. Use of the bathymetric model may have a significant impact on improving the hydrodynamic model results.


2022 ◽  
Vol 11 (2) ◽  
pp. 0-0

In the recent times transfer learning models have known to exhibited good results in the area of text classification for question-answering, summarization, next word prediction but these learning models have not been extensively used for the problem of hate speech detection yet. We anticipate that these networks may give better results in another task of text classification i.e. hate speech detection. This paper introduces a novel method of hate speech detection based on the concept of attention networks using the BERT attention model. We have conducted exhaustive experiments and evaluation over publicly available datasets using various evaluation metrics (precision, recall and F1 score). We show that our model outperforms all the state-of-the-art methods by almost 4%. We have also discussed in detail the technical challenges faced during the implementation of the proposed model.


2019 ◽  
Vol 277 ◽  
pp. 02025
Author(s):  
Yuele Zhang ◽  
Jie Guo ◽  
Zheng Huang ◽  
Weidong Qiu ◽  
Hexiaohui Fan

Person re-identification has been a significant application in the field of video surveillance analysis, yet it remains a challenging work to recognize the person of interest across disjoint cameras of different viewpoints. The factors affecting the identification results include the variation in background, different illumination conditions and the changes of human body poses. Existing person re-identification methods mainly focus on the feature extraction of the whole frame and metric learning functions. However, most of those algorithms treat different areas without distinction. It is worth emphasizing that different local regions make different contributions to image representaion, which exactly conforms to the attention mechanism. In this paper, we introduce a novel attention network which explores spatial attention in a convolutional neural network. Our algorithm learns the visual attention in multi-layer feature maps. The proposed model not only pays attention to the spatial probabilities of local regions, but also takes the features in different levels into consideration. We evaluate this multi-layer spatial attention model on three benchmark person re-identification datasets: Market-1501, CUHK03, and DukeMTMC-reID. The experiment results validate the advances of our adopted network by comparing with state-of-the-art baselines.


2018 ◽  
Vol 46 (3) ◽  
pp. 174-219 ◽  
Author(s):  
Bin Li ◽  
Xiaobo Yang ◽  
James Yang ◽  
Yunqing Zhang ◽  
Zeyu Ma

ABSTRACT The tire model is essential for accurate and efficient vehicle dynamic simulation. In this article, an in-plane flexible ring tire model is proposed, in which the tire is composed of a rigid rim, a number of discretized lumped mass belt points, and numerous massless tread blocks attached on the belt. One set of tire model parameters is identified by approaching the predicted results with ADAMS® FTire virtual test results for one particular cleat test through the particle swarm method using MATLAB®. Based on the identified parameters, the tire model is further validated by comparing the predicted results with FTire for the static load-deflection tests and other cleat tests. Finally, several important aspects regarding the proposed model are discussed.


Sign in / Sign up

Export Citation Format

Share Document