scholarly journals Automated Video Behavior Recognition of Pigs Using Two-Stream Convolutional Networks

Sensors ◽  
2020 ◽  
Vol 20 (4) ◽  
pp. 1085
Author(s):  
Kaifeng Zhang ◽  
Dan Li ◽  
Jiayun Huang ◽  
Yifei Chen

The detection of pig behavior helps detect abnormal conditions such as diseases and dangerous movements in a timely and effective manner, which plays an important role in ensuring the health and well-being of pigs. Monitoring pig behavior by staff is time consuming, subjective, and impractical. Therefore, there is an urgent need to implement methods for identifying pig behavior automatically. In recent years, deep learning has been gradually applied to the study of pig behavior recognition. Existing studies judge the behavior of the pig only based on the posture of the pig in a still image frame, without considering the motion information of the behavior. However, optical flow can well reflect the motion information. Thus, this study took image frames and optical flow from videos as two-stream input objects to fully extract the temporal and spatial behavioral characteristics. Two-stream convolutional network models based on deep learning were proposed, including inflated 3D convnet (I3D) and temporal segment networks (TSN) whose feature extraction network is Residual Network (ResNet) or the Inception architecture (e.g., Inception with Batch Normalization (BN-Inception), InceptionV3, InceptionV4, or InceptionResNetV2) to achieve pig behavior recognition. A standard pig video behavior dataset that included 1000 videos of feeding, lying, walking, scratching and mounting from five kinds of different behavioral actions of pigs under natural conditions was created. The dataset was used to train and test the proposed models, and a series of comparative experiments were conducted. The experimental results showed that the TSN model whose feature extraction network was ResNet101 was able to recognize pig feeding, lying, walking, scratching, and mounting behaviors with a higher average of 98.99%, and the average recognition time of each video was 0.3163 s. The TSN model (ResNet101) is superior to the other models in solving the task of pig behavior recognition.

2019 ◽  
Vol 6 (1) ◽  
Author(s):  
Si Zhang ◽  
Hanghang Tong ◽  
Jiejun Xu ◽  
Ross Maciejewski

Abstract Graphs naturally appear in numerous application domains, ranging from social analysis, bioinformatics to computer vision. The unique capability of graphs enables capturing the structural relations among data, and thus allows to harvest more insights compared to analyzing data in isolation. However, it is often very challenging to solve the learning problems on graphs, because (1) many types of data are not originally structured as graphs, such as images and text data, and (2) for graph-structured data, the underlying connectivity patterns are often complex and diverse. On the other hand, the representation learning has achieved great successes in many areas. Thereby, a potential solution is to learn the representation of graphs in a low-dimensional Euclidean space, such that the graph properties can be preserved. Although tremendous efforts have been made to address the graph representation learning problem, many of them still suffer from their shallow learning mechanisms. Deep learning models on graphs (e.g., graph neural networks) have recently emerged in machine learning and other related areas, and demonstrated the superior performance in various problems. In this survey, despite numerous types of graph neural networks, we conduct a comprehensive review specifically on the emerging field of graph convolutional networks, which is one of the most prominent graph deep learning models. First, we group the existing graph convolutional network models into two categories based on the types of convolutions and highlight some graph convolutional network models in details. Then, we categorize different graph convolutional networks according to the areas of their applications. Finally, we present several open challenges in this area and discuss potential directions for future research.


2020 ◽  
Vol 2020 ◽  
pp. 1-11
Author(s):  
Jie Shen ◽  
Mengxi Xu ◽  
Xinyu Du ◽  
Yunbo Xiong

Video surveillance is an important data source of urban computing and intelligence. The low resolution of many existing video surveillance devices affects the efficiency of urban computing and intelligence. Therefore, improving the resolution of video surveillance is one of the important tasks of urban computing and intelligence. In this paper, the resolution of video is improved by superresolution reconstruction based on a learning method. Different from the superresolution reconstruction of static images, the superresolution reconstruction of video is characterized by the application of motion information. However, there are few studies in this area so far. Aimed at fully exploring motion information to improve the superresolution of video, this paper proposes a superresolution reconstruction method based on an efficient subpixel convolutional neural network, where the optical flow is introduced in the deep learning network. Fusing the optical flow features between successive frames can compensate for information in frames and generate high-quality superresolution results. In addition, in order to improve the superresolution, a superpixel convolution layer is added after the deep convolution network. Finally, experimental evaluations demonstrate the satisfying performance of our method compared with previous methods and other deep learning networks; our method is more efficient.


2019 ◽  
Vol 11 (6) ◽  
pp. 123
Author(s):  
Huanan Dong ◽  
Ming Wen ◽  
Zhouwang Yang

Vehicle speed estimation is an important problem in traffic surveillance. Many existing approaches to this problem are based on camera calibration. Two shortcomings exist for camera calibration-based methods. First, camera calibration methods are sensitive to the environment, which means the accuracy of the results are compromised in some situations where the environmental condition is not satisfied. Furthermore, camera calibration-based methods rely on vehicle trajectories acquired by a two-stage tracking and detection process. In an effort to overcome these shortcomings, we propose an alternate end-to-end method based on 3-dimensional convolutional networks (3D ConvNets). The proposed method bases average vehicle speed estimation on information from video footage. Our methods are characterized by the following three features. First, we use non-local blocks in our model to better capture spatial–temporal long-range dependency. Second, we use optical flow as an input in the model. Optical flow includes the information on the speed and direction of pixel motion in an image. Third, we construct a multi-scale convolutional network. This network extracts information on various characteristics of vehicles in motion. The proposed method showcases promising experimental results on commonly used dataset with mean absolute error (MAE) as 2.71 km/h and mean square error (MSE) as 14.62 .


Biomolecules ◽  
2021 ◽  
Vol 11 (8) ◽  
pp. 1119
Author(s):  
Shuang Wang ◽  
Mingjian Jiang ◽  
Shugang Zhang ◽  
Xiaofeng Wang ◽  
Qing Yuan ◽  
...  

In the process of drug discovery, identifying the interaction between the protein and the novel compound plays an important role. With the development of technology, deep learning methods have shown excellent performance in various situations. However, the compound–protein interaction is complicated and the features extracted by most deep models are not comprehensive, which limits the performance to a certain extent. In this paper, we proposed a multiscale convolutional network that extracted the local and global features of the protein and the topological feature of the compound using different types of convolutional networks. The results showed that our model obtained the best performance compared with the existing deep learning methods.


Electronics ◽  
2021 ◽  
Vol 10 (9) ◽  
pp. 1014
Author(s):  
Chengsheng Pan ◽  
Jiang Zhu ◽  
Zhixiang Kong ◽  
Huaifeng Shi ◽  
Wensheng Yang

Network traffic forecasting is essential for efficient network management and planning. Accurate long-term forecasting models are also essential for proactive control of upcoming congestion events. Due to the complex spatial-temporal dependencies between traffic flows, traditional time series forecasting models are often unable to fully extract the spatial-temporal characteristics between the traffic flows. To address this issue, we propose a novel dual-channel based graph convolutional network (DC-STGCN) model. The proposed model consists of two temporal components that characterize the daily and weekly correlation of the network traffic. Each of these two components contains a spatial-temporal characteristics extraction module consisting of a dual-channel graph convolutional network (DCGCN) and a gated recurrent unit (GRU). The DCGCN further consists of an adjacency feature extraction module (AGCN) and a correlation feature extraction module (PGCN) to capture the connectivity between nodes and the proximity correlation, respectively. The GRU further extracts the temporal characteristics of the traffic. The experimental results based on real network data sets show that the prediction accuracy of the DC-STGCN model overperforms the existing baseline and is capable of making long-term predictions.


Author(s):  
A. Sokolova ◽  
A. Konushin

In this work we investigate the problem of people recognition by their gait. For this task, we implement deep learning approach using the optical flow as the main source of motion information and combine neural feature extraction with the additional embedding of descriptors for representation improvement. In order to find the best heuristics, we compare several deep neural network architectures, learning and classification strategies. The experiments were made on two popular datasets for gait recognition, so we investigate their advantages and disadvantages and the transferability of considered methods.


2019 ◽  
Vol 11 (18) ◽  
pp. 2142 ◽  
Author(s):  
Lianfa Li

Semantic segmentation is a fundamental means of extracting information from remotely sensed images at the pixel level. Deep learning has enabled considerable improvements in efficiency and accuracy of semantic segmentation of general images. Typical models range from benchmarks such as fully convolutional networks, U-Net, Micro-Net, and dilated residual networks to the more recently developed DeepLab 3+. However, many of these models were originally developed for segmentation of general or medical images and videos, and are not directly relevant to remotely sensed images. The studies of deep learning for semantic segmentation of remotely sensed images are limited. This paper presents a novel flexible autoencoder-based architecture of deep learning that makes extensive use of residual learning and multiscaling for robust semantic segmentation of remotely sensed land-use images. In this architecture, a deep residual autoencoder is generalized to a fully convolutional network in which residual connections are implemented within and between all encoding and decoding layers. Compared with the concatenated shortcuts in U-Net, these residual connections reduce the number of trainable parameters and improve the learning efficiency by enabling extensive backpropagation of errors. In addition, resizing or atrous spatial pyramid pooling (ASPP) can be leveraged to capture multiscale information from the input images to enhance the robustness to scale variations. The residual learning and multiscaling strategies improve the trained model’s generalizability, as demonstrated in the semantic segmentation of land-use types in two real-world datasets of remotely sensed images. Compared with U-Net, the proposed method improves the Jaccard index (JI) or the mean intersection over union (MIoU) by 4-11% in the training phase and by 3-9% in the validation and testing phases. With its flexible deep learning architecture, the proposed approach can be easily applied for and transferred to semantic segmentation of land-use variables and other surface variables of remotely sensed images.


2021 ◽  
Author(s):  
YueHua Feng ◽  
Shao-Wu Zhang ◽  
Qing-Qing Zhang ◽  
Chu-Han Zhang ◽  
Jian-Yu Shi

Abstract Although the polypharmacy has both higher therapeutic efficacy and less drug resistance in combating complex diseases, drug-drug interactions (DDIs) may trigger unexpected pharmacological effects, such as side effects, adverse reactions, or even serious toxicity. Thus, it is crucial to identify DDIs and explore its underlying mechanism (e.g., DDIs types) for polypharmacy safety. However, the detection of DDIs in assays is still time-consuming and costly, due to the need of experimental search over a large drug combinational space. Machine learning methods have been proved as a promising and efficient method for preliminary DDI screening. Most shallow learning-based predictive methods focus on whether a drug interacts with another or not. Although deep learning (DL)-based predictive methods address a more realistic screening task for identifying the DDI types, they only predict the DDI types of known DDI, ignoring the structural relationship between DDI entries, and they also cannot reveal the knowledge about the dependence between DDI types. Thus, here we proposed a novel end-to-end deep learning-based predictive method (called MTDDI) to predict DDIs as well as its types, exploring the underlying mechanism of DDIs. MTDDI designs an encoder derived from enhanced deep relational graph convolutional networks to capture the structural relationship between multi-type DDI entries, and adopts the tensor-like decoder to uniformly model both single-fold interactions and multi-fold interactions to reflect the relation between DDI types. The results show that our MTDDI is superior to other state-of-the-art deep learning-based methods. For predicting the multi-type DDIs with unknown DDIs in case of both single-fold DDIs and multi-fold DDIs, we validated the effectiveness and the practical capability of our MTDDI. More importantly, MTDDI can reveal the dependency between DDI types. These crucial observations are beneficial to uncover the mechanism and regularity of DDIs.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Wang Long ◽  
Zheng Junfeng ◽  
Yu Hong ◽  
Ding Meng ◽  
Li Jiangyun

Slagging-off (i.e., slag removal) is an important preprocessing operation of steel-making to improve the purity of iron. Current manual-operated slag removal schemes are inefficient and labor-intensive. Automatic slagging-off is desirable but challenging as the reliable recognition of iron and slag is difficult. This work focuses on realizing an efficient and accurate recognition algorithm of iron and slag, which is conducive to realize automatic slagging-off operation. Motivated by the recent success of deep learning techniques in smart manufacturing, we introduce deep learning methods to this field for the first time. The monotonous gray value of industry images, poor image quality, and nonrigid feature of iron and slag challenge the existing fully convolutional networks (FCNs). To this end, we propose a novel spatial and feature graph convolutional network (SFGCN) module. SFGCN module can be easily inserted in FCNs to improve the reasoning ability of global contextual information, which is helpful to enhance the segmentation accuracy of small objects and isolated areas. To verify the validity of the SFGCN module, we create an industrial dataset and conduct extensive experiments. Finally, the results show that our SFGCN module brings a consistent performance boost for a wide range of FCNs. Moreover, by adopting a lightweight network as backbone, our method achieves real-time iron and slag segmentation. In the future work, we will dedicate our efforts to the weakly supervised learning for quick annotation of big data stream to improve the generalization ability of current models.


2021 ◽  
Vol 7 (1) ◽  
pp. 2
Author(s):  
Mateo Gende ◽  
Joaquim de Moura ◽  
Jorge Novo ◽  
Pablo Charlón ◽  
Marcos Ortega

The Epiretinal Membrane (ERM) is an ocular disease that appears as a fibro-cellular layer of tissue over the retina, specifically, over the Inner Limiting Membrane (ILM). It causes vision blurring and distortion, and its presence can be indicative of other ocular pathologies, such as diabetic macular edema. The ERM diagnosis is usually performed by visually inspecting Optical Coherence Tomography (OCT) images, a manual process which is tiresome and prone to subjectivity. In this work, we present a methodology for the automatic segmentation and visualisation of the ERM in OCT volumes using deep learning. By employing a Densely Connected Convolutional Network, every pixel in the ILM can be classified into either healthy or pathological. Thus, a segmentation of the region susceptible to ERM appearance can be produced. This methodology also produces an intuitive colour map representation of the ERM presence over a visualisation of the eye fundus created from the OCT volume. In a series of representative experiments conducted to evaluate this methodology, it achieved a Dice score of 0.826±0.112 and a Jaccard index of 0.714±0.155. The results that were obtained demonstrate the competitive performance of the proposed methodology when compared to other works in the state of the art.


Sign in / Sign up

Export Citation Format

Share Document