Automated Video Behavior Recognition of Pigs Using Two-Stream Convolutional Networks

The detection of pig behavior helps detect abnormal conditions such as diseases and dangerous movements in a timely and effective manner, which plays an important role in ensuring the health and well-being of pigs. Monitoring pig behavior by staff is time consuming, subjective, and impractical. Therefore, there is an urgent need to implement methods for identifying pig behavior automatically. In recent years, deep learning has been gradually applied to the study of pig behavior recognition. Existing studies judge the behavior of the pig only based on the posture of the pig in a still image frame, without considering the motion information of the behavior. However, optical flow can well reflect the motion information. Thus, this study took image frames and optical flow from videos as two-stream input objects to fully extract the temporal and spatial behavioral characteristics. Two-stream convolutional network models based on deep learning were proposed, including inflated 3D convnet (I3D) and temporal segment networks (TSN) whose feature extraction network is Residual Network (ResNet) or the Inception architecture (e.g., Inception with Batch Normalization (BN-Inception), InceptionV3, InceptionV4, or InceptionResNetV2) to achieve pig behavior recognition. A standard pig video behavior dataset that included 1000 videos of feeding, lying, walking, scratching and mounting from five kinds of different behavioral actions of pigs under natural conditions was created. The dataset was used to train and test the proposed models, and a series of comparative experiments were conducted. The experimental results showed that the TSN model whose feature extraction network was ResNet101 was able to recognize pig feeding, lying, walking, scratching, and mounting behaviors with a higher average of 98.99%, and the average recognition time of each video was 0.3163 s. The TSN model (ResNet101) is superior to the other models in solving the task of pig behavior recognition.

Download Full-text

Graph convolutional networks: a comprehensive review

Computational Social Networks ◽

10.1186/s40649-019-0069-y ◽

2019 ◽

Vol 6 (1) ◽

Cited By ~ 27

Author(s):

Si Zhang ◽

Hanghang Tong ◽

Jiejun Xu ◽

Ross Maciejewski

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Network Models ◽

Representation Learning ◽

Superior Performance ◽

Learning Models ◽

Convolutional Network ◽

Comprehensive Review ◽

Convolutional Networks ◽

Graph Neural Networks

Abstract Graphs naturally appear in numerous application domains, ranging from social analysis, bioinformatics to computer vision. The unique capability of graphs enables capturing the structural relations among data, and thus allows to harvest more insights compared to analyzing data in isolation. However, it is often very challenging to solve the learning problems on graphs, because (1) many types of data are not originally structured as graphs, such as images and text data, and (2) for graph-structured data, the underlying connectivity patterns are often complex and diverse. On the other hand, the representation learning has achieved great successes in many areas. Thereby, a potential solution is to learn the representation of graphs in a low-dimensional Euclidean space, such that the graph properties can be preserved. Although tremendous efforts have been made to address the graph representation learning problem, many of them still suffer from their shallow learning mechanisms. Deep learning models on graphs (e.g., graph neural networks) have recently emerged in machine learning and other related areas, and demonstrated the superior performance in various problems. In this survey, despite numerous types of graph neural networks, we conduct a comprehensive review specifically on the emerging field of graph convolutional networks, which is one of the most prominent graph deep learning models. First, we group the existing graph convolutional network models into two categories based on the types of convolutions and highlight some graph convolutional network models in details. Then, we categorize different graph convolutional networks according to the areas of their applications. Finally, we present several open challenges in this area and discuss potential directions for future research.

Download Full-text

Superresolution Reconstruction of Video Based on Efficient Subpixel Convolutional Neural Network for Urban Computing

Wireless Communications and Mobile Computing ◽

10.1155/2020/8865110 ◽

2020 ◽

Vol 2020 ◽

pp. 1-11

Author(s):

Jie Shen ◽

Mengxi Xu ◽

Xinyu Du ◽

Yunbo Xiong

Keyword(s):

Neural Network ◽

Deep Learning ◽

Convolutional Neural Network ◽

Optical Flow ◽

Video Surveillance ◽

Reconstruction Method ◽

Motion Information ◽

Learning Networks ◽

Urban Computing ◽

Deep Convolution Network

Video surveillance is an important data source of urban computing and intelligence. The low resolution of many existing video surveillance devices affects the efficiency of urban computing and intelligence. Therefore, improving the resolution of video surveillance is one of the important tasks of urban computing and intelligence. In this paper, the resolution of video is improved by superresolution reconstruction based on a learning method. Different from the superresolution reconstruction of static images, the superresolution reconstruction of video is characterized by the application of motion information. However, there are few studies in this area so far. Aimed at fully exploring motion information to improve the superresolution of video, this paper proposes a superresolution reconstruction method based on an efficient subpixel convolutional neural network, where the optical flow is introduced in the deep learning network. Fusing the optical flow features between successive frames can compensate for information in frames and generate high-quality superresolution results. In addition, in order to improve the superresolution, a superpixel convolution layer is added after the deep convolution network. Finally, experimental evaluations demonstrate the satisfying performance of our method compared with previous methods and other deep learning networks; our method is more efficient.

Download Full-text

Vehicle Speed Estimation Based on 3D ConvNets and Non-Local Blocks

Future Internet ◽

10.3390/fi11060123 ◽

2019 ◽

Vol 11 (6) ◽

pp. 123

Author(s):

Huanan Dong ◽

Ming Wen ◽

Zhouwang Yang

Keyword(s):

Optical Flow ◽

Camera Calibration ◽

Absolute Error ◽

Speed Estimation ◽

Vehicle Speed ◽

Convolutional Network ◽

Video Footage ◽

Convolutional Networks ◽

Calibration Methods ◽

Non Local

Vehicle speed estimation is an important problem in traffic surveillance. Many existing approaches to this problem are based on camera calibration. Two shortcomings exist for camera calibration-based methods. First, camera calibration methods are sensitive to the environment, which means the accuracy of the results are compromised in some situations where the environmental condition is not satisfied. Furthermore, camera calibration-based methods rely on vehicle trajectories acquired by a two-stage tracking and detection process. In an effort to overcome these shortcomings, we propose an alternate end-to-end method based on 3-dimensional convolutional networks (3D ConvNets). The proposed method bases average vehicle speed estimation on information from video footage. Our methods are characterized by the following three features. First, we use non-local blocks in our model to better capture spatial–temporal long-range dependency. Second, we use optical flow as an input in the model. Optical flow includes the information on the speed and direction of pixel motion in an image. Third, we construct a multi-scale convolutional network. This network extracts information on various characteristics of vehicles in motion. The proposed method showcases promising experimental results on commonly used dataset with mean absolute error (MAE) as 2.71 km/h and mean square error (MSE) as 14.62 .

Download Full-text

MCN-CPI: Multiscale Convolutional Network for Compound–Protein Interaction Prediction

Biomolecules ◽

10.3390/biom11081119 ◽

2021 ◽

Vol 11 (8) ◽

pp. 1119

Author(s):

Shuang Wang ◽

Mingjian Jiang ◽

Shugang Zhang ◽

Xiaofeng Wang ◽

Qing Yuan ◽

...

Keyword(s):

Deep Learning ◽

Protein Interaction ◽

The Novel ◽

Global Features ◽

Learning Methods ◽

Convolutional Network ◽

Convolutional Networks ◽

Different Types ◽

Protein Interaction Prediction ◽

Local And Global Features

In the process of drug discovery, identifying the interaction between the protein and the novel compound plays an important role. With the development of technology, deep learning methods have shown excellent performance in various situations. However, the compound–protein interaction is complicated and the features extracted by most deep models are not comprehensive, which limits the performance to a certain extent. In this paper, we proposed a multiscale convolutional network that extracted the local and global features of the protein and the topological feature of the compound using different types of convolutional networks. The results showed that our model obtained the best performance compared with the existing deep learning methods.

Download Full-text

DC-STGCN: Dual-Channel Based Graph Convolutional Networks for Network Traffic Forecasting

Electronics ◽

10.3390/electronics10091014 ◽

2021 ◽

Vol 10 (9) ◽

pp. 1014

Author(s):

Chengsheng Pan ◽

Jiang Zhu ◽

Zhixiang Kong ◽

Huaifeng Shi ◽

Wensheng Yang

Keyword(s):

Feature Extraction ◽

Network Traffic ◽

Traffic Flows ◽

Convolutional Network ◽

Temporal Characteristics ◽

Traffic Forecasting ◽

Convolutional Networks ◽

Forecasting Models ◽

Dual Channel

Network traffic forecasting is essential for efficient network management and planning. Accurate long-term forecasting models are also essential for proactive control of upcoming congestion events. Due to the complex spatial-temporal dependencies between traffic flows, traditional time series forecasting models are often unable to fully extract the spatial-temporal characteristics between the traffic flows. To address this issue, we propose a novel dual-channel based graph convolutional network (DC-STGCN) model. The proposed model consists of two temporal components that characterize the daily and weekly correlation of the network traffic. Each of these two components contains a spatial-temporal characteristics extraction module consisting of a dual-channel graph convolutional network (DCGCN) and a gated recurrent unit (GRU). The DCGCN further consists of an adjacency feature extraction module (AGCN) and a correlation feature extraction module (PGCN) to capture the connectivity between nodes and the proximity correlation, respectively. The GRU further extracts the temporal characteristics of the traffic. The experimental results based on real network data sets show that the prediction accuracy of the DC-STGCN model overperforms the existing baseline and is capable of making long-term predictions.

Download Full-text

GAIT RECOGNITION BASED ON CONVOLUTIONAL NEURAL NETWORKS

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xlii-2-w4-207-2017 ◽

2017 ◽

Vol XLII-2/W4 ◽

pp. 207-212 ◽

Cited By ~ 10

Author(s):

A. Sokolova ◽

A. Konushin

Keyword(s):

Neural Network ◽

Neural Networks ◽

Feature Extraction ◽

Deep Learning ◽

Deep Neural Network ◽

Gait Recognition ◽

Learning Approach ◽

Network Architectures ◽

Motion Information ◽

Advantages And Disadvantages

In this work we investigate the problem of people recognition by their gait. For this task, we implement deep learning approach using the optical flow as the main source of motion information and combine neural feature extraction with the additional embedding of descriptors for representation improvement. In order to find the best heuristics, we compare several deep neural network architectures, learning and classification strategies. The experiments were made on two popular datasets for gait recognition, so we investigate their advantages and disadvantages and the transferability of considered methods.

Download Full-text

Deep Residual Autoencoder with Multiscaling for Semantic Segmentation of Land-Use Images

Remote Sensing ◽

10.3390/rs11182142 ◽

2019 ◽

Vol 11 (18) ◽

pp. 2142 ◽

Cited By ~ 5

Author(s):

Lianfa Li

Keyword(s):

Land Use ◽

Deep Learning ◽

Semantic Segmentation ◽

Remotely Sensed ◽

Convolutional Network ◽

Convolutional Networks ◽

Residual Learning ◽

Fully Convolutional Networks ◽

Remotely Sensed Images ◽

Real World Datasets

Semantic segmentation is a fundamental means of extracting information from remotely sensed images at the pixel level. Deep learning has enabled considerable improvements in efficiency and accuracy of semantic segmentation of general images. Typical models range from benchmarks such as fully convolutional networks, U-Net, Micro-Net, and dilated residual networks to the more recently developed DeepLab 3+. However, many of these models were originally developed for segmentation of general or medical images and videos, and are not directly relevant to remotely sensed images. The studies of deep learning for semantic segmentation of remotely sensed images are limited. This paper presents a novel flexible autoencoder-based architecture of deep learning that makes extensive use of residual learning and multiscaling for robust semantic segmentation of remotely sensed land-use images. In this architecture, a deep residual autoencoder is generalized to a fully convolutional network in which residual connections are implemented within and between all encoding and decoding layers. Compared with the concatenated shortcuts in U-Net, these residual connections reduce the number of trainable parameters and improve the learning efficiency by enabling extensive backpropagation of errors. In addition, resizing or atrous spatial pyramid pooling (ASPP) can be leveraged to capture multiscale information from the input images to enhance the robustness to scale variations. The residual learning and multiscaling strategies improve the trained model’s generalizability, as demonstrated in the semantic segmentation of land-use types in two real-world datasets of remotely sensed images. Compared with U-Net, the proposed method improves the Jaccard index (JI) or the mean intersection over union (MIoU) by 4-11% in the training phase and by 3-9% in the validation and testing phases. With its flexible deep learning architecture, the proposed approach can be easily applied for and transferred to semantic segmentation of land-use variables and other surface variables of remotely sensed images.

Download Full-text

MTDDI: a graph convolutional network framework for predicting Multi-Type Drug-Drug Interactions

10.21203/rs.3.rs-397281/v1 ◽

2021 ◽

Author(s):

YueHua Feng ◽

Shao-Wu Zhang ◽

Qing-Qing Zhang ◽

Chu-Han Zhang ◽

Jian-Yu Shi

Keyword(s):

Deep Learning ◽

Drug Interactions ◽

Adverse Reactions ◽

Underlying Mechanism ◽

Structural Relationship ◽

Convolutional Network ◽

Convolutional Networks ◽

Machine Learning Methods ◽

Single Fold ◽

Predictive Methods

Abstract Although the polypharmacy has both higher therapeutic efficacy and less drug resistance in combating complex diseases, drug-drug interactions (DDIs) may trigger unexpected pharmacological effects, such as side effects, adverse reactions, or even serious toxicity. Thus, it is crucial to identify DDIs and explore its underlying mechanism (e.g., DDIs types) for polypharmacy safety. However, the detection of DDIs in assays is still time-consuming and costly, due to the need of experimental search over a large drug combinational space. Machine learning methods have been proved as a promising and efficient method for preliminary DDI screening. Most shallow learning-based predictive methods focus on whether a drug interacts with another or not. Although deep learning (DL)-based predictive methods address a more realistic screening task for identifying the DDI types, they only predict the DDI types of known DDI, ignoring the structural relationship between DDI entries, and they also cannot reveal the knowledge about the dependence between DDI types. Thus, here we proposed a novel end-to-end deep learning-based predictive method (called MTDDI) to predict DDIs as well as its types, exploring the underlying mechanism of DDIs. MTDDI designs an encoder derived from enhanced deep relational graph convolutional networks to capture the structural relationship between multi-type DDI entries, and adopts the tensor-like decoder to uniformly model both single-fold interactions and multi-fold interactions to reflect the relation between DDI types. The results show that our MTDDI is superior to other state-of-the-art deep learning-based methods. For predicting the multi-type DDIs with unknown DDIs in case of both single-fold DDIs and multi-fold DDIs, we validated the effectiveness and the practical capability of our MTDDI. More importantly, MTDDI can reveal the dependency between DDI types. These crucial observations are beneficial to uncover the mechanism and regularity of DDIs.

Download Full-text

Image-Based Iron Slag Segmentation via Graph Convolutional Networks

Complexity ◽

10.1155/2021/6691117 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Wang Long ◽

Zheng Junfeng ◽

Yu Hong ◽

Ding Meng ◽

Li Jiangyun

Keyword(s):

Deep Learning ◽

Contextual Information ◽

Recognition Algorithm ◽

Smart Manufacturing ◽

Convolutional Network ◽

Slag Removal ◽

Convolutional Networks ◽

Fully Convolutional Networks ◽

Iron Slag ◽

Wide Range

Slagging-off (i.e., slag removal) is an important preprocessing operation of steel-making to improve the purity of iron. Current manual-operated slag removal schemes are inefficient and labor-intensive. Automatic slagging-off is desirable but challenging as the reliable recognition of iron and slag is difficult. This work focuses on realizing an efficient and accurate recognition algorithm of iron and slag, which is conducive to realize automatic slagging-off operation. Motivated by the recent success of deep learning techniques in smart manufacturing, we introduce deep learning methods to this field for the first time. The monotonous gray value of industry images, poor image quality, and nonrigid feature of iron and slag challenge the existing fully convolutional networks (FCNs). To this end, we propose a novel spatial and feature graph convolutional network (SFGCN) module. SFGCN module can be easily inserted in FCNs to improve the reasoning ability of global contextual information, which is helpful to enhance the segmentation accuracy of small objects and isolated areas. To verify the validity of the SFGCN module, we create an industrial dataset and conduct extensive experiments. Finally, the results show that our SFGCN module brings a consistent performance boost for a wide range of FCNs. Moreover, by adopting a lightweight network as backbone, our method achieves real-time iron and slag segmentation. In the future work, we will dedicate our efforts to the weakly supervised learning for quick annotation of big data stream to improve the generalization ability of current models.

Download Full-text

Automatic Segmentation and Visualisation of the Epirretinal Membrane in OCT Scans Using Densely Connected Convolutional Networks

Engineering Proceedings ◽

10.3390/engproc2021007002 ◽

2021 ◽

Vol 7 (1) ◽

pp. 2

Author(s):

Mateo Gende ◽

Joaquim de Moura ◽

Jorge Novo ◽

Pablo Charlón ◽

Marcos Ortega

Keyword(s):

Optical Coherence Tomography ◽

Deep Learning ◽

Epiretinal Membrane ◽

State Of The Art ◽

Automatic Segmentation ◽

Jaccard Index ◽

Convolutional Network ◽

Convolutional Networks ◽

Inner Limiting Membrane ◽

Cellular Layer

The Epiretinal Membrane (ERM) is an ocular disease that appears as a fibro-cellular layer of tissue over the retina, specifically, over the Inner Limiting Membrane (ILM). It causes vision blurring and distortion, and its presence can be indicative of other ocular pathologies, such as diabetic macular edema. The ERM diagnosis is usually performed by visually inspecting Optical Coherence Tomography (OCT) images, a manual process which is tiresome and prone to subjectivity. In this work, we present a methodology for the automatic segmentation and visualisation of the ERM in OCT volumes using deep learning. By employing a Densely Connected Convolutional Network, every pixel in the ILM can be classified into either healthy or pathological. Thus, a segmentation of the region susceptible to ERM appearance can be produced. This methodology also produces an intuitive colour map representation of the ERM presence over a visualisation of the eye fundus created from the OCT volume. In a series of representative experiments conducted to evaluate this methodology, it achieved a Dice score of 0.826±0.112 and a Jaccard index of 0.714±0.155. The results that were obtained demonstrate the competitive performance of the proposed methodology when compared to other works in the state of the art.

Download Full-text