Fast human-animal detection from highly cluttered camera-trap images using joint background modeling and deep learning classification

Author(s):  
Hayder Yousif ◽  
Jianhe Yuan ◽  
Roland Kays ◽  
Zhihai He
2019 ◽  
Author(s):  
◽  
Hayder Yousif

[ACCESS RESTRICTED TO THE UNIVERSITY OF MISSOURI AT REQUEST OF AUTHOR.] Camera traps are a popular tool to sample animal populations because they are noninvasive, detect a variety of species, and can record many thousands of animal detections per deployment. Cameras are typically set to take bursts of multiple images for each detection, and are deployed in arrays of dozens or hundreds of sites, often resulting in millions of images per study. The task of converting images to animal detection records from such large image collections is daunting, and made worse by situations that generate copious empty pictures from false triggers (e.g. camera malfunction or moving vegetation) or pictures of humans. We offer the first widely available computer vision tool for processing camera trap images. Our results show that the tool is accurate and results in substantial time savings for processing large image datasets, thus improving our ability to monitor wildlife across large scales with camera traps. In this dissertation, we have developed new image/video processing and computer vision algorithms for efficient and accurate object detection and sequence-level classiffication from natural scene camera-trap images. This work addresses the following five major tasks: (1) Human-animal detection. We develop a fast and accurate scheme for human-animal detection from highly cluttered camera-trap images using joint background modeling and deep learning classification. Specifically, first, We develop an effective background modeling and subtraction scheme to generate region proposals for the foreground objects. We then develop a cross-frame image patch verification to reduce the number of foreground object proposals. Finally, We perform complexity-accuracy analysis of deep convolutional neural networks (DCNN) to develop a fast deep learning classification scheme to classify these region proposals into three categories: human, animals, and background patches. The optimized DCNN is able to maintain high level of accuracy while reducing the computational complexity by 14 times. Our experimental results demonstrate that the proposed method outperforms existing methods on the camera-trap dataset. (2) Object segmentation from natural scene. We first design and train a fast DCNN for animal-human-background object classification, which is used to analyze the input image to generate multi-layer feature maps, representing the responses of different image regions to the animal-human-background classifier. From these feature maps, we construct the so-called deep objectness graph for accurate animal-human object segmentation with graph cut. The segmented object regions from each image in the sequence are then verfied and fused in the temporal domain using background modeling. Our experimental results demonstrate that our proposed method outperforms existing state-of-the-art methods on the camera-trap dataset with highly cluttered natural scenes. (3) DCNN domain background modeling. We replaced the background model with a new more efficient deep learning based model. The input frames are segmented into regions through the deep objectness graph then the region boundaries of the input frames are multiplied by each other to obtain the regions of movement patches. We construct the background representation using the temporal information of the co-located patches. We propose to fuse the subtraction and foreground/background pixel classiffcation of two representation : a) chromaticity and b) deep pixel information. (4) Sequence-level object classiffcation. We proposed a new method for sequence-level video recognition with application to animal species recognition from camera trap images. First, using background modeling and cross-frame patch verification, we developed a scheme to generate candidate object regions or object proposals in the spatiotemporal domain. Second, we develop a dynamic programming optimization approach to identify the best temporal subset of object proposals. Third, we aggregate and fuse the features of these selected object proposals for efficient sequence-level animal species classification.


2020 ◽  
Vol E103.B (12) ◽  
pp. 1394-1402
Author(s):  
Hiroshi SAITO ◽  
Tatsuki OTAKE ◽  
Hayato KATO ◽  
Masayuki TOKUTAKE ◽  
Shogo SEMBA ◽  
...  

2019 ◽  
Author(s):  
Eric Devost ◽  
Sandra Lai ◽  
Nicolas Casajus ◽  
Dominique Berteaux

SUMMARYCamera traps now represent a reliable, efficient and cost-effective technique to monitor wildlife and collect biological data in the field. However, efficiently extracting information from the massive amount of images generated is often extremely time-consuming and may now represent the most rate-limiting step in camera trap studies.To help overcome this challenge, we developed FoxMask, a new tool performing the automatic detection of animal presence in short sequences of camera trap images. FoxMask uses background estimation and foreground segmentation algorithms to detect the presence of moving objects (most likely, animals) on images.We analyzed a sample dataset from camera traps used to monitor activity on arctic fox Vulpes lagopus dens to test the parameter settings and the performance of the algorithm. The shape and color of arctic foxes, their background at snowmelt and during the summer growing season were highly variable, thus offering challenging testing conditions. We compared the automated animal detection performed by FoxMask to a manual review of the image series.The performance analysis indicated that the proportion of images correctly classified by FoxMask as containing an animal or not was very high (> 90%). FoxMask is thus highly efficient at reducing the workload by eliminating most false triggers (images without an animal). We provide parameter recommendations to facilitate usage and we present the cases where the algorithm performs less efficiently to stimulate further development.FoxMask is an easy-to-use tool freely available to ecologists performing camera trap data extraction. By minimizing analytical time, computer-assisted image analysis will allow collection of increased sample sizes and testing of new biological questions.


Author(s):  
Jayme Garcia Arnal Barbedo ◽  
Luciano Vieira Koenigkan ◽  
Thiago Teixeira Santos ◽  
Patrícia Menezes Santos

Unmanned Aerial Vehicles (UAVs) are being increasingly viewed as valuable tools to aid the management of farms. This kind of technology can be particularly useful in the context of extensive cattle farming, as production areas tend to be expansive and animals tend to be more loosely monitored. With the advent of deep learning, and Convolutional Neural Networks (CNNs) in particular, extracting relevant information from aerial images has become more effective. Despite the technological advancements in drone, imaging and machine learning technologies, the application of UAVs for cattle monitoring is far from being thoroughly studied, with many research gaps still remaining. In this context, the objectives of this study were threefold: 1) to determine the highest possible accuracy that could be achieved in the detection of animals of the Canchim breed, which is visually similar to the Nelore breed (\textit{Bos taurus indicus}); 2) to determine the ideal Ground Sample Distance (GSD) for animal detection; 3) to determine the most accurate CNN architecture for this specific problem. The experiments involved 1,853 images containing 8,629 samples of animals, and 15 different CNN architectures were tested. A total of 900 models were trained (15 CNN architectures * 3 spacial resolutions * 2 datasets * 10-fold cross validation), allowing for a deep analysis of the several aspects that impact the detection of cattle using aerial images captured using UAVs. Results revealed that many CNN architectures are robust enough to reliably detect animals in aerial images even under far from ideal conditions, indicating the viability of using UAVs for cattle monitoring.


Sensors ◽  
2019 ◽  
Vol 19 (24) ◽  
pp. 5436 ◽  
Author(s):  
Jayme Garcia Arnal Barbedo ◽  
Luciano Vieira Koenigkan ◽  
Thiago Teixeira Santos ◽  
Patrícia Menezes Santos

Unmanned aerial vehicles (UAVs) are being increasingly viewed as valuable tools to aid the management of farms. This kind of technology can be particularly useful in the context of extensive cattle farming, as production areas tend to be expansive and animals tend to be more loosely monitored. With the advent of deep learning, and convolutional neural networks (CNNs) in particular, extracting relevant information from aerial images has become more effective. Despite the technological advancements in drone, imaging and machine learning technologies, the application of UAVs for cattle monitoring is far from being thoroughly studied, with many research gaps still remaining. In this context, the objectives of this study were threefold: (1) to determine the highest possible accuracy that could be achieved in the detection of animals of the Canchim breed, which is visually similar to the Nelore breed (Bos taurus indicus); (2) to determine the ideal ground sample distance (GSD) for animal detection; (3) to determine the most accurate CNN architecture for this specific problem. The experiments involved 1853 images containing 8629 samples of animals, and 15 different CNN architectures were tested. A total of 900 models were trained (15 CNN architectures × 3 spacial resolutions × 2 datasets × 10-fold cross validation), allowing for a deep analysis of the several aspects that impact the detection of cattle using aerial images captured using UAVs. Results revealed that many CNN architectures are robust enough to reliably detect animals in aerial images even under far from ideal conditions, indicating the viability of using UAVs for cattle monitoring.


2016 ◽  
Vol 36 ◽  
pp. 145-151 ◽  
Author(s):  
Jennifer L. Price Tack ◽  
Brian S. West ◽  
Conor P. McGowan ◽  
Stephen S. Ditchkoff ◽  
Stanley J. Reeves ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document