Wildlife recognition from camera trap data using computer vision algorithms

Recent years have ushered in a vast array of different types of low cost and reliable sensors that are capable of capturing large quantities of audio and visual information from the natural world. In the case of biodiversity monitoring, camera traps (i.e. remote cameras that take images when movement is detected (Kays et al. 2009) have shown themselves to be particularly effective tools for the automated monitoring of the presence and activity of different animal species. However, this ease of deployment comes at a cost, as even a small scale camera trapping project can result in hundreds of thousands of images that need to be reviewed. Until recently, this review process was an extremely time consuming endeavor. It required domain experts to manually inspect each image to: determine if it contained a species of interest and identify, where possible, which species was present. Fortunately, in the last five years, advances in machine learning have resulted in a new suite of algorithms that are capable of automatically performing image classification tasks like species classification. determine if it contained a species of interest and identify, where possible, which species was present. Fortunately, in the last five years, advances in machine learning have resulted in a new suite of algorithms that are capable of automatically performing image classification tasks like species classification. The effectiveness of deep neural networks (Norouzzadeh et al. 2018), coupled with transfer learning (tuning a model that is pretrained on a larger dataset (Willi et al. 2018), have resulted in high levels of accuracy on camera trap images. However, camera trap images exhibit unique challenges that are typically not present in standard benchmark datasets used in computer vision. For example, objects of interest are often heavily occluded, the appearance of a scene can change dramatically over time due to changes in weather and lighting, and while the overall number of images can be large, the variation in locations is often limited (Schneider et al. 2020). These challenges combined mean that in order to reach high performance on species classification it is necessary to collect a large amount of annotated data to train the deep models. This again takes a significant amount of time for each project, and this time could be better spent addressing the ecological or conservation questions of interest. Self-supervised learning is a paradigm in machine learning that attempts to forgo the need for manual supervision by instead learning informative representations from images directly, e.g. transforming an image in two different ways without impacting the semantics of the included object, and learn by imposing similarity between the two tranformations. This is a tantalizing proposition for camera trap data, as it has the potential to drastically reduce the amount of time required to annotate data. The current performance of these methods on standard computer vision benchmarks is encouraging, as it suggests that self-supervised models have begun to reach the accuracy of their fully supervised counterparts for tasks like classifying everyday objects in images (Chen et al. 2020). However, existing self-supervised methods can struggle when applied to tasks that contain highly similar, i.e. fine-grained, object categories such as different species of plants and animals (Van Horn et al. 2021). To this end, we explore the effectiveness of self-supervised learning when applied to camera trap imagery. We show that these methods can be used to train image classifiers with a significant reduction in manual supervision. Furthermore, we extend this analysis by showing, with some careful design considerations, that off-the-shelf self-supervised methods can be made to learn even more effective image representations for automated species classification. We show that by exploiting cues at training time related to where and when a given image was captured can result in further improvements in classification performance. We demonstrate, across several different camera trapping datasets, that it is possible to achieve similar, and sometimes even superior, accuracy to fully supervised transfer learning-based methods using a factor of ten times less manual supervision. Finally, we discuss some of the limitations of the outlined approaches and their implications on automated species classification from images.

Download Full-text

Computer Vision for Visual Effects

10.1017/cbo9781139019682 ◽

2009 ◽

Cited By ~ 6

Author(s):

Richard J. Radke

Keyword(s):

Computer Vision ◽

Visual Effects

Download Full-text

Human Vision and Computer Vision

Contemporary Psychology ◽

10.1037/023481 ◽

1985 ◽

Vol 30 (1) ◽

pp. 47-47

Author(s):

Herman Bouma

Keyword(s):

Computer Vision ◽

Human Vision

Download Full-text

Computer vision for industrial applications

Software & Microsystems ◽

10.1049/sm.1983.0048 ◽

1983 ◽

Vol 2 (5) ◽

pp. 130

Author(s):

J.A. Losty ◽

P.R. Watkins

Keyword(s):

Computer Vision ◽

Industrial Applications

Download Full-text

The machine training in problems of satellite images’s processing

Metrologiya ◽

10.32446/0132-4713.2020-4-15-37 ◽

2020 ◽

pp. 15-37

Author(s):

L. P. Bass ◽

Yu. A. Plastinin ◽

I. Yu. Skryabysheva

Keyword(s):

Remote Sensing ◽

Neural Networks ◽

Computer Vision ◽

Satellite Images ◽

Vision Systems ◽

Earth Remote Sensing ◽

Practical Applications ◽

Convolution Neural Networks ◽

Computer Vision Systems ◽

Trained Neural Network

Use of the technical (computer) vision systems for Earth remote sensing is considered. An overview of software and hardware used in computer vision systems for processing satellite images is submitted. Algorithmic methods of the data processing with use of the trained neural network are described. Examples of the algorithmic processing of satellite images by means of artificial convolution neural networks are given. Ways of accuracy increase of satellite images recognition are defined. Practical applications of convolution neural networks onboard microsatellites for Earth remote sensing are presented.

Download Full-text

Modern Technology for Image processing and Computer vision -A Review

Journal of Advanced Sciences and Engineering Technologies ◽

10.32441/jaset.v1i2.178 ◽

2018 ◽

Vol 1 (2) ◽

pp. 17-23

Author(s):

Takialddin Al Smadi

Keyword(s):

Image Processing ◽

Computer Vision ◽

Augmented Reality ◽

Video Compression ◽

Video Processing ◽

Vision System ◽

Citrus Fruit ◽

Modern Technology ◽

Image And Video Processing ◽

Depth Video

This survey outlines the use of computer vision in Image and video processing in multidisciplinary applications; either in academia or industry, which are active in this field.The scope of this paper covers the theoretical and practical aspects in image and video processing in addition of computer vision, from essential research to evolution of application.In this paper a various subjects of image processing and computer vision will be demonstrated ,these subjects are spanned from the evolution of mobile augmented reality (MAR) applications, to augmented reality under 3D modeling and real time depth imaging, video processing algorithms will be discussed to get higher depth video compression, beside that in the field of mobile platform an automatic computer vision system for citrus fruit has been implemented ,where the Bayesian classification with Boundary Growing to detect the text in the video scene. Also the paper illustrates the usability of the handed interactive method to the portable projector based on augmented reality. © 2018 JASET, International Scholars and Researchers Association

Download Full-text