scholarly journals Spatially Invariant Unsupervised Object Detection with Convolutional Neural Networks

Author(s):  
Eric Crawford ◽  
Joelle Pineau

There are many reasons to expect an ability to reason in terms of objects to be a crucial skill for any generally intelligent agent. Indeed, recent machine learning literature is replete with examples of the benefits of object-like representations: generalization, transfer to new tasks, and interpretability, among others. However, in order to reason in terms of objects, agents need a way of discovering and detecting objects in the visual world - a task which we call unsupervised object detection. This task has received significantly less attention in the literature than its supervised counterpart, especially in the case of large images containing many objects. In the current work, we develop a neural network architecture that effectively addresses this large-image, many-object setting. In particular, we combine ideas from Attend, Infer, Repeat (AIR), which performs unsupervised object detection but does not scale well, with recent developments in supervised object detection. We replace AIR’s core recurrent network with a convolutional (and thus spatially invariant) network, and make use of an object-specification scheme that describes the location of objects with respect to local grid cells rather than the image as a whole. Through a series of experiments, we demonstrate a number of features of our architecture: that, unlike AIR, it is able to discover and detect objects in large, many-object scenes; that it has a significant ability to generalize to images that are larger and contain more objects than images encountered during training; and that it is able to discover and detect objects with enough accuracy to facilitate non-trivial downstream processing.

2020 ◽  
Vol 226 ◽  
pp. 02020
Author(s):  
Alexey V. Stadnik ◽  
Pavel S. Sazhin ◽  
Slavomir Hnatic

The performance of neural networks is one of the most important topics in the field of computer vision. In this work, we analyze the speed of object detection using the well-known YOLOv3 neural network architecture in different frameworks under different hardware requirements. We obtain results, which allow us to formulate preliminary qualitative conclusions about the feasibility of various hardware scenarios to solve tasks in real-time environments.


Doklady BGUIR ◽  
2022 ◽  
Vol 19 (8) ◽  
pp. 40-44
Author(s):  
P. A. Vyaznikov ◽  
I. D. Kotilevets

The paper presents the methods of development and the results of research on the effectiveness of the seq2seq neural network architecture using Visual Attention mechanism to solve the im2latex problem. The essence of the task is to create a neural network capable of converting an image with mathematical expressions into a similar expression in the LaTeX markup language. This problem belongs to the Image Captioning type: the neural network scans the image and, based on the extracted features, generates a description in natural language. The proposed solution uses the seq2seq architecture, which contains the Encoder and Decoder mechanisms, as well as Bahdanau Attention. A series of experiments was conducted on training and measuring the effectiveness of several neural network models.


2019 ◽  
Vol 35 (2) ◽  
pp. 135-145
Author(s):  
Chi Cuong Nguyen ◽  
Giang Son Tran ◽  
Thi Phuong Nghiem ◽  
Jean-Christophe Burie ◽  
Chi Mai Luong

Real-time smile detection from facial images is useful in many real world applications such as automatic photo capturing in mobile phone cameras or interactive distance learning. In this paper, we study different architectures of object detection deep networks for solving real-time smile detection problem. We then propose a combination of a lightweight convolutional neural network architecture (BKNet) with an efficient object detection framework (RetinaNet). The evaluation on the two datasets (GENKI-4K, UCF Selfie) with a mid-range hardware device (GTX TITAN Black) show that our proposed method helps in improving both accuracy and inference time of the original RetinaNet to reach real-time performance. In comparison with the state-of-the-art object detection framework (YOLO), our method has higher inference time, but still reaches real-time performance and obtains higher accuracy of smile detection on both experimented datasets.


2020 ◽  
Vol 2020 (10) ◽  
pp. 54-62
Author(s):  
Oleksii VASYLIEV ◽  

The problem of applying neural networks to calculate ratings used in banking in the decision-making process on granting or not granting loans to borrowers is considered. The task is to determine the rating function of the borrower based on a set of statistical data on the effectiveness of loans provided by the bank. When constructing a regression model to calculate the rating function, it is necessary to know its general form. If so, the task is to calculate the parameters that are included in the expression for the rating function. In contrast to this approach, in the case of using neural networks, there is no need to specify the general form for the rating function. Instead, certain neural network architecture is chosen and parameters are calculated for it on the basis of statistical data. Importantly, the same neural network architecture can be used to process different sets of statistical data. The disadvantages of using neural networks include the need to calculate a large number of parameters. There is also no universal algorithm that would determine the optimal neural network architecture. As an example of the use of neural networks to determine the borrower's rating, a model system is considered, in which the borrower's rating is determined by a known non-analytical rating function. A neural network with two inner layers, which contain, respectively, three and two neurons and have a sigmoid activation function, is used for modeling. It is shown that the use of the neural network allows restoring the borrower's rating function with quite acceptable accuracy.


2021 ◽  
Vol 06 ◽  
Author(s):  
Ayekpam Chandralekha Devi ◽  
G. K. Hamsavi ◽  
Simran Sahota ◽  
Rochak Mittal ◽  
Hrishikesh A. Tavanandi ◽  
...  

Abstract: Algae (both micro and macro) have gained huge attention in the recent past for their high commercial value products. They are the source of various biomolecules of commercial applications ranging from nutraceuticals to fuels. Phycobiliproteins are one such high value low volume compounds which are mainly obtained from micro and macro algae. In order to tap the bioresource, a significant amount of work has been carried out for large scale production of algal biomass. However, work on downstream processing aspects of phycobiliproteins (PBPs) from algae is scarce, especially in case of macroalgae. There are several difficulties in cell wall disruption of both micro and macro algae because of their cell wall structure and compositions. At the same time, there are several challenges in the purification of phycobiliproteins. The current review article focuses on the recent developments in downstream processing of phycobiliproteins (mainly phycocyanins and phycoerythrins) from micro and macroalgae. The current status, the recent advancements and potential technologies (that are under development) are summarised in this review article besides providing future directions for the present research area.


Sign in / Sign up

Export Citation Format

Share Document