scene recognition
Recently Published Documents


TOTAL DOCUMENTS

622
(FIVE YEARS 220)

H-INDEX

35
(FIVE YEARS 8)

2021 ◽  
Vol 15 ◽  
Author(s):  
Anna C. Geuzebroek ◽  
Karlijn Woutersen ◽  
Albert V. van den Berg

Background: Occipital cortex lesions (OCLs) typically result in visual field defects (VFDs) contralateral to the damage. VFDs are usually mapped with perimetry involving the detection of point targets. This, however, ignores the important role of integration of visual information across locations in many tasks of everyday life. Here, we ask whether standard perimetry can fully characterize the consequences of OCLs. We compare performance on a rapid scene discrimination task of OCL participants and healthy observers with simulated VFDs. While the healthy observers will only suffer the loss of part of the visual scene, the damage in the OCL participants may further compromise global visual processing.Methods: VFDs were mapped with Humphrey perimetry, and participants performed two rapid scene discrimination tasks. In healthy participants, the VFDs were simulated with hemi- and quadrant occlusions. Additionally, the GIST model, a computational model of scene recognition, was used to make individual predictions based on the VFDs.Results: The GIST model was able to predict the performance of controls regarding the effects of the local occlusion. Using the individual predictions of the GIST model, we can determine that the variability between the OCL participants is much larger than the extent of the VFD could account for. The OCL participants can further be categorized as performing worse, the same, or better as their VFD would predict.Conclusions: While in healthy observers the extent of the simulated occlusion accounts for their performance loss, the OCL participants’ performance is not fully determined by the extent or shape of their VFD as measured with Humphrey perimetry. While some OCL participants are indeed only limited by the local occlusion of the scene, for others, the lesions compromised the visual network in a more global and disruptive way. Yet one outperformed a healthy observer, suggesting a possible adaptation to the VFD. Preliminary analysis of neuroimaging data suggests that damage to the lateral geniculate nucleus and corpus callosum might be associated with the larger disruption of rapid scene discrimination. We believe our approach offers a useful behavioral tool for investigating why similar VFDs can produce widely differing limitations in everyday life.


Technologies ◽  
2021 ◽  
Vol 9 (4) ◽  
pp. 100
Author(s):  
Kirill Sviatov ◽  
Nadejda Yarushkina ◽  
Daniil Kanin ◽  
Ivan Rubtcov ◽  
Roman Jitkov ◽  
...  

The article describes a structural and functional model of a self-driving car control system, which generates a wide class of mathematical problems. Currently, control systems for self-driving cars are considered at several levels of abstraction and implementation: Mechanics, electronics, perception, scene recognition, control, security, integration of all subsystems into a solid system. Modern research often considers particular problems to be solved for each of the levels separately. In this paper, a parameterized model of the integration of individual components into a complex control system for a self-driving car is considered. Such a model simplifies the design and development of self-driving control systems with configurable automation tools, taking into account the specifics of the solving problem. The parameterized model can be used for CAD design in the field of self-driving car development. A full cycle of development of a control system for a self-driving truck was implemented, which was rub in the “Robocross 2021” competition. The software solution was tested on more than 40 launches of a self-driving truck. Parameterization made it possible to speed up the development of the control system, expressed in man-hours, by 1.5 times compared to the experience of the authors of the article who participated in the same competition in 2018 and 2019. The proposed parameterization was used in the development of individual CAD elements described in this article. Additionally, the implementation of specific modules and functions is a field for experimental research.


2021 ◽  
Vol 13 (24) ◽  
pp. 4999
Author(s):  
Boyong He ◽  
Xianjiang Li ◽  
Bo Huang ◽  
Enhui Gu ◽  
Weijie Guo ◽  
...  

As a data-driven approach, deep learning requires a large amount of annotated data for training to obtain a sufficiently accurate and generalized model, especially in the field of computer vision. However, when compared with generic object recognition datasets, aerial image datasets are more challenging to acquire and more expensive to label. Obtaining a large amount of high-quality aerial image data for object recognition and image understanding is an urgent problem. Existing studies show that synthetic data can effectively reduce the amount of training data required. Therefore, in this paper, we propose the first synthetic aerial image dataset for ship recognition, called UnityShip. This dataset contains over 100,000 synthetic images and 194,054 ship instances, including 79 different ship models in ten categories and six different large virtual scenes with different time periods, weather environments, and altitudes. The annotations include environmental information, instance-level horizontal bounding boxes, oriented bounding boxes, and the type and ID of each ship. This provides the basis for object detection, oriented object detection, fine-grained recognition, and scene recognition. To investigate the applications of UnityShip, the synthetic data were validated for model pre-training and data augmentation using three different object detection algorithms and six existing real-world ship detection datasets. Our experimental results show that for small-sized and medium-sized real-world datasets, the synthetic data achieve an improvement in model pre-training and data augmentation, showing the value and potential of synthetic data in aerial image recognition and understanding tasks.


2021 ◽  
Author(s):  
Xiaohui Yuan ◽  
Zhinan Qiao ◽  
Abolfazl Meyarian
Keyword(s):  

2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Sihua Sun

Audio scene recognition is a task that enables devices to understand their environment through digital audio analysis. It belongs to a branch of the field of computer auditory scene. At present, this technology has been widely used in intelligent wearable devices, robot sensing services, and other application scenarios. In order to explore the applicability of machine learning technology in the field of digital audio scene recognition, an audio scene recognition method based on optimized audio processing and convolutional neural network is proposed. Firstly, different from the traditional audio feature extraction method using mel-frequency cepstrum coefficient, the proposed method uses binaural representation and harmonic percussive source separation method to optimize the original audio and extract the corresponding features, so that the system can make use of the spatial features of the scene and then improve the recognition accuracy. Then, an audio scene recognition system with two-layer convolution module is designed and implemented. In terms of network structure, we try to learn from the VGGNet structure in the field of image recognition to increase the network depth and improve the system flexibility. Experimental data analysis shows that compared with traditional machine learning methods, the proposed method can greatly improve the recognition accuracy of each scene and achieve better generalization effect on different data.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
MVV Prasad Kantipudi ◽  
Sandeep Kumar ◽  
Ashish Kumar Jha

Deep learning is a subfield of artificial intelligence that allows the computer to adopt and learn some new rules. Deep learning algorithms can identify images, objects, observations, texts, and other structures. In recent years, scene text recognition has inspired many researchers from the computer vision community, and still, it needs improvement because of the poor performance of existing scene recognition algorithms. This research paper proposed a novel approach for scene text recognition that integrates bidirectional LSTM and deep convolution neural networks. In the proposed method, first, the contour of the image is identified and then it is fed into the CNN. CNN is used to generate the ordered sequence of the features from the contoured image. The sequence of features is now coded using the Bi-LSTM. Bi-LSTM is a handy tool for extracting the features from the sequence of words. Hence, this paper combines the two powerful mechanisms for extracting the features from the image, and contour-based input image makes the recognition process faster, which makes this technique better compared to existing methods. The results of the proposed methodology are evaluated on MSRATD 50 dataset, SVHN dataset, vehicle number plate dataset, SVT dataset, and random datasets, and the accuracy is 95.22%, 92.25%, 96.69%, 94.58%, and 98.12%, respectively. According to quantitative and qualitative analysis, this approach is more promising in terms of accuracy and precision rate.


2021 ◽  
Vol 10 (11) ◽  
pp. 740
Author(s):  
Yao Shen ◽  
Yiyi Xu ◽  
Lefeng Liu

The built environment reshapes various scenes that can be perceived, experienced, and interpreted, which are known as city images. City images emerge as the complex composite of various imagery elements. Previous studies demonstrated the coincide between the city images produced by experts with prior knowledge and that are extracted from the high-frequency photo contents generated by citizens. The realistic city images hidden behind the volunteered geo-tagged photos, however, are more complex than assumed. The dominating elements are only one side of the city image; more importantly, the interactions between elements are also crucial for understanding how city images are structured in people’s minds. This paper focuses on the composition of city image–the various interactions between imagery elements and areas of a city. These interactions are identified as four aspects: co-presence, hierarchy, heterogeneity, and differentiation, which are quantified and visualized respectively as correlation network, dendrogram, spatial clusters, and scattergrams in a framework using scene recognition with volunteered and georeferenced photos. The outputs are interdependent elements, typologies of elements, imagery areas, and preferences for groups, which are essential for urban design processes. In the application in Central Beijing, the significant interdependency between two elements is complex and is not necessarily an interaction between the elements with higher frequency only. The main typologies and the principal imagery elements are different from what were prefixed in the image recognition model. The detected imagery areas with adaptive thresholds suggest the spatially varying spill over effects of named areas and their typologies can be well annotated by the detected principal imagery elements. The aggregation of the data from different social media platforms is proven as a necessity of calibrating the unbiased scope of the city image. Any specific data can hardly capture the whole sample. The differentiation across the local and non-local is found to be related to their preference and activity space. The results provide more comprehensive insights on the complex composition of city images and its effects on placemaking.


2021 ◽  
Author(s):  
Shibo Gong ◽  
Yansong Gong ◽  
Longfei Su ◽  
Jing Yuan ◽  
Fengchi Sun

Processes ◽  
2021 ◽  
Vol 9 (11) ◽  
pp. 1955
Author(s):  
Gang Liu ◽  
Rongxu Zhang ◽  
Yanyan Wang ◽  
Rongjun Man

The application of scene recognition in intelligent robots to forklift AGV equipment is of great significance in order to improve the automation and intelligence level of distribution centers. At present, using the camera to collect image information to obtain environmental information can break through the limitation of traditional guideway and positioning equipment, and is beneficial to the path planning and system expansion in the later stage of warehouse construction. Taking the forklift AGV equipment in the distribution center as the research object, this paper explores the scene recognition and path planning of forklift AGV equipment based on a deep convolution neural network. On the basis of the characteristics of the warehouse environment, a semantic segmentation network applied to the scene recognition of the warehouse environment is established, and a scene recognition method suitable for the warehouse environment is proposed, so that the equipment can use the deep learning method to learn the environment features and achieve accurate recognition in the large-scale environment, without adding environmental landmarks, which provides an effective convolution neural network model for the scene recognition of forklift AGV equipment in the warehouse environment. The activation function layer of the model is studied by using the activation function with better gradient performance. The results show that the performance of the H-Swish activation function is better than that of the ReLU function in recognition accuracy and computational complexity, and it can save costs as a calculation form of the mobile terminal.


Sign in / Sign up

Export Citation Format

Share Document