Scene classification of ambiguous visual information

Author(s):  
L. Dong ◽  
E. Izquierdo
Electronics ◽  
2021 ◽  
Vol 10 (4) ◽  
pp. 371
Author(s):  
Yerin Lee ◽  
Soyoung Lim ◽  
Il-Youp Kwak

Acoustic scene classification (ASC) categorizes an audio file based on the environment in which it has been recorded. This has long been studied in the detection and classification of acoustic scenes and events (DCASE). This presents the solution to Task 1 of the DCASE 2020 challenge submitted by the Chung-Ang University team. Task 1 addressed two challenges that ASC faces in real-world applications. One is that the audio recorded using different recording devices should be classified in general, and the other is that the model used should have low-complexity. We proposed two models to overcome the aforementioned problems. First, a more general classification model was proposed by combining the harmonic-percussive source separation (HPSS) and deltas-deltadeltas features with four different models. Second, using the same feature, depthwise separable convolution was applied to the Convolutional layer to develop a low-complexity model. Moreover, using gradient-weight class activation mapping (Grad-CAM), we investigated what part of the feature our model sees and identifies. Our proposed system ranked 9th and 7th in the competition for these two subtasks, respectively.


Author(s):  
Wenshuai Chen ◽  
Shuiping Gou ◽  
Xinlin Wang ◽  
Licheng Jiao ◽  
Changzhe Jiao ◽  
...  

2021 ◽  
Author(s):  
Anh Nguyen ◽  
Khoa Pham ◽  
Dat Ngo ◽  
Thanh Ngo ◽  
Lam Pham

This paper provides an analysis of state-of-the-art activation functions with respect to supervised classification of deep neural network. These activation functions comprise of Rectified Linear Units (ReLU), Exponential Linear Unit (ELU), Scaled Exponential Linear Unit (SELU), Gaussian Error Linear Unit (GELU), and the Inverse Square Root Linear Unit (ISRLU). To evaluate, experiments over two deep learning network architectures integrating these activation functions are conducted. The first model, basing on Multilayer Perceptron (MLP), is evaluated with MNIST dataset to perform these activation functions.Meanwhile, the second model, likely VGGish-based architecture, is applied for Acoustic Scene Classification (ASC) Task 1A in DCASE 2018 challenge, thus evaluate whether these activation functions work well in different datasets as well as different network architectures.


2019 ◽  
Vol 16 (5) ◽  
pp. 558-571
Author(s):  
A. V. Belyakova ◽  
B. V. Saveliev

Introduction. Organization of high-quality training of the vehicles’ drivers is possible only with the proper formation of professional skills. Moreover, the formation of the skills is necessary for the driver to control the vehicle safety, perhaps by using simulators at the initial stage of training. The use of simulators allows automating the actions that the driver performs, while not exposing the student to risks.Therefore, the purpose of the paper is to analyze the application of simulators in the training of the vehicles’ drivers.Materials and methods. The paper presented the basic psycho physiological principles of the learning process, which should be taken into account when using simulators for driver training. The authors demonstrated the classification of the car simulators used for training of drivers by the information models. Existing information models of simulators were divided into two groups: reproducing only visual information, without imitation of the vestibular and simulating both visual and vestibular information. The analysis reflected the advantages and disadvantages of information models.Results. As a result, the authors proposed two systematizing features: the view angle of the visual information and the simulation of vestibular information.Discussion and conclusions. The research is useful not only for the further science development, but also for the selection of simulators and for the organization of the educational process in driving schools.


Author(s):  
Adam Csapo ◽  
Barna Resko ◽  
Morten Lind ◽  
Peter Baranyi

The computerized modeling of cognitive visual information has been a research field of great interest in the past several decades. The research field is interesting not only from a biological perspective, but also from an engineering point of view when systems are developed that aim to achieve similar goals as biological cognitive systems. This article introduces a general framework for the extraction and systematic storage of low-level visual features. The applicability of the framework is investigated in both unstructured and highly structured environments. In a first experiment, a linear categorization algorithm originally developed for the classification of text documents is used to classify natural images taken from the Caltech 101 database. In a second experiment, the framework is used to provide an automatically guided vehicle with obstacle detection and auto-positioning functionalities in highly structured environments. Results demonstrate that the model is highly applicable in structured environments, and also shows promising results in certain cases when used in unstructured environments.


2020 ◽  
Vol 2020 ◽  
pp. 1-12 ◽  
Author(s):  
Milos Antonijevic ◽  
Miodrag Zivkovic ◽  
Sladjana Arsic ◽  
Aleksandar Jevremovic

Visual short-term memory (VSTM) is defined as the ability to remember a small amount of visual information, such as colors and shapes, during a short period of time. VSTM is a part of short-term memory, which can hold information up to 30 seconds. In this paper, we present the results of research where we classified the data gathered by using an electroencephalogram (EEG) during a VSTM experiment. The experiment was performed with 12 participants that were required to remember as many details as possible from the two images, displayed for 1 minute. The first assessment was done in an isolated environment, while the second assessment was done in front of the other participants, in order to increase the stress of the examinee. The classification of the EEG data was done by using four algorithms: Naive Bayes, support vector, KNN, and random forest. The results obtained show that AI-based classification could be successfully used in the proposed way, since we were able to correctly classify the order of the images presented 90.12% of the time and type of the displayed image 90.51% of the time.


2018 ◽  
Vol 2018 (16) ◽  
pp. 1650-1657
Author(s):  
Xu Jiaqing ◽  
Lv Qi ◽  
Liu Hongjun ◽  
He Jie

Sign in / Sign up

Export Citation Format

Share Document