scholarly journals Assisting the Visually Impaired in Multi-object Scene Description Using OWA-Based Fusion of CNN Models

2020 ◽  
Vol 45 (12) ◽  
pp. 10511-10527
Author(s):  
Haikel Alhichri ◽  
Yakoub Bazi ◽  
Naif Alajlan

AbstractAdvances in technology can provide a lot of support for visually impaired (VI) persons. In particular, computer vision and machine learning can provide solutions for object detection and recognition. In this work, we propose a multi-label image classification solution for assisting a VI person in recognizing the presence of multiple objects in a scene. The solution is based on the fusion of two deep CNN models using the induced ordered weighted averaging (OWA) approach. Namely, in this work, we fuse the outputs of two pre-trained CNN models, VGG16 and SqueezeNet. To use the induced OWA approach, we need to estimate a confidence measure in the outputs of the two CNN base models. To this end, we propose the residual error between the predicted output and the true output as a measure of confidence. We estimate this residual error using another dedicated CNN model that is trained on the residual errors computed from the main CNN models. Then, the OAW technique uses these estimated residual errors as confidence measures and fuses the decisions of the two main CNN models. When tested on four image datasets of indoor environments from two separate locations, the proposed novel method improves the detection accuracy compared to both base CNN models. The results are also significantly better than state-of-the-art methods reported in the literature.

2019 ◽  
Vol 9 (21) ◽  
pp. 4656 ◽  
Author(s):  
Haikel Alhichri ◽  
Yakoub Bazi ◽  
Naif Alajlan ◽  
Bilel Bin Jdira

This work presents a deep learning method for scene description. (1) Background: This method is part of a larger system, called BlindSys, that assists the visually impaired in an indoor environment. The method detects the presence of certain objects, regardless of their position in the scene. This problem is also known as image multi-labeling. (2) Methods: Our proposed deep learning solution is based on a light-weight pre-trained CNN called SqueezeNet. We improved the SqueezeNet architecture by resetting the last convolutional layer to free weights, replacing its activation function from a rectified linear unit (ReLU) to a LeakyReLU, and adding a BatchNormalization layer thereafter. We also replaced the activation functions at the output layer from softmax to linear functions. These adjustments make up the main contributions in this work. (3) Results: The proposed solution is tested on four image multi-labeling datasets representing different indoor environments. It has achieved results better than state-of-the-art solutions both in terms of accuracy and processing time. (4) Conclusions: The proposed deep CNN is an effective solution for predicting the presence of objects in a scene and can be successfully used as a module within BlindSys.


Author(s):  
Shrugal Varde* ◽  
◽  
Dr. M.S. Panse ◽  

This paper introduces a novel travel for blind users that can assist them to detects location of doors in corridors and also give information about location of stairs. The developed system uses camera to capture images in front of the user. Feature extraction algorithm is used to extract key features that distinguish doors and stairs from other structures observed in indoor environments. This information is then conveyed to the user using simple auditory feedback. The mobility aid was validated on 50 visually impaired users. The subjects walked in a controlled test environment. The accuracy of the device to help the user detect doors and stairs was determined. The results obtained were satisfactory and the device has the potential for use in standalone mode for indoor navigations.


2017 ◽  
Vol 68 (2) ◽  
pp. 117-124
Author(s):  
Martin Broda ◽  
Vladimír Hajduk ◽  
Dušan Levický

Abstract Novel image steganalytic method used to detection of secret message in static images is introduced in this paper. This method is based on statistical steganalysis (SS), where statistical vector is composed by 285 statistical features (parameters) extracted from DCT (Discrete Cosine Transformation) domain and 46 features extracted mainly from DWT (Discrete Wavelet Transformation) domain. Classification process was realized by Ensemble classifier that was helpful in reduction of computational and time complexity. Proposed steganalytic method was verified by detection of popular image steganographic methods. Novel method was also compared with existing steganalytic methods by overall detection accuracy of a secret message.


Sensors ◽  
2020 ◽  
Vol 20 (21) ◽  
pp. 6238
Author(s):  
Payal Mahida ◽  
Seyed Shahrestani ◽  
Hon Cheung

Wayfinding and navigation can present substantial challenges to visually impaired (VI) people. Some of the significant aspects of these challenges arise from the difficulty of knowing the location of a moving person with enough accuracy. Positioning and localization in indoor environments require unique solutions. Furthermore, positioning is one of the critical aspects of any navigation system that can assist a VI person with their independent movement. The other essential features of a typical indoor navigation system include pathfinding, obstacle avoidance, and capabilities for user interaction. This work focuses on the positioning of a VI person with enough precision for their use in indoor navigation. We aim to achieve this by utilizing only the capabilities of a typical smartphone. More specifically, our proposed approach is based on the use of the accelerometer, gyroscope, and magnetometer of a smartphone. We consider the indoor environment to be divided into microcells, with the vertex of each microcell being assigned two-dimensional local coordinates. A regression-based analysis is used to train a multilayer perceptron neural network to map the inertial sensor measurements to the coordinates of the vertex of the microcell corresponding to the position of the smartphone. In order to test our proposed solution, we used IPIN2016, a publicly-available multivariate dataset that divides the indoor environment into cells tagged with the inertial sensor data of a smartphone, in order to generate the training and validating sets. Our experiments show that our proposed approach can achieve a remarkable prediction accuracy of more than 94%, with a 0.65 m positioning error.


Author(s):  
Annalisa Milella ◽  
Paolo Vanadia ◽  
Grazia Cicirelli ◽  
Arcangelo Distante

In this paper, the use of passive Radio Frequency Identification (RFID) as a support technology for mobile robot navigation and environment mapping is investigated. A novel method for localizing passive RFID tags in a geometric map of the environment using fuzzy logic is, first, described. Then, it is shown how a mobile robot equipped with RF antennas, RF reader, and a laser range finder can use such map for localization and path planning. Experimental results from tests performed in our institute suggest that the proposed approach is accurate in mapping RFID tags and can be effectively used for vehicle navigation in indoor environments.


Sensors ◽  
2020 ◽  
Vol 20 (21) ◽  
pp. 6329
Author(s):  
Ruijun Li ◽  
Yongjun Wang ◽  
Pan Tao ◽  
Rongjun Cheng ◽  
Zhenying Cheng ◽  
...  

Laser beam drift greatly influences the accuracy of a four degrees of freedom (4-DOF) measurement system during the detection of machine tool errors, especially for long-distance measurement. A novel method was proposed using bellows to serve as a laser beam shield and air pumps to stabilize the refractive index of air. The inner diameter of the bellows and the control mode of the pumps were optimized through theoretical analysis and simulation. An experimental setup was established to verify the feasibility of the method under the temperature interference condition. The results indicated that the position stability of the laser beam spot can be improved by more than 79% under the action of pumping and inflating. The proposed scheme provides a cost-effective method to reduce the laser beam drift, which can be applied to improve the detection accuracy of a 4-DOF measurement system.


2019 ◽  
Vol 9 (5) ◽  
pp. 878 ◽  
Author(s):  
Seondae Kim ◽  
Eun-Soo Park ◽  
Eun-Seok Ryu

Visual impairments cause very limited and low vision, leading to difficulties in processing information such as obstacles, objects, multimedia contents (e.g., video, photographs, and paintings), and reading in outdoor and indoor environments. Therefore, there are assistive devices and aids for visually impaired (VI) people. In general, such devices provide guidance or some supportive information that can be used along with guide dogs, walking canes, and braille devices. However, these devices have functional limitations; for example, they cannot help in the processing of multimedia contents such as images and videos. Additionally, most of the available braille displays for the VI represent the text as a single line with several braille cells. Although these devices are sufficient to read and understand text, they have difficulty in converting multimedia contents or massive text contents to braille. This paper describes a methodology to effectively convert multimedia contents to braille using 2D braille display. Furthermore, this research also proposes the transformation of Digital Accessible Information SYstem (DAISY) and electronic publication (EPUB) formats into 2D braille display. In addition, it introduces interesting research considering efficient communication for the VI. Thus, this study proposes an eBook reader application for DAISY and EPUB formats, which can correctly render and display text, images, audios, and videos on a 2D multiarray braille display. This approach is expected to provide better braille service for the VI when implemented and verified in real-time.


Mathematics ◽  
2020 ◽  
Vol 8 (1) ◽  
pp. 93 ◽  
Author(s):  
Zhenrong Deng ◽  
Rui Yang ◽  
Rushi Lan ◽  
Zhenbing Liu ◽  
Xiaonan Luo

Small scale face detection is a very difficult problem. In order to achieve a higher detection accuracy, we propose a novel method, termed SE-IYOLOV3, for small scale face in this work. In SE-IYOLOV3, we improve the YOLOV3 first, in which the anchorage box with a higher average intersection ratio is obtained by combining niche technology on the basis of the k-means algorithm. An upsampling scale is added to form a face network structure that is suitable for detecting dense small scale faces. The number of prediction boxes is five times more than the YOLOV3 network. To further improve the detection performance, we adopt the SENet structure to enhance the global receptive field of the network. The experimental results on the WIDERFACEdataset show that the IYOLOV3 network embedded in the SENet structure can significantly improve the detection accuracy of dense small scale faces.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Yiran Feng ◽  
Xueheng Tao ◽  
Eung-Joo Lee

In view of the current absence of any deep learning algorithm for shellfish identification in real contexts, an improved Faster R-CNN-based detection algorithm is proposed in this paper. It achieves multiobject recognition and localization through a second-order detection network and replaces the original feature extraction module with DenseNet, which can fuse multilevel feature information, increase network depth, and avoid the disappearance of network gradients. Meanwhile, the proposal merging strategy is improved with Soft-NMS, where an attenuation function is designed to replace the conventional NMS algorithm, thereby avoiding missed detection of adjacent or overlapping objects and enhancing the network detection accuracy under multiple objects. By constructing a real contexts shellfish dataset and conducting experimental tests on a vision recognition seafood sorting robot production line, we were able to detect the features of shellfish in different scenarios, and the detection accuracy was improved by nearly 4% compared to the original detection model, achieving a better detection accuracy. This provides favorable technical support for future quality sorting of seafood using the improved Faster R-CNN-based approach.


Sign in / Sign up

Export Citation Format

Share Document