Deep Recurrent Architecture based Scene Description Generator for Visually Impaired

Author(s):  
Aviral Chharia ◽  
Rahul Upadhyay
2019 ◽  
Vol 9 (21) ◽  
pp. 4656 ◽  
Author(s):  
Haikel Alhichri ◽  
Yakoub Bazi ◽  
Naif Alajlan ◽  
Bilel Bin Jdira

This work presents a deep learning method for scene description. (1) Background: This method is part of a larger system, called BlindSys, that assists the visually impaired in an indoor environment. The method detects the presence of certain objects, regardless of their position in the scene. This problem is also known as image multi-labeling. (2) Methods: Our proposed deep learning solution is based on a light-weight pre-trained CNN called SqueezeNet. We improved the SqueezeNet architecture by resetting the last convolutional layer to free weights, replacing its activation function from a rectified linear unit (ReLU) to a LeakyReLU, and adding a BatchNormalization layer thereafter. We also replaced the activation functions at the output layer from softmax to linear functions. These adjustments make up the main contributions in this work. (3) Results: The proposed solution is tested on four image multi-labeling datasets representing different indoor environments. It has achieved results better than state-of-the-art solutions both in terms of accuracy and processing time. (4) Conclusions: The proposed deep CNN is an effective solution for predicting the presence of objects in a scene and can be successfully used as a module within BlindSys.


2020 ◽  
Vol 45 (12) ◽  
pp. 10511-10527
Author(s):  
Haikel Alhichri ◽  
Yakoub Bazi ◽  
Naif Alajlan

AbstractAdvances in technology can provide a lot of support for visually impaired (VI) persons. In particular, computer vision and machine learning can provide solutions for object detection and recognition. In this work, we propose a multi-label image classification solution for assisting a VI person in recognizing the presence of multiple objects in a scene. The solution is based on the fusion of two deep CNN models using the induced ordered weighted averaging (OWA) approach. Namely, in this work, we fuse the outputs of two pre-trained CNN models, VGG16 and SqueezeNet. To use the induced OWA approach, we need to estimate a confidence measure in the outputs of the two CNN base models. To this end, we propose the residual error between the predicted output and the true output as a measure of confidence. We estimate this residual error using another dedicated CNN model that is trained on the residual errors computed from the main CNN models. Then, the OAW technique uses these estimated residual errors as confidence measures and fuses the decisions of the two main CNN models. When tested on four image datasets of indoor environments from two separate locations, the proposed novel method improves the detection accuracy compared to both base CNN models. The results are also significantly better than state-of-the-art methods reported in the literature.


2019 ◽  
Vol 9 (23) ◽  
pp. 5062 ◽  
Author(s):  
Yakoub Bazi ◽  
Haikel Alhichri ◽  
Naif Alajlan ◽  
Farid Melgani

In this paper, we present a portable camera-based method for helping visually impaired (VI) people to recognize multiple objects in images. This method relies on a novel multi-label convolutional support vector machine (CSVM) network for coarse description of images. The core idea of CSVM is to use a set of linear SVMs as filter banks for feature map generation. During the training phase, the weights of the SVM filters are obtained using a forward-supervised learning strategy unlike the backpropagation algorithm used in standard convolutional neural networks (CNNs). To handle multi-label detection, we introduce a multi-branch CSVM architecture, where each branch will be used for detecting one object in the image. This architecture exploits the correlation between the objects present in the image by means of an opportune fusion mechanism of the intermediate outputs provided by the convolution layers of each branch. The high-level reasoning of the network is done through binary classification SVMs for predicting the presence/absence of objects in the image. The experiments obtained on two indoor datasets and one outdoor dataset acquired from a portable camera mounted on a lightweight shield worn by the user, and connected via a USB wire to a laptop processing unit are reported and discussed.


Sensors ◽  
2017 ◽  
Vol 17 (11) ◽  
pp. 2641 ◽  
Author(s):  
Salim Malek ◽  
Farid Melgani ◽  
Mohamed Mekhalfi ◽  
Yakoub Bazi

1979 ◽  
Vol 10 (3) ◽  
pp. 139-144
Author(s):  
Cheri L. Florance ◽  
Judith O’Keefe

A modification of the Paired-Stimuli Parent Program (Florance, 1977) was adapted for the treatment of articulatory errors of visually handicapped children. Blind high school students served as clinical aides. A discussion of treatment methodology, and the results of administrating the program to 32 children, including a two-year follow-up evaluation to measure permanence of behavior change, is presented.


2013 ◽  
Author(s):  
Bavani Ramayah ◽  
Azizah Jaafar ◽  
Noor Faezah Mohd Yatin

Sign in / Sign up

Export Citation Format

Share Document