scholarly journals Enhancing perception for the visually impaired with deep learning techniques and low-cost wearable sensors

2020 ◽  
Vol 137 ◽  
pp. 27-36 ◽  
Author(s):  
Zuria Bauer ◽  
Alejandro Dominguez ◽  
Edmanuel Cruz ◽  
Francisco Gomez-Donoso ◽  
Sergio Orts-Escolano ◽  
...  

Cataract is a degenerative condition that, according to estimations, will rise globally. Even though there are various proposals about its diagnosis, there are remaining problems to be solved. This paper aims to identify the current situation of the recent investigations on cataract diagnosis using a framework to conduct the literature review with the intention of answering the following research questions: RQ1) Which are the existing methods for cataract diagnosis? RQ2) Which are the features considered for the diagnosis of cataracts? RQ3) Which is the existing classification when diagnosing cataracts? RQ4) And Which obstacles arise when diagnosing cataracts? Additionally, a cross-analysis of the results was made. The results showed that new research is required in: (1) the classification of “congenital cataract” and, (2) portable solutions, which are necessary to make cataract diagnoses easily and at a low cost.


Sensors ◽  
2020 ◽  
Vol 20 (10) ◽  
pp. 2984
Author(s):  
Yue Mu ◽  
Tai-Shen Chen ◽  
Seishi Ninomiya ◽  
Wei Guo

Automatic detection of intact tomatoes on plants is highly expected for low-cost and optimal management in tomato farming. Mature tomato detection has been wildly studied, while immature tomato detection, especially when occluded with leaves, is difficult to perform using traditional image analysis, which is more important for long-term yield prediction. Therefore, tomato detection that can generalize well in real tomato cultivation scenes and is robust to issues such as fruit occlusion and variable lighting conditions is highly desired. In this study, we build a tomato detection model to automatically detect intact green tomatoes regardless of occlusions or fruit growth stage using deep learning approaches. The tomato detection model used faster region-based convolutional neural network (R-CNN) with Resnet-101 and transfer learned from the Common Objects in Context (COCO) dataset. The detection on test dataset achieved high average precision of 87.83% (intersection over union ≥ 0.5) and showed a high accuracy of tomato counting (R2 = 0.87). In addition, all the detected boxes were merged into one image to compile the tomato location map and estimate their size along one row in the greenhouse. By tomato detection, counting, location and size estimation, this method shows great potential for ripeness and yield prediction.


Energies ◽  
2020 ◽  
Vol 13 (22) ◽  
pp. 6104
Author(s):  
Bernardo Calabrese ◽  
Ramiro Velázquez ◽  
Carolina Del-Valle-Soto ◽  
Roberto de Fazio ◽  
Nicola Ivan Giannoccaro ◽  
...  

This paper introduces a novel low-cost solar-powered wearable assistive technology (AT) device, whose aim is to provide continuous, real-time object recognition to ease the finding of the objects for visually impaired (VI) people in daily life. The system consists of three major components: a miniature low-cost camera, a system on module (SoM) computing unit, and an ultrasonic sensor. The first is worn on the user’s eyeglasses and acquires real-time video of the nearby space. The second is worn as a belt and runs deep learning-based methods and spatial algorithms which process the video coming from the camera performing objects’ detection and recognition. The third assists on positioning the objects found in the surrounding space. The developed device provides audible descriptive sentences as feedback to the user involving the objects recognized and their position referenced to the user gaze. After a proper power consumption analysis, a wearable solar harvesting system, integrated with the developed AT device, has been designed and tested to extend the energy autonomy in the different operating modes and scenarios. Experimental results obtained with the developed low-cost AT device have demonstrated an accurate and reliable real-time object identification with an 86% correct recognition rate and 215 ms average time interval (in case of high-speed SoM operating mode) for the image processing. The proposed system is capable of recognizing the 91 objects offered by the Microsoft Common Objects in Context (COCO) dataset plus several custom objects and human faces. In addition, a simple and scalable methodology for using image datasets and training of Convolutional Neural Networks (CNNs) is introduced to add objects to the system and increase its repertory. It is also demonstrated that comprehensive trainings involving 100 images per targeted object achieve 89% recognition rates, while fast trainings with only 12 images achieve acceptable recognition rates of 55%.


2020 ◽  
Vol 12 (22) ◽  
pp. 3836
Author(s):  
Carlos García Rodríguez ◽  
Jordi Vitrià ◽  
Oscar Mora

In recent years, different deep learning techniques were applied to segment aerial and satellite images. Nevertheless, state of the art techniques for land cover segmentation does not provide accurate results to be used in real applications. This is a problem faced by institutions and companies that want to replace time-consuming and exhausting human work with AI technology. In this work, we propose a method that combines deep learning with a human-in-the-loop strategy to achieve expert-level results at a low cost. We use a neural network to segment the images. In parallel, another network is used to measure uncertainty for predicted pixels. Finally, we combine these neural networks with a human-in-the-loop approach to produce correct predictions as if developed by human photointerpreters. Applying this methodology shows that we can increase the accuracy of land cover segmentation tasks while decreasing human intervention.


2021 ◽  
Vol 10 (1) ◽  
pp. 14
Author(s):  
Manuel Gil-Martín ◽  
Javier López-Iniesta ◽  
Rubén San-Segundo

Human Activity Recognition (HAR) has been widely addressed by deep learning techniques. However, most prior research applied a general unique approach (signal processing and deep learning) to deal with different human activities including postures and gestures. These types of activity typically have highly diverse motion characteristics, which could be captured with wearable sensors placed on the user’s body. Repetitive movements such as running or cycling have repetitive patterns over time and generate harmonics in the frequency domain, while postures such as sitting or lying are characterized by a fixed position, with some positional changes and gestures or non-repetitive movements being based on an isolated movement usually performed by a limb. This work proposes a classifier module to perform an initial classification among these different types of movements, which would allow for applying afterwards the most appropriate approach in terms of signal processing and deep learning techniques for each type of movement. This classifier has been evaluated using the PAMAP2 and OPPORTUNITY datasets using a subject-wise cross-validation methodology. These datasets contain recordings from inertial sensors on hands, arms, chest, hip, and ankles, collected in a non-intrusive way. In the case of PAMAP2, the baseline approach for classifying the 12 activities using 5-s windows in the frequency domain obtained an accuracy of 85.26 ± 0.25%. However, an initial classifier module could distinguish between repetitive movements and postures using 5-s windows reaching higher performances. Afterward, specific window size, signal format, and deep learning architecture were used for each type of movement module, obtaining a final accuracy of 90.09 ± 0.35% (an absolute improvement of 4.83%).


Author(s):  
Anna Ferrari ◽  
Daniela Micucci ◽  
Marco Mobilio ◽  
Paolo Napoletano

AbstractHuman activity recognition (HAR) is a line of research whose goal is to design and develop automatic techniques for recognizing activities of daily living (ADLs) using signals from sensors. HAR is an active research filed in response to the ever-increasing need to collect information remotely related to ADLs for diagnostic and therapeutic purposes. Traditionally, HAR used environmental or wearable sensors to acquire signals and relied on traditional machine-learning techniques to classify ADLs. In recent years, HAR is moving towards the use of both wearable devices (such as smartphones or fitness trackers, since they are daily used by people and they include reliable inertial sensors), and deep learning techniques (given the encouraging results obtained in the area of computer vision). One of the major challenges related to HAR is population diversity, which makes difficult traditional machine-learning algorithms to generalize. Recently, researchers successfully attempted to address the problem by proposing techniques based on personalization combined with traditional machine learning. To date, no effort has been directed at investigating the benefits that personalization can bring in deep learning techniques in the HAR domain. The goal of our research is to verify if personalization applied to both traditional and deep learning techniques can lead to better performance than classical approaches (i.e., without personalization). The experiments were conducted on three datasets that are extensively used in the literature and that contain metadata related to the subjects. AdaBoost is the technique chosen for traditional machine learning, while convolutional neural network is the one chosen for deep learning. These techniques have shown to offer good performance. Personalization considers both the physical characteristics of the subjects and the inertial signals generated by the subjects. Results suggest that personalization is most effective when applied to traditional machine-learning techniques rather than to deep learning ones. Moreover, results show that deep learning without personalization performs better than any other methods experimented in the paper in those cases where the number of training samples is high and samples are heterogeneous (i.e., they represent a wider spectrum of the population). This suggests that traditional deep learning can be more effective, provided you have a large and heterogeneous dataset, intrinsically modeling the population diversity in the training process.


2020 ◽  
Vol 4 (1) ◽  
Author(s):  
Zixuan Zhang ◽  
Tianyiyi He ◽  
Minglu Zhu ◽  
Zhongda Sun ◽  
Qiongfeng Shi ◽  
...  

Abstract The era of artificial intelligence and internet of things is rapidly developed by recent advances in wearable electronics. Gait reveals sensory information in daily life containing personal information, regarding identification and healthcare. Current wearable electronics of gait analysis are mainly limited by high fabrication cost, operation energy consumption, or inferior analysis methods, which barely involve machine learning or implement nonoptimal models that require massive datasets for training. Herein, we developed low-cost triboelectric intelligent socks for harvesting waste energy from low-frequency body motions to transmit wireless sensory data. The sock equipped with self-powered functionality also can be used as wearable sensors to deliver information, regarding the identity, health status, and activity of the users. To further address the issue of ineffective analysis methods, an optimized deep learning model with an end-to-end structure on the socks signals for the gait analysis is proposed, which produces a 93.54% identification accuracy of 13 participants and detects five different human activities with 96.67% accuracy. Toward practical application, we map the physical signals collected through the socks in the virtual space to establish a digital human system for sports monitoring, healthcare, identification, and future smart home applications.


Author(s):  
A. V. N. Kameswari

Abstract: When humans see an image, their brain can easily tell what the image is about, but a computer cannot do it easily. Computer vision researchers worked on this a lot and they considered it impossible until now! With the advancement in Deep learning techniques, availability of huge datasets and computer power, we can build models that can generate captions for an image. Image Caption Generator is a popular research area of Deep Learning that deals with image understanding and a language description for that image. Generating well-formed sentences requires both syntactic and semantic understanding of the language. Being able to describe the content of an image using accurately formed sentences is a very challenging task, but it could also have a great impact, by helping visually impaired people better understand the content of images. The biggest challenge is most definitely being able to create a description that must capture not only the objects contained in an image, but also express how these objects relate to each other. This paper uses Flickr_8K dataset and Flickr8k_text folder that contains Flickr8k.token which is the main file of our dataset that contains image name and their respective caption separated by newline(“\n”). CNN is used for extracting features from the image. We will use the pre-trained model Xception. LSTM will use the information from CNN to help generate a description of the image. In our Flickr8k_text folder, we have Flickr_8k.trainImages.txt file that contains a list of 6000 images names that we will use for training. After CNN-LSTM model is defined we give an image file as parameter through command prompt for testing image caption generator and it generates the caption of an image and its accuracy is observed by calculating bleu score for generated and reference captions. Keywords: Image Caption Generator, Convolutional Neural Network, Long Short-Term Memory, Bleu score, Flickr_8K


2020 ◽  
Author(s):  
Robert Arntfield ◽  
Blake VanBerlo ◽  
Thamer Alaifan ◽  
Nathan Phelps ◽  
Matt White ◽  
...  

AbstractObjectivesLung ultrasound (LUS) is a portable, low cost respiratory imaging tool but is challenged by user dependence and lack of diagnostic specificity. It is unknown whether the advantages of LUS implementation could be paired with deep learning techniques to match or exceed human-level, diagnostic specificity among similar appearing, pathological LUS images.DesignA convolutional neural network was trained on LUS images with B lines of different etiologies. CNN diagnostic performance, as validated using a 10% data holdback set was compared to surveyed LUS-competent physicians.SettingTwo tertiary Canadian hospitals.Participants600 LUS videos (121,381 frames) of B lines from 243 distinct patients with either 1) COVID-19, Non-COVID acute respiratory distress syndrome (NCOVID) and 3) Hydrostatic pulmonary edema (HPE).ResultsThe trained CNN performance on the independent dataset showed an ability to discriminate between COVID (AUC 1.0), NCOVID (AUC 0.934) and HPE (AUC 1.0) pathologies. This was significantly better than physician ability (AUCs of 0.697, 0.704, 0.967 for the COVID, NCOVID and HPE classes, respectively), p < 0.01.ConclusionsA deep learning model can distinguish similar appearing LUS pathology, including COVID-19, that cannot be distinguished by humans. The performance gap between humans and the model suggests that subvisible biomarkers within ultrasound images could exist and multi-center research is merited.


Sign in / Sign up

Export Citation Format

Share Document