scholarly journals How to Design a Relevant Corpus for Sleepiness Detection Through Voice?

2021 ◽  
Vol 3 ◽  
Author(s):  
Vincent P. Martin ◽  
Jean-Luc Rouas ◽  
Jean-Arthur Micoulaud-Franchi ◽  
Pierre Philip ◽  
Jarek Krajewski

This article presents research on the detection of pathologies affecting speech through automatic analysis. Voice processing has indeed been used for evaluating several diseases such as Parkinson, Alzheimer, or depression. If some studies present results that seem sufficient for clinical applications, this is not the case for the detection of sleepiness. Even two international challenges and the recent advent of deep learning techniques have still not managed to change this situation. This article explores the hypothesis that the observed average performances of automatic processing find their cause in the design of the corpora. To this aim, we first discuss and refine the concept of sleepiness related to the ground-truth labels. Second, we present an in-depth study of four corpora, bringing to light the methodological choices that have been made and the underlying biases they may have induced. Finally, in light of this information, we propose guidelines for the design of new corpora.

2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Sofia B. Dias ◽  
Sofia J. Hadjileontiadou ◽  
José Diniz ◽  
Leontios J. Hadjileontiadis

AbstractCoronavirus (Covid-19) pandemic has imposed a complete shut-down of face-to-face teaching to universities and schools, forcing a crash course for online learning plans and technology for students and faculty. In the midst of this unprecedented crisis, video conferencing platforms (e.g., Zoom, WebEx, MS Teams) and learning management systems (LMSs), like Moodle, Blackboard and Google Classroom, are being adopted and heavily used as online learning environments (OLEs). However, as such media solely provide the platform for e-interaction, effective methods that can be used to predict the learner’s behavior in the OLEs, which should be available as supportive tools to educators and metacognitive triggers to learners. Here we show, for the first time, that Deep Learning techniques can be used to handle LMS users’ interaction data and form a novel predictive model, namely DeepLMS, that can forecast the quality of interaction (QoI) with LMS. Using Long Short-Term Memory (LSTM) networks, DeepLMS results in average testing Root Mean Square Error (RMSE) $$<0.009$$ < 0.009 , and average correlation coefficient between ground truth and predicted QoI values $$r\ge 0.97$$ r ≥ 0.97 $$(p<0.05)$$ ( p < 0.05 ) , when tested on QoI data from one database pre- and two ones during-Covid-19 pandemic. DeepLMS personalized QoI forecasting scaffolds user’s online learning engagement and provides educators with an evaluation path, additionally to the content-related assessment, enriching the overall view on the learners’ motivation and participation in the learning process.


2020 ◽  
Author(s):  
Jordan Reece ◽  
Margaret Couvillon ◽  
Christoph Grüter ◽  
Francis Ratnieks ◽  
Constantino Carlos Reyes-Aldasoro

AbstractThis work describe an algorithm for the automatic analysis of the waggle dance of honeybees. The algorithm analyses a video of a beehive with 13,624 frames, acquired at 25 frames/second. The algorithm employs the following traditional image processing steps: conversion to grayscale, low pass filtering, background subtraction, thresholding, tracking and clustering to detect run of bees that perform waggle dances. The algorithm detected 44,530 waggle events, i.e. one bee waggling in one time frame, which were then clustered into 511 waggle runs. Most of these were concentrated in one section of the hive. The accuracy of the tracking was 90% and a series of metrics like intra-dance variation in angle and duration were found to be consistent with literature. Whilst this algorithm was tested on a single video, the ideas and steps, which are simple as compared with Machine and Deep Learning techniques, should be attractive for researchers in this field who are not specialists in more complex techniques.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Young Jin Jeong ◽  
Hyoung Suk Park ◽  
Ji Eun Jeong ◽  
Hyun Jin Yoon ◽  
Kiwan Jeon ◽  
...  

AbstractOur purpose in this study is to evaluate the clinical feasibility of deep-learning techniques for F-18 florbetaben (FBB) positron emission tomography (PET) image reconstruction using data acquired in a short time. We reconstructed raw FBB PET data of 294 patients acquired for 20 and 2 min into standard-time scanning PET (PET20m) and short-time scanning PET (PET2m) images. We generated a standard-time scanning PET-like image (sPET20m) from a PET2m image using a deep-learning network. We did qualitative and quantitative analyses to assess whether the sPET20m images were available for clinical applications. In our internal validation, sPET20m images showed substantial improvement on all quality metrics compared with the PET2m images. There was a small mean difference between the standardized uptake value ratios of sPET20m and PET20m images. A Turing test showed that the physician could not distinguish well between generated PET images and real PET images. Three nuclear medicine physicians could interpret the generated PET image and showed high accuracy and agreement. We obtained similar quantitative results by means of temporal and external validations. We can generate interpretable PET images from low-quality PET images because of the short scanning time using deep-learning techniques. Although more clinical validation is needed, we confirmed the possibility that short-scanning protocols with a deep-learning technique can be used for clinical applications.


2020 ◽  
Vol 237 (12) ◽  
pp. 1438-1441
Author(s):  
Soenke Langner ◽  
Ebba Beller ◽  
Felix Streckenbach

AbstractMedical images play an important role in ophthalmology and radiology. Medical image analysis has greatly benefited from the application of “deep learning” techniques in clinical and experimental radiology. Clinical applications and their relevance for radiological imaging in ophthalmology are presented.


2018 ◽  
Vol 6 (3) ◽  
pp. 93 ◽  
Author(s):  
Michael O’Byrne ◽  
Vikram Pakrashi ◽  
Franck Schoefs ◽  
and Bidisha Ghosh

Recent breakthroughs in the computer vision community have led to the emergence of efficient deep learning techniques for end-to-end segmentation of natural scenes. Underwater imaging stands to gain from these advances, however, deep learning methods require large annotated datasets for model training and these are typically unavailable for underwater imaging applications. This paper proposes the use of photorealistic synthetic imagery for training deep models that can be applied to interpret real-world underwater imagery. To demonstrate this concept, we look at the specific problem of biofouling detection on marine structures. A contemporary deep encoder–decoder network, termed SegNet, is trained using 2500 annotated synthetic images of size 960 × 540 pixels. The images were rendered in a virtual underwater environment under a wide variety of conditions and feature biofouling of various size, shape, and colour. Each rendered image has a corresponding ground truth per-pixel label map. Once trained on the synthetic imagery, SegNet is applied to segment new real-world images. The initial segmentation is refined using an iterative support vector machine (SVM) based post-processing algorithm. The proposed approach achieves a mean Intersection over Union (IoU) of 87% and a mean accuracy of 94% when tested on 32 frames extracted from two distinct real-world subsea inspection videos. Inference takes several seconds for a typical image.


Author(s):  
Subarna Shakya

Deep learning methods have gained an increasing research interest, especially in the field of image denoising. Although there are significant differences between the different types of deep learning techniques used for natural image denoising, it includes significant process and procedure differences between them. To be specific, discriminative learning based on deep learning convolutional neural network (CNN) may effectively solve the problem of Gaussian noise. Deep learning based optimization models are useful in predicting the true noise level. However, no relevant research has attempted to summarize the different deep learning approaches for performing image denoising in one location. It has been suggested to build the proposed framework in parallel with the previously trained CNN to enhance the training speed and accuracy in denoising the Gaussian White Noise (GWN). In the proposed architecture, ground truth maps are created by combining the additional patches of input with original pictures to create ground truth maps. Furthermore, by changing kernel weights for forecasting probability maps, the loss function may be reduced to its smallest value. Besides, it is efficient in terms of processing time with less sparsity while enlarging the objects present in the images. As well as in conventional methods, various performance measures such as PSNR, MSE, and SSIM are computed and compared with one another.


Author(s):  
Bing Pan ◽  
Virinchi Savanapelli ◽  
Abhishek Shukla ◽  
Junjun Yin

AbstractThis short paper summarizes the first research stage for applying deep learning techniques to capture human-wildlife interactions in national parks from crowd-sourced data. The results from objection detection, image captioning, and distance calculation are reported. We were able to categorize animal types, summarize visitor behaviors in the pictures, and calculate distances between visitors and animals with different levels of accuracy. Future development will focus on getting more training data and field experiments to collect ground truth on animal types and distances to animals. More in-depth manual coding is needed to categorize visitor behavior into acceptable and unacceptable ones.


2018 ◽  
Vol 27 (01) ◽  
pp. 098-109 ◽  
Author(s):  
Nagarajan Ganapathy ◽  
Ramakrishnan Swaminathan ◽  
Thomas Deserno

Objectives: Deep learning models such as convolutional neural networks (CNNs) have been applied successfully to medical imaging, but biomedical signal analysis has yet to fully benefit from this novel approach. Our survey aims at (i) reviewing deep learning techniques for biosignal analysis in computer- aided diagnosis; and (ii) deriving a taxonomy for organizing the growing number of applications in the field. Methods: A comprehensive literature research was performed using PubMed, Scopus, and ACM. Deep learning models were classified with respect to the (i) origin, (ii) dimension, and (iii) type of the biosignal as input to the deep learning model; (iv) the goal of the application; (v) the size and (vi) type of ground truth data; (vii) the type and (viii) schedule of learning the network; and (ix) the topology of the model. Results: Between January 2010 and December 2017, a total 71 papers were published on the topic. The majority (n = 36) of papers are on electrocariography (ECG) signals. Most applications (n = 25) aim at detection of patterns, while only a few (n = 6) at predection of events. Out of 36 ECG-based works, many (n = 17) relate to multi-lead ECG. Other biosignals that have been identified in the survey are electromyography, phonocardiography, photoplethysmography, electrooculography, continuous glucose monitoring, acoustic respiratory signal, blood pressure, and electrodermal activity signal, while ballistocardiography or seismocardiography have yet to be analyzed using deep learning techniques. In supervised and unsupervised applications, CNNs and restricted Boltzmann machines are the most and least frequently used, (n = 34) and (n = 15), respectively. Conclusion: Our key-code classification of relevant papers was used to cluster the approaches that have been published to date and demonstrated a large variability of research with respect to data, application, and network topology. Future research is expected to focus on the standardization of deep learning architectures and on the optimization of the network parameters to increase performance and robustness. Furthermore, application-driven approaches and updated training data from mobile recordings are needed.


2017 ◽  
Author(s):  
Philippe Poulin ◽  
Marc-Alexandre Côté ◽  
Jean-Christophe Houde ◽  
Laurent Petit ◽  
Peter F. Neher ◽  
...  

AbstractWe show that deep learning techniques can be applied successfully to fiber tractography. Specifically, we use feed-forward and recurrent neural networks to learn the generation process of streamlines directly from diffusion-weighted imaging (DWI) data. Furthermore, we empirically study the behavior of the proposed models on a realistic white matter phantom with known ground truth. We show that their performance is competitive to that of commonly used techniques, even when the models are used on DWI data unseen at training time. We also show that our models are able to recover high spatial coverage of the ground truth white matter pathways while better controlling the number of false connections. In fact, our experiments suggest that exploiting past information within a streamline's trajectory during tracking helps predict the following direction.


Symmetry ◽  
2020 ◽  
Vol 12 (12) ◽  
pp. 2012
Author(s):  
JongBae Kim

This paper proposes a real-time detection method for a car driving ahead in real time on a tunnel road. Unlike the general road environment, the tunnel environment is irregular and has significantly lower illumination, including tunnel lighting and light reflected from driving vehicles. The environmental restrictions are large owing to pollution by vehicle exhaust gas. In the proposed method, a real-time detection method is used for vehicles in tunnel images learned in advance using deep learning techniques. To detect the vehicle region in the tunnel environment, brightness smoothing and noise removal processes are carried out. The vehicle region is learned after generating a learning image using the ground-truth method. The YOLO v2 model, with an optimal performance compared to the performances of deep learning algorithms, is applied. The training parameters are refined through experiments. The vehicle detection rate is approximately 87%, while the detection accuracy is approximately 94% for the proposed method applied to various tunnel road environments.


Sign in / Sign up

Export Citation Format

Share Document