scholarly journals Two Stage Continuous Gesture Recognition Based on Deep Learning

Electronics ◽  
2021 ◽  
Vol 10 (5) ◽  
pp. 534
Author(s):  
Huogen Wang

The paper proposes an effective continuous gesture recognition method, which includes two modules: segmentation and recognition. In the segmentation module, the video frames are divided into gesture frames and transitional frames by using the information of hand motion and appearance, and continuous gesture sequences are segmented into isolated sequences. In the recognition module, our method exploits the spatiotemporal information embedded in RGB and depth sequences. For the RGB modality, our method adopts Convolutional Long Short-Term Memory Networks to learn long-term spatiotemporal features from short-term spatiotemporal features obtained from a 3D convolutional neural network. For the depth modality, our method converts a sequence into Dynamic Images and Motion Dynamic Images through weighted rank pooling and feed them into Convolutional Neural Networks, respectively. Our method has been evaluated on both ChaLearn LAP Large-scale Continuous Gesture Dataset and Montalbano Gesture Dataset and achieved state-of-the-art performance.

Atmosphere ◽  
2020 ◽  
Vol 11 (6) ◽  
pp. 569
Author(s):  
Suting Chen ◽  
Song Zhang ◽  
Huantong Geng ◽  
Yaodeng Chen ◽  
Chuang Zhang ◽  
...  

In order to solve the existing problems of easy spatiotemporal information loss and low forecast accuracy in traditional radar echo nowcasting, this paper proposes an encoding-forecasting model (3DCNN-BCLSTM) combining 3DCNN and bi-directional convolutional long short-term memory. The model first constructs dimensions of input data and gets 3D tensor data with spatiotemporal features, extracts local short-term spatiotemporal features of radar echoes through 3D convolution networks, then utilizes constructed bi-directional convolutional LSTM to learn global long-term spatiotemporal feature dependencies, and finally realizes the forecast of echo image changes by forecasting network. This structure can capture the spatiotemporal correlation of radar echoes in continuous motion fully and realize more accurate forecast of moving trend of short-term radar echoes within a region. The samples of radar echo images recorded by Shenzhen and Hong Kong meteorological stations are used for experiments, the results show that the critical success index (CSI) of this proposed model for eight predicted echoes reaches 0.578 when the echo threshold is 10 dBZ, the false alarm ratio (FAR) is 20% lower than convolutional LSTM network (ConvLSTM), and the mean square error (MSE) is 16% lower than the real-time optical flow by variational method (ROVER), which outperforms the current state-of-the-art radar echo nowcasting methods.


2016 ◽  
Vol 39 ◽  
Author(s):  
Mary C. Potter

AbstractRapid serial visual presentation (RSVP) of words or pictured scenes provides evidence for a large-capacity conceptual short-term memory (CSTM) that momentarily provides rich associated material from long-term memory, permitting rapid chunking (Potter 1993; 2009; 2012). In perception of scenes as well as language comprehension, we make use of knowledge that briefly exceeds the supposed limits of working memory.


2020 ◽  
Vol 29 (4) ◽  
pp. 710-727
Author(s):  
Beula M. Magimairaj ◽  
Naveen K. Nagaraj ◽  
Alexander V. Sergeev ◽  
Natalie J. Benafield

Objectives School-age children with and without parent-reported listening difficulties (LiD) were compared on auditory processing, language, memory, and attention abilities. The objective was to extend what is known so far in the literature about children with LiD by using multiple measures and selective novel measures across the above areas. Design Twenty-six children who were reported by their parents as having LiD and 26 age-matched typically developing children completed clinical tests of auditory processing and multiple measures of language, attention, and memory. All children had normal-range pure-tone hearing thresholds bilaterally. Group differences were examined. Results In addition to significantly poorer speech-perception-in-noise scores, children with LiD had reduced speed and accuracy of word retrieval from long-term memory, poorer short-term memory, sentence recall, and inferencing ability. Statistically significant group differences were of moderate effect size; however, standard test scores of children with LiD were not clinically poor. No statistically significant group differences were observed in attention, working memory capacity, vocabulary, and nonverbal IQ. Conclusions Mild signal-to-noise ratio loss, as reflected by the group mean of children with LiD, supported the children's functional listening problems. In addition, children's relative weakness in select areas of language performance, short-term memory, and long-term memory lexical retrieval speed and accuracy added to previous research on evidence-based areas that need to be evaluated in children with LiD who almost always have heterogenous profiles. Importantly, the functional difficulties faced by children with LiD in relation to their test results indicated, to some extent, that commonly used assessments may not be adequately capturing the children's listening challenges. Supplemental Material https://doi.org/10.23641/asha.12808607


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Sungmin O. ◽  
Rene Orth

AbstractWhile soil moisture information is essential for a wide range of hydrologic and climate applications, spatially-continuous soil moisture data is only available from satellite observations or model simulations. Here we present a global, long-term dataset of soil moisture derived through machine learning trained with in-situ measurements, SoMo.ml. We train a Long Short-Term Memory (LSTM) model to extrapolate daily soil moisture dynamics in space and in time, based on in-situ data collected from more than 1,000 stations across the globe. SoMo.ml provides multi-layer soil moisture data (0–10 cm, 10–30 cm, and 30–50 cm) at 0.25° spatial and daily temporal resolution over the period 2000–2019. The performance of the resulting dataset is evaluated through cross validation and inter-comparison with existing soil moisture datasets. SoMo.ml performs especially well in terms of temporal dynamics, making it particularly useful for applications requiring time-varying soil moisture, such as anomaly detection and memory analyses. SoMo.ml complements the existing suite of modelled and satellite-based datasets given its distinct derivation, to support large-scale hydrological, meteorological, and ecological analyses.


2021 ◽  
Vol 13 (2) ◽  
pp. 164
Author(s):  
Chuyao Luo ◽  
Xutao Li ◽  
Yongliang Wen ◽  
Yunming Ye ◽  
Xiaofeng Zhang

The task of precipitation nowcasting is significant in the operational weather forecast. The radar echo map extrapolation plays a vital role in this task. Recently, deep learning techniques such as Convolutional Recurrent Neural Network (ConvRNN) models have been designed to solve the task. These models, albeit performing much better than conventional optical flow based approaches, suffer from a common problem of underestimating the high echo value parts. The drawback is fatal to precipitation nowcasting, as the parts often lead to heavy rains that may cause natural disasters. In this paper, we propose a novel interaction dual attention long short-term memory (IDA-LSTM) model to address the drawback. In the method, an interaction framework is developed for the ConvRNN unit to fully exploit the short-term context information by constructing a serial of coupled convolutions on the input and hidden states. Moreover, a dual attention mechanism on channels and positions is developed to recall the forgotten information in the long term. Comprehensive experiments have been conducted on CIKM AnalytiCup 2017 data sets, and the results show the effectiveness of the IDA-LSTM in addressing the underestimation drawback. The extrapolation performance of IDA-LSTM is superior to that of the state-of-the-art methods.


2014 ◽  
Vol 26 (7) ◽  
pp. 1377-1389 ◽  
Author(s):  
Bo-Cheng Kuo ◽  
Mark G. Stokes ◽  
Alexandra M. Murray ◽  
Anna Christina Nobre

In the current study, we tested whether representations in visual STM (VSTM) can be biased via top–down attentional modulation of visual activity in retinotopically specific locations. We manipulated attention using retrospective cues presented during the retention interval of a VSTM task. Retrospective cues triggered activity in a large-scale network implicated in attentional control and led to retinotopically specific modulation of activity in early visual areas V1–V4. Importantly, shifts of attention during VSTM maintenance were associated with changes in functional connectivity between pFC and retinotopic regions within V4. Our findings provide new insights into top–down control mechanisms that modulate VSTM representations for flexible and goal-directed maintenance of the most relevant memoranda.


1978 ◽  
Vol 10 (2) ◽  
pp. 141-148
Author(s):  
Mary Anne Herndon

In a model of the functioning of short term memory, the encoding of information for subsequent storage in long term memory is simulated. In the encoding process, semantically equivalent paragraphs are detected for recombination into a macro information unit. This recombination process can be used to relieve the limited storage capacity constraint of short term memory and subsequently increase processing efficiency. The results of the simulation give a favorable indication of the success for the use of cluster analysis as a tool to simulate the encoding function in the detection of semantically similar paragraphs.


2017 ◽  
Vol 14 (1) ◽  
pp. 172988141769231 ◽  
Author(s):  
Ning An ◽  
Shi-Ying Sun ◽  
Xiao-Guang Zhao ◽  
Zeng-Guang Hou

Visual tracking is a challenging computer vision task due to the significant observation changes of the target. By contrast, the tracking task is relatively easy for humans. In this article, we propose a tracker inspired by the cognitive psychological memory mechanism, which decomposes the tracking task into sensory memory register, short-term memory tracker, and long-term memory tracker like humans. The sensory memory register captures information with three-dimensional perception; the short-term memory tracker builds the highly plastic observation model via memory rehearsal; the long-term memory tracker builds the highly stable observation model via memory encoding and retrieval. With the cooperative models, the tracker can easily handle various tracking scenarios. In addition, an appearance-shape learning method is proposed to update the two-dimensional appearance model and three-dimensional shape model appropriately. Extensive experimental results on a large-scale benchmark data set demonstrate that the proposed method outperforms the state-of-the-art two-dimensional and three-dimensional trackers in terms of efficiency, accuracy, and robustness.


Sign in / Sign up

Export Citation Format

Share Document