sequential models
Recently Published Documents


TOTAL DOCUMENTS

84
(FIVE YEARS 38)

H-INDEX

14
(FIVE YEARS 4)

2021 ◽  
Vol 12 (1) ◽  
pp. 327
Author(s):  
Cristina Luna-Jiménez ◽  
Ricardo Kleinlein ◽  
David Griol ◽  
Zoraida Callejas ◽  
Juan M. Montero ◽  
...  

Emotion recognition is attracting the attention of the research community due to its multiple applications in different fields, such as medicine or autonomous driving. In this paper, we proposed an automatic emotion recognizer system that consisted of a speech emotion recognizer (SER) and a facial emotion recognizer (FER). For the SER, we evaluated a pre-trained xlsr-Wav2Vec2.0 transformer using two transfer-learning techniques: embedding extraction and fine-tuning. The best accuracy results were achieved when we fine-tuned the whole model by appending a multilayer perceptron on top of it, confirming that the training was more robust when it did not start from scratch and the previous knowledge of the network was similar to the task to adapt. Regarding the facial emotion recognizer, we extracted the Action Units of the videos and compared the performance between employing static models against sequential models. Results showed that sequential models beat static models by a narrow difference. Error analysis reported that the visual systems could improve with a detector of high-emotional load frames, which opened a new line of research to discover new ways to learn from videos. Finally, combining these two modalities with a late fusion strategy, we achieved 86.70% accuracy on the RAVDESS dataset on a subject-wise 5-CV evaluation, classifying eight emotions. Results demonstrated that these modalities carried relevant information to detect users’ emotional state and their combination allowed to improve the final system performance.


2021 ◽  
Author(s):  
Ashutosh Kumar

Abstract A single well from any mature field produces approximately 1.7 million Measurement While Drilling (MWD) data points. We either use cross-correlation and covariance measurement, or Long Short-Term Memory (LSTM) based Deep Learning algorithms to diagnose long sequences of extremely noisy data. LSTM's context size of 200 tokens barely accounts for the entire depth. Proposed work develops application of Transformer-based Deep Learning algorithm to diagnose and predict events in complex sequences of well-log data. Sequential models learn geological patterns and petrophysical trends to detect events across depths of well-log data. However, vanishing gradients, exploding gradients and the limits of convolutional filters, limit the diagnosis of ultra-deep wells in complex subsurface information. Vast number of operations required to detect events between two subsurface points at large separation limits them. Transformers-based Models (TbMs) rely on non-sequential modelling that uses self-attention to relate information from different positions in the sequence of well-log, allowing to create an end-to-end, non-sequential, parallel memory network. We use approximately 21 million data points from 21 wells of Volve for the experiment. LSTMs, in addition to auto-regression (AR), autoregressive moving average (ARMA) and autoregressive integrated moving average (ARIMA) conventionally models the events in the time-series well-logs. However, complex global dependencies to detect events in heterogeneous subsurface are challenging for these sequence models. In the presented work we begin with one meter depth of data from Volve, an oil-field in the North Sea, and then proceed up to 1000 meters. Initially LSTMs and ARIMA models were acceptable, as depth increased beyond a few 100 meters their diagnosis started underperforming and a new methodology was required. TbMs have already outperformed several models in large sequences modelling for natural language processing tasks, thus they are very promising to model well-log data with very large depth separation. We scale features and labels according to the maximum and minimum value present in the training dataset and then use the sliding window to get training and evaluation data pairs from well-logs. Additional subsurface features were able to encode some information in the conventional sequential models, but the result did not compare significantly with the TbMs. TbMs achieved Root Mean Square Error of 0.27 on scale of (0-1) while diagnosing the depth up to 5000 meters. This is the first paper to show successful application of Transformer-based deep learning models for well-log diagnosis. Presented model uses a self-attention mechanism to learn complex dependencies and non-linear events from the well-log data. Moreover, the experimental setting discussed in the paper will act as a generalized framework for data from ultra-deep wells and their extremely heterogeneous subsurface environment.


2021 ◽  
Vol 3 (4) ◽  
pp. 922-945
Author(s):  
Shaw-Hwa Lo ◽  
Yiqiao Yin

Text classification is a fundamental language task in Natural Language Processing. A variety of sequential models are capable of making good predictions, yet there is a lack of connection between language semantics and prediction results. This paper proposes a novel influence score (I-score), a greedy search algorithm, called Backward Dropping Algorithm (BDA), and a novel feature engineering technique called the “dagger technique”. First, the paper proposes to use the novel influence score (I-score) to detect and search for the important language semantics in text documents that are useful for making good predictions in text classification tasks. Next, a greedy search algorithm, called the Backward Dropping Algorithm, is proposed to handle long-term dependencies in the dataset. Moreover, the paper proposes a novel engineering technique called the “dagger technique” that fully preserves the relationship between the explanatory variable and the response variable. The proposed techniques can be further generalized into any feed-forward Artificial Neural Networks (ANNs) and Convolutional Neural Networks (CNNs), and any neural network. A real-world application on the Internet Movie Database (IMDB) is used and the proposed methods are applied to improve prediction performance with an 81% error reduction compared to other popular peers if I-score and “dagger technique” are not implemented.


2021 ◽  
Author(s):  
Zaccheus Olaofe

Abstract This paper assessed the model performance accuracies of 3 forecast-based architectures (Long Short-Term Memory, LSTM; Convolutional Neural Network, Conv2D and hybrid ConvLSTM2D) for multivariate inputs to multi-steps wind speed and direction forecasts. These high-level neural network-based architectures were setup with the Keras sequential models trained to learn the historical patterns from the processed weather input datasets. To build these forecast models, the sampled time series weather observations at different station heights were obtained and reshaped for network layer compatibility, while the Adamax algorithm for the network optimization was considered. The trained and evaluated model performances with different input data sequences (normalized/un-normalized) were assessed while the forecast results were also compared with the Actual and Conv1D models. Upon optimal network training, the Conv2D model returned MSE, MAE and RMSE estimated values of 0.82, 4.48 and 0.91 %, respectively; the LSTM model returned 1.03, 4.75 and 1.01 %; while the ConvLSTM2D model returned 2.11, 10.13 and 1.45 %, respectively. Also, Conv2D validated model values of 3.16, 14.73 and 1.77 % were obtained %, respectively; 3.21, 14.98 and 1.82, for the LSTM-based; while ConvLSTM2D model returned 3.27, 15.92 and 1.91 %, respectively. Studied finding results show that better prediction and evaluation could be achieved for all the trained model architectures as compared to the untrained models. Also, from the predicted model results, the Keras sequential models were found to be useful for replicating the time-series historical wind speed and direction based on the well-tuned model hyperparameters as well as the input sequence structure


Author(s):  
Corrado Fumagalli

This chapter delves into debates about time and democratic deliberation. Deliberative democrats have developed sequential models but tend to think of time mainly as a background variable. Critics have drawn attention to the inadequacy of deliberation in accelerated society but, in so doing, have conflated arguments about the pace of democratic deliberation with arguments about its durational time. Democratic deliberation may be slow and inconclusive, but one aspect does not necessarily entail the other. It is against this backdrop that the chapter sheds light on a diachronic reading of fallibilism in order to advance a more favorable reading of inconclusiveness.


Author(s):  
Pisit NAKJAI ◽  
Tatpong KATANYUKUL

This article explores a transcription of a video recording Thai Finger Spelling (TFS)—a specific signing mode used in Thai sign language—to a corresponding Thai word. TFS copes with 42 Thai alphabets and 20 vowels using multiple and complex schemes. This leads to many technical challenges uncommon in spelling schemes of other sign languages. Our proposed system, Automatic Thai Finger Spelling Transcription (ATFS), processes a signing video in 3 stages: ALS marking video frames to easily remove any non-signing frame as well as conveniently group frames associating to the same alphabet, SR classifying a signing image frame to a sign label (or its equivalence), and SSR transcribing a series of signs into alphabets. ALS utilizes the TFS practice of signing different alphabets at different locations. SR and SSC employ well-adopted spatial and sequential models. Our ATFS has been found to achieve Alphabet Error Rate (AER) 0.256 (c.f. 0.63 of the baseline method). In addition to ATFS, our findings have disclosed a benefit of coupling image classification and sequence modeling stages by using a feature or penultimate vector for label representation rather than a definitive label or one-hot coding. Our results also assert the necessity of a smoothening mechanism in ALS and reveal a benefit of our proposed WFS, which could lead to over 15.88 % improvement. For TFS transcription, our work emphasizes the utilization of signing location in the identification of different alphabets. This is contrary to a common belief of exploiting signing time duration, which are shown to be ineffective by our data. HIGHLIGHTS Prototype of Thai finger spelling transcription (transcribing a signing video to alphabets) Utilization of signing location as cue for identification of different alphabets Disclosure of a benefit of coupling image classification and sequence modeling in signing transcription Examination of various frame smoothing techniques and their contributions to the overall transcription performance GRAPHICAL ABSTRACT


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Tong Wu ◽  
Yunlong Wang ◽  
Yue Wang ◽  
Emily Zhao ◽  
Yilian Yuan

AbstractAutomatic representation learning of key entities in electronic health record (EHR) data is a critical step for healthcare data mining that turns heterogeneous medical records into structured and actionable information. Here we propose , an algorithmic framework for learning continuous low-dimensional embedding vectors of the most common entities in EHR: medical services, doctors, and patients. features a hierarchical structure that encapsulates different node embedding schemes to cater for the unique characteristic of each medical entity. To embed medical services, we employ a biased-random-walk-based node embedding that leverages the irregular time intervals of medical services in EHR to embody their relative importance. To embed doctors and patients, we adhere to the principle “it’s what you do that defines you” and derive their embeddings based on their interactions with other types of entities through graph neural network and proximity-preserving network embedding, respectively. Using real-world clinical data, we demonstrate the efficacy of over competitive baselines on diagnosis prediction, readmission prediction, as well as recommending doctors to patients based on their medical conditions. In addition, medical service embeddings pretrained using can substantially improve the performance of sequential models in predicting patients clinical outcomes. Overall, can serve as a general-purpose representation learning algorithm for EHR data and benefit various downstream tasks in terms of both performance and interpretability.


2021 ◽  
Vol 1074 (1) ◽  
pp. 012030
Author(s):  
M. Shyamala Devi ◽  
Ankita Sagar ◽  
Karan Thapa ◽  
Mreenmoy Hazarika ◽  
P. Swathi

Sign in / Sign up

Export Citation Format

Share Document