scholarly journals Exploring Personalized Neural Conversational Models

Author(s):  
Satwik Kottur ◽  
Xiaoyu Wang ◽  
Vitor Carvalho

Modeling dialog systems is currently one of the most active problems in Natural Language Processing. Recent advancement in Deep Learning has sparked an interest in the use of neural networks in modeling language, particularly for personalized conversational agents that can retain contextual information during dialog exchanges. This work carefully explores and compares several of the recently proposed neural conversation models, and carries out a detailed evaluation on the multiple factors that can significantly affect predictive performance, such as pretraining, embedding training, data cleaning, diversity reranking, evaluation setting, etc. Based on the tradeoffs of different models, we propose a new generative dialogue model conditioned on speakers as well as context history that outperforms all previous models on both retrieval and generative metrics. Our findings indicate that pretraining speaker embeddings on larger datasets, as well as bootstrapping word and speaker embeddings, can significantly improve performance (up to 3 points in perplexity), and that promoting diversity in using Mutual Information based techniques has a very strong effect in ranking metrics.

PLoS ONE ◽  
2021 ◽  
Vol 16 (6) ◽  
pp. e0252918
Author(s):  
Christopher Ifeanyi Eke ◽  
Azah Anir Norman ◽  
Liyana Shuib

Sarcasm is the main reason behind the faulty classification of tweets. It brings a challenge in natural language processing (NLP) as it hampers the method of finding people’s actual sentiment. Various feature engineering techniques are being investigated for the automatic detection of sarcasm. However, most related techniques have always concentrated only on the content-based features in sarcastic expression, leaving the contextual information in isolation. This leads to a loss of the semantics of words in the sarcastic expression. Another drawback is the sparsity of the training data. Due to the word limit of microblog, the feature vector’s values for each sample constructed by BoW produces null features. To address the above-named problems, a Multi-feature Fusion Framework is proposed using two classification stages. The first stage classification is constructed with the lexical feature only, extracted using the BoW technique, and trained using five standard classifiers, including SVM, DT, KNN, LR, and RF, to predict the sarcastic tendency. In stage two, the constructed lexical sarcastic tendency feature is fused with eight other proposed features for modelling a context to obtain a final prediction. The effectiveness of the developed framework is tested with various experimental analysis to obtain classifiers’ performance. The evaluation shows that our constructed classification models based on the developed novel feature fusion obtained results with a precision of 0.947 using a Random Forest classifier. Finally, the obtained results were compared with the results of three baseline approaches. The comparison outcome shows the significance of the proposed framework.


Information ◽  
2019 ◽  
Vol 10 (3) ◽  
pp. 82 ◽  
Author(s):  
Momchil Hardalov ◽  
Ivan Koychev ◽  
Preslav Nakov

Recent advances in deep neural networks, language modeling and language generation have introduced new ideas to the field of conversational agents. As a result, deep neural models such as sequence-to-sequence, memory networks, and the Transformer have become key ingredients of state-of-the-art dialog systems. While those models are able to generate meaningful responses even in unseen situations, they need a lot of training data to build a reliable model. Thus, most real-world systems have used traditional approaches based on information retrieval (IR) and even hand-crafted rules, due to their robustness and effectiveness, especially for narrow-focused conversations. Here, we present a method that adapts a deep neural architecture from the domain of machine reading comprehension to re-rank the suggested answers from different models using the question as a context. We train our model using negative sampling based on question–answer pairs from the Twitter Customer Support Dataset. The experimental results show that our re-ranking framework can improve the performance in terms of word overlap and semantics both for individual models as well as for model combinations.


2016 ◽  
Vol 26 (01) ◽  
pp. 1650002 ◽  
Author(s):  
David Griol ◽  
José Antonio Iglesias ◽  
Agapito Ledezma ◽  
Araceli Sanchis

This paper proposes a statistical framework to develop user-adapted spoken dialog systems. The proposed framework integrates two main models. The first model is used to predict the user’s intention during the dialog. The second model uses this prediction and the history of dialog up to the current moment to predict the next system response. This prediction is performed with an ensemble-based classifier trained for each of the tasks considered, so that a better selection of the next system can be attained weighting the outputs of these specialized classifiers. The codification of the information and the definition of data structures to store the data supplied by the user throughout the dialog makes the estimation of the models from the training data and practical domains manageable. We describe our proposal and its application and detailed evaluation in a practical spoken dialog system.


Author(s):  
Jindong Chen ◽  
Yizhou Hu ◽  
Jingping Liu ◽  
Yanghua Xiao ◽  
Haiyun Jiang

Short text classification is one of important tasks in Natural Language Processing (NLP). Unlike paragraphs or documents, short texts are more ambiguous since they have not enough contextual information, which poses a great challenge for classification. In this paper, we retrieve knowledge from external knowledge source to enhance the semantic representation of short texts. We take conceptual information as a kind of knowledge and incorporate it into deep neural networks. For the purpose of measuring the importance of knowledge, we introduce attention mechanisms and propose deep Short Text Classification with Knowledge powered Attention (STCKA). We utilize Concept towards Short Text (CST) attention and Concept towards Concept Set (C-CS) attention to acquire the weight of concepts from two aspects. And we classify a short text with the help of conceptual information. Unlike traditional approaches, our model acts like a human being who has intrinsic ability to make decisions based on observation (i.e., training data for machines) and pays more attention to important knowledge. We also conduct extensive experiments on four public datasets for different tasks. The experimental results and case studies show that our model outperforms the state-of-the-art methods, justifying the effectiveness of knowledge powered attention.


2021 ◽  
pp. 1-17
Author(s):  
J. Shobana ◽  
M. Murali

Text Sentiment analysis is the process of predicting whether a segment of text has opinionated or objective content and analyzing the polarity of the text’s sentiment. Understanding the needs and behavior of the target customer plays a vital role in the success of the business so the sentiment analysis process would help the marketer to improve the quality of the product as well as a shopper to buy the correct product. Due to its automatic learning capability, deep learning is the current research interest in Natural language processing. Skip-gram architecture is used in the proposed model for better extraction of the semantic relationships as well as contextual information of words. However, the main contribution of this work is Adaptive Particle Swarm Optimization (APSO) algorithm based LSTM for sentiment analysis. LSTM is used in the proposed model for understanding complex patterns in textual data. To improve the performance of the LSTM, weight parameters are enhanced by presenting the Adaptive PSO algorithm. Opposition based learning (OBL) method combined with PSO algorithm becomes the Adaptive Particle Swarm Optimization (APSO) classifier which assists LSTM in selecting optimal weight for the environment in less number of iterations. So APSO - LSTM ‘s ability in adjusting the attributes such as optimal weights and learning rates combined with the good hyper parameter choices leads to improved accuracy and reduces losses. Extensive experiments were conducted on four datasets proved that our proposed APSO-LSTM model secured higher accuracy over the classical methods such as traditional LSTM, ANN, and SVM. According to simulation results, the proposed model is outperforming other existing models.


Author(s):  
Jacqueline Peng ◽  
Mengge Zhao ◽  
James Havrilla ◽  
Cong Liu ◽  
Chunhua Weng ◽  
...  

Abstract Background Natural language processing (NLP) tools can facilitate the extraction of biomedical concepts from unstructured free texts, such as research articles or clinical notes. The NLP software tools CLAMP, cTAKES, and MetaMap are among the most widely used tools to extract biomedical concept entities. However, their performance in extracting disease-specific terminology from literature has not been compared extensively, especially for complex neuropsychiatric disorders with a diverse set of phenotypic and clinical manifestations. Methods We comparatively evaluated these NLP tools using autism spectrum disorder (ASD) as a case study. We collected 827 ASD-related terms based on previous literature as the benchmark list for performance evaluation. Then, we applied CLAMP, cTAKES, and MetaMap on 544 full-text articles and 20,408 abstracts from PubMed to extract ASD-related terms. We evaluated the predictive performance using precision, recall, and F1 score. Results We found that CLAMP has the best performance in terms of F1 score followed by cTAKES and then MetaMap. Our results show that CLAMP has much higher precision than cTAKES and MetaMap, while cTAKES and MetaMap have higher recall than CLAMP. Conclusion The analysis protocols used in this study can be applied to other neuropsychiatric or neurodevelopmental disorders that lack well-defined terminology sets to describe their phenotypic presentations.


2021 ◽  
Author(s):  
Marciane Mueller ◽  
Rejane Frozza ◽  
Liane Mählmann Kipper ◽  
Ana Carolina Kessler

BACKGROUND This article presents the modeling and development of a Knowledge Based System, supported by the use of a virtual conversational agent called Dóris. Using natural language processing resources, Dóris collects the clinical data of patients in care in the context of urgency and hospital emergency. OBJECTIVE The main objective is to validate the use of virtual conversational agents to properly and accurately collect the data necessary to perform the evaluation flowcharts used to classify the degree of urgency of patients and determine the priority for medical care. METHODS The agent's knowledge base was modeled using the rules provided for in the evaluation flowcharts comprised by the Manchester Triage System. It also allows the establishment of a simple, objective and complete communication, through dialogues to assess signs and symptoms that obey the criteria established by a standardized, validated and internationally recognized system. RESULTS Thus, in addition to verifying the applicability of Artificial Intelligence techniques in a complex domain of health care, a tool is presented that helps not only in the perspective of improving organizational processes, but also in improving human relationships, bringing professionals and patients closer. The system's knowledge base was modeled on the IBM Watson platform. CONCLUSIONS The results obtained from simulations carried out by the human specialist allowed us to verify that a knowledge-based system supported by a virtual conversational agent is feasible for the domain of risk classification and priority determination of medical care for patients in the context of emergency care and hospital emergency.


2021 ◽  
Author(s):  
Eva van der Kooij ◽  
Marc Schleiss ◽  
Riccardo Taormina ◽  
Francesco Fioranelli ◽  
Dorien Lugt ◽  
...  

<p>Accurate short-term forecasts, also known as nowcasts, of heavy precipitation are desirable for creating early warning systems for extreme weather and its consequences, e.g. urban flooding. In this research, we explore the use of machine learning for short-term prediction of heavy rainfall showers in the Netherlands.</p><p>We assess the performance of a recurrent, convolutional neural network (TrajGRU) with lead times of 0 to 2 hours. The network is trained on a 13-year archive of radar images with 5-min temporal and 1-km spatial resolution from the precipitation radars of the Royal Netherlands Meteorological Institute (KNMI). We aim to train the model to predict the formation and dissipation of dynamic, heavy, localized rain events, a task for which traditional Lagrangian nowcasting methods still come up short.</p><p>We report on different ways to optimize predictive performance for heavy rainfall intensities through several experiments. The large dataset available provides many possible configurations for training. To focus on heavy rainfall intensities, we use different subsets of this dataset through using different conditions for event selection and varying the ratio of light and heavy precipitation events present in the training data set and change the loss function used to train the model.</p><p>To assess the performance of the model, we compare our method to current state-of-the-art Lagrangian nowcasting system from the pySTEPS library, like S-PROG, a deterministic approximation of an ensemble mean forecast. The results of the experiments are used to discuss the pros and cons of machine-learning based methods for precipitation nowcasting and possible ways to further increase performance.</p>


2021 ◽  
pp. 2-11
Author(s):  
David Aufreiter ◽  
Doris Ehrlinger ◽  
Christian Stadlmann ◽  
Margarethe Uberwimmer ◽  
Anna Biedersberger ◽  
...  

On the servitization journey, manufacturing companies complement their offerings with new industrial and knowledge-based services, which causes challenges of uncertainty and risk. In addition to the required adjustment of internal factors, the international selling of services is a major challenge. This paper presents the initial results of an international research project aimed at assisting advanced manufacturers in making decisions about exporting their service offerings to foreign markets. In the frame of this project, a tool is developed to support managers in their service export decisions through the automated generation of market information based on Natural Language Processing and Machine Learning. The paper presents a roadmap for progressing towards an Artificial Intelligence-based market information solution. It describes the research process steps of analyzing problem statements of relevant industry partners, selecting target countries and markets, defining parameters for the scope of the tool, classifying different service offerings and their components into categories and developing annotation scheme for generating reliable and focused training data for the Artificial Intelligence solution. This paper demonstrates good practices in essential steps and highlights common pitfalls to avoid for researcher and managers working on future research projects supported by Artificial Intelligence. In the end, the paper aims at contributing to support and motivate researcher and manager to discover AI application and research opportunities within the servitization field.


Sign in / Sign up

Export Citation Format

Share Document