Mining Profiles and Definitions with Natural Language Processing

Author(s):  
Horacio Saggion

Free text is a main repository of human knowledge, therefore methods and techniques to access this unstructured source of knowledge are of paramount importance. In this chapter we describe natural language processing technology for the development of question answering and text summarization systems. We focus on applications aiming at mining textual resources to extract knowledge for the automatic creation of definitions and person profiles.

2021 ◽  
Author(s):  
García-Robledo Gabriela A ◽  
Reyes-Ortiz José A ◽  
González-Beltrán Beatriz A ◽  
Bravo Maricela

The development of question answering (QA) systems involves methods and techniques from the areas of Information Extraction (EI), Natural Language Processing (NLP), and sometimes speech recognition. A user interface that involves all these tasks requires deep development to improve the interaction between a user and a device. This paper describes a Spanish QA system for an academic domain through a multi-platform user interface. The system uses a voice query to be transformed into text. The semi-structured query is converted into SQWRL language to extract a system of ontologies from an academic domain using patterns. The answer of the ontologies is placed in templates classified according to the type of question. Finally, the answer is transformed into a voice. A method for experimentation is presented focusing on the questions asked in voice and their respective answers by experts from the academic domain in a set of 258 questions, obtaining a 92% accuracy.


Author(s):  
Mario Jojoa Acosta ◽  
Gema Castillo-Sánchez ◽  
Begonya Garcia-Zapirain ◽  
Isabel de la Torre Díez ◽  
Manuel Franco-Martín

The use of artificial intelligence in health care has grown quickly. In this sense, we present our work related to the application of Natural Language Processing techniques, as a tool to analyze the sentiment perception of users who answered two questions from the CSQ-8 questionnaires with raw Spanish free-text. Their responses are related to mindfulness, which is a novel technique used to control stress and anxiety caused by different factors in daily life. As such, we proposed an online course where this method was applied in order to improve the quality of life of health care professionals in COVID 19 pandemic times. We also carried out an evaluation of the satisfaction level of the participants involved, with a view to establishing strategies to improve future experiences. To automatically perform this task, we used Natural Language Processing (NLP) models such as swivel embedding, neural networks, and transfer learning, so as to classify the inputs into the following three categories: negative, neutral, and positive. Due to the limited amount of data available—86 registers for the first and 68 for the second—transfer learning techniques were required. The length of the text had no limit from the user’s standpoint, and our approach attained a maximum accuracy of 93.02% and 90.53%, respectively, based on ground truth labeled by three experts. Finally, we proposed a complementary analysis, using computer graphic text representation based on word frequency, to help researchers identify relevant information about the opinions with an objective approach to sentiment. The main conclusion drawn from this work is that the application of NLP techniques in small amounts of data using transfer learning is able to obtain enough accuracy in sentiment analysis and text classification stages.


2021 ◽  
Vol 28 (1) ◽  
pp. e100262
Author(s):  
Mustafa Khanbhai ◽  
Patrick Anyadi ◽  
Joshua Symons ◽  
Kelsey Flott ◽  
Ara Darzi ◽  
...  

ObjectivesUnstructured free-text patient feedback contains rich information, and analysing these data manually would require a lot of personnel resources which are not available in most healthcare organisations.To undertake a systematic review of the literature on the use of natural language processing (NLP) and machine learning (ML) to process and analyse free-text patient experience data.MethodsDatabases were systematically searched to identify articles published between January 2000 and December 2019 examining NLP to analyse free-text patient feedback. Due to the heterogeneous nature of the studies, a narrative synthesis was deemed most appropriate. Data related to the study purpose, corpus, methodology, performance metrics and indicators of quality were recorded.ResultsNineteen articles were included. The majority (80%) of studies applied language analysis techniques on patient feedback from social media sites (unsolicited) followed by structured surveys (solicited). Supervised learning was frequently used (n=9), followed by unsupervised (n=6) and semisupervised (n=3). Comments extracted from social media were analysed using an unsupervised approach, and free-text comments held within structured surveys were analysed using a supervised approach. Reported performance metrics included the precision, recall and F-measure, with support vector machine and Naïve Bayes being the best performing ML classifiers.ConclusionNLP and ML have emerged as an important tool for processing unstructured free text. Both supervised and unsupervised approaches have their role depending on the data source. With the advancement of data analysis tools, these techniques may be useful to healthcare organisations to generate insight from the volumes of unstructured free-text data.


2020 ◽  
Vol 4 (Supplement_1) ◽  
pp. 183-183
Author(s):  
Javad Razjouyan ◽  
Jennifer Freytag ◽  
Edward Odom ◽  
Lilian Dindo ◽  
Aanand Naik

Abstract Patient Priorities Care (PPC) is a model of care that aligns health care recommendations with priorities of older adults with multiple chronic conditions. Social workers (SW), after online training, document PPC in the patient’s electronic health record (EHR). Our goal is to identify free-text notes with PPC language using a natural language processing (NLP) model and to measure PPC adoption and effect on long term services and support (LTSS) use. Free-text notes from the EHR produced by trained SWs passed through a hybrid NLP model that utilized rule-based and statistical machine learning. NLP accuracy was validated against chart review. Patients who received PPC were propensity matched with patients not receiving PPC (control) on age, gender, BMI, Charlson comorbidity index, facility and SW. The change in LTSS utilization 6-month intervals were compared by groups with univariate analysis. Chart review indicated that 491 notes out of 689 had PPC language and the NLP model reached to precision of 0.85, a recall of 0.90, an F1 of 0.87, and an accuracy of 0.91. Within group analysis shows that intervention group used LTSS 1.8 times more in the 6 months after the encounter compared to 6 months prior. Between group analysis shows that intervention group has significant higher number of LTSS utilization (p=0.012). An automated NLP model can be used to reliably measure the adaptation of PPC by SW. PPC seems to encourage use of LTSS that may delay time to long term care placement.


2021 ◽  
pp. 1063293X2098297
Author(s):  
Ivar Örn Arnarsson ◽  
Otto Frost ◽  
Emil Gustavsson ◽  
Mats Jirstrand ◽  
Johan Malmqvist

Product development companies collect data in form of Engineering Change Requests for logged design issues, tests, and product iterations. These documents are rich in unstructured data (e.g. free text). Previous research affirms that product developers find that current IT systems lack capabilities to accurately retrieve relevant documents with unstructured data. In this research, we demonstrate a method using Natural Language Processing and document clustering algorithms to find structurally or contextually related documents from databases containing Engineering Change Request documents. The aim is to radically decrease the time needed to effectively search for related engineering documents, organize search results, and create labeled clusters from these documents by utilizing Natural Language Processing algorithms. A domain knowledge expert at the case company evaluated the results and confirmed that the algorithms we applied managed to find relevant document clusters given the queries tested.


2015 ◽  
Vol 54 (04) ◽  
pp. 338-345 ◽  
Author(s):  
A. Fong ◽  
R. Ratwani

SummaryObjective: Patient safety event data repositories have the potential to dramatically improve safety if analyzed and leveraged appropriately. These safety event reports often consist of both structured data, such as general event type categories, and unstructured data, such as free text descriptions of the event. Analyzing these data, particularly the rich free text narratives, can be challenging, especially with tens of thousands of reports. To overcome the resource intensive manual review process of the free text descriptions, we demonstrate the effectiveness of using an unsupervised natural language processing approach.Methods: An unsupervised natural language processing technique, called topic modeling, was applied to a large repository of patient safety event data to identify topics, or themes, from the free text descriptions of the data. Entropy measures were used to evaluate and compare these topics to the general event type categories that were originally assigned by the event reporter.Results: Measures of entropy demonstrated that some topics generated from the un-supervised modeling approach aligned with the clinical general event type categories that were originally selected by the individual entering the report. Importantly, several new latent topics emerged that were not originally identified. The new topics provide additional insights into the patient safety event data that would not otherwise easily be detected.Conclusion: The topic modeling approach provides a method to identify topics or themes that may not be immediately apparent and has the potential to allow for automatic reclassification of events that are ambiguously classified by the event reporter.


Sign in / Sign up

Export Citation Format

Share Document