Revealing Opinions for COVID-19 Questions through Context Retriever and Opinion Aggregating Question-Answering (Preprint)

BACKGROUND The COVID-19 has caused severe challenges to global public health because it is highly contagious and can be lethal. Numerous ongoing and recently published researches have emerged. However, the research regarding COVID-19 is largely ongoing and inconclusive. OBJECTIVE A potential approach to accelerate COVID-19 research is to borrow information from the existing researches of the other viruses that belong to the same coronavirus family. We develop a natural language processing method for answering factoid questions related to COVID-19 using published articles as knowledge sources. METHODS Given a question, first, a BM25 based context retriever model is implemented to select the most relevant passages from the articles. Second, for each selected context passage, an answer is obtained using a pre-trained BERT question-answering model. Third, an opinion aggregator, which is a combination of biterm topic model (BTM) and k-means clustering, is applied to aggregating all answers into several opinions. RESULTS We apply the proposed pipeline to extract answers, opinions and the most frequent words to six questions from the COVID-19 Open Research Dataset Challenge (CORD-19). By showing the longitudinal distributions of the opinions, we uncover the trends of opinions and popular words in the publications during four periods: before 1990, during 1990-2000, 2000-2010, 2011-2019, and after 2019. The changes in the opinions and popular words agree with several distinct characteristics and challenges of COVID-19, including a higher risk for senior people and people with pre-existing medical conditions, high contagion and rapid transmission, and more urgent need of screening and testing. The opinions and the popular words also provide additional insights for the COVID-19 related questions. CONCLUSIONS Compared with other methods for literature retriever and answer generation, opinion aggregation in our method leads to more interpretable, robust and comprehensive question-specific literature reviews. The results demonstrate the usefulness of the proposed method in answering COVID-19 related questions with main opinions and capturing the trends of research about COVID-19 and other relevant strains of coronavirus in recent years.

Download Full-text

Natural language inference for Malayalam language using language agnostic sentence representation

PeerJ Computer Science ◽

10.7717/peerj-cs.508 ◽

2021 ◽

Vol 7 ◽

pp. e508

Author(s):

Sara Renjit ◽

Sumam Idicula

Keyword(s):

Natural Language ◽

Language Processing ◽

Question Answering ◽

Binary Classification ◽

Multiclass Classification ◽

The Other ◽

Indian Language ◽

Information Retrieval Systems ◽

Textual Entailment ◽

Malayalam Language

Natural language inference (NLI) is an essential subtask in many natural language processing applications. It is a directional relationship from premise to hypothesis. A pair of texts is defined as entailed if a text infers its meaning from the other text. The NLI is also known as textual entailment recognition, and it recognizes entailed and contradictory sentences in various NLP systems like Question Answering, Summarization and Information retrieval systems. This paper describes the NLI problem attempted for a low resource Indian language Malayalam, the regional language of Kerala. More than 30 million people speak this language. The paper is about the Malayalam NLI dataset, named MaNLI dataset, and its application of NLI in Malayalam language using different models, namely Doc2Vec (paragraph vector), fastText, BERT (Bidirectional Encoder Representation from Transformers), and LASER (Language Agnostic Sentence Representation). Our work attempts NLI in two ways, as binary classification and as multiclass classification. For both the classifications, LASER outperformed the other techniques. For multiclass classification, NLI using LASER based sentence embedding technique outperformed the other techniques by a significant margin of 12% accuracy. There was also an accuracy improvement of 9% for LASER based NLI system for binary classification over the other techniques.

Download Full-text

Document Similarity Measure Based on Topic Model

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.513-517.1280 ◽

2014 ◽

Vol 513-517 ◽

pp. 1280-1284

Author(s):

Ming He ◽

Zhen Zhen Wang ◽

Yong Ping Du

Keyword(s):

Language Processing ◽

Latent Dirichlet Allocation ◽

Question Answering ◽

Topic Model ◽

Real Data ◽

Space Representation ◽

Data Set ◽

Document Similarity ◽

Fuzzy Query ◽

Document Categorization

Document similarity computation is an exciting research topic in information retrieval (IR) and it is a key issue for automatic document categorization, clustering analysis, fuzzy query and question answering. Topic model is an emerging field in natural language processing (NLP), IR and machine learning (ML). In this paper, we apply a latent Dirichlet allocation (LDA) topic model-based method to compute similarity between documents. By mapping a document with term space representation into a topic space, a distribution over topics derived for computing document similarity. An empirical study using real data set demonstrates the efficiency of our method.

Download Full-text

A Survey of Paraphrasing and Textual Entailment Methods

Journal of Artificial Intelligence Research ◽

10.1613/jair.2985 ◽

2010 ◽

Vol 38 ◽

pp. 135-187 ◽

Cited By ~ 123

Author(s):

I. Androutsopoulos ◽

P. Malakasiotis

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Machine Translation ◽

Language Processing ◽

Question Answering ◽

Extraction Methods ◽

The Other ◽

Text Generation ◽

Wide Range ◽

Textual Entailment

Paraphrasing methods recognize, generate, or extract phrases, sentences, or longer natural language expressions that convey almost the same information. Textual entailment methods, on the other hand, recognize, generate, or extract pairs of natural language expressions, such that a human who reads (and trusts) the first element of a pair would most likely infer that the other element is also true. Paraphrasing can be seen as bidirectional textual entailment and methods from the two areas are often similar. Both kinds of methods are useful, at least in principle, in a wide range of natural language processing applications, including question answering, summarization, text generation, and machine translation. We summarize key ideas from the two areas by considering in turn recognition, generation, and extraction methods, also pointing to prominent articles and resources.

Download Full-text

PENGGUNAAN RINFOCAL SEBAGAI APLIKASI PENGINGAT (REMINDER) KEGIATAN AKADEMIK PADA PERGURUAN TINGGI

CCIT Journal ◽

10.33050/ccit.v9i1.393 ◽

2015 ◽

Vol 9 (1) ◽

pp. 13-26

Author(s):

Indri Handayani ◽

Qurotul Aini ◽

Yessy Oktavyanti

Keyword(s):

Daily Life ◽

Human Life ◽

The Other ◽

Digital Form ◽

Mind Mapping ◽

Literature Reviews ◽

Schedule Time

Progress of technology and its developed is going so rapidly nowadays and it provide big affect on human life, some of them were education and daily life. Due to its development we also know the other form of calendar which is in digital form that we usually found in gadgets such as handphone or tablets and surely it is portable. Rinfo which is an email supporting facilities for the needs of Raharja College may help Pribadi Raharja in coordination and communication about task and/or event. Rinfo has some applications that integrated with Rinfo itself, such as RinfoGroup, RinfoSites, RinfoDocs, RinfoDrive, RinfoH and RinfoCal. RinfoCal is an calendar application that can be use as schedule time reminder application and it will send any reminder not only to one person but some or couple persons. RinfoCal may sent an pop-up notification or email notification. This paper will discuss about what is RinfoCal, how to use it, what’s the purpose of using RinfoCal, benefit of RinfoCal and so on. But, instead of its benefit, there are also some shortages including many people who using Rinfo doesn’t get the benefit of RinfoCal because they just pretending that RinfoCal is just an usual calendar. This paper also present six problems from conventional reminder that will solved by RinfoCal fews are just doing reminders only once at a time or just remembering only one person, a mind mapping to simplify the analyze of problem and make the best solution, eight literature reviews that had been done to help analyzing problems of research.

Download Full-text

Reports of the Workshops Held at the 2019 AAAI Conference on Artificial Intelligence

AI Magazine ◽

10.1609/aimag.v40i3.4981 ◽

2019 ◽

Vol 40 (3) ◽

pp. 67-78

Author(s):

Guy Barash ◽

Mauricio Castillo-Effen ◽

Niyati Chhaya ◽

Peter Clark ◽

Huáscar Espinoza ◽

...

Keyword(s):

Artificial Intelligence ◽

Language Processing ◽

Cyber Security ◽

Question Answering ◽

Intent Recognition ◽

Affective Content ◽

Learning Plan ◽

Dialog System ◽

Affective Content Analysis ◽

Games And Simulations

The workshop program of the Association for the Advancement of Artificial Intelligence’s 33rd Conference on Artificial Intelligence (AAAI-19) was held in Honolulu, Hawaii, on Sunday and Monday, January 27–28, 2019. There were fifteen workshops in the program: Affective Content Analysis: Modeling Affect-in-Action, Agile Robotics for Industrial Automation Competition, Artificial Intelligence for Cyber Security, Artificial Intelligence Safety, Dialog System Technology Challenge, Engineering Dependable and Secure Machine Learning Systems, Games and Simulations for Artificial Intelligence, Health Intelligence, Knowledge Extraction from Games, Network Interpretability for Deep Learning, Plan, Activity, and Intent Recognition, Reasoning and Learning for Human-Machine Dialogues, Reasoning for Complex Question Answering, Recommender Systems Meet Natural Language Processing, Reinforcement Learning in Games, and Reproducible AI. This report contains brief summaries of the all the workshops that were held.

Download Full-text

Machine Learning in Medical Emergencies: a Systematic Review and Analysis

Journal of Medical Systems ◽

10.1007/s10916-021-01762-3 ◽

2021 ◽

Vol 45 (10) ◽

Author(s):

Inés Robles Mendo ◽

Gonçalo Marques ◽

Isabel de la Torre Díez ◽

Miguel López-Coronado ◽

Francisco Martín-Rodríguez

Keyword(s):

Machine Learning ◽

Systematic Review ◽

Language Processing ◽

Emergency Services ◽

Clinical Decision ◽

Quality Of Healthcare ◽

The Other ◽

Other Hand ◽

Increasing Demand ◽

Commercial Applications

AbstractDespite the increasing demand for artificial intelligence research in medicine, the functionalities of his methods in health emergency remain unclear. Therefore, the authors have conducted this systematic review and a global overview study which aims to identify, analyse, and evaluate the research available on different platforms, and its implementations in healthcare emergencies. The methodology applied for the identification and selection of the scientific studies and the different applications consist of two methods. On the one hand, the PRISMA methodology was carried out in Google Scholar, IEEE Xplore, PubMed ScienceDirect, and Scopus. On the other hand, a review of commercial applications found in the best-known commercial platforms (Android and iOS). A total of 20 studies were included in this review. Most of the included studies were of clinical decisions (n = 4, 20%) or medical services or emergency services (n = 4, 20%). Only 2 were focused on m-health (n = 2, 10%). On the other hand, 12 apps were chosen for full testing on different devices. These apps dealt with pre-hospital medical care (n = 3, 25%) or clinical decision support (n = 3, 25%). In total, half of these apps are based on machine learning based on natural language processing. Machine learning is increasingly applicable to healthcare and offers solutions to improve the efficiency and quality of healthcare. With the emergence of mobile health devices and applications that can use data and assess a patient's real-time health, machine learning is a growing trend in the healthcare industry.

Download Full-text

Comparative Question Answering System based on Natural Language Processing and Machine Learning

2021 International Conference on Artificial Intelligence and Smart Systems (ICAIS) ◽

10.1109/icais50930.2021.9396015 ◽

2021 ◽

Author(s):

Rohit Arora ◽

Parth Singh ◽

Hemlata Goyal ◽

Sunita Singhal ◽

Smita Vijayvargiya

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Question Answering ◽

Question Answering System

Download Full-text

Perceiving Residents’ Festival Activities Based on Social Media Data: A Case Study in Beijing, China

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10070474 ◽

2021 ◽

Vol 10 (7) ◽

pp. 474

Author(s):

Bingqing Wang ◽

Bin Meng ◽

Juan Wang ◽

Siyu Chen ◽

Jian Liu

Keyword(s):

Social Media ◽

Language Processing ◽

Topic Model ◽

Central Area ◽

Classification Model ◽

Social Media Data ◽

Ring Road ◽

Different Types ◽

Spatial Differences ◽

Media Data

Social media data contains real-time expressed information, including text and geographical location. As a new data source for crowd behavior research in the era of big data, it can reflect some aspects of the behavior of residents. In this study, a text classification model based on the BERT and Transformers framework was constructed, which was used to classify and extract more than 210,000 residents’ festival activities based on the 1.13 million Sina Weibo (Chinese “Twitter”) data collected from Beijing in 2019 data. On this basis, word frequency statistics, part-of-speech analysis, topic model, sentiment analysis and other methods were used to perceive different types of festival activities and quantitatively analyze the spatial differences of different types of festivals. The results show that traditional culture significantly influences residents’ festivals, reflecting residents’ motivation to participate in festivals and how residents participate in festivals and express their emotions. There are apparent spatial differences among residents in participating in festival activities. The main festival activities are distributed in the central area within the Fifth Ring Road in Beijing. In contrast, expressing feelings during the festival is mainly distributed outside the Fifth Ring Road in Beijing. The research integrates natural language processing technology, topic model analysis, spatial statistical analysis, and other technologies. It can also broaden the application field of social media data, especially text data, which provides a new research paradigm for studying residents’ festival activities and adds residents’ perception of the festival. The research results provide a basis for the design and management of the Chinese festival system.

Download Full-text

Evolution of Semantic Similarity—A Survey

ACM Computing Surveys ◽

10.1145/3440755 ◽

2021 ◽

Vol 54 (2) ◽

pp. 1-37

Author(s):

Dhivya Chandrasekaran ◽

Vijay Mago

Keyword(s):

Natural Language ◽

Semantic Similarity ◽

Language Processing ◽

Hybrid Methods ◽

Research Work ◽

Similarity Measures ◽

Text Data ◽

Knowledge Based ◽

Open Research ◽

Research Problems

Estimating the semantic similarity between text data is one of the challenging and open research problems in the field of Natural Language Processing (NLP). The versatility of natural language makes it difficult to define rule-based methods for determining semantic similarity measures. To address this issue, various semantic similarity methods have been proposed over the years. This survey article traces the evolution of such methods beginning from traditional NLP techniques such as kernel-based methods to the most recent research work on transformer-based models, categorizing them based on their underlying principles as knowledge-based, corpus-based, deep neural network–based methods, and hybrid methods. Discussing the strengths and weaknesses of each method, this survey provides a comprehensive view of existing systems in place for new researchers to experiment and develop innovative ideas to address the issue of semantic similarity.

Download Full-text

Managing Free Text for Secondary Use of Health Data

Yearbook of Medical Informatics ◽

10.15265/iy-2014-0037 ◽

2014 ◽

Vol 23 (01) ◽

pp. 167-169 ◽

Cited By ~ 5

Author(s):

N. Griffon ◽

J. Charlet ◽

S. J. Darmoni ◽

Keyword(s):

Language Processing ◽

Meaningful Use ◽

Health Data ◽

Structured Data ◽

The Other ◽

Free Text ◽

Reference Resolution ◽

Comprehensive Review ◽

Secondary Use ◽

Corpus Creation

Summary Objective: To summarize the best papers in the field of Knowledge Representation and Management (KRM). Methods: A comprehensive review of medical informatics literature was performed to select some of the most interesting papers of KRM and natural language processing (NLP) published in 2013. Results: Four articles were selected, one focuses on Electronic Health Record (EHR) interoperability for clinical pathway personalization based on structured data. The other three focus on NLP (corpus creation, de-identification, and co-reference resolution) and highlight the increase in NLP tools performances. Conclusion: NLP tools are close to being seriously concurrent to humans in some annotation tasks. Their use could increase drastically the amount of data usable for meaningful use of EHR.

Download Full-text