scholarly journals Analyzing COVID-19 Medical Papers Using Artificial Intelligence: Insights for Researchers and Medical Professionals

2022 ◽  
Vol 6 (1) ◽  
pp. 4
Author(s):  
Dmitry Soshnikov ◽  
Tatiana Petrova ◽  
Vickie Soshnikova ◽  
Andrey Grunin

Since the beginning of the COVID-19 pandemic almost two years ago, there have been more than 700,000 scientific papers published on the subject. An individual researcher cannot possibly get acquainted with such a huge text corpus and, therefore, some help from artificial intelligence (AI) is highly needed. We propose the AI-based tool to help researchers navigate the medical papers collections in a meaningful way and extract some knowledge from scientific COVID-19 papers. The main idea of our approach is to get as much semi-structured information from text corpus as possible, using named entity recognition (NER) with a model called PubMedBERT and Text Analytics for Health service, then store the data into NoSQL database for further fast processing and insights generation. Additionally, the contexts in which the entities were used (neutral or negative) are determined. Application of NLP and text-based emotion detection (TBED) methods to COVID-19 text corpus allows us to gain insights on important issues of diagnosis and treatment (such as changes in medical treatment over time, joint treatment strategies using several medications, and the connection between signs and symptoms of coronavirus, etc.).

2021 ◽  
Vol 11 (4) ◽  
pp. 267-273
Author(s):  
Wen-Juan Hou ◽  
◽  
Bamfa Ceesay

Information extraction (IE) is the process of automatically identifying structured information from unstructured or partially structured text. IE processes can involve several activities, such as named entity recognition, event extraction, relationship discovery, and document classification, with the overall goal of translating text into a more structured form. Information on the changes in the effect of a drug, when taken in combination with a second drug, is known as drug–drug interaction (DDI). DDIs can delay, decrease, or enhance absorption of drugs and thus decrease or increase their efficacy or cause adverse effects. Recent research trends have shown several adaptation of recurrent neural networks (RNNs) from text. In this study, we highlight significant challenges of using RNNs in biomedical text processing and propose automatic extraction of DDIs aiming at overcoming some challenges. Our results show that the system is competitive against other systems for the task of extracting DDIs.


Information ◽  
2019 ◽  
Vol 10 (5) ◽  
pp. 178 ◽  
Author(s):  
Denis Maurel ◽  
Enza Morale ◽  
Nicolas Thouvenin ◽  
Patrice Ringot ◽  
Angel Turri

Istex is a database of twenty million full text scientific papers bought by the French Government for the use of academic libraries. Papers are usually searched for by the title, authors, keywords or possibly the abstract. To authorize new types of queries of Istex, we implemented a system of named entity recognition on all papers and we offer users the possibility to run searches on these entities. After the presentation of the French Istex project, we detail in this paper the named entity recognition with CasEN, a cascade of graphs, implemented on the Unitex Software. CasEN exists in French, but not in English. The first challenge was to build a new cascade in a short time. The results of its evaluation showed a good Precision measure, even if the Recall was not very good. The Precision was very important for this project to ensure it did not return unwanted papers by a query. The second challenge was the implementation of Unitex to parse around twenty millions of documents. We used a dockerized application. Finally, we explain also how to query the resulting Named entities in the Istex website.


2021 ◽  
Author(s):  
SHANHAO ZHONG ◽  
QINGSONG YU

Abstract. Medical named entity recognition is the first step in processing electronic medical records. It is the basis for processing medical natural language text information into medical structured information, which has extremely high research value and application value. In this paper, we have proposed a model that aims to identify various types of named entities such as disease, imaging examination, laboratory examination, operation, drug, and anatomy from Chinese electronic medical record. We construct a fusion Glyph and lexicon model based on BERT. Experimental studies have shown that increasing character-level semantic representation can improve the performance of named entity recognition. In order to boost it, the major measures of our model include: (1) a CNN structure is proposed to capture glyph information. (2) Soft-Lexicon method is introduced to encode lexicon information. Our models show an improvement over the baseline BERT-BiLSTM-CRF model. The experimental results on CCKS2019 dataset showed that the F1 score was 84.64, which was +1.99 higher than the baseline level.


Author(s):  
Andrianingsih Andrianingsih ◽  
Tri Wahyu Widyaningsih ◽  
Meta Amalya Dewi

A researcher in conducting his research usually uses a search through the homepage of the publication, based on expertise, collaboration in research, and research interests. Today, the COVID-19 pandemic is becoming a trending topic for researchers from various scientific fields. The study classified the case based on publications located in the homepage sources such as Scopus, Crossref, IEEE Xplore, and Google Scholar, by analyzing the following topics, namely Artificial Intelligence, Data Mining, Deep Learning, Machine Learning and the Internet of Things by using Named Entity Recognition to detect and classify named entities in text and using occurence and link strength methods. Based on this study, the results were obtained that Scopus has the most equitable percentage, which has a good occurrence and link strength among the five scientific fields, namely Artificial Intelligence 33.33%, Machine Learning 15.38%, Deep Learning 23.08%, Data Mining 12.82% and IoT 15.38%. The second-best are Google Scholar, then IEEE Xplore, and Crossref.


2020 ◽  
Vol 34 (05) ◽  
pp. 9225-9232
Author(s):  
Wenya Wang ◽  
Sinno Jialin Pan

Information extraction (IE) aims to produce structured information from an input text, e.g., Named Entity Recognition and Relation Extraction. Various attempts have been proposed for IE via feature engineering or deep learning. However, most of them fail to associate the complex relationships inherent in the task itself, which has proven to be especially crucial. For example, the relation between 2 entities is highly dependent on their entity types. These dependencies can be regarded as complex constraints that can be efficiently expressed as logical rules. To combine such logic reasoning capabilities with learning capabilities of deep neural networks, we propose to integrate logical knowledge in the form of first-order logic into a deep learning system, which can be trained jointly in an end-to-end manner. The integrated framework is able to enhance neural outputs with knowledge regularization via logic rules, and at the same time update the weights of logic rules to comply with the characteristics of the training data. We demonstrate the effectiveness and generalization of the proposed model on multiple IE tasks.


Data ◽  
2021 ◽  
Vol 6 (7) ◽  
pp. 78
Author(s):  
Dipali Baviskar ◽  
Swati Ahirrao ◽  
Ketan Kotecha

The day-to-day working of an organization produces a massive volume of unstructured data in the form of invoices, legal contracts, mortgage processing forms, and many more. Organizations can utilize the insights concealed in such unstructured documents for their operational benefit. However, analyzing and extracting insights from such numerous and complex unstructured documents is a tedious task. Hence, the research in this area is encouraging the development of novel frameworks and tools that can automate the key information extraction from unstructured documents. However, the availability of standard, best-quality, and annotated unstructured document datasets is a serious challenge for accomplishing the goal of extracting key information from unstructured documents. This work expedites the researcher’s task by providing a high-quality, highly diverse, multi-layout, and annotated invoice documents dataset for extracting key information from unstructured documents. Researchers can use the proposed dataset for layout-independent unstructured invoice document processing and to develop an artificial intelligence (AI)-based tool to identify and extract named entities in the invoice documents. Our dataset includes 630 invoice document PDFs with four different layouts collected from diverse suppliers. As far as we know, our invoice dataset is the only openly available dataset comprising high-quality, highly diverse, multi-layout, and annotated invoice documents.


Sign in / Sign up

Export Citation Format

Share Document