prediction task
Recently Published Documents


TOTAL DOCUMENTS

118
(FIVE YEARS 43)

H-INDEX

11
(FIVE YEARS 3)

2022 ◽  
Vol 70 (2) ◽  
pp. 3969-3984
Author(s):  
Nataliya Shakhovska ◽  
Nataliia Melnykova ◽  
Valentyna Chopiyak ◽  
Michal Gregus ml

2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Khrystyna Shakhovska ◽  
Iryna Dumyn ◽  
Natalia Kryvinska ◽  
Mohan Krishna Kagita

Text generation, in particular, next-word prediction, is convenient for users because it helps to type without errors and faster. Therefore, a personalized text prediction system is a vital analysis topic for all languages, primarily for Ukrainian, because of limited support for the Ukrainian language tools. LSTM and Markov chains and their hybrid were chosen for next-word prediction. Their sequential nature (current output depends on previous) helps to successfully cope with the next-word prediction task. The Markov chains presented the fastest and adequate results. The hybrid model presents adequate results but it works slowly. Using the model, user can generate not only one word but also a few or a sentence or several sentences, unlike T9.


2021 ◽  
Author(s):  
Joshua Meier ◽  
Roshan Rao ◽  
Robert Verkuil ◽  
Jason Liu ◽  
Tom Sercu ◽  
...  

Modeling the effect of sequence variation on function is a fundamental problem for understanding and designing proteins. Since evolution encodes information about function into patterns in protein sequences, unsupervised models of variant effects can be learned from sequence data. The approach to date has been to fit a model to a family of related sequences. The conventional setting is limited, since a new model must be trained for each prediction task. We show that using only zero-shot inference, without any supervision from experimental data or additional training, protein language models capture the functional effects of sequence variation, performing at state-of-the-art.


Author(s):  
Masaki Asada ◽  
Nallappan Gunasekaran ◽  
Makoto Miwa ◽  
Yutaka Sasaki

We deal with a heterogeneous pharmaceutical knowledge-graph containing textual information built from several databases. The knowledge graph is a heterogeneous graph that includes a wide variety of concepts and attributes, some of which are provided in the form of textual pieces of information which have not been targeted in the conventional graph completion tasks. To investigate the utility of textual information for knowledge graph completion, we generate embeddings from textual descriptions given to heterogeneous items, such as drugs and proteins, while learning knowledge graph embeddings. We evaluate the obtained graph embeddings on the link prediction task for knowledge graph completion, which can be used for drug discovery and repurposing. We also compare the results with existing methods and discuss the utility of the textual information.


2021 ◽  
Vol 21 (S2) ◽  
Author(s):  
Weidong Bao ◽  
Hongfei Lin ◽  
Yijia Zhang ◽  
Jian Wang ◽  
Shaowu Zhang

Abstract Background Clinical notes record the health status, clinical manifestations and other detailed information of each patient. The International Classification of Diseases (ICD) codes are important labels for electronic health records. Automatic medical codes assignment to clinical notes through the deep learning model can not only improve work efficiency and accelerate the development of medical informatization but also facilitate the resolution of many issues related to medical insurance. Recently, neural network-based methods have been proposed for the automatic medical code assignment. However, in the medical field, clinical notes are usually long documents and contain many complex sentences, most of the current methods cannot effective in learning the representation of potential features from document text. Methods In this paper, we propose a hybrid capsule network model. Specifically, we use bi-directional LSTM (Bi-LSTM) with forwarding and backward directions to merge the information from both sides of the sequence. The label embedding framework embeds the text and labels together to leverage the label information. We then use a dynamic routing algorithm in the capsule network to extract valuable features for medical code prediction task. Results We applied our model to the task of automatic medical codes assignment to clinical notes and conducted a series of experiments based on MIMIC-III data. The experimental results show that our method achieves a micro F1-score of 67.5% on MIMIC-III dataset, which outperforms the other state-of-the-art methods. Conclusions The proposed model employed the dynamic routing algorithm and label embedding framework can effectively capture the important features across sentences. Both Capsule networks and domain knowledge are helpful for medical code prediction task.


PLoS ONE ◽  
2021 ◽  
Vol 16 (6) ◽  
pp. e0253822
Author(s):  
Mingshan Jia ◽  
Bogdan Gabrys ◽  
Katarzyna Musial

The triangle structure, being a fundamental and significant element, underlies many theories and techniques in studying complex networks. The formation of triangles is typically measured by the clustering coefficient, in which the focal node is the centre-node in an open triad. In contrast, the recently proposed closure coefficient measures triangle formation from an end-node perspective and has been proven to be a useful feature in network analysis. Here, we extend it by proposing the directed closure coefficient that measures the formation of directed triangles. By distinguishing the direction of the closing edge in building triangles, we further introduce the source closure coefficient and the target closure coefficient. Then, by categorising particular types of directed triangles (e.g., head-of-path), we propose four closure patterns. Through multiple experiments on 24 directed networks from six domains, we demonstrate that at network-level, the four closure patterns are distinctive features in classifying network types, while at node-level, adding the source and target closure coefficients leads to significant improvement in link prediction task in most types of directed networks.


2021 ◽  
Author(s):  
Christian Requena-Mesa ◽  
Vitus Benson ◽  
Markus Reichstein ◽  
Jakob Runge ◽  
Joachim Denzler

Author(s):  
Nitish Kumar ◽  
Deepak Chaurasiya ◽  
Alok Singh ◽  
Siddhartha Asthana ◽  
Kushagra Agarwal ◽  
...  

Every year, health insurance fraud costs taxpayers billions of dollars and puts patient’s health and welfare at risk. Existing solutions to detect fraudulent providers (hospitals, physicians, etc.) aim to find unusual pattern at claim level features but fail to harness provider-provider and provider-patient interaction information. We propose a novel framework, Med-Dynamic meta learning (MeDML), that extends the capability of traditional fraud detection by learning patterns from 1) patient-provider interaction using temporal and geo-spatial characteristics 2) provider's treatment using encounter data (e.g. medical codes, mix of attended patients) and 3) referral using underlying provider-provider relationships based on common patient visits within 30 days. To the best of our knowledge, MeDML is first framework that can model fraud using multi-aspect representation of provider.MeDML also encapsulates provider's phantom billing index, which identifies excessive and unnecessary services provided to patients, by segmenting frequently co-occurring diagnosis and procedures in non-fraudulent provider's claims. It uses a novel framework to aggregate the learned representations capturing their task-specific relative importance via attention mechanism. We test the dynamically generated meta embedding using various downstream models and show that it outperforms all baseline algorithms for provider fraud prediction task.


2021 ◽  
Vol 8 ◽  
Author(s):  
Keith Carlson ◽  
Faraz Dadgostari ◽  
Michael A. Livermore ◽  
Daniel N. Rockmore

This paper introduces a novel linked structure-content representation of federal statutory law in the United States and analyzes and quantifies its structure using tools and concepts drawn from network analysis and complexity studies. The organizational component of our representation is based on the explicit hierarchical organization within the United States Code (USC) as well an embedded cross-reference citation network. We couple this structure with a layer of content-based similarity derived from the application of a “topic model” to the USC. The resulting representation is the first that explicitly models the USC as a “multinetwork” or “multilayered network” incorporating hierarchical structure, cross-references, and content. We report several novel descriptive statistics of this multinetwork. These include the results of this first application of the machine learning technique of topic modeling to the USC as well as multiple measures articulating the relationships between the organizational and content network layers. We find a high degree of assortativity of “titles” (the highest level hierarchy within the USC) with related topics. We also present a link prediction task and show that machine learning techniques are able to recover information about structure from content. Success in this prediction task has a natural interpretation as indicating a form of mutual information. We connect the relational findings between organization and content to a measure of “ease of search” in this large hyperlinked document that has implications for the ways in which the structure of the USC supports (or doesn’t support) broad useful access to the law. The measures developed in this paper have the potential to enable comparative work in the study of statutory networks that ranges across time and geography.


Sign in / Sign up

Export Citation Format

Share Document