automated prediction
Recently Published Documents


TOTAL DOCUMENTS

146
(FIVE YEARS 53)

H-INDEX

21
(FIVE YEARS 2)

2021 ◽  
Vol 13 (4) ◽  
pp. 1-11
Author(s):  
Stuti Nayak ◽  
Amrapali Zaveri ◽  
Pedro Hernandez Serrano ◽  
Michel Dumontier

While there exists an abundance of open biomedical data, the lack of high-quality metadata makes it challenging for others to find relevant datasets and to reuse them for another purpose. In particular, metadata are useful to understand the nature and provenance of the data. A common approach to improving the quality of metadata relies on expensive human curation, which itself is time-consuming and also prone to error. Towards improving the quality of metadata, we use scientific publications to automatically predict metadata key:value pairs. For prediction, we use a Convolutional Neural Network (CNN) and a Bidirectional Long-short term memory network (BiLSTM). We focus our attention on the NCBI Disease Corpus, which is used for training the CNN and BiLSTM. We perform two different kinds of experiments with these two architectures: (1) we predict the disease names by using their unique ID in the MeSH ontology and (2) we use the tree structures of MeSH ontology to move up in the hierarchy of these disease terms, which reduces the number of labels. We also perform various multi-label classification techniques for the above-mentioned experiments. We find that in both cases CNN achieves the best results in predicting the superclasses for disease with an accuracy of 83%.


2021 ◽  
Author(s):  
Abdulrahman Alasiri ◽  
Konrad J. Karczewski ◽  
Brian Cole ◽  
Bao-Li Loza ◽  
Jason H. Moore ◽  
...  

Motivation: Loss-of-Function (LoF) variants in human genes are important due to their impact on clinical phenotypes and frequent occurrence in the genomes of healthy individuals. Current approaches predict high-confidence LoF variants without identifying the specific genes or the number of copies they affect. Moreover, there is a lack of methods for detecting knockout genes caused by compound heterozygous (CH) LoF variants. Results: We have developed the Loss-of-Function ToolKit (LoFTK), which allows efficient and automated prediction of LoF variants from both genotyped and sequenced genomes. LoFTK enables the identification of genes that are inactive in one or two copies and provides summary statistics for downstream analyses. LoFTK can identify CH LoF variants, which result in LoF genes with two copies lost. Using data from parents and offspring we show that 96% of CH LoF genes predicted by LoFTK in the offspring have the respective alleles donated by each parent. Availability and implementation: LoFTK is an open source software and is freely available to non-commercial users at https://github.com/CirculatoryHealth/LoFTK


2021 ◽  
Vol 77 (25) ◽  
pp. 3184-3192 ◽  
Author(s):  
Craig G. Rusin ◽  
Sebastian I. Acosta ◽  
Eric L. Vu ◽  
Mubbasheer Ahmed ◽  
Kennith M. Brady ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document