annotation accuracy
Recently Published Documents


TOTAL DOCUMENTS

12
(FIVE YEARS 2)

H-INDEX

3
(FIVE YEARS 1)

2021 ◽  
Author(s):  
Guangsheng Pei ◽  
Fangfang Yan ◽  
Lukas M Simon ◽  
Yulin Dai ◽  
Peilin Jia ◽  
...  

Single-cell RNA sequencing (scRNA-seq) is revolutionizing the study of complex and dynamic cellular mechanisms. However, cell-type annotation remains a main challenge as it largely relies on manual curation, which is cumbersome and less accurate. The increasing number of scRNA-seq data sets, as well as numerous published genetic studies, motivated us to build a comprehensive human cell type reference atlas. Here, we present deCS (decoding Cell type-Specificity), an automatic cell type annotation method based on a comprehensive collection of human cell type expression profiles or a list of marker genes. We applied deCS to single-cell data sets from various tissue types, and systematically evaluated the annotation accuracy under various conditions, including reference panels, sequencing depth and feature selection. Our results demonstrated that expanding the references is critical for improving annotation accuracy. Compared to the existing state-of-the-art annotation tools, deCS significantly reduced computation time while increased accuracy. deCS can be integrated into the standard scRNA-seq analytical pipeline to enhance cell type annotation. Finally, we demonstrated the broad utility of deCS to identify trait-cell type associations in 51 human complex traits, providing deeper insight into the cellular mechanisms of disease pathogenesis.


2019 ◽  
Vol 21 (4) ◽  
pp. 1437-1447 ◽  
Author(s):  
Jiajun Hong ◽  
Yongchao Luo ◽  
Yang Zhang ◽  
Junbiao Ying ◽  
Weiwei Xue ◽  
...  

Abstract Functional annotation of protein sequence with high accuracy has become one of the most important issues in modern biomedical studies, and computational approaches of significantly accelerated analysis process and enhanced accuracy are greatly desired. Although a variety of methods have been developed to elevate protein annotation accuracy, their ability in controlling false annotation rates remains either limited or not systematically evaluated. In this study, a protein encoding strategy, together with a deep learning algorithm, was proposed to control the false discovery rate in protein function annotation, and its performances were systematically compared with that of the traditional similarity-based and de novo approaches. Based on a comprehensive assessment from multiple perspectives, the proposed strategy and algorithm were found to perform better in both prediction stability and annotation accuracy compared with other de novo methods. Moreover, an in-depth assessment revealed that it possessed an improved capacity of controlling the false discovery rate compared with traditional methods. All in all, this study not only provided a comprehensive analysis on the performances of the newly proposed strategy but also provided a tool for the researcher in the fields of protein function annotation.


2018 ◽  
Author(s):  
Takumi Kondo ◽  
Jun Inoue ◽  
John Blake
Keyword(s):  

2018 ◽  
Vol 18 (03) ◽  
pp. 1850013
Author(s):  
Tarek Helmy

Advanced digital capturing technologies have led to the explosive growth of images on the Web. To retrieve the desired image from a huge amount of images, textual query is handier to represent the user's interest than providing a visually similar image as a query. Semantic annotation of images' has been identified as an important step towards more efficient manipulation and retrieval of images. The aim of the semantic annotation of images is to annotate the existing images on the Web so that the images are more easily interpreted by searching programs. To annotate the images effectively, extensive image interpretation techniques have been developed to explore the semantic concept of images. But, due to the complexity and variety of backgrounds, effective image annotation is still a very challenging and open problem. Semantic annotation of Web contents manually is not feasible or scalable too, due to the huge amount and rate of emerging Web content. In this paper, we have surveyed the existing image annotation models and developed a hierarchical classification-based image annotation framework for image categorization, description and annotation. Empirical evaluation of the proposed framework with respect to its annotation accuracy shows high precision and recall compared with other annotation models with significant time and cost. An important feature of the proposed framework is that its specific annotation techniques, suitable for a particular image category, can be easily integrated and developed for other image categories.


2018 ◽  
Author(s):  
Wasila Dahdul ◽  
Prashanti Manda ◽  
Hong Cui ◽  
James P. Balhoff ◽  
T. Alexander Dececchi ◽  
...  

AbstractNatural language descriptions of organismal phenotypes - a principal object of study in biology, are abundant in biological literature. Expressing these phenotypes as logical statements using formal ontologies would enable large-scale analysis on phenotypic information from diverse systems. However, considerable human effort is required to make the semantics of phenotype descriptions amenable to machine reasoning by (a) recognizing appropriate on-tological terms for entities in text and (b) stringing these terms into logical statements. Most existing Natural Language Processing tools stop at entity recognition, leaving a need for tools that can assist with both aspects of the task. The recently described Semantic CharaParser aims to meet this need. We describe the first expert-curated Gold Standard corpus for ontology-based annotation of phenotypes from the systematics literature. We use it to evaluate Semantic CharaParser’s annotations and explore differences in performance between humans and machine. We use four annotation accuracy metrics that can account for both semantically identical and similar matches. We found that machine-human consistency was significantly lower than inter-curator (human–human) consistency. Surprisingly, allowing curators access to external information that was not available to Semantic CharaParser did not significantly increase the similarity of their annotations to the Gold Standard nor have a significant effect on inter-curator consistency. We found that the similarity of machine annotations to the Gold Standard increased after new ontology terms relevant to the input text had been added. Evaluation by the original authors of the character descriptions indicated that the Gold Standard annotations came closer to representing their intended meaning than did either the curator or machine annotations. These findings point toward ways to better design of software to augment human curators, and the Gold Standard corpus will allow training and assessment of new tools to improve phenotype annotation accuracy at scale.


2018 ◽  
Author(s):  
Benjamin A. Neely ◽  
Debra L. Ellisor ◽  
W. Clay Davis

AbstractBackgroundThe last decade has witnessed dramatic improvements in whole-genome sequencing capabilities coupled to drastically decreased costs, leading to an inundation of high-quality de novo genomes. For this reason, continued development of genome quality metrics is imperative. The current study utilized the recently updated Atlantic bottlenose dolphin (Tursiops truncatus) genome and annotation to evaluate a proteomics-based metric of genome accuracy.ResultsProteomic analysis of six tissues provided experimental confirmation of 10 402 proteins from 4 711 protein groups, almost 1/3 of the possible predicted proteins in the genome. There was an increased median molecular weight and number of identified peptides per protein using the current T. truncatus annotation versus the previous annotation. Identification of larger proteins with more identified peptides implied reduced database fragmentation and improved gene annotation accuracy. A metric is proposed, NP10, that attempts to capture this quality improvement. When using the new T. truncatus genome there was a 21 % improvement in NP10. This metric was further demonstrated by using a publicly available proteomic data set to compare human genome annotations from 2004, 2013 and 2016, which had a 33 % improvement in NP10.ConclusionsThese results demonstrate that new whole-genome sequencing techniques can rapidly generate high quality de novo genome assemblies and emphasizes the speed of advancing bioanalytical measurements in a non-model organism. Moreover, proteomics may be a useful metrological tool to benchmark genome accuracy, though there is a need for reference proteomic datasets to facilitate this utility in new de novo and existing genomes.


2015 ◽  
Vol 15 (3) ◽  
pp. 533
Author(s):  
B. Minaoui ◽  
M. Oujaoura ◽  
M. Fakir ◽  
M. Sajieddine

In this paper we study the problem of combining low-level visual features for semantic image annotation. The problem is tackled with a two different approaches that combines texture, color and shape features via a Bayesian network classifier. In first approach, vector concatenation has been applied to combine the three low-level visual features. All three descriptors are normalized and merged into a unique vector used with single classifier. In the second approach, the three types of visual features are combined in parallel scheme via three classifiers. Each type of descriptors is used separately with single classifier. The experimental results show that the semantic image annotation accuracy is higher when the second approach is used.


2015 ◽  
Author(s):  
Stephen Nayfach ◽  
Patrick H. Bradley ◽  
Stacia K. Wyman ◽  
Timothy J. Laurent ◽  
Alex Williams ◽  
...  

Shotgun metagenomic DNA sequencing is a widely applicable tool for characterizing the functions that are encoded by microbial communities. Several bioinformatic tools can be used to functionally annotate metagenomes, allowing researchers to draw inferences about the functional potential of the community and to identify putative functional biomarkers. However, little is known about how decisions made during annotation affect the reliability of the results. Here, we use statistical simulations to rigorously assess how to optimize annotation accuracy and speed, given parameters of the input data like read length and library size. We identify best practices in metagenome annotation and use them to guide the development of the Shotgun Metagenome Annotation Pipeline (ShotMAP). ShotMAP is an analytically flexible, end-to-end annotation pipeline that can be implemented either on a local computer or a cloud compute cluster. We use ShotMAP to assess how different annotation databases impact the interpretation of how marine metagenome and metatranscriptome functional capacity changes across seasons. We also apply ShotMAP to data obtained from a clinical microbiome investigation of inflammatory bowel disease. This analysis finds that gut microbiota collected from Crohn's disease patients are functionally distinct from gut microbiota collected from either ulcerative colitis patients or healthy controls, with differential abundance of metabolic pathways related to host-microbiome interactions that may serve as putative biomarkers of disease.


Sign in / Sign up

Export Citation Format

Share Document