scholarly journals Mixing genome annotation methods in a comparative analysis inflates the apparent number of lineage-specific genes

2022 ◽  
Author(s):  
Caroline M. Weisman ◽  
Andrew M. Murray ◽  
Sean R Eddy

Comparisons of genomes of different species are used to identify lineage-specific genes, those genes that appear unique to one species or clade. Lineage-specific genes are often thought to represent genetic novelty that underlies unique adaptations. Identification of these genes depends not only on genome sequences, but also on inferred gene annotations. Comparative analyses typically use available genomes that have been annotated using different methods, increasing the risk that orthologous DNA sequences may be erroneously annotated as a gene in one species but not another, appearing lineage-specific as a result. To evaluate the impact of such 'annotation heterogeneity', we identified four clades of species with sequenced genomes with more than one publicly available gene annotation, allowing us to compare the number of lineage-specific genes inferred when differing annotation methods are used to those resulting when annotation method is uniform across the clade. In these case studies, annotation heterogeneity increases the apparent number of lineage-specific genes by up to 15-fold, suggesting that annotation heterogeneity is a substantial source of potential artifact.

2021 ◽  
Author(s):  
Yu Hamaguchi ◽  
Chao Zeng ◽  
Michiaki Hamada

Abstract Background: Differential expression (DE) analysis of RNA-seq data typically depends on gene annotations. Different sets of gene annotations are available for the human genome and are continually updated–a process complicated with the development and application of high-throughput sequencing technologies. However, the impact of the complexity of gene annotations on DE analysis remains unclear.Results: Using “mappability”, a metric of the complexity of gene annotation, we compared three distinct human gene annotations, GENCODE, RefSeq, and NONCODE, and evaluated how mappability affected DE analysis. We found that mappability was significantly different among the human gene annotations. We also found that increasing mappability improved the performance of DE analysis, and the impact of mappability mainly evident in the quantification step and propagated downstream of DE analysis systematically.Conclusions: We assessed how the complexity of gene annotations affects DE analysis using mappability. Our findings indicate that the growth and complexity of gene annotations negatively impact the performance of DE analysis, suggesting that an approach that excludes unnecessary gene models from gene annotations improves the performance of DE analysis.


Open Biology ◽  
2017 ◽  
Vol 7 (6) ◽  
pp. 160330 ◽  
Author(s):  
Erica Acton ◽  
Amy Huei-Yi Lee ◽  
Pei Jun Zhao ◽  
Stephane Flibotte ◽  
Mauricio Neira ◽  
...  

The Yeast Knockout (YKO) collection has provided a wealth of functional annotations from genome-wide screens. An unintended consequence is that 76% of gene annotations derive from one genotype. The nutritional auxotrophies in the YKO, in particular, have phenotypic consequences. To address this issue, ‘prototrophic’ versions of the YKO collection have been constructed, either by introducing a plasmid carrying wild-type copies of the auxotrophic markers (Plasmid-Borne, PB prot ) or by backcrossing (Backcrossed, BC prot ) to a wild-type strain. To systematically assess the impact of the auxotrophies, genome-wide fitness profiles of prototrophic and auxotrophic collections were compared across diverse drug and environmental conditions in 250 experiments. Our quantitative profiles uncovered broad impacts of genotype on phenotype for three deletion collections, and revealed genotypic and strain-construction-specific phenotypes. The PB prot collection exhibited fitness defects associated with plasmid maintenance, while BC prot fitness profiles were compromised due to strain loss from nutrient selection steps during strain construction. The repaired prototrophic versions of the YKO collection did not restore wild-type behaviour nor did they clarify gaps in gene annotation resulting from the auxotrophic background. To remove marker bias and expand the experimental scope of deletion libraries, construction of a bona fide prototrophic collection from a wild-type strain will be required.


2021 ◽  
Vol 10 (10) ◽  
Author(s):  
Yasunori Suzuki ◽  
Hiroaki Kubota ◽  
Tsutomu Kakuda ◽  
Shinji Takai ◽  
Kenji Sadamasu

ABSTRACT The complete genome sequences of two Staphylococcus argenteus strains, Tokyo13064 and Tokyo13069, isolated from human feces and suspected causative foods during a staphylococcal food poisoning outbreak, consist of 2,750,811-bp and 2,751,556-bp circular chromosomes and 2,543 and 2,548 genome annotation-predicted coding DNA sequences, respectively, with 19 rRNAs, 61 tRNAs, and 1 CRISPR each.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Yu Hamaguchi ◽  
Chao Zeng ◽  
Michiaki Hamada

Abstract Background Differential expression (DE) analysis of RNA-seq data typically depends on gene annotations. Different sets of gene annotations are available for the human genome and are continually updated–a process complicated with the development and application of high-throughput sequencing technologies. However, the impact of the complexity of gene annotations on DE analysis remains unclear. Results Using “mappability”, a metric of the complexity of gene annotation, we compared three distinct human gene annotations, GENCODE, RefSeq, and NONCODE, and evaluated how mappability affected DE analysis. We found that mappability was significantly different among the human gene annotations. We also found that increasing mappability improved the performance of DE analysis, and the impact of mappability mainly evident in the quantification step and propagated downstream of DE analysis systematically. Conclusions We assessed how the complexity of gene annotations affects DE analysis using mappability. Our findings indicate that the growth and complexity of gene annotations negatively impact the performance of DE analysis, suggesting that an approach that excludes unnecessary gene models from gene annotations improves the performance of DE analysis.


2018 ◽  
Vol 4 (2) ◽  
pp. 200-224
Author(s):  
Ben Brewster

This article presents findings from a series of case studies into the impact of multi- agency anti-slavery partnerships in the UK. The research draws upon empirical evidence from a number of geographic regions as the basis of a comparative analysis involving the full spectrum of statutory and non-statutory organisations that undertake anti-slavery work. The article focuses, in particular, on the role of partnerships in victim identification and support, while simultaneously discussing issues and drawing upon existing discourse associated with policy, legislation and the macro conditions that impose barriers on such efforts.


2021 ◽  
Author(s):  
Hui Jiang ◽  
Jing Tian ◽  
Jiaxin Yang ◽  
Xiang Dong ◽  
Zhixiang Zhong ◽  
...  

Abstract Background: Polystachya Hook. is a large pantropical orchid genus (c. 240 species) distributed in Africa, southern Asia and the Americas, with the centre of diversity in Africa. Chloroplast (cp) genomes of plants are highly conserved and can provide much more informative DNA sites and generate much better resolution for plant phylogenies. However, for Polystachya, the whole cp genome including its structure features are yet unknown and its phylogenetic placement of the genus within the Orchidaceae is still unclear.Results: In this study, the complete cp genomes of six Polystachya species were assembled based on genome skimming. We subjected them to comparative genomic analyses and reconstructed their phylogenetic relationships. The results exhibited that the cp genomes had a typical quadripartite structure with conserved genome arrangement and moderate divergence. The cp genomes of the six Polystachya species ranged from 145,484 bp to 149,274 bp in length and had almost similar GC content of 36.9%-37.0%. Gene annotation revealed 113 unique genes. In additions, 19 genes are duplicated in the inverted regions, and 17 gene possessed intron. Comparative analysis of the overall sequence identity among six complete cp genomes confirmed that for both coding and non-coding regions in Polystachya, SC regions exhibit higher sequence variation than IRs. Furthermore, there were various amplifications in the IR region among the six Polystachya species. Most of the protein-coding genes of these species had a high degree of codon preference. We screened out specific SSR and found seven relatively highly variable loci. Moreover, 13 genes were discovered with significant positive selection. Phylogenetic analysis suggested that the six Polystachya species formed a monophyletic clade and had more closely related to tribe Vandeae. Phylogenetic relationships of the family Orchidaceae inferred from the 85 cp genome sequences were generally consistent with previous studies and robust. Conclusions: Our study reported the complete cp genomes of the six Polystachya species, and provided detailed structural analysis and comparative analysis results, which can contribute to the development of DNA markers for use in the study of genetic variability and evolutionary studies in Polystachya. In addition, the present results further demonstrate the phylogenetic position of Polystachya.


2018 ◽  
pp. 32-51
Author(s):  
R. Yu. Kochnev ◽  
L. I. Polishchuk ◽  
A. Yu. Rubin

We present the comparative analysis of the impact of centralized and decentralized corruption for private sector. Theory and empirical evidence point out to a “double jeopardy” of decentralized corruption which increases the burden of corruption upon private firms and weakens the incentives of bureaucracy to provide public production inputs, such as infrastructure. These outcomes are produced by simultaneous free-riding and the tragedy of the commons effects. The empirical part of the paper utilizes data of the Business Environment and Enterprise Performance project.


Author(s):  
Igor Ponomarenko ◽  
Kateryna Volovnenko

The subject of the research is a set of approaches to the statistical analysis ofthe activities of small business entities in Ukraine, including micro-enterprises. The purpose of writing this article is to study of the features of functioningof small business entities in Ukraine. Methodology. The research methodology isto use a system-structural and comparative analysis (to study the change in thenumber of small enterprises by major components); monographic (when studyingmethods of statistical analysis of small businesses); economic analysis (when assessing the impact of small business entities on socio-economic phenomena andprocesses in Ukraine). The scientific novelty consists to determine the features ofthe functioning of small businesses in Ukraine in modern conditions. The influenceof the activities of the main socio-economic and political indicators on the activities of small enterprises in recent periods of time has been identified. It has beenestablished that there is flexibility in the development of strategies by small businesses in conditions of significant competition, which makes it possible to quicklyrespond to changing situations in specific markets. Conclusions. The use of acomprehensive statistical analysis of small businesses functioning in Ukraine willallow government agencies to develop a set of measures to optimize the activitiesof these enterprises, which ultimately will positively affect the strengthening oftheir competitiveness and will contribute to the growth of the national economicsystem.


Sign in / Sign up

Export Citation Format

Share Document