Fitness Landscape of the Fission Yeast Genome

AbstractBackgroundNon-protein-coding regions of eukaryotic genomes remain poorly understood. Diversity studies, comparative genomics and biochemical outputs of genomic sites can be indicators of functional elements, but none produce fine-scale genome-wide descriptions of all functional elements.ResultsTowards the generation of a comprehensive description of functional elements in the haploid Schizosaccharomyces pombe genome, we generated transposon mutagenesis libraries to a density of one insertion per 13 nucleotides of the genome. We applied a five-state hidden Markov model (HMM) to characterise insertion-depleted regions at nucleotide-level resolution. HMM-defined functional constraint was consistent with genetic diversity, comparative genomics, gene-expression data and genome annotation.ConclusionsWe infer that transposon insertions lead to fitness consequences in 90% of the genome, including 80% of the non-protein-coding regions, reflecting the presence of numerous non-coding elements in this compact genome that have functional roles. Display of this data in genome browsers provides fine-scale views of structure-function relationships within specific genes.

Download Full-text

Fitness Landscape of the Fission Yeast Genome

Molecular Biology and Evolution ◽

10.1093/molbev/msz113 ◽

2019 ◽

Vol 36 (8) ◽

pp. 1612-1623

Author(s):

Leanne Grech ◽

Daniel C Jeffares ◽

Christoph Y Sadée ◽

María Rodríguez-López ◽

Danny A Bitton ◽

...

Keyword(s):

Molecular Evolution ◽

Comparative Genomics ◽

Fitness Landscape ◽

Gene Knockout ◽

Transposon Mutagenesis ◽

Yeast Genome ◽

Noncoding Regions ◽

Functional Regions ◽

Transposon Insertions ◽

The Relationship

Abstract The relationship between DNA sequence, biochemical function, and molecular evolution is relatively well-described for protein-coding regions of genomes, but far less clear in noncoding regions, particularly, in eukaryote genomes. In part, this is because we lack a complete description of the essential noncoding elements in a eukaryote genome. To contribute to this challenge, we used saturating transposon mutagenesis to interrogate the Schizosaccharomyces pombe genome. We generated 31 million transposon insertions, a theoretical coverage of 2.4 insertions per genomic site. We applied a five-state hidden Markov model (HMM) to distinguish insertion-depleted regions from insertion biases. Both raw insertion-density and HMM-defined fitness estimates showed significant quantitative relationships to gene knockout fitness, genetic diversity, divergence, and expected functional regions based on transcription and gene annotations. Through several analyses, we conclude that transposon insertions produced fitness effects in 66–90% of the genome, including substantial portions of the noncoding regions. Based on the HMM, we estimate that 10% of the insertion depleted sites in the genome showed no signal of conservation between species and were weakly transcribed, demonstrating limitations of comparative genomics and transcriptomics to detect functional units. In this species, 3′- and 5′-untranslated regions were the most prominent insertion-depleted regions that were not represented in measures of constraint from comparative genomics. We conclude that the combination of transposon mutagenesis, evolutionary, and biochemical data can provide new insights into the relationship between genome function and molecular evolution.

Download Full-text

Mapping overlapping functional elements embedded within the protein-coding regions of RNA viruses

Nucleic Acids Research ◽

10.1093/nar/gku981 ◽

2014 ◽

Vol 42 (20) ◽

pp. 12425-12439 ◽

Cited By ~ 55

Author(s):

Andrew E. Firth

Keyword(s):

Rna Viruses ◽

Functional Elements ◽

Protein Coding ◽

Coding Regions

Download Full-text

PhyloCSF: a comparative genomics method to distinguish protein coding and non-coding regions

Bioinformatics ◽

10.1093/bioinformatics/btr209 ◽

2011 ◽

Vol 27 (13) ◽

pp. i275-i282 ◽

Cited By ~ 535

Author(s):

M. F. Lin ◽

I. Jungreis ◽

M. Kellis

Keyword(s):

Comparative Genomics ◽

Protein Coding ◽

Coding Regions

Download Full-text

Comparative Genomics and Functional Elements in the Yeast Genome

Mycological Research ◽

10.1017/s0953756203248498 ◽

2003 ◽

Vol 107 (8) ◽

pp. 899

Keyword(s):

Comparative Genomics ◽

Yeast Genome ◽

Functional Elements

Download Full-text

Evolutionarily conserved non-protein-coding regions in the chicken genome harbor functionally important variation

10.1101/2020.03.27.012005 ◽

2020 ◽

Cited By ~ 1

Author(s):

Christian Groß ◽

Chiara Bortoluzzi ◽

Dick de Ridder ◽

Hendrik-Jan Megens ◽

Martien AM Groenen ◽

...

Keyword(s):

Comparative Genomics ◽

Chicken Genome ◽

Population Genomics ◽

Purifying Selection ◽

Disease Genes ◽

Functional Importance ◽

Protein Coding ◽

Frequency Distributions ◽

Functional Studies ◽

Coding Regions

AbstractThe availability of genomes for many species has advanced our understanding of the non-protein-coding fraction of the genome. Comparative genomics has proven to be an invaluable approach for the systematic, genome-wide identification of conserved non-protein-coding elements (CNEs). However, for many non-mammalian model species, including chicken, our capability to interpret the functional importance of variants overlapping CNEs has been limited by current genomic annotations, which rely on a single information type (e.g. conservation). We here studied CNEs in chicken using a combination of population genomics and comparative genomics. To investigate the functional importance of variants found in CNEs we develop a ch(icken) Combined Annotation-Dependent Depletion (chCADD), a variant effect prediction tool first introduced for humans and later on for mouse and pig. We show that 73 Mb of the chicken genome has been conserved across more than 280 million years of vertebrate evolution. The vast majority of the conserved elements are in non-protein-coding regions, which display SNP densities and allele frequency distributions characteristic of genomic regions constrained by purifying selection. By annotating SNPs with the chCADD score we are able to pinpoint specific subregions of the CNEs to be of higher functional importance, as supported by SNPs found in these subregions are associated with known disease genes in humans, mice, and rats. Taken together, our findings indicate that CNEs harbor variants of functional significance that should be object of further investigation along with protein-coding mutations. We therefore anticipate chCADD to be of great use to the scientific community and breeding companies in future functional studies in chicken.

Download Full-text

PhyloCSF: a comparative genomics method to distinguish protein-coding and non-coding regions

Nature Precedings ◽

10.1038/npre.2010.4784.1 ◽

2010 ◽

Cited By ~ 1

Author(s):

Michael Lin ◽

Irwin Jungreis ◽

Manolis Kellis

Keyword(s):

Comparative Genomics ◽

Protein Coding ◽

Coding Regions

Download Full-text

Evolutionary Analysis of DNA-Protein-Coding Regions Based on a Genetic Code Cube Metric

Current Topics in Medicinal Chemistry ◽

10.2174/1568026613666131204110022 ◽

2014 ◽

Vol 14 (3) ◽

pp. 407-417

Author(s):

Robersy Sanchez

Keyword(s):

Genetic Code ◽

Evolutionary Analysis ◽

Protein Coding ◽

Coding Regions

Download Full-text

The open targets post-GWAS analysis pipeline

Bioinformatics ◽

10.1093/bioinformatics/btaa020 ◽

2020 ◽

Vol 36 (9) ◽

pp. 2936-2937 ◽

Cited By ~ 4

Author(s):

Gareth Peat ◽

William Jones ◽

Michael Nuhn ◽

José Carlos Marugán ◽

William Newell ◽

...

Keyword(s):

Drug Targets ◽

Gene Expression Regulation ◽

Association Studies ◽

Genome Wide Association Studies ◽

Protein Coding ◽

Data Resource ◽

Coding Regions ◽

Genome Wide ◽

Causal Genes ◽

Interactive Data

Abstract Motivation Genome-wide association studies (GWAS) are a powerful method to detect even weak associations between variants and phenotypes; however, many of the identified associated variants are in non-coding regions, and presumably influence gene expression regulation. Identifying potential drug targets, i.e. causal protein-coding genes, therefore, requires crossing the genetics results with functional data. Results We present a novel data integration pipeline that analyses GWAS results in the light of experimental epigenetic and cis-regulatory datasets, such as ChIP-Seq, Promoter-Capture Hi-C or eQTL, and presents them in a single report, which can be used for inferring likely causal genes. This pipeline was then fed into an interactive data resource. Availability and implementation The analysis code is available at www.github.com/Ensembl/postgap and the interactive data browser at postgwas.opentargets.io.

Download Full-text

Comparative Genomics: Insights on the Pathogenicity and Lifestyle of Rhizoctonia solani

International Journal of Molecular Sciences ◽

10.3390/ijms22042183 ◽

2021 ◽

Vol 22 (4) ◽

pp. 2183

Author(s):

Nurhani Mat Razali ◽

Siti Norvahida Hisham ◽

Ilakiya Sharanee Kumar ◽

Rohit Nandan Shukla ◽

Melvin Lee ◽

...

Keyword(s):

Comparative Genomics ◽

Rhizoctonia Solani ◽

Abiotic Factors ◽

Biotic Factor ◽

Protein Coding ◽

Sustainable Food ◽

Repeat Elements ◽

Gene Sets ◽

Core Genes

Proper management of agricultural disease is important to ensure sustainable food security. Staple food crops like rice, wheat, cereals, and other cash crops hold great export value for countries. Ensuring proper supply is critical; hence any biotic or abiotic factors contributing to the shortfall in yield of these crops should be alleviated. Rhizoctonia solani is a major biotic factor that results in yield losses in many agriculturally important crops. This paper focuses on genome informatics of our Malaysian Draft R. solani AG1-IA, and the comparative genomics (inter- and intra- AG) with four AGs including China AG1-IA (AG1-IA_KB317705.1), AG1-IB, AG3, and AG8. The genomic content of repeat elements, transposable elements (TEs), syntenic genomic blocks, functions of protein-coding genes as well as core orthologous genic information that underlies R. solani’s pathogenicity strategy were investigated. Our analyses show that all studied AGs have low content and varying profiles of TEs. All AGs were dominant for Class I TE, much like other basidiomycete pathogens. All AGs demonstrate dominance in Glycoside Hydrolase protein-coding gene assignments suggesting its importance in infiltration and infection of host. Our profiling also provides a basis for further investigation on lack of correlation observed between number of pathogenicity and enzyme-related genes with host range. Despite being grouped within the same AG with China AG1-IA, our Draft AG1-IA exhibits differences in terms of protein-coding gene proportions and classifications. This implies that strains from similar AG do not necessarily have to retain similar proportions and classification of TE but must have the necessary arsenal to enable successful infiltration and colonization of host. In a larger perspective, all the studied AGs essentially share core genes that are generally involved in adhesion, penetration, and host colonization. However, the different infiltration strategies will depend on the level of host resilience where this is clearly exhibited by the gene sets encoded for the process of infiltration, infection, and protection from host.

Download Full-text

Investigation of long non-coding RNAs as regulatory players of grapevine response to powdery and downy mildew infection

BMC Plant Biology ◽

10.1186/s12870-021-03059-6 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Garima Bhatia ◽

Santosh K. Upadhyay ◽

Anuradha Upadhyay ◽

Kashmir Singh

Keyword(s):

Downy Mildew ◽

Plasmopara Viticola ◽

Defense Responses ◽

Protein Coding ◽

Functional Roles ◽

Real Time Quantitative Pcr ◽

Transcriptional Reprogramming ◽

Sequencing Technologies ◽

Non Coding Rnas ◽

Fungal Phytopathogens

Abstract Background Long non-coding RNAs (lncRNAs) are regulatory transcripts of length > 200 nt. Owing to the rapidly progressing RNA-sequencing technologies, lncRNAs are emerging as considerable nodes in the plant antifungal defense networks. Therefore, we investigated their role in Vitis vinifera (grapevine) in response to obligate biotrophic fungal phytopathogens, Erysiphe necator (powdery mildew, PM) and Plasmopara viticola (downy mildew, DM), which impose huge agro-economic burden on grape-growers worldwide. Results Using computational approach based on RNA-seq data, 71 PM- and 83 DM-responsive V. vinifera lncRNAs were identified and comprehensively examined for their putative functional roles in plant defense response. V. vinifera protein coding sequences (CDS) were also profiled based on expression levels, and 1037 PM-responsive and 670 DM-responsive CDS were identified. Next, co-expression analysis-based functional annotation revealed their association with gene ontology (GO) terms for ‘response to stress’, ‘response to biotic stimulus’, ‘immune system process’, etc. Further investigation based on analysis of domains, enzyme classification, pathways enrichment, transcription factors (TFs), interactions with microRNAs (miRNAs), and real-time quantitative PCR of lncRNAs and co-expressing CDS pairs suggested their involvement in modulation of basal and specific defense responses such as: Ca2+-dependent signaling, cell wall reinforcement, reactive oxygen species metabolism, pathogenesis related proteins accumulation, phytohormonal signal transduction, and secondary metabolism. Conclusions Overall, the identified lncRNAs provide insights into the underlying intricacy of grapevine transcriptional reprogramming/post-transcriptional regulation to delay or seize the living cell-dependent pathogen growth. Therefore, in addition to defense-responsive genes such as TFs, the identified lncRNAs can be further examined and leveraged to candidates for biotechnological improvement/breeding to enhance fungal stress resistance in this susceptible fruit crop of economic and nutritional importance.

Download Full-text