scholarly journals AnnoGen: annotating genome-wide pragmatic features

2020 ◽  
Vol 36 (9) ◽  
pp. 2899-2901
Author(s):  
Quanhu Sheng ◽  
Hui Yu ◽  
Olufunmilola Oyebamiji ◽  
Jiandong Wang ◽  
Danqian Chen ◽  
...  

Abstract Motivation Genome annotation is an important step for all in-depth bioinformatics analysis. It is imperative to augment quantity and diversity of genome-wide annotation data for the latest reference genome to promote its adoption by ongoing and future impactful studies. Results We developed a python toolkit AnnoGen, which at the first time, allows the annotation of three pragmatic genomic features for the GRCh38 genome in enormous base-wise quantities. The three features are chemical binding Energy, sequence information Entropy and Homology Score. The Homology Score is an exceptional feature that captures the genome-wide homology through single-base-offset tiling windows of 100 continual nucleotide bases. AnnoGen is capable of annotating the proprietary pragmatic features for variable user-interested genomic regions and optionally comparing two parallel sets of genomic regions. AnnoGen is characterized with simple utility modes and succinct HTML report of informative statistical tables and plots. Availability and implementation https://github.com/shengqh/annogen.

2018 ◽  
Author(s):  
Ling-Ling Liu ◽  
Chao Fang ◽  
Jun Meng ◽  
Johann Detilleux ◽  
Wu-Jun Liu ◽  
...  

AbstractHigh quality gaits play an important role in many breeding programs, but the genes of gait were limited at the moment. Here, we present an analysis of genomic selection signatures in 53 individuals from two breeds, using genotype data from the Affymetrix Equine 670K SNP genotyping array. The 11 selection regions of Yili horse were identified using an FST statistic and XP-EHH calculated in 200-kb windows across the genome. In total, 50 genes could be found in the 11 regions, and two candidate genes related to locomotory behavior (CLN6, FZD4). The genome of Yili horse and Russian horse were shaped by natural and artificial selection. Our results suggest that gait trait of Yili horse may related to two genes. This is the first time when whole genome array data is utilized to study genomic regions affecting gait in Yili horse breed.


2015 ◽  
Author(s):  
John E Pool

North American populations of Drosophila melanogaster are thought to derive from both European and African source populations, but despite their importance for genetic research, patterns of admixture along their genomes are essentially undocumented. Here, I infer geographic ancestry along genomes of the Drosophila Genetic Reference Panel (DGRP) and the D. melanogaster reference genome. Overall, the proportion of African ancestry was estimated to be 20% for the DGRP and 9% for the reference genome. Based on the size of admixture tracts and the approximate timing of admixture, I estimate that the DGRP population underwent roughly 13.9 generations per year. Notably, ancestry levels varied strikingly among genomic regions, with significantly less African introgression on the X chromosome, in regions of high recombination, and at genes involved in specific processes such as circadian rhythm. An important role for natural selection during the admixture process was further supported by a genome-wide signal of ancestry disequilibrium, in that many between-chromosome pairs of loci showed a deficiency of Africa-Europe allele combinations. These results support the hypothesis that admixture between partially genetically isolated Drosophila populations led to natural selection against incompatible genetic variants, and that this process is ongoing. The ancestry blocks inferred here may be relevant for the performance of reference alignment in this species, and may bolster the design and interpretation of many population genetic and association mapping studies.


2021 ◽  
Author(s):  
Savannah J Hoyt ◽  
Jessica M Storer ◽  
Gabrielle A Hartley ◽  
Patrick G.S. Grady ◽  
Ariel Gershman ◽  
...  

Mobile elements and highly repetitive genomic regions are potent sources of lineage-specific genomic innovation and fingerprint individual genomes. Comprehensive analyses of large, composite or arrayed repeat elements and those found in more complex regions of the genome require a complete, linear genome assembly. Here we present the first de novo repeat discovery and annotation of a complete human reference genome, T2T-CHM13v1.0. We identified novel satellite arrays, expanded the catalog of variants and families for known repeats and mobile elements, characterized new classes of complex, composite repeats, and provided comprehensive annotations of retroelement transduction events. Utilizing PRO-seq to detect nascent transcription and nanopore sequencing to delineate CpG methylation profiles, we defined the structure of transcriptionally active retroelements in humans, including for the first time those found in centromeres. Together, these data provide expanded insight into the diversity, distribution and evolution of repetitive regions that have shaped the human genome.


Genes ◽  
2018 ◽  
Vol 9 (10) ◽  
pp. 499
Author(s):  
Tamanna Anwar ◽  
Gourinath Samudrala

Entamoeba histolytica is an invasive, pathogenic parasite causing amoebiasis. Given that proteins involved in transmembrane (TM) transport are crucial for the adherence, invasion, and nutrition of the parasite, we conducted a genome-wide bioinformatics analysis of encoding proteins to functionally classify and characterize all the TM proteins in E. histolytica. In the present study, 692 TM proteins have been identified, of which 546 are TM transporters. For the first time, we report a set of 141 uncharacterized proteins predicted as TM transporters. The percentage of TM proteins was found to be lower in comparison to the free-living eukaryotes, due to the extracellular nature and functional diversification of the TM proteins. The number of multi-pass proteins is larger than the single-pass proteins; though both have their own significance in parasitism, multi-pass proteins are more extensively required as these are involved in acquiring nutrition and for ion transport, while single-pass proteins are only required at the time of inciting infection. Overall, this intestinal parasite implements multiple mechanisms for establishing infection, obtaining nutrition, and adapting itself to the new host environment. A classification of the repertoire of TM transporters in the present study augments several hints on potential methods of targeting the parasite for therapeutic benefits.


2015 ◽  
Author(s):  
Laurence Ettwiller ◽  
John Buswell ◽  
Erbay Yigit ◽  
Ira Schildkraut

We have developed Cappable-seq that specifically captures primary RNA transcripts by enzymatically modifying the 5' triphosphorylated end of RNA with a selectable tag. We first applied Cappable-seq to E. coli , achieving up to 50 fold enrichment of primary transcripts and identifying an unprecedented 16539 transcription start sites (TSS) genome-wide at single base resolution. We also applied Cappable-seq to a mouse cecum sample and for the first time identified TSS in a microbiome. Furthermore, Cappable-seq universally depletes ribosomal RNA and reduces the complexity of the transcriptome to a single quantifiable tag per TSS enabling digital profiling of gene expression in any microbiome.


2017 ◽  
Author(s):  
Yue Li ◽  
Alvin Houze Shi ◽  
Ryan Tewhey ◽  
Pardis C. Sabeti ◽  
Jason Ernst ◽  
...  

Massively-parallel reporter assays (MPRA) enable unprecedented opportunities to test for regulatory activity of thousands of regulatory sequences. However, MPRA only assay a subset of the genome thus limiting their applicability for genome-wide functional annotations. To overcome this limitation, we have used existing MPRA datasets to train a machine learning model that uses DNA sequence information, regulatory motif annotations, evolutionary conservation, and epigenomic information to predict genomic regions that show enhancer activity when tested in MPRA assays. We used the resulting model to generate global predictions of regulatory activity at single-nucleotide resolution across 14 million common variants. We find that genetic variants with stronger predicted regulatory activity show significantly lower minor allele frequency, indicative of evolutionary selection within the human population. They also show higher over-lap with eQTL annotations across multiple tissues relative to the background SNPs, indicating that their perturbations in vivo more frequently result in changes in gene expression. In addition, they are more frequently associated with trait-associated SNPs from genome-wide association studies (GWAS), enabling us to prioritize genetic variants that are more likely to be causal based on their predicted regulatory activity. Lastly, we use our model to compare MPRA inferences across cell types and platforms and to prioritize the assays most predictive of MPRA assay results, including cell-dependent DNase hypersensitivity sites and transcription factors known to be active in the tested cell types. Our results indicate that high-throughput testing of thousands of putative regions, coupled with regulatory predictions across millions of sites, presents a powerful strategy for systematic annotation of genomic regions and genetic variants.


2021 ◽  
Vol 80 (3) ◽  
pp. 1329-1337
Author(s):  
Jure Mur ◽  
Daniel L. McCartney ◽  
Daniel I. Chasman ◽  
Peter M. Visscher ◽  
Graciela Muniz-Terrera ◽  
...  

Background: The genetic variant rs9923231 (VKORC1) is associated with differences in the coagulation of blood and consequentially with sensitivity to the drug warfarin. Variation in VKORC1 has been linked in a gene-based test to dementia/Alzheimer’s disease in the parents of participants, with suggestive evidence for an association for rs9923231 (p = 1.8×10–7), which was included in the genome-wide significant KAT8 locus. Objective: Our study aimed to investigate whether the relationship between rs9923231 and dementia persists only for certain dementia sub-types, and if those taking warfarin are at greater risk. Methods: We used logistic regression and data from 238,195 participants from UK Biobank to examine the relationship between VKORC1, risk of dementia, and the interplay with warfarin use. Results: Parental history of dementia, APOE variant, atrial fibrillation, diabetes, hypertension, and hypercholesterolemia all had strong associations with vascular dementia (p < 4.6×10–6). The T-allele in rs9923231 was linked to a lower warfarin dose (βperT - allele = –0.29, p < 2×10–16) and risk of vascular dementia (OR = 1.17, p = 0.010), but not other dementia sub-types. However, the risk of vascular dementia was not affected by warfarin use in carriers of the T-allele. Conclusion: Our study reports for the first time an association between rs9923231 and vascular dementia, but further research is warranted to explore potential mechanisms and specify the relationship between rs9923231 and features of vascular dementia.


Author(s):  
Wayne Xu ◽  
James R Tucker ◽  
Wubishet A Bekele ◽  
Frank M You ◽  
Yong-Bi Fu ◽  
...  

Abstract Barley (Hordeum vulgare L.) is one of the most important global crops. The six-row barley cultivar Morex reference genome has been used by the barley research community worldwide. However, this reference genome can have limitations when used for genomic and genetic diversity analysis studies, gene discovery, and marker development when working in two-row germplasm that is more common to Canadian barley. Here we assembled, for the first time, the genome sequence of a Canadian two-row malting barley, cultivar AAC Synergy. We applied deep Illumina paired-end reads, long mate-pair reads, PacBio sequences, 10X chromium linked read libraries, and chromosome conformation capture sequencing (Hi-C) to generate a contiguous assembly. The genome assembled from super-scaffolds had a size of 4.85 Gb, N50 of 2.32 Mb and an estimated 93.9% of complete genes from a plant database (BUSCO, benchmarking universal single-copy orthologous genes). After removal of small scaffolds (&lt; 300 Kb), the assembly was arranged into pseudomolecules of 4.14 Gb in size with seven chromosomes plus unanchored scaffolds. The completeness and annotation of the assembly were assessed by comparing it with the updated version of six-row Morex and recently released two-row Golden Promise genome assemblies.


2021 ◽  
Vol 7 (11) ◽  
pp. eabd1239
Author(s):  
Mark Simcoe ◽  
Ana Valdes ◽  
Fan Liu ◽  
Nicholas A. Furlotte ◽  
David M. Evans ◽  
...  

Human eye color is highly heritable, but its genetic architecture is not yet fully understood. We report the results of the largest genome-wide association study for eye color to date, involving up to 192,986 European participants from 10 populations. We identify 124 independent associations arising from 61 discrete genomic regions, including 50 previously unidentified. We find evidence for genes involved in melanin pigmentation, but we also find associations with genes involved in iris morphology and structure. Further analyses in 1636 Asian participants from two populations suggest that iris pigmentation variation in Asians is genetically similar to Europeans, albeit with smaller effect sizes. Our findings collectively explain 53.2% (95% confidence interval, 45.4 to 61.0%) of eye color variation using common single-nucleotide polymorphisms. Overall, our study outcomes demonstrate that the genetic complexity of human eye color considerably exceeds previous knowledge and expectations, highlighting eye color as a genetically highly complex human trait.


Chromosoma ◽  
2021 ◽  
Vol 130 (1) ◽  
pp. 27-40
Author(s):  
Guoqing Liu ◽  
Hongyu Zhao ◽  
Hu Meng ◽  
Yongqiang Xing ◽  
Lu Cai

AbstractWe present a deformation energy model for predicting nucleosome positioning, in which a position-dependent structural parameter set derived from crystal structures of nucleosomes was used to calculate the DNA deformation energy. The model is successful in predicting nucleosome occupancy genome-wide in budding yeast, nucleosome free energy, and rotational positioning of nucleosomes. Our model also indicates that the genomic regions underlying the MNase-sensitive nucleosomes in budding yeast have high deformation energy and, consequently, low nucleosome-forming ability, while the MNase-sensitive non-histone particles are characterized by much lower DNA deformation energy and high nucleosome preference. In addition, we also revealed that remodelers, SNF2 and RSC8, are likely to act in chromatin remodeling by binding to broad nucleosome-depleted regions that are intrinsically favorable for nucleosome positioning. Our data support the important role of position-dependent physical properties of DNA in nucleosome positioning.


Sign in / Sign up

Export Citation Format

Share Document