mouse reference genome
Recently Published Documents


TOTAL DOCUMENTS

9
(FIVE YEARS 4)

H-INDEX

4
(FIVE YEARS 1)

2020 ◽  
Vol 49 (D1) ◽  
pp. D916-D923
Author(s):  
Adam Frankish ◽  
Mark Diekhans ◽  
Irwin Jungreis ◽  
Julien Lagarde ◽  
Jane E Loveland ◽  
...  

Abstract The GENCODE project annotates human and mouse genes and transcripts supported by experimental data with high accuracy, providing a foundational resource that supports genome biology and clinical genomics. GENCODE annotation processes make use of primary data and bioinformatic tools and analysis generated both within the consortium and externally to support the creation of transcript structures and the determination of their function. Here, we present improvements to our annotation infrastructure, bioinformatics tools, and analysis, and the advances they support in the annotation of the human and mouse genomes including: the completion of first pass manual annotation for the mouse reference genome; targeted improvements to the annotation of genes associated with SARS-CoV-2 infection; collaborative projects to achieve convergence across reference annotation databases for the annotation of human and mouse protein-coding genes; and the first GENCODE manually supervised automated annotation of lncRNAs. Our annotation is accessible via Ensembl, the UCSC Genome Browser and https://www.gencodegenes.org.


2019 ◽  
Author(s):  
Vishal Kumar Sarsani ◽  
Narayanan Raghupathy ◽  
Ian T. Fiddes ◽  
Joel Armstrong ◽  
Francoise Thibaud-Nissen ◽  
...  

ABSTRACTIsogenic laboratory mouse strains are used to enhance reproducibility as individuals within a strain are essentially genetically identical. For the most widely used isogenic strain, C57BL/6, there is also a wealth of genetic, phenotypic, and genomic data, including one of the highest quality reference genomes (GRCm38.p6). However, laboratory mouse strains are living reagents and hence genetic drift occurs and is an unavoidable source of accumulating genetic variability that can have an impact on reproducibility over time. Nearly 20 years after the first release of the mouse reference genome, individuals from the strain it represents (C57BL/6J) are at least 26 inbreeding generations removed from the individuals used to generate the mouse reference genome. Moreover, C57BL/6J is now maintained through the periodic reintroduction of mice from cryopreserved embryo stocks that are derived from a single breeder pair, aptly named C57BL/6J Adam and Eve. To more accurately represent the genome of today’s C57BL/6J mice, we have generated a de novo assembly of the C57BL/6J Eve genome (B6Eve) using high coverage, long-read sequencing, optical mapping, and short-read data. Using these data, we addressed recurring variants observed in previous mouse studies. We have also identified structural variations that impact coding sequences, closed gaps in the mouse reference assembly, some of which are in genes, and we have identified previously unannotated coding sequences through long read sequencing of cDNAs. This B6Eve assembly explains discrepant observations that have been associated with GRCm38-based analyses, and has provided data towards a reference genome that is more representative of the C57BL/6J mice that are in use today.


2018 ◽  
Author(s):  
Cristina Sisu ◽  
Paul Muir ◽  
Adam Frankish ◽  
Ian Fiddes ◽  
Mark Diekhans ◽  
...  

Pseudogenes are ideal markers of genome remodeling. In turn, the mouse is an ideal platform for studying them, particularly with the availability of developmental transcriptional data and the sequencing of 18 strains. Here, we present a comprehensive genome-wide annotation of the pseudogenes in the mouse reference genome and associated strains. We compiled this by combining manual curation of over 10,000 pseudogenes with results from automatic annotation pipelines. Also, by comparing the human and mouse, we annotated 165 unitary pseudogenes in mouse, and 303 unitaries in human. We make all our annotation available through mouse.pseudogene.org. The overall mouse pseudogene repertoire (in the reference and strains) is similar to human in terms of overall size, biotype distribution (~80% processed/~20% duplicated) and top family composition (with many GAPDH and ribosomal pseudogenes). However, notable differences arise in the pseudogene age distribution, with multiple retro-transpositional bursts in mouse evolutionary history and only one in human. Furthermore, in each strain about a fifth of the pseudogenes are unique, reflecting strain-specific functions and evolution. Additionally, we find that ~15% of the pseudogenes are transcribed, a fraction similar to that for human, and that pseudogene transcription exhibits greater tissue and strain specificity compared to protein-coding genes. Finally, we show that highly transcribed parent genes tend to give rise to processed pseudogenes.


2018 ◽  
Author(s):  
Jingtao Lilue ◽  
Anthony G. Doran ◽  
Ian T. Fiddes ◽  
Monica Abrudan ◽  
Joel Armstrong ◽  
...  

AbstractThe most commonly employed mammalian model organism is the laboratory mouse. A wide variety of genetically diverse inbred mouse strains, representing distinct physiological states, disease susceptibilities, and biological mechanisms have been developed over the last century. We report full length draft de novo genome assemblies for 16 of the most widely used inbred strains and reveal for the first time extensive strain-specific haplotype variation. We identify and characterise 2,567 regions on the current Genome Reference Consortium mouse reference genome exhibiting the greatest sequence diversity between strains. These regions are enriched for genes involved in defence and immunity, and exhibit enrichment of transposable elements and signatures of recent retrotransposition events. Combinations of alleles and genes unique to an individual strain are commonly observed at these loci, reflecting distinct strain phenotypes. Several immune related loci, some in previously identified QTLs for disease response have novel haplotypes not present in the reference that may explain the phenotype. We used these genomes to improve the mouse reference genome resulting in the completion of 10 new gene structures, and 62 new coding loci were added to the reference genome annotation. Notably this high quality collection of genomes revealed a previously unannotated gene (Efcab3-like) encoding 5,874 amino acids, one of the largest known in the rodent lineage. Interestingly, Efcab3-like−/− mice exhibit severe size anomalies in four regions of the brain suggesting a mechanism of Efcab3-like regulating brain development.


2016 ◽  
Author(s):  
Andrew P Morgan ◽  
J Matthew Holt ◽  
Rachel C McMullan ◽  
Timothy A Bell ◽  
Amelia M-F Clayshulte ◽  
...  

ABSTRACTGene duplication and loss are major sources of genetic polymorphism in populations, and are important forces shaping the evolution of genome content and organization. We have reconstructed the origin and history of a 127 kbp segmental duplication, R2d, in the house mouse (Mus musculus). R2d contains a single protein-coding gene, Cwc22. De novo assembly of both the ancestral (R2d1) and the derived (R2d2) copies reveals that they have been subject to non-allelic gene conversion events spanning tens of kilobases. R2d2 is also a hotspot for structural variation: its diploid copy number ranges from zero in the mouse reference genome to more than 80 in wild mice sampled from around the globe. Hemizgyosity for high-copy-number alleles of R2d2 is associated in cis with meiotic drive, suppression of meiotic crossovers, and copy-number instability, with a mutation rate in excess of 1 per 100 transmissions in laboratory populations. We identify an additional 57 loci covering 0.8% of the mouse genome with patterns of sequence variation similar to those at R2d1 and R2d2. Our results provide a striking example of allelic diversity generated by duplication and demonstrate the value of de novo assembly in a phylogenetic context for understanding the mutational processes affecting duplicate genes.


2015 ◽  
Vol 26 (7-8) ◽  
pp. 295-304 ◽  
Author(s):  
Y. Zhu ◽  
J. E. Richardson ◽  
P. Hale ◽  
R. M. Baldarelli ◽  
D. J. Reed ◽  
...  

BMC Genomics ◽  
2009 ◽  
Vol 10 (1) ◽  
pp. 606 ◽  
Author(s):  
Clara Amid ◽  
Linda M Rehaume ◽  
Kelly L Brown ◽  
James GR Gilbert ◽  
Gordon Dougan ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document