scholarly journals Chromosome-level de novo assembly of the pig-tailed macaque genome using linked-read sequencing and HiC proximity scaffolding

GigaScience ◽  
2020 ◽  
Vol 9 (7) ◽  
Author(s):  
Morteza Roodgar ◽  
Afshin Babveyh ◽  
Lan H Nguyen ◽  
Wenyu Zhou ◽  
Rahul Sinha ◽  
...  

Abstract Background Macaque species share >93% genome homology with humans and develop many disease phenotypes similar to those of humans, making them valuable animal models for the study of human diseases (e.g., HIV and neurodegenerative diseases). However, the quality of genome assembly and annotation for several macaque species lags behind the human genome effort. Results To close this gap and enhance functional genomics approaches, we used a combination of de novo linked-read assembly and scaffolding using proximity ligation assay (HiC) to assemble the pig-tailed macaque (Macaca nemestrina) genome. This combinatorial method yielded large scaffolds at chromosome level with a scaffold N50 of 127.5 Mb; the 23 largest scaffolds covered 90% of the entire genome. This assembly revealed large-scale rearrangements between pig-tailed macaque chromosomes 7, 12, and 13 and human chromosomes 2, 14, and 15. We subsequently annotated the genome using transcriptome and proteomics data from personalized induced pluripotent stem cells derived from the same animal. Reconstruction of the evolutionary tree using whole-genome annotation and orthologous comparisons among 3 macaque species, human, and mouse genomes revealed extensive homology between human and pig-tailed macaques with regards to both pluripotent stem cell genes and innate immune gene pathways. Our results confirm that rhesus and cynomolgus macaques exhibit a closer evolutionary distance to each other than either species exhibits to humans or pig-tailed macaques. Conclusions These findings demonstrate that pig-tailed macaques can serve as an excellent animal model for the study of many human diseases particularly with regards to pluripotency and innate immune pathways.

2019 ◽  
Author(s):  
Morteza Roodgar ◽  
Afshin Babveyh ◽  
Lan Huong ◽  
Wenyu Zhou ◽  
Rahul Sinha ◽  
...  

AbstractOld world monkey species share over 93% genome homology with humans and develop many disease phenotypes similar to those of humans, making them highly valuable animal models for the study of numerous human diseases. However, the quality of genome assembly and annotation for old world monkeys including macaque species lags behind the human genome effort. To close this gap and enhance functional genomics approaches, we employed a combination ofde novolinked-read assembly and scaffolding using proximity ligation assay (HiC) to assemble the pig-tailed macaque (Macaca nemestrina) genome. This combinatorial method yielded large scaffolds at chromosome-level with a scaffold N50 of 127.5 Mb; the 23 largest scaffolds covered 90% of the entire genome. This assembly revealed large-scale rearrangements between pig-tailed macaque chromosomes 7,12, and13 and human chromosomes 2, 14, and 15.


2019 ◽  
Vol 6 (1) ◽  
Author(s):  
Baohua Chen ◽  
Zhixiong Zhou ◽  
Qiaozhen Ke ◽  
Yidi Wu ◽  
Huaqiang Bai ◽  
...  

Abstract Larimichthys crocea is an endemic marine fish in East Asia that belongs to Sciaenidae in Perciformes. L. crocea has now been recognized as an “iconic” marine fish species in China because not only is it a popular food fish in China, it is a representative victim of overfishing and still provides high value fish products supported by the modern large-scale mariculture industry. Here, we report a chromosome-level reference genome of L. crocea generated by employing the PacBio single molecule sequencing technique (SMRT) and high-throughput chromosome conformation capture (Hi-C) technologies. The genome sequences were assembled into 1,591 contigs with a total length of 723.86 Mb and a contig N50 length of 2.83 Mb. After chromosome-level scaffolding, 24 scaffolds were constructed with a total length of 668.67 Mb (92.48% of the total length). Genome annotation identified 23,657 protein-coding genes and 7262 ncRNAs. This highly accurate, chromosome-level reference genome of L. crocea provides an essential genome resource to support the development of genome-scale selective breeding and restocking strategies of L. crocea.


2019 ◽  
Author(s):  
Mats E. Pettersson ◽  
Christina M. Rochus ◽  
Fan Han ◽  
Junfeng Chen ◽  
Jason Hill ◽  
...  

ABSTRACTThe Atlantic herring is a model species for exploring the genetic basis for ecological adaptation, due to its huge population size and extremely low genetic differentiation at selectively neutral loci. However, such studies have so far been hampered because of a highly fragmented genome assembly. Here, we deliver a chromosome-level genome assembly based on a hybrid approach combining ade novoPacBio assembly with Hi-C-supported scaffolding. The assembly comprises 26 autosomes with sizes ranging from 12.4 to 33.1 Mb and a total size, in chromosomes, of 726 Mb. The development of a high-resolution linkage map confirmed the global chromosome organization and the linear order of genomic segments along the chromosomes. A comparison between the herring genome assembly with other high-quality assemblies from bony fishes revealed few interchromosomal but frequent intrachromosomal rearrangements. The improved assembly makes the analysis of previously intractable large-scale structural variation more feasible; allowing, for example, the detection of a 7.8 Mb inversion on chromosome 12 underlying ecological adaptation. This supergene shows strong genetic differentiation between populations from the northern and southern parts of the species distribution. The chromosome-based assembly also markedly improves the interpretation of previously detected signals of selection, allowing us to reveal hundreds of independent loci associated with ecological adaptation in the Atlantic herring.


2018 ◽  
Vol 35 (7) ◽  
pp. 1249-1251 ◽  
Author(s):  
Kai Li ◽  
Marc Vaudel ◽  
Bing Zhang ◽  
Yan Ren ◽  
Bo Wen

Abstract Summary Data visualization plays critical roles in proteomics studies, ranging from quality control of MS/MS data to validation of peptide identification results. Herein, we present PDV, an integrative proteomics data viewer that can be used to visualize a wide range of proteomics data, including database search results, de novo sequencing results, proteogenomics files, MS/MS data in mzML/mzXML format and data from public proteomics repositories. PDV is a lightweight visualization tool that enables intuitive and fast exploration of diverse, large-scale proteomics datasets on standard desktop computers in both graphical user interface and command line modes. Availability and implementation PDV software and the user manual are freely available at http://pdv.zhang-lab.org. The source code is available at https://github.com/wenbostar/PDV and is released under the GPL-3 license. Supplementary information Supplementary data are available at Bioinformatics online.


2016 ◽  
Vol 15 (3) ◽  
pp. 732-742 ◽  
Author(s):  
Arun Devabhaktuni ◽  
Joshua E. Elias

2015 ◽  
Author(s):  
Evan H. Baugh ◽  
Riley Simmons-Edler ◽  
Christian L. Mueller ◽  
Rebecca F. Alford ◽  
Natalia Volfovsky ◽  
...  

Existing methods for interpreting protein variation focus on annotating mutation pathogenicity rather than detailed interpretation of variant deleteriousness and frequently use only sequence-based or structure-based information. We present VIPUR, a computational framework that seamlessly integrates sequence analysis and structural modeling (using the Rosetta protein modeling suite) to identify and interpret deleterious protein variants. To train VIPUR, we collected 9,477 protein variants with known effects on protein function from multiple organisms and curated structural models for each variant from crystal structures and homology models. VIPUR can be applied to mutations in any organism's proteome with improved generalized accuracy (AUROC .83) and interpretability (AUPR .87) compared to other methods. We demonstrate that VIPUR's predictions of deleteriousness match the biological phenotypes in ClinVar and provide a clear ranking of prediction confidence. We use VIPUR to interpret known mutations associated with inflammation and diabetes, demonstrating the structural diversity of disrupted functional sites and improved interpretation of mutations associated with human diseases. Lastly we demonstrate VIPUR's ability to highlight candidate genes associated with human diseases by applying VIPUR to de novo variants associated with autism spectrum disorders.


2016 ◽  
Author(s):  
Alvina G. Lai ◽  
A. Aziz Aboobaker

AbstractGrowing demands for aquatic sources of animal proteins have attracted significant investments in aquaculture research in recent years. The crustacean aquaculture industry has undergone substantial growth to accommodate a rising global demand, however such large-scale production is susceptible to pathogen-mediated destruction. It is clear that a thorough understanding of the crustacean innate immune system is imperative for future research into combating current and future pathogens of the main food crop species. Through a comparative genomics approach utilising extant data from 55 species, we describe the innate immune system of crustaceans from the Malacostraca class. We identify 7407 malacostracan genes from 39 gene families implicated in different aspects of host defence and demonstrate dynamic evolution of innate immunity components within this group. Malacostracans have achieved flexibility in recognising infectious agents through divergent evolution and expansion of pathogen recognition receptors genes. Antiviral RNAi, Toll and JAK-STAT signal transduction pathways have remained conserved within Malacostraca, although the Imd pathway appears to lack several key components. Immune effectors such as the antimicrobial peptides (AMPs) have unique evolutionary profiles, with many malacostracan AMPs not found in other arthropod groups. Lastly, we describe four putative novel immune gene families, characterised by distinct protein domains, potentially representing important evolutionary novelties of the malacostracan immune system.


2020 ◽  
Author(s):  
Salvador Guardiola ◽  
Monica Varese ◽  
Xavier Roig ◽  
Jesús Garcia ◽  
Ernest Giralt

<p>NOTE: This preprint has been retracted by consensus from all authors. See the retraction notice in place above; the original text can be found under "Version 1", accessible from the version selector above.</p><p><br></p><p>------------------------------------------------------------------------</p><p><br></p><p>Peptides, together with antibodies, are among the most potent biochemical tools to modulate challenging protein-protein interactions. However, current structure-based methods are largely limited to natural peptides and are not suitable for designing target-specific binders with improved pharmaceutical properties, such as macrocyclic peptides. Here we report a general framework that leverages the computational power of Rosetta for large-scale backbone sampling and energy scoring, followed by side-chain composition, to design heterochiral cyclic peptides that bind to a protein surface of interest. To showcase the applicability of our approach, we identified two peptides (PD-<i>i</i>3 and PD-<i>i</i>6) that target PD-1, a key immune checkpoint, and work as protein ligand decoys. A comprehensive biophysical evaluation confirmed their binding mechanism to PD-1 and their inhibitory effect on the PD-1/PD-L1 interaction. Finally, elucidation of their solution structures by NMR served as validation of our <i>de novo </i>design approach. We anticipate that our results will provide a general framework for designing target-specific drug-like peptides.<i></i></p>


2020 ◽  
Author(s):  
Salvador Guardiola ◽  
Monica Varese ◽  
Xavier Roig ◽  
Jesús Garcia ◽  
Ernest Giralt

<p>NOTE: This preprint has been retracted by consensus from all authors. See the retraction notice in place above; the original text can be found under "Version 1", accessible from the version selector above.</p><p><br></p><p>------------------------------------------------------------------------</p><p><br></p><p>Peptides, together with antibodies, are among the most potent biochemical tools to modulate challenging protein-protein interactions. However, current structure-based methods are largely limited to natural peptides and are not suitable for designing target-specific binders with improved pharmaceutical properties, such as macrocyclic peptides. Here we report a general framework that leverages the computational power of Rosetta for large-scale backbone sampling and energy scoring, followed by side-chain composition, to design heterochiral cyclic peptides that bind to a protein surface of interest. To showcase the applicability of our approach, we identified two peptides (PD-<i>i</i>3 and PD-<i>i</i>6) that target PD-1, a key immune checkpoint, and work as protein ligand decoys. A comprehensive biophysical evaluation confirmed their binding mechanism to PD-1 and their inhibitory effect on the PD-1/PD-L1 interaction. Finally, elucidation of their solution structures by NMR served as validation of our <i>de novo </i>design approach. We anticipate that our results will provide a general framework for designing target-specific drug-like peptides.<i></i></p>


2020 ◽  
Author(s):  
Salvador Guardiola ◽  
Monica Varese ◽  
Xavier Roig ◽  
Jesús Garcia ◽  
Ernest Giralt

<p>NOTE: This preprint has been retracted by consensus from all authors. See the retraction notice in place above; the original text can be found under "Version 1", accessible from the version selector above.</p><p><br></p><p>------------------------------------------------------------------------</p><p><br></p><p>Peptides, together with antibodies, are among the most potent biochemical tools to modulate challenging protein-protein interactions. However, current structure-based methods are largely limited to natural peptides and are not suitable for designing target-specific binders with improved pharmaceutical properties, such as macrocyclic peptides. Here we report a general framework that leverages the computational power of Rosetta for large-scale backbone sampling and energy scoring, followed by side-chain composition, to design heterochiral cyclic peptides that bind to a protein surface of interest. To showcase the applicability of our approach, we identified two peptides (PD-<i>i</i>3 and PD-<i>i</i>6) that target PD-1, a key immune checkpoint, and work as protein ligand decoys. A comprehensive biophysical evaluation confirmed their binding mechanism to PD-1 and their inhibitory effect on the PD-1/PD-L1 interaction. Finally, elucidation of their solution structures by NMR served as validation of our <i>de novo </i>design approach. We anticipate that our results will provide a general framework for designing target-specific drug-like peptides.<i></i></p>


Sign in / Sign up

Export Citation Format

Share Document