scholarly journals High-quality carnivore genomes from roadkill samples enable species delimitation in aardwolf and bat-eared fox

2020 ◽  
Author(s):  
Rémi Allio ◽  
Marie-Ka Tilak ◽  
Céline Scornavacca ◽  
Nico L. Avenant ◽  
Erwan Corre ◽  
...  

AbstractIn a context of ongoing biodiversity erosion, obtaining genomic resources from wildlife is becoming essential for conservation. The thousands of yearly mammalian roadkill could potentially provide a useful source material for genomic surveys. To illustrate the potential of this underexploited resource, we used roadkill samples to sequence reference genomes and study the genomic diversity of the bat-eared fox (Otocyon megalotis) and the aardwolf (Proteles cristata) for which subspecies have been defined based on similar disjunct distributions in Eastern and Southern Africa. By developing an optimized DNA extraction protocol, we successfully obtained long reads using the Oxford Nanopore Technologies (ONT) MinION device. For the first time in mammals, we obtained two reference genomes with high contiguity and gene completeness by combining ONT long reads with Illumina short reads using hybrid assembly. Based on re-sequencing data from few other roakill samples, the comparison of the genetic differentiation between our two pairs of subspecies to that of pairs of well-defined species across Carnivora showed that the two subspecies of aardwolf might warrant species status (P. cristata and P. septentrionalis), whereas the two subspecies of bat-eared fox might not. Moreover, using these data, we conducted demographic analyses that revealed similar trajectories between Eastern and Southern populations of both species, suggesting that their population sizes have been shaped by similar environmental fluctuations. Finally, we obtained a well resolved genome-scale phylogeny for Carnivora with evidence for incomplete lineage sorting among the three main arctoid lineages. Overall, our cost-effective strategy opens the way for large-scale population genomic studies and phylogenomics of mammalian wildlife using roadkill.

eLife ◽  
2021 ◽  
Vol 10 ◽  
Author(s):  
Rémi Allio ◽  
Marie-Ka Tilak ◽  
Celine Scornavacca ◽  
Nico L Avenant ◽  
Andrew C Kitchener ◽  
...  

In a context of ongoing biodiversity erosion, obtaining genomic resources from wildlife is essential for conservation. The thousands of yearly mammalian roadkill provide a useful source material for genomic surveys. To illustrate the potential of this underexploited resource, we used roadkill samples to study the genomic diversity of the bat-eared fox (Otocyon megalotis) and the aardwolf (Proteles cristatus), both having subspecies with similar disjunct distributions in Eastern and Southern Africa. First, we obtained reference genomes with high contiguity and gene completeness by combining Nanopore long reads and Illumina short reads. Then, we showed that the two subspecies of aardwolf might warrant species status (P. cristatus and P. septentrionalis) by comparing their genome-wide genetic differentiation to pairs of well-defined species across Carnivora with a new Genetic Differentiation index (GDi) based on only a few resequenced individuals. Finally, we obtained a genome-scale Carnivora phylogeny including the new aardwolf species.


2020 ◽  
Author(s):  
Ramon Viñas ◽  
Tiago Azevedo ◽  
Eric R. Gamazon ◽  
Pietro Liò

AbstractA question of fundamental biological significance is to what extent the expression of a subset of genes can be used to recover the full transcriptome, with important implications for biological discovery and clinical application. To address this challenge, we present GAIN-GTEx, a method for gene expression imputation based on Generative Adversarial Imputation Networks. In order to increase the applicability of our approach, we leverage data from GTEx v8, a reference resource that has generated a comprehensive collection of transcriptomes from a diverse set of human tissues. We compare our model to several standard and state-of-the-art imputation methods and show that GAIN-GTEx is significantly superior in terms of predictive performance and runtime. Furthermore, our results indicate strong generalisation on RNA-Seq data from 3 cancer types across varying levels of missingness. Our work can facilitate a cost-effective integration of large-scale RNA biorepositories into genomic studies of disease, with high applicability across diverse tissue types.


Author(s):  
S. Rubinacci ◽  
D.M. Ribeiro ◽  
R. Hofmeister ◽  
O. Delaneau

AbstractLow-coverage whole genome sequencing followed by imputation has been proposed as a cost-effective genotyping approach for disease and population genetics studies. However, its competitiveness against SNP arrays is undermined as current imputation methods are computationally expensive and unable to leverage large reference panels.Here, we describe a method, GLIMPSE, for phasing and imputation of low-coverage sequencing datasets from modern reference panels. We demonstrate its remarkable performance across different coverages and human populations. It achieves imputation of a full genome for less than $1, outperforming existing methods by orders of magnitude, with an increased accuracy of more than 20% at rare variants. We also show that 1x coverage enables effective association studies and is better suited than dense SNP arrays to access the impact of rare variations. Overall, this study demonstrates the promising potential of low-coverage imputation and suggests a paradigm shift in the design of future genomic studies.


2019 ◽  
Author(s):  
Joshua I Brian ◽  
Simon K Davy ◽  
Shaun P Wilkinson

Coral reefs rely on their intracellular dinoflagellate symbionts (family Symbiodiniaceae) for nutritional provision in nutrient-poor waters, yet this association is threatened by thermally stressful conditions. Despite this, the evolutionary potential of these symbionts remains poorly characterised. In this study, we tested the potential for divergent Symbiodiniaceae types to sexually reproduce (i.e. hybridise) within Cladocopium, the most ecologically prevalent genus in this family. With sequence data from three organelles (cob gene, mitochondria; psbAncr region, chloroplast; and ITS2 region, nucleus), we utilised the Incongruence Length Difference test, Approximately Unbiased test, tree hybridisation analyses and visual inspection of raw data in stepwise fashion to highlight incongruences between organelles, and thus provide evidence of reticulate evolution. Using this approach, we identified three putative hybrid Cladocopium samples among the 158 analysed, at two of the seven sites sampled. These samples were identified as the common Cladocopium types C40 or C1 with respect to the mitochondria and chloroplasts, but the rarer types C3z, C3u and C1# with respect to their nuclear identity. These five Cladocopium types have previously been confirmed as evolutionarily distinct and were also recovered in non-incongruent samples multiple times, which is strongly suggestive that they sexually reproduced to produce the incongruent samples. A concomitant inspection of Next Generation Sequencing data for these samples suggests that other plausible explanations, such as incomplete lineage sorting, are much less likely. The approach taken in this study allows incongruences between gene regions to be identified with confidence, and brings new light to the evolutionary potential within Symbiodiniaceae.


2016 ◽  
Vol 12 (3) ◽  
pp. e1004812 ◽  
Author(s):  
Andrej Kuritzin ◽  
Tabea Kischka ◽  
Jürgen Schmitz ◽  
Gennady Churakov

Mobile DNA ◽  
2019 ◽  
Vol 10 (1) ◽  
Author(s):  
Jerilyn A. Walker ◽  
◽  
Vallmer E. Jordan ◽  
Jessica M. Storer ◽  
Cody J. Steely ◽  
...  

Abstract Background Baboons (genus Papio) and geladas (Theropithecus gelada) are now generally recognized as close phylogenetic relatives, though morphologically quite distinct and generally classified in separate genera. Primate specific Alu retrotransposons are well-established genomic markers for the study of phylogenetic and population genetic relationships. We previously reported a computational reconstruction of Papio phylogeny using large-scale whole genome sequence (WGS) analysis of Alu insertion polymorphisms. Recently, high coverage WGS was generated for Theropithecus gelada. The objective of this study was to apply the high-throughput “poly-Detect” method to computationally determine the number of Alu insertion polymorphisms shared by T. gelada and Papio, and vice versa, by each individual Papio species and T. gelada. Secondly, we performed locus-specific polymerase chain reaction (PCR) assays on a diverse DNA panel to complement the computational data. Results We identified 27,700 Alu insertions from T. gelada WGS that were also present among six Papio species, with nearly half (12,956) remaining unfixed among 12 Papio individuals. Similarly, each of the six Papio species had species-indicative Alu insertions that were also present in T. gelada. In general, P. kindae shared more insertion polymorphisms with T. gelada than did any of the other five Papio species. PCR-based genotype data provided additional support for the computational findings. Conclusions Our discovery that several thousand Alu insertion polymorphisms are shared by T. gelada and Papio baboons suggests a much more permeable reproductive barrier between the two genera then previously suspected. Their intertwined evolution likely involves a long history of admixture, gene flow and incomplete lineage sorting.


BMC Genomics ◽  
2019 ◽  
Vol 20 (S10) ◽  
Author(s):  
Tao Tang ◽  
Yuansheng Liu ◽  
Buzhong Zhang ◽  
Benyue Su ◽  
Jinyan Li

Abstract Background The rapid development of Next-Generation Sequencing technologies enables sequencing genomes with low cost. The dramatically increasing amount of sequencing data raised crucial needs for efficient compression algorithms. Reference-based compression algorithms have exhibited outstanding performance on compressing single genomes. However, for the more challenging and more useful problem of compressing a large collection of n genomes, straightforward application of these reference-based algorithms suffers a series of issues such as difficult reference selection and remarkable performance variation. Results We propose an efficient clustering-based reference selection algorithm for reference-based compression within separate clusters of the n genomes. This method clusters the genomes into subsets of highly similar genomes using MinHash sketch distance, and uses the centroid sequence of each cluster as the reference genome for an outstanding reference-based compression of the remaining genomes in each cluster. A final reference is then selected from these reference genomes for the compression of the remaining reference genomes. Our method significantly improved the performance of the-state-of-art compression algorithms on large-scale human and rice genome databases containing thousands of genome sequences. The compression ratio gain can reach up to 20-30% in most cases for the datasets from NCBI, the 1000 Human Genomes Project and the 3000 Rice Genomes Project. The best improvement boosts the performance from 351.74 compression folds to 443.51 folds. Conclusions The compression ratio of reference-based compression on large scale genome datasets can be improved via reference selection by applying appropriate data preprocessing and clustering methods. Our algorithm provides an efficient way to compress large genome database.


PeerJ ◽  
2019 ◽  
Vol 7 ◽  
pp. e7178 ◽  
Author(s):  
Joshua I. Brian ◽  
Simon K. Davy ◽  
Shaun P. Wilkinson

Coral reefs rely on their intracellular dinoflagellate symbionts (family Symbiodiniaceae) for nutritional provision in nutrient-poor waters, yet this association is threatened by thermally stressful conditions. Despite this, the evolutionary potential of these symbionts remains poorly characterised. In this study, we tested the potential for divergent Symbiodiniaceae types to sexually reproduce (i.e. hybridise) within Cladocopium, the most ecologically prevalent genus in this family. With sequence data from three organelles (cob gene, mitochondrion; psbAncr region, chloroplast; and ITS2 region, nucleus), we utilised the Incongruence Length Difference test, Approximately Unbiased test, tree hybridisation analyses and visual inspection of raw data in stepwise fashion to highlight incongruences between organelles, and thus provide evidence of reticulate evolution. Using this approach, we identified three putative hybrid Cladocopium samples among the 158 analysed, at two of the seven sites sampled. These samples were identified as the common Cladocopium types C40 or C1 with respect to the mitochondria and chloroplasts, but the rarer types C3z, C3u and C1# with respect to their nuclear identity. These five Cladocopium types have previously been confirmed as evolutionarily distinct and were also recovered in non-incongruent samples multiple times, which is strongly suggestive that they sexually reproduced to produce the incongruent samples. A concomitant inspection of next generation sequencing data for these samples suggests that other plausible explanations, such as incomplete lineage sorting or the presence of co-dominance, are much less likely. The approach taken in this study allows incongruences between gene regions to be identified with confidence, and brings new light to the evolutionary potential within Symbiodiniaceae.


2021 ◽  
Vol 11 ◽  
Author(s):  
Jinyuan Chen ◽  
Guili Wu ◽  
Nawal Shrestha ◽  
Shuang Wu ◽  
Wei Guo ◽  
...  

Medicago and its relatives, Trigonella and Melilotus comprise the most important forage resources globally. The alfalfa selected from the wild relatives has been cultivated worldwide as the forage queen. In the Flora of China, 15 Medicago, eight Trigonella, and four Melilotus species are recorded, of which six Medicago and two Trigonella species are introduced. Although several studies have been conducted to investigate the phylogenetic relationship within the three genera, many Chinese naturally distributed or endemic species are not included in those studies. Therefore, the taxonomic identity and phylogenetic relationship of these species remains unclear. In this study, we collected samples representing 18 out of 19 Chinese naturally distributed species of these three genera and three introduced Medicago species, and applied an integrative approach by combining evidences from population-based morphological clusters and molecular data to investigate species boundaries. A total of 186 individuals selected from 156 populations and 454 individuals from 124 populations were collected for genetic and morphological analyses, respectively. We sequenced three commonly used DNA barcodes (trnH-psbA, trnK-matK, and ITS) and one nuclear marker (GA3ox1) for phylogenetic analyses. We found that 16 out of 21 species could be well delimited based on phylogenetic analyses and morphological clusters. Two Trigonella species may be merged as one species or treated as two subspecies, and Medicago falcata should be treated as a subspecies of the M. sativa complex. We further found that major incongruences between the chloroplast and nuclear trees mainly occurred among the deep diverging lineages, which may be resulted from hybridization, incomplete lineage sorting and/or sampling errors. Further studies involving a finer sampling of species associated with large scale genomic data should be employed to better understand the species delimitation of these three genera.


2017 ◽  
Author(s):  
◽  
Jacob Daniel Washburn

Most plants convert sunlight into chemical energy using a process known as C[subscript 3] photosynthesis. However, some of the world's most successful plants instead use the C[subscript 4] photosynthetic pathway which allows them to more efficiently use water, nitrogen, and solar energy. In the past 30 million years, C4 photosynthesis has convergently evolved from C3 over 60 times and new lineages are in the process of evolving even today. Because of this complex evolutionary history, C[subscript 4] is not "one" uniform photosynthetic type, but a diverse collection of photosynthetic sub-types that are classically grouped according to their use of three different biochemical pathways. The grass tribe Paniceae is especially interesting in this aspect because it contains all three of these biochemical subtypes as well as important food and bioenergy crops. To better understand the evolution of C[subscript 4] photosynthesis, DNA and RNA sequencing were undertaken for various species from within the Paniceae and used for phylogenetic and comparative genomic studies. Cell type specific RNA expression profiling for the two major C4 cell types was also completed for representative species of each C[subscript 4] sub-type. Streamlined bioinformatics pipelines for both chloroplast and nuclear phylogenetics were developed for processing the data. These analyses resulted in: 1) The first "genome scale" phylogenetic tree of the grass tribe Paniceae, 2) The clearest evidence to date of the evolutionary relationships between the three classically defined C[subscript 4] sub-types, 3) The most convincing results to date that the chloroplast and nuclear phylogenies of the Paniceae are incongruent, 4) Evidence that this chloroplast nuclear incongruence is likely due to introgression and/or incomplete lineage sorting, and 5) Strong support for sub-type mixing as well as the existence of a PCK sub-type.


Sign in / Sign up

Export Citation Format

Share Document