Sorghum Association Panel Whole-Genome Sequencing Establishes Pivotal Resource for Dissecting Genomic Diversity

Association mapping panels represent foundational resources for understanding the genetic basis of phenotypic diversity and serve to advance plant breeding by exploring genetic variation across diverse accessions with distinct histories of evolutionary divergence and local adaptation. We report the whole-genome sequencing (WGS) of 400 sorghum [Sorghum bicolor (L.) Moench] accessions from the Sorghum Association Panel (SAP) at an average coverage of 38X (25X-72X), enabling the development of a high-density genomic-marker set of 43,983,694 variants including SNPs (~38 million), indels (~5 million), and CNVs (~170,000). We observe slightly more deletions among indels and a much higher prevalence of deletions among copy number variants compared to insertions. This new marker set enabled the identification of several putatively novel genomic associations for plant height and tannin content, which were not identified when using previous lower-density marker sets. WGS identified and scored variants in 5 kb bins where available genotyping-by-sequencing (GBS) data captured no variants, with half of all bins in the genome falling into this category. The predictive ability of genomic best unbiased linear predictor (GBLUP) models was increased by an average of 30% by using WGS markers rather than GBS markers. We identified 18 selection peaks across subpopulations that formed due to evolutionary divergence during domestication, and we found six Fst peaks resulting from comparisons between converted lines and breeding lines within the SAP that were distinct from the peaks associated with historic selection. This population has been and continues to serve as a significant public resource for sorghum research and demonstrates the value of improving upon existing genomic resources.

Download Full-text

Darwin’s Fancy Revised: An Updated Understanding of the Genomic Constitution of Pigeon Breeds

Genome Biology and Evolution ◽

10.1093/gbe/evaa027 ◽

2020 ◽

Vol 12 (3) ◽

pp. 136-150

Author(s):

George Pacheco ◽

Hein van Grouw ◽

Michael D Shapiro ◽

Marcus Thomas P Gilbert ◽

Filipe Garrett Vieira

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Phenotypic Diversity ◽

Genotyping By Sequencing ◽

Columba Livia ◽

Whole Genome Sequencing Data ◽

Evolutionary Relationships ◽

Whole Genome ◽

Sequencing Data ◽

Domestic Species

Abstract Through its long history of artificial selection, the rock pigeon (Columba livia Gmelin 1789) was forged into a large number of domestic breeds. The incredible amount of phenotypic diversity exhibited in these breeds has long held the fascination of scholars, particularly those interested in biological inheritance and evolution. However, exploiting them as a model system is challenging, as unlike with many other domestic species, few reliable records exist about the origins of, and relationships between, each of the breeds. Therefore, in order to broaden our understanding of the complex evolutionary relationships among pigeon breeds, we generated genome-wide data by performing the genotyping-by-sequencing (GBS) method on close to 200 domestic individuals representing over 60 breeds. We analyzed these GBS data alongside previously published whole-genome sequencing data, and this combined analysis allowed us to conduct the most extensive phylogenetic analysis of the group, including two feral pigeons and one outgroup. We improve previous phylogenies, find considerable population structure across the different breeds, and identify unreported interbreed admixture events. Despite the reduced number of loci relative to whole-genome sequencing, we demonstrate that GBS data provide sufficient analytical power to investigate intertwined evolutionary relationships, such as those that are characteristic of animal domestic breeds. Thus, we argue that future studies should consider sequencing methods akin to the GBS approach as an optimal cost-effective approach for addressing complex phylogenies.

Download Full-text

Assessing genomic diversity and signatures of selection in Jiaxian Red cattle using whole-genome sequencing data

BMC Genomics ◽

10.1186/s12864-020-07340-0 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Xiaoting Xia ◽

Shunjin Zhang ◽

Huaju Zhang ◽

Zijing Zhang ◽

Ningbo Chen ◽

...

Keyword(s):

Population Structure ◽

Whole Genome Sequencing ◽

Genome Sequencing ◽

Genomic Variation ◽

Genomic Diversity ◽

System Response ◽

Whole Genome ◽

Population Structure Analysis ◽

Native Cattle ◽

Genomic Regions

Abstract Background Native cattle breeds are an important source of genetic variation because they might carry alleles that enable them to adapt to local environment and tough feeding conditions. Jiaxian Red, a Chinese native cattle breed, is reported to have originated from crossbreeding between taurine and indicine cattle; their history as a draft and meat animal dates back at least 30 years. Using whole-genome sequencing (WGS) data of 30 animals from the core breeding farm, we investigated the genetic diversity, population structure and genomic regions under selection of Jiaxian Red cattle. Furthermore, we used 131 published genomes of world-wide cattle to characterize the genomic variation of Jiaxian Red cattle. Results The population structure analysis revealed that Jiaxian Red cattle harboured the ancestry with East Asian taurine (0.493), Chinese indicine (0.379), European taurine (0.095) and Indian indicine (0.033). Three methods (nucleotide diversity, linkage disequilibrium decay and runs of homozygosity) implied the relatively high genomic diversity in Jiaxian Red cattle. We used θπ, CLR, FST and XP-EHH methods to look for the candidate signatures of positive selection in Jiaxian Red cattle. A total number of 171 (θπ and CLR) and 17 (FST and XP-EHH) shared genes were identified using different detection strategies. Functional annotation analysis revealed that these genes are potentially responsible for growth and feed efficiency (CCSER1), meat quality traits (ROCK2, PPP1R12A, CYB5R4, EYA3, PHACTR1), fertility (RFX4, SRD5A2) and immune system response (SLAMF1, CD84 and SLAMF6). Conclusion We provide a comprehensive overview of sequence variations in Jiaxian Red cattle genomes. Selection signatures were detected in genomic regions that are possibly related to economically important traits in Jiaxian Red cattle. We observed a high level of genomic diversity and low inbreeding in Jiaxian Red cattle. These results provide a basis for further resource protection and breeding improvement of this breed.

Download Full-text

AFLAP: Assembly-Free Linkage Analysis Pipeline using k-mers from whole genome sequencing data

10.1101/2020.09.14.296525 ◽

2020 ◽

Author(s):

Kyle Fletcher ◽

Lin Zhang ◽

Juliana Gil ◽

Rongkui Han ◽

Keri Cavanaugh ◽

...

Keyword(s):

Linkage Analysis ◽

Whole Genome Sequencing ◽

Genome Sequencing ◽

Genetic Map ◽

Genotyping By Sequencing ◽

Genetic Maps ◽

Whole Genome ◽

Sequencing Data ◽

Analysis Pipeline ◽

Genome Assemblies

AbstractBackgroundGenetic maps are an important resource for validation of genome assemblies, trait discovery, and breeding. Next generation sequencing has enabled production of high-density genetic maps constructed with 10,000s of markers. Most current approaches require a genome assembly to identify markers. Our Assembly Free Linkage Analysis Pipeline (AFLAP) removes this requirement by using uniquely segregating k-mers as markers to rapidly construct a genotype table and perform subsequent linkage analysis. This avoids potential biases including preferential read alignment and variant calling.ResultsThe performance of AFLAP was determined in simulations and contrasted to a conventional workflow. We tested AFLAP using 100 F2 individuals of Arabidopsis thaliana, sequenced to low coverage. Genetic maps generated using k-mers contained over 130,000 markers that were concordant with the genomic assembly. The utility of AFLAP was then demonstrated by generating an accurate genetic map using genotyping-by-sequencing data of 235 recombinant inbred lines of Lactuca spp. AFLAP was then applied to 83 F1 individuals of the oomycete Bremia lactucae, sequenced to >5x coverage. The genetic map contained over 90,000 markers ordered in 19 large linkage groups. This genetic map was used to fragment, order, orient, and scaffold the genome, resulting in a much-improved reference assembly.ConclusionsAFLAP can be used to generate high density linkage maps and improve genome assemblies of any organism when a mapping population is available using whole genome sequencing or genotyping-by-sequencing data. Genetic maps produced for B. lactucae were accurately aligned to the genome and guided significant improvements of the reference assembly.

Download Full-text

Genomic Surveillance and Phylodynamic Analyses Reveal the Emergence of Novel Mutations and Co-mutation Patterns Within SARS-CoV-2 Variants Prevalent in India

Frontiers in Microbiology ◽

10.3389/fmicb.2021.703933 ◽

2021 ◽

Vol 12 ◽

Author(s):

Nupur Biswas ◽

Priyanka Mallick ◽

Sujay Krishna Maity ◽

Debaleena Bhowmik ◽

Arpita Ghosh Mitra ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Time Course ◽

West Bengal ◽

Genomic Diversity ◽

Viral Population ◽

Whole Genome ◽

Novel Mutations ◽

Frequent Mutations ◽

Time Periods

Identification of the genomic diversity and the phylodynamic profiles of prevalent variants is critical to understand the evolution and spread of SARS-CoV-2 variants. We performed whole-genome sequencing of 54 SARS-CoV-2 variants collected from COVID-19 patients in Kolkata, West Bengal during August–October 2020. Phylogeographic and phylodynamic analyses were performed using these 54 and other sequences from India and abroad that are available in the GISAID database. We estimated the clade dynamics of the Indian variants and compared the clade-specific mutations and the co-mutation patterns across states and union territories of India over the time course. Frequent mutations and co-mutations observed within the major clades across time periods do not show much overlap, indicating the emergence of newer mutations in the viral population prevailing in the country. Furthermore, we explored the possible association of specific mutations and co-mutations with the infection outcomes manifested in Indian patients.

Download Full-text

Assessing genomic diversity and selective pressures in Bashan cattle by whole-genome sequencing data

Animal Biotechnology ◽

10.1080/10495398.2021.1998094 ◽

2021 ◽

pp. 1-12

Author(s):

Luyang Sun ◽

Kaixing Qu ◽

Yangkai Liu ◽

Xiaohui Ma ◽

Ningbo Chen ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Genomic Diversity ◽

Whole Genome Sequencing Data ◽

Whole Genome ◽

Sequencing Data ◽

Selective Pressures

Download Full-text

Whole-Genome Sequencing Reveals a Prolonged and Persistent Intrahospital Transmission of Corynebacterium striatum, an Emerging Multidrug-Resistant Pathogen

Journal of Clinical Microbiology ◽

10.1128/jcm.00683-19 ◽

2019 ◽

Vol 57 (9) ◽

Cited By ~ 4

Author(s):

Xuebing Wang ◽

Haijian Zhou ◽

Dongke Chen ◽

Pengcheng Du ◽

Ruiting Lan ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Resistance Genes ◽

Antimicrobial Agents ◽

Chronically Ill ◽

Multidrug Resistant ◽

Resistance Rate ◽

Genomic Diversity ◽

Whole Genome ◽

Corynebacterium Striatum

ABSTRACT Corynebacterium striatum is an emerging multidrug-resistant (MDR) pathogen that occurs primarily among immunocompromised and chronically ill patients. However, little is known about the genomic diversity of C. striatum, which contributes to its long-term persistence and transmission in hospitals. In this study, a total of 192 C. striatum isolates obtained from 14 September 2017 to 29 March 2018 in a hospital in Beijing, China, were analyzed by antimicrobial susceptibility testing and pulsed-field gel electrophoresis (PFGE). Whole-genome sequencing was conducted on 91 isolates. Nearly all isolates (96.3%, 183/190) were MDR. The highest resistance rate was observed for ciprofloxacin (99.0%, 190/192), followed by cefotaxime (90.6%, 174/192) and erythromycin (89.1%, 171/192). PFGE separated the 192 isolates into 79 pulsotypes, and differences in core genome single-nucleotide polymorphisms (SNPs) partitioned the 91 isolates sequenced into four clades. Isolates of the same pulsotype were identical or nearly identical at the genome level, with some exceptions. Two dominant subclones, clade 3a, and clade 4a, were responsible for the hospital-wide dissemination. Genomic analysis further revealed nine resistance genes mobilized by eight unique cassettes. PFGE and whole-genome sequencing revealed that the C. striatum isolates studied were the result mainly of predominant clones spreading in the hospital. C. striatum isolates in the hospital progressively acquired resistance to antimicrobial agents, demonstrating that isolates of C. striatum may adapt rapidly through the acquisition and accumulation of resistance genes and thus evolve into dominant and persistent clones. These insights will be useful for the prevention of C. striatum infection in hospitals.

Download Full-text

Pipeline for the Rapid Development of Cytogenetic Markers Using Genomic Data of Related Species

Genes ◽

10.3390/genes10020113 ◽

2019 ◽

Vol 10 (2) ◽

pp. 113 ◽

Cited By ~ 2

Author(s):

Pavel Kroupin ◽

Victoria Kuznetsova ◽

Dmitry Romanov ◽

Alina Kocheshkova ◽

Gennady Karlov ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Genome Sequence ◽

Related Species ◽

Evolutionary Divergence ◽

Whole Genome ◽

Target Genome ◽

Target Species ◽

Closely Related Species ◽

Chromosome Markers

Repetitive DNA including tandem repeats (TRs) is a significant part of most eukaryotic genomes. TRs include rapidly evolving satellite DNA (satDNA) that can be shared by closely related species, their abundance may be associated with evolutionary divergence, and they have been widely used for chromosome karyotyping using fluorescence in situ hybridization (FISH). The recent progress in the development of whole-genome sequencing and bioinformatics tools enables rapid and cost-effective searches for TRs including satDNA that can be converted into molecular cytogenetic markers. In the case of closely related taxa, the genome sequence of one species (donor) can be used as a base for the development of chromosome markers for related species or genomes (target). Here, we present a pipeline for rapid and high-throughput screening for new satDNA TRs in whole-genome sequencing of the donor genome and the development of chromosome markers based on them that can be applied in the target genome. One of the main peculiarities of the developed pipeline is that preliminary estimation of TR abundance using qPCR and ranking found TRs according to their copy number in the target genome; it facilitates the selection of the most prospective (most abundant) TRs that can be converted into cytogenetic markers. Another feature of our pipeline is the probe preparation for FISH using PCR with primers designed on the aligned TR unit sequences and the genomic DNA of a target species as a template that enables amplification of a whole pool of monomers inherent in the chromosomes of the target species. We demonstrate the efficiency of the developed pipeline by the example of FISH probes developed for A, B, and R subgenome chromosomes of hexaploid triticale (BBAARR) based on a bioinformatics analysis of the D genome of Aegilops tauschii (DD) whole-genome sequence. Our pipeline can be used to develop chromosome markers in closely related species for comparative cytogenetics in evolutionary and breeding studies.

Download Full-text

Assessing genomic diversity and signatures of selection in Original Braunvieh cattle using whole-genome sequencing data

10.1101/703439 ◽

2019 ◽

Cited By ~ 1

Author(s):

Meenu Bhati ◽

Naveen Kumar Kadri ◽

Danang Crysnanto ◽

Hubert Pausch

Keyword(s):

Whole Genome Sequencing ◽

Meat Quality ◽

Genome Sequencing ◽

Genomic Diversity ◽

Whole Genome ◽

Runs Of Homozygosity ◽

Cattle Breeds ◽

Signatures Of Selection ◽

Genomic Inbreeding ◽

Genomic Regions

AbstractBackgroundAutochthonous cattle breeds represent an important source of genetic variation because they might carry alleles that enable them to adapt to local environment and food conditions. Original Braunvieh (OB) is a local cattle breed of Switzerland used for beef and milk production in alpine areas. Using whole-genome sequencing (WGS) data of 49 key ancestors, we characterize genomic diversity, genomic inbreeding, and signatures of selection in Swiss OB cattle at nucleotide resolution.ResultsWe annotated 15,722,811 million SNPs and 1,580,878 million Indels including 10,738 and 2,763 missense deleterious and high impact variants, respectively, that were discovered in 49 OB key ancestors. Six Mendelian trait-associated variants that were previously detected in breeds other than OB, segregated in the sequenced key ancestors including variants causal for recessive xanthinuria and albinism. The average nucleotide diversity (1.6 × 10-3) was higher in OB than many mainstream European cattle breeds. Accordingly, the average genomic inbreeding quantified using runs of homozygosity (ROH) was relatively low (FROH=0.14) in the 49 OB key ancestor animals. However, genomic inbreeding was higher in more recent generations of OB cattle (FROH=0.16) due to a higher number of long (> 1 Mb) runs of homozygosity. Using two complementary approaches, composite likelihood ratio test and integrated haplotype score, we identified 95 and 162 genomic regions encompassing 136 and 157 protein-coding genes, respectively, that showed evidence (P < 0.005) of past and ongoing selection. These selection signals were enriched for quantitative trait loci related to beef traits including meat quality, feed efficiency and body weight and pathways related to blood coagulation, nervous and sensory stimulus.ConclusionsWe provide a comprehensive overview of sequence variation in Swiss OB cattle genomes. With WGS data, we observe higher genomic diversity and less inbreeding in OB than many European mainstream cattle breeds. Footprints of selection were detected in genomic regions that are possibly relevant for meat quality and adaptation to local environmental conditions. Considering that the population size is low and genomic inbreeding increased in the past generations, the implementation and adoption of optimal mating strategies seems warranted to maintain genetic diversity in the Swiss OB cattle population.

Download Full-text

Genomic Diversity of the Ostreid Herpesvirus Type 1 Across Time and Location and Among Host Species

Frontiers in Microbiology ◽

10.3389/fmicb.2021.711377 ◽

2021 ◽

Vol 12 ◽

Author(s):

Benjamin Morga ◽

Maude Jacquot ◽

Camille Pelletier ◽

Germain Chevignon ◽

Lionel Dégremont ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Phylogenetic Analyses ◽

Genomic Diversity ◽

Microbial Pathogens ◽

Whole Genome ◽

Site Mutation ◽

Herpesvirus Type ◽

New Variant ◽

Genomic Regions

The mechanisms underlying virus emergence are rarely well understood, making the appearance of outbreaks largely unpredictable. This is particularly true for pathogens with low per-site mutation rates, such as DNA viruses, that do not exhibit a large amount of evolutionary change among genetic sequences sampled at different time points. However, whole-genome sequencing can reveal the accumulation of novel genetic variation between samples, promising to render most, if not all, microbial pathogens measurably evolving and suitable for analytical techniques derived from population genetic theory. Here, we aim to assess the measurability of evolution on epidemiological time scales of the Ostreid herpesvirus 1 (OsHV-1), a double stranded DNA virus of which a new variant, OsHV-1 μVar, emerged in France in 2008, spreading across Europe and causing dramatic economic and ecological damage. We performed phylogenetic analyses of heterochronous (n = 21) OsHV-1 genomes sampled worldwide. Results show sufficient temporal signal in the viral sequences to proceed with phylogenetic molecular clock analyses and they indicate that the genetic diversity seen in these OsHV-1 isolates has arisen within the past three decades. OsHV-1 samples from France and New Zealand did not cluster together suggesting a spatial structuration of the viral populations. The genome-wide study of simple and complex polymorphisms shows that specific genomic regions are deleted in several isolates or accumulate a high number of substitutions. These contrasting and non-random patterns of polymorphism suggest that some genomic regions are affected by strong selective pressures. Interestingly, we also found variant genotypes within all infected individuals. Altogether, these results provide baseline evidence that whole genome sequencing could be used to study population dynamic processes of OsHV-1, and more broadly herpesviruses.

Download Full-text