Population Genomics of Filamentous Plant Pathogens—A Brief Overview of Research Questions, Approaches, and Pitfalls

2020 ◽  
pp. PHYTO-11-20-052
Author(s):  
Sydney Everhart ◽  
Nikita Gambhir ◽  
Remco Stam

With ever-decreasing sequencing costs, research on the population biology of plant pathogens is transitioning from population genetics—using dozens of genetic markers or polymorphism data of several genes—to population genomics—using several hundred to tens of thousands of markers or whole-genome sequence data. The field of population genomics is characterized by rapid theoretical and methodological advances and by numerous steps and pitfalls in its technical and analytical workflow. In this article, we aim to provide a brief overview of topics relevant to the study of population genomics of filamentous plant pathogens and direct readers to more extensive reviews for in-depth understanding. We briefly discuss different types of population genomics-inspired research questions and give insights into the sampling strategies that can be used to answer such questions. We then consider different sequencing strategies, the various options available for data processing, and some of the currently available tools for population genomic data analysis. We conclude by highlighting some of the hurdles along the population genomic workflow, providing cautionary warnings relative to assumptions and technical challenges, and presenting our own future perspectives of the field of population genomics for filamentous plant pathogens.

2020 ◽  
Vol 33 (8) ◽  
pp. 1022-1024
Author(s):  
Giovanni Cafà ◽  
Thaís Regina Boufleur ◽  
Renata Rebellato Linhares de Castro ◽  
Nelson Sidnei Massola ◽  
Riccardo Baroncelli

The genus Stagonosporopsis is classified within the Didymellaceae family and has around 40 associated species. Among them, several species are important plant pathogens responsible for significant losses in economically important crops worldwide. Stagonosporopsis vannaccii is a newly described species pathogenic to soybean. Here, we present the draft whole-genome sequence, gene prediction, and annotation of S. vannaccii isolate LFN0148 (also known as IMI 507030). To our knowledge, this is the first genome sequenced of this species and represents a new useful source for future research on fungal comparative genomics studies.


2019 ◽  
Vol 47 (W1) ◽  
pp. W283-W288 ◽  
Author(s):  
Jesús Murga-Moreno ◽  
Marta Coronado-Zamora ◽  
Sergi Hervas ◽  
Sònia Casillas ◽  
Antonio Barbadilla

Abstract The McDonald and Kreitman test (MKT) is one of the most powerful and widely used methods to detect and quantify recurrent natural selection using DNA sequence data. Here we present iMKT (acronym for integrative McDonald and Kreitman test), a novel web-based service performing four distinct MKT types. It allows the detection and estimation of four different selection regimes −adaptive, neutral, strongly deleterious and weakly deleterious− acting on any genomic sequence. iMKT can analyze both user's own population genomic data and pre-loaded Drosophila melanogaster and human sequences of protein-coding genes obtained from the largest population genomic datasets to date. Advanced options in the website allow testing complex hypotheses such as the application example showed here: do genes located in high recombination regions undergo higher rates of adaptation? We aim that iMKT will become a reference site tool for the study of evolutionary adaptation in massive population genomics datasets, especially in Drosophila and humans. iMKT is a free resource online at https://imkt.uab.cat.


Hereditas ◽  
2020 ◽  
Vol 157 (1) ◽  
Author(s):  
Ziqing Pan ◽  
Shuhua Xu

AbstractEast Asia constitutes one-fifth of the global population and exhibits substantial genetic diversity. However, genetic investigations on populations in this region have been largely under-represented compared with European populations. Nonetheless, the last decade has seen considerable efforts and progress in genome-wide genotyping and whole-genome sequencing of the East-Asian ethnic groups. Here, we review the recent studies in terms of ancestral origin, population relationship, genetic differentiation, and admixture of major East- Asian groups, such as the Chinese, Korean, and Japanese populations. We mainly focus on insights from the whole-genome sequence data and also include the recent progress based on mitochondrial DNA (mtDNA) and Y chromosome data. We further discuss the evolutionary forces driving genetic diversity in East-Asian populations, and provide our perspectives for future directions on population genetics studies, particularly on underrepresented indigenous groups in East Asia.


Author(s):  
Gustavo A. Bravo ◽  
C. Jonathan Schmitt ◽  
Scott V. Edwards

The increased capacity of DNA sequencing has significantly advanced our understanding of the phylogeny of birds and the proximate and ultimate mechanisms molding their genomic diversity. In less than a decade, the number of available avian reference genomes has increased to over 500—approximately 5% of bird diversity—placing birds in a privileged position to advance the fields of phylogenomics and comparative, functional, and population genomics. Whole-genome sequence data, as well as indels and rare genomic changes, are further resolving the avian tree of life. The accumulation of bird genomes, increasingly with long-read sequence data, greatly improves the resolution of genomic features such as germline-restricted chromosomes and the W chromosome, and is facilitating the comparative integration of genotypes and phenotypes. Community-based initiatives such as the Bird 10,000 Genomes Project and Vertebrate Genome Project are playing a fundamental role in amplifying and coalescing a vibrant international program in avian comparative genomics. Expected final online publication date for the Annual Review of Ecology, Evolution, and Systematics, Volume 52 is November 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.


2021 ◽  
Author(s):  
Claire Oget-Ebrad ◽  
Naveen Kumar Kadri ◽  
Gabriel Costa Monteiro Moreira ◽  
Latifa Karim ◽  
Wouter Coppieters ◽  
...  

Background: Accurate haplotype reconstruction is required in many applications in quantitative and population genomics. Different phasing methods are available but their accuracy must be evaluated for samples with different properties (population structure, marker density, etc.). We herein took advantage of whole-genome sequence data available for a Holstein cattle pedigree containing 264 individuals, including 98 trios, to evaluate several population-based phasing methods. This data represents a typical example of a livestock population, with low effective population size, high levels of relatedness and long-range linkage disequilibrium. Results: After stringent filtering of our sequence data, we evaluated several population-based phasing programs including one or more versions of AlphaPhase, ShapeIT, Beagle, Eagle and FImpute. To that end we used 98 individuals having both parents sequenced for validation. Their haplotypes reconstructed based on Mendelian segregation rules were considered the gold standard to assess the performance of population-based methods in two scenarios. In the first one, only these 98 individuals were phased, while in the second one, all the 264 sequenced individuals were phased simultaneously, ignoring the pedigree relationships. We assessed phasing accuracy based on switch error counts (SEC) and rates (SER), lengths of correctly phased haplotypes and pairwise SNP phasing accuracies (the probability that a pair of SNPs is correctly phased as a function of their distance). For most evaluated metrics or scenarios, the best software was either ShapeIT4.1 or Beagle5.2, both methods resulting in particularly high phasing accuracies. For instance, ShapeIT4.1 achieved a median SEC of 50 per individual and a mean haplotype block length of 24.1 Mb in the second scenario. These statistics are remarkable since the methods were evaluated with a map of 8,400,000 SNPs, and this corresponds to only one switch error every 40,000 phased informative markers. When more relatives were included in the data, FImpute3.0 reconstructed extremely long segments without errors. Conclusions: We report extremely high phasing accuracies in a typical livestock sample of 100 sequenced individuals. ShapeIT4.1 and Beagle5.2 proved to be the most accurate, particularly for phasing long segments. Nevertheless, most tools achieved high accuracy at short distances and would be suitable for applications requiring only local haplotypes.


2018 ◽  
Vol 3 ◽  
pp. 124 ◽  
Author(s):  
Keith A. Jolley ◽  
James E. Bray ◽  
Martin C. J. Maiden

The PubMLST.org website hosts a collection of open-access, curated databases that integrate population sequence data with provenance and phenotype information for over 100 different microbial species and genera.  Although the PubMLST website was conceived as part of the development of the first multi-locus sequence typing (MLST) scheme in 1998 the software it uses, the Bacterial Isolate Genome Sequence database (BIGSdb, published in 2010), enables PubMLST to include all levels of sequence data, from single gene sequences up to and including complete, finished genomes.  Here we describe developments in the BIGSdb software made from publication to June 2018 and show how the platform realises microbial population genomics for a wide range of applications.  The system is based on the gene-by-gene analysis of microbial genomes, with each deposited sequence annotated and curated to identify the genes present and systematically catalogue their variation.  Originally intended as a means of characterising isolates with typing schemes, the synthesis of sequences and records of genetic variation with provenance and phenotype data permits highly scalable (whole genome sequence data for tens of thousands of isolates) means of addressing a wide range of functional questions, including: the prediction of antimicrobial resistance; likely cross-reactivity with vaccine antigens; and the functional activities of different variants that lead to key phenotypes.  There are no limitations to the number of sequences, genetic loci, allelic variants or schemes (combinations of loci) that can be included, enabling each database to represent an expanding catalogue of the genetic variation of the population in question.  In addition to providing web-accessible analyses and links to third-party analysis and visualisation tools, the BIGSdb software includes a RESTful application programming interface (API) that enables access to all the underlying data for third-party applications and data analysis pipelines.


2020 ◽  
Author(s):  
J. P. Hereward ◽  
X.H. Cai ◽  
A. M. Matias ◽  
G. H. Walter ◽  
C. X. Xu ◽  
...  

AbstractBrown planthoppers (Nilaparvata lugens) are the most serious insect pests of rice, one of the world’s most important staple crops. They reproduce year-round in the tropical parts of their distribution, but cannot overwinter in the temperate areas where they occur, and invade seasonally from elsewhere. Decades of research has not revealed their source unambiguously. We therefore sequenced the genomes of brown planthopper populations from across temperate and tropical parts of their distribution and show that the Indochinese peninsula is the major source of migration into temperate China. The Philippines, once considered a key source, is not significant, with little evidence for their migration into China. We find support for immigration from the west of China contributing to these regional dynamics. The lack of connectivity between the Philippines and mainland China explains the different evolution of Imidacloprid resistance in these populations. This study highlights the promise of whole genome sequence data to understand migration when gene flow is high – a situation that has been difficult to resolve using traditional genetic markers.


Author(s):  
Rute da Fonseca ◽  
Paula Campos ◽  
Alba Rey de la Iglesia ◽  
Gustavo Barroso ◽  
Lucie Bergeron ◽  
...  

Whole genome sequence data is an ideal tool for characterizing processes in ecology and evolution. Despite the lowering in sequencing costs, it can be challenging to produce a genome and high-coverage resequencing data for a non-model species. New population genomics data analysis pipelines based on genotype likelihoods allow for a significant reduction in cost by efficiently extracting information from low coverage sequence data. We demonstrate the robustness of such approaches with a genomic data set consisting of two draft genomes of the European sardine (Sardina pilchardus, Walbaum 1792), and resequencing data (~1.5 X depth) for 78 individuals from 12 sampling locations across the 5,000 Km of the species’ distribution range (from the Eastern Mediterranean to the archipelagos of Madeira and Azores). Our results clearly show at least three genetic clusters. One includes individuals from Azores and Madeira (two archipelagos in the Atlantic), the second corresponds to Iberia (the center of the sampling distribution), and the third gathers the Mediterranean samples and those from the Canary Islands. This suggests at least two important barriers to gene flow, even though these do not seem complete, with individuals from Iberia showing some degree of admixture. These results together with the genetic resources generated for this commercially important taxon provide a baseline for further studies aiming at identifying the nature of these barriers between Sardine populations, and information for transnational stock management of this highly exploited species towards sustainable fisheries.


2020 ◽  
Author(s):  
Brian J. Sanderson ◽  
Stephen P. DiFazio ◽  
Quentin C. Cronk ◽  
Tao Ma ◽  
Matthew S. Olson

AbstractPremise of the studyThe family Salicaceae has proved taxonomically challenging, especially in the genus Salix, which is speciose and features frequent hybridization and polyploidy. Past efforts to reconstruct the phylogeny with molecular barcodes have failed to resolve the species relationships of many sections of the genus.MethodsWe used the wealth of sequence data in the family to design sequence capture probes to target regions of 300-1200 base pairs of exonic regions of 972 genes.ResultsWe recovered sequence data for nearly all of the targeted genes in three species of Populus and three species of Salix. We present a species tree, discuss concordance among gene trees, as well as some population genomic summary statistics for these loci.ConclusionsOur sequence capture array has extremely high capture efficiency within the genera Populus and Salix, resulting in abundant phylogenetic information. Additionally, these loci show promise for population genomic studies.


Sign in / Sign up

Export Citation Format

Share Document