Investigation of the Genetic Diversity of a Rice Core Collection of Japanese Landraces using Whole-Genome Sequencing

Abstract The Rice Core Collection of Japanese Landraces (JRC) consisting of 50 accessions was developed by the genebank at the National Agriculture and Food Research Organization (NARO) in 2008. As a Japanese landrace core collection, the JRC has been used for many research projects, including screening for different phenotypes and allele mining for target genes. To understand the genetic diversity of Japanese Landraces, we performed whole-genome resequencing of these 50 accessions and obtained a total of 2,145,095 single nucleotide polymorphism (SNPs) and 317,832 insertion–deletions (indels) by mapping against the Oryza sativa ssp. japonica Nipponbare genome. A JRC phylogenetic tree based on 1,394 representative SNPs showed that JRC accessions were divided into two major groups and one small group. We used the multiple genome browser, TASUKE+, to examine the haplotypes of flowering genes and detected new mutations in these genes. Finally, we performed genome-wide association studies (GWAS) for agronomical traits using the JRC and another core collection, the World Rice Core Collection (WRC), comprising 69 accessions also provided by the NARO genebank. In leaf blade width, a strong peak close to NAL1, a key gene for the regulation of leaf width, and, in heading date, a peak near HESO1 involved in flowering regulation were observed in GWAS using the JRC. They were also detected in GWAS using the combined JRC + WRC. Thus, JRC and JRC + WRC are suitable populations for GWAS of particular traits.

Download Full-text

Whole-Genome Sequencing of the NARO World Rice Core Collection (WRC) as the Basis for Diversity and Association Studies

Plant and Cell Physiology ◽

10.1093/pcp/pcaa019 ◽

2020 ◽

Vol 61 (5) ◽

pp. 922-932 ◽

Cited By ~ 3

Author(s):

N Tanaka ◽

M Shenton ◽

Y Kawahara ◽

M Kumagai ◽

H Sakai ◽

...

Keyword(s):

Genetic Diversity ◽

Core Collection ◽

Sequence Data ◽

Association Studies ◽

Crop Improvement ◽

Heading Date ◽

Principal Component ◽

Genome Wide Association Studies ◽

Whole Genome ◽

Population Structure Analysis

Abstract Genebanks provide access to diverse materials for crop improvement. To utilize and evaluate them effectively, core collections, such as the World Rice Core Collection (WRC) in the Genebank at the National Agriculture and Food Research Organization, have been developed. Because the WRC consists of 69 accessions with a high degree of genetic diversity, it has been used for >300 projects. To allow deeper investigation of existing WRC data and to further promote research using Genebank rice accessions, we performed whole-genome resequencing of these 69 accessions, examining their sequence variation by mapping against the Oryza sativa ssp. japonica Nipponbare genome. We obtained a total of 2,805,329 single nucleotide polymorphisms (SNPs) and 357,639 insertion–deletions. Based on the principal component analysis and population structure analysis of these data, the WRC can be classified into three major groups. We applied TASUKE, a multiple genome browser to visualize the different WRC genome sequences, and classified haplotype groups of genes affecting seed characteristics and heading date. TASUKE thus provides access to WRC genotypes as a tool for reverse genetics. We examined the suitability of the compact WRC population for genome-wide association studies (GWASs). Heading date, affected by a large number of quantitative trait loci (QTLs), was not associated with known genes, but several seed-related phenotypes were associated with known genes. Thus, for QTLs of strong effect, the compact WRC performed well in GWAS. This information enables us to understand genetic diversity in 37,000 rice accessions maintained in the Genebank and to find genes associated with different phenotypes. The sequence data have been deposited in DNA Data Bank of Japan Sequence Read Archive (DRA) (Supplementary Table S1).

Download Full-text

The USDA cucumber (Cucumis sativus L.) collection: genetic diversity, population structure, genome-wide association studies, and core collection development

Horticulture Research ◽

10.1038/s41438-018-0080-8 ◽

2018 ◽

Vol 5 (1) ◽

Cited By ~ 20

Author(s):

Xin Wang ◽

Kan Bao ◽

Umesh K. Reddy ◽

Yang Bai ◽

Sue A. Hammar ◽

...

Keyword(s):

Genetic Diversity ◽

Population Structure ◽

Cucumis Sativus ◽

Core Collection ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Collection Development ◽

Cucumis Sativus L ◽

Genome Wide

Download Full-text

The USDA Barley Core Collection: Genetic Diversity, Population Structure, and Potential for Genome-Wide Association Studies

PLoS ONE ◽

10.1371/journal.pone.0094688 ◽

2014 ◽

Vol 9 (4) ◽

pp. e94688 ◽

Cited By ~ 124

Author(s):

María Muñoz-Amatriaín ◽

Alfonso Cuesta-Marcos ◽

Jeffrey B. Endelman ◽

Jordi Comadran ◽

John M. Bonman ◽

...

Keyword(s):

Genetic Diversity ◽

Population Structure ◽

Core Collection ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Genome Wide

Download Full-text

Genetic characterization of melon accessions in the U.S. National Plant Germplasm System and construction of a melon core collection

Molecular Horticulture ◽

10.1186/s43897-021-00014-9 ◽

2021 ◽

Vol 1 (1) ◽

Author(s):

Xin Wang ◽

Kaori Ando ◽

Shan Wu ◽

Umesh K. Reddy ◽

Prabin Tamang ◽

...

Keyword(s):

Genetic Diversity ◽

Core Collection ◽

Association Studies ◽

Genotyping By Sequencing ◽

Future Research ◽

Genome Wide Association Studies ◽

The Core ◽

Plant Germplasm ◽

National Plant Germplasm System ◽

The U.S

AbstractMelon (C. melo L.) is an economically important vegetable crop cultivated worldwide. The melon collection in the U.S. National Plant Germplasm System (NPGS) is a valuable resource to conserve natural genetic diversity and provide novel traits for melon breeding. Here we use the genotyping-by-sequencing (GBS) technology to characterize 2083 melon accessions in the NPGS collected from major melon production areas as well as regions where primitive melons exist. Population structure and genetic diversity analyses suggested that C. melo ssp. melo was firstly introduced from the centers of origin, Indian and Pakistan, to Central and West Asia, and then brought to Europe and Americas. C. melo ssp. melo from East Asia was likely derived from C. melo ssp. agrestis in India and Pakistan and displayed a distinct genetic background compared to the rest of ssp. melo accessions from other geographic regions. We developed a core collection of 383 accessions capturing more than 98% of genetic variation in the germplasm, providing a publicly accessible collection for future research and genomics-assisted breeding of melon. Thirty-five morphological characters investigated in the core collection indicated high variability of these characters across accessions in the collection. Genome-wide association studies using the core collection panel identified potentially associated genome regions related to fruit quality and other horticultural traits. This study provides insights into melon origin and domestication, and the constructed core collection and identified genome loci potentially associated with important traits provide valuable resources for future melon research and breeding.

Download Full-text

Family-based gene-environment interaction using sequence kernel association test (FGE-SKAT) for complex quantitative traits

Scientific Reports ◽

10.1038/s41598-021-86871-2 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Chao-Yu Guo ◽

Reng-Hong Wang ◽

Hsin-Chou Yang

Keyword(s):

Complex Traits ◽

Association Studies ◽

Association Test ◽

Whole Genome Sequence ◽

Environment Interaction ◽

Genome Wide Association Studies ◽

Whole Genome ◽

Sequence Kernel Association Test ◽

Gene Environment ◽

Family Based

AbstractAfter the genome-wide association studies (GWAS) era, whole-genome sequencing is highly engaged in identifying the association of complex traits with rare variations. A score-based variance-component test has been proposed to identify common and rare genetic variants associated with complex traits while quickly adjusting for covariates. Such kernel score statistic allows for familial dependencies and adjusts for random confounding effects. However, the etiology of complex traits may involve the effects of genetic and environmental factors and the complex interactions between genes and the environment. Therefore, in this research, a novel method is proposed to detect gene and gene-environment interactions in a complex family-based association study with various correlated structures. We also developed an R function for the Fast Gene-Environment Sequence Kernel Association Test (FGE-SKAT), which is freely available as supplementary material for easy GWAS implementation to unveil such family-based joint effects. Simulation studies confirmed the validity of the new strategy and the superior statistical power. The FGE-SKAT was applied to the whole genome sequence data provided by Genetic Analysis Workshop 18 (GAW18) and discovered concordant and discordant regions compared to the methods without considering gene by environment interactions.

Download Full-text

Integration of genome wide association studies and whole genome sequencing provides novel insights into fat deposition in chicken

Scientific Reports ◽

10.1038/s41598-018-34364-0 ◽

2018 ◽

Vol 8 (1) ◽

Cited By ~ 8

Author(s):

Gabriel Costa Monteiro Moreira ◽

Clarissa Boschiero ◽

Aline Silva Mello Cesar ◽

James M. Reecy ◽

Thaís Fernanda Godoy ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Association Studies ◽

Fat Deposition ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Whole Genome ◽

Genome Wide

Download Full-text

Analysis of chromatin organization and gene expression in T cells identifies functional genes for rheumatoid arthritis

10.1101/827923 ◽

2019 ◽

Author(s):

Jing Yang ◽

Amanda McGovern ◽

Paul Martin ◽

Kate Duffus ◽

Xiangyu Ge ◽

...

Keyword(s):

Rheumatoid Arthritis ◽

Gene Expression ◽

T Cells ◽

Complex Disease ◽

Target Genes ◽

Disease Risk ◽

Association Studies ◽

Dna Interaction ◽

Genome Wide Association Studies ◽

Causal Genes

AbstractGenome-wide association studies have identified genetic variation contributing to complex disease risk. However, assigning causal genes and mechanisms has been more challenging because disease-associated variants are often found in distal regulatory regions with cell-type specific behaviours. Here, we collect ATAC-seq, Hi-C, Capture Hi-C and nuclear RNA-seq data in stimulated CD4+ T-cells over 24 hours, to identify functional enhancers regulating gene expression. We characterise changes in DNA interaction and activity dynamics that correlate with changes gene expression, and find that the strongest correlations are observed within 200 kb of promoters. Using rheumatoid arthritis as an example of T-cell mediated disease, we demonstrate interactions of expression quantitative trait loci with target genes, and confirm assigned genes or show complex interactions for 20% of disease associated loci, including FOXO1, which we confirm using CRISPR/Cas9.

Download Full-text

A comprehensive integrated post-GWAS analysis of Type 1 diabetes reveals enhancer-based immune dysregulation

PLoS ONE ◽

10.1371/journal.pone.0257265 ◽

2021 ◽

Vol 16 (9) ◽

pp. e0257265

Author(s):

Seung-Soo Kim ◽

Adam D. Hudgins ◽

Jiping Yang ◽

Yizhou Zhu ◽

Zhidong Tu ◽

...

Keyword(s):

Type 1 Diabetes ◽

Target Genes ◽

Association Studies ◽

Regulatory Elements ◽

Immune Dysregulation ◽

Specific Gene ◽

Genome Wide Association Studies ◽

Gwas Analysis ◽

Regulatory Variants

Type 1 diabetes (T1D) is an organ-specific autoimmune disease, whereby immune cell-mediated killing leads to loss of the insulin-producing β cells in the pancreas. Genome-wide association studies (GWAS) have identified over 200 genetic variants associated with risk for T1D. The majority of the GWAS risk variants reside in the non-coding regions of the genome, suggesting that gene regulatory changes substantially contribute to T1D. However, identification of causal regulatory variants associated with T1D risk and their affected genes is challenging due to incomplete knowledge of non-coding regulatory elements and the cellular states and processes in which they function. Here, we performed a comprehensive integrated post-GWAS analysis of T1D to identify functional regulatory variants in enhancers and their cognate target genes. Starting with 1,817 candidate T1D SNPs defined from the GWAS catalog and LDlink databases, we conducted functional annotation analysis using genomic data from various public databases. These include 1) Roadmap Epigenomics, ENCODE, and RegulomeDB for epigenome data; 2) GTEx for tissue-specific gene expression and expression quantitative trait loci data; and 3) lncRNASNP2 for long non-coding RNA data. Our results indicated a prevalent enhancer-based immune dysregulation in T1D pathogenesis. We identified 26 high-probability causal enhancer SNPs associated with T1D, and 64 predicted target genes. The majority of the target genes play major roles in antigen presentation and immune response and are regulated through complex transcriptional regulatory circuits, including those in HLA (6p21) and non-HLA (16p11.2) loci. These candidate causal enhancer SNPs are supported by strong evidence and warrant functional follow-up studies.

Download Full-text

Design and performance of a bovine 200 k SNP chip developed for endangered German Black Pied cattle (DSN)

BMC Genomics ◽

10.1186/s12864-021-08237-2 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Guilherme B. Neumann ◽

Paula Korkuć ◽

Danny Arends ◽

Manuel J. Wolf ◽

Katharina May ◽

...

Keyword(s):

Genetic Diversity ◽

Association Studies ◽

Bos Indicus ◽

Diversity Management ◽

Whole Genome Sequencing Data ◽

General Interest ◽

Whole Genome ◽

Sequencing Data ◽

Snp Chip ◽

Selection Of

Abstract Background German Black Pied cattle (DSN) are an endangered dual-purpose breed which was largely replaced by Holstein cattle due to their lower milk yield. DSN cattle are kept as a genetic reserve with a current herd size of around 2500 animals. The ability to track sequence variants specific to DSN could help to support the conservation of DSN’s genetic diversity and to provide avenues for genetic improvement. Results Whole-genome sequencing data of 304 DSN cattle were used to design a customized DSN200k SNP chip harboring 182,154 variants (173,569 SNPs and 8585 indels) based on ten selection categories. We included variants of interest to DSN such as DSN unique variants and variants from previous association studies in DSN, but also variants of general interest such as variants with predicted consequences of high, moderate, or low impact on the transcripts and SNPs from the Illumina BovineSNP50 BeadChip. Further, the selection of variants based on haplotype blocks ensured that the whole-genome was uniformly covered with an average variant distance of 14.4 kb on autosomes. Using 300 DSN and 162 animals from other cattle breeds including Holstein, endangered local cattle populations, and also a Bos indicus breed, performance of the SNP chip was evaluated. Altogether, 171,978 (94.31%) of the variants were successfully called in at least one of the analyzed breeds. In DSN, the number of successfully called variants was 166,563 (91.44%) while 156,684 (86.02%) were segregating at a minor allele frequency > 1%. The concordance rate between technical replicates was 99.83 ± 0.19%. Conclusion The DSN200k SNP chip was proved useful for DSN and other Bos taurus as well as one Bos indicus breed. It is suitable for genetic diversity management and marker-assisted selection of DSN animals. Moreover, variants that were segregating in other breeds can be used for the design of breed-specific customized SNP chips. This will be of great value in the application of conservation programs for endangered local populations in the future.

Download Full-text

Perspective of the GEMSTONE Consortium on Current and Future Approaches to Functional Validation for Skeletal Genetic Disease Using Cellular, Molecular and Animal-Modeling Techniques

Frontiers in Endocrinology ◽

10.3389/fendo.2021.731217 ◽

2021 ◽

Vol 12 ◽

Author(s):

Martina Rauner ◽

Ines Foessl ◽

Melissa M. Formosa ◽

Erika Kague ◽

Vid Prijatelj ◽

...

Keyword(s):

Resource Sharing ◽

Complex Traits ◽

Cellular Localization ◽

Target Genes ◽

Mission Statement ◽

Association Studies ◽

Repetitive Sequences ◽

Genome Wide Association Studies ◽

Causal Genes

The availability of large human datasets for genome-wide association studies (GWAS) and the advancement of sequencing technologies have boosted the identification of genetic variants in complex and rare diseases in the skeletal field. Yet, interpreting results from human association studies remains a challenge. To bridge the gap between genetic association and causality, a systematic functional investigation is necessary. Multiple unknowns exist for putative causal genes, including cellular localization of the molecular function. Intermediate traits (“endophenotypes”), e.g. molecular quantitative trait loci (molQTLs), are needed to identify mechanisms of underlying associations. Furthermore, index variants often reside in non-coding regions of the genome, therefore challenging for interpretation. Knowledge of non-coding variance (e.g. ncRNAs), repetitive sequences, and regulatory interactions between enhancers and their target genes is central for understanding causal genes in skeletal conditions. Animal models with deep skeletal phenotyping and cell culture models have already facilitated fine mapping of some association signals, elucidated gene mechanisms, and revealed disease-relevant biology. However, to accelerate research towards bridging the current gap between association and causality in skeletal diseases, alternative in vivo platforms need to be used and developed in parallel with the current -omics and traditional in vivo resources. Therefore, we argue that as a field we need to establish resource-sharing standards to collectively address complex research questions. These standards will promote data integration from various -omics technologies and functional dissection of human complex traits. In this mission statement, we review the current available resources and as a group propose a consensus to facilitate resource sharing using existing and future resources. Such coordination efforts will maximize the acquisition of knowledge from different approaches and thus reduce redundancy and duplication of resources. These measures will help to understand the pathogenesis of osteoporosis and other skeletal diseases towards defining new and more efficient therapeutic targets.

Download Full-text