Improved Estimation of Phenotypic Correlations Using Summary Association Statistics

Estimating the phenotypic correlations between complex traits and diseases based on their genome-wide association summary statistics has been a useful technique in genetic epidemiology and statistical genetics inference. Two state-of-the-art strategies, Z-score correlation across null-effect single nucleotide polymorphisms (SNPs) and LD score regression intercept, were widely applied to estimate phenotypic correlations. Here, we propose an improved Z-score correlation strategy based on SNPs with low minor allele frequencies (MAFs), and show how this simple strategy can correct the bias generated by the current methods. The low MAF estimator improves phenotypic correlation estimation, thus it is beneficial for methods and applications using phenotypic correlations inferred from summary association statistics.

Download Full-text

Improved estimation of phenotypic correlations using summary association statistics

10.1101/2020.12.10.419325 ◽

2020 ◽

Author(s):

Xia Shen ◽

Ting Li ◽

Zheng Ning

Keyword(s):

Complex Traits ◽

State Of The Art ◽

Phenotypic Correlation ◽

Summary Statistics ◽

Z Score ◽

Phenotypic Correlations ◽

Simple Strategy ◽

Null Effect ◽

Correlation Estimation ◽

Genome Wide

Estimating the phenotypic correlations between complex traits and diseases based on their genome-wide association summary statistics has been a useful technique in genetic epidemiology and statistical genetics inference. Two state-of-the-art strategies, Z-score correlation across null-effect SNPs and LD score regression intercept, were widely applied to estimate phenotypic correlations. Here, we propose an improved Z-score correlation strategy based on SNPs with low minor allele frequencies (MAFs), and show how this simple strategy can correct the bias generated by the current methods. Comparing to LDSC, the low-MAF estimator improves phenotypic correlation estimation thus is beneficial for methods and applications using phenotypic correlations inferred from summary association statistics.

Download Full-text

Genetics of complex traits: prediction of phenotype, identification of causal polymorphisms and genetic architecture

Proceedings of The Royal Society B Biological Sciences ◽

10.1098/rspb.2016.0569 ◽

2016 ◽

Vol 283 (1835) ◽

pp. 20160569 ◽

Cited By ~ 52

Author(s):

M. E. Goddard ◽

K. E. Kemper ◽

I. M. MacLeod ◽

A. J. Chamberlain ◽

B. J. Hayes

Keyword(s):

Complex Traits ◽

Genetic Architecture ◽

Quantitative Traits ◽

Association Studies ◽

Genome Wide Association Studies ◽

Nucleotide Polymorphisms ◽

Crop Breeding ◽

Single Nucleotide ◽

Genome Wide ◽

Phenotype Identification

Complex or quantitative traits are important in medicine, agriculture and evolution, yet, until recently, few of the polymorphisms that cause variation in these traits were known. Genome-wide association studies (GWAS), based on the ability to assay thousands of single nucleotide polymorphisms (SNPs), have revolutionized our understanding of the genetics of complex traits. We advocate the analysis of GWAS data by a statistical method that fits all SNP effects simultaneously, assuming that these effects are drawn from a prior distribution. We illustrate how this method can be used to predict future phenotypes, to map and identify the causal mutations, and to study the genetic architecture of complex traits. The genetic architecture of complex traits is even more complex than previously thought: in almost every trait studied there are thousands of polymorphisms that explain genetic variation. Methods of predicting future phenotypes, collectively known as genomic selection or genomic prediction, have been widely adopted in livestock and crop breeding, leading to increased rates of genetic improvement.

Download Full-text

Performing post-genome-wide association study analysis: overview, challenges and recommendations

F1000Research ◽

10.12688/f1000research.53962.1 ◽

2021 ◽

Vol 10 ◽

pp. 1002

Author(s):

Yagoub Adam ◽

Chaimae Samtal ◽

Jean-tristan Brandenburg ◽

Oluwadamilare Falola ◽

Ezekiel Adebiyi

Keyword(s):

Complex Traits ◽

Genome Wide Association Study ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Nucleotide Polymorphisms ◽

Single Nucleotide ◽

Genome Wide ◽

Single Phenotype ◽

Insight Into

Genome-wide association studies (GWAS) provide huge information on statistically significant single-nucleotide polymorphisms (SNPs) associated with various human complex traits and diseases. By performing GWAS studies, scientists have successfully identified the association of hundreds of thousands to millions of SNPs to a single phenotype. Moreover, the association of some SNPs with rare diseases has been intensively tested. However, classic GWAS studies have not yet provided solid, knowledgeable insight into functional and biological mechanisms underlying phenotypes or mechanisms of diseases. Therefore, several post-GWAS (pGWAS) methods have been recommended. Currently, there is no simple scientific document to provide a quick guide for performing pGWAS analysis. pGWAS is a crucial step for a better understanding of the biological machinery beyond the SNPs. Here, we provide an overview to performing pGWAS analysis and demonstrate the challenges behind each method. Furthermore, we direct readers to key articles for each pGWAS method and present the overall issues in pGWAS analysis. Finally, we include a custom pGWAS pipeline to guide new users when performing their research.

Download Full-text

Fibrate pharmacogenomics: expanding past the genome

Pharmacogenomics ◽

10.2217/pgs-2019-0140 ◽

2020 ◽

Vol 21 (4) ◽

pp. 293-306

Author(s):

John S House ◽

Alison A Motsinger-Reif

Keyword(s):

Complex Traits ◽

Density Lipoprotein ◽

Nucleotide Polymorphisms ◽

Single Nucleotide ◽

Cholesterol Levels ◽

Genome Wide ◽

Lower Blood ◽

Reduce Risk ◽

Response Variation ◽

Technological Platforms

Fibrates are a medication class prescribed for decades as ‘broad-spectrum’ lipid-modifying agents used to lower blood triglyceride levels and raise high-density lipoprotein cholesterol levels. Such lipid changes are associated with a decrease in cardiovascular disease, and fibrates are commonly used to reduce risk of dangerous cardiovascular outcomes. As with most drugs, it is well established that response to fibrate treatment is variable, and this variation is heritable. This has motivated the investigation of pharmacogenomic determinants of response, and multiple studies have discovered a number of genes associated with fibrate response. Similar to other complex traits, the interrogation of single nucleotide polymorphisms using candidate gene or genome-wide approaches has not revealed a substantial portion of response variation. However, recent innovations in technological platforms and advances in statistical methodologies are revolutionizing the use and integration of other ‘omes’ in pharmacogenomics studies. Here, we detail successes, challenges, and recent advances in fibrate pharmacogenomics.

Download Full-text

EpiPen: An R Package to Investigate Two-Locus Epistatic Models

Twin Research and Human Genetics ◽

10.1017/thg.2014.25 ◽

2014 ◽

Vol 17 (4) ◽

Cited By ~ 2

Author(s):

Raymond K. Walters ◽

Charles Laurin ◽

Gitta H. Lubke

Keyword(s):

Power Analysis ◽

R Package ◽

Simulation Studies ◽

Nucleotide Polymorphisms ◽

Single Nucleotide ◽

Epistatic Interactions ◽

Model Interpretation ◽

Genome Wide ◽

Using Data ◽

Power Analyses

Epistasis is a growing area of research in genome-wide studies, but the differences between alternative definitions of epistasis remain a source of confusion for many researchers. One problem is that models for epistasis are presented in a number of formats, some of which have difficult-to-interpret parameters. In addition, the relation between the different models is rarely explained. Existing software for testing epistatic interactions between single-nucleotide polymorphisms (SNPs) does not provide the flexibility to compare the available model parameterizations. For that reason we have developed an R package for investigating epistatic and penetrance models, EpiPen, to aid users who wish to easily compare, interpret, and utilize models for two-locus epistatic interactions. EpiPen facilitates research on SNP-SNP interactions by allowing the R user to easily convert between common parametric forms for two-locus interactions, generate data for simulation studies, and perform power analyses for the selected model with a continuous or dichotomous phenotype. The usefulness of the package for model interpretation and power analysis is illustrated using data on rheumatoid arthritis.

Download Full-text

Genomic Analyses of Globodera pallida, A Quarantine Agricultural Pathogen in Idaho

Pathogens ◽

10.3390/pathogens10030363 ◽

2021 ◽

Vol 10 (3) ◽

pp. 363

Author(s):

Sulochana K. Wasala ◽

Dana K. Howe ◽

Louise-Marie Dandurand ◽

Inga A. Zasada ◽

Dee R. Denver

Keyword(s):

Genetic Variation ◽

Potato Production ◽

Globodera Pallida ◽

Fixation Index ◽

Parasitic Nematodes ◽

Nucleotide Polymorphisms ◽

Single Nucleotide ◽

Genome Wide ◽

Field Samples ◽

Multiple Introduction

Globodera pallida is among the most significant plant-parasitic nematodes worldwide, causing major damage to potato production. Since it was discovered in Idaho in 2006, eradication efforts have aimed to contain and eradicate G. pallida through phytosanitary action and soil fumigation. In this study, we investigated genome-wide patterns of G. pallida genetic variation across Idaho fields to evaluate whether the infestation resulted from a single or multiple introduction(s) and to investigate potential evolutionary responses since the time of infestation. A total of 53 G. pallida samples (~1,042,000 individuals) were collected and analyzed, representing five different fields in Idaho, a greenhouse population, and a field in Scotland that was used for external comparison. According to genome-wide allele frequency and fixation index (Fst) analyses, most of the genetic variation was shared among the G. pallida populations in Idaho fields pre-fumigation, indicating that the infestation likely resulted from a single introduction. Temporal patterns of genome-wide polymorphisms involving (1) pre-fumigation field samples collected in 2007 and 2014 and (2) pre- and post-fumigation samples revealed nucleotide variants (SNPs, single-nucleotide polymorphisms) with significantly differentiated allele frequencies indicating genetic differentiation. This study provides insights into the genetic origins and adaptive potential of G. pallida invading new environments.

Download Full-text

Genetic dissection of soybean partial resistance to sclerotinia stem rot through genome wide association study and high throughout single nucleotide polymorphisms

Genomics ◽

10.1016/j.ygeno.2020.10.042 ◽

2021 ◽

Author(s):

Yan Jing ◽

Weili Teng ◽

Lijuan Qiu ◽

Hongkun Zheng ◽

Wenbin Li ◽

...

Keyword(s):

Single Nucleotide Polymorphisms ◽

Partial Resistance ◽

Genome Wide Association Study ◽

Genome Wide Association ◽

Stem Rot ◽

Genetic Dissection ◽

Nucleotide Polymorphisms ◽

Sclerotinia Stem Rot ◽

Single Nucleotide ◽

Genome Wide

Download Full-text

Single Nucleotide Polymorphism Discovery and Genetic Differentiation Analysis of Geese Bred in Poland, Using Genotyping-by-Sequencing (GBS)

Genes ◽

10.3390/genes12071074 ◽

2021 ◽

Vol 12 (7) ◽

pp. 1074

Author(s):

Joanna Grzegorczyk ◽

Artur Gurgul ◽

Maria Oczkowicz ◽

Tomasz Szmatoła ◽

Agnieszka Fornal ◽

...

Keyword(s):

Genotyping By Sequencing ◽

Read Depth ◽

Model Organisms ◽

Single Nucleotide Polymorphism Discovery ◽

Nucleotide Polymorphisms ◽

Single Nucleotide ◽

Polymorphism Discovery ◽

Genome Wide ◽

Plumage Development ◽

Edar Gene

Poland is the largest European producer of goose, while goose breeding has become an essential and still increasing branch of the poultry industry. The most frequently bred goose is the White Kołuda® breed, constituting 95% of the country’s population, whereas geese of regional varieties are bred in smaller, conservation flocks. However, a goose’s genetic diversity is inaccurately explored, mainly because the advantages of the most commonly used tools are strongly limited in non-model organisms. One of the most accurate used markers for population genetics is single nucleotide polymorphisms (SNP). A highly efficient strategy for genome-wide SNP detection is genotyping-by-sequencing (GBS), which has been already widely applied in many organisms. This study attempts to use GBS in 12 conservative goose breeds and the White Kołuda® breed maintained in Poland. The GBS method allowed for the detection of 3833 common raw SNPs. Nevertheless, after filtering for read depth and alleles characters, we obtained the final markers panel used for a differentiation analysis that comprised 791 SNPs. These variants were located within 11 different genes, and one of the most diversified variants was associated with the EDAR gene, which is especially interesting as it participates in the plumage development, which plays a crucial role in goose breeding.

Download Full-text

Genome-Wide Patterns of Homozygosity Reveal the Conservation Status in Five Italian Goat Populations

Animals ◽

10.3390/ani11061510 ◽

2021 ◽

Vol 11 (6) ◽

pp. 1510

Author(s):

Salvatore Mastrangelo ◽

Rosalia Di Gerlando ◽

Maria Teresa Sardina ◽

Anna Maria Sutera ◽

Angelo Moscarelli ◽

...

Keyword(s):

Conservation Status ◽

Phenotypic Traits ◽

Nucleotide Polymorphisms ◽

Single Nucleotide ◽

Local Populations ◽

Genomic Technologies ◽

Fitness Traits ◽

Genome Wide ◽

Breeding Schemes ◽

Genomic Inbreeding

The application of genomic technologies has facilitated the assessment of genomic inbreeding based on single nucleotide polymorphisms (SNPs). In this study, we computed several runs of homozygosity (ROH) parameters to investigate the patterns of homozygosity using Illumina Goat SNP50 in five Italian local populations: Argentata dell’Etna (N = 48), Derivata di Siria (N = 32), Girgentana (N = 59), Maltese (N = 16) and Messinese (N = 22). The ROH results showed well-defined differences among the populations. A total of 3687 ROH segments >2 Mb were detected in the whole sample. The Argentata dell’Etna and Messinese were the populations with the lowest mean number of ROH and inbreeding coefficient values, which reflect admixture and gene flow. In the Girgentana, we identified an ROH pattern related with recent inbreeding that can endanger the viability of the breed due to reduced population size. The genomes of Derivata di Siria and Maltese breeds showed the presence of long ROH (>16 Mb) that could seriously impact the overall biological fitness of these breeds. Moreover, the results confirmed that ROH parameters are in agreement with the known demography of these populations and highlighted the different selection histories and breeding schemes of these goat populations. In the analysis of ROH islands, we detected harbored genes involved with important traits, such as for milk yield, reproduction, and immune response, and are consistent with the phenotypic traits of the studied goat populations. Finally, the results of this study can be used for implementing conservation programs for these local populations in order to avoid further loss of genetic diversity and to preserve the production and fitness traits. In view of this, the availability of genomic data is a fundamental resource.

Download Full-text

Genotyping-by-Sequencing of Gossypium hirsutum Races and Cultivars Uncovers Novel Patterns of Genetic Relationships and Domestication Footprints

Evolutionary Bioinformatics ◽

10.1177/1176934319889948 ◽

2019 ◽

Vol 15 ◽

pp. 117693431988994

Author(s):

Shulin Zhang ◽

Yaling Cai ◽

Jinggong Guo ◽

Kun Li ◽

Renhai Peng ◽

...

Keyword(s):

Gossypium Hirsutum ◽

Genetic Relationships ◽

Phylogenetic Analyses ◽

Genotyping By Sequencing ◽

Nucleotide Polymorphisms ◽

Single Nucleotide ◽

Breeding Programs ◽

Association Analyses ◽

Genome Wide ◽

Candidate Gene Locus

Determining the genetic rearrangement and domestication footprints in Gossypium hirsutum cultivars and primitive race genotypes are essential for effective gene conservation efforts and the development of advanced breeding molecular markers for marker-assisted breeding. In this study, 94 accessions representing the 7 primitive races of G hirsutum, along with 9 G hirsutum and 12 Gossypium barbadense cultivated accessions were evaluated. The genotyping-by-sequencing (GBS) approach was employed and 146 558 single nucleotide polymorphisms (SNP) were generated. Distinct SNP signatures were identified through the combination of selection scans and association analyses. Phylogenetic analyses were also conducted, and we concluded that the Latifolium, Richmondi, and Marie-Galante race accessions were more genetically related to the G hirsutum cultivars and tend to cluster together. Fifty-four outlier SNP loci were identified by selection-scan analysis, and 3 SNPs were located in genes related to the processes of plant responding to stress conditions and confirmed through further genome-wide signals of marker-phenotype association analysis, which indicate a clear selection signature for such trait. These results identified useful candidate gene locus for cotton breeding programs.

Download Full-text