scholarly journals An extended Tajima’s D neutrality test incorporating SNP calling and imputation uncertainties

2015 ◽  
Vol 8 (4) ◽  
pp. 447-456 ◽  
Author(s):  
Qingrun Zhang ◽  
Chris Tyler-Smith ◽  
Quan Long
Author(s):  
Frederick H. Wallace

The Fisher and Seater (1993) methodology is used to test for the long run neutrality of money in Guatemala, 1950-2001. Real GDP, real per capita GDP, and the money measures, M1 and M2, are integrated of order one [1(1)]. Given these orders of integration, the Fisher-Seater neutrality test can be applied. The evidence suggests that M1 and M2 are neutral with respect to real GDP. Furthermore, the test indicates that M1, but not M2, is neutral with respect to real per capita GDP as well.


Genes ◽  
2020 ◽  
Vol 12 (1) ◽  
pp. 29
Author(s):  
Lilia González-Cerón ◽  
José Cebrián-Carmona ◽  
Concepción M. Mesa-Valle ◽  
Federico García-Maroto ◽  
Frida Santillán-Valenzuela ◽  
...  

Plasmodium vivax Cysteine-Rich Protective Antigen (CyRPA) is a merozoite protein participating in the parasite invasion of human reticulocytes. During natural P. vivax infection, antibody responses against PvCyRPA have been detected. In children, low anti-CyRPA antibody titers correlated with clinical protection, which suggests this protein as a potential vaccine candidate. This work analyzed the genetic and amino acid diversity of pvcyrpa in Mexican and global parasites. Consensus coding sequences of pvcyrpa were obtained from seven isolates. Other sequences were extracted from a repository. Maximum likelihood phylogenetic trees, genetic diversity parameters, linkage disequilibrium (LD), and neutrality tests were analyzed, and the potential amino acid polymorphism participation in B-cell epitopes was investigated. In 22 sequences from Southern Mexico, two synonymous and 21 nonsynonymous mutations defined nine private haplotypes. These parasites had the highest LD-R2 index and the lowest nucleotide diversity compared to isolates from South America or Asia. The nucleotide diversity and Tajima’s D values varied across the coding gene. The exon-1 sequence had greater diversity and Rm values than those of exon-2. Exon-1 had significant positive values for Tajima’s D, β-α values, and for the Z (HA: dN > dS) and MK tests. These patterns were similar for parasites of different origin. The polymorphic amino acid residues at PvCyRPA resembled the conformational B-cell peptides reported in PfCyRPA. Diversity at pvcyrpa exon-1 is caused by mutation and recombination. This seems to be maintained by balancing selection, likely due to selective immune pressure, all of which merit further study.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
O. Ajibola ◽  
M. F. Diop ◽  
A. Ghansah ◽  
L. Amenga-Etego ◽  
L. Golassa ◽  
...  

AbstractGenetic diversity of surface exposed and stage specific Plasmodium falciparum immunogenic proteins pose a major roadblock to developing an effective malaria vaccine with broad and long-lasting immunity. We conducted a prospective genetic analysis of candidate antigens (msp1, ama1, rh5, eba175, glurp, celtos, csp, lsa3, Pfsea, trap, conserved chrom3, hyp9, hyp10, phistb, surfin8.2, and surfin14.1) for malaria vaccine development on 2375 P. falciparum sequences from 16 African countries. We described signatures of balancing selection inferred from positive values of Tajima’s D for all antigens across all populations except for glurp. This could be as a result of immune selection on these antigens as positive Tajima’s D values mapped to regions with putative immune epitopes. A less diverse phistb antigen was characterised with a transmembrane domain, glycophosphatidyl anchors between the N and C- terminals, and surface epitopes that could be targets of immune recognition. This study demonstrates the value of population genetic and immunoinformatic analysis for identifying and characterising new putative vaccine candidates towards improving strain transcending immunity, and vaccine efficacy across all endemic populations.


Author(s):  
Fereshteh Shahoveisi ◽  
Atena Oladzad ◽  
Luis E. del Rio Mendoza ◽  
Seyedali Hosseinirad ◽  
Susan Ruud ◽  
...  

The polyploid nature of canola (Brassica napus) represents a challenge for the accurate identification of single nucleotide polymorphisms (SNPs) and the detection of quantitative trait loci (QTL). In this study, combinations of eight phenotyping scoring systems and six SNP calling and filtering parameters were evaluated for their efficiency in detection of QTL associated with response to Sclerotinia stem rot, caused by Sclerotinia sclerotiorum, in two doubled haploid (DH) canola mapping populations. Most QTL were detected in lesion length, relative areas under the disease progress curve (rAUDPC) for lesion length, and binomial-plant mortality data sets. Binomial data derived from lesion size were less efficient in QTL detection. Inclusion of additional phenotypic sets to the analysis increased the numbers of significant QTL by 2.3-fold; however, the continuous data sets were more efficient. Between two filtering parameters used to analyze genotyping by sequencing (GBS) data, imputation of missing data increased QTL detection in one population with a high level of missing data but not in the other. Inclusion of segregation-distorted SNPs increased QTL detection but did not impact their R2 values significantly. Twelve of the 16 detected QTL were on chromosomes A02 and C01, and the rest were on A07, A09, and C03. Marker A02-7594120, associated with a QTL on chromosome A02 was detected in both populations. Results of this study suggest the impact of genotypic variant calling and filtering parameters may be population dependent while deriving additional phenotyping scoring systems such as rAUDPC datasets and mortality binary may improve QTL detection efficiency.


Author(s):  
Russ Jasper ◽  
Tegan Krista McDonald ◽  
Pooja Singh ◽  
Mengmeng Lu ◽  
Clément Rougeux ◽  
...  

The use of NGS datasets has increased dramatically over the last decade, however, there have been few systematic analyses quantifying the accuracy of the commonly used variant caller programs. Here we used a familial design consisting of diploid tissue from a single Pinus contorta parent and the maternally derived haploid tissue from 106 full-sibling offspring, where mismatches could only arise due to mutation or bioinformatic error. Given the rarity of mutation, we used the rate of mismatches between parent and offspring genotype calls to infer the SNP genotyping error rates of FreeBayes, HaplotypeCaller, SAMtools, UnifiedGenotyper, and VarScan. With baseline filtering HaplotypeCaller and UnifiedGenotyper yielded one to two orders of magnitude larger numbers of SNPs and error rates, whereas FreeBayes, SAMtools and VarScan yielded lower numbers of SNPs and more modest error rates. To facilitate comparison between variant callers we standardized each SNP set to the same number of SNPs using additional filtering, where UnifiedGenotyper consistently produced the smallest proportion of genotype errors, followed by HaplotypeCaller, VarScan, SAMtools, and FreeBayes. Additionally, we found that error rates were minimized for SNPs called by more than one variant caller. Finally, we evaluated the performance of various commonly used filtering metrics on SNP calling. Our analysis provides a quantitative assessment of the accuracy of five widely used variant calling programs and offers valuable insights into both the choice of variant caller program and the choice of filtering metrics, especially for researchers using non-model study systems.


2018 ◽  
Author(s):  
Tristan Cumer ◽  
Charles Pouchon ◽  
Frédéric Boyer ◽  
Glenn Yannic ◽  
Delphine Rioux ◽  
...  

ABSTRACTNext-generation sequencing technologies have opened a new era of research in genomics. Among these, restriction enzyme-based techniques such as restriction-site associated DNA sequencing (RADseq) or double-digest RAD-sequencing (ddRADseq) are now widely used in many population genomics fields. From DNA sampling to SNP calling, both wet and dry protocols have been discussed in the literature to identify key parameters for an optimal loci reconstruction.The impact of these parameters on downstream analyses and biological results drawn from RADseq or ddRADseq data has however not been fully explored yet. In this study, we tackled this issue by investigating the effects of ddRADseq laboratory (i.e. wet protocol) and bioinformatics (i.e. dry protocol) settings on loci reconstruction and inferred biological signal at two evolutionary scale using two systems: a complex of butterfly species (Coenonympha sp.) and populations of Common beech (Fagus sylvatica).Results suggest an impact of wet protocol parameters (DNA quantity, number of PCR cycles during library preparation) on the number of recovered reads and SNPs, the number of unique alleles and individual heterozygosity. We also found that bioinformatic settings (i.e. clustering and minimum coverage thresholds) impact loci reconstruction (e.g. number of loci, mean coverage) and SNP calling (e.g. number of SNPs, heterozygosity). We however do not detect an impact of parameter settings on three types of analysis performed with ddRADseq data: measure of genetic differentiation, estimation of individual admixture, and demographic inferences. In addition, our work demonstrates the high reproducibility and low rate of genotyping inconsistencies of the ddRADseq protocol.Thus, our study highlights the impact of wet parameters on ddRADseq protocol with strong consequences on experimental success and biological conclusions. Dry parameters affects loci reconstruction and descriptive statistics but not biological conclusion for the two studied systems. Overall, this study illustrates, with others, the relevance of ddRADseq for population and evolutionary genomics at the inter- or intraspecific scales.


2020 ◽  
Vol 19 (1) ◽  
Author(s):  
Elikplim A. Amegashie ◽  
Lucas Amenga-Etego ◽  
Courage Adobor ◽  
Peter Ogoti ◽  
Kevin Mbogo ◽  
...  

Abstract Background Extensive genetic diversity in the Plasmodium falciparum circumsporozoite protein (PfCSP) is a major contributing factor to the moderate efficacy of the RTS,S/AS01 vaccine. The transmission intensity and rates of recombination within and between populations influence the extent of its genetic diversity. Understanding the extent and dynamics of PfCSP genetic diversity in different transmission settings will help to interpret the results of current RTS,S efficacy and Phase IV implementation trials conducted within and between populations in malaria-endemic areas such as Ghana. Methods Pfcsp sequences were retrieved from the Illumina-generated paired-end short-read sequences of 101 and 131 malaria samples from children aged 6–59 months presenting with clinical malaria at health facilities in Cape Coast (in the coastal belt) and Navrongo (Guinea savannah region), respectively, in Ghana. The sequences were mapped onto the 3D7 reference strain genome to yield high-quality genome-wide coding sequence data. Following data filtering and quality checks to remove missing data, 220 sequences were retained and analysed for the allele frequency spectrum, genetic diversity both within the host and between populations and signatures of selection. Population genetics tools were used to determine the extent and dynamics of Pfcsp diversity in P. falciparum from the two geographically distinct locations in Ghana. Results Pfcsp showed extensive diversity at the two sites, with the higher transmission site, Navrongo, exhibiting higher within-host and population-level diversity. The vaccine strain C-terminal epitope of Pfcsp was found in only 5.9% and 45.7% of the Navrongo and Cape Coast sequences, respectively. Between 1 and 6 amino acid variations were observed in the TH2R and TH3R epitope regions of PfCSP. Tajima’s D was negatively skewed, especially for the population from Cape Coast, given the expected historical population expansion. In contrast, a positive Tajima’s D was observed for the Navrongo P. falciparum population, consistent with balancing selection acting on the immuno-dominant TH2R and TH3R vaccine epitopes. Conclusion The low frequencies of the Pfcsp vaccine haplotype in the analysed populations indicate a need for additional molecular and immuno-epidemiological studies with broader temporal and geographic sampling in endemic populations targeted for RTS,S application. These results have implications for the efficacy of the vaccine in Ghana and will inform the choice of alleles to be included in future multivalent or chimeric vaccines.


Sign in / Sign up

Export Citation Format

Share Document