scholarly journals Effects of single nucleotide polymorphism ascertainment on population structure inferences

Author(s):  
Kotaro Dokan ◽  
Sayu Kawamura ◽  
Kosuke M Teshima

Abstract Single nucleotide polymorphism (SNP) data are widely used in research on natural populations. Although they are useful, SNP genotyping data are known to contain bias, normally referred to as ascertainment bias, because they are conditioned by already confirmed variants. This bias is introduced during the genotyping process, including the selection of populations for novel SNP discovery and the number of individuals involved in the discovery panel and selection of SNP markers. It is widely recognized that ascertainment bias can cause inaccurate inferences in population genetics and several methods to address these bias issues have been proposed. However, especially in natural populations, it is not always possible to apply an ideal ascertainment scheme because natural populations tend to have complex structures and histories. In addition, it was not fully assessed if ascertainment bias has the same effect on different types of population structure. Here we examine the effects of bias produced during the selection of population for SNP discovery and consequent SNP marker selection processes under three demographic models: the island, stepping-stone, and population split models. Results show that site frequency spectra and summary statistics contain biases that depend on the joint effect of population structure and ascertainment schemes. Additionally, population structure inferences are also affected by ascertainment bias. Based on these results, it is recommended to evaluate the validity of the ascertainment strategy prior to the actual typing process because the direction and extent of ascertainment bias vary depending on several factors.

Agronomy ◽  
2021 ◽  
Vol 11 (3) ◽  
pp. 604
Author(s):  
Subhash Chander ◽  
Ana Luísa Garcia-Oliveira ◽  
Melaku Gedil ◽  
Trushar Shah ◽  
Gbemisola Oluwayemisi Otusanya ◽  
...  

Soybean productivity in sub-Saharan Africa (SSA) is less than half of the global average yield. To plug the productivity gap, further improvement in grain yield must be attained by enhancing the genetic potential of new cultivars that depends on the genetic diversity of the parents. Hence, our aim was to assess genetic diversity and population structure of elite soybean genotypes, mainly released cultivars and advanced selections in SSA. In this study, a set of 165 lines was genotyped with high-throughput single nucleotide polymorphism (SNP) markers covering the complete genome of soybean. The genetic diversity (0.414) was high considering the bi-allelic nature of SNP markers. The polymorphic information content (PIC) varied from 0.079 to 0.375, with an average of 0.324 and about 49% of the markers had a PIC value above 0.350. Cluster analysis grouped all the genotypes into three major clusters. The model-based STRUCTURE and discriminant analysis of principal components (DAPC) exhibited high consistency in the allocation of lines in subpopulations or groups. Nonetheless, they presented some discrepancy and identified the presence of six and five subpopulations or groups, respectively. Principal coordinate analysis revealed more consistency with subgroups suggested by DAPC analysis. Our results clearly revealed the broad genetic base of TGx (Tropical Glycine max) lines that soybean breeders may select parents for crossing, testing and selection of future cultivars with desirable traits for SSA.


Genome ◽  
2005 ◽  
Vol 48 (1) ◽  
pp. 12-17 ◽  
Author(s):  
L D Chaves ◽  
J A Rowe ◽  
K M Reed

Genome characterization and analysis is an imperative step in identifying and selectively breeding for improved traits of agriculturally important species. Expressed sequence tags (ESTs) represent a transcribed portion of the genome and are an effective way to identify genes within a species. Downstream applications of EST projects include DNA microarray construction and interspecies comparisons. In this study, 694 ESTs were sequenced and analyzed from a library derived from a 24-day-old turkey embryo. The 437 unique sequences identified were divided into 76 assembled contigs and 361 singletons. The majority of significant comparative matches occurred between the turkey sequences and sequences reported from the chicken. Whole genome sequence from the chicken was used to identify potential exon–intron boundaries for selected turkey clones and intron-amplifying primers were developed for sequence analysis and single nucleotide polymorphism (SNP) discovery. Identified SNPs were genotyped for linkage analysis on two turkey reference populations. This study significantly increases the number of EST sequences available for the turkey.Key words: turkey, cDNA, expressed sequence tag, single nucleotide polymorphism.


2021 ◽  
Vol 19 (1) ◽  
pp. 20-28
Author(s):  
Abush Tesfaye Abebe ◽  
Adesike Oladoyin Kolawole ◽  
Nnanna Unachukwu ◽  
Godfree Chigeza ◽  
Hailu Tefera ◽  
...  

AbstractSoybean (Glycine max (L.) Merr.) is an important legume crop with high commercial value widely cultivated globally. Thus, the genetic characterization of the existing soybean germplasm will provide useful information for enhanced conservation, improvement and future utilization. This study aimed to assess the extent of genetic diversity of soybean elite breeding lines and varieties developed by the soybean breeding programme of the International Institute of Tropical Agriculture (IITA), Ibadan, Nigeria. The genetic diversity of 65 soybean genotypes was studied using single-nucleotide polymorphism (SNP) markers. The result revealed that 2446 alleles were detected, and the indicators for allelic richness and diversity had good differentiating power in assessing the diversity of the genotypes. The three complementary approaches used in the study grouped the germplasm into three major clusters based on genetic relatedness. The analysis of molecular variance revealed that 71% (P < 0.001) variation was due to among individual genotypes, while 11% (P < 0.001) was ascribed to differences among the three clusters, and the fixation index (FST) was 0.11 for the SNP loci, signifying moderate genetic differentiation among the genotypes. The identified private alleles indicate that the soybean germplasm contains diverse variability that is yet to be exploited. The SNP markers revealed high diversity in the studied germplasm and found to be efficient for assessing genetic diversity in the crop. These results provide valuable information that might be utilized for assessing the genetic variability of soybean and other legume crops germplasm by breeding programmes.


Plants ◽  
2021 ◽  
Vol 10 (10) ◽  
pp. 2025
Author(s):  
Shyryn Almerekova ◽  
Yuliya Genievskaya ◽  
Saule Abugalieva ◽  
Kazuhiro Sato ◽  
Yerlan Turuspekov

The genetic relationship and population structure of two-rowed barley accessions from Kazakhstan were assessed using single-nucleotide polymorphism (SNP) markers. Two different approaches were employed in the analysis: (1) the accessions from Kazakhstan were compared with barley samples from six different regions around the world using 1955 polymorphic SNPs, and (2) 94 accessions collected from six breeding programs from Kazakhstan were studied using 5636 polymorphic SNPs using a 9K Illumina Infinium assay. In the first approach, the neighbor-joining tree showed that the majority of the accessions from Kazakhstan were grouped in a separate subcluster with a common ancestral node; there was a sister subcluster that comprised mainly barley samples that originated in Europe. The Pearson’s correlation analysis suggested that Kazakh accessions were genetically close to samples from Africa and Europe. In the second approach, the application of the STRUCTURE package using 5636 polymorphic SNPs suggested that Kazakh barley samples consisted of five subclusters in three major clusters. The principal coordinate analysis plot showed that, among six breeding origins in Kazakhstan, the Krasnovodopad (KV) and Karaganda (KA) samples were the most distant groups. The assessment of the pedigrees in the KV and KA samples showed that the hybridization schemes in these breeding stations heavily used accessions from Ethiopia and Ukraine, respectively. The comparative analysis of the KV and KA samples allowed us to identify 214 SNPs with opposite allele frequencies that were tightly linked to 60 genes/gene blocks associated with plant adaptation traits, such as the heading date and plant height. The identified SNP markers can be efficiently used in studies of barley adaptation and deployed in breeding projects to develop new competitive cultivars.


Life Science ◽  
2019 ◽  
Vol 8 (1) ◽  
pp. 54-64
Author(s):  
Mohamad Ikhsan Nurulloh ◽  
Yustinus Ulung Anggraito ◽  
Hidayat Trimarsanto ◽  
Endah Peniati ◽  
R. Susanti

Plasmodium is a pathogen that causes malaria which has high genetic diversity and resistance to antimalarial drugs. Information on the population structure of Plasmodium can be used as molecular markers, one of which is Single Nucleotide Polymorphism (SNP). SNP markers are in large numbers and not entirely informative. The existing method has not been effective in producing informative SNPs, therefore it is necessary to develop an effective SNP selection method. The SNP selection method is developed using FST as the main filter (filter) and combines Linkage Disequilibrium (LD). The population structure of the SNP is known to use Principal Component Analysis (PCA), Principal Coordinate Analysis (PCoA), pairwise FST, and neighbor-joining population trees. Informative SNP criteria known by calculating FST and Minor Allele Frequency (MAF). Statistical methods were tested to determine their effectiveness in producing informative SNPs. The method testing was carried out using genetic data simulation of the Plasmodium population. The results of the study show that the statistical method is effective in producing informative SNPs. The informative SNP criteria are SNPs with MAF 0.2-0.4 and FST 0.1-0.4 and 0.8-1.0.   Plasmodium merupakan patogen penyebab malaria dengan keanekaragaman genetik tinggi dan memiliki resistensi terhadap obat antimalaria. Informasi sturuktur populasi Plasmodium dapat dimanfaatkan sebagai marka molekuler seperti Single Nucleotide Polymorphism (SNP). Marka SNP terdapat dalam jumlah yang banyak dan tidak seluruhnya informatif. Metode yang telah ada belum efektif dalam menghasilkan SNP informatif sehingga perlu dilakukan pengembangan metode seleksi SNP yang efektif. Metode seleksi SNP dikembangkan menggunakan FST sebagai filter (penyaring) utamanya dan gabungkan Linkage Disequilibrium (LD). Struktur populasi dari SNP diketahui menggunakan Principal Component Analysis (PCA), Principal Coordinate Analysis (PCoA), pairwise FST, dan neighbor-joining population tree. Kriteria SNP informatif yang diketahui dengan menghitung FST dan Minor Allele Frequency (MAF). Metode statistika diuji untuk mengetahui keefektifannya dalam menghasilkan SNP informatif. Pengujian metode dilakukan menggunakan simulasi data genetik populasi Plasmodium. Hasil penelitian menunjukkan metode statistika efektif dalam menghasilkan SNP informatif. Kriteria SNP informatif adalah SNP dengan MAF 0.2-0.4 serta FST 0.1-0.4 dan 0.8-1.0.


Euphytica ◽  
2010 ◽  
Vol 175 (1) ◽  
pp. 91-107 ◽  
Author(s):  
Jin-kee Jung ◽  
Soung-Woo Park ◽  
Wing Yee Liu ◽  
Byoung-Cheorl Kang

Sign in / Sign up

Export Citation Format

Share Document