scholarly journals Population Structure Assessed Using Microsatellite and SNP Data: An Empirical Comparison in West African Cattle

Animals ◽  
2021 ◽  
Vol 11 (1) ◽  
pp. 151
Author(s):  
Isabel Álvarez ◽  
Iván Fernández ◽  
Amadou Traoré ◽  
Nuria A. Menéndez-Arias ◽  
Félix Goyache

A sample of 185 West African cattle belonging to nine different taurine, sanga, and zebu populations was typed using a set of 33 microsatellites and the BovineHD BeadChip of Illumina. The information provided by each type of marker was summarized via clustering methods and principal component analyses (PCA). The aim was to assess differences in performance between both marker types for the identification of population structure and the projection of genetic variability on geographical maps. In general, both microsatellites and Single Nucleotide Polymorphism (SNP) allowed us to differentiate taurine cattle from zebu and sanga cattle, which, in turn, would form a single population. Pearson and Spearman correlation coefficients computed among the admixture coefficients (fitting K = 2) and the eigenvectors corresponding to the first two factors identified using PCA on both microsatellite and SNP data were statistically significant (most of them having p < 0.0001) and high. However, SNP data allowed for a better fine-scale identification of population structure within taurine cattle: Lagunaire cattle from Benin were separated from two different N’Dama cattle samples. Furthermore, when clustering analyses assumed the existence of two parental populations only (K = 2), the SNPs could differentiate a different genetic background in Lagunaire and N’Dama cattle. Although the two N’Dama cattle populations had very different breeding histories, the microsatellite set could not separate the two N’Dama cattle populations. Classic bidimensional dispersion plots constructed using factors identified via PCA gave different shapes for microsatellites and SNPs: plots constructed using microsatellite polymorphism would suggest the existence of weakly differentiated, highly intermingled, subpopulations. However, the projection of the factors identified on synthetic maps gave comparable images. This would suggest that results on population structuring must be interpreted with caution. The geographic projection of genetic variation on synthetic maps avoids interpretations that go beyond the results obtained, particularly when previous information on the analyzed populations is scant. Factors influencing the performance of the projection of genetic parameters on geographic maps, together with restrictions that may affect the election of a given type of markers, are discussed.

BMC Genomics ◽  
2020 ◽  
Vol 21 (1) ◽  
Author(s):  
N. Z. Gebrehiwot ◽  
E. M. Strucken ◽  
H. Aliloo ◽  
K. Marshall ◽  
J. P. Gibson

Abstract Background Humpless Bos taurus cattle are one of the earliest domestic cattle in Africa, followed by the arrival of humped Bos indicus cattle. The diverse indigenous cattle breeds of Africa are derived from these migrations, with most appearing to be hybrids between Bos taurus and Bos indicus. The present study examines the patterns of admixture, diversity, and relationships among African cattle breeds. Methods Data for ~ 40 k SNPs was obtained from previous projects for 4089 animals representing 35 African indigenous, 6 European Bos taurus, 4 Bos indicus, and 5 African crossbred cattle populations. Genetic diversity and population structure were assessed using principal component analyses (PCA), admixture analyses, and Wright’s F statistic. The linkage disequilibrium and effective population size (Ne) were estimated for the pure cattle populations. Results The first two principal components differentiated Bos indicus from European Bos taurus, and African Bos taurus from other breeds. PCA and admixture analyses showed that, except for recently admixed cattle, all indigenous breeds are either pure African Bos taurus or admixtures of African Bos taurus and Bos indicus. The African zebu breeds had highest proportions of Bos indicus ancestry ranging from 70 to 90% or 60 to 75%, depending on the admixture model. Other indigenous breeds that were not 100% African Bos taurus, ranged from 42 to 70% or 23 to 61% Bos indicus ancestry. The African Bos taurus populations showed substantial genetic diversity, and other indigenous breeds show evidence of having more than one African taurine ancestor. Ne estimates based on r2 and r2adj showed a decline in Ne from a large population at 2000 generations ago, which is surprising for the indigenous breeds given the expected increase in cattle populations over that period and the lack of structured breeding programs. Conclusion African indigenous cattle breeds have a large genetic diversity and are either pure African Bos taurus or admixtures of African Bos taurus and Bos indicus. This provides a rich resource of potentially valuable genetic variation, particularly for adaptation traits, and to support conservation programs. It also provides challenges for the development of genomic assays and tools for use in African populations.


2016 ◽  
Vol 59 (3) ◽  
pp. 337-344 ◽  
Author(s):  
Amadou Traoré ◽  
Delphin O. Koudandé ◽  
Iván Fernández ◽  
Albert Soudré ◽  
Isabel Álvarez ◽  
...  

Abstract. A total of 183 adult sires belonging to nine West African cattle breeds sampled in 67 villages of Mali, Burkina Faso and Benin were assessed for 16 body measurements and 18 qualitative traits. Within type of cattle (zebu, sanga or taurine), the different breeds analysed showed large differences in body measurements. In general, taurine breeds had lower average values than the zebu breeds while sanga cattle tended to have intermediate values. Principal component analysis identified three factors characterising body measurements. Factor 1 summarised the information provided by those traits characterising the size of the individuals and explained 59.0 % of the variability. Factor 2 tended to gather information characterising the body width and explained 8.0 % of the variation. Less representative, Factor 3 (6.6 % of the variability) had no clear interpretation. Qualitative traits did not allow to distinguish among either cattle groups or breeds. Two Correspondence Analysis Dimensions computed on qualitative traits (explaining 26.2 and 15.5 % of the variability, respectively) did not allow to differentiate between zebu, sanga or taurine cattle breeds. Our results confirm that, in the framework of a general appearance, body measurements are the main criteria for differentiating West African cattle breeds. Furthermore, the current research has not allowed to identify breeding preferences on qualitative type traits in West African cattle sires. Therefore, homogenisation of the appearance of individuals within cattle breed is not expected.


2020 ◽  
Author(s):  
Léa Boyrie ◽  
Corentin Moreau ◽  
Florian Frugier ◽  
Christophe Jacquet ◽  
Maxime Bonhomme

AbstractThe quest for genome-wide signatures of selection in populations using SNP data has proven efficient to uncover genes involved in conserved or adaptive molecular functions, but none of the statistical methods were designed to identify interacting genes as targets of selective processes. Here, we propose a straightforward statistical test aimed at detecting epistatic selection, based on a linkage disequilibrium (LD) measure accounting for population structure and heterogeneous relatedness between individuals. SNP-based (Trv) and window-based (TcorPC1v) statistics fit a Student distribution, allowing to easily and quickly test the significance of correlation coefficients in the frame of Genome-Wide Epistatic Selection Scans (GWESS) using candidate genes as baits. As a proof of concept, use of SNP data from the Medicago truncatula symbiotic legume plant uncovered a previously unknown gene coadaptation between the MtSUNN (Super Numeric Nodule) receptor and the MtCLE02 (CLAVATA3-Like) signalling peptide, and experimental evidence accordingly supported a MtSUNN-dependent negative role of MtCLE02 in symbiotic root nodulation. Using human HGDP-CEPH SNP data, our new statistical test uncovered strong LD between SLC24A5 and EDAR worldwide, which persists after correction for population structure and relatedness in Central South Asian populations. This result suggests adaptive genetic interaction or coselection between skin pigmentation and the ectodysplasin pathway involved in the development of ectodermal organs (hairs, teeth, sweat glands), in some human populations. Applying this approach to genome-wide SNP data will foster the identification of evolutionary coadapted gene networks.Author summaryPopulation genomic methods have allowed to identify many genes associated with adaptive processes in populations with complex histories. However, they are not designed to identify gene coadaptation between genes through epistatic selection, in structured populations. To tackle this problem, we developed a straightforward LD-based statistical test accounting for population structure and heterogeneous relatedness between individuals, using SNP-based (Trv) or windows-based (TcorPC1v) statistics. This allows easily and quickly testing for significance of correlation coefficients between polymorphic loci in the frame of Genome Wide Epistatic Selection Scans (GWESS). Following detection of gene coadaptation using SNP data from human and the model plant Medicago truncatula, we report experimental evidence of genetic interaction between two receptors involved in the regulation of root nodule symbiosis in Medicago truncatula. This test opens new avenues for exploring the evolution of genes as interacting units and thus paves the way to infer new networks based on evolutionary coadaptation between genes.


2020 ◽  
Vol 11 ◽  
Author(s):  
Xaviera Alejandra López-Cortés ◽  
Felipe Matamala ◽  
Carlos Maldonado ◽  
Freddy Mora-Poblete ◽  
Carlos Alberto Scapim

Analysis of population genetic variation and structure is a common practice for genome-wide studies, including association mapping, ecology, and evolution studies in several crop species. In this study, machine learning (ML) clustering methods, K-means (KM), and hierarchical clustering (HC), in combination with non-linear and linear dimensionality reduction techniques, deep autoencoder (DeepAE) and principal component analysis (PCA), were used to infer population structure and individual assignment of maize inbred lines, i.e., dent field corn (n = 97) and popcorn (n = 86). The results revealed that the HC method in combination with DeepAE-based data preprocessing (DeepAE-HC) was the most effective method to assign individuals to clusters (with 96% of correct individual assignments), whereas DeepAE-KM, PCA-HC, and PCA-KM were assigned correctly 92, 89, and 81% of the lines, respectively. These findings were consistent with both Silhouette Coefficient (SC) and Davies–Bouldin validation indexes. Notably, DeepAE-HC also had better accuracy than the Bayesian clustering method implemented in InStruct. The results of this study showed that deep learning (DL)-based dimensional reduction combined with ML clustering methods is a useful tool to determine genetically differentiated groups and to assign individuals into subpopulations in genome-wide studies without having to consider previous genetic assumptions.


2020 ◽  
Vol 14 (8) ◽  
pp. 2782-2798
Author(s):  
Arnaud Stéphane Rayangnéwêndé Tapsoba ◽  
Bernadette Yougbaré ◽  
Fabiola Gnine Traoré ◽  
Félicienne Béré ◽  
Dominique Ouédraogo ◽  
...  

A total of 769 adult females belonging to 3 taurine and one zebu cattle populations sampled in 3 provinces of Burkina Faso were assessed for 19 body measurements during two different years (2014 and 2018). The aim of this research was to identify temporal morphological variation in cattle bred in the humid southern zones to obtain empirical evidence supporting a possible introgression of zebu cattle genes into Gourounsi and Lobi taurine cattle breeds. Zebu cattle samples were used as out-group for both 2014 and 2018 subsets. Least square means of body measurements allowed to classify Burkina Faso taurine cattle into three subgroups according to body size (Gourounsi–Sanguié –GourS-, Gourounsi-Nahouri –GourN- and Lobi from the tallest to the smallest respectively). Principal Component Analysis suggested that in 2014, taurine populations were structured. Dispersion map constructed using the two first factors informed that the GourS population was well separated from both the Lobi and the GourN, which, in turn, overlapped. However, in 2018 a strong signal of homogenization was identified, with GourN partially overlapping the other two populations. Linear Discriminant Analysis suggested that about 20% of both GourS and GourN individuals were reciprocally misclassified. Clues for such increase have been pointed out by MANOVA analysis. Although on 2014, Lobi cattle was clearly smaller than Gourounsi and both GourS and GourN populations showed clear differences on body traits, on 2018 it could be assessed an increase in size in Lobi cattle and a strong homogenization signal within Gourounsi cattle. Zebu cattle gene flow southwards in Burkina Faso is likely to have caused these changes, suggesting a fast erosion of taurine cattle genetic background. Keywords: Body traits, quantitative traits, Gourounsi cattle, Lobi, Burkina Faso.


2018 ◽  
Vol 21 (2) ◽  
pp. 125-137
Author(s):  
Jolanta Stasiak ◽  
Marcin Koba ◽  
Marcin Gackowski ◽  
Tomasz Baczek

Aim and Objective: In this study, chemometric methods as correlation analysis, cluster analysis (CA), principal component analysis (PCA), and factor analysis (FA) have been used to reduce the number of chromatographic parameters (logk/logkw) and various (e.g., 0D, 1D, 2D, 3D) structural descriptors for three different groups of drugs, such as 12 analgesic drugs, 11 cardiovascular drugs and 36 “other” compounds and especially to choose the most important data of them. Material and Methods: All chemometric analyses have been carried out, graphically presented and also discussed for each group of drugs. At first, compounds’ structural and chromatographic parameters were correlated. The best results of correlation analysis were as follows: correlation coefficients like R = 0.93, R = 0.88, R = 0.91 for cardiac medications, analgesic drugs, and 36 “other” compounds, respectively. Next, part of molecular and HPLC experimental data from each group of drugs were submitted to FA/PCA and CA techniques. Results: Almost all results obtained by FA or PCA, and total data variance, from all analyzed parameters (experimental and calculated) were explained by first two/three factors: 84.28%, 76.38 %, 69.71% for cardiovascular drugs, for analgesic drugs and for 36 “other” compounds, respectively. Compounds clustering by CA method had similar characteristic as those obtained by FA/PCA. In our paper, statistical classification of mentioned drugs performed has been widely characterized and discussed in case of their molecular structure and pharmacological activity. Conclusion: Proposed QSAR strategy of reduced number of parameters could be useful starting point for further statistical analysis as well as support for designing new drugs and predicting their possible activity.


2021 ◽  
Vol 13 (11) ◽  
pp. 2125
Author(s):  
Bardia Yousefi ◽  
Clemente Ibarra-Castanedo ◽  
Martin Chamberland ◽  
Xavier P. V. Maldague ◽  
Georges Beaudoin

Clustering methods unequivocally show considerable influence on many recent algorithms and play an important role in hyperspectral data analysis. Here, we challenge the clustering for mineral identification using two different strategies in hyperspectral long wave infrared (LWIR, 7.7–11.8 μm). For that, we compare two algorithms to perform the mineral identification in a unique dataset. The first algorithm uses spectral comparison techniques for all the pixel-spectra and creates RGB false color composites (FCC). Then, a color based clustering is used to group the regions (called FCC-clustering). The second algorithm clusters all the pixel-spectra to directly group the spectra. Then, the first rank of non-negative matrix factorization (NMF) extracts the representative of each cluster and compares results with the spectral library of JPL/NASA. These techniques give the comparison values as features which convert into RGB-FCC as the results (called clustering rank1-NMF). We applied K-means as clustering approach, which can be modified in any other similar clustering approach. The results of the clustering-rank1-NMF algorithm indicate significant computational efficiency (more than 20 times faster than the previous approach) and promising performance for mineral identification having up to 75.8% and 84.8% average accuracies for FCC-clustering and clustering-rank1 NMF algorithms (using spectral angle mapper (SAM)), respectively. Furthermore, several spectral comparison techniques are used also such as adaptive matched subspace detector (AMSD), orthogonal subspace projection (OSP) algorithm, principal component analysis (PCA), local matched filter (PLMF), SAM, and normalized cross correlation (NCC) for both algorithms and most of them show a similar range in accuracy. However, SAM and NCC are preferred due to their computational simplicity. Our algorithms strive to identify eleven different mineral grains (biotite, diopside, epidote, goethite, kyanite, scheelite, smithsonite, tourmaline, pyrope, olivine, and quartz).


Horticulturae ◽  
2021 ◽  
Vol 7 (7) ◽  
pp. 165
Author(s):  
Allan Waniale ◽  
Rony Swennen ◽  
Settumba B. Mukasa ◽  
Arthur K. Tugume ◽  
Jerome Kubiriba ◽  
...  

Seed set in banana is influenced by weather, yet the key weather attributes and the critical period of influence are unknown. We therefore investigated the influence of weather during floral development for a better perspective of seed set increase. Three East African highland cooking bananas (EAHBs) were pollinated with pollen fertile wild banana ‘Calcutta 4′. At full maturity, bunches were harvested, ripened, and seeds extracted from fruit pulp. Pearson’s correlation analysis was then conducted between seed set per 100 fruits per bunch and weather attributes at 15-day intervals from 105 days before pollination (DBP) to 120 days after pollination (DAP). Seed set was positively correlated with average temperature (P < 0.05–P < 0.001, r = 0.196–0.487) and negatively correlated with relative humidity (RH) (P < 0.05–P < 0.001, r = −0.158–−0.438) between 75 DBP and the time of pollination. After pollination, average temperature was negatively correlated with seed set in ‘Mshale’ and ‘Nshonowa’ from 45 to 120 DAP (P < 0.05–P < 0.001, r = −0.213–−0.340). Correlation coefficients were highest at 15 DBP for ‘Mshale’ and ‘Nshonowa’, whereas for ‘Enzirabahima’, the highest were at the time of pollination. Maximum temperature as revealed by principal component analysis at the time of pollination should be the main focus for seed set increase.


2021 ◽  
Vol 13 (4) ◽  
pp. 2292
Author(s):  
Aneta Ptak-Chmielewska ◽  
Agnieszka Chłoń-Domińczak

Micro, small and medium enterprises (MSMEs) represent more than 99% of enterprises in Europe. Therefore, knowledge about this sector, also in the spatial context is important to understand the patterns of economic and social development. The main goal of this article is an analysis of spatial conditions and the situation of MSMEs on a local level using combined sources of information. This includes data collected in the Social Insurance Institution and Tax registers in Poland, which provides information on the employment, wages, revenues and taxes paid by the MSMEs on a local level as well as contextual statistical information. The data is used for a diagnosis of spatial circumstances and discussion of conditions influencing the status of the MSMEs sector in a selected region (voivodeship) in Poland. Taxonomy methods including factor analysis and clustering methods based on k-means and SOM Kohonen were used for selecting significant information and grouping of the local units according to the situation of the MSMEs. There are eight factors revealed in principal component analysis and five clusters of local units distinguished using these factors. These include two clusters with a high share of rural local units and two clusters with a high share of rural-urban and urban local units. Additionally, there was an outstanding cluster with only two dominant urban local units. Factors show differences between clusters in the situation of MSMEs sector and infrastructure. Different spatial conditions in different regions influence the situation of MSMEs.


Author(s):  
Timothy Jinam ◽  
Yosuke Kawai ◽  
Yoichiro Kamatani ◽  
Shunro Sonoda ◽  
Kanro Makisumi ◽  
...  

AbstractThe “Dual Structure” model on the formation of the modern Japanese population assumes that the indigenous hunter-gathering population (symbolized as Jomon people) admixed with rice-farming population (symbolized as Yayoi people) who migrated from the Asian continent after the Yayoi period started. The Jomon component remained high both in Ainu and Okinawa people who mainly reside in northern and southern Japan, respectively, while the Yayoi component is higher in the mainland Japanese (Yamato people). The model has been well supported by genetic data, but the Yamato population was mostly represented by people from Tokyo area. We generated new genome-wide SNP data using Japonica Array for 45 individuals in Izumo City of Shimane Prefecture and for 72 individuals in Makurazaki City of Kagoshima Prefecture in Southern Kyushu, and compared these data with those of other human populations in East Asia, including BioBank Japan data. Using principal component analysis, phylogenetic network, and f4 tests, we found that Izumo, Makurazaki, and Tohoku populations are slightly differentiated from Kanto (including Tokyo), Tokai, and Kinki regions. These results suggest the substructure within Mainland Japanese maybe caused by multiple migration events from the Asian continent following the Jomon period, and we propose a modified version of “Dual Structure” model called the “Inner-Dual Structure” model.


Sign in / Sign up

Export Citation Format

Share Document