scholarly journals Unraveling city-specific signature and identifying sample origin locations for the data from CAMDA MetaSUB challenge

2021 ◽  
Vol 16 (1) ◽  
Author(s):  
Runzhi Zhang ◽  
Alejandro R. Walker ◽  
Susmita Datta

Abstract Background Composition of microbial communities can be location-specific, and the different abundance of taxon within location could help us to unravel city-specific signature and predict the sample origin locations accurately. In this study, the whole genome shotgun (WGS) metagenomics data from samples across 16 cities around the world and samples from another 8 cities were provided as the main and mystery datasets respectively as the part of the CAMDA 2019 MetaSUB “Forensic Challenge”. The feature selecting, normalization, three methods of machine learning, PCoA (Principal Coordinates Analysis) and ANCOM (Analysis of composition of microbiomes) were conducted for both the main and mystery datasets. Results Features selecting, combined with the machines learning methods, revealed that the combination of the common features was effective for predicting the origin of the samples. The average error rates of 11.93 and 30.37% of three machine learning methods were obtained for main and mystery datasets respectively. Using the samples from main dataset to predict the labels of samples from mystery dataset, nearly 89.98% of the test samples could be correctly labeled as “mystery” samples. PCoA showed that nearly 60% of the total variability of the data could be explained by the first two PCoA axes. Although many cities overlapped, the separation of some cities was found in PCoA. The results of ANCOM, combined with importance score from the Random Forest, indicated that the common “family”, “order” of the main-dataset and the common “order” of the mystery dataset provided the most efficient information for prediction respectively. Conclusions The results of the classification suggested that the composition of the microbiomes was distinctive across the cities, which could be used to identify the sample origins. This was also supported by the results from ANCOM and importance score from the RF. In addition, the accuracy of the prediction could be improved by more samples and better sequencing depth.

2020 ◽  
Author(s):  
Runzhi Zhang ◽  
Alejandro R. Walker ◽  
Susmita Datta

Abstract BackgroundComposition of microbial communities can be location specific, and the different abundance of taxon within location could help us to unravel city-specific signature and predict the sample origin locations accurately. In this study, the whole genome shotgun (WGS) metagenomics data from samples across 16 cities around the world and samples from another 8 cities were provided as the main and mystery datasets respectively as the part of the CAMDA 2019 MetaSUB “Forensic Challenge”. The feature selection, normalization, three methods of machine learning, PCoA (Principal Coordinates Analysis) and ANCOM (Analysis of composition of microbiomes) were conducted for both the main and mystery datasets.ResultsFeature selection, combined with the machines learning methods, revealed that the combination of the common features was effective for predicting the origin of the samples. The average error rates of 11.6% and 30.0% of three machine learning methods were obtained for main and mystery datasets respectively. Using the samples from main dataset to predict the labels of samples from mystery dataset, nearly 89.98% of the test samples could be correctly labeled as “mystery” samples. PCoA showed that nearly 60% of the total variability of the data could be explained by the first two PCoA axes. Although many cities overlapped, the separation of some cities was found in PCoA. The results of ANCOM, combined with importance score from the Random Forest, indicated that the common “family”, “order” of the main-dataset and the common “order” of the mystery dataset provided the most efficient information for prediction respectively.ConclusionsThe results of the classification suggested that the composition of the microbiomes was distinctive across the cities, which was also supported by the results from ANCOM and importance score from the RF. The analysis utilized in this study can be of great help in field of forensic science to efficiently predict the origin of the samples. And the accurate of the prediction could be improved by more samples and better sequencing depth.


2021 ◽  
Vol 5 (Supplement_2) ◽  
pp. 1164-1164
Author(s):  
J Philip Karl ◽  
Nicholes Armstrong ◽  
Patrick Radcliffe ◽  
Holly McClung

Abstract Objectives The fecal metabolome provides a functional readout of interactions between host, diet and the gut microbiota that may help identify gut microbiota-derived compounds associated with health outcomes. This study aimed to determine associations between inter-individual variability in gut microbiota composition, diet-induced changes in the fecal metabolome and gastrointestinal symptoms in adults consuming a diet consisting solely of military rations. Methods Secondary analysis of a randomized-controlled trial in which 54 healthy adults (32 ± 14 yr, BMI 26 ± 3 kg/m2) were randomly assigned to consume their usual diet (Control) or a provided diet of Meal, Ready-to-Eat military rations (MRE) for 3wk. Fecal microbiota composition was measured by 16S rRNA sequencing and the fecal metabolome by untargeted UPLC-MS/MS at baseline and post-intervention. Self-reported gastrointestinal symptoms were measured weekly using the Irritable Bowel Severity Scoring System (IBSSS). Results Principal coordinates analysis of baseline gut microbiota composition separated MRE participants into two clusters determined primarily by ratio of Bacteroides to Prevotella (HIGH (n = 17) or LOW (n = 10)). Random Forest classification of changes in the fecal metabolome within Control, HIGH, and LOW produced error rates of 7%, 18% and 100%, respectively, suggesting a more discriminant metabolome response in HIGH than LOW. Between-group differences in 153 metabolites were detected by ANOVA (FDR <0.20). Among those, 39 identified and 20 unidentified metabolites demonstrated an association with the gut microbiota (HIGH vs. LOW, P < 0.05). Compounds within xenobiotic, peptide/amino acid, and lipid metabolism pathways comprised 29 of the microbiota-associated metabolites. Changes in microbiota-associated metabolites were not correlated with changes in IBSSS scores. Conclusions Changes in the fecal metabolome of individuals consuming a short-term military ration diet are associated with inter-individual variability in gut microbiota composition, but changes in microbiota-associated fecal metabolites do not appear to impact gastrointestinal symptoms. Funding Sources Military Operational Medicine Research Program. Disclaimer Authors’ views do not reflect official DoD or Army policy.


Author(s):  
Yutao Huang ◽  
Xin Liu ◽  
Dongdong Cao ◽  
Guang Chen ◽  
Sujuan Li ◽  
...  

Background: The emerging expressed sequence tag-derived simple sequence repeats (EST-SSRs) offer an important approach to investigate plant genetic diversity. Methods: A total of seventy common bean polymorphic EST-SSRs were utilized for assessing genetic diversity among 19 hyacinth, 20 pea and 21 soybean accessions, respectively. The genetic statistics and principal coordinates analysis (PCoA) were conducted by GenAlEx 6.5. Result: The transferability rates of common bean EST-SSRs in hyacinth, pea and soybean were 27.1%, 20.0% and 21.4%. And the ratios of polymorphic SSR markers in these legumes were 42.1%, 85.71% and 100.0%, respectively. The hyacinth, pea and soybean accessions could be assigned to three distinct clusters for the germplasm types greatly depending on the geographic distributions. The present results revealed that the common bean EST-SSRs are highly transferable to hyacinth bean, pea and soybean. Moreover, these transferable markers would provide a set of inexpensive and effective tools for future research on molecular breeding, taxonomy and comparative mapping.


Sign in / Sign up

Export Citation Format

Share Document