scholarly journals A statistical appraisal of disproportional versus proportional microbial source tracking libraries

2007 ◽  
Vol 5 (4) ◽  
pp. 503-509 ◽  
Author(s):  
Brian J. Robinson ◽  
Kerry J. Ritter ◽  
R. D. Ellender

Library-based microbial source tracking (MST) can assist in reducing or eliminating fecal pollution in waters by predicting sources of fecal-associated bacteria. Library-based MST relies on an assembly of genetic or phenotypic “fingerprints” from pollution-indicative bacteria cultivated from known sources to compare with and identify fingerprints of unknown origin. The success of the library-based approach depends on how well each source candidate is represented in the library and which statistical algorithm or matching criterion is used to match unknowns. Because known source libraries are often built based on convenience or cost, some library sources may contain more representation than others. Depending on the statistical algorithm or matching criteria, predictions may become severely biased toward classifying unknowns into the library's dominant source category. We examined prediction bias for four of the most commonly used statistical matching algorithms in library-based MST when applied to disproportionately-represented known source libraries; maximum similarity (MS), average similarity (AS), discriminant analyses (DA), and k-means nearest neighbor (k-NN). MS was particularly sensitive to disproportionate source representation. AS and DA were more robust. k-NN provided a compromise between correct prediction and sensitivity to disproportional libraries including increased matching success and stability that should be considered when matching to disproportionally-represented libraries.

2003 ◽  
Vol 1 (4) ◽  
pp. 209-223 ◽  
Author(s):  
Kerry J. Ritter ◽  
Ethan Carruthers ◽  
C. Andrew Carson ◽  
R. D. Ellender ◽  
Valerie J. Harwood ◽  
...  

Several commonly used statistical methods for fingerprint identification in microbial source tracking (MST) were examined to assess the effectiveness of pattern-matching algorithms to correctly identify sources. Although numerous statistical methods have been employed for source identification, no widespread consensus exists as to which is most appropriate. A large-scale comparison of several MST methods, using identical fecal sources, presented a unique opportunity to assess the utility of several popular statistical methods. These included discriminant analysis, nearest neighbour analysis, maximum similarity and average similarity, along with several measures of distance or similarity. Threshold criteria for excluding uncertain or poorly matched isolates from final analysis were also examined for their ability to reduce false positives and increase prediction success. Six independent libraries used in the study were constructed from indicator bacteria isolated from fecal materials of humans, seagulls, cows and dogs. Three of these libraries were constructed using the rep-PCR technique and three relied on antibiotic resistance analysis (ARA). Five of the libraries were constructed using Escherichia coli and one using Enterococcus spp. (ARA). Overall, the outcome of this study suggests a high degree of variability across statistical methods. Despite large differences in correct classification rates among the statistical methods, no single statistical approach emerged as superior. Thresholds failed to consistently increase rates of correct classification and improvement was often associated with substantial effective sample size reduction. Recommendations are provided to aid in selecting appropriate analyses for these types of data.


2010 ◽  
Vol 62 (3) ◽  
pp. 586-593 ◽  
Author(s):  
P. Roslev ◽  
A. S. Bukh ◽  
L. Iversen ◽  
H. Sønderbo ◽  
N. Iversen

Sources of faecal pollution in coastal recreational waters may be identified by analysing different host associated microorganisms or molecular markers. However, the microbial targets are often present at low numbers in moderately impacted waters, and often exhibit significant temporal and spatial variability in waters with fluctuating faecal loads. This patchy occurrence can limit successful detection of relevant targets in microbial source tracking studies. In this study, we explored the possibility for using the blue mussel (Mytilus edulis) as a biosampler for accumulation of faecal bacteria relevant for microbial source tracking. Non-contaminated blue mussels were transferred to three coastal recreational waters affected by faecal pollution of unknown origin. Molecular markers associated with animal and human waste were targeted by PCR and compared in seawater and mussel samples. The results demonstrated that transplanted mussels in simple enclosures accumulated and retained elevated levels of molecular markers associated with different types of faecal pollution. The targets included a novel putative human associated E. coli subgroup B2 VIII clone, and animal and human associated markers in enterococci (esp, M19, M66, M90, and M91). Human (sewage) associated markers including esp and M66 were sometimes not detectable in seawater samples despite known wastewater contamination, whereas the markers were detectable in mussels. We suggest that transplanted mussels should be considered as potential biosamplers in studies focusing on identifying source of faecal pollution in low or moderately impacted recreational waters. Bioaccumulation of molecular markers in mussels for several days may represent the water quality better than traditional grab samples from the water column.


2005 ◽  
Vol 71 (1) ◽  
pp. 512-518 ◽  
Author(s):  
Wail M. Hassan ◽  
Shiao Y. Wang ◽  
Rudolph D. Ellender

ABSTRACT The goal of the study was to determine which similarity coefficient and statistical method to use to produce the highest rate of correct assignment (RCA) in repetitive extragenic palindromic PCR-based bacterial source tracking. In addition, the use of standards for deciding whether to accept or reject source assignments was investigated. The use of curve-based coefficients Cosine Coefficient and Pearson's Product Moment Correlation yielded higher RCAs than the use of band-based coefficients Jaccard, Dice, Jeffrey's x, and Ochiai. When enterococcal and Escherichia coli isolates from known sources were used in a blind test, the use of maximum similarity produced consistently higher RCAs than the use of average similarity. We also found that the use of a similarity value threshold and/or a quality factor threshold (the ratio of the average fingerprint similarity within a source to the average similarity of this source's isolates to an unknown) to decide whether to accept source assignments of unknowns increases the reliability of source assignments. Applying a similarity value threshold improved the overall RCA (ORCA) by 15 to 27% when enterococcal fingerprints were used and 8 to 29% when E. coli fingerprints were used. Applying the quality factor threshold resulted in a 22 to 32% improvement in the ORCA, depending on the fingerprinting technique used. This increase in reliability was, however, achieved at the expense of decreased numbers of isolates that were assigned a source.


2021 ◽  
Vol 232 (2) ◽  
Author(s):  
Meriane Demoliner ◽  
Juliana Schons Gularte ◽  
Viviane Girardi ◽  
Ana Karolina Antunes Eisen ◽  
Fernanda Gil de Souza ◽  
...  

Author(s):  
Jan Lorenz Soliman ◽  
Alex Dekhtyar ◽  
Jennifer Vanderkellen ◽  
Aldrin Montana ◽  
Michael Black ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document