Population informative markers selected using Wright's fixation index and machine learning improves human identification using the skin microbiome
Microbial DNA, shed from human skin, can be distinctive to its host and thus help individualize donors of forensic biological evidence. Previous studies have utilized single locus microbial DNA markers (e.g., 16S rRNA) to assess the presence/absence of personal microbiota to profile human hosts. However, since the taxonomic composition of the microbiome is in constant fluctuation, this approach may not be sufficiently robust for human identification (HID). Multi-marker approaches may be more powerful. Additionally, genetic differentiation, rather than taxonomic distinction, may be more individualizing. To this end, the non-dominant hands of 51 individuals were sampled in triplicate (n = 153). They were analyzed for markers in the hidSkinPlex, a multiplex panel comprising candidate markers for skin microbiome profiling. Single nucleotide polymorphisms (SNPs) with the highest Wright’s fixation index (F ST ) estimates were then selected for predicting donor identity using a support vector machine (SVM) learning model. F ST is an estimate of the genetic differences within and between populations. Three different SNP selection criteria were employed: SNPs with the highest-ranking F ST estimates 1) common between any two samples regardless of markers present (termed overall ); 2) each marker common between samples (termed per marker ); and 3) common to all samples used to train the SVM algorithm for HID (termed selected ). The SNPs chosen based on criteria for overall , per marker, and selected methods resulted in an accuracy of 92.00%, 94.77%, and 88.00%, respectively. The results support that estimates of F ST , combined with SVM, can notably improve forensic HID via skin microbiome profiling. IMPORTANCE There is a need for additional genetic information to help identify the source of biological evidence found at a crime scene. The human skin microbiome is a potentially abundant source of DNA that can enable the identification of a donor of biological evidence. With microbial profiling for human identification, there will be an additional source of DNA to identify individuals as well as to exclude individuals wrongly associated with biological evidence, thereby improving the utility of forensic DNA profiling to support criminal investigations.