scholarly journals A Profile Hidden Markov Model to investigate the distribution and frequency of LanB-encoding lantibiotic modification genes in the human oral and gut microbiome

PeerJ ◽  
2017 ◽  
Vol 5 ◽  
pp. e3254 ◽  
Author(s):  
Calum J. Walsh ◽  
Caitriona M. Guinane ◽  
Paul W. O’ Toole ◽  
Paul D. Cotter

Background The human microbiota plays a key role in health and disease, and bacteriocins, which are small, bacterially produced, antimicrobial peptides, are likely to have an important function in the stability and dynamics of this community. Here we examined the density and distribution of the subclass I lantibiotic modification protein, LanB, in human oral and stool microbiome datasets using a specially constructed profile Hidden Markov Model (HMM). Methods The model was validated by correctly identifying known lanB genes in the genomes of known bacteriocin producers more effectively than other methods, while being sensitive enough to differentiate between different subclasses of lantibiotic modification proteins. This approach was compared with two existing methods to screen both genomic and metagenomic datasets obtained from the Human Microbiome Project (HMP). Results Of the methods evaluated, the new profile HMM identified the greatest number of putative LanB proteins in the stool and oral metagenome data while BlastP identified the fewest. In addition, the model identified more LanB proteins than a pre-existing Pfam lanthionine dehydratase model. Searching the gastrointestinal tract subset of the HMP reference genome database with the new HMM identified seven putative subclass I lantibiotic producers, including two members of the Coprobacillus genus. Conclusions These findings establish custom profile HMMs as a potentially powerful tool in the search for novel bioactive producers with the power to benefit human health, and reinforce the repertoire of apparent bacteriocin-encoding gene clusters that may have been overlooked by culture-dependent mining efforts to date.

2018 ◽  
Vol 13 (5) ◽  
pp. 1081-1095 ◽  
Author(s):  
Zhongliu Zhuo ◽  
Yang Zhang ◽  
Zhi-li Zhang ◽  
Xiaosong Zhang ◽  
Jingzhong Zhang

2003 ◽  
Vol 310 (2) ◽  
pp. 574-579 ◽  
Author(s):  
Norihiro Kikuchi ◽  
Yeon-Dae Kwon ◽  
Masanori Gotoh ◽  
Hisashi Narimatsu

2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Xujie Ren ◽  
Tao Shang ◽  
Yatong Jiang ◽  
Jianwei Liu

In the era of big data, next-generation sequencing produces a large amount of genomic data. With these genetic sequence data, research in biology fields will be further advanced. However, the growth of data scale often leads to privacy issues. Even if the data is not open, it is still possible for an attacker to steal private information by a member inference attack. In this paper, we proposed a private profile hidden Markov model (PHMM) with differential identifiability for gene sequence clustering. By adding random noise into the model, the probability of identifying individuals in the database is limited. The gene sequences could be unsupervised clustered without labels according to the output scores of private PHMM. The variation of the divergence distance in the experimental results shows that the addition of noise makes the profile hidden Markov model distort to a certain extent, and the maximum divergence distance can reach 15.47 when the amount of data is small. Also, the cosine similarity comparison of the clustering model before and after adding noise shows that as the privacy parameters changes, the clustering model distorts at a low or high level, which makes it defend the member inference attack.


Sign in / Sign up

Export Citation Format

Share Document