profile hidden markov model
Recently Published Documents


TOTAL DOCUMENTS

34
(FIVE YEARS 7)

H-INDEX

8
(FIVE YEARS 1)

2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Xujie Ren ◽  
Tao Shang ◽  
Yatong Jiang ◽  
Jianwei Liu

In the era of big data, next-generation sequencing produces a large amount of genomic data. With these genetic sequence data, research in biology fields will be further advanced. However, the growth of data scale often leads to privacy issues. Even if the data is not open, it is still possible for an attacker to steal private information by a member inference attack. In this paper, we proposed a private profile hidden Markov model (PHMM) with differential identifiability for gene sequence clustering. By adding random noise into the model, the probability of identifying individuals in the database is limited. The gene sequences could be unsupervised clustered without labels according to the output scores of private PHMM. The variation of the divergence distance in the experimental results shows that the addition of noise makes the profile hidden Markov model distort to a certain extent, and the maximum divergence distance can reach 15.47 when the amount of data is small. Also, the cosine similarity comparison of the clustering model before and after adding noise shows that as the privacy parameters changes, the clustering model distorts at a low or high level, which makes it defend the member inference attack.


Author(s):  
Natsuki Iwano ◽  
Tatsuo Adachi ◽  
Kazuteru Aoki ◽  
Yoshikazu Nakamura ◽  
Michiaki Hamada

AbstractNucleic acid aptamers are generated by an in vitro molecular evolution method known as systematic evolution of ligands by exponential enrichment (SELEX). A variety of candidates is limited by actual sequencing data from an experiment. Here, we developed RaptGen, which is a variational autoencoder for in silico aptamer generation. RaptGen exploits a profile hidden Markov model decoder to represent motif sequences effectively. We showed that RaptGen embedded simulation sequence data into low-dimension latent space dependent on motif information. We also performed sequence embedding using two independent SELEX datasets. RaptGen successfully generated aptamers from the latent space even though they were not included in high-throughput sequencing. RaptGen could also generate a truncated aptamer with a short learning model. We demonstrated that RaptGen could be applied to activity-guided aptamer generation according to Bayesian optimization. We concluded that a generative method by RaptGen and latent representation are useful for aptamer discovery. Codes are available at https://github.com/hmdlab/raptgen.


2021 ◽  
Vol 21 (1) ◽  
pp. 21-36
Author(s):  
H. OZCAN ◽  
F. KAYA GULAGIZ ◽  
M. A. ALTUNCU ◽  
S. ILKIN ◽  
S. SAHIN

2020 ◽  
Vol 24 (4) ◽  
pp. 759-778
Author(s):  
Alireza Abbas Alipour ◽  
Ebrahim Ansari

F1000Research ◽  
2019 ◽  
Vol 8 ◽  
pp. 1834 ◽  
Author(s):  
Gerben P. Voshol ◽  
Peter J. Punt ◽  
Erik Vijgenboom

Insight into the inter- and intra-family relationship of protein families is important, since it can aid understanding of substrate specificity evolution and assign putative functions to proteins with unknown function. To study both these inter- and intra-family relationships, the ability to build phylogenetic trees using the most sensitive sequence similarity search methods (e.g. profile hidden Markov model (pHMM)–pHMM alignments) is required. However, existing solutions require a very long calculation time to obtain the phylogenetic tree. Therefore, a faster protocol is required to make this approach efficient for research. To contribute to this goal, we extended the original Profile Comparer program (PRC) for the construction of large pHMM phylogenetic trees at speeds several orders of magnitude faster compared to pHMM-tree. As an example, PRC Extended (PRCx) was used to study the phylogeny of over 10,000 sequences of lytic polysaccharide monooxygenase (LPMO) from over seven families. Using the newly developed program we were able to reveal previously unknown homologs of LPMOs, namely the PFAM Egh16-like family. Moreover, we show that the substrate specificities have evolved independently several times within the LPMO superfamily. Furthermore, the LPMO phylogenetic tree, does not seem to follow taxonomy-based classification.


2018 ◽  
Vol 13 (5) ◽  
pp. 1081-1095 ◽  
Author(s):  
Zhongliu Zhuo ◽  
Yang Zhang ◽  
Zhi-li Zhang ◽  
Xiaosong Zhang ◽  
Jingzhong Zhang

Sign in / Sign up

Export Citation Format

Share Document