remote homology detection
Recently Published Documents


TOTAL DOCUMENTS

108
(FIVE YEARS 19)

H-INDEX

24
(FIVE YEARS 3)

Author(s):  
S. Dinesh

Abstract: Homology detection plays a major role in bioinformatics. Different type of methods is used for Homology detection. Here we extract the information from protein sequences and then uses the various algorithm to predict the similarity between protein families. SVM most commonly used the algorithm in homology detection. Classification techniques are not suitable for homology detection because theyare not suitable for high dimensional datasets. Soreducing the higher dimensionality is very important than easily can predict the similarity of protein families. Keywords: Homology detection, Protein, Sequence, Reducing dimensionality, BLAST, SCOP.


2021 ◽  
Author(s):  
Sajithra Nakshathram ◽  
Ramyachitra Duraisamy ◽  
Manikandan Pandurangan

Abstract Background: Protein Remote Homology Detection (PRHD) is used to find the homologous proteins which are similar in function and structure but sharing low sequence identity. In general, the Sequence-Order Frequency Matrix (SOFM) was used for protein remote homology detection. In the SOFM Top-n-gram (SOFM-Top) algorithm, the probability of substrings was calculated based on the highest probability value of substrings. Moreover, SOFM-Smith Waterman (SOFM-SW) algorithm combines the SOFM with local alignment for protein remote homology detection. However, the computation complexity of SOFM based PRHD is high since it processes all protein sequences in SOFM.Objective: Sequence-Order Frequency Matrix - Sampling and Machine learning with Smith-Waterman (SOFM-SMSW) algorithm is proposed for predicting the protein remote homology. The SOFM-SMSW algorithm used the PVS method to select the optimum target sequences based on the uniform distribution measure.Method: This research work considers the most important sequences for PRHD by introducing Proportional Volume Sampling (PVS). After sampling the protein sequences, a feature vector is constructed and labeling is performed based on the concatenation between two protein sequences. Then, a substitution score which represents the structural alignment is learned using k-Nearest Neighbor (k-NN). Based on the learned substitution score and alignment score, the protein homology is detected using Smith-Waterman algorithm and Support Vector Machine (SVM). By selecting the most important sequences, the accuracy of PRHD is improved and the computational complexity for PRHD is reduced by using structural alignment along with the local alignment.Results: The performance of the proposed SOFM-SMSW algorithm is tested with SCOP database and it has been compared with various existing algorithms such as SVM Top-N-gram, SVM pairwise, GPkernal, Long Short-Term Memory (LSTM), SOFM Top-N-gram and SOFM-SW. Conclusion: The experimental results illustrate that the proposed SOFM-SMSW algorithm has better accuracy, precision, recall, ROC and ROC 50 for PRHD than the other existing algorithms.


Author(s):  
Xiaopeng Jin ◽  
Qing Liao ◽  
Bin Liu

Abstract Protein remote homology detection is a fundamental and important task for protein structure and function analysis. Several search methods have been proposed to improve the detection performance of the remote homologues and the accuracy of ranking lists. The position-specific scoring matrix (PSSM) profile and hidden Markov model (HMM) profile can contribute to improving the performance of the state-of-the-art search methods. In this paper, we improved the profile-link (PL) information for constructing PSSM or HMM profiles, and proposed a PL-based search method (PL-search). In PL-search, more robust PLs are constructed through the double-link and iterative extending strategies, and an accurate similarity score of sequence pairs is calculated from the two-level Jaccard distance for remote homologues. We tested our method on two widely used benchmark datasets. Our results show that whether HHblits, JackHMMER or position-specific iterated-BLAST is used, PL-search obviously improves the search performance in terms of ranking quality as well as the number of detected remote homologues. For ease of use of PL-search, both its stand-alone tool and the web server are constructed, which can be accessed at http://bliulab.net/PL-search/.


Sign in / Sign up

Export Citation Format

Share Document