Performance of Similarity Measures in 2D Fragment-Based Similarity Searching:  Comparison of Structural Descriptors and Similarity Coefficients

Current similarity measures for virtual screening are based on the use of molecular fingerprints and the Tanimoto coefficient. This paper describes two ways in which one can increase the effectiveness of similarity-based virtual screening: using similarity coefficients other than the Tanimoto coefficient for the comparison of molecular fingerprints; and using a graph-theoretic similarity measure based on the largest substructure common to a pair of molecules.

Download Full-text

Multivariate Time Series Similarity Searching

The Scientific World JOURNAL ◽

10.1155/2014/851017 ◽

2014 ◽

Vol 2014 ◽

pp. 1-8 ◽

Cited By ~ 4

Author(s):

Jimin Wang ◽

Yuelong Zhu ◽

Shijin Li ◽

Dingsheng Wan ◽

Pengcheng Zhang

Keyword(s):

Time Series ◽

Multivariate Time Series ◽

Similarity Measures ◽

Combination Method ◽

Similarity Searching ◽

Frobenius Norm ◽

Point Distribution ◽

Single Dimension ◽

Searching Method ◽

Hydrological Fields

Multivariate time series (MTS) datasets are very common in various financial, multimedia, and hydrological fields. In this paper, a dimension-combination method is proposed to search similar sequences for MTS. Firstly, the similarity of single-dimension series is calculated; then the overall similarity of the MTS is obtained by synthesizing each of the single-dimension similarity based on weighted BORDA voting method. The dimension-combination method could use the existing similarity searching method. Several experiments, which used the classification accuracy as a measure, were performed on six datasets from the UCI KDD Archive to validate the method. The results show the advantage of the approach compared to the traditional similarity measures, such as Euclidean distance (ED), cynamic time warping (DTW), point distribution (PD), PCA similarity factorSPCA, and extended Frobenius norm (Eros), for MTS datasets in some ways. Our experiments also demonstrate that no measure can fit all datasets, and the proposed measure is a choice for similarity searches.

Download Full-text

Similarity searching in ligand-based virtual screening using different fingerprints and different similarity coefficients

International Journal of Intelligent Systems Technologies and Applications ◽

10.1504/ijista.2019.10021692 ◽

2019 ◽

Vol 18 (4) ◽

pp. 405

Author(s):

Hentabli Hamza ◽

Faisal Saeed ◽

Belhadef Hacene ◽

Berrhail Fouaz

Keyword(s):

Virtual Screening ◽

Similarity Searching ◽

Similarity Coefficients

Download Full-text

Genetic Algorithm-based Feature Selection Approach for Enhancing the Effectiveness of Similarity Searching in Ligand-based Virtual Screening

Current Bioinformatics ◽

10.2174/1574893614666191119123935 ◽

2020 ◽

Vol 15 (5) ◽

pp. 431-444

Author(s):

Fouaz Berrhail ◽

Hacene Belhadef

Keyword(s):

Genetic Algorithm ◽

Virtual Screening ◽

Similarity Measures ◽

Similarity Searching ◽

Features Selection ◽

Screening Process ◽

Selection Approach ◽

Feature Selection Approach ◽

The Individual ◽

Screening Approaches

Background: In the last years, similarity searching has gained wide popularity as a method for performing Ligand-Based Virtual Screening (LBVS). This screening technique functions by making a comparison of the target compound’s features with that of each compound in the database of compounds. It is well known that none of the individual similarity measures could provide the best performances each time pertaining to an active compound structure, representing all types of activity classes. In the literature, we find several techniques and strategies that have been proposed to improve the overall effectiveness of ligand-based virtual screening approaches. Objective: In this work, our main objective is to propose a features selection approach based on genetic algorithm (FSGASS) to improve similarity searching pertaining to ligand-based virtual screening. Methods: Our contribution allows us to identify the most important and relevant characteristics of chemical compounds and to minimize their number in their representations. This will allow the reduction of features space, the elimination of redundancy, the reduction of training execution time, and the increase of the performance of the screening process. Results: The obtained results demonstrate superiority in the performance compared with these obtained with Tanimoto coefficient, which is considered as the most widely coefficient to quantify the similarity in the domain of LBVS. Conclusion: Our results show that significant improvements can be obtained by using molecular similarity research methods at the basis of features selection.

Download Full-text