Towards effective link prediction: A hybrid similarity model

2020 ◽  
pp. 1-14
Author(s):  
Longjie Li ◽  
Lu Wang ◽  
Hongsheng Luo ◽  
Xiaoyun Chen

Link prediction is an important research direction in complex network analysis and has drawn increasing attention from researchers in various fields. So far, a plethora of structural similarity-based methods have been proposed to solve the link prediction problem. To achieve stable performance on different networks, this paper proposes a hybrid similarity model to conduct link prediction. In the proposed model, the Grey Relation Analysis (GRA) approach is employed to integrate four carefully selected similarity indexes, which are designed according to different structural features. In addition, to adaptively estimate the weight for each index based on the observed network structures, a new weight calculation method is presented by considering the distribution of similarity scores. Due to taking separate similarity indexes into account, the proposed method is applicable to multiple different types of network. Experimental results show that the proposed method outperforms other prediction methods in terms of accuracy and stableness on 10 benchmark networks.

2020 ◽  
Author(s):  
Prasannavenkatesh Durai ◽  
Young-Joon Ko ◽  
Cheol-Ho Pan ◽  
Keunwan Park

Abstract Background: Despite continued efforts using chemical similarity methods in virtual screening, currently developed approaches suffer from time-consuming multistep procedures and low success rates. We recently developed a machine learning-based chemical binding similarity model considering common structural features from molecules binding to the same, or evolutionarily related targets. The chemical binding similarity measures the resemblance of chemical compounds in terms of binding site similarity to better describe functional similarities that arise from target binding. In this study, we have shown how the chemical binding similarity could be used in virtual screening together with the conventional structure-based methods. Results: The chemical binding similarity, receptor-based pharmacophore, chemical structure similarity, and molecular docking methods were evaluated to identify an effective virtual screening procedure for desired target proteins. When we tested the chemical binding similarity method with test sets of 51 kinases, it outperformed the traditional structural similarity-based methods as well as structure-based methods, such as molecular docking and receptor-based pharmacophore modeling, in terms of finding active compounds. We further validated the results by performing virtual screening (using the chemical binding similarity and receptor-based pharmacophore methods) against a completely blind dataset for mitogen-activated protein kinase kinase 1 (MEK1), ephrin type-B receptor 4 (EPHB4) and wee1-like protein kinase (WEE1). The in vitro kinase binding assay confirmed that 6 out of 13 (46.2%) for MEK1 and 2 out of 12 (16.7%) for EPHB4 were newly identified only by the chemical binding similarity model.Conclusions: We report that the virtual screening results could further be improved by combining the chemical binding similarity model with 3D-QSAR and molecular docking models. Not only the new inhibitors are identified in this study, but also many of the identified molecules have low structural similarity scores against already reported inhibitors and that show the revelation of novel scaffolds.


2018 ◽  
Vol 24 (17) ◽  
pp. 1899-1904
Author(s):  
Daniel Fabio Kawano ◽  
Marcelo Rodrigues de Carvalho ◽  
Mauricio Ferreira Marcondes Machado ◽  
Adriana Karaoglanovic Carmona ◽  
Gilberto Ubida Leite Braga ◽  
...  

Background: Fungal secondary metabolites are important sources for the discovery of new pharmaceuticals, as exemplified by penicillin, lovastatin and cyclosporine. Searching for secondary metabolites of the fungi Metarhizium spp., we previously identified tyrosine betaine as a major constituent. Methods: Because of the structural similarity with other inhibitors of neprilysin (NEP), an enzyme explored for the treatment of heart failure, we devised the synthesis of tyrosine betaine and three analogues to be subjected to in vitro NEP inhibition assays and to molecular modeling studies. Results: In spite of the similar binding modes with other NEP inhibitors, these compounds only displayed moderate inhibitory activities (IC50 ranging from 170.0 to 52.9 µM). However, they enclose structural features required to hinder passive blood brain barrier permeation (BBB). Conclusions: Tyrosine betaine remains as a starting point for the development of NEP inhibitors because of the low probability of BBB permeation and, consequently, of NEP inhibition at the Central Nervous System, which is associated to an increment in the Aβ levels and, accordingly, with a higher risk for the onset of Alzheimer's disease.


Author(s):  
Wei Jia ◽  
Wei Xia ◽  
Yang Zhao ◽  
Hai Min ◽  
Yan-Xiang Chen

AbstractPalmprint recognition and palm vein recognition are two emerging biometrics technologies. In the past two decades, many traditional methods have been proposed for palmprint recognition and palm vein recognition and have achieved impressive results. In recent years, in the field of artificial intelligence, deep learning has gradually become the mainstream recognition technology because of its excellent recognition performance. Some researchers have tried to use convolutional neural networks (CNNs) for palmprint recognition and palm vein recognition. However, the architectures of these CNNs have mostly been developed manually by human experts, which is a time-consuming and error-prone process. In order to overcome some shortcomings of manually designed CNN, neural architecture search (NAS) technology has become an important research direction of deep learning. The significance of NAS is to solve the deep learning model’s parameter adjustment problem, which is a cross-study combining optimization and machine learning. NAS technology represents the future development direction of deep learning. However, up to now, NAS technology has not been well studied for palmprint recognition and palm vein recognition. In this paper, in order to investigate the problem of NAS-based 2D and 3D palmprint recognition and palm vein recognition in-depth, we conduct a performance evaluation of twenty representative NAS methods on five 2D palmprint databases, two palm vein databases, and one 3D palmprint database. Experimental results show that some NAS methods can achieve promising recognition results. Remarkably, among different evaluated NAS methods, ProxylessNAS achieves the best recognition performance.


2021 ◽  
Vol 25 (3) ◽  
pp. 711-738
Author(s):  
Phu Pham ◽  
Phuc Do

Link prediction on heterogeneous information network (HIN) is considered as a challenge problem due to the complexity and diversity in types of nodes and links. Currently, there are remained challenges of meta-path-based link prediction in HIN. Previous works of link prediction in HIN via network embedding approach are mainly focused on exploiting features of node rather than existing relations in forms of meta-paths between nodes. In fact, predicting the existence of new links between non-linked nodes is absolutely inconvincible. Moreover, recent HIN-based embedding models also lack of thorough evaluations on the topic similarity between text-based nodes along given meta-paths. To tackle these challenges, in this paper, we proposed a novel approach of topic-driven multiple meta-path-based HIN representation learning framework, namely W-MMP2Vec. Our model leverages the quality of node representations by combining multiple meta-paths as well as calculating the topic similarity weight for each meta-path during the processes of network embedding learning in content-based HINs. To validate our approach, we apply W-TMP2Vec model in solving several link prediction tasks in both content-based and non-content-based HINs (DBLP, IMDB and BlogCatalog). The experimental outputs demonstrate the effectiveness of proposed model which outperforms recent state-of-the-art HIN representation learning models.


Entropy ◽  
2021 ◽  
Vol 23 (6) ◽  
pp. 731
Author(s):  
Mengxia Liang ◽  
Xiaolong Wang ◽  
Shaocong Wu

Finding the correlation between stocks is an effective method for screening and adjusting investment portfolios for investors. One single temporal feature or static nontemporal features are generally used in most studies to measure the similarity between stocks. However, these features are not sufficient to explore phenomena such as price fluctuations similar in shape but unequal in length which may be caused by multiple temporal features. To research stock price volatilities entirely, mining the correlation between stocks should be considered from the point view of multiple features described as time series, including closing price, etc. In this paper, a time-sensitive composite similarity model designed for multivariate time-series correlation analysis based on dynamic time warping is proposed. First, a stock is chosen as the benchmark, and the multivariate time series are segmented by the peaks and troughs time-series segmentation (PTS) algorithm. Second, similar stocks are screened out by similarity. Finally, the rate of rising or falling together between stock pairs is used to verify the proposed model’s effectiveness. Compared with other models, the composite similarity model brings in multiple temporal features and is generalizable for numerical multivariate time series in different fields. The results show that the proposed model is very promising.


Agriculture ◽  
2021 ◽  
Vol 11 (7) ◽  
pp. 651
Author(s):  
Shengyi Zhao ◽  
Yun Peng ◽  
Jizhan Liu ◽  
Shuo Wu

Crop disease diagnosis is of great significance to crop yield and agricultural production. Deep learning methods have become the main research direction to solve the diagnosis of crop diseases. This paper proposed a deep convolutional neural network that integrates an attention mechanism, which can better adapt to the diagnosis of a variety of tomato leaf diseases. The network structure mainly includes residual blocks and attention extraction modules. The model can accurately extract complex features of various diseases. Extensive comparative experiment results show that the proposed model achieves the average identification accuracy of 96.81% on the tomato leaf diseases dataset. It proves that the model has significant advantages in terms of network complexity and real-time performance compared with other models. Moreover, through the model comparison experiment on the grape leaf diseases public dataset, the proposed model also achieves better results, and the average identification accuracy of 99.24%. It is certified that add the attention module can more accurately extract the complex features of a variety of diseases and has fewer parameters. The proposed model provides a high-performance solution for crop diagnosis under the real agricultural environment.


2011 ◽  
Vol 335-336 ◽  
pp. 419-422 ◽  
Author(s):  
Yuan Lian ◽  
Jian Yi Wu ◽  
Da Peng Zhou ◽  
Hong Mei Wang ◽  
Dian Wu Huang ◽  
...  

Alginate fibre has attracted great attention in the area of biological medical materials due to its unique biological properties. But its low tenacity greatly hinders its application area. Therefore, the preparation technology of alginate fibre has been as an important research direction in this area in recent years. The purpose of this article is to prepare the calcium alginate fibre with good properties by wet spinning. The structure and properties of this fibre are analyzed by scanning electron microscope,infrared spectrometer,thermal gravimetric analyzer and DSC.


2015 ◽  
Vol 198 (4) ◽  
pp. 720-730 ◽  
Author(s):  
Stephanie Swanson ◽  
Thomas R. Ioerger ◽  
Nathan W. Rigel ◽  
Brittany K. Miller ◽  
Miriam Braunstein ◽  
...  

ABSTRACTWhile SecA is the ATPase component of the major bacterial secretory (Sec) system, mycobacteria and some Gram-positive pathogens have a second paralog, SecA2. In bacteria with two SecA paralogs, each SecA is functionally distinct, and they cannot compensate for one another. Compared to SecA1, SecA2 exports a distinct and smaller set of substrates, some of which have roles in virulence. In the mycobacterial system, some SecA2-dependent substrates lack a signal peptide, while others contain a signal peptide but possess features in the mature protein that necessitate a role for SecA2 in their export. It is unclear how SecA2 functions in protein export, and one open question is whether SecA2 works with the canonical SecYEG channel to export proteins. In this study, we report the structure ofMycobacterium tuberculosisSecA2 (MtbSecA2), which is the first structure of any SecA2 protein. A high level of structural similarity is observed between SecA2 and SecA1. The major structural difference is the absence of the helical wing domain, which is likely to play a role in howMtbSecA2 recognizes its unique substrates. Importantly, structural features critical to the interaction between SecA1 and SecYEG are preserved in SecA2. Furthermore, suppressor mutations of a dominant-negativesecA2mutant map to the surface of SecA2 and help identify functional regions of SecA2 that may promote interactions with SecYEG or the translocating polypeptide substrate. These results support a model in which the mycobacterial SecA2 works with SecYEG.IMPORTANCESecA2 is a paralog of SecA1, which is the ATPase of the canonical bacterial Sec secretion system. SecA2 has a nonredundant function with SecA1, and SecA2 exports a distinct and smaller set of substrates than SecA1. This work reports the crystal structure of SecA2 ofMycobacterium tuberculosis(the first SecA2 structure reported for any organism). Many of the structural features of SecA1 are conserved in the SecA2 structure, including putative contacts with the SecYEG channel. Several structural differences are also identified that could relate to the unique function and selectivity of SecA2. Suppressor mutations of asecA2mutant map to the surface of SecA2 and help identify functional regions of SecA2 that may promote interactions with SecYEG.


Author(s):  
Aya Taleb ◽  
Rizik M. H. Al-Sayyed ◽  
Hamed S. Al-Bdour

In this research, a new technique to improve the accuracy of the link prediction for most of the networks is proposed; it is based on the prediction ensemble approach using the voting merging technique. The new proposed ensemble called Jaccard, Katz, and Random models Wrapper (JKRW), it scales up the prediction accuracy and provides better predictions for different sizes of populations including small, medium, and large data. The proposed model has been tested and evaluated based on the area under curve (AUC) and accuracy (ACC) measures. These measures applied to the other models used in this study that has been built based on the Jaccard Coefficient, Katz, Adamic/Adar, and Preferential attachment. Results from applying the evaluation matrices verify the improvement of JKRW effectiveness and stability in comparison to the other tested models.  The results from applying the Wilcoxon signed-rank method (one of the non-parametric paired tests) indicate that JKRW has significant differences compared to the other models in the different populations at <strong>0.95</strong> confident interval.


2017 ◽  
Vol 28 (08) ◽  
pp. 1750101 ◽  
Author(s):  
Yabing Yao ◽  
Ruisheng Zhang ◽  
Fan Yang ◽  
Yongna Yuan ◽  
Qingshuang Sun ◽  
...  

In complex networks, the existing link prediction methods primarily focus on the internal structural information derived from single-layer networks. However, the role of interlayer information is hardly recognized in multiplex networks, which provide more diverse structural features than single-layer networks. Actually, the structural properties and functions of one layer can affect that of other layers in multiplex networks. In this paper, the effect of interlayer structural properties on the link prediction performance is investigated in multiplex networks. By utilizing the intralayer and interlayer information, we propose a novel “Node Similarity Index” based on “Layer Relevance” (NSILR) of multiplex network for link prediction. The performance of NSILR index is validated on each layer of seven multiplex networks in real-world systems. Experimental results show that the NSILR index can significantly improve the prediction performance compared with the traditional methods, which only consider the intralayer information. Furthermore, the more relevant the layers are, the higher the performance is enhanced.


Sign in / Sign up

Export Citation Format

Share Document