similarity function Latest Research Papers

Provable randomized rounding for minimum-similarity diversification

Data Mining and Knowledge Discovery ◽

10.1007/s10618-021-00811-2 ◽

2022 ◽

Author(s):

Bruno Ordozgoiti ◽

Ananth Mahadevan ◽

Antonis Matakos ◽

Aristides Gionis

Keyword(s):

Optimization Problems ◽

Randomized Algorithm ◽

Similarity Function ◽

Randomized Rounding ◽

Cardinality Constraint ◽

Similarity Functions ◽

Penalty Term ◽

Combinatorial Optimization Problems ◽

Benchmark Datasets ◽

Approximation Guarantee

AbstractWhen searching for information in a data collection, we are often interested not only in finding relevant items, but also in assembling a diverse set, so as to explore different concepts that are present in the data. This problem has been researched extensively. However, finding a set of items with minimal pairwise similarities can be computationally challenging, and most existing works striving for quality guarantees assume that item relatedness is measured by a distance function. Given the widespread use of similarity functions in many domains, we believe this to be an important gap in the literature. In this paper we study the problem of finding a diverse set of items, when item relatedness is measured by a similarity function. We formulate the diversification task using a flexible, broadly applicable minimization objective, consisting of the sum of pairwise similarities of the selected items and a relevance penalty term. To find good solutions we adopt a randomized rounding strategy, which is challenging to analyze because of the cardinality constraint present in our formulation. Even though this obstacle can be overcome using dependent rounding, we show that it is possible to obtain provably good solutions using an independent approach, which is faster, simpler to implement and completely parallelizable. Our analysis relies on a novel bound for the ratio of Poisson-Binomial densities, which is of independent interest and has potential implications for other combinatorial-optimization problems. We leverage this result to design an efficient randomized algorithm that provides a lower-order additive approximation guarantee. We validate our method using several benchmark datasets, and show that it consistently outperforms the greedy approaches that are commonly used in the literature.

Structural and dynamical properties of 13-atom Cu–Co mixed clusters

International Journal of Modern Physics C ◽

10.1142/s0129183122500723 ◽

2021 ◽

Author(s):

Shi-Wei Ren

Keyword(s):

Molecular Dynamics ◽

Pure Copper ◽

Similarity Function ◽

Step Process ◽

Pure Cobalt ◽

Pair Separation ◽

Dynamical Properties ◽

Mixed Clusters ◽

Dynamics Method ◽

Relative Root

In this paper, the geometric structures and the melting-like processes of the 13-atom pure copper, pure cobalt cluster and their 13-atom mixed clusters are investigated and compared by the molecular dynamics method. The calculation shows that the pure copper and cobalt clusters have the standard icosahedral structures and the mixed clusters take on the deformed icosahedral structures. The quantitative analysis shows that the deformations are slight. Moreover, an element similarity function is introduced by which the contribution of the compositions of the clusters to the deformation of the mixed clusters is analyzed and discussed. With the increase of the temperature, the migrating and recombination of the atoms on the surface of the clusters are observed, indicating the starting of the transition from solid-like to liquid-like state for the clusters. Through the calculating of the relative root-mean-squared pair separation fluctuation and monitoring the dynamical structures of the clusters, it is found that the mixed clusters experience a multi-step process in the transition.

Efficient Online Log Parsing with Log Punctuations Signature

Applied Sciences ◽

10.3390/app112411974 ◽

2021 ◽

Vol 11 (24) ◽

pp. 11974

Author(s):

Shijie Zhang ◽

Gang Wu

Keyword(s):

System Reliability ◽

State Of The Art ◽

Data Driven ◽

Log Analysis ◽

Similarity Function ◽

The Core ◽

Small Set ◽

Public Datasets ◽

Runtime Information ◽

Candidate Set

Logs, recording the system runtime information, are frequently used to ensure software system reliability. As the first and foremost step of typical log analysis, many data-driven methods have been proposed for automated log parsing. Most existing log parsers work offline, requiring a time-consuming training progress and retraining as the system upgrades. Meanwhile, the state of the art online log parsers are tree-based, which still have defects in robustness and efficiency. To overcome such limitations, we abandon the tree structure and propose a hash-like method. In this paper, we propose LogPunk, an efficient online log parsing method. The core of LogPunk is a novel log signature method based on log punctuations and length features. According to the signature, we can quickly find a small set of candidate templates. Further, the most suitable template is returned by traversing the candidate set with our log similarity function. We evaluated LogPunk on 16 public datasets from the LogHub comparing with five other log parsers. LogPunk achieves the best parsing accuracy of 91.9%. Evaluation results also demonstrate its superiority in terms of robustness and efficiency.

Theoretical Calculation of Cocrystal Components for Explosives: A Similarity Function of Energetic Supramolecules

Crystal Growth & Design ◽

10.1021/acs.cgd.1c00933 ◽

2021 ◽

Author(s):

Xun Han ◽

Min Liu ◽

Zhong Huang ◽

Hui Huang ◽

Xinping Long ◽

...

Keyword(s):

Theoretical Calculation ◽

Similarity Function

On the design of a similarity function for sparse binary data with application on protein function annotation

Knowledge-Based Systems ◽

10.1016/j.knosys.2021.107863 ◽

2021 ◽

pp. 107863

Author(s):

Marcelo B.A. Veras ◽

Bishnu Sarker ◽

Sabeur Aridhi ◽

João P.P. Gomes ◽

José A.F. Macêdo ◽

...

Keyword(s):

Protein Function ◽

Binary Data ◽

Similarity Function ◽

Function Annotation ◽

Protein Function Annotation

Adaptive Similarity Function with Structural Features of Network Embedding for Missing Link Prediction

Complexity ◽

10.1155/2021/1277579 ◽

2021 ◽

Vol 2021 ◽

pp. 1-15

Author(s):

Chuanting Zhang ◽

Ke-Ke Shang ◽

Jingping Qiao

Keyword(s):

Link Prediction ◽

Graph Mining ◽

Data Science ◽

Fundamental Problem ◽

Structural Features ◽

Similarity Function ◽

Network Embedding ◽

Feature Representations ◽

Node Similarity ◽

Edge Features

Link prediction is a fundamental problem of data science, which usually calls for unfolding the mechanisms that govern the micro-dynamics of networks. In this regard, using features obtained from network embedding for predicting links has drawn widespread attention. Although methods based on edge features or node similarity have been proposed to solve the link prediction problem, many technical challenges still exist due to the unique structural properties of networks, especially when the networks are sparse. From the graph mining perspective, we first give empirical evidence of the inconsistency between heuristic and learned edge features. Then, we propose a novel link prediction framework, AdaSim, by introducing an Adaptive Similarity function using features obtained from network embedding based on random walks. The node feature representations are obtained by optimizing a graph-based objective function. Instead of generating edge features using binary operators, we perform link prediction solely leveraging the node features of the network. We define a flexible similarity function with one tunable parameter, which serves as a penalty of the original similarity measure. The optimal value is learned through supervised learning and thus is adaptive to data distribution. To evaluate the performance of our proposed algorithm, we conduct extensive experiments on eleven disparate networks of the real world. Experimental results show that AdaSim achieves better performance than state-of-the-art algorithms and is robust to different sparsities of the networks.

A Local Similarity Function for Katabatic Flows Derived from Field Observations Over Steep‐ and Shallow‐Angled Slopes

Geophysical Research Letters ◽

10.1029/2021gl095479 ◽

2021 ◽

Author(s):

Chaoxun Hang ◽

Holly J. Oldroyd ◽

Marco G. Giometto ◽

Eric R. Pardyjak ◽

Marc B. Parlange

Keyword(s):

Similarity Function ◽

Local Similarity ◽

Field Observations ◽

Katabatic Flows

A Method of Recommending Physical Education Network Course Resources Based on Machine Learning Algorithms

Security and Communication Networks ◽

10.1155/2021/4925605 ◽

2021 ◽

Vol 2021 ◽

pp. 1-9

Author(s):

Fu Wei

Keyword(s):

Machine Learning ◽

Physical Education ◽

Learning Algorithm ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Online Course ◽

Similarity Function ◽

Similarity Threshold ◽

Matching Degree ◽

Education Network

Aiming at the problem of difficult selection of physical education online course resources, a method of recommending online course resources based on machine learning algorithms is proposed. The information recommendation model is established through the expression of a collaborative filtering algorithm and resource feedback matrix. According to the feedback score of any user on the same data resource in the project set, the interest matching degree is established by comparative analysis, and the matching degree is substituted into the cosine similarity function to calculate the similarity threshold between each item and so on, calculate the similarity threshold number of all items, select the project resource that best matches the user according to the threshold number, and complete the recommendation. The experimental results show that the recommended method of physical education network curriculum resources based on machine learning algorithm is relatively excellent in recommendation accuracy and efficiency; this method can realize the innovation of higher physical education network curriculum teaching mode.

Impact of Variability in Cell Cycle Periodicity on Cell Population Dynamics

10.1101/2021.10.13.464184 ◽

2021 ◽

Author(s):

Chance Michael Nowak ◽

Tyler Quarton ◽

Leonidas Bleris

Keyword(s):

Cell Cycle ◽

Population Dynamics ◽

Cell Population ◽

Cell Cycle Progression ◽

Similarity Function ◽

Cycle Progression ◽

Cell Population Dynamics ◽

Model Simulations ◽

Cell Cycle Synchronization ◽

Cervical Cancer Cells

Cell cycle synchronization has been pivotal in the development of our understanding of cell population dynamics. Intriguingly, when cells are released from a synchronized state, they do not maintain synchronized cell division and rapidly become asynchronous. Here, using a combination of experiments and model simulations, we investigate this process of "cell cycle desynchronization" in cervical cancer cells (HeLa) that are arrested at the G1/S boundary. We tracked DNA content overtime at regular intervals to monitor cell cycle progression and developed a custom auto-similarity function to quantify the convergence to asynchronicity. In parallel, using experimental data, we developed a single-cell phenomenological model that returns DNA concentration across the cell cycle stages from a desynchronizing cell population. Our simulations revealed that desynchronization is primarily sensitive to cell cycle variability. We tested this prediction by introducing lipopolysaccharide to increase cellular noise, which resulted in greater cell cycle variability with an enhanced rate of desynchronization. Our results show that the desynchronization rate of cell populations can be used a proxy of the degree of variance in cell cycle periodicity.

A semantic search approach for hyper relational knowledge graphs

10.5753/sbbd_estendido.2021.18171 ◽

2021 ◽

Author(s):

Veronica dos Santos ◽

Sérgio Lifschitz

Keyword(s):

Information Retrieval ◽

Semantic Search ◽

Similarity Function ◽

Search Results ◽

Relational Knowledge ◽

Retrieval Systems ◽

User Query ◽

Information Retrieval Systems ◽

Knowledge Graphs ◽

Search Approach

Information Retrieval Systems usually employ syntactic search techniques to match a set of keywords with the indexed content to retrieve results. But pure keyword-based matching lacks on capturing user's search intention and context and suffers of natural language ambiguity and vocabulary mismatch. Considering this scenario, the hypothesis raised is that the use of embeddings in a semantic search approach will make search results more meaningfully. Embeddings allow to minimize problems arising from terminology and context mismatch. This work proposes a semantic similarity function to support semantic search based on hyper relational knowledge graphs. This function uses embeddings in order to find the most similar nodes that satisfy a user query.

similarity function
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Provable randomized rounding for minimum-similarity diversification

Structural and dynamical properties of 13-atom Cu–Co mixed clusters

Efficient Online Log Parsing with Log Punctuations Signature

Theoretical Calculation of Cocrystal Components for Explosives: A Similarity Function of Energetic Supramolecules

On the design of a similarity function for sparse binary data with application on protein function annotation

Adaptive Similarity Function with Structural Features of Network Embedding for Missing Link Prediction

A Local Similarity Function for Katabatic Flows Derived from Field Observations Over Steep‐ and Shallow‐Angled Slopes

A Method of Recommending Physical Education Network Course Resources Based on Machine Learning Algorithms

Impact of Variability in Cell Cycle Periodicity on Cell Population Dynamics

A semantic search approach for hyper relational knowledge graphs

Export Citation Format

similarity functionRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Provable randomized rounding for minimum-similarity diversification

Structural and dynamical properties of 13-atom Cu–Co mixed clusters

Efficient Online Log Parsing with Log Punctuations Signature

Theoretical Calculation of Cocrystal Components for Explosives: A Similarity Function of Energetic Supramolecules

On the design of a similarity function for sparse binary data with application on protein function annotation

Adaptive Similarity Function with Structural Features of Network Embedding for Missing Link Prediction

A Local Similarity Function for Katabatic Flows Derived from Field Observations Over Steep‐ and Shallow‐Angled Slopes

A Method of Recommending Physical Education Network Course Resources Based on Machine Learning Algorithms

Impact of Variability in Cell Cycle Periodicity on Cell Population Dynamics

A semantic search approach for hyper relational knowledge graphs

similarity function
Recently Published Documents