Conjunctive vector representations for set valued feature descriptions

Author(s):  
Michael Carl
2017 ◽  
Author(s):  
Sabrina Jaeger ◽  
Simone Fulle ◽  
Samo Turk

Inspired by natural language processing techniques we here introduce Mol2vec which is an unsupervised machine learning approach to learn vector representations of molecular substructures. Similarly, to the Word2vec models where vectors of closely related words are in close proximity in the vector space, Mol2vec learns vector representations of molecular substructures that are pointing in similar directions for chemically related substructures. Compounds can finally be encoded as vectors by summing up vectors of the individual substructures and, for instance, feed into supervised machine learning approaches to predict compound properties. The underlying substructure vector embeddings are obtained by training an unsupervised machine learning approach on a so-called corpus of compounds that consists of all available chemical matter. The resulting Mol2vec model is pre-trained once, yields dense vector representations and overcomes drawbacks of common compound feature representations such as sparseness and bit collisions. The prediction capabilities are demonstrated on several compound property and bioactivity data sets and compared with results obtained for Morgan fingerprints as reference compound representation. Mol2vec can be easily combined with ProtVec, which employs the same Word2vec concept on protein sequences, resulting in a proteochemometric approach that is alignment independent and can be thus also easily used for proteins with low sequence similarities.


2021 ◽  
Vol 15 (3) ◽  
pp. 1-19
Author(s):  
Wei Wang ◽  
Feng Xia ◽  
Jian Wu ◽  
Zhiguo Gong ◽  
Hanghang Tong ◽  
...  

While scientific collaboration is critical for a scholar, some collaborators can be more significant than others, e.g., lifetime collaborators. It has been shown that lifetime collaborators are more influential on a scholar’s academic performance. However, little research has been done on investigating predicting such special relationships in academic networks. To this end, we propose Scholar2vec, a novel neural network embedding for representing scholar profiles. First, our approach creates scholars’ research interest vector from textual information, such as demographics, research, and influence. After bridging research interests with a collaboration network, vector representations of scholars can be gained with graph learning. Meanwhile, since scholars are occupied with various attributes, we propose to incorporate four types of scholar attributes for learning scholar vectors. Finally, the early-stage similarity sequence based on Scholar2vec is used to predict lifetime collaborators with machine learning methods. Extensive experiments on two real-world datasets show that Scholar2vec outperforms state-of-the-art methods in lifetime collaborator prediction. Our work presents a new way to measure the similarity between two scholars by vector representation, which tackles the knowledge between network embedding and academic relationship mining.


2015 ◽  
Vol 74 ◽  
pp. 46-56 ◽  
Author(s):  
Alexander Hogenboom ◽  
Flavius Frasincar ◽  
Franciska de Jong ◽  
Uzay Kaymak

1991 ◽  
Vol 06 (10) ◽  
pp. 923-927 ◽  
Author(s):  
S.M. SERGEEV

In this paper spectral decompositions of R-matrices for vector representations of exceptional algebras are found.


2021 ◽  
Vol 27 (1) ◽  
pp. 51-56
Author(s):  
E. B. Doronina ◽  
◽  
A. V. Skatkov ◽  
◽  
◽  
...  

The article presents the problem of analyzing the efficiency of maintenance and repair of complex technical equipment, shows a number of problem statements that reflect the problems of choosing the optimal service plan within the sequence of operations. The scalar and vector representations of the problem are considered, and a scheme for implementing an approach to evaluating the effectiveness of repair and maintenance plans is proposed.


2021 ◽  
pp. 1-12
Author(s):  
Melesio Crespo-Sanchez ◽  
Ivan Lopez-Arevalo ◽  
Edwin Aldana-Bobadilla ◽  
Alejandro Molina-Villegas

In the last few years, text analysis has grown as a keystone in several domains for solving many real-world problems, such as machine translation, spam detection, and question answering, to mention a few. Many of these tasks can be approached by means of machine learning algorithms. Most of these algorithms take as input a transformation of the text in the form of feature vectors containing an abstraction of the content. Most of recent vector representations focus on the semantic component of text, however, we consider that also taking into account the lexical and syntactic components the abstraction of content could be beneficial for learning tasks. In this work, we propose a content spectral-based text representation applicable to machine learning algorithms for text analysis. This representation integrates the spectra from the lexical, syntactic, and semantic components of text producing an abstract image, which can also be treated by both, text and image learning algorithms. These components came from feature vectors of text. For demonstrating the goodness of our proposal, this was tested on text classification and complexity reading score prediction tasks obtaining promising results.


2021 ◽  
Vol 10 (1) ◽  
Author(s):  
Koya Sato ◽  
Mizuki Oka ◽  
Alain Barrat ◽  
Ciro Cattuto

AbstractLow-dimensional vector representations of network nodes have proven successful to feed graph data to machine learning algorithms and to improve performance across diverse tasks. Most of the embedding techniques, however, have been developed with the goal of achieving dense, low-dimensional encoding of network structure and patterns. Here, we present a node embedding technique aimed at providing low-dimensional feature vectors that are informative of dynamical processes occurring over temporal networks – rather than of the network structure itself – with the goal of enabling prediction tasks related to the evolution and outcome of these processes. We achieve this by using a lossless modified supra-adjacency representation of temporal networks and building on standard embedding techniques for static graphs based on random walks. We show that the resulting embedding vectors are useful for prediction tasks related to paradigmatic dynamical processes, namely epidemic spreading over empirical temporal networks. In particular, we illustrate the performance of our approach for the prediction of nodes’ epidemic states in single instances of a spreading process. We show how framing this task as a supervised multi-label classification task on the embedding vectors allows us to estimate the temporal evolution of the entire system from a partial sampling of nodes at random times, with potential impact for nowcasting infectious disease dynamics.


Sign in / Sign up

Export Citation Format

Share Document