scholarly journals Gaussian Embedding of Linked Documents from a Pretrained Semantic Space

Author(s):  
Antoine Gourru ◽  
Julien Velcin ◽  
Julien Jacques

Gaussian Embedding of Linked Documents (GELD) is a new method that embeds linked documents (e.g., citation networks) onto a pretrained semantic space (e.g., a set of word embeddings). We formulate the problem in such a way that we model each document as a Gaussian distribution in the word vector space. We design a generative model that combines both words and links in a consistent way. Leveraging the variance of a document allows us to model the uncertainty related to word and link generation. In most cases, our method outperforms state-of-the-art methods when using our document vectors as features for usual downstream tasks. In particular, GELD achieves better accuracy in classification and link prediction on Cora and Dblp. In addition, we demonstrate qualitatively the convenience of several properties of our method. We provide the implementation of GELD and the evaluation datasets to the community (https://github.com/AntoineGourru/DNEmbedding).

2017 ◽  
Vol 2 (1) ◽  
pp. 299-316 ◽  
Author(s):  
Cristina Pérez-Benito ◽  
Samuel Morillas ◽  
Cristina Jordán ◽  
J. Alberto Conejero

AbstractIt is still a challenge to improve the efficiency and effectiveness of image denoising and enhancement methods. There exists denoising and enhancement methods that are able to improve visual quality of images. This is usually obtained by removing noise while sharpening details and improving edges contrast. Smoothing refers to the case of denoising when noise follows a Gaussian distribution.Both operations, smoothing noise and sharpening, have an opposite nature. Therefore, there are few approaches that simultaneously respond to both goals. We will review these methods and we will also provide a detailed study of the state-of-the-art methods that attack both problems in colour images, separately.


Author(s):  
Tatsunori B. Hashimoto ◽  
David Alvarez-Melis ◽  
Tommi S. Jaakkola

Continuous word representations have been remarkably useful across NLP tasks but remain poorly understood. We ground word embeddings in semantic spaces studied in the cognitive-psychometric literature, taking these spaces as the primary objects to recover. To this end, we relate log co-occurrences of words in large corpora to semantic similarity assessments and show that co-occurrences are indeed consistent with an Euclidean semantic space hypothesis. Framing word embedding as metric recovery of a semantic space unifies existing word embedding algorithms, ties them to manifold learning, and demonstrates that existing algorithms are consistent metric recovery methods given co-occurrence counts from random walks. Furthermore, we propose a simple, principled, direct metric recovery algorithm that performs on par with the state-of-the-art word embedding and manifold learning methods. Finally, we complement recent focus on analogies by constructing two new inductive reasoning datasets—series completion and classification—and demonstrate that word embeddings can be used to solve them as well.


Psihologija ◽  
2017 ◽  
Vol 50 (4) ◽  
pp. 503-520 ◽  
Author(s):  
Marco Marelli

Distributional semantics has been for long a source of successful models in psycholinguistics, permitting to obtain semantic estimates for a large number of words in an automatic and fast way. However, resources in this respect remain scarce or limitedly accessible for languages different from English. The present paper describes WEISS (Word-Embeddings Italian Semantic Space), a distributional semantic model based on Italian. WEISS includes models of semantic representations that are trained adopting state-of-the-art word-embeddings methods, applying neural networks to induce distributed representations for lexical meanings. The resource is evaluated against two test sets, demonstrating that WEISS obtains a better performance with respect to a baseline encoding word associations. Moreover, an extensive qualitative analysis of the WEISS output provides examples of the model potentialities in capturing several semantic phenomena. Two variants of WEISS are released and made easily accessible via web through the SNAUT graphic interface.


2016 ◽  
Vol 2016 ◽  
pp. 1-14 ◽  
Author(s):  
M. G. Sobamowo

The determinant of a matrix is very powerful tool that helps in establishing properties of matrices. Indisputably, its importance in various engineering and applied science problems has made it a mathematical area of increasing significance. From developed and existing methods of finding determinant of a matrix, basketweave method/Sarrus’ rule has been shown to be the simplest, easiest, very fast, accurate, and straightforward method for the computation of the determinant of 3 × 3 matrices. However, its gross limitation is that this method/rule does not work for matrices larger than 3 × 3 and this fact is well established in literatures. Therefore, the state-of-the-art methods for finding the determinants of 4 × 4 matrix and larger matrices are predominantly founded on non-basketweave method/non-Sarrus’ rule. In this work, extension of the simple, easy, accurate, and straightforward approach to the determinant of larger matrices is presented. The paper presents the developments of new method with different schemes based on the basketweave method/Sarrus’ rule for the computation of the determinant of 4 × 4. The potency of the new method is revealed in generalization of the basketweave method/non-Sarrus’ rule for the computation of the determinant of n×n (n>3) matrices. The new method is very efficient, very consistence for handy calculations, highly accurate, and fastest compared to other existing methods.


Author(s):  
Ziyu Wan ◽  
Yan Li ◽  
Min Yang ◽  
Junge Zhang

In this paper, we propose a Visual Center Adaptation Method (VCAM) to address the domain shift problem in zero-shot learning. For the seen classes in the training data, VCAM builds an embedding space by learning the mapping from semantic space to some visual centers. While for unseen classes in the test data, the construction of embedding space is constrained by a symmetric Chamfer-distance term, aiming to adapt the distribution of the synthetic visual centers to that of the real cluster centers. Therefore the learned embedding space can generalize the unseen classes well. Experiments on two widely used datasets demonstrate that our model significantly outperforms state-of-the-art methods.


RSC Advances ◽  
2014 ◽  
Vol 4 (102) ◽  
pp. 58127-58136
Author(s):  
Shantanu Kadam ◽  
Kumar Vanka

A computational method based on the concept of “noise” in conjunction with the representative reaction approach is proposed to solve the problem of negative numbers. The new method performs better on the front of accuracy and efficiency than other state-of-the-art methods.


Author(s):  
Bidisha Samanta ◽  
Abir DE ◽  
Gourhari Jana ◽  
Pratim Kumar Chattaraj ◽  
Niloy Ganguly ◽  
...  

Deep generative models have been praised for their ability to learn smooth latent representation of images, text, and audio, which can then be used to generate new, plausible data. However, current generative models are unable to work with molecular graphs due to their unique characteristics—their underlying structure is not Euclidean or grid-like, they remain isomorphic under permutation of the nodes labels, and they come with a different number of nodes and edges. In this paper, we propose NeVAE, a novel variational autoencoder for molecular graphs, whose encoder and decoder are specially designed to account for the above properties by means of several technical innovations. In addition, by using masking, the decoder is able to guarantee a set of valid properties in the generated molecules. Experiments reveal that our model can discover plausible, diverse and novel molecules more effectively than several state of the art methods. Moreover, by utilizing Bayesian optimization over the continuous latent representation of molecules our model finds, we can also find molecules that maximize certain desirable properties more effectively than alternatives.


2019 ◽  
Vol 3 (2) ◽  
pp. 30 ◽  
Author(s):  
Dandan Li ◽  
Douglas Summers-Stay

Word embeddings have been very successful in many natural language processing tasks, but they characterize the meaning of a word/concept by uninterpretable “context signatures”. Such a representation can render results obtained using embeddings difficult to interpret. Neighboring word vectors may have similar meanings, but in what way are they similar? That similarity may represent a synonymy, metonymy, or even antonymy relation. In the cognitive psychology literature, in contrast, concepts are frequently represented by their relations with properties. These properties are produced by test subjects when asked to describe important features of concepts. As such, they form a natural, intuitive feature space. In this work, we present a neural-network-based method for mapping a distributional semantic space onto a human-built property space automatically. We evaluate our method on word embeddings learned with different types of contexts, and report state-of-the-art performances on the widely used McRae semantic feature production norms.


Author(s):  
Wanhua Cao ◽  
Yi Zhang ◽  
Juntao Liu ◽  
Ziyun Rao

Knowledge graph embedding improves the performance of relation extraction and knowledge reasoning by encoding entities and relationships in low-dimensional semantic space. During training, negative samples are usually constructed by replacing the head/tail entity. And the different replacing relationships lead to different accuracy of the prediction results. This paper develops a negative triplets construction framework according to the frequency of relational association entities. The proposed construction framework can fully consider the quantitative of relations and entities in the dataset to assign the proportion of relation and entity replacement and the frequency of the entities associated with each relationship to set reasonable proportions for different relations. To verify the validity of the proposed construction framework, it is integrated into the state-of-the-art knowledge graph embedding models, such as TransE, TransH, DistMult, ComplEx, and Analogy. And both the evaluation criteria of relation prediction and entity prediction are used to evaluate the performance of link prediction more comprehensively. The experimental results on two commonly used datasets, WN18 and FB15K, show that the proposed method improves entity link and triplet classification accuracy, especially the accuracy of relational link prediction.


2019 ◽  
Author(s):  
Jacson Rodrigues Barbosa ◽  
Ricardo Marcondes Marcacini ◽  
Ricardo Britto ◽  
Frederico Soares ◽  
Solange Rezende ◽  
...  

Bug localization (BL) from the bug report is the strategic activity of the software maintaining process. Because BL is a costly and tedious activity, BL techniques information retrieval-based and machine learning-based could aid software engineers. We propose a method for BUg Localization with word embeddings and Network Regularization (BULNER). The preliminary results suggest that BULNER has better performance than two state-of-the-art methods.


Sign in / Sign up

Export Citation Format

Share Document