Exploiting Latent Semantic Subspaces to Derive Associations for Specific Pharmaceutical Semantics

2020 ◽  
Vol 5 (4) ◽  
pp. 333-345
Author(s):  
Janus Wawrzinek ◽  
José María González Pinto ◽  
Oliver Wiehr ◽  
Wolf-Tilo Balke

Abstract State-of-the-art approaches in the field of neural embedding models (NEMs) enable progress in the automatic extraction and prediction of semantic relations between important entities like active substances, diseases, and genes. In particular, the prediction property is making them valuable for important research-related tasks such as hypothesis generation and drug repositioning. A core challenge in the biomedical domain is to have interpretable semantics from NEMs that can distinguish, for instance, between the following two situations: (a) drug x induces disease y and (b) drug x treats disease y. However, NEMs alone cannot distinguish between associations such as treats or induces. Is it possible to develop a model to learn a latent representation from the NEMs capable of such disambiguation? To what extent do we need domain knowledge to succeed in the task? In this paper, we answer both questions and show that our proposed approach not only succeeds in the disambiguation task but also advances current growing research efforts to find real predictions using a sophisticated retrospective analysis. Furthermore, we investigate which type of associations is generally better contextualized and therefore probably has a stronger influence in our disambiguation task. In this context, we present an approach to extract an interpretable latent semantic subspace from the original embedding space in which therapeutic drug–disease associations are more likely .

2017 ◽  
Vol 4 (S) ◽  
pp. 76
Author(s):  
Duc-Hau Le ◽  
Duc-Hau Le

Computational drug repositioning has been proven as a promising and efficient strategy for discovering new uses from existing drugs. To achieve this goal, a number of computational methods have been proposed, which are based on different data sources of drugs, diseases and different approaches. Depending on where the discovery of drug-disease relationships comes from, proposed computational methods can be categorized as either ‘drug-based’ or ‘disease-based’. The proposed methods are usually based on an assumption that similar drugs can be used for similar diseases to identify new indications of drugs. Therefore, similarity between drugs and between diseases is usually used as inputs. In addition, known drug-disease associations are also needed for the methods. It should be noted that these associations are still not well established due to many of marketed drugs have been withdrawn and this could affect to outcome of the methods. In this study, instead of using the known drug-disease associations, we based on known disease-gene and drug-target associations. In addition, similarity between drugs measured by chemical structures of drug compounds and similarity between diseases sharing phenotypes are used. Then, a semi-supervised learning model, Regularized Least Square (RLS), which can exploit these information effectively, is used to find new uses of drugs. Experiment results demonstrate that our method, namely RLSDR, outperforms several state-of-the-art existing methods in terms of area under the ROC curve (AUC). Novel indications for a number of drugs are identified and validated by evidences from different resources


2022 ◽  
Vol 12 ◽  
Author(s):  
Chu-Qiao Gao ◽  
Yuan-Ke Zhou ◽  
Xiao-Hong Xin ◽  
Hui Min ◽  
Pu-Feng Du

Drug repositioning provides a promising and efficient strategy to discover potential associations between drugs and diseases. Many systematic computational drug-repositioning methods have been introduced, which are based on various similarities of drugs and diseases. In this work, we proposed a new computational model, DDA-SKF (drug–disease associations prediction using similarity kernels fusion), which can predict novel drug indications by utilizing similarity kernel fusion (SKF) and Laplacian regularized least squares (LapRLS) algorithms. DDA-SKF integrated multiple similarities of drugs and diseases. The prediction performances of DDA-SKF are better, or at least comparable, to all state-of-the-art methods. The DDA-SKF can work without sufficient similarity information between drug indications. This allows us to predict new purpose for orphan drugs. The source code and benchmarking datasets are deposited in a GitHub repository (https://github.com/GCQ2119216031/DDA-SKF).


2021 ◽  
Vol 11 (4) ◽  
pp. 1728
Author(s):  
Hua Zhong ◽  
Li Xu

The prediction interval (PI) is an important research topic in reliability analyses and decision support systems. Data size and computation costs are two of the issues which may hamper the construction of PIs. This paper proposes an all-batch (AB) loss function for constructing high quality PIs. Taking the full advantage of the likelihood principle, the proposed loss makes it possible to train PI generation models using the gradient descent (GD) method for both small and large batches of samples. With the structure of dual feedforward neural networks (FNNs), a high-quality PI generation framework is introduced, which can be adapted to a variety of problems including regression analysis. Numerical experiments were conducted on the benchmark datasets; the results show that higher-quality PIs were achieved using the proposed scheme. Its reliability and stability were also verified in comparison with various state-of-the-art PI construction methods.


2020 ◽  
pp. 1-21 ◽  
Author(s):  
Clément Dalloux ◽  
Vincent Claveau ◽  
Natalia Grabar ◽  
Lucas Emanuel Silva Oliveira ◽  
Claudia Maria Cabral Moro ◽  
...  

Abstract Automatic detection of negated content is often a prerequisite in information extraction systems in various domains. In the biomedical domain especially, this task is important because negation plays an important role. In this work, two main contributions are proposed. First, we work with languages which have been poorly addressed up to now: Brazilian Portuguese and French. Thus, we developed new corpora for these two languages which have been manually annotated for marking up the negation cues and their scope. Second, we propose automatic methods based on supervised machine learning approaches for the automatic detection of negation marks and of their scopes. The methods show to be robust in both languages (Brazilian Portuguese and French) and in cross-domain (general and biomedical languages) contexts. The approach is also validated on English data from the state of the art: it yields very good results and outperforms other existing approaches. Besides, the application is accessible and usable online. We assume that, through these issues (new annotated corpora, application accessible online, and cross-domain robustness), the reproducibility of the results and the robustness of the NLP applications will be augmented.


2015 ◽  
Vol 24 (02) ◽  
pp. 1540010 ◽  
Author(s):  
Patrick Arnold ◽  
Erhard Rahm

We introduce a novel approach to extract semantic relations (e.g., is-a and part-of relations) from Wikipedia articles. These relations are used to build up a large and up-to-date thesaurus providing background knowledge for tasks such as determining semantic ontology mappings. Our automatic approach uses a comprehensive set of semantic patterns, finite state machines and NLP techniques to extract millions of relations between concepts. An evaluation for different domains shows the high quality and effectiveness of the proposed approach. We also illustrate the value of the newly found relations for improving existing ontology mappings.


2018 ◽  
Vol 25 (6) ◽  
pp. 726-733
Author(s):  
Maria S. Karyaeva ◽  
Pavel I. Braslavski ◽  
Valery A. Sokolov

The ability to identify semantic relations between words has made a word2vec model widely used in NLP tasks. The idea of word2vec is based on a simple rule that a higher similarity can be reached if two words have a similar context. Each word can be represented as a vector, so the closest coordinates of vectors can be interpreted as similar words. It allows to establish semantic relations (synonymy, relations of hypernymy and hyponymy and other semantic relations) by applying an automatic extraction. The extraction of semantic relations by hand is considered as a time-consuming and biased task, requiring a large amount of time and some help of experts. Unfortunately, the word2vec model provides an associative list of words which does not consist of relative words only. In this paper, we show some additional criteria that may be applicable to solve this problem. Observations and experiments with well-known characteristics, such as word frequency, a position in an associative list, might be useful for improving results for the task of extraction of semantic relations for the Russian language by using word embedding. In the experiments, the word2vec model trained on the Flibusta and pairs from Wiktionary are used as examples with semantic relationships. Semantically related words are applicable to thesauri, ontologies and intelligent systems for natural language processing.


2003 ◽  
Vol 4 (1) ◽  
pp. 94-97 ◽  
Author(s):  
Udo Hahn

This paper reports a large-scale knowledge conversion and curation experiment. Biomedical domain knowledge from a semantically weak and shallow terminological resource, the UMLS, is transformed into a rigorous description logics format. This way, the broad coverage of the UMLS is combined with inference mechanisms for consistency and cycle checking. They are the key to proper cleansing of the knowledge directly imported from the UMLS, as well as subsequent updating, maintenance and refinement of large knowledge repositories. The emerging biomedical knowledge base currently comprises more than 240 000 conceptual entities and hence constitutes one of the largest formal knowledge repositories ever built.


2021 ◽  
Vol 17 (1) ◽  
pp. 97-122
Author(s):  
Mohamed Hassan Mohamed Ali ◽  
Said Fathalla ◽  
Mohamed Kholief ◽  
Yasser Fouad Hassan

Ontologies, as semantic knowledge representation, have a crucial role in various information systems. The main pitfall of manually building ontologies is effort and time-consuming. Ontology learning is a key solution. Learning Non-Taxonomic Relationships of Ontologies (LNTRO) is the process of automatic/semi-automatic extraction of all possible relationships between concepts in a specific domain, except the hierarchal relations. Most of the research works focused on the extraction of concepts and taxonomic relations in the ontology learning process. This article presents the results of a systematic review of the state-of-the-art approaches for LNTRO. Sixteen approaches have been described and qualitatively analyzed. The solutions they provide are discussed along with their respective positive and negative aspects. The goal is to provide researchers in this area a comprehensive understanding of the drawbacks of the existing work, thereby encouraging further improvement of the research work in this area. Furthermore, this article proposes a set of recommendations for future research.


Author(s):  
Chia-Hu Chang ◽  
Ja-Ling Wu

With the aid of content-based multimedia analysis, virtual product placement opens up new opportunities for advertisers to effectively monetize the existing videos in an efficient way. In addition, a number of significant and challenging issues are raising accordingly, such as how to less-intrusively insert the contextually relevant advertising message (what) at the right place (where) and the right time (when) with the attractive representation (how) in the videos. In this chapter, domain knowledge in support of delivering and receiving the advertising message is introduced, such as the advertising theory, psychology and computational aesthetics. We briefly review the state of the art techniques for assisting virtual product placement in videos. In addition, we present a framework to serve the virtual spotlighted advertising (ViSA) for virtual product placement and give an explorative study of it. Moreover, observations about the new trend and possible extension in the design space of virtual product placement will also be stated and discussed. We believe that it would inspire the researchers to develop more interesting and applicable multimedia advertising systems for virtual product placement.


Author(s):  
Mengyun Yang ◽  
Gaoyan Wu ◽  
Qichang Zhao ◽  
Yaohang Li ◽  
Jianxin Wang

Abstract With the development of high-throughput technology and the accumulation of biomedical data, the prior information of biological entity can be calculated from different aspects. Specifically, drug–drug similarities can be measured from target profiles, drug–drug interaction and side effects. Similarly, different methods and data sources to calculate disease ontology can result in multiple measures of pairwise disease similarities. Therefore, in computational drug repositioning, developing a dynamic method to optimize the fusion process of multiple similarities is a crucial and challenging task. In this study, we propose a multi-similarities bilinear matrix factorization (MSBMF) method to predict promising drug-associated indications for existing and novel drugs. Instead of fusing multiple similarities into a single similarity matrix, we concatenate these similarity matrices of drug and disease, respectively. Applying matrix factorization methods, we decompose the drug–disease association matrix into a drug-feature matrix and a disease-feature matrix. At the same time, using these feature matrices as basis, we extract effective latent features representing the drug and disease similarity matrices to infer missing drug–disease associations. Moreover, these two factored matrices are constrained by non-negative factorization to ensure that the completed drug–disease association matrix is biologically interpretable. In addition, we numerically solve the MSBMF model by an efficient alternating direction method of multipliers algorithm. The computational experiment results show that MSBMF obtains higher prediction accuracy than the state-of-the-art drug repositioning methods in cross-validation experiments. Case studies also demonstrate the effectiveness of our proposed method in practical applications. Availability: The data and code of MSBMF are freely available at https://github.com/BioinformaticsCSU/MSBMF. Corresponding author: Jianxin Wang, School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, P. R. China. E-mail: [email protected] Supplementary Data: Supplementary data are available online at https://academic.oup.com/bib.


Sign in / Sign up

Export Citation Format

Share Document