Establishing Semantic Similarity of the Cluster Documents and Extracting Key Entities in the Problem of the Semantic Analysis of News Texts

Reconstruction of signaling pathways is crucial for understanding cellular mechanisms. A pathway is represented as a path of a signaling cascade involving a series of proteins to perform a particular function. Since a protein pair involved in signaling and response have a strong interaction, putative pathways can be detected from protein–protein interaction (PPI) networks. However, predicting directed pathways from the undirected genome-wide PPI networks has been challenging. We present a novel computational algorithm to efficiently predict signaling pathways from PPI networks given a starting protein and an ending protein. Our approach integrates topological analysis of PPI networks and semantic analysis of PPIs using Gene Ontology data. An advanced semantic similarity measure is used for weighting each interacting protein pair. Our distance-wise algorithm iteratively selects an adjacent protein from a PPI network to build a pathway based on a distance condition. On each iteration, the strength of a hypothetical path passing through a candidate edge is estimated by a local heuristic. We evaluate the performance by comparing the resultant paths to known signaling pathways on yeast. The results show that our approach has higher accuracy and efficiency than previous methods.

Download Full-text

A Hybrid Model for Emotion Detection from Text

International Journal of Information Retrieval Research ◽

10.4018/ijirr.2017010103 ◽

2017 ◽

Vol 7 (1) ◽

pp. 32-48 ◽

Cited By ~ 3

Author(s):

Samar Fathy ◽

Nahla El-Haggar ◽

Mohamed H. Haggag

Keyword(s):

Semantic Similarity ◽

Hybrid Model ◽

Semantic Analysis ◽

Main Idea ◽

Object Extraction ◽

Emotion Detection ◽

Suggested Approach ◽

Matching Process ◽

Extraction Algorithm ◽

Input Sentence

Emotions can be judged by a combination of cues such as speech facial expressions and actions. Emotions are also articulated by text. This paper shows a new hybrid model for detecting emotion from text which depends on ontology with keywords semantic similarity. The text labelled with one of the six basic Ekman emotion categories. The main idea is to extract ontology from input sentences and match it with the ontology base which created from simple ontologies and the emotion of each ontology. The ontology extracted from the input sentence by using a triplet (subject, predicate, and object) extraction algorithm, then the ontology matching process is applied with the ontology base. After that the emotion of the input sentence is the emotion of the ontology which it matches with the highest score of matching. If the extracted ontology doesn't match with any ontology from the ontology base, then the keyword semantic similarity approach used. The suggested approach depends on the meaning of each sentence, the syntax and semantic analysis of the context.

Download Full-text

Recent change in the productivity and schematicity of the way-construction: A distributional semantic analysis

Corpus Linguistics and Linguistic Theory ◽

10.1515/cllt-2016-0014 ◽

2018 ◽

Vol 14 (1) ◽

pp. 65-97 ◽

Cited By ~ 8

Author(s):

Florent Perek

Keyword(s):

Semantic Similarity ◽

Semantic Analysis ◽

Recent Change ◽

Occurrence Frequency ◽

Semantic Model ◽

Semantic Change ◽

Semantic Domain ◽

The Way

AbstractThis paper presents a corpus-based study of recent change in the Englishway-construction, drawing on data from the 1830s to the 2000s. Semantic change in the distribution of the construction is characterized by means of a distributional semantic model, which captures semantic similarity between verbs through their co-occurrence frequency with other words in the corpus. By plotting and comparing the semantic domain of the three senses of the construction at different points in time, it is found that they all have gained in semantic diversity. These findings are interpreted in terms of increases in schematicity, either of the verb slot or the motion component contributed by the construction.

Download Full-text

Idea Generation and Goal Derived Categories

10.31234/osf.io/6w5tg ◽

2019 ◽

Author(s):

Rick Hass

Keyword(s):

Response Time ◽

Semantic Similarity ◽

Semantic Analysis ◽

Idea Generation ◽

Derived Categories ◽

Context Dependency ◽

Slope Difference ◽

Response Time Distributions ◽

Future Work ◽

Potential Confound

Semantic search and retrieval of information plays an important role in creative idea generation. This study was designed to examine how semantic and temporal clustering varies when asking participants to generate ideas about uses for objects compared with generating members of goal-derived categories. Participants generated uses for three objects: brick, hammer, picture frame, and also generated members of the following goal-derived categories: things to take in case of a fire, things to sell at a garage sale, and ways to spend lottery winnings. Using response-time analysis and semantic analysis, results illustrated that all six prompts generally led to exponentialcumulative response-time distributions. However, the proportion of temporally clustered responses, defined using the slope-difference algorithm, was higher for goal-derived category responses compared with object uses. Despite that, overall pairwise semantic similarity was higher for object uses than for goal derived exemplars. The effect of prompt on pairwise semantic similarity is likely the result of context-dependency of exemplars from goal-derived categories. However, the current analysis contains a potential confound such that special instructions to give “common and uncommon” responses were provided only for the object-uses prompts. The confound is likely minimal, but future work is necessary to illustrate the robustness of the results.

Download Full-text

An Efficient Approach for Ranking of Semantic Web Documents by Computing Semantic Similarity and Using HCS Clustering

International Journal of Semiotics and Visual Rhetoric ◽

10.4018/ijsvr.2021010104 ◽

2021 ◽

Vol 5 (1) ◽

pp. 45-56

Author(s):

Poonam Chahal ◽

Manjeet Singh

Keyword(s):

Semantic Web ◽

Semantic Similarity ◽

Semantic Analysis ◽

Semantic Content ◽

Graph Representation ◽

Graphical Form ◽

Theoretic Approach ◽

Connected Subgraph ◽

Web Documents ◽

The Web

In today's era, with the availability of a huge amount of dynamic information available in world wide web (WWW), it is complex for the user to retrieve or search the relevant information. One of the techniques used in information retrieval is clustering, and then the ranking of the web documents is done to provide user the information as per their query. In this paper, semantic similarity score of Semantic Web documents is computed by using the semantic-based similarity feature combining the latent semantic analysis (LSA) and latent relational analysis (LRA). The LSA and LRA help to determine the relevant concepts and relationships between the concepts which further correspond to the words and relationships between these words. The extracted interrelated concepts are represented by the graph further representing the semantic content of the web document. From this graph representation for each document, the HCS algorithm of clustering is used to extract the most connected subgraph for constructing the different number of clusters which is according to the information-theoretic approach. The web documents present in clusters in graphical form are ranked by using the text-rank method in combination with the proposed method. The experimental analysis is done by using the benchmark datasets OpinRank. The performance of the approach on ranking of web documents using semantic-based clustering has shown promising results.

Download Full-text

Dative by Genitive Replacement in the Greek Language of the Papyri: A Diachronic Account of Case Semantics

Journal of Greek Linguistics ◽

10.1163/15699846-01501001 ◽

2015 ◽

Vol 15 (1) ◽

pp. 91-121 ◽

Cited By ~ 3

Author(s):

Joanne Vera Stolk

Keyword(s):

Semantic Similarity ◽

Semantic Analysis ◽

Semantic Role ◽

Greek Language ◽

Ancient Greek ◽

Verbal Form ◽

Semantic Extension ◽

Dative Case ◽

Genitive Case

Semantic analysis of the prenominal first person singular genitive pronoun (μου) in the Greek of the documentary papyri shows that the pronoun is typically found in the position between a verbal form and an alienable possessum which functions as the patient of the predicate. When the event expressed by the predicate is patient-affecting, the possessor is indirectly also affected. Hence the semantic role of this affected alienable possessor might be interpreted as a benefactive or malefactive in genitive possession constructions. By semantic extension the meaning of the genitive case in this position is extended into goal-oriented roles, such as addressee and recipient, which are commonly denoted by the dative case in Ancient Greek. The semantic similarity of the genitive and dative cases in these constructions might have provided the basis for the merger of the cases in the Greek language.

Download Full-text

Semantic summarization of web news

Encyclopedia with Semantic Computing and Robotic Intelligence ◽

10.1142/s2425038416300068 ◽

2017 ◽

Vol 01 (01) ◽

pp. 1630006 ◽

Cited By ~ 1

Author(s):

Flora Amato ◽

Vincenzo Moscato ◽

Antonio Picariello ◽

Giancarlo Sperlí ◽

Antonio D’Acierno ◽

...

Keyword(s):

Semantic Similarity ◽

General Framework ◽

Clustering Algorithm ◽

Semantic Analysis ◽

Relevant Information ◽

Unsupervised Clustering ◽

Web Document ◽

Web News

In this paper, we present a general framework for retrieving relevant information from news papers that exploits a novel summarization algorithm based on a deep semantic analysis of texts. In particular, we extract from each Web document a set of triples (subject, predicate, object) that are then used to build a summary through an unsupervised clustering algorithm exploiting the notion of semantic similarity. Finally, we leverage the centroids of clusters to determine the most significant summary sentences using some heuristics. Several experiments are carried out using the standard DUC methodology and ROUGE software and show how the proposed method outperforms several summarizer systems in terms of recall and readability.

Download Full-text

Exploring media bias with semantic analysis tools: validation of the Contrast Analysis of Semantic Similarity (CASS)

Behavior Research Methods ◽

10.3758/s13428-010-0026-z ◽

2010 ◽

Vol 43 (1) ◽

pp. 193-200 ◽

Cited By ~ 19

Author(s):

Nicholas S. Holtzman ◽

John Paul Schott ◽

Michael N. Jones ◽

David A. Balota ◽

Tal Yarkoni

Keyword(s):

Semantic Similarity ◽

Semantic Analysis ◽

Media Bias ◽

Analysis Tools ◽

Contrast Analysis

Download Full-text

pyMeSHSim: an integrative python package for biomedical named entity recognition, normalization and comparison

10.1101/459172 ◽

2018 ◽

Cited By ~ 2

Author(s):

Zhi-Hui Luo ◽

Meng-Wei Shi ◽

Zhuang Yang ◽

Hong-Yu Zhang ◽

Zhen-Xia Chen

Keyword(s):

Semantic Similarity ◽

Semantic Analysis ◽

Named Entity Recognition ◽

Entity Recognition ◽

Analysis Tool ◽

Medical Subject Headings ◽

Unified Medical Language System ◽

Named Entity ◽

Mesh Terms ◽

Causal Genes

ABSTRACTMotivationIncreasing disease causal genes have been identified through different methods, while there are still no uniform biomedical named entity (bio-NE) annotations of the disease phenotypes. Furthermore, semantic similarity comparison between two bio-NE annotations, like disease descriptions, has become important for data integration or system genetics analysis.MethodsThe package pyMeSHSim realizes bio-NEs recognition using MetaMap, which produces Unified Medical Language System (UMLS) concepts in natural language process. To map the UMLS concepts to MeSH, pyMeSHSim embedded a house made dataset containing the Medical Subject Headings (MeSH) main headings (MHs), supplementary concept records (SCRs) and relations between them. Based on the dataset, pyMeSHSim implemented four information content (IC) based algorithms and one graph-based algorithm to measure the semantic similarity between two MeSH terms.ResultsTo evaluate its performance, we used pyMeSHSim to parse OMIM and GWAS phenotypes. The inclusion of SCRs and the curation strategy of non-MeSH-synonymous UMLS concepts used by pyMeSHSim improved the performance of pyMeSHSim in the recognition of OMIM phenotypes. In the curation of GWAS phenotypes, pyMeSHSim and previous manual work recognized the same MeSH terms from 276/461 GWAS phenotypes, and the correlation between their semantic similarity calculated by pyMeSHSim and another semantic analysis tool meshes was as high as 0.53-0.97.ConclusionWith the embedded dataset including both MeSH MHs and SCRs, the integrative MeSH tool pyMeSHSim realized the disease recognition, normalization and comparison in biomedical text-mining.AvailabilityPackage’s source code and test datasets are available under the GPLv3 license at https://github.com/luozhhub/pyMeSHSim

Download Full-text

Towards optimize-ESA for text semantic similarity: A case study of biomedical text

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v10i3.pp2934-2943 ◽

2020 ◽

Vol 10 (3) ◽

pp. 2934

Author(s):

Khaoula Mrhar ◽

Mounia Abik

Keyword(s):

Semantic Similarity ◽

Language Processing ◽

Semantic Analysis ◽

Semantic Relatedness ◽

High Dimensional ◽

Specific Domain ◽

Large Matrix ◽

Index Matrix ◽

Explicit Semantic Analysis

Explicit Semantic Analysis (ESA) is an approach to measure the semantic relatedness between terms or documents based on similarities to documents of a references corpus usually Wikipedia. ESA usage has received tremendous attention in the field of natural language processing NLP and information retrieval. However, ESA utilizes a huge Wikipedia index matrix in its interpretation by multiplying a large matrix by a term vector to produce a high-dimensional vector. Consequently, the ESA process is too expensive in interpretation and similarity steps. Therefore, the efficiency of ESA will slow down because we lose a lot of time in unnecessary operations. This paper propose enhancements to ESA called optimize-ESA that reduce the dimension at the interpretation stage by computing the semantic similarity in a specific domain. The experimental results show clearly that our method correlates much better with human judgement than the full version ESA approach.

Download Full-text