scholarly journals Semantic analysis of Twitter content

2021 ◽  
Author(s):  
Yue Feng

Semantic analysis is the process of shifting the understanding of text from the levels of phrases, clauses, sentences to the level of semantic meanings. Two of the most important semantic analysis tasks include 1) semantic relatedness measurement and 2) entity linking. The semantic relatedness measurement task aims to quantitatively identify the relationships between two words or concepts based on the similarity or closeness of their semantic meaning whereas the entity linking task focuses on linking plain text to structured knowledge resources, e.g. Wikipedia to provide semantic annotation of texts. A limitation of current semantic analysis approaches is that they are built upon traditional documents which are well structured in formal English, e.g. news; however, with the emergence of social networks, enormous volumes of information can be extracted from the posts on social networks, which are short, grammatically incorrect and can contain special characters or newly invented words, e.g. LOL, BRB. Therefore, traditional semantic analysis approaches may not perform well for analysing social network posts. In this thesis, we build semantic analysis techniques particularly for Twitter content. We build a semantic relatedness model to calculate semantic relatedness between any two words obtained from tweets and by using the proposed semantic relatedness model, we semantically annotate tweets by linking them to Wikipedia entries. We compare our work with state-of-the-art semantic relatedness and entity linking methods that show promising results.

2021 ◽  
Author(s):  
Yue Feng

Semantic analysis is the process of shifting the understanding of text from the levels of phrases, clauses, sentences to the level of semantic meanings. Two of the most important semantic analysis tasks include 1) semantic relatedness measurement and 2) entity linking. The semantic relatedness measurement task aims to quantitatively identify the relationships between two words or concepts based on the similarity or closeness of their semantic meaning whereas the entity linking task focuses on linking plain text to structured knowledge resources, e.g. Wikipedia to provide semantic annotation of texts. A limitation of current semantic analysis approaches is that they are built upon traditional documents which are well structured in formal English, e.g. news; however, with the emergence of social networks, enormous volumes of information can be extracted from the posts on social networks, which are short, grammatically incorrect and can contain special characters or newly invented words, e.g. LOL, BRB. Therefore, traditional semantic analysis approaches may not perform well for analysing social network posts. In this thesis, we build semantic analysis techniques particularly for Twitter content. We build a semantic relatedness model to calculate semantic relatedness between any two words obtained from tweets and by using the proposed semantic relatedness model, we semantically annotate tweets by linking them to Wikipedia entries. We compare our work with state-of-the-art semantic relatedness and entity linking methods that show promising results.


Author(s):  
Yue Feng ◽  
Ebrahim Bagheri ◽  
Faezeh Ensan ◽  
Jelena Jovanovic

AbstractSemantic relatedness (SR) is a form of measurement that quantitatively identifies the relationship between two words or concepts based on the similarity or closeness of their meaning. In the recent years, there have been noteworthy efforts to compute SR between pairs of words or concepts by exploiting various knowledge resources such as linguistically structured (e.g. WordNet) and collaboratively developed knowledge bases (e.g. Wikipedia), among others. The existing approaches rely on different methods for utilizing these knowledge resources, for instance, methods that depend on the path between two words, or a vector representation of the word descriptions. The purpose of this paper is to review and present the state of the art in SR research through a hierarchical framework. The dimensions of the proposed framework cover three main aspects of SR approaches including the resources they rely on, the computational methods applied on the resources for developing a relatedness metric, and the evaluation models that are used for measuring their effectiveness. We have selected 14 representative SR approaches to be analyzed using our framework. We compare and critically review each of them through the dimensions of our framework, thus, identifying strengths and weaknesses of each approach. In addition, we provide guidelines for researchers and practitioners on how to select the most relevant SR method for their purpose. Finally, based on the comparative analysis of the reviewed relatedness measures, we identify existing challenges and potentially valuable future research directions in this domain.


Author(s):  
Patrick Chan ◽  
Yoshinori Hijikata ◽  
Toshiya Kuramochi ◽  
Shogo Nishida

Computing the semantic relatedness between two words or phrases is an important problem in fields such as information retrieval and natural language processing. Explicit Semantic Analysis (ESA), a state-of-the-art approach to solve the problem uses word frequency to estimate relevance. Therefore, the relevance of words with low frequency cannot always be well estimated. To improve the relevance estimate of low-frequency words and concepts, the authors apply regression to word frequency, its location in an article, and its text style to calculate the relevance. The relevance value is subsequently used to compute semantic relatedness. Empirical evaluation shows that, for low-frequency words, the authors’ method achieves better estimate of semantic relatedness over ESA. Furthermore, when all words of the dataset are considered, the combination of the authors’ proposed method and the conventional approach outperforms the conventional approach alone.


Author(s):  
Weixin Zeng ◽  
Xiang Zhao ◽  
Jiuyang Tang

List-only entity linking is the task of mapping ambiguous mentions in texts to target entities in a group of entity lists. Different from traditional entity linking task, which leverages rich semantic relatedness in knowledge bases to improve linking accuracy, list-only entity linking can merely take advantage of co-occurrences information in entity lists. State-of-the-art work utilizes co-occurrences information to enrich entity descriptions, which are further used to calculate local compatibility between mentions and entities to determine results. Nonetheless, entity coherence is also deemed to play an important part in entity linking, which is yet currently neglected. In this work, in addition to local compatibility, we take into account global coherence among entities. Specifically, we propose to harness co-occurrences in entity lists for mining both explicit and implicit entity relations. The relations are then integrated into an entity graph, on which Personalized PageRank is incorporated to compute entity coherence. The final results are derived by combining local mention-entity similarity and global entity coherence. The experimental studies validate the superiority of our method. Our proposal not only improves the performance of list-only entity linking, but also opens up the bridge between list-only entity linking and conventional entity linking solutions.


2012 ◽  
Vol 06 (01) ◽  
pp. 67-91 ◽  
Author(s):  
PIA-RAMONA WOJTINNEK ◽  
STEPHEN PULMAN ◽  
JOHANNA VÖLKER

The construction of suitable and scalable representations of semantic knowledge is a core challenge in Semantic Computing. Manually created resources such as WordNet have been shown to be useful for many AI and NLP tasks, but they are inherently restricted in their coverage and scalability. In addition, they have been challenged by simple distributional models on very large corpora, questioning the advantage of structured knowledge representations. We present a framework for building large-scale semantic networks automatically from plain text and Wikipedia articles using only linguistic analysis tools. Our constructed resources cover up to 2 million concepts and were built in less than 6 days. Using the task of measuring semantic relatedness, we show that we achieve results comparable to the best WordNet based methods as well as the best distributional methods while using a corpus of a size several magnitudes smaller. In addition, we show that we can outperform both types of methods by combining the results of our two network variants. Initial experiments on noun compound paraphrasing show similar results, underlining the quality as well as the flexibility of our constructed resources.


2019 ◽  
Author(s):  
Paolo Soraci

The purpose of this study is to create a new tool capable of diagnosing the severity of internet addiction (IA) and is based on the nine IGD criteria. These same criteria were suggested by the APA in the last edition of the DSM-51. A sample was recruited with a method of convenience and 300+ participants were recruited through different forums and social networks. The construct validity of the IDS9SF test was achieved through factor analysis and nomological validity. The concurrent validity, criterion and reliability of the test itself have been thoroughly investigated through the most common and consolidated data analysis techniques, confirming that the same test has sufficient psychometric properties to be used also in the Italian territory. Furthermore it is necessary to remember that this preliminary research is only valid in the field of data and statistics with all the limitations of the case and cannot be used for a real clinical evaluation.


This article examines the method of latent-semantic analysis, its advantages, disadvantages, and the possibility of further transformation for use in arrays of unstructured data, which make up most of the information that Internet users deal with. To extract context-dependent word meanings through the statistical processing of large sets of textual data, an LSA method is used, based on operations with numeric matrices of the word-text type, the rows of which correspond to words, and the columns of text units to texts. The integration of words into themes and the representation of text units in the theme space is accomplished by applying one of the matrix expansions to the matrix data: singular decomposition or factorization of nonnegative matrices. The results of LSA studies have shown that the content of the similarity of words and text is obtained in such a way that the results obtained closely coincide with human thinking. Based on the methods described above, the author has developed and proposed a new way of finding semantic links between unstructured data, namely, information on social networks. The method is based on latent-semantic and frequency analyzes and involves processing the search result received, splitting each remaining text (post) into separate words, each of which takes the round in n words right and left, counting the number of occurrences of each term, working with a pre-created semantic resource (dictionary, ontology, RDF schema, ...). The developed method and algorithm have been tested on six well-known social networks, the interaction of which occurs through the ARI of the respective social networks. The average score for author's results exceeded that of their own social network search. The results obtained in the course of this dissertation can be used in the development of recommendation, search and other systems related to the search, rubrication and filtering of information.


1998 ◽  
Vol 08 (01) ◽  
pp. 21-66 ◽  
Author(s):  
W. M. P. VAN DER AALST

Workflow management promises a new solution to an age-old problem: controlling, monitoring, optimizing and supporting business processes. What is new about workflow management is the explicit representation of the business process logic which allows for computerized support. This paper discusses the use of Petri nets in the context of workflow management. Petri nets are an established tool for modeling and analyzing processes. On the one hand, Petri nets can be used as a design language for the specification of complex workflows. On the other hand, Petri net theory provides for powerful analysis techniques which can be used to verify the correctness of workflow procedures. This paper introduces workflow management as an application domain for Petri nets, presents state-of-the-art results with respect to the verification of workflows, and highlights some Petri-net-based workflow tools.


Author(s):  
Marcos Sanchez Sanchez ◽  
John Iliff

<p>This paper describes the key elements from early planning to completion of a new bridge over the River Barrow which is part of the New Ross bypass in the south of Ireland. The structure has a total length of 887m, with a span arrangement of 36-45-95-230-230-95-70-50-36m. The two central twin spans are the longest of its kind in the world (extrados with a full concrete deck). The bridge carries a dual carriageway with a cable arrangement consisting of a single plane of cables located in the central axis of the deck. The design and construction focused in providing a structure with long term durability, resilience, and a robust approach to design scenarios using the Eurocodes and state of the art analysis techniques, including extreme events such as fire and ship impact<i>.</i></p>


Sign in / Sign up

Export Citation Format

Share Document