Learning Sentence-to-Hashtags Semantic Mapping for Hashtag Recommendation on Microblogs

2022 ◽  
Vol 16 (2) ◽  
pp. 1-26
Author(s):  
Riccardo Cantini ◽  
Fabrizio Marozzo ◽  
Giovanni Bruno ◽  
Paolo Trunfio

The growing use of microblogging platforms is generating a huge amount of posts that need effective methods to be classified and searched. In Twitter and other social media platforms, hashtags are exploited by users to facilitate the search, categorization, and spread of posts. Choosing the appropriate hashtags for a post is not always easy for users, and therefore posts are often published without hashtags or with hashtags not well defined. To deal with this issue, we propose a new model, called HASHET ( HAshtag recommendation using Sentence-to-Hashtag Embedding Translation ), aimed at suggesting a relevant set of hashtags for a given post. HASHET is based on two independent latent spaces for embedding the text of a post and the hashtags it contains. A mapping process based on a multi-layer perceptron is then used for learning a translation from the semantic features of the text to the latent representation of its hashtags. We evaluated the effectiveness of two language representation models for sentence embedding and tested different search strategies for semantic expansion, finding out that the combined use of BERT ( Bidirectional Encoder Representation from Transformer ) and a global expansion strategy leads to the best recommendation results. HASHET has been evaluated on two real-world case studies related to the 2016 United States presidential election and COVID-19 pandemic. The results reveal the effectiveness of HASHET in predicting one or more correct hashtags, with an average F -score up to 0.82 and a recommendation hit-rate up to 0.92. Our approach has been compared to the most relevant techniques used in the literature ( generative models , unsupervised models, and attention-based supervised models ) by achieving up to 15% improvement in F -score for the hashtag recommendation task and 9% for the topic discovery task.

Author(s):  
Iraida Bryxina

We consider the problem of the development of the lexical competence of the future linguists who study French language after English in the higher education system. We also analyze English borrowings in French language, consider their adaptation in the receiving language and reveal the semantic features of foreign words. It is disclosed that the lexical competence of a future linguist who studies French in multilingual terms, is the ability of the learner to determine the contextual meaning of foreign words, to compare the volume of their meanings in English, French and Russian languages, to explicate the meanings of words showing sensitivity to differences in lan-guages. Future linguists have the opportunity to learn the phenomena of the French language, ac-cept them, or through dialogue and textual analysis of linguistic information correct their mistakes. It is revealed that at the semantic stage of the integration of anglicisms into French language, their semantic expansion or restriction occurs. It is proved that the knowledge component of students’ lexical competence includes ideas about interlanguage lexical correspondences, word-formation structure of the word and their semantic features. Different strategies are used to study lexical units: searching for information in paper and electronic dictionaries, language supports, interlanguage contrasting exercises, which contribute to the improvement of the lexical competence of students in multilingualism terms. The use of information and communication technologies allows to develop receptive and productive lexical skills of students, contributes to the expansion of their vocabulary, leads to the development of future specialist’s awareness.


ELT-Lectura ◽  
2015 ◽  
Vol 2 (2) ◽  
Author(s):  
Dahler Dahler

This classroom action research conducted to 38 participants at the seventh grade class D of SMPN 29 in2011/2012 academic years. It tries to improve their writing skill by applying semantic mapping strategy. Theresearcher collecting writing tests, observations, field notes, and interview as the instrument. The date reveals that improvement exists after the treatment on the students' writing skill. The data indicates that some factors influenced their improvement. The first was brainstorming process that led them easy to convey and think about the ideas. The second was the categorization process that made them easy to determine kinds of the idea. The third was the mapping process made them easy to write a good descriptive paragraph. The last was the teacher’s roles facilitated them to have an effective class.


Author(s):  
Hiram Calvo ◽  
Kentaro Inui ◽  
Yuji Matsumoto

Learning verb argument preferences has been approached as a verb and argument problem, or at most as a tri-nary relationship between subject, verb and object. However, the simultaneous correlation of all arguments in a sentence has not been explored thoroughly for sentence plausibility mensuration because of the increased number of potential combinations and data sparseness. In this work the authors present a review of some common methods for learning argument preferences beginning with the simplest case of considering binary co-relations, then they compare with tri-nary co-relations, and finally they consider all arguments. For this latter, the authors use an ensemble model for machine learning using discriminative and generative models, using co-occurrence features, and semantic features in different arrangements. They seek to answer questions about the number of optimal topics required for PLSI and LDA models, as well as the number of co-occurrences that should be required for improving performance. They explore the implications of using different ways of projecting co-relations, i.e., into a word space, or directly into a co-occurrence features space. The authors conducted tests using a pseudo-disambiguation task learning from large corpora extracted from Internet.


Author(s):  
Tong Xu ◽  
Peilun Zhou ◽  
Linkang Hu ◽  
Xiangnan He ◽  
Yao Hu ◽  
...  

As a crucial task for video analysis, social relation recognition for characters not only provides semantically rich description of video content but also supports intelligent applications, e.g., video retrieval and visual question answering. Unfortunately, due to the semantic gap between visual and semantic features, traditional solutions may fail to reveal the accurate relations among characters. At the same time, the development of social media platforms has now promoted the emergence of crowdsourced comments, which may enhance the recognition task with semantic and descriptive cues. To that end, in this article, we propose a novel multimodal-based solution to deal with the character relation recognition task. Specifically, we capture the target character pairs via a search module and then design a multistream architecture for jointly embedding the visual and textual information, in which feature fusion and attention mechanism are adapted for better integrating the multimodal inputs. Finally, supervised learning is applied to classify character relations. Experiments on real-world data sets validate that our solution outperforms several competitive baselines.


2022 ◽  
Author(s):  
Laurent Caplette ◽  
Nicholas Turk-Browne

Revealing the contents of mental representations is a longstanding goal of cognitive science. However, there is currently no general framework for providing direct access to representations of high-level visual concepts. We asked participants to indicate what they perceived in images synthesized from random visual features in a deep neural network. We then inferred a mapping between the semantic features of their responses and the visual features of the images. This allowed us to reconstruct the mental representation of virtually any common visual concept, both those reported and others extrapolated from the same semantic space. We successfully validated 270 of these reconstructions as containing the target concept in a separate group of participants. The visual-semantic mapping uncovered with our method further generalized to new stimuli, participants, and tasks. Finally, it allowed us to reveal how the representations of individual observers differ from each other and from those of neural networks.


2019 ◽  
Vol 43 (1) ◽  
pp. 89-112 ◽  
Author(s):  
Suliman Aladhadh ◽  
Xiuzhen Zhang ◽  
Mark Sanderson

PurposeSocial media platforms provide a source of information about events. However, this information may not be credible, and the distance between an information source and the event may impact on that credibility. Therefore, the purpose of this paper is to address an understanding of the relationship between sources, physical distance from that event and the impact on credibility in social media.Design/methodology/approachIn this paper, the authors focus on the impact of location on the distribution of content sources (informativeness and source) for different events, and identify the semantic features of the sources and the content of different credibility levels.FindingsThe study found that source location impacts on the number of sources across different events. Location also impacts on the proportion of semantic features in social media content.Research limitations/implicationsThis study illustrated the influence of location on credibility in social media. The study provided an overview of the relationship between content types including semantic features, the source and event locations. However, the authors will include the findings of this study to build the credibility model in the future research.Practical implicationsThe results of this study provide a new understanding of reasons behind the overestimation problem in current credibility models when applied to different domains: such models need to be trained on data from the same place of event, as that can make the model more stable.Originality/valueThis study investigates several events – including crisis, politics and entertainment – with steady methodology. This gives new insights about the distribution of sources, credibility and other information types within and outside the country of an event. Also, this study used the power of location to find alternative approaches to assess credibility in social media.


Sensors ◽  
2021 ◽  
Vol 21 (9) ◽  
pp. 3241
Author(s):  
Jingyi Liu ◽  
Caijuan Shi ◽  
Dongjing Tu ◽  
Ze Shi ◽  
Yazhi Liu

The supervised model based on deep learning has made great achievements in the field of image classification after training with a large number of labeled samples. However, there are many categories without or only with a few labeled training samples in practice, and some categories even have no training samples at all. The proposed zero-shot learning greatly reduces the dependence on labeled training samples for image classification models. Nevertheless, there are limitations in learning the similarity of visual features and semantic features with a predefined fixed metric (e.g., as Euclidean distance), as well as the problem of semantic gap in the mapping process. To address these problems, a new zero-shot image classification method based on an end-to-end learnable deep metric is proposed in this paper. First, the common space embedding is adopted to map the visual features and semantic features into a common space. Second, an end-to-end learnable deep metric, that is, the relation network is utilized to learn the similarity of visual features and semantic features. Finally, the invisible images are classified, according to the similarity score. Extensive experiments are carried out on four datasets and the results indicate the effectiveness of the proposed method.


The objective of this study is to examine the use of the Masculine Sound Plural (MSP) as a default inflection in Modern Standard Arabic (MSA). Twenty-six fourth-year university students in the department of Arabic language and literature participated in the experiment and they were required to provide the plural inflection for 30 derived noun forms in MSA. The data used in this study consists of agentive derived forms that indicate human action meaning. The descriptive statistics approach (mean and standard deviation) was used to investigate the data; the results of the current study showed that MSP inflection was produced in a higher rate of frequency than the other possible forms of the irregular plural inflectional forms-broken plural (BP)- inflection that can also be actual part of the lexicon or schemata or background knowledge. The findings of this study support the accounts provided by the combinatorial processing mechanism with a suffixation formation that is more predictable than any other BP forms. The results of this study also provide more concrete evidence on the idea that there is initial mapping between the semantic features and the emergence of the default inflection which primarily motivates the emergence of the default form, and this semantic mapping is expected to add more to the properties that make the multidefault scenario more possible.


Information ◽  
2018 ◽  
Vol 9 (8) ◽  
pp. 203 ◽  
Author(s):  
Lianhong Ding ◽  
Bin Sun ◽  
Peng Shi

A microblog is a new type of social media for information publishing, acquiring, and spreading. Finding the significant topics of a microblog is necessary for popularity tracing and public opinion following. This paper puts forward a method to detect topics from Chinese microblogs. Since traditional methods showed low performance on a short text from a microblog, we put forward a topic detection method based on the semantic description of the microblog post. The semantic expansion of the post supplies more information and clues for topic detection. First, semantic features are extracted from a microblog post. Second, the semantic features are expanded according to a thesaurus. Here TongYiCi CiLin is used as the lexical resource to find words with the same meaning. To overcome the polysemy problem, several semantic expansion strategies based on part-of-speech are introduced and compared. Third, an approach to detect topics based on semantic descriptions and an improved incremental clustering algorithm is introduced. A dataset from Sina Weibo is employed to evaluate our method. Experimental results show that our method can bring about better results both for post clustering and topic detection in Chinese microblogs. We also found that the semantic expansion of nouns is far more efficient than for other parts of speech. The potential mechanism of the phenomenon is also analyzed and discussed.


Author(s):  
Varvara Vagiati

This chapter presents the current status of the efforts to harmonize MPEG-7 and SCORM Content Package (including the LOM description metadata, part of SCORM). In particular a model for the interoperability between these standards is developed. The MPEG-7 provides a standardized set of technologies for describing multimedia content, while SCORM is a collection of specifications for developing, organizing and delivering instructional content.The proposed model concerns the semantic mapping between the different elements of these standards, which are created to satisfy the specific needs of different communities. The followed approach is based on the main principles and procedures for metadata interoperability, such as on the crosswalking and mapping techniques. Moreover some empirical remarks conclude the mapping process.


Sign in / Sign up

Export Citation Format

Share Document