semantic indexing
Recently Published Documents


TOTAL DOCUMENTS

525
(FIVE YEARS 45)

H-INDEX

35
(FIVE YEARS 3)

Author(s):  
Ansh Mehta

Abstract: Previous research on emotion recognition of Twitter users centered on the use of lexicons and basic classifiers on pack of words models, despite the recent accomplishments of deep learning in many disciplines of natural language processing. The study's main question is if deep learning can help them improve their performance. Because of the scant contextual information that most posts offer, emotion analysis is still difficult. The suggested method can capture more emotion sematic than existing models by projecting emoticons and words into emoticon space, which improves the performance of emotion analysis. In a microblog setting, this aids in the detection of subjectivity, polarity, and emotion. It accomplishes this by utilizing hash tags to create three large emotion-labeled data sets that can be compared to various emotional orders. Then compare the results of a few words and character-based repetitive and convolutional neural networks to the results of a pack of words and latent semantic indexing models. Furthermore, the specifics examine the transferability of the most recent hidden state representations across distinct emotional classes and whether it is possible to construct a unified model for predicting each of them using a common representation. It's been shown that repetitive neural systems, especially character-based ones, outperform pack-of-words and latent semantic indexing models. The semantics of the token must be considered while classifying the tweet emotion. The semantics of the tokens recorded in the hash map may be simply searched. Despite these models' low exchange capacities, the recently presented training heuristic produces a unity model with execution comparable to the three solo models. Keywords: Hashtags, Sentiment Analysis, Facial Recognition, Emotions.


Author(s):  
Taher Zaki ◽  
Driss Mammass ◽  
Abdellatif Ennaji ◽  
Stéphane Nicolas

In this paper, we propose a hybrid system for contextual and semantic indexing of Arabic documents, bringing an improvement to classical models based on n-grams and the Okapi model. This new approach takes into account the concept of the semantic vicinity of terms. We proceed in fact by the calculation of similarity between words using an hybridization of NGRAMs-OKAPI statistical measures and a kernel function in order to identify relevant descriptors. Terminological resources such as graphs and semantic dictionaries are integrated into the system to improve the indexing and the classification processes.


2021 ◽  
Vol 4 (2) ◽  
pp. 64-70
Author(s):  
Agung Hasbi Ardiansyah ◽  
Kurnia Paranita Kartika ◽  
Saiful Nur Budiman

Ketika mendapat temuan atau laporan dugaan kasus pelanggaran pemilu, pengawas pemilu akan melakukan klarifikasi dan pencarian bukti-bukti yang cukup sebelum menentukan temuan atau laporan tersebut termasuk kedalam pelanggaran atau tidak. Pada saat proses klarifikasi, pengawas pemilu mencari pasal yang kemungkinan dilanggar pada temuan atau laporan yang masuk. Banyaknya pasal rujukan untuk masing-masing kasus pada temuan atau laporan terkadang menghambat pekerjaan petugas pengawas pemilu, sehingga dibutuhkan sebuah alat bantu untuk mempercepat proses pencarian pasal berdasarkan kasus pelanggaran. Pada penelitian ini, sistem temu balik informasi digunakan untuk mencari pasal-pasal pada undang-undang nomor 10 tahun 2016 yang relevan pada suatu kasus berdasarkan deskripsi kasus. Pada penelitian ini digunakan metode Latent Semantic Indexing (LSI). LSI menggunakan teknik Singular Value Decomposition (SVD) untuk mereduksi dimensi. Pada penelitian ini digunakan 37 pasal, dan 4 kasus atau deskripsi pelanggaran sebagai query. Sistem menerima masukkan berupa query atau deskripsi kasus pelanggaran kemudian menghitung dan menentukan pasal yang terkait. Tingkat keberhasilan dari metode ini untuk menemukan hasil pencarian yang relevan dapat dilihat melalui besar 100% untuk recall, 70% untuk precision dan 82% untuk f-measure.


2021 ◽  
Vol 12 (4) ◽  
pp. 169-185
Author(s):  
Saida Ishak Boushaki ◽  
Omar Bendjeghaba ◽  
Nadjet Kamel

Clustering is an important unsupervised analysis technique for big data mining. It finds its application in several domains including biomedical documents of the MEDLINE database. Document clustering algorithms based on metaheuristics is an active research area. However, these algorithms suffer from the problems of getting trapped in local optima, need many parameters to adjust, and the documents should be indexed by a high dimensionality matrix using the traditional vector space model. In order to overcome these limitations, in this paper a new documents clustering algorithm (ASOS-LSI) with no parameters is proposed. It is based on the recent symbiotic organisms search metaheuristic (SOS) and enhanced by an acceleration technique. Furthermore, the documents are represented by semantic indexing based on the famous latent semantic indexing (LSI). Conducted experiments on well-known biomedical documents datasets show the significant superiority of ASOS-LSI over five famous algorithms in terms of compactness, f-measure, purity, misclassified documents, entropy, and runtime.


2021 ◽  
Vol 11 (3) ◽  
pp. 113-137
Author(s):  
M. Fevzi Esen

A remarkable increase has currently been happening in social media platform content related to COVID-19. Users have created large volumes of content on various topics over a short time, interacting with people in real-time. This also has transformed social media into an indispensable information source for any crisis. This study aims to explore the information content on COVID-19 disseminated through social media and to discover prominent topics in shares on COVID-19. In this regard, we have retrieved 17,542 tweets shared in Turkish. A content analysis of social media shares has been carried out, with latent semantic indexing and network analyses being performed to detect the relationships and interactions among shares. As a result, the most shared topics have been concluded to be on yasak [lockdown], tedbir [precaution], karantina [quarantine], and vaka [case], with communication being frequently passed using this semantic string and information exchanges being faster within the network. In addition, shares related to hygiene, masks, and distancing were determined to have occurred less than shares related to precautions, rules, cases, and lockdowns. The number of likes and retweets for content with social propaganda such as #evdekal [stayathome], #evdehayatvar [lifeathome], and #birliktebaşaracağız [togetherwesucceed] were low and not found in a semantic string. This suggests social propaganda through social media to have had a limited impact on epidemic management. In conclusion, identifying the prominent issues in social media posts and the characteristics of social media networks will help decision-makers determine appropriate policies for controlling and preventing the pandemic’s spread.


2021 ◽  
pp. 298-307
Author(s):  
P. Anandakrishnan ◽  
Naveen Raj ◽  
Mahesh S. Nair ◽  
Akshay Sreekumar ◽  
Jayasree Narayanan

2021 ◽  
Author(s):  
Adebayo Abayomi-Alli ◽  
Olusola Abayomi-Alli ◽  
Sanjay Misra ◽  
Luis Fernandez-Sanz

Abstract BackgroundSocial media opinion has become a medium to quickly access large, valuable, and rich details of information on any subject matter within a short period. Twitter being a social microblog site, generate over 330 million tweets monthly across different countries. Analyzing trending topics on Twitter presents opportunities to extract meaningful insight into different opinions on various issues.AimThis study aims to gain insights into the trending yahoo-yahoo topic on Twitter using content analysis of selected historical tweets.MethodologyThe widgets and workflow engine in the Orange Data mining toolbox were employed for all the text mining tasks. 5500 tweets were collected from Twitter using the 'yahoo yahoo' hashtag. The corpus was pre-processed using a pre-trained tweet tokenizer, Valence Aware Dictionary for Sentiment Reasoning (VADER) was used for the sentiment and opinion mining, Latent Dirichlet Allocation (LDA) and Latent Semantic Indexing (LSI) was used for topic modeling. In contrast, Multidimensional scaling (MDS) was used to visualize the modeled topics. ResultsResults showed that "yahoo" appeared in the corpus 9555 times, 175 unique tweets were returned after duplicate removal. Contrary to expectation, Spain had the highest number of participants tweeting on the 'yahoo yahoo' topic within the period. The result of Vader sentiment analysis returned 35.85%, 24.53%, 15.09%, and 24.53%, negative, neutral, no-zone, and positive sentiment tweets, respectively. The word yahoo was highly representative of the LDA topics 1, 3, 4, 6, and LSI topic 1.ConclusionIt can be concluded that emojis are even more representative of the sentiments in tweets faster than the textual contents. Also, despite popular belief, a significant number of youths regard cybercrime as a detriment to society.


Information ◽  
2021 ◽  
Vol 12 (1) ◽  
pp. 43
Author(s):  
Stefan Wagenpfeil ◽  
Felix Engel ◽  
Paul Mc Kevitt ◽  
Matthias Hemmje

To cope with the growing number of multimedia assets on smartphones and social media, an integrated approach for semantic indexing and retrieval is required. Here, we introduce a generic framework to fuse existing image and video analysis tools and algorithms into a unified semantic annotation, indexing and retrieval model resulting in a multimedia feature vector graph representing various levels of media content, media structures and media features. Utilizing artificial intelligence (AI) and machine learning (ML), these feature representations can provide accurate semantic indexing and retrieval. Here, we provide an overview of the generic multimedia analysis framework (GMAF) and the definition of a multimedia feature vector graph framework (MMFVGF). We also introduce AI4MMRA to detect differences, enhance semantics and refine weights in the feature vector graph. To address particular requirements on smartphones, we introduce an algorithm for fast indexing and retrieval of graph structures. Experiments to prove efficiency, effectiveness and quality of the algorithm are included. All in all, we describe a solution for highly flexible semantic indexing and retrieval that offers unique potential for applications such as social media or local applications on smartphones.


Sign in / Sign up

Export Citation Format

Share Document