Provenance Framework for Twitter Data using Zero-Information Loss Graph Database

Mining social network data and developing user profile from unstructured and informal data are a challenging task. The proposed research builds user profile using Twitter data which is later helpful to provide the user with personalized recommendations. Publicly available tweets are fetched and classified and sentiments expressed in tweets are extracted and normalized. This research uses domain-specific seed list to classify tweets. Semantic and syntactic analysis on tweets is performed to minimize information loss during the process of tweets classification. After precise classification and sentiment analysis, the system builds user interest-based profile by analyzing user’s post on Twitter to know about user interests. The proposed system was tested on a dataset of almost 1 million tweets and was able to classify up to 96% tweets accurately.

Download Full-text

KETIDAKPADANAN DIKSI TERJEMAHAN ACHMAD SUNARTO DALAM BUKU TERJEMAH TA’LIM MUTA’ALIM

Hijai - Journal on Arabic Language and Literature ◽

10.15575/hijai.v2i1.6471 ◽

2019 ◽

Vol 2 (1) ◽

pp. 1-17

Author(s):

Muhammad Ibnu Pamungkas ◽

Izzuddin Musthafa ◽

Muhammad Nurhasan

Keyword(s):

Information Loss ◽

Source Text ◽

Knowledge Based

Ta’lim Muta’alim is Syaikh al-Zarnūjī’s opus that consists of norms, ethics, and rules for gaining knowledge based on Islamic teachings. Thus, claimants of science could reach their goals to obtain it. This book was translated by Achmad Sunarto into Indonesian language and published by Husaini Publisher in Bandung. After reading it totally, researcher found mistakes in translation, especially mistakes in words selection (diction) in translation. And after analyzed it, researcher formulate the mistakes into 4 parts, (1) translation that is the result of direct transliteration from SL without considering its compability in TL, (2) existence of information loss and gain that effects the translation itself and makes it unsuitable, (3) choosing a word which is not suit with the meaning reference from the source text, (4) translation is unacceptable in TL because it is translated literally.

Download Full-text

Exploring the Use of Machine Learning to Automate the Qualitative Coding of Church-related Tweets

Fieldwork in Religion ◽

10.1558/firn.40610 ◽

2020 ◽

Vol 14 (2) ◽

pp. 140-159

Author(s):

Anthony-Paul Cooper ◽

Emmanuel Awuni Kolog ◽

Erkki Sutinen

Keyword(s):

Machine Learning ◽

Online Community ◽

High Volume ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Social Media Data ◽

Twitter Data ◽

Resource Intensity ◽

Media Data ◽

Better Than

This article builds on previous research around the exploration of the content of church-related tweets. It does so by exploring whether the qualitative thematic coding of such tweets can, in part, be automated by the use of machine learning. It compares three supervised machine learning algorithms to understand how useful each algorithm is at a classification task, based on a dataset of human-coded church-related tweets. The study finds that one such algorithm, Naïve-Bayes, performs better than the other algorithms considered, returning Precision, Recall and F-measure values which each exceed an acceptable threshold of 70%. This has far-reaching consequences at a time where the high volume of social media data, in this case, Twitter data, means that the resource-intensity of manual coding approaches can act as a barrier to understanding how the online community interacts with, and talks about, church. The findings presented in this article offer a way forward for scholars of digital theology to better understand the content of online church discourse.

Download Full-text

A Study on the Leading Indicators of Customer Perspective through Twitter Data Analysis

Journal of Korea Service Management Society ◽

10.15706/jksms.2018.19.4.001 ◽

2018 ◽

Vol 19 (4) ◽

pp. 1-23

Author(s):

Chun Mi Pyo

Keyword(s):

Data Analysis ◽

Leading Indicators ◽

Twitter Data ◽

Customer Perspective ◽

Twitter Data Analysis

Download Full-text

Transforming Product Catalogue Relational into Graph Database: a Performance Comparison

2020 43rd International Convention on Information, Communication and Electronic Technology (MIPRO) ◽

10.23919/mipro48935.2020.9245152 ◽

2020 ◽

Author(s):

Josip Lorincz ◽

Vlatka Huljic ◽

Dinko Begusic

Keyword(s):

Performance Comparison ◽

Graph Database ◽

A Performance

Download Full-text

Controversial Analysis:- Sentimental Analysis of Twitter Data

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse/v7i4/0124 ◽

2017 ◽

Vol 7 (4) ◽

pp. 96-100

Author(s):

Samarth Jaykar Shetty ◽

◽

Badal Rakesh Thosani ◽

Lenherd Deon Olivera ◽

Supriya Kamoji ◽

...

Keyword(s):

Twitter Data

Download Full-text

RAKING: An Efficient K-Maximal Frequent Pattern Mining Algorithm on Uncertain Graph Database

Chinese Journal of Computers ◽

10.3724/sp.j.1016.2010.01387 ◽

2010 ◽

Vol 33 (8) ◽

pp. 1387-1395 ◽

Cited By ~ 4

Author(s):

Meng HAN ◽

Wei ZHANG ◽

Jian-Zhong LI

Keyword(s):

Pattern Mining ◽

Frequent Pattern Mining ◽

Frequent Pattern ◽

Graph Database ◽

Uncertain Graph ◽

Mining Algorithm ◽

Maximal Frequent Pattern

Download Full-text

Global Depictions of International Students in a Time of Crisis: A Thematic Analysis of Twitter Data During COVID-19

SSRN Electronic Journal ◽

10.2139/ssrn.3703604 ◽

2020 ◽

Author(s):

Jenna Mittelmeier ◽

Heather Cockayne

Keyword(s):

International Students ◽

Thematic Analysis ◽

Twitter Data

Download Full-text

Utilizing Twitter Data Analysis and Deep Learning to Identify Drug Use (Preprint)

10.2196/preprints.14681 ◽

2019 ◽

Author(s):

Joseph Tassone ◽

Peizhi Yan ◽

Mackenzie Simpson ◽

Chetan Mendhe ◽

Vijay Mago ◽

...

Keyword(s):

Social Media ◽

Logistic Regression ◽

Deep Learning ◽

Decision Tree ◽

Semantic Meaning ◽

Predictive Capability ◽

Logistic Regression Models ◽

Twitter Data ◽

Data Points ◽

Positive Classification

BACKGROUND The collection and examination of social media has become a useful mechanism for studying the mental activity and behavior tendencies of users. OBJECTIVE Through the analysis of a collected set of Twitter data, a model will be developed for predicting positively referenced, drug-related tweets. From this, trends and correlations can be determined. METHODS Twitter social media tweets and attribute data were collected and processed using topic pertaining keywords, such as drug slang and use-conditions (methods of drug consumption). Potential candidates were preprocessed resulting in a dataset 3,696,150 rows. The predictive classification power of multiple methods was compared including regression, decision trees, and CNN-based classifiers. For the latter, a deep learning approach was implemented to screen and analyze the semantic meaning of the tweets. RESULTS The logistic regression and decision tree models utilized 12,142 data points for training and 1041 data points for testing. The results calculated from the logistic regression models respectively displayed an accuracy of 54.56% and 57.44%, and an AUC of 0.58. While an improvement, the decision tree concluded with an accuracy of 63.40% and an AUC of 0.68. All these values implied a low predictive capability with little to no discrimination. Conversely, the CNN-based classifiers presented a heavy improvement, between the two models tested. The first was trained with 2,661 manually labeled samples, while the other included synthetically generated tweets culminating in 12,142 samples. The accuracy scores were 76.35% and 82.31%, with an AUC of 0.90 and 0.91. Using association rule mining in conjunction with the CNN-based classifier showed a high likelihood for keywords such as “smoke”, “cocaine”, and “marijuana” triggering a drug-positive classification. CONCLUSIONS Predictive analysis without a CNN is limited and possibly fruitless. Attribute-based models presented little predictive capability and were not suitable for analyzing this type of data. The semantic meaning of the tweets needed to be utilized, giving the CNN-based classifier an advantage over other solutions. Additionally, commonly mentioned drugs had a level of correspondence with frequently used illicit substances, proving the practical usefulness of this system. Lastly, the synthetically generated set provided increased scores, improving the predictive capability. CLINICALTRIAL None

Download Full-text

Sentiment Analysis using Twitter Data

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2020.5368 ◽

2020 ◽

Vol 8 (5) ◽

pp. 2253-2257

Author(s):

Nikhil Srivastava

Keyword(s):

Sentiment Analysis ◽

Twitter Data

Download Full-text