Classification of unlabeled online media

Sakthi Kumar Arul Prakash; Conrad Tucker

doi:10.1038/s41598-021-85608-5

Classification of unlabeled online media

Scientific Reports ◽

10.1038/s41598-021-85608-5 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Sakthi Kumar Arul Prakash ◽

Conrad Tucker

Keyword(s):

Social Media ◽

Real World ◽

Graphical Model ◽

Ground Truth ◽

Classification Problem ◽

Machine Learning Algorithms ◽

Social Media Networks ◽

Online Social Media ◽

Wide Range

AbstractThis work investigates the ability to classify misinformation in online social media networks in a manner that avoids the need for ground truth labels. Rather than approach the classification problem as a task for humans or machine learning algorithms, this work leverages user–user and user–media (i.e.,media likes) interactions to infer the type of information (fake vs. authentic) being spread, without needing to know the actual details of the information itself. To study the inception and evolution of user–user and user–media interactions over time, we create an experimental platform that mimics the functionality of real-world social media networks. We develop a graphical model that considers the evolution of this network topology to model the uncertainty (entropy) propagation when fake and authentic media disseminates across the network. The creation of a real-world social media network enables a wide range of hypotheses to be tested pertaining to users, their interactions with other users, and with media content. The discovery that the entropy of user–user and user–media interactions approximate fake and authentic media likes, enables us to classify fake media in an unsupervised learning manner.

Get full-text (via PubEx)

Classification of unlabeled online media

10.21203/rs.3.rs-107002/v1 ◽

2020 ◽

Author(s):

Sakthi Kumar Arul Prakash ◽

Conrad Tucker

Keyword(s):

Real World ◽

Graphical Model ◽

Classification Problem ◽

Machine Learning Algorithms ◽

Online Media ◽

Media Content ◽

Social Media Networks ◽

Online Social Media ◽

Wide Range

Abstract This work investigates the ability to classify misinformation in online social media networks in a manner that avoids the need forground truth labels. Rather than approach the classification problem as a task for humans or machine learning algorithms, thiswork leverages user-user and user-media (i.e.,media likes) interactions to infer the type of information (fake vs. authentic) beingspread, without needing to know the actual details of the information itself. To study the inception and evolution of user-userand user-media interactions over time, we create an experimental platform that mimics the functionality of real world socialmedia networks. We develop a graphical model that considers the evolution of this network topology to model the uncertainty(entropy) propagation when fake and authentic media disseminates across the network. The creation of a real-world socialmedia network enables a wide range of hypotheses to be tested pertaining to users, their interactions with other users, andwith media content. The discovery that the entropy of user-user, and user-media interactions approximates fake and authenticmedia likes, enables us to classify fake media in an unsupervised learning manner.

Get full-text (via PubEx)

DERADICALISATION IDEAS IN SOCIAL MEDIA THROUGH THE RELIGIOUS-MULTICULTURAL THERAPY APPROACH TO THE DA'WAH CULTURE

Potret Pemikiran ◽

10.30984/pp.v24i1.1116 ◽

2020 ◽

Vol 24 (1) ◽

pp. 58

Author(s):

Anwar Hafidzi

Keyword(s):

Social Media ◽

Real World ◽

Environmental Conditions ◽

Islamic Law ◽

Research Tool ◽

Therapy Approach ◽

The Real ◽

Multicultural Therapy ◽

Online Social Media ◽

Library Research

This research begins with an understanding of the endemic radicalism of society, not only of the real world, but also of various online social media. This study showed that the avoidance of online radicalism can be stopped as soon as possible by accusing those influenced by the radical radicality of a secular religious approach. The methods used must be assisted in order to achieve balanced understanding (wasathiyah) under the different environmental conditions of the culture through recognizing the meaning of religion. The research tool used is primarily library work and the journal writings by Abu Rokhmad, a terrorist and radicalise specialist. The results of this study are that an approach that supports inclusive ism will avoid the awareness of radicalization through a heart-to-heart approach. This study also shows that radical actors will never cease to argue dramatically until they are able to grasp different views from Islamic law, culture, and families.Keywords: radicalism, deradicalization, multiculturalism, culture, religion, moderate.Penelitian ini berawal dari paham radikalisme yang telah mewabah di masyarakat, bukan hanya di dunia nyata, bahkan sudah menyusup di berbagai media sosial online. Penelitian ini menemukan bahwa cara menangkal radikalisme online dapat dilakukan pencegahan sedini mungkin melalui pendekatan konseling religius multikultural terhadap mereka yang terkena paham radikal radikal. Diantara teknik yang digunakan adalah melalui pemahaman tentang konsep agama juga perlu digalakkan agar memunculkan pemahaman yang moderat (wasathiyah) diberbagai keadaan lingkungan masyarakat. Metode yang digunakan untuk penelitian ini adalah library research dengan sumber utama adalah karya dan jurnal karya Abu Rokhmad seorang pakar dalam masalah terorisme dan radikalisme. Temuan penelitian ini adalah paham radikalisasi itu dapat dihentikan dengan pendekatan hati ke hati dengan mengedepankan budaya yang multikultural. Kajian ini juga membuktikan bahwa pelaku paham radikal tidak akan pernah berhenti memberikan argumen radikal kecuali mampu memahami perbedaan pendapat yang bersumber dari syariat Islam, lingkungan sosial, dan keluarga.Kata kunci: radikalisme, deradikalisasi, multikultural, budaya, agama, moderat.

Get full-text (via PubEx)

RepPer: Perception of Psychiatric Disorders on Twitter in French (Preprint)

10.2196/preprints.18539 ◽

2020 ◽

Author(s):

Sarah Delanys ◽

Farah Benamara ◽

Véronique Moriceau ◽

François Olivier ◽

Josiane Mothe

Keyword(s):

Social Media ◽

Psychiatric Disorders ◽

Digital Technology ◽

Psychotic Disorders ◽

Negative Polarity ◽

Machine Learning Algorithms ◽

Annotation Scheme ◽

Word Use ◽

Wide Range ◽

Initial Dataset

BACKGROUND With the advent of digital technology and specifically user generated contents in social media, new ways emerged for studying possible stigma of people in relation with mental health. Several pieces of work studied the discourse conveyed about psychiatric pathologies on Twitter considering mostly tweets in English and a limited number of psychiatric disorders terms. This paper proposes the first study to analyze the use of a wide range of psychiatric terms in tweets in French. OBJECTIVE Our aim is to study how generic, nosographic and therapeutic psychiatric terms are used on Twitter in French. More specifically, our study has three complementary goals: (1) to analyze the types of psychiatric word use namely medical, misuse, irrelevant, (2) to analyze the polarity conveyed in the tweets that use these terms (positive/negative/neural), and (3) to compare the frequency of these terms to those observed in related work (mainly in English ). METHODS Our study has been conducted on a corpus of tweets in French posted between 01/01/2016 to 12/31/2018 and collected using dedicated keywords. The corpus has been manually annotated by clinical psychiatrists following a multilayer annotation scheme that includes the type of word use and the opinion orientation of the tweet. Two analysis have been performed. First a qualitative analysis to measure the reliability of the produced manual annotation, then a quantitative analysis considering mainly term frequency in each layer and exploring the interactions between them. RESULTS One of the first result is a resource as an annotated dataset . The initial dataset is composed of 22,579 tweets in French containing at least one of the selected psychiatric terms. From this set, experts in psychiatry randomly annotated 3,040 tweets that corresponds to the resource resulting from our work. The second result is the analysis of the annotations; it shows that terms are misused in 45.3% of the tweets and that their associated polarity is negative in 86.2% of the cases. When considering the three types of term use, 59.5% of the tweets are associated to a negative polarity. Misused terms related to psychotic disorders (55.5%) are more frequent to those related to mood disorders (26.5%). CONCLUSIONS Some psychiatric terms are misused in the corpora we studied; which is consistent with the results reported in related work in other languages. Thanks to the great diversity of studied terms, this work highlighted a disparity in the representations and ways of using psychiatric terms. Moreover, our study is important to help psychiatrists to be aware of the term use in new communication media such as social networks which are widely used. This study has the huge advantage to be reproducible thanks to the framework and guidelines we produced; so that the study could be renewed in order to analyze the evolution of term usage. While the newly build dataset is a valuable resource for other analytical studies, it could also serve to train machine learning algorithms to automatically identify stigma in social media.

Get full-text (via PubEx)

Understanding Smartwatch Battery Utilization in the Wild

Sensors ◽

10.3390/s20133784 ◽

2020 ◽

Vol 20 (13) ◽

pp. 3784 ◽

Cited By ~ 1

Author(s):

Morteza Homayounfar ◽

Amirhossein Malekijoo ◽

Aku Visuri ◽

Chelsea Dobbins ◽

Ella Peltonen ◽

...

Keyword(s):

Real World ◽

Binary Classification ◽

Classification Problem ◽

Machine Learning Algorithms ◽

Indexing Method ◽

In The Wild ◽

Art Research ◽

Battery Discharge ◽

Changes Over Time ◽

Application Developers

Smartwatch battery limitations are one of the biggest hurdles to their acceptability in the consumer market. To our knowledge, despite promising studies analyzing smartwatch battery data, there has been little research that has analyzed the battery usage of a diverse set of smartwatches in a real-world setting. To address this challenge, this paper utilizes a smartwatch dataset collected from 832 real-world users, including different smartwatch brands and geographic locations. First, we employ clustering to identify common patterns of smartwatch battery utilization; second, we introduce a transparent low-parameter convolutional neural network model, which allows us to identify the latent patterns of smartwatch battery utilization. Our model converts the battery consumption rate into a binary classification problem; i.e., low and high consumption. Our model has 85.3% accuracy in predicting high battery discharge events, outperforming other machine learning algorithms that have been used in state-of-the-art research. Besides this, it can be used to extract information from filters of our deep learning model, based on learned filters of the feature extractor, which is impossible for other models. Third, we introduce an indexing method that includes a longitudinal study to quantify smartwatch battery quality changes over time. Our novel findings can assist device manufacturers, vendors and application developers, as well as end-users, to improve smartwatch battery utilization.

Get full-text (via PubEx)

Data-driven inferences of agency-level risk and response communication on COVID-19 through social media-based interactions

Journal of Emergency Management ◽

10.5055/jem.0589 ◽

2021 ◽

Vol 19 (7) ◽

pp. 59-82

Author(s):

Md Ashraf Ahmed, PhD Candidate ◽

Arif Mohaimin Sadri, PhD ◽

M. Hadi Amini, PhD, DEng

Keyword(s):

Public Health ◽

Social Media ◽

Information Dissemination ◽

Topic Model ◽

Face Mask ◽

Community Response ◽

Machine Learning Algorithms ◽

Data Driven ◽

Contact Tracing ◽

Online Social Media

Risk perception and risk averting behaviors of public agencies in the emergence and spread of COVID-19 can be retrieved through online social media (Twitter), and such interactions can be echoed in other information outlets. This study collected time-sensitive online social media data and analyzed patterns of health risk communication of public health and emergency agencies in the emergence and spread of novel coronavirus using data-driven methods. The major focus is toward understanding how policy-making agencies communicate risk and response information through social media during a pandemic and influence community response—ie, timing of lockdown, timing of reopening, etc.—and disease outbreak indicators—ie, number of confirmed cases and number of deaths. Twitter data of six major public organizations (1,000-4,500 tweets per organization) are collected from February 21, 2020 to June 6, 2020. Several machine learning algorithms, including dynamic topic model and sentiment analysis, are applied over time to identify the topic dynamics over the specific timeline of the pandemic. Organizations emphasized on various topics—eg, importance of wearing face mask, home quarantine, understanding the symptoms, social distancing and contact tracing, emerging community transmission, lack of personal protective equipment, COVID-19 testing and medical supplies, effect of tobacco, pandemic stress management, increasing hospitalization rate, upcoming hurricane season, use of convalescent plasma for COVID-19 treatment, maintaining hygiene, and the role of healthcare podcast in different timeline. The findings can benefit emergency management, policymakers, and public health agencies to identify targeted information dissemination policies for public with diverse needs based on how local, federal, and international agencies reacted to COVID-19.

Get full-text (via PubEx)

Study of Automotive Brands Popularity in Indonesia Using Twitter Data

Journal of Applied Information, Communication and Technology ◽

10.33555/ejaict.v3i1.91 ◽

2016 ◽

Vol 3 (1) ◽

pp. 23-33

Author(s):

Stevent Efendi ◽

Alva Erwin ◽

Kho I Eng

Keyword(s):

Social Media ◽

Social Network ◽

Sentiment Analysis ◽

Automotive Industry ◽

Real World ◽

The Internet ◽

Brand Preference ◽

Twitter Data ◽

Wide Range ◽

Widespread Phenomenon

Social media has been a widespread phenomenon in the recent years. People shared a lot of thought in social media, and these data posted on the internet could be used for study and researches. As one of the fastest growing social network, Twitter is a particularly popular social media to be studied because it allows researchers to access their data. This research will look the correlation between Twitter chatter of a brand and the sales of brands in Indonesia. Factors such as sentiment and tweet rate are expected to be able to predict the popularity of a brand. Being one of the biggest industries in Indonesia, automotive industry is an interesting subject to study. A wide range of people buys vehicles, and even gather as communities based on their car or motorcycle brand preference. The Twitter results of sentiment analysis and tweet rate will be compared with real world sales results published by GAIKINDO and AISI.

Get full-text (via PubEx)

Awareness and Use of Social Media

Advances in Library and Information Science - Social Media Strategies for Dynamic Library Service Development ◽

10.4018/978-1-4666-7415-8.ch014 ◽

2015 ◽

pp. 263-278

Author(s):

S. Thanuskodi ◽

A. Alagu

Keyword(s):

Social Media ◽

Social Media Networks ◽

Online Social Media ◽

Online Tools ◽

Level Of Use ◽

Social Media Tools ◽

Use Of Social Media

ABSTRACT In this chapter, Social Media Networks (SMNs), a subset of ICTs, are defined as online tools and utilities that allow communication of information online and participation and collaboration. Additionally, social media tools are websites that interact with the users, while giving them information. It is this two-way nature of SMNs that is central to this argument, and the role they played in the Egyptian uprisings. This chapter further defines the four most widely and effectively used SMNs: Facebook, Twitter, YouTube, and blogging. It is observed that only 81.75% of the respondents have their own blog, 73.64% of the respondents read blogs, while 74.32% of respondents add posts to blogs. The study shows the respondents' extent of level of use of specific online social media by gender.

Get full-text (via PubEx)

Data retrieval from online social media networks for defining business angels’ profile

Journal of Enterprising Communities People and Places in the Global Economy ◽

10.1108/jec-10-2019-0095 ◽

2019 ◽

Vol 14 (1) ◽

pp. 57-75

Author(s):

Gustavo Morales-Alonso ◽

Guzmán A. Vila ◽

Isaac Lemus-Aguilar ◽

Antonio Hidalgo

Keyword(s):

Social Media ◽

Early Stage ◽

Data Retrieval ◽

Business Angels ◽

Informal Finance ◽

Content Type ◽

Highly Educated ◽

Social Media Networks ◽

Retrieval Technique ◽

Online Social Media

Purpose Entrepreneurship is the basis of economic development but is somehow limited by the lack of access to financing sources, especially in the crucial moments of start-up early-stage development. For crossing the so-called “valley of death,” start-ups need to access informal finance sources, such as business angels. This study aims at defining the profile of business angels and comparing it with the existing literature. Design/methodology/approach A novel methodology for sampling the business angles population has been used, which extracts data from online social media networks. This allows taking a closer look at informal sources of entrepreneurial finance. A total of 500 real business angels, acting worldwide, from the LinkedIn and Crunchbase databases has been retrieved for this study. Findings Results point out that younger investors seem to be entering the entrepreneurial informal finance market. They are mainly males between 40 and 50 years of age, with a previous entrepreneurial record, and more highly educated than previously stated. They tend to have studies from Business Administration and Economics, although they prefer to invest in the ICT sector. Originality/value Besides the novel data retrieval technique for analyzing the informal sources of finance, the originality of the work lies in updating the archetype for business angels.

Get full-text (via PubEx)

Comparative of Machine Learning Algorithms and Datasets to Classify Natural Coverage in the Cajas National Park (Ecuador) Based on GEOBIA Approach

Proceedings ◽

10.3390/proceedings2019019020 ◽

2019 ◽

Vol 19 (1) ◽

pp. 20

Author(s):

Diego Pacheco Prado ◽

Luis Ángel Ruiz

Keyword(s):

Random Forest ◽

National Park ◽

Classification Problem ◽

Machine Learning Algorithms ◽

High Resolution Data ◽

Natural Land ◽

Geographic Datasets ◽

Land Cover Maps ◽

Selection Of

GEOBIA is an alternative to create and update land cover maps. In this work we assessed the combination of geographic datasets of the Cajas National Park (Ecuador) to detect which is the appropriate dataset-algorithm combination for the classification tasks in the Ecuadorian Andean region. The datasets included high resolution data as photogrammetric orthomosaic, DEM and derivated slope. These data were compared with free Sentinel imagery to classify natural land covers. We evaluated two aspects of the classification problem: the appropriate algorithm and the dataset combination. We evaluated SMO, C4.5 and Random Forest algorithms for the selection of attributes and classification of objects. The best results of kappa in the comparison of algorithms of classification were obtained with SMO (0.8182) and Random Forest (0.8117). In the evaluation of datasets the kappa values of the photogrammetry orthomosaic and the combination of Sentinel 1 and 2 have similar values using the C4.5 algorithm.

Get full-text (via PubEx)

Sentiment Analysis on Movie Reviews Using Twitter

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2020.9326 ◽

2020 ◽

Vol 17 (7) ◽

pp. 2869-2875

Author(s):

Sajay Thomas Samuel ◽

Booma Poolan Marikannan

Keyword(s):

Machine Learning ◽

Social Media ◽

Sentiment Analysis ◽

Learning Algorithm ◽

Instant Messaging ◽

Machine Learning Algorithms ◽

Depth Information ◽

Implementation Phase ◽

Online Social Media ◽

Past Data

Machine learning can help people to perform complex tasks and solve problems as it uses historical data to learn its pattern and make predictions based on the past data. This research addresses the problem about movie reviews on social media specifically Twitter; where it will gather the tweets on movie reviews and display a rating based on the sentiment of the tweet. Twitter is an online social media website where people from all walks of life communicate by tweeting short updates without exceeding the character limit which is 240 characters. Twitter is continuously growing as a business and became one of the biggest platform for communication and instant messaging. Due to the large number of users, there are voluminous amounts of data available that can be used for more in depth information and insights and to get the sentiments from analysing the tweets. In today’s world, there are many applications that are using sentiment analysis in various fields such as to gets insights about a particular brand or product. To do sentiment analysis using the traditional ways can be time consuming and becomes very complex. The aim of this research is to investigate about the domain of sentiment analysis and incorporate a machine learning algorithm to create a system that is able to get and display the ratings of a particular movie. The machine learning algorithms used are Naïve Bayes Classifier and SVM. The algorithm with better accuracy will be chosen for the implementation phase.

Get full-text (via PubEx)