Socialized Word Embeddings

Word embeddings have attracted a lot of attention. On social media, each user’s language use can be significantly affected by the user’s friends. In this paper, we propose a socialized word embedding algorithm which can consider both user’s personal characteristics of language use and the user’s social relationship on social media. To incorporate personal characteristics, we propose to use a user vector to represent each user. Then for each user, the word embeddings are trained based on each user’s corpus by combining the global word vectors and local user vector. To incorporate social relationship, we add a regularization term to impose similarity between two friends. In this way, we can train the global word vectors and user vectors jointly. To demonstrate the effectiveness, we used the latest large-scale Yelp data to train our vectors, and designed several experiments to show how user vectors affect the results.

Download Full-text

Biased Random Walk based Social Regularization for Word Embeddings

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/634 ◽

2018 ◽

Cited By ~ 3

Author(s):

Ziqian Zeng ◽

Xin Liu ◽

Yangqiu Song

Keyword(s):

Social Media ◽

Random Walk ◽

Natural Language ◽

Language Use ◽

Personal Characteristics ◽

Sentiment Classification ◽

Word Embeddings ◽

Biased Random Walk ◽

The Social ◽

The One

Nowadays, people publish a lot of natural language texts on social media. Socialized word embeddings (SWE) has been proposed to deal with two phenomena of language use: everyone has his/her own personal characteristics of language use and socially connected users are likely to use language in similar ways. We observe that the spread of language use is transitive. Namely, one user can affect his/her friends and the friends can also affect their friends. However, SWE modeled the transitivity implicitly. The social regularization in SWE only applies to one-hop neighbors and thus users outside the one-hop social circle will not be affected directly. In this work, we adopt random walk methods to generate paths on the social graph to model the transitivity explicitly. Each user on a path will be affected by his/her adjacent user(s) on the path. Moreover, according to the update mechanism of SWE, fewer friends a user has, fewer update opportunities he/she can get. Hence, we propose a biased random walk method to provide these users with more update opportunities. Experiments show that our random walk based social regularizations perform better on sentiment classification.

Download Full-text

Towards Robust Word Embeddings for Noisy Texts

Applied Sciences ◽

10.3390/app10196893 ◽

2020 ◽

Vol 10 (19) ◽

pp. 6893

Author(s):

Yerai Doval ◽

Jesús Vilares ◽

Carlos Gómez-Rodríguez

Keyword(s):

Social Media ◽

Word Embedding ◽

Simple Extension ◽

Word Embeddings ◽

Explicit Approach ◽

Wide Range

Research on word embeddings has mainly focused on improving their performance on standard corpora, disregarding the difficulties posed by noisy texts in the form of tweets and other types of non-standard writing from social media. In this work, we propose a simple extension to the skipgram model in which we introduce the concept of bridge-words, which are artificial words added to the model to strengthen the similarity between standard words and their noisy variants. Our new embeddings outperform baseline models on noisy texts on a wide range of evaluation tasks, both intrinsic and extrinsic, while retaining a good performance on standard texts. To the best of our knowledge, this is the first explicit approach at dealing with these types of noisy texts at the word embedding level that goes beyond the support for out-of-vocabulary words.

Download Full-text

Understanding Large-Scale Social Relationship Data by Combining Conceptual Graphs and Domain Ontologies

Discrete Dynamics in Nature and Society ◽

10.1155/2021/2857611 ◽

2021 ◽

Vol 2021 ◽

pp. 1-18

Author(s):

Zhao Huang ◽

Liu Yuan

Keyword(s):

Social Media ◽

Social Relationship ◽

Large Scale ◽

Conceptual Graphs ◽

Formal Representation ◽

Lehigh University ◽

Large Scale Data ◽

Conceptual Graph ◽

Discovery Method ◽

The Relationship

People worldwide communicate online and create a great amount of data on social media. The understanding of such large-scale data generated on social media and uncovering patterns from social relationship has received much attention from academics and practitioners. However, it still faces challenges to represent and manage the large-scale social relationship data in a formal manner. Therefore, this study proposes a social relationship representation model, which addresses both conceptual graph and domain ontology. Such a formal representation of a social relationship graph can provide a flexible and adaptive way to complete social relationship discovery. Using the term-define capability of ontologies and the graphical structure of the conceptual graph, this paper presents a social relationship description with formal syntax and semantics. The reasoning procedure working on this formal representation can exploit the capability of ontology reasoning and graph homomorphism-based reasoning. A social relationship graph constructed from the Lehigh University Benchmark (LUBM) is used to test the efficiency of the relationship discovery method.

Download Full-text

Lifelong Domain Word Embedding via Meta-Learning

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/627 ◽

2018 ◽

Cited By ~ 1

Author(s):

Hu Xu ◽

Bing Liu ◽

Lei Shu ◽

Philip S. Yu

Keyword(s):

Lifelong Learning ◽

Large Scale ◽

General Purpose ◽

Word Embedding ◽

Experimental Results ◽

Word Embeddings ◽

High Quality ◽

Domain Specific ◽

The Past ◽

Meta Learning

Learning high-quality domain word embeddings is important for achieving good performance in many NLP tasks. General-purpose embeddings trained on large-scale corpora are often sub-optimal for domain-specific applications. However, domain-specific tasks often do not have large in-domain corpora for training high-quality domain embeddings. In this paper, we propose a novel lifelong learning setting for domain embedding. That is, when performing the new domain embedding, the system has seen many past domains, and it tries to expand the new in-domain corpus by exploiting the corpora from the past domains via meta-learning. The proposed meta-learner characterizes the similarities of the contexts of the same word in many domain corpora, which helps retrieve relevant data from the past domains to expand the new domain corpus. Experimental results show that domain embeddings produced from such a process improve the performance of the downstream tasks.

Download Full-text

Change of Attitude, Technology and Practice: Identifying the Change for Increased Value Creation with Customer Co-creation

TRANSNATIONAL MARKETING JOURNAL ◽

10.33182/tmj.v5i1.388 ◽

2017 ◽

Vol 5 (1) ◽

pp. 70-82

Author(s):

Soumi Paul ◽

Paola Peretti ◽

Saroj Kumar Datta

Keyword(s):

Social Media ◽

New Product Development ◽

Value Creation ◽

Large Scale ◽

Customer Orientation ◽

Customer Relationships ◽

Future Research ◽

Management Competencies ◽

Business Decisions ◽

Prime Concern

Building customer relationships and customer equity is the prime concern in today’s business decisions. The emergence of internet, especially social media like Facebook and Twitter, changed traditional marketing thought to a great extent. The importance of customer orientation is reflected in the axiom, “The customer is the king”. A good number of organizations are engaging customers in their new product development activities via social media platforms. Co-creation, a new perspective in which customers are active co-creators of the products they buy and use, is currently challenging the traditional paradigm. The concept of co-creation involving the customer’s knowledge, creativity and judgment to generate value is considered not only an upcoming trend that introduces new products or services but also fitting their need and increasing value for money. Knowledge and innovation are inseparable. Knowledge management competencies and capacities are essential to any organization that aspires to be distinguished and innovative. The present work is an attempt to identify the change in value creation procedure along with one area of business, where co-creation can return significant dividends. It is on extending the brand or brand category through brand extension or line extension. This article, through an in depth literature review analysis, identifies the changes in every perspective of this paradigm shift and it presents a conceptual model of company-customer-brand-based co-creation activity via social media. The main objective is offering an agenda for future research of this emerging trend and ensuring the way to move from theory to practice. The paper acts as a proposal; it allows the organization to go for this change in a large scale and obtain early feedback on the idea presented.

Download Full-text

Location and Language Use in Social Media

10.3115/v1/w14-2504 ◽

2014 ◽

Author(s):

Ed Chi

Keyword(s):

Social Media ◽

Language Use

Download Full-text

First grammar books in the Habsburg Monarchy: individual initiative and regulatory interference by the state (1760s–1770s)

A day in the calendar. Celebrations and memorial days as an instrument of national consolidation in Central, Eastern and South-Eastern Europe from the nineteenth to the twenty-first century - Central-European Studies ◽

10.31168/2619-0877.2019.2.6 ◽

2020 ◽

Vol 2019 (2 (11)) ◽

pp. 137-157

Author(s):

Olga V. Khavanova ◽

Keyword(s):

Eighteenth Century ◽

State Policy ◽

Large Scale ◽

Language Use ◽

Linguistic Diversity ◽

Mother Tongue ◽

German Language ◽

Habsburg Monarchy ◽

Private Initiative ◽

The One

The second half of the eighteenth century in the lands under the sceptre of the House of Austria was a period of development of a language policy addressing the ethno-linguistic diversity of the monarchy’s subjects. On the one hand, the sphere of use of the German language was becoming wider, embracing more and more segments of administration, education, and culture. On the other hand, the authorities were perfectly aware of the fact that communication in the languages and vernaculars of the nationalities living in the Austrian Monarchy was one of the principal instruments of spreading decrees and announcements from the central and local authorities to the less-educated strata of the population. Consequently, a large-scale reform of primary education was launched, aimed at making the whole population literate, regardless of social status, nationality (mother tongue), or confession. In parallel with the centrally coordinated state policy of education and language-use, subjects-both language experts and amateur polyglots-joined the process of writing grammar books, which were intended to ease communication between the different nationalities of the Habsburg lands. This article considers some examples of such editions with primary attention given to the correlation between private initiative and governmental policies, mechanisms of verifying the textbooks to be published, their content, and their potential readers. This paper demonstrates that for grammar-book authors, it was very important to be integrated into the patronage networks at the court and in administrative bodies and stresses that the Vienna court controlled the process of selection and financing of grammar books to be published depending on their quality and ability to satisfy the aims and goals of state policy.

Download Full-text

Public Awareness: What Climate Change Scientists Should Consider

Sustainability ◽

10.3390/su12208369 ◽

2020 ◽

Vol 12 (20) ◽

pp. 8369

Author(s):

Mohammad Rahimi

Keyword(s):

Climate Change ◽

Social Media ◽

Large Scale ◽

Public Awareness ◽

Connected Network ◽

Multi Scale ◽

Valuable Addition ◽

Scientific Approach ◽

Design Solutions ◽

Mitigate Climate Change

In this Opinion, the importance of public awareness to design solutions to mitigate climate change issues is highlighted. A large-scale acknowledgment of the climate change consequences has great potential to build social momentum. Momentum, in turn, builds motivation and demand, which can be leveraged to develop a multi-scale strategy to tackle the issue. The pursuit of public awareness is a valuable addition to the scientific approach to addressing climate change issues. The Opinion is concluded by providing strategies on how to effectively raise public awareness on climate change-related topics through an integrated, well-connected network of mavens (e.g., scientists) and connectors (e.g., social media influencers).

Download Full-text

Investigating the impact of pre-processing techniques and pre-trained word embeddings in detecting Arabic health information on social media

Journal Of Big Data ◽

10.1186/s40537-021-00488-w ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Yahya Albalawi ◽

Jim Buckley ◽

Nikola S. Nikolov

Keyword(s):

Social Media ◽

Deep Learning ◽

Comprehensive Evaluation ◽

Classification Problem ◽

Data Sets ◽

Word Embeddings ◽

Data Set ◽

Lower Accuracy ◽

Health Related ◽

The Impact

AbstractThis paper presents a comprehensive evaluation of data pre-processing and word embedding techniques in the context of Arabic document classification in the domain of health-related communication on social media. We evaluate 26 text pre-processings applied to Arabic tweets within the process of training a classifier to identify health-related tweets. For this task we use the (traditional) machine learning classifiers KNN, SVM, Multinomial NB and Logistic Regression. Furthermore, we report experimental results with the deep learning architectures BLSTM and CNN for the same text classification problem. Since word embeddings are more typically used as the input layer in deep networks, in the deep learning experiments we evaluate several state-of-the-art pre-trained word embeddings with the same text pre-processing applied. To achieve these goals, we use two data sets: one for both training and testing, and another for testing the generality of our models only. Our results point to the conclusion that only four out of the 26 pre-processings improve the classification accuracy significantly. For the first data set of Arabic tweets, we found that Mazajak CBOW pre-trained word embeddings as the input to a BLSTM deep network led to the most accurate classifier with F1 score of 89.7%. For the second data set, Mazajak Skip-Gram pre-trained word embeddings as the input to BLSTM led to the most accurate model with F1 score of 75.2% and accuracy of 90.7% compared to F1 score of 90.8% achieved by Mazajak CBOW for the same architecture but with lower accuracy of 70.89%. Our results also show that the performance of the best of the traditional classifier we trained is comparable to the deep learning methods on the first dataset, but significantly worse on the second dataset.

Download Full-text

Social Media in and Around a Temporary Large-Scale Refugee Shelter in the Netherlands

Social Media + Society ◽

10.1177/20563051211024961 ◽

2021 ◽

Vol 7 (2) ◽

pp. 205630512110249

Author(s):

Peer Smets ◽

Younes Younes ◽

Marinka Dohmen ◽

Kees Boersma ◽

Lenie Brouwer

Keyword(s):

Social Media ◽

The Netherlands ◽

Crisis Communication ◽

Asylum Seekers ◽

Large Scale ◽

State Policies ◽

Refugee Crisis ◽

Top Down ◽

And Migration

During the 2015 refugee crisis in Europe, temporary refugee shelters arose in the Netherlands to shelter the large influx of asylum seekers. The largest shelter was located in the eastern part of the country. This shelter, where tents housed nearly 3,000 asylum seekers, was managed with a firm top-down approach. However, many residents of the shelter—mainly Syrians and Eritreans—developed horizontal relations with the local receiving society, using social media to establish contact and exchange services and goods. This case study shows how various types of crisis communication played a role and how the different worlds came together. Connectivity is discussed in relation to inclusion, based on resilient (non-)humanitarian approaches that link society with social media. Moreover, we argue that the refugee crisis can be better understood by looking through the lens of connectivity, practices, and migration infrastructure instead of focusing only on state policies.

Download Full-text