Gender Inference for Arabic Language in Social Media

2014 ◽  
Vol 5 (4) ◽  
pp. 1-10
Author(s):  
Abdul Rahman I. Al-Ghadir ◽  
Abdullatif Alabdullatif ◽  
Aqil M. Azmi

The widespread usage of social media has attracted a new group of researchers seeking information on who, what and, where the users are. Some of the information retrieval researchers are interested in identifying the gender, age group, and the educational level of the users. The objective of this work is to identify the gender in the Arabic posts in the social media. Most of the works related to gender classification has been for English based content in the social media. Work for other languages, such as Arabic, is almost next to none. Typically people express themselves in the social media using colloquial, so this study is geared towards the identification of genders using the Saudi dialect of the Arabic language. To solve the gender identification problem the authors, a novel method called k-Top Vector (k-TV), which is based on the k-top words based on the words occurrences and the frequency of the stems, was introduced. Part of this work required compiling a dataset of Saudi dialect words. For this, a well-known widely used social site was relied on. To test the system, we compiled 1200 samples equally split between both genders. The authors trained Support Vector Machine (SVM) and k-NN classifiers using different number of samples for training and testing. SVM did a better job and achieved an accuracy of 95% for gender classification.

2017 ◽  
pp. 811-821
Author(s):  
Abdul Rahman I. Al-Ghadir ◽  
Abdullatif Alabdullatif ◽  
Aqil M. Azmi

The widespread usage of social media has attracted a new group of researchers seeking information on who, what and, where the users are. Some of the information retrieval researchers are interested in identifying the gender, age group, and the educational level of the users. The objective of this work is to identify the gender in the Arabic posts in the social media. Most of the works related to gender classification has been for English based content in the social media. Work for other languages, such as Arabic, is almost next to none. Typically people express themselves in the social media using colloquial, so this study is geared towards the identification of genders using the Saudi dialect of the Arabic language. To solve the gender identification problem the authors, a novel method called k-Top Vector (k-TV), which is based on the k-top words based on the words occurrences and the frequency of the stems, was introduced. Part of this work required compiling a dataset of Saudi dialect words. For this, a well-known widely used social site was relied on. To test the system, we compiled 1200 samples equally split between both genders. The authors trained Support Vector Machine (SVM) and k-NN classifiers using different number of samples for training and testing. SVM did a better job and achieved an accuracy of 95% for gender classification.


2019 ◽  
Vol 36 (1) ◽  
pp. 58-72
Author(s):  
Saeed Rouhani ◽  
Ehsan Abedin

Purpose Crypto-currencies, decentralized electronic currencies systems, denote a radical change in financial exchange and economy environment. Consequently, it would be attractive for designers and policy-makers in this area to make out what social media users think about them on Twitter. The purpose of this study is to investigate the social opinions about different kinds of crypto-currencies and tune the best-customized classification technique to categorize the tweets based on sentiments. Design/methodology/approach This paper utilized a lexicon-based approach for analyzing the reviews on a wide range of crypto-currencies over Twitter data to measure positive, negative or neutral sentiments; in addition, the end result of sentiments played a training role to train a supervised technique, which can predict the sentiment loading of tweets about the main crypto-currencies. Findings The findings further prove that more than 50 per cent of people have positive beliefs about crypto-currencies. Furthermore, this paper confirms that marketers can predict the sentiment of tweets about these crypto-currencies with high accuracy if they use appropriate classification techniques like support vector machine (SVM). Practical implications Considering the growing interest in crypto-currencies (Bitcoin, Cardano, Ethereum, Litcoin and Ripple), the findings of this paper have a remarkable value for enterprises in the financial area to obtain the promised benefits of social media analysis at work. In addition, this paper helps crypto-currencies vendors analyze public opinion in social media platforms. In this sense, the current paper strengthens our understanding of what happens in social media for crypto-currencies. Originality/value For managers and decision-makers, this paper suggests that the news and campaign for their crypto in Twitter would affect people’s perspectives in a good manner. Because of this fact, the firms, investing in these crypto-currencies, could apply the social media as a magnifier for their promotional activities. The findings steer the market managers to see social media as a predictor tool, which can analyze the market through understanding the opinions of users of Twitter.


Author(s):  
P. Pitchaipandi

This chapter tries to analyse the impact and usage of social media among the postgraduate students of arts in Alagappa University, Karaikudi, under survey method for the study. The study identified the majority (69.79%) of the respondents under female category, and 72.92% of the respondents belong in the age group between 21 and 23 years. It is observed that 32.29% of the respondents use the social media, preferably YouTube. The plurality (48.96%) of the respondents use smartphone/mobiles compare to iPod, desktop, laptop, and others. 35.42% of the respondents' spent between 1 and 5 hours weekly using social media. Further, the study also observes the positive and negative aspects of using social media in postgraduate students of arts disciplines in the university.


2018 ◽  
Vol 197 ◽  
pp. 15006 ◽  
Author(s):  
Rosa Andrie Asmara ◽  
Irtafa Masruri ◽  
Cahya Rahmad ◽  
Indrazno Siradjuddin ◽  
Erfan Rohadi ◽  
...  

Identifying gender from the pedestrian video is one crucial key to study demographics in such areas. With current video surveillance technology, identifying gender from a distance is possible. This research proposed the utilization of computer vision to identify gender based on their walking gait. The data feature used to determine gender based on their walking gait divided into five parts, namely the head, chest, back, waist & buttocks, and legs. Two different methods are used to perform the real-time gender gait recognition process, i.e., Gait Energy Image (GEI) and Gait Information Image (GII), while the Support Vector Machine (SVM) method used as the data classifier. The experimental results show that the process of identifying gender based on walking with GEI method is 55% accuracy and GII method is 60% accuracy. From these results, it can conclude that the method GII with SVM classifier has the best accuracy in the process of gender classification


2019 ◽  
Vol 9 (6) ◽  
pp. 1215-1223 ◽  
Author(s):  
Fiaz Majeed ◽  
Muhammad Waqas Asif ◽  
Muhammad Awais Hassan ◽  
Syed Ali Abbas ◽  
M. Ikramullah Lali

The trend of news transmission is rapidly shifting from electronic media to social media. Currently, news channels in general, while health news channels specifically send health related news on social media sites. These news are beneficial for the patients, medical professionals and the general public. A lot of health related data is available on the social media that may be used to extract significant information and present several predictions from it to assist physicians, patients and healthcare organizations for decision making. However, A little research is found on health news data using machine learning approaches, thus in this paper, we have proposed a framework for the data collection, modeling, and visualization of the health related patterns. For the analysis, the tweets of 13 news channels are collected from the Twitter. The dataset holds approximately 28k tweets available under 280 hashtags. Furthermore, a comprehensive set of experiments are performed to extract patterns from the data. A comparative analysis is carried among the baseline method and four classification algorithms which include Naive Bayes (NB), Support Vector Machine (SVM), Logistic Regression (LR), Decision Tree (J48). For the evaluation of the results, the standard measures accuracy, precision, recall and f-measure have been used. The results of the study are encouraging and better than the other studies of such kind.


2018 ◽  
Vol 25 (10) ◽  
pp. 1274-1283 ◽  
Author(s):  
Abeed Sarker ◽  
Maksim Belousov ◽  
Jasper Friedrichs ◽  
Kai Hakala ◽  
Svetlana Kiritchenko ◽  
...  

AbstractObjectiveWe executed the Social Media Mining for Health (SMM4H) 2017 shared tasks to enable the community-driven development and large-scale evaluation of automatic text processing methods for the classification and normalization of health-related text from social media. An additional objective was to publicly release manually annotated data.Materials and MethodsWe organized 3 independent subtasks: automatic classification of self-reports of 1) adverse drug reactions (ADRs) and 2) medication consumption, from medication-mentioning tweets, and 3) normalization of ADR expressions. Training data consisted of 15 717 annotated tweets for (1), 10 260 for (2), and 6650 ADR phrases and identifiers for (3); and exhibited typical properties of social-media-based health-related texts. Systems were evaluated using 9961, 7513, and 2500 instances for the 3 subtasks, respectively. We evaluated performances of classes of methods and ensembles of system combinations following the shared tasks.ResultsAmong 55 system runs, the best system scores for the 3 subtasks were 0.435 (ADR class F1-score) for subtask-1, 0.693 (micro-averaged F1-score over two classes) for subtask-2, and 88.5% (accuracy) for subtask-3. Ensembles of system combinations obtained best scores of 0.476, 0.702, and 88.7%, outperforming individual systems.DiscussionAmong individual systems, support vector machines and convolutional neural networks showed high performance. Performance gains achieved by ensembles of system combinations suggest that such strategies may be suitable for operational systems relying on difficult text classification tasks (eg, subtask-1).ConclusionsData imbalance and lack of context remain challenges for natural language processing of social media text. Annotated data from the shared task have been made available as reference standards for future studies (http://dx.doi.org/10.17632/rxwfb3tysd.1).


2019 ◽  
Vol 4 (1) ◽  
pp. 47-60
Author(s):  
Ihsanudin

Arabic is synonymous with the symbol of Islam. Because the Qur'an and Sunnah use Arabic Language. However, tren fashion is now entering the era of globalization. It doesn't matter if tren fashion is currently mixed with western and eastern cultures. Agnes Monica's dress, which when appearing on television shows, attracted controversy of many parties because it was considered taboo, because of the transparent clothing and Arabic writing that was right on her thigh. Various suggestions from nitizen fulfilled the social media homepage, MUI also commented on the polemic that had taken place. The Mead Symbolic Interactionism Theory is very appropriate to be used as a knife for analyzing the case above. The purpose of this study is to describe and analyze the phenomenon of Agnes Monica's dress Arab writing "Al-Muttaḥidah" using Mead's theory, include describing the specifically human social act, action, gesture, signicicant symbols, mean, self, and society. This research is belong to library research and  use analytical descriptive methods.Tren Fashion, Arabic Writing, Mead's Symbolic Interactionism.


Author(s):  
Muhammet Sinan Basarslan ◽  
Fatih Kayaalp

Social media has become an important part of our everyday life due to the widespread use of the Internet. Of the social media services, Twitter is among the most used ones around the world. People share their opinions by writing tweets about numerous subjects, such as politics, sports, economy, etc. Millions of tweets per day create a huge dataset, which drew attention of the data scientists to focus on these data for sentiment analysis. The sentiment analysis focuses to identify the social media posts of users about a specific topic and categorize them as positive, negative or neutral. Thus, the study aims to investigate the effect of types of text representation on the performance of sentiment analysis. In this study, two datasets were used in the experiments. The first one is the user reviews about movies from the IMDB, which has been labeled by Kotzias, and the second one is the Twitter tweets, including the tweets of users about health topic in English in 2019, collected using the Twitter API. The Python programming language was used in the study both for implementing the classification models using the Naïve Bayes (NB), Support Vector Machines (SVM) and Artificial Neural Networks (ANN) algorithms, and for categorizing the sentiments as positive, negative and neutral. The feature extraction from the dataset was performed using Term Frequency-Inverse Document Frequency (TF-IDF) and Word2Vec (W2V) modeling techniques. The success percentages of the classification algorithms were compared at the end. According to the experimental results, Artificial Neural Network had the best accuracy performance in both datasets compared to the others.


2021 ◽  
Vol 115 ◽  
pp. 01003
Author(s):  
Nikoleta Hutmanová ◽  
Peter Dorčák

The paper focuses on how social media usage by children determines their interactions with consumer brands. First it describes how and when young children develop brand awareness and which are the most important predictors of this development. Those findings are then put in connection with the impact of social media. We elaborate on a deeper level how children approach online communications with brands in the social media context. Our assumptions are supported by a research conducted on a group of New Zealand children, both boys and girls in the age group of 11-14 years. This qualitative approach was implemented using in-depth interviews and identifies three key modes of brand interaction behaviour when young consumers use social media. According to these findings we assume that there is a connection between the use of social media and children´s relationship with consumer brands.


2021 ◽  
Vol 5 (2) ◽  
Author(s):  
Aulia Mustika Ilmiani ◽  
◽  
Mukhtar I Miolo ◽  

Social media is often used as a learning tool, one of which is Arabic learning. This study aims to explore social media-based Arabic learning carried out by Arabic Language Education study program lecturers at IAIN Palangka Raya. By using descriptive qualitative research methods, this study describes the steps for implementing Arabic language learning which is carried out using social media, such as accessing, selecting, understanding, analyzing, verifying, evaluating and producing. The findings in this study describe that social media is used as: First, as a publication forum for project-based assignments; Second, as a means of digital literacy to obtain information; Third, as a way for students to optimize social media as a medium for literacy. The social media used in learning Arabic in the PBA IAIN Palangka Raya study program are; Whatsapp is used as a learning resource for Maharah Istima, Instagram is used as a learning resource in Maharah Kalam. Facebook is used as a learning resource for Maharah Qiraah and Kitabah. Meanwhile, Youtube is mostly used for the publication of project-based assignments. Further research recommended is the effectiveness of using social media in improving Arabic learning skills, as well as digital literacy-based Arabic learning using other information technologies.


Sign in / Sign up

Export Citation Format

Share Document