Understanding the expression of grievances in the Arabic Twitter-sphere using machine learning

2019 ◽  
Vol 5 (2) ◽  
pp. 108-119
Author(s):  
Yeslam Al-Saggaf ◽  
Amanda Davies

Purpose The purpose of this paper is to discuss the design, application and findings of a case study in which the application of a machine learning algorithm is utilised to identify the grievances in Twitter in an Arabian context. Design/methodology/approach To understand the characteristics of the Twitter users who expressed the identified grievances, data mining techniques and social network analysis were utilised. The study extracted a total of 23,363 tweets and these were stored as a data set. The machine learning algorithm applied to this data set was followed by utilising a data mining process to explore the characteristics of the Twitter feed users. The network of the users was mapped and the individual level of interactivity and network density were calculated. Findings The machine learning algorithm revealed 12 themes all of which were underpinned by the coalition of Arab countries blockade of Qatar. The data mining analysis revealed that the tweets could be clustered in three clusters, the main cluster included users with a large number of followers and friends but who did not mention other users in their tweets. The social network analysis revealed that whilst a large proportion of users engaged in direct messages with others, the network ties between them were not registered as strong. Practical implications Borum (2011) notes that invoking grievances is the first step in the radicalisation process. It is hoped that by understanding these grievances, the study will shed light on what radical groups could invoke to win the sympathy of aggrieved people. Originality/value In combination, the machine learning algorithm offered insights into the grievances expressed within the tweets in an Arabian context. The data mining and the social network analyses revealed the characteristics of the Twitter users highlighting identifying and managing early intervention of radicalisation.

Author(s):  
Sushant Keni ◽  
Priyanka Jadhav ◽  
Mayur Patil ◽  
Prof. Sonal Chaudhari

We evaluate the feasibility of using Facebook data to enhance the effectiveness of a recruitment system, especially for résumé verification and recognize the personality by using social network analysis methods. In the industries employee’s personality is very important in the workplace which will help to growth of the company and give more good service to the client. Currently resume verification is based on trustful third parties who does background verification. Based on this report is sent to the company who is hiring the employee decides to keep employee or not. This manual system usually takes lots of time and this system generally wont display candidates’ nature towards society (in short how he behaves in society weather he posts something wrong on social media in simple words his/her personality). Social media now a days is huge platform where user generally spends too much time on social media like Facebook, LinkedIn etc. like posting a page, commenting, liking the post, certification uploading, adding friends. We are going to design such a system that verifies genuineness of user by scraping or exploring data from Facebook or LinkedIn or both. we are exploring post of person and classifies it into is it technology related, violence related and many more what are the comments he gives on his post how he reacts his language of handling a query will be parsed and classified using machine learning algorithm of previously trained dataset using SVM. And at the end we will show this information to the company to make their own decision based on this result.


2021 ◽  
Vol 12 (26) ◽  
pp. 1-13
Author(s):  
Carlos Alberto Arango Pastrana ◽  
Carlos Fernando Osorio Andrade

To reduce the rate of contagion by Covid-19, the Colombian government has adopted, among other measures, for mandatory isolation, with divided opinions, because despite helping to reduce the spread of the virus, it generates mental and economic problems that are difficult to overcome. The objective of this document was to analyze the underlying sentiments in the Twitter comments related to isolation, identifying the topics and words most frequently used in this context. A machine learning algorithm was built to identify sentiments in 72,564 posts and a social network analysis was applied establishing the most frequent topics in the data sets. The results suggest that the algorithm is highly accurate in classifying feelings. Also, as the isolation extends, comments related to the quarantine grow proportionally. Fear was identified as the predominant feeling throughout the period of confinement in Colombia.


Author(s):  
Ana Maria Magdalena Saldana-Perez ◽  
Marco Antonio Moreno-Ibarra ◽  
Miguel Jesus Torres-Ruiz

It is interesting to exploit the user-generated content (UGC) and to use it with a view to infer new data; volunteered geographic information (VGI) is a concept derived from UGC, whose main importance lies in its continuously updated data. The present approach tries to explode the use of VGI by collecting data from a social network and a RSS service; the short texts collected from the social network are written in Spanish language; text mining and a recovery information processes are applied over the data in order to remove special characters on text and to extract relevant information about the traffic events on the study area; then data are geocoded. The texts are classified by using a machine learning algorithm into five classes, each of them represents a specific traffic event or situation.


Author(s):  
Ana Maria Magdalena Saldana-Perez ◽  
Marco Antonio Moreno-Ibarra ◽  
Miguel Jesus Torres-Ruiz

It is interesting to exploit the user generated content (UGC), and to use it with a view to infer new data; volunteered geographic information (VGI) is a concept derived from UGC, which main importance lies in its continuously updated data. The present approach tries to explode the use of VGI, by collecting data from a social network and a RSS service; the short texts collected from the social network are written in Spanish language; a text mining and a recovery information processes are applied over the data, in order to remove special characters on text, and to extract relevant information about the traffic events on the study area, then data are geocoded. The texts are classified by using a machine learning algorithm into five classes, each of them represents a specific traffic event or situation.


2021 ◽  
Vol 1088 (1) ◽  
pp. 012035
Author(s):  
Mulyawan ◽  
Agus Bahtiar ◽  
Githera Dwilestari ◽  
Fadhil Muhammad Basysyar ◽  
Nana Suarna

2020 ◽  
Vol 13 (4) ◽  
pp. 503-534
Author(s):  
Mehmet Ali Köseoğlu ◽  
John Parnell

PurposeThe authors evaluate the evolution of the intellectual structure of strategic management (SM) by employing a document co-citation analysis through a network analysis for academic citations in articles published in the Strategic Management Journal (SMJ).Design/methodology/approachThe authors employed the co-citation analysis through the social network analysis.FindingsThe authors outlined the evolution of the academic foundations of the structure and emphasized several domains. The economic foundation of SM research with macro and micro perspectives has generated a solid knowledge stock in the literature. Industrial organization (IO) psychology has also been another dominant foundation. Its robust development and extension in the literature have focused on cognitive issues in actors' behaviors as a behavioral foundation of SM. Methodological issues in SM research have become dominant between 2004 and 2011, but their influence has been inconsistent. The authors concluded by recommending future directions to increase maturity in the SM research domain.Originality/valueThis is the first paper to elucidate the intellectual structure of SM by adopting the co-citation analysis through the social network analysis.


A large volume of datasets is available in various fields that are stored to be somewhere which is called big data. Big Data healthcare has clinical data set of every patient records in huge amount and they are maintained by Electronic Health Records (EHR). More than 80 % of clinical data is the unstructured format and reposit in hundreds of forms. The challenges and demand for data storage, analysis is to handling large datasets in terms of efficiency and scalability. Hadoop Map reduces framework uses big data to store and operate any kinds of data speedily. It is not solely meant for storage system however conjointly a platform for information storage moreover as processing. It is scalable and fault-tolerant to the systems. Also, the prediction of the data sets is handled by machine learning algorithm. This work focuses on the Extreme Machine Learning algorithm (ELM) that can utilize the optimized way of finding a solution to find disease risk prediction by combining ELM with Cuckoo Search optimization-based Support Vector Machine (CS-SVM). The proposed work also considers the scalability and accuracy of big data models, thus the proposed algorithm greatly achieves the computing work and got good results in performance of both veracity and efficiency.


In today’s world social media is one of the most important tool for communication that helps people to interact with each other and share their thoughts, knowledge or any other information. Some of the most popular social media websites are Facebook, Twitter, Whatsapp and Wechat etc. Since, it has a large impact on people’s daily life it can be used a source for any fake or misinformation. So it is important that any information presented on social media should be evaluated for its genuineness and originality in terms of the probability of correctness and reliability to trust the information exchange. In this work we have identified the features that can be helpful in predicting whether a given Tweet is Rumor or Information. Two machine learning algorithm are executed using WEKA tool for the classification that is Decision Tree and Support Vector Machine.


2020 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Sandeepkumar Hegde ◽  
Monica R. Mundada

Purpose According to the World Health Organization, by 2025, the contribution of chronic disease is expected to rise by 73% compared to all deaths and it is considered as global burden of disease with a rate of 60%. These diseases persist for a longer duration of time, which are almost incurable and can only be controlled. Cardiovascular disease, chronic kidney disease (CKD) and diabetes mellitus are considered as three major chronic diseases that will increase the risk among the adults, as they get older. CKD is considered a major disease among all these chronic diseases, which will increase the risk among the adults as they get older. Overall 10% of the population of the world is affected by CKD and it is likely to double in the year 2030. The paper aims to propose novel feature selection approach in combination with the machine-learning algorithm which can early predict the chronic disease with utmost accuracy. Hence, a novel feature selection adaptive probabilistic divergence-based feature selection (APDFS) algorithm is proposed in combination with the hyper-parameterized logistic regression model (HLRM) for the early prediction of chronic disease. Design/methodology/approach A novel feature selection APDFS algorithm is proposed which explicitly handles the feature associated with the class label by relevance and redundancy analysis. The algorithm applies the statistical divergence-based information theory to identify the relationship between the distant features of the chronic disease data set. The data set required to experiment is obtained from several medical labs and hospitals in India. The HLRM is used as a machine-learning classifier. The predictive ability of the framework is compared with the various algorithm and also with the various chronic disease data set. The experimental result illustrates that the proposed framework is efficient and achieved competitive results compared to the existing work in most of the cases. Findings The performance of the proposed framework is validated by using the metric such as recall, precision, F1 measure and ROC. The predictive performance of the proposed framework is analyzed by passing the data set belongs to various chronic disease such as CKD, diabetes and heart disease. The diagnostic ability of the proposed approach is demonstrated by comparing its result with existing algorithms. The experimental figures illustrated that the proposed framework performed exceptionally well in prior prediction of CKD disease with an accuracy of 91.6. Originality/value The capability of the machine learning algorithms depends on feature selection (FS) algorithms in identifying the relevant traits from the data set, which impact the predictive result. It is considered as a process of choosing the relevant features from the data set by removing redundant and irrelevant features. Although there are many approaches that have been already proposed toward this objective, they are computationally complex because of the strategy of following a one-step scheme in selecting the features. In this paper, a novel feature selection APDFS algorithm is proposed which explicitly handles the feature associated with the class label by relevance and redundancy analysis. The proposed algorithm handles the process of feature selection in two separate indices. Hence, the computational complexity of the algorithm is reduced to O(nk+1). The algorithm applies the statistical divergence-based information theory to identify the relationship between the distant features of the chronic disease data set. The data set required to experiment is obtained from several medical labs and hospitals of karkala taluk ,India. The HLRM is used as a machine learning classifier. The predictive ability of the framework is compared with the various algorithm and also with the various chronic disease data set. The experimental result illustrates that the proposed framework is efficient and achieved competitive results are compared to the existing work in most of the cases.


Sign in / Sign up

Export Citation Format

Share Document