Social Media Content Analysis and Classification Using Data Mining and ML

2021 ◽  
Vol 2 (2) ◽  
pp. 75-84
Author(s):  
Sambhaji D. Rane

Students' natural conversations on social media such as Twitter and Whatsapp are useful to understand their learning experiences feelings. Collecting and analyzing data from such media can be a difficult task. However, the large scale of data is required for automatic data analysis techniques to classify Twitter data. The proposed new system is a combination of qualitative analysis and large-scale data mining and ML techniques. This system focuses on engineering students' Twitter posts, which are collected from engineering colleges, to understand issues and problems in their learning. The authors first conduct a qualitative analysis using ML studio on tweets collected from engineering colleges using term #DStudentsproblems, engineeringProblem, Aluminisuggestions, and ladyEngineer. Collected tweets are related to engineering students' college lives. In the proposed system, a multi-label classification algorithm to classify tweets reflecting students' problems such as soft skill issues, heavy study load, lack of social engagement, and sleep problems is used.

2018 ◽  
Vol 37 (2) ◽  
pp. 87-102 ◽  
Author(s):  
Li Zhao ◽  
Chao Min

With the advent of modern cognitive computing technologies, fashion informatics researchers contribute to the academic and professional discussion about how a large-scale data set is able to reshape the fashion industry. Data-mining-based social network analysis is a promising area of fashion informatics to investigate relations and information flow among fashion units. By adopting this pragmatic approach, we provide dynamic network visualizations of the case of Paris Fashion Week. Three time periods were researched to monitor the formulation and mobilization of social media users’ discussions of the event. Initial textual data on social media were crawled, converted, calculated, and visualized by Python and Gephi. The most influential nodes (hashtags) that function as junctions and the distinct hashtag communities were identified and represented visually as graphs. The relations between the contextual clusters and the role of junctions in linking these clusters were investigated and interpreted.


2020 ◽  
Author(s):  
Guilherme Carminati ◽  
Roberto Augusto ◽  
Norberto Dallabrida ◽  
Raimundo Teive

This paper tackles the problem of dropout of undergraduate studentsin a private university, by using Educational Data Mining(EDM) techniques. The EDM is an emerging area, concerned withdeveloping methods for exploring the increasingly large-scale datathat come from educational settings and using those methods tobetter understand students and the settings which they learn in. Inthis work, EDM is used to identify profiles of students who withdrawfrom their engineering courses. The considered dataset iscomposed of 53 attributes, involving financial and academic aspectsof 2,925 engineering students. Preliminary results have identifiedsome attributes that are related to the dropout in engineering courses,such as: the semester of the year (students are more prone todropout in the first half of the year), attendance, grades (in thiscase median is more important than the mean value) and numberof credits in the previous semester, and the current semester thestudent is enrolled (students bellow the 5th semester have a highertendency to dropout).


Assessment ◽  
2021 ◽  
pp. 107319112110272
Author(s):  
Maartje Boer ◽  
Gonneke W. J. M. Stevens ◽  
Catrin Finkenauer ◽  
Ina M. Koning ◽  
Regina J. J. M. van den Eijnden

Large-scale validation research on instruments measuring problematic social media use (SMU) is scarce. Using a nationally representative sample of 6,626 Dutch adolescents aged 12 to 16 years, the present study examined the psychometric properties of the nine-item Social Media Disorder scale. The structural validity was solid, because one underlying factor was identified, with adequate factor loadings. The internal consistency was good, but the test information was most reliable at moderate to high scores on the scale’s continuum. The factor structure was measurement invariant across different subpopulations. Three subgroups were identified, distinguished by low, medium, and high probabilities of endorsing the criteria. Higher levels of problematic SMU were associated with higher probabilities of mental, school, and sleep problems, confirming adequate criterion validity. Girls, lower educated adolescents, 15-year-olds, and non-Western adolescents were most likely to report problematic SMU. Given its good psychometric properties, the scale is suitable for research on problematic SMU among adolescents.


2020 ◽  
Author(s):  
Mahmoud Arafat

<p>In response to the Coronavirus disease (COVID-19) outbreak and the Transportation Research Board’s (TRB) urgent need for work related to transportation and pandemics, this paper contributes with a sense of urgency and provides a starting point for research on the topic. The main goal of this paper is to support transportation researchers and the TRB community during this COVID-19 pandemic by reviewing the performance of software models used for extracting large-scale data from Twitter streams related to COVID-19. The study extends the previous research efforts in social media data mining by providing a review of contemporary tools, including their computing maturity and their potential usefulness. The paper also includes an open repository for the processed data frames to facilitate the quick development of new transportation research studies. The output of this work is recommended to be used by the TRB community when deciding to further investigate topics related to COVID-19 and social media data mining tools.</p>


2021 ◽  
Author(s):  
Maartje Boer ◽  
gonneke stevens ◽  
Catrin Finkenauer ◽  
H.M. Koning Ina ◽  
Regina van den Eijnden

Large-scale validation research on instruments measuring problematic social media use (SMU) is scarce. Using a nationally representative sample of 6,626 Dutch adolescents aged 12 to 16, the present study examined the psychometric properties of the nine-item Social Media Disorder-scale. The structural validity was solid, because one underlying factor was identified, with adequate factor loadings. The internal consistency was good, but the test information was most reliable at moderate to high scores on the scale’s continuum. The factor structure was measurement invariant across different subpopulations. Three subgroups were identified, distinguished by low, medium, and high probabilities of endorsing the criteria. Higher levels of problematic SMU were associated with higher probabilities of mental, school, and sleep problems, confirming adequate criterion validity. Girls, lower educated adolescents, 15-year-olds, and non-Western adolescents were most likely to report problematic SMU. Given its good psychometric properties, the scale is suitable for research on problematic SMU among adolescents.


2020 ◽  
Author(s):  
Mahmoud Arafat

<p>In response to the Coronavirus disease (COVID-19) outbreak and the Transportation Research Board’s (TRB) urgent need for work related to transportation and pandemics, this paper contributes with a sense of urgency and provides a starting point for research on the topic. The main goal of this paper is to support transportation researchers and the TRB community during this COVID-19 pandemic by reviewing the performance of software models used for extracting large-scale data from Twitter streams related to COVID-19. The study extends the previous research efforts in social media data mining by providing a review of contemporary tools, including their computing maturity and their potential usefulness. The paper also includes an open repository for the processed data frames to facilitate the quick development of new transportation research studies. The output of this work is recommended to be used by the TRB community when deciding to further investigate topics related to COVID-19 and social media data mining tools.</p>


Author(s):  
Oanh Tran

This paper presents a work of mining informal social media data to provide insights into students’ learning experiences. Analyzing such kind of data is a challenging task because of the data volume, the complexity and diversity of languages used in these social sites. In this study, we developed a framework which integrating both qualitative analysis and different data mining techniques in order to understand students’ learning experiences. This is the first work focusing on mining Vietnamese forums for students in natural science fields to understand issues and problems in their education. The results indicated that these students usually encounter problems such as heavy study load, sleepy problem, negative emotion, English barriers, and carreers’ targets. The experimental results are quite promising in classifying students’ posts into predefined categories developed for academic purposes. It is expected to help educational managers get necessary information in a timely fashion and then make more informed decisions in supporting their students in studying.


2019 ◽  
Author(s):  
Valerio Sbragaglia ◽  
Ricardo A. Correia ◽  
Salvatore Coco ◽  
Robert Arlinghaus

Data about recreational fisheries are scarce in many areas of the world. In the absence of monitoring data collected in situ, alternative data sources, such as digital applications and social media platforms, have the potential to produce valuable insights. Yet, the potential of social media for drawing insights about recreational fisheries is still underexplored. In this study, we applied data mining on YouTube videos to better understand recreational fisheries targeting common dentex (Dentex dentex), an iconic species of Mediterranean recreational fisheries. We chose this model species because of ongoing controversies about the relative impact of recreational angling and recreational spearfishing on its conservation status. In Italy alone, from 2010 to 2016 recreational spearfishers posted 1051 videos compared to 692 videos posted by recreational anglers. Only the upload pattern of spearfishing videos followed a seasonal pattern with peaks in July, suggesting seasonality of spearfishing catches of D. dentex – a trend not found for anglers. The average mass of the fish declared in recreational angling videos (6.43 kg) was significantly larger than the one in spearfishing videos (4.50 kg). Videos posted by recreational spearfishers received significantly more likes and comments than those posted by recreational anglers, suggesting that the social engagement among recreational spearfishers was stronger than in anglers. We also found that the mass of the fish positively predicted social engagement in recreational spearfishing videos, but not in videos posted by recreational anglers. This could be caused by the generally smaller odds of catching large D. dentex by spearfishing, possibly explaining why posting videos with particularly large specimen triggered larger social engagement by recreational spearfishers. Our case study demonstrates that data mining on YouTube can be a powerful tool to provide complementary data on controversial and data-poor aspects of recreational fisheries.


Author(s):  
Marc-André Kaufhold ◽  
Christian Reuter

AbstractFor almost 15 years, social media have been regularly used during emergencies. One of the most recent, and instructive, examples of its widespread use during a large scale scenario in Europe were the 2013 European floods. Public reporting during the event indicated, and our analysis confirms, that Twitter, Facebook (FB), Google Maps and other services were frequently used by affected citizen and volunteers to coordinate help activities among themselves. We conducted a qualitative analysis of selected emergent volunteer communities in Germany on FB and Twitter among others, and subsequently conducted interviews with FB group founders and activists. Our aim was to analyze the use of social media during this particular event, especially by digital volunteers. Our study illustrates the relevance of social media for German citizens in cases of disaster, focusing especially on the role of the moderator. Our specific emphasis was the embedding of social media in the organizing work done by said volunteers, emphasizing both the patterns of social media use and the challenges that result. We show that different social media were used in different ways: Twitter was used in the main for status updates while FB-pages were mostly intended to provide an overview. FB-groups also coordinated a multitude of activities.


Author(s):  
Rainer Simon ◽  
Drazen Ignjatovic ◽  
Georg Neubauer ◽  
Clemens Gutschi ◽  
Johannes Pan ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document