Integrating Big Data Into Evaluation: R Code for Topic Identification and Modeling

2021 ◽  
pp. 109821402110316
Author(s):  
Dakota W. Cintron ◽  
Bianca Montrosse-Moorhead

Despite the rising popularity of big data, there is speculation that evaluators have been slow adopters of these new statistical approaches. Several possible reasons have been offered for why this is the case: ethical concerns, institutional capacity, and evaluator capacity and values. In this method note, we address one of these barriers and aim to build evaluator capacity to integrate big data analytics into their studies. We focus our efforts on a specific topic modeling technique referred to as latent Dirichlet allocation (LDA) because of the ubiquitousness of qualitative textual data in evaluation. Given current equity debates, both within evaluation and the communities in which we practice, we analyze 1,796 tweets that use the hashtag #equity with the R packages topicmodels and ldatuning to illustrate the use of LDA. Furthermore, a freely available workbook for implementing LDA topic modeling is provided as Supplemental Material Online.

2019 ◽  
Vol 5 (1) ◽  
pp. 3-28 ◽  
Author(s):  
Ronggui Huang

The Weibo platform is a social space for interaction and expression. This requires scholars to examine, in a simultaneous fashion, communication patterns and the communicated content among Weibo users. Based on theories of ‘network and culture’ and relational sociology, this article contends that network fields and the communicated cultural meanings are mutually constituted. A latent Dirichlet allocation (LDA) topic model and social network analysis techniques were used to examine 51,288 Weibo posts published by users concerned for workers revealing the relationship between community structures and communities’ focal topics. Specifically, the result of LDA topic modeling shows that the focal topics regarding labor issues could be categorized into four groups: workers’ culture (art and entertainment) and welfare; predicaments and problems; strikes (rights defending actions) and labor organizations; and institutions and labor rights. Analysis of interaction patterns among users resulted in the identification of five major online communities which, based on the primary communicated topics within communities, were labeled as the Labor Homeland Community; Labor Culture Community; Labor Rights Protection Community; Labor Interest Concerned Community; and Labor Institution Concerned Community. The results also showed two new trends in relation to labor issues: first, workers’ culture and their integration into urban life have garnered increasing online attention with the growth of new generation workers; and second, the Weibo platform provides an interaction channel for labor researchers and labor non-governmental organizations, and such interaction facilitates the latter to critically reflect the current conditions or plights of workers from an institutional/structural perspective. This article concludes with a discussion about the significance of utilizing big data analytics to study online culture and social mentality.


2022 ◽  
pp. 758-787
Author(s):  
Chitresh Verma ◽  
Rajiv Pandey

Data Visualization enables visual representation of the data set for interpretation of data in a meaningful manner from human perspective. The Statistical visualization calls for various tools, algorithms and techniques that can support and render graphical modeling. This chapter shall explore on the detailed features R and RStudio. The combination of Hadoop and R for the Big Data Analytics and its data visualization shall be demonstrated through appropriate code snippets. The integration perspective of R and Hadoop is explained in detail with the help of a utility called Hadoop streaming jar. The various R packages and their integration with Hadoop operations in the R environment are explained through suitable examples. The process of data streaming is provided using different readers of Hadoop streaming package. A case based statistical project is considered in which the data set is visualized after dual execution using the Hadoop MapReduce and R script.


Author(s):  
Chitresh Verma ◽  
Rajiv Pandey

Data Visualization enables visual representation of the data set for interpretation of data in a meaningful manner from human perspective. The Statistical visualization calls for various tools, algorithms and techniques that can support and render graphical modeling. This chapter shall explore on the detailed features R and RStudio. The combination of Hadoop and R for the Big Data Analytics and its data visualization shall be demonstrated through appropriate code snippets. The integration perspective of R and Hadoop is explained in detail with the help of a utility called Hadoop streaming jar. The various R packages and their integration with Hadoop operations in the R environment are explained through suitable examples. The process of data streaming is provided using different readers of Hadoop streaming package. A case based statistical project is considered in which the data set is visualized after dual execution using the Hadoop MapReduce and R script.


2019 ◽  
Vol 54 (5) ◽  
pp. 20
Author(s):  
Dheeraj Kumar Pradhan

2020 ◽  
Vol 49 (5) ◽  
pp. 11-17
Author(s):  
Thomas Wrona ◽  
Pauline Reinecke

Big Data & Analytics (BDA) ist zu einer kaum hinterfragten Institution für Effizienz und Wettbewerbsvorteil von Unternehmen geworden. Zu viele prominente Beispiele, wie der Erfolg von Google oder Amazon, scheinen die Bedeutung zu bestätigen, die Daten und Algorithmen zur Erlangung von langfristigen Wettbewerbsvorteilen zukommt. Sowohl die Praxis als auch die Wissenschaft scheinen geradezu euphorisch auf den „Datenzug“ aufzuspringen. Wenn Risiken thematisiert werden, dann handelt es sich meist um ethische Fragen. Dabei wird häufig übersehen, dass die diskutierten Vorteile sich primär aus einer operativen Effizienzperspektive ergeben. Strategische Wirkungen werden allenfalls in Bezug auf Geschäftsmodellinnovationen diskutiert, deren tatsächlicher Innovationsgrad noch zu beurteilen ist. Im Folgenden soll gezeigt werden, dass durch BDA zwar Wettbewerbsvorteile erzeugt werden können, dass aber hiermit auch große strategische Risiken verbunden sind, die derzeit kaum beachtet werden.


2019 ◽  
Vol 7 (2) ◽  
pp. 273-277
Author(s):  
Ajay Kumar Bharti ◽  
Neha Verma ◽  
Deepak Kumar Verma

2017 ◽  
Vol 49 (004) ◽  
pp. 825--830
Author(s):  
A. AHMED ◽  
R.U. AMIN ◽  
M. R. ANJUM ◽  
I. ULLAH ◽  
I. S. BAJWA

Sign in / Sign up

Export Citation Format

Share Document