scholarly journals Big Data is Too Small: Research Implications of Class Inequality for Online Data Collection

2018 ◽  
Author(s):  
Jen Schradie

With a growing interest in data science and online analytics, researchers are increasingly using data derived from the Internet. Whether for qualitative or quantitative analysis, online data, including “Big Data,” can often exclude marginalized populations, especially those from the poor and working class, as the digital divide remains a persistent problem. This methodological commentary on the current state of digital data and methods disentangles the hype from the reality of digitally produced data for sociological research. In the process, it offers strategies to address the weaknesses of data that is derived from the Internet in order to represent marginalized populations.

The article is devoted to the analysis of the phenomenon of big data and the role that they play in the study of modern society. The evolution of the interpretation of big data is characterized and it is shown that the initial technological connotation of big data associated with emphasizing the extremeness of its volume, which does not allow processing by traditional methods and tools, has undergone significant transformations due to the inclusion of the human component. Now, big data is understood as a set of digital tracks that people leave when they use information technology: they surf the Internet, download mobile applications or music, chat with friends on social networks, use GPS, buy goods in online stores, etc. Big data is generated by the Internet, but it contains information not only about the Internet, but about society and the social processes that are displayed on the Internet. In other words, big data is datafied information about everything and everyone. It is shown that big data is a new source of information both about the world around us and about the development of social processes, which turns them into a valuable base of empirical sociological research. However, empirical research based on big data is impossible without solving a number of methodological problems, in particular, questions of “re-profiling” of online data processing methods used by Internet platforms in order to solve sociological problems. This necessitates the development of "digital methods" – a new direction in the field of methodology of sociological analysis, which is formed in connection with the wide dissemination of big data. It is emphasized that widespread datafication changes society and redefines human existence in the era of big data, and therefore it is impossible to consider big data outside the context of their “dark side”. It is concluded that today the urgent problem is the involvement of the sociological community in the development of a fair data policy.


2020 ◽  
pp. 239-254
Author(s):  
David W. Dorsey

With the rise of the internet and the related explosion in the amount of data that are available, the field of data science has expanded rapidly, and analytic techniques designed for use in “big data” contexts have become popular. These include techniques for analyzing both structured and unstructured data. This chapter explores the application of these techniques to the development and evaluation of career pathways. For example, data scientists can analyze online job listings and resumes to examine changes in skill requirements and careers over time and to examine job progressions across an enormous number of people. Similarly, analysts can evaluate whether information on career pathways accurately captures realistic job progressions. Within organizations, the increasing amount of data make it possible to pinpoint the specific skills, behaviors, and attributes that maximize performance in specific roles. The chapter concludes with ideas for the future application of big data to career pathways.


JAMIA Open ◽  
2018 ◽  
Vol 1 (2) ◽  
pp. 136-141 ◽  
Author(s):  
Philip R O Payne ◽  
Elmer V Bernstam ◽  
Justin B Starren

Abstract There are an ever-increasing number of reports and commentaries that describe the challenges and opportunities associated with the use of big data and data science (DS) in the context of biomedical education, research, and practice. These publications argue that there are substantial benefits resulting from the use of data-centric approaches to solve complex biomedical problems, including an acceleration in the rate of scientific discovery, improved clinical decision making, and the ability to promote healthy behaviors at a population level. In addition, there is an aligned and emerging body of literature that describes the ethical, legal, and social issues that must be addressed to responsibly use big data in such contexts. At the same time, there has been growing recognition that the challenges and opportunities being attributed to the expansion in DS often parallel those experienced by the biomedical informatics community. Indeed, many informaticians would consider some of these issues relevant to the core theories and methods incumbent to the field of biomedical informatics science and practice. In response to this topic area, during the 2016 American College of Medical Informatics Winter Symposium, a series of presentations and focus group discussions intended to define the current state and identify future directions for interaction and collaboration between people who identify themselves as working on big data, DS, and biomedical informatics were conducted. We provide a perspective concerning these discussions and the outcomes of that meeting, and also present a set of recommendations that we have generated in response to a thematic analysis of those same outcomes. Ultimately, this report is intended to: (1) summarize the key issues currently being discussed by the biomedical informatics community as it seeks to better understand how to constructively interact with the emerging biomedical big data and DS fields; and (2) propose a framework and agenda that can serve to advance this type of constructive interaction, with mutual benefit accruing to both fields.


Seminar.net ◽  
2021 ◽  
Vol 17 (2) ◽  
Author(s):  
Dan Verständig

This paper discusses an explorative approach on strengthening critical data literacy using data science methods and a theoretical framing intersecting educational science and media theory. The goal is to path a way from data-driven to data-discursive perspectives of data and datafication in higher education. Therefore, the paper focuses on a case study, a higher education course project in 2019 and 2020 on education and data science, based on problem-based learning. The paper closes with a discussion on the challenges on strengthening data literacy in higher education, offering insights into data practices and the pitfalls of working with and reflecting on digital data.


2021 ◽  
Vol 9 ◽  
Author(s):  
Andrea Rau

Data collected in very large quantities are called big data, and big data has changed the way we think about and answer questions in many different fields, like weather forecasting and biology. With all this information available, we need computers to help us store, process, analyze, and understand it. Data science combines tools from fields like statistics, mathematics, and computer science to find interesting patterns in big data. Data scientists write step-by-step instructions called algorithms to teach computers how to learn from data. To help computers understand these instructions, algorithms must be translated from the original question asked by a data scientist into a programming language—and the results must be translated back, so that humans can understand them. That means that data scientists are data detectives, programmers, and translators all in one!


2017 ◽  
Vol 4 (4) ◽  
pp. 21-47 ◽  
Author(s):  
Surabhi Verma

The insights that firms gain from big data analytics (BDA) in real time is used to direct, automate and optimize the decision making to successfully achieve their organizational goals. Data management (DM) and advance analytics (AA) tools and techniques are some of the key contributors to making BDA possible. This paper aims to investigate the characteristics of BD, processes of data management, AA techniques, applications across sectors and issues that are related to their effective implementation and management within broader context of BDA. A range of recently published literature on the characteristics of BD, DM processes, AA techniques are reviewed to explore their current state, applications, issues and challenges learned from their practice. The finding discusses different characteristics of BD, a framework for BDA using data management processes and AA techniques. It also discusses the opportunities/applications and challenges managers dealing with these technologies face for gaining competitive advantages in businesses. The study findings are intended to assist academicians and managers in effectively quantifying the data available in an organization into BD by understanding its properties, understanding the emerging technologies, applications and issues behind BDA implementation.


2019 ◽  
Vol 112 (6) ◽  
pp. 473-476 ◽  
Author(s):  
Gemma F. Mojica ◽  
Christina N. Azmy ◽  
Hollylynne S. Lee

Concord Consortium's Common Online Data Analysis Platform (CODAP), a free Web-based data tool designed for students in grades 6-12 and higher, is continuously being updated and developed for diverse projects in data science, science education, and mathematics/statistics education (https://codap.concord.org/). Teachers and students can access CODAP without downloading software or registering for accounts. Although some Web-based technology tools provide certain features for free and require users to pay a fee to use additional features, CODAP has no hidden costs. Devices need only be connected to the Internet using an updated Web browser (Chrome is preferred). CODAP is not optimized (yet) for use on such touchscreen devices as tablets or iPads®.


2017 ◽  
Author(s):  
Michael J Madison

The knowledge commons research framework is applied to a case of commons governance grounded in research in modern astronomy. The case, Galaxy Zoo, is a leading example of at least three different contemporary phenomena. In the first place Galaxy Zoo is a global citizen science project, in which volunteer non-scientists have been recruited to participate in large-scale data analysis via the Internet. In the second place Galaxy Zoo is a highly successful example of peer production, sometimes known colloquially as crowdsourcing, by which data are gathered, supplied, and/or analyzed by very large numbers of anonymous and pseudonymous contributors to an enterprise that is centrally coordinated or managed. In the third place Galaxy Zoo is a highly visible example of data-intensive science, sometimes referred to as e-science or Big Data science, by which scientific researchers develop methods to grapple with the massive volumes of digital data now available to them via modern sensing and imaging technologies. This chapter synthesizes these three perspectives on Galaxy Zoo via the knowledge commons framework.


Sign in / Sign up

Export Citation Format

Share Document