scholarly journals A view from data science

2021 ◽  
Vol 8 (2) ◽  
pp. 205395172110401
Author(s):  
Anna Sapienza ◽  
Sune Lehmann

For better and worse, our world has been transformed by Big Data. To understand digital traces generated by individuals, we need to design multidisciplinary approaches that combine social and data science. Data and social scientists face the challenge of effectively building upon each other’s approaches to overcome the limitations inherent in each side. Here, we offer a “data science perspective” on the challenges that arise when working to establish this interdisciplinary environment. We discuss how we perceive the differences and commonalities of the questions we ask to understand digital behaviors (including how we answer them), and how our methods may complement each other. Finally, we describe what a path toward common ground between these fields looks like when viewed from data science.

2019 ◽  
Vol 9 (15) ◽  
pp. 3065 ◽  
Author(s):  
Dresp-Langley ◽  
Ekseth ◽  
Fesl ◽  
Gohshi ◽  
Kurz ◽  
...  

Detecting quality in large unstructured datasets requires capacities far beyond the limits of human perception and communicability and, as a result, there is an emerging trend towards increasingly complex analytic solutions in data science to cope with this problem. This new trend towards analytic complexity represents a severe challenge for the principle of parsimony (Occam’s razor) in science. This review article combines insight from various domains such as physics, computational science, data engineering, and cognitive science to review the specific properties of big data. Problems for detecting data quality without losing the principle of parsimony are then highlighted on the basis of specific examples. Computational building block approaches for data clustering can help to deal with large unstructured datasets in minimized computation time, and meaning can be extracted rapidly from large sets of unstructured image or video data parsimoniously through relatively simple unsupervised machine learning algorithms. Why we still massively lack in expertise for exploiting big data wisely to extract relevant information for specific tasks, recognize patterns and generate new information, or simply store and further process large amounts of sensor data is then reviewed, and examples illustrating why we need subjective views and pragmatic methods to analyze big data contents are brought forward. The review concludes on how cultural differences between East and West are likely to affect the course of big data analytics, and the development of increasingly autonomous artificial intelligence (AI) aimed at coping with the big data deluge in the near future.


2019 ◽  
Vol 22 (1) ◽  
pp. 297-323 ◽  
Author(s):  
Henry E. Brady

Big data and data science are transforming the world in ways that spawn new concerns for social scientists, such as the impacts of the internet on citizens and the media, the repercussions of smart cities, the possibilities of cyber-warfare and cyber-terrorism, the implications of precision medicine, and the consequences of artificial intelligence and automation. Along with these changes in society, powerful new data science methods support research using administrative, internet, textual, and sensor-audio-video data. Burgeoning data and innovative methods facilitate answering previously hard-to-tackle questions about society by offering new ways to form concepts from data, to do descriptive inference, to make causal inferences, and to generate predictions. They also pose challenges as social scientists must grasp the meaning of concepts and predictions generated by convoluted algorithms, weigh the relative value of prediction versus causal inference, and cope with ethical challenges as their methods, such as algorithms for mobilizing voters or determining bail, are adopted by policy makers.


2020 ◽  
Vol 34 (1) ◽  
pp. 19-42
Author(s):  
David Moats

It is often claimed that the rise of so called ‘big data’ and computationally advanced methods may exacerbate tensions between disciplines like data science and anthropology. This paper is an attempt to reflect on these possible tensions and their resolution, empirically. It contributes to a growing body of literature which observes interdisciplinary collabrations around new methods and digital infrastructures in practice but argues that many existing arrangements for interdisciplinary collaboration enforce a separation between disciplines in which identities are not really put at risk. In order to disrupt these standard roles and routines we put on a series of workshops in which mainly self-identified qualitative or non-technical researchers were encouraged to use digital tools (scrapers, automated text analysis and data visualisations). The paper focuses on three empirical examples from the workshops in which tensions, both between disciplines and methods, flared up and how they were ultimately managed or settled. In order to characterise both these tensions and negotiating strategies I draw on Woolgar and Stengers’ use of the humour and irony to describe how disciplines relate to each others truth claims. I conclude that while there is great potential in more open-ended collaborative settings, qualitative social scientists may need to confront some of their own disciplinary baggage in order for better dialogue and more radical mixings between disciplines to occur.


2021 ◽  
Vol 1 (1) ◽  
Author(s):  
Simon Elias Bibri

AbstractSustainable cities are quintessential complex systems—dynamically changing environments and developed through a multitude of individual and collective decisions from the bottom up to the top down. As such, they are full of contestations, conflicts, and contingencies that are not easily captured, steered, and predicted respectively. In short, they are characterized by wicked problems. Therefore, they are increasingly embracing and leveraging what smart cities have to offer as to big data technologies and their novel applications in a bid to effectively tackle the complexities they inherently embody and to monitor, evaluate, and improve their performance with respect to sustainability—under what has been termed “data-driven smart sustainable cities.” This paper analyzes and discusses the enabling role and innovative potential of urban computing and intelligence in the strategic, short-term, and joined-up planning of data-driven smart sustainable cities of the future. Further, it devises an innovative framework for urban intelligence and planning functions as an advanced form of decision support. This study expands on prior work done to develop a novel model for data-driven smart sustainable cities of the future. I argue that the fast-flowing torrent of urban data, coupled with its analytical power, is of crucial importance to the effective planning and efficient design of this integrated model of urbanism. This is enabled by the kind of data-driven and model-driven decision support systems associated with urban computing and intelligence. The novelty of the proposed framework lies in its essential technological and scientific components and the way in which these are coordinated and integrated given their clear synergies to enable urban intelligence and planning functions. These utilize, integrate, and harness complexity science, urban complexity theories, sustainability science, urban sustainability theories, urban science, data science, and data-intensive science in order to fashion powerful new forms of simulation models and optimization methods. These in turn generate optimal designs and solutions that improve sustainability, efficiency, resilience, equity, and life quality. This study contributes to understanding and highlighting the value of big data in regard to the planning and design of sustainable cities of the future.


2020 ◽  
Vol 9 (1) ◽  
pp. 45-56
Author(s):  
Akella Subhadra

Data Science is associated with new discoveries, the discovery of value from the data. It is a practice of deriving insights and developing business strategies through transformation of data in to useful information. It has been evaluated as a scientific field and research evolution in disciplines like statistics, computing science, intelligence science, and practical transformation in the domains like science, engineering, public sector, business and lifestyle. The field encompasses the larger areas of artificial intelligence, data analytics, machine learning, pattern recognition, natural language understanding, and big data manipulation. It also tackles related new scientific challenges, ranging from data capture, creation, storage, retrieval, sharing, analysis, optimization, and visualization, to integrative analysis across heterogeneous and interdependent complex resources for better decision-making, collaboration, and, ultimately, value creation. In this paper we entitled epicycles of analysis, formal modeling, from data analysis to data science, data analytics -A keystone of data science, The Big data is not a single technology but an amalgamation of old and new technologies that assistance companies gain actionable awareness. The big data is vital because it manages, store and manipulates large amount of data at the desirable speed and time. Big data addresses detached requirements, in other words the amalgamate of multiple un-associated datasets, processing of large amounts of amorphous data and harvesting of unseen information in a time-sensitive generation. As businesses struggle to stay up with changing market requirements, some companies are finding creative ways to use Big Data to their growing business needs and increasingly complex problems. As organizations evolve their processes and see the opportunities that Big Data can provide, they struggle to beyond traditional Business Intelligence activities, like using data to populate reports and dashboards, and move toward Data Science- driven projects that plan to answer more open-ended and sophisticated questions. Although some organizations are fortunate to have data scientists, most are not, because there is a growing talent gap that makes finding and hiring data scientists in a timely manner is difficult. This paper, aimed to demonstrate a close view about Data science, big data, including big data concepts like data storage, data processing, and data analysis of these technological developments, we also provide brief description about big data analytics and its characteristics , data structures, data analytics life cycle, emphasizes critical points on these issues.


2018 ◽  
Vol 4 ◽  
pp. 1
Author(s):  
Kelly Ann Joyce ◽  
Kendall Darfler ◽  
Dalton George ◽  
Jason Ludwig ◽  
Kristene Unsworth

The automation of knowledge via algorithms, code and big data has brought new ethical concerns that computer scientists and engineers are not yet trained to identify or mediate. We present our experience of using original research to develop scenarios to explore how STS scholars can produce materials that facilitate ethics education in computer science, data science, and software engineering. STS scholars are uniquely trained to investigate the societal context of science and technology as well as the meaning STEM researchers attach to their day-to-day work practices. In this project, we use a collaborative, co-constitutive method of doing ethics education that focuses on building an ethical framework based on empirical practices, highlighting two issues in particular: data validity and the relations between data and inequalities. Through data-grounded scenario writing, we demonstrate how STS scholars and other social scientists can apply their expertise to the production of educational materials to spark broad ranging discussions that explore the connections between values, ethics, STEM, politics, and social contexts.


Author(s):  
Shaveta Bhatia

 The epoch of the big data presents many opportunities for the development in the range of data science, biomedical research cyber security, and cloud computing. Nowadays the big data gained popularity.  It also invites many provocations and upshot in the security and privacy of the big data. There are various type of threats, attacks such as leakage of data, the third party tries to access, viruses and vulnerability that stand against the security of the big data. This paper will discuss about the security threats and their approximate method in the field of biomedical research, cyber security and cloud computing.


2020 ◽  
Author(s):  
Bankole Olatosi ◽  
Jiajia Zhang ◽  
Sharon Weissman ◽  
Zhenlong Li ◽  
Jianjun Hu ◽  
...  

BACKGROUND The Coronavirus Disease 2019 (COVID-19) caused by the severe acute respiratory syndrome coronavirus (SARS-CoV-2) remains a serious global pandemic. Currently, all age groups are at risk for infection but the elderly and persons with underlying health conditions are at higher risk of severe complications. In the United States (US), the pandemic curve is rapidly changing with over 6,786,352 cases and 199,024 deaths reported. South Carolina (SC) as of 9/21/2020 reported 138,624 cases and 3,212 deaths across the state. OBJECTIVE The growing availability of COVID-19 data provides a basis for deploying Big Data science to leverage multitudinal and multimodal data sources for incremental learning. Doing this requires the acquisition and collation of multiple data sources at the individual and county level. METHODS The population for the comprehensive database comes from statewide COVID-19 testing surveillance data (March 2020- till present) for all SC COVID-19 patients (N≈140,000). This project will 1) connect multiple partner data sources for prediction and intelligence gathering, 2) build a REDCap database that links de-identified multitudinal and multimodal data sources useful for machine learning and deep learning algorithms to enable further studies. Additional data will include hospital based COVID-19 patient registries, Health Sciences South Carolina (HSSC) data, data from the office of Revenue and Fiscal Affairs (RFA), and Area Health Resource Files (AHRF). RESULTS The project was funded as of June 2020 by the National Institutes for Health. CONCLUSIONS The development of such a linked and integrated database will allow for the identification of important predictors of short- and long-term clinical outcomes for SC COVID-19 patients using data science.


Sign in / Sign up

Export Citation Format

Share Document