scholarly journals Big Data Revisited: A Rejoinder

2015 ◽  
Vol 30 (1) ◽  
pp. 70-74 ◽  
Author(s):  
Jannis Kallinikos ◽  
Ioanna D Constantiou

We elaborate on key issues of our paper New games, new rules: big data and the changing context of strategy as a means of addressing some of the concerns raised by the paper's commentators. We initially deal with the issue of social data and the role it plays in the current data revolution. The massive involvement of lay publics as instrumented by social media breaks with the strong expert cultures that have underlain the production and use of data in modern organizations. It also sets apart the interactive and communicative processes by which social data is produced from sensor data and the technological recording of facts. We further discuss the significance of the very mechanisms by which big data is produced as distinct from the very attributes of big data, often discussed in the literature. In the final section of the paper, we qualify the alleged importance of algorithms and claim that the structures of data capture and the architectures in which data generation is embedded are fundamental to the phenomenon of big data.

2016 ◽  
Vol 14 (2) ◽  
pp. 197-210 ◽  
Author(s):  
Tobias Matzner

The article discusses problems of representative views of data and elaborates a concept of the performativity of data. It shows how data used for surveillance contributes in creating suspect subjectivities. In particular, the article focuses on the inductive or explorative processing of data and on the decoupling of data generation and analysis that characterize current use of data for surveillance. It lines out several challenges this poses to established accounts of surveillance: David Lyon’s concept of surveillance as social sorting and Haggerty and Ericson’s “surveillant assemblage”. These problems are attributed to a representationalist view, which focuses on the veracity of data. This can lead to ignoring problematic consequences of surveillance procedures and the full scope of affected persons. Building on an idea by Rita Raley, an alternative account of data as performative is proposed. Using Judith Butler’s concept of “citationality,” this account shows how surveillance is entangled with the production of subjects through data in general. Surveillance is reformulated as a particular way in which subjects are produced that is parasitical to other forms of subjectivation.


2020 ◽  
Vol 30 (Supplement_5) ◽  
Author(s):  
◽  

Abstract Countries have a wide range of lifestyles, environmental exposures and different health(care) systems providing a large natural experiment to be investigated. Through pan-European comparative studies, underlying determinants of population health can be explored and provide rich new insights into the dynamics of population health and care such as the safety, quality, effectiveness and costs of interventions. Additionally, in the big data era, secondary use of data has become one of the major cornerstones of digital transformation for health systems improvement. Several countries are reviewing governance models and regulatory framework for data reuse. Precision medicine and public health intelligence share the same population-based approach, as such, aligning secondary use of data initiatives will increase cost-efficiency of the data conversion value chain by ensuring that different stakeholders needs are accounted for since the beginning. At EU level, the European Commission has been raising awareness of the need to create adequate data ecosystems for innovative use of big data for health, specially ensuring responsible development and deployment of data science and artificial intelligence technologies in the medical and public health sectors. To this end, the Joint Action on Health Information (InfAct) is setting up the Distributed Infrastructure on Population Health (DIPoH). DIPoH provides a framework for international and multi-sectoral collaborations in health information. More specifically, DIPoH facilitates the sharing of research methods, data and results through participation of countries and already existing research networks. DIPoH's efforts include harmonization and interoperability, strengthening of the research capacity in MSs and providing European and worldwide perspectives to national data. In order to be embedded in the health information landscape, DIPoH aims to interact with existing (inter)national initiatives to identify common interfaces, to avoid duplication of the work and establish a sustainable long-term health information research infrastructure. In this workshop, InfAct lays down DIPoH's core elements in coherence with national and European initiatives and actors i.e. To-Reach, eHAction, the French Health Data Hub and ECHO. Pitch presentations on DIPoH and its national nodes will set the scene. In the format of a round table, possible collaborations with existing initiatives at (inter)national level will be debated with the audience. Synergies will be sought, reflections on community needs will be made and expectations on services will be discussed. The workshop will increase the knowledge of delegates around the latest health information infrastructure and initiatives that strive for better public health and health systems in countries. The workshop also serves as a capacity building activity to promote cooperation between initiatives and actors in the field. Key messages DIPoH an infrastructure aiming to interact with existing (inter)national initiatives to identify common interfaces, avoid duplication and enable a long-term health information research infrastructure. National nodes can improve coordination, communication and cooperation between health information stakeholders in a country, potentially reducing overlap and duplication of research and field-work.


Author(s):  
Anuradha Rajkumar ◽  
Bruce Wallace ◽  
Laura Ault ◽  
Julien Lariviere-Chartier ◽  
Frank Knoefel ◽  
...  

Sensors ◽  
2021 ◽  
Vol 21 (6) ◽  
pp. 2144
Author(s):  
Stefan Reitmann ◽  
Lorenzo Neumann ◽  
Bernhard Jung

Common Machine-Learning (ML) approaches for scene classification require a large amount of training data. However, for classification of depth sensor data, in contrast to image data, relatively few databases are publicly available and manual generation of semantically labeled 3D point clouds is an even more time-consuming task. To simplify the training data generation process for a wide range of domains, we have developed the BLAINDER add-on package for the open-source 3D modeling software Blender, which enables a largely automated generation of semantically annotated point-cloud data in virtual 3D environments. In this paper, we focus on classical depth-sensing techniques Light Detection and Ranging (LiDAR) and Sound Navigation and Ranging (Sonar). Within the BLAINDER add-on, different depth sensors can be loaded from presets, customized sensors can be implemented and different environmental conditions (e.g., influence of rain, dust) can be simulated. The semantically labeled data can be exported to various 2D and 3D formats and are thus optimized for different ML applications and visualizations. In addition, semantically labeled images can be exported using the rendering functionalities of Blender.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Ikbal Taleb ◽  
Mohamed Adel Serhani ◽  
Chafik Bouhaddioui ◽  
Rachida Dssouli

AbstractBig Data is an essential research area for governments, institutions, and private agencies to support their analytics decisions. Big Data refers to all about data, how it is collected, processed, and analyzed to generate value-added data-driven insights and decisions. Degradation in Data Quality may result in unpredictable consequences. In this case, confidence and worthiness in the data and its source are lost. In the Big Data context, data characteristics, such as volume, multi-heterogeneous data sources, and fast data generation, increase the risk of quality degradation and require efficient mechanisms to check data worthiness. However, ensuring Big Data Quality (BDQ) is a very costly and time-consuming process, since excessive computing resources are required. Maintaining Quality through the Big Data lifecycle requires quality profiling and verification before its processing decision. A BDQ Management Framework for enhancing the pre-processing activities while strengthening data control is proposed. The proposed framework uses a new concept called Big Data Quality Profile. This concept captures quality outline, requirements, attributes, dimensions, scores, and rules. Using Big Data profiling and sampling components of the framework, a faster and efficient data quality estimation is initiated before and after an intermediate pre-processing phase. The exploratory profiling component of the framework plays an initial role in quality profiling; it uses a set of predefined quality metrics to evaluate important data quality dimensions. It generates quality rules by applying various pre-processing activities and their related functions. These rules mainly aim at the Data Quality Profile and result in quality scores for the selected quality attributes. The framework implementation and dataflow management across various quality management processes have been discussed, further some ongoing work on framework evaluation and deployment to support quality evaluation decisions conclude the paper.


Author(s):  
Pijush Kanti Dutta Pramanik ◽  
Saurabh Pal ◽  
Moutan Mukhopadhyay

Like other fields, the healthcare sector has also been greatly impacted by big data. A huge volume of healthcare data and other related data are being continually generated from diverse sources. Tapping and analysing these data, suitably, would open up new avenues and opportunities for healthcare services. In view of that, this paper aims to present a systematic overview of big data and big data analytics, applicable to modern-day healthcare. Acknowledging the massive upsurge in healthcare data generation, various ‘V's, specific to healthcare big data, are identified. Different types of data analytics, applicable to healthcare, are discussed. Along with presenting the technological backbone of healthcare big data and analytics, the advantages and challenges of healthcare big data are meticulously explained. A brief report on the present and future market of healthcare big data and analytics is also presented. Besides, several applications and use cases are discussed with sufficient details.


Author(s):  
M. Mazhar Rathore ◽  
Anand Paul ◽  
Awais Ahmad ◽  
Gwanggil Jeon

Recently, a rapid growth in the population in urban regions demands the provision of services and infrastructure. These needs can be come up wit the use of Internet of Things (IoT) devices, such as sensors, actuators, smartphones and smart systems. This leans to building Smart City towards the next generation Super City planning. However, as thousands of IoT devices are interconnecting and communicating with each other over the Internet to establish smart systems, a huge amount of data, termed as Big Data, is being generated. It is a challenging task to integrate IoT services and to process Big Data in an efficient way when aimed at decision making for future Super City. Therefore, to meet such requirements, this paper presents an IoT-based system for next generation Super City planning using Big Data Analytics. Authors have proposed a complete system that includes various types of IoT-based smart systems like smart home, vehicular networking, weather and water system, smart parking, and surveillance objects, etc., for dada generation. An architecture is proposed that includes four tiers/layers i.e., 1) Bottom Tier-1, 2) Intermediate Tier-1, 3) Intermediate Tier 2, and 4) Top Tier that handle data generation and collections, communication, data administration and processing, and data interpretation, respectively. The system implementation model is presented from the generation and collection of data to the decision making. The proposed system is implemented using Hadoop ecosystem with MapReduce programming. The throughput and processing time results show that the proposed Super City planning system is more efficient and scalable.


2017 ◽  
Vol 8 (2) ◽  
pp. 88-105 ◽  
Author(s):  
Gunasekaran Manogaran ◽  
Daphne Lopez

Ambient intelligence is an emerging platform that provides advances in sensors and sensor networks, pervasive computing, and artificial intelligence to capture the real time climate data. This result continuously generates several exabytes of unstructured sensor data and so it is often called big climate data. Nowadays, researchers are trying to use big climate data to monitor and predict the climate change and possible diseases. Traditional data processing techniques and tools are not capable of handling such huge amount of climate data. Hence, there is a need to develop advanced big data architecture for processing the real time climate data. The purpose of this paper is to propose a big data based surveillance system that analyzes spatial climate big data and performs continuous monitoring of correlation between climate change and Dengue. Proposed disease surveillance system has been implemented with the help of Apache Hadoop MapReduce and its supporting tools.


2018 ◽  
Vol 2 ◽  
pp. e26539 ◽  
Author(s):  
Paul J. Morris ◽  
James Hanken ◽  
David Lowery ◽  
Bertram Ludäscher ◽  
James Macklin ◽  
...  

As curators of biodiversity data in natural science collections, we are deeply concerned with data quality, but quality is an elusive concept. An effective way to think about data quality is in terms of fitness for use (Veiga 2016). To use data to manage physical collections, the data must be able to accurately answer questions such as what objects are in the collections, where are they and where are they from. Some research uses aggregate data across collections, which involves exchange of data using standard vocabularies. Some research uses require accurate georeferences, collecting dates, and current identifications. It is well understood that the costs of data capture and data quality improvement increase with increasing time from the original observation. These factors point towards two engineering principles for software that is intended to maintain or enhance data quality: build small modular data quality tests that can be easily assembled in suites to assess the fitness of use of data for some particular need; and produce tools that can be applied by users with a wide range of technical skill levels at different points in the data life cycle. In the Kurator project, we have produced code (e.g. Wieczorek et al. 2017, Morris 2016) which consists of small modules that can be incorporated into data management processes as small libraries that address particular data quality tests. These modules can be combined into customizable data quality scripts, which can be run on single computers or scalable architecture and can be incorporated into other software, run as command line programs, or run as suites of canned workflows through a web interface. Kurator modules can be integrated into early stage data capture applications, run to help prepare data for aggregation by matching it to standard vocabularies, be run for quality control or quality assurance on data sets, and can report on data quality in terms of a fitness-for-use framework (Veiga et al. 2017). One of our goals is simple tests usable by anyone anywhere.


Sign in / Sign up

Export Citation Format

Share Document