scholarly journals Data analysis and modeling pipelines for controlled networked social science experiments

PLoS ONE ◽  
2020 ◽  
Vol 15 (11) ◽  
pp. e0242453
Author(s):  
Vanessa Cedeno-Mieles ◽  
Zhihao Hu ◽  
Yihui Ren ◽  
Xinwei Deng ◽  
Noshir Contractor ◽  
...  

There is large interest in networked social science experiments for understanding human behavior at-scale. Significant effort is required to perform data analytics on experimental outputs and for computational modeling of custom experiments. Moreover, experiments and modeling are often performed in a cycle, enabling iterative experimental refinement and data modeling to uncover interesting insights and to generate/refute hypotheses about social behaviors. The current practice for social analysts is to develop tailor-made computer programs and analytical scripts for experiments and modeling. This often leads to inefficiencies and duplication of effort. In this work, we propose a pipeline framework to take a significant step towards overcoming these challenges. Our contribution is to describe the design and implementation of a software system to automate many of the steps involved in analyzing social science experimental data, building models to capture the behavior of human subjects, and providing data to test hypotheses. The proposed pipeline framework consists of formal models, formal algorithms, and theoretical models as the basis for the design and implementation. We propose a formal data model, such that if an experiment can be described in terms of this model, then our pipeline software can be used to analyze data efficiently. The merits of the proposed pipeline framework is elaborated by several case studies of networked social science experiments.

2020 ◽  
Vol 7 (1) ◽  
Author(s):  
Patrick Obilikwu ◽  
Emeka Ogbuju

Abstract Organizations may be related in terms of similar operational procedures, management, and supervisory agencies coordinating their operations. Supervisory agencies may be governmental or non-governmental but, in all cases, they perform oversight functions over the activities of the organizations under their control. Multiple organizations that are related in terms of oversight functions by their supervisory agencies, may differ significantly in terms of their geographical locations, aims, and objectives. To harmonize these differences such that comparative analysis will be meaningful, data about the operations of multiple organizations under one control or management can be cultivated, using a uniform format. In this format, data is easily harvested and the ease with which it is used for cross-population analysis, referred to as data comparability is enhanced. The current practice, whereby organizations under one control maintain their data in independent databases, specific to an enterprise application, greatly reduces data comparability and makes cross-population analysis a herculean task. In this paper, the collocation data model is formulated as consisting of big data technologies beyond data mining techniques and used to reduce the heterogeneity inherent in databases maintained independently across multiple organizations. The collocation data model is thus presented as capable of enhancing data comparability across multiple organizations. The model was used to cultivate the assessment scores of students in some schools for some period and used to rank the schools. The model permits data comparability across several geographical scales among which are: national, regional and global scales, where harvested data form the basis for generating analytics for insights, hindsight, and foresight about organizational problems and strategies.


2020 ◽  
Author(s):  
I Enwereuzo ◽  
Pedro Antunes ◽  
David Johnstone

© 2019 Association for Information Systems. All rights reserved. This paper addresses the challenges of theory testing. Adopting the design science paradigm, we address this challenge by delimiting the line that separates theory building from theory testing by conceptually characterizing its essential aspects: the relationships between humans, organizations and technology, which led to the development of a conceptual framework for theory testing. Practically, the methods, processes, instruments, and tools needed to acquire and analyze data for theory testing is considered, leading to the development of a pattern system. Card sorting was used as an evaluation strategy for the conceptual framework and the pattern model.


2017 ◽  
Author(s):  
Zaynab Hammoud ◽  
Frank Kramer

Personalized medicine, i.e. a medicine focused on the individual and proactive in nature, promises an improved health care by customizing the treatment according to patient needs. The methods to analyze data, model knowledge and store interpretable results vary widely. A common approach is to use networks for modelling and organizing this information. Network theory has been used for many years in the modelling and analysis of complex systems, as epidemiology, biology and biomedicine. As the data evolves and becomes more heterogeneous and complex, monoplex networks become an oversimplification of the corresponding systems. This imposes a need to go beyond traditional networks into a richer framework capable of hosting objects and relations of different scales, called Multilayered Network. These complex networks have contributed in many contexts and fields, and they are very applicable in the investigation of biological networks. In order to enrich this investigation, we aim to implement a multilayer framework that can be applicable in various domains, especially in the field of pathway modelling. Our idea is to integrate pathways and their related knowledge into a multilayer model, where each layer represents one of their elements. The model offers a feature we call “Selective Inclusion of Knowledge”, as well as a collection of related knowledge into a single graph, like diseases and drugs. The main layers are mapped to the entities of the pathways and the additional knowledge, for instance, a convenient model would have 3 layers respectively representing proteins, drugs and diseases. The model imports knowledge from multiple sources like the Reactome database, PharmGKB, DrugBank, OMIM and other public sources. The submitted poster will give an overview of the various models of multilayered networks, then it will describe the model we are building, and the workflow of implementing it into R as well as the future plan. The workflow consists of multiple R packages, of which we present the first implemented package, mully, that provides the multilayer layout. The data import and the integration will be done by another package to be implemented, Multipath.


2013 ◽  
Vol 336-338 ◽  
pp. 1953-1956
Author(s):  
Qi Ming Lou ◽  
Ying Fang Li ◽  
Hong Wei Zhang

Computer room is an important infrastructure of information technology on education for colleges, how to balance the load, calculate fees flexibly, improve resource utilization, service for the teachers and students better is an urgent problem.Firstly, the development trends of computer room management systems are discussed in the paper. Secondly, gives a data model of open computer room management system, which in order to balance the load and improve the utilization efficiency etc. of computer rooms. Finally, gives the intelligent billing algorithm according to the designed data model, and then implemented the algorithm using stored procedure with SQL Server 2005.


2013 ◽  
Vol 1 (2) ◽  
pp. 148-161 ◽  
Author(s):  
Wendy M. Smith ◽  
Ruth M. Heaton

Teachers who continually engage in cycles of research may be characterized as having a stance of inquiry: They continually reflect on their past teaching, ask themselves questions to problematize their current practices, and collect and analyze data to inform future teaching practices. We guided 154 mathematics teachers, distributed across 6 cohorts, in conducting classroom research projects. Our purposes and expectations as teacher educators have become more clearly defined and articulated based on our reflections on 6 iterations of teacher research. Repeatedly, we have adjusted how we facilitated the design and implementation of the projects to improve the quality of teachers' research. Over time, we have come to understand teacher research as a way of helping teachers develop a stance of inquiry toward mathematical content, students' mathematical understandings, and productive mathematical teaching practices rather than as merely a culminating project for a master's degree.


2017 ◽  
Vol 20 (2) ◽  
pp. 301-328 ◽  
Author(s):  
Gregory M. Randolph ◽  
James P. Fetzner

AbstractWhile regulators, firms, and the courts must all be able to interpret regulations to best address economic and social issues, regulatory interpretation may vary greatly across parties. After introducing a framework to explain the impact of the complexity of written regulations and the complexity of the regulatory environment on regulatory interpretation, this paper utilizes regulatory examples to explore the challenges associated with regulatory interpretation. Several recent initiatives designed to improve regulatory efficacy are examined to assess potential methods available to reduce challenges associated with regulatory interpretation. When considered with the public policy implementation literature and research on networks in public policy, several implications emerge from the consideration of regulatory interpretation and recent regulatory initiatives. Regulators should pursue strategies to minimize the number of possible interpretations in the design of regulation and seek improved regulatory mechanisms to alleviate regulatory interpretation challenges. Furthermore, theoretical models should acknowledge regulatory interpretation to better assist in the design and implementation of regulation.


2002 ◽  
Vol 44-46 ◽  
pp. 1049-1056 ◽  
Author(s):  
Gully A.P.C. Burns ◽  
Fang Bian ◽  
Wei-Cheng Cheng ◽  
Shyam Kapadia ◽  
Cyrus Shahabi ◽  
...  

2018 ◽  
Vol 612 ◽  
pp. A53 ◽  
Author(s):  
Lorenzo Pino ◽  
David Ehrenreich ◽  
Aurélien Wyttenbach ◽  
Vincent Bourrier ◽  
Valerio Nascimbeni ◽  
...  

Space-borne low- to medium-resolution (ℛ ~ 102–103) and ground-based high-resolution spectrographs (ℛ ~ 105) are commonly used to obtain optical and near infrared transmission spectra of exoplanetary atmospheres. In this wavelength range, space-borne observations detect the broadest spectral features (alkali doublets, molecular bands, scattering, etc.), while high-resolution, ground-based observations probe the sharpest features (cores of the alkali lines, molecular lines). The two techniques differ by several aspects. (1) The line spread function of ground-based observations is ~103 times narrower than for space-borne observations; (2) Space-borne transmission spectra probe up to the base of thermosphere (P ≳ 10−6 bar), while ground-based observations can reach lower pressures (down to ~10−11 bar) thanks to their high resolution; (3) Space-borne observations directly yield the transit depth of the planet, while ground-based observations can only measure differences in the apparent size of the planet at different wavelengths. These differences make it challenging to combine both techniques. Here, we develop a robust method to compare theoretical models with observations at different resolutions. We introduce πη, a line-by-line 1D radiative transfer code to compute theoretical transmission spectra over a broad wavelength range at very high resolution (ℛ ~ 106, or Δλ ~ 0.01 Å). An hybrid forward modeling/retrieval optimization scheme is devised to deal with the large computational resources required by modeling a broad wavelength range ~0.3–2 μm at high resolution. We apply our technique to HD 189733b. In this planet, HST observations reveal a flattened spectrum due to scattering by aerosols, while high-resolution ground-based HARPS observations reveal sharp features corresponding to the cores of sodium lines. We reconcile these apparent contrasting results by building models that reproduce simultaneously both data sets, from the troposphere to the thermosphere. We confirm: (1) the presence of scattering by tropospheric aerosols; (2) that the sodium core feature is of thermospheric origin. When we take into account the presence of aerosols, the large contrast of the core of the sodium lines measured by HARPS indicates a temperature of up to ~10 000K in the thermosphere, higher than what reported in the literature. We also show that the precise value of the thermospheric temperature is degenerate with the relative optical depth of sodium, controlled by its abundance, and of the aerosol deck.


2009 ◽  
Vol 24 (1-2) ◽  
pp. 69-83
Author(s):  
Nicola Hönle ◽  
Matthias Grossmann ◽  
Daniela Nicklas ◽  
Bernhard Mitschang

Sign in / Sign up

Export Citation Format

Share Document