scholarly journals Bringing sedimentology and stratigraphy into the StraboSpot data management system

Geosphere ◽  
2021 ◽  
Author(s):  
Casey J. Duncan ◽  
Marjorie A. Chan ◽  
Elizabeth Hajek ◽  
Diane Kamola ◽  
Nicolas M. Roberts ◽  
...  

The StraboSpot data system provides field-based geologists the ability to digitally collect, archive, query, and share data. Recent efforts have expanded this data system with the vocabulary, standards, and workflow utilized by the sedimentary geology community. A standardized vocabulary that honors typical workflows for collecting sedimentologic and stratigraphic field and laboratory data was developed through a series of focused workshops and vetted/refined through subsequent workshops and field trips. This new vocabulary was designed to fit within the underlying structure of StraboSpot and resulted in the expansion of the existing data structure. Although the map-based approach of StraboSpot did not fully conform to the workflow for sedimentary geologists, new functions were developed for the sedimentary community to facilitate descriptions, interpretations, and the plotting of measured sections to document stratigraphic position and relationships between data types. Consequently, a new modality was added to StraboSpot— Strat Mode—which now accommodates sedimentary workflows that enable users to document stratigraphic positions and relationships and automates construction of measured stratigraphic sections. Strat Mode facilitates data collection and co-location of multiple data types (e.g., descriptive observations, images, samples, and measurements) in geographic and stratigraphic coordinates across multiple scales, thus preserving spatial and stratigraphic relationships in the data structure. Incorporating these digital technologies will lead to better research communication in sedimentology through a common vocabulary, shared standards, and open data archiving and sharing.

2018 ◽  
Vol 10 (1) ◽  
Author(s):  
Wayne Clifford

Objective: Integrate and streamline the collection and analysis of environmental, veterinary, and vector zoonotic data using a One Health approach to data system development.Introduction: Environmental Public Health Zoonotic Disease surveillance includes veternary, environmental, and vector data. Surveillance systems within each sector may appear disparate from each other, although they are actually complimentaly and closely allied. Consolidating and integrating data in to one application can be challenging, but there are commonalities shared by all. The goal of the One Health Integrated Data Sysytem is to standardize data collection, streamline data entry, and integrate these sectors in to one application.Methods: Data Assessment. An assessment of each surveillance function was carried out to evaluate data types and needs.Identify Commonalities. Common data was identified across each of the surveillance areas.Identify Unique Data. Data unique to specific surveillance efforts was identified.Build Data Structure. A back-end data structure was developed that reflected the data needs from each surveillance area.Build Data Entry Interfaces. Data entry interfaces were developed to meet the needs of each surveillance area.Build Data QC. Procedures were developed that run several quality control checks on the data.Build Data Eports. To allow users to carry out more extensive analysis of data, customized data exports were built.Results: This data integration project resulted in:● Reduced time spent entering and managing data● Improved data entry error rates● Increased visibility through automated program metrics● Improved access to data from data usersConclusions: Integrating data and building a data system that reflects the diversity of environmental, veterinary, and vector surveillance data is doable using off-the-shelf database tools. The process of integrating data and building the data structure results in a more intimate understanding of the data revealing opportunities for improving data quality.


Electronics ◽  
2021 ◽  
Vol 10 (5) ◽  
pp. 621
Author(s):  
Giuseppe Psaila ◽  
Paolo Fosci

Internet technology and mobile technology have enabled producing and diffusing massive data sets concerning almost every aspect of day-by-day life. Remarkable examples are social media and apps for volunteered information production, as well as Open Data portals on which public administrations publish authoritative and (often) geo-referenced data sets. In this context, JSON has become the most popular standard for representing and exchanging possibly geo-referenced data sets over the Internet.Analysts, wishing to manage, integrate and cross-analyze such data sets, need a framework that allows them to access possibly remote storage systems for JSON data sets, to retrieve and query data sets by means of a unique query language (independent of the specific storage technology), by exploiting possibly-remote computational resources (such as cloud servers), comfortably working on their PC in their office, more or less unaware of real location of resources. In this paper, we present the current state of the J-CO Framework, a platform-independent and analyst-oriented software framework to manipulate and cross-analyze possibly geo-tagged JSON data sets. The paper presents the general approach behind the J-CO Framework, by illustrating the query language by means of a simple, yet non-trivial, example of geographical cross-analysis. The paper also presents the novel features introduced by the re-engineered version of the execution engine and the most recent components, i.e., the storage service for large single JSON documents and the user interface that allows analysts to comfortably share data sets and computational resources with other analysts possibly working in different places of the Earth globe. Finally, the paper reports the results of an experimental campaign, which show that the execution engine actually performs in a more than satisfactory way, proving that our framework can be actually used by analysts to process JSON data sets.


2021 ◽  
Vol 5 (2) ◽  
pp. 80
Author(s):  
Vivi Elvina Panjaitan

ABSTRACTNumbers of management, storage, and preservation of research data problems had been the rationales why national scientific repository (RIN) system was implemented. To measure its success, the present study evaluated, analyzed problems, and provided recommendations using descriptive exploratory qualitative research methods with interviews as the primary data. In terms of the effectiveness, the results showed that the RIN system objectives provided a nationally integrated interoperability research data management system, ensuring long-term archiving and access had been achieved whereas the awareness of researchers to share data and sustainability plans had not been achieved. Based on its efficiency, PDDI LIPI had pursued many activities and strategies. In accordance with its adequacy, the existence of RIN system was able to answer the existing research data problems while the problem of continuity of input of research data and the sustainability of research had not been achieved. In regard to its equalization, RIN system was intended to all professions that carried out research, in which the socialization activities and technical guidance to researchers in relevant institutions were conducted. In coping with its responsiveness, all target groups still could not experience it because the follow-up activity of the target groups after knowing RIN system was still minimum. Hence, it was advised that the target group from both internal LIPI, external LIPI, and PDDI LIPI acted as the implementors. The present study concluded that the implementation of RIN system had not been optimally implemented and still needed improvements. ABSTRAKPermasalahan pengelolaan, penyimpanan, pelestarian data penelitian mendorong dilakukannya implementasi kebijakan sistem RIN. Untuk mengukur keberhasilannya, penulis mengevaluasi, menganalisis permasalahan dan memberikan rekomendasi dengan menggunakan metode penelitan kualitatif deskriptif melalui data primer yaitu wawancara dan data sekunder. Dari efektivitasnya diperoleh hasil bahwa tujuan sistem RIN menyediakan sistem interoperabilitas pengelolaan data penelitian terintegrasi secara nasional, menjamin pengarsipan dan pengaksesan jangka panjang telah tercapai sedangkan kesadaran peneliti untuk berbagi data dan rencana keberlanjutan belum tercapai. Berdasarkan efisiensinya, PDDI LIPI telah mengupayakan banyak kegiatan dan strategi. Berdasarkan kecukupannya, keberadaan sistem RIN mampu menjawab permasalahan data penelitian yang dihadapi sedangkan permasalahan kontinuitas penginputan data penelitian, keberlanjutan penelitian belum tercapai. Berdasarkan pemerataannya, sistem RIN ditujukan kepada seluruh profesi yang melaksanakan penelitian, bukan sekelompok golongan namun kegiatan sosialisasi dan bimbingan teknis lebih banyak kepada peneliti di instansi yang memiliki badan penelitian pengembangan serta perguruan tinggi. Berdasarkan responsivitasnya, belum dapat dirasakan oleh seluruh target sasaran dikarenakan tindaklanjut dari para target sasaran setelah mengenal sistem RIN masih rendah. Maka perlu rekomendasi kepada target sasaran baik dari internal LIPI, eksternal LIPI maupun PDDI LIPI sebagai implementor. Oleh karena itu dapat disimpulkan bahwa implementasi kebijakan sistem RIN belum berjalan dengan optimal dan masih perlu ditingkatkan.


PeerJ ◽  
2017 ◽  
Vol 5 ◽  
pp. e2880 ◽  
Author(s):  
Reem Al-jawahiri ◽  
Elizabeth Milne

Recently, there has been a move encouraged by many stakeholders towards generating big, open data in many areas of research. One area where big, open data is particularly valuable is in research relating to complex heterogeneous disorders such as Autism Spectrum Disorder (ASD). The inconsistencies of findings and the great heterogeneity of ASD necessitate the use of big and open data to tackle important challenges such as understanding and defining the heterogeneity and potential subtypes of ASD. To this end, a number of initiatives have been established that aim to develop big and/or open data resources for autism research. In order to provide a useful data reference for autism researchers, a systematic search for ASD data resources was conducted using the Scopus database, the Google search engine, and the pages on ‘recommended repositories’ by key journals, and the findings were translated into a comprehensive list focused on ASD data. The aim of this review is to systematically search for all available ASD data resources providing the following data types: phenotypic, neuroimaging, human brain connectivity matrices, human brain statistical maps, biospecimens, and ASD participant recruitment. A total of 33 resources were found containing different types of data from varying numbers of participants. Description of the data available from each data resource, and links to each resource is provided. Moreover, key implications are addressed and underrepresented areas of data are identified.


2021 ◽  
Author(s):  
Anita Bandrowski ◽  
Jeffrey S. Grethe ◽  
Anna Pilko ◽  
Tom Gillespie ◽  
Gabi Pine ◽  
...  

AbstractThe NIH Common Fund’s Stimulating Peripheral Activity to Relieve Conditions (SPARC) initiative is a large-scale program that seeks to accelerate the development of therapeutic devices that modulate electrical activity in nerves to improve organ function. Integral to the SPARC program are the rich anatomical and functional datasets produced by investigators across the SPARC consortium that provide key details about organ-specific circuitry, including structural and functional connectivity, mapping of cell types and molecular profiling. These datasets are provided to the research community through an open data platform, the SPARC Portal. To ensure SPARC datasets are Findable, Accessible, Interoperable and Reusable (FAIR), they are all submitted to the SPARC portal following a standard scheme established by the SPARC Curation Team, called the SPARC Data Structure (SDS). Inspired by the Brain Imaging Data Structure (BIDS), the SDS has been designed to capture the large variety of data generated by SPARC investigators who are coming from all fields of biomedical research. Here we present the rationale and design of the SDS, including a description of the SPARC curation process and the automated tools for complying with the SDS, including the SDS validator and Software to Organize Data Automatically (SODA) for SPARC. The objective is to provide detailed guidelines for anyone desiring to comply with the SDS. Since the SDS are suitable for any type of biomedical research data, it can be adopted by any group desiring to follow the FAIR data principles for managing their data, even outside of the SPARC consortium. Finally, this manuscript provides a foundational framework that can be used by any organization desiring to either adapt the SDS to suit the specific needs of their data or simply desiring to design their own FAIR data sharing scheme from scratch.


Author(s):  
Jamin Pelkey

Analyzing visual meaning online and curating digitized images are topics of increasing relevance, but many potential methodologies for doing so remain merely implicit, underthematized, or unexplored. The potential for testing and developing semiotic theory through the exploration of visual data online also requires far more careful attention. In response, this paper provides an integrated, reflexive, Peircean account of two case studies featuring research projects focused on visual data drawn primarily from sources online, relying heavily on Google Image Search as a data collection tool. The first study illustrates the comparative analysis of brand mark logos to test and refine a theory of embodied semiotics involving oppositional relations. The second study illustrates the comparative analysis of images depicting the Tibetan Wheel of Life and Yama the monster of death, in order to test the embodied grounding hypothesis for the semiotic square. Issues of hypothesis formation, research parameters, data collection, database construction, operationalization, coding parameters, open data archiving and related issues are addressed in order to further develop and encourage practices of researching visual semiotics online in the context of Digital Humanities scholarship.Keywords: Mixed-methods research. Google Image search. Visual content analysis. Semiotic theory. Semiotic methods. Peircean semiotics.


Author(s):  
N. Fumai ◽  
C. Collet ◽  
M. Petroni ◽  
K. Roger ◽  
E. Saab ◽  
...  

Abstract A Patient Data Management System (PDMS) is being developed for use in the Intensive Care Unit (ICU) of the Montreal Children’s Hospital. The PDMS acquires real-time patient data from a network of physiological bedside monitors and facilitates the review and interpretation of this data by presenting it as graphical trends, charts and plots on a color video display. Due to the large amounts of data involved, the data storage and data management processes are an important task of the PDMS. The data management structure must integrate varied data types and provide database support for different applications, while preserving the real-time acquisition of network data. This paper outlines a new data management structure which is based primarily on OS/2’s Extended Edition relational database. The relational database design is expected to solve the query shortcomings of the previous data management structure, as well as offer support for security and concurrency. The discussion will also highlight future advantages available from a network implementation.


2021 ◽  
Author(s):  
Kerstin Lehnert ◽  
Daven Quinn ◽  
Basil Tikoff ◽  
Douglas Walker ◽  
Sarah Ramdeen ◽  
...  

<div> <p>Management of geochemical data needs to consider the sequence of phases in the lifecycle of these data from field to lab to publication to archive. It also needs to address the large variety of chemical properties measured; the wide range of materials that are analyzed; the different ways, in which these materials may be prepared for analysis; the diversity of analytical techniques and instrumentation used to obtain analytical results; and the many ways used to calibrate and correct raw data, normalize them to standard reference materials, and otherwise treat them to obtain meaningful and comparable results. In order to extract knowledge from the data, they are then integrated and compared with other measurements, formatted for visualization, statistical analysis, or model generation, and finally cleaned and organized for publication and deposition in a data repository. Each phase in the geochemical data lifecycle has its specific workflows and metadata that need to be recorded to fully document the provenance of the data so that others can reproduce the results.</p> </div><div> <p>An increasing number of software tools are developed to support the different phases of the geochemical data lifecycle. These include electronic field notebooks, digital lab books, and Jupyter notebooks for data analysis, as well as data submission forms and templates. These tools are mostly disconnected and often require manual transcription or copying and pasting of data and metadata from one tool to the other. In an ideal world, these tools would be connected so that field observations gathered in a digital field notebook, such as sample locations and sampling dates, can be seamlessly send to an IGSN Allocating Agent to obtain a unique sample identifier with a QR code with a single click. The sample metadata would be readily accessible for the lab data management system that allows the researchers to capture information about the sample preparation, and that connects to the instrumentation to capture instrument settings and the raw data. The data would then be seamlessly accessed by data reduction software, visualized, and further compared to data from global databases that can be directly accessed. Ultimately, a few clicks will allow the user to format the data for publication and archiving.</p> </div><div> <p>Several data systems that support different stages in the lifecycle of samples and sample-based geochemical data have now come together to explore the development of standardized interfaces and APIs and consistent data and metadata schemas to link their systems into an efficient pipeline for geochemical data from the field to the archive. These systems include StraboSpot (www.strabospot.org; data system for digital collection, storage, and sharing of both field and lab data), SESAR (<span>www.geosamples.org</span>; sample registry and allocating agent for IGSN), EarthChem (www.earthchem.org; publishers and repository for geochemical data), Sparrow (sparrow-data.org; data system to organize analytical data and track project- and sample-level metadata), IsoBank (isobank.org; repository for stable isotope data), and MacroStrat (macrostrat.org; collaborative platform for geological data exploration and integration).</p> </div>


2014 ◽  
Vol 53 (01) ◽  
pp. 39-46 ◽  
Author(s):  
Z. Guan ◽  
J. Sun ◽  
Z. Wang ◽  
Y. Geng ◽  
W. Xu

SummaryObjectives: In China, deployment of electronic data capture (EDC) and clinical data management system (CDMS) for clinical research (CR) is in its very early stage, and about 90% of clinical studies collected and submitted clinical data manually. This work aims to build an open metadata schema for Prospective Clinical Research (openPCR) in China based on openEHR archetypes, in order to help Chinese researchers easily create specific data entry templates for registration, study design and clinical data collection.Methods: Singapore Framework for Dublin Core Application Profiles (DCAP) is used to develop openPCR and four steps such as defining the core functional requirements and deducing the core metadata items, developing archetype models, defining metadata terms and creating archetype records, and finally developing implementation syntax are followed.Results: The core functional requirements are divided into three categories: requirements for research registration, requirements for trial design, and requirements for case report form (CRF). 74 metadata items are identified and their Chinese authority names are created. The minimum metadata set of openPCR includes 3 documents, 6 sections, 26 top level data groups, 32 lower data groups and 74 data elements. The top level container in openPCR is composed of public document, internal document and clinical document archetypes. A hierarchical structure of openPCR is established according to Data Structure of Electronic Health Record Architecture and Data Stand -ard of China (Chinese EHR Standard). Meta-data attributes are grouped into six parts: identification, definition, representation, relation, usage guides, and administration.Discussions and Conclusion: OpenPCR is an open metadata schema based on research registration standards, standards of the Clinical Data Interchange Standards Consortium (CDISC) and Chinese healthcare related stand -ards, and is to be publicly available throughout China. It considers future integration of EHR and CR by adopting data structure and data terms in Chinese EHR Standard. Archetypes in openPCR are modularity models and can be separated, recombined, and reused. The authors recommend that the method to develop openPCR can be referenced by other countries when designing metadata schema of clinical research. In the next steps, openPCR should be used in a number of CR projects to test its applicability and to continuously improve its coverage. Besides, metadata schema for research protocol can be developed to structurize and standardize protocol, and syntactical interoperability of openPCR with other related standards can be considered.


2019 ◽  
Vol 20 (5) ◽  
pp. 598-616 ◽  
Author(s):  
Ryan Burns ◽  
Grace Wark

Contemporary cities are witnessing momentous shifts in how institutions and individuals produce and circulate data. Despite recent trends claiming that anyone can create and use data, cities remain marked by persistently uneven access and usage of digital technologies. This is the case as well within the emergent phenomenon of the ‘smart city,’ where open data are a key strategy for achieving ‘smartness,’ and increasingly constitute a fundamental dimension of urban life, governance, economic activity, and epistemology. The digital ethnography has extended traditional ethnographic research practices into such digital realms, yet its applicability within open data and smart cities is unclear. The method has tended to overlook the important roles of particular digital artifacts such as the database in structuring and producing knowledge. In this paper, we develop the database ethnography as a rich methodological resource for open data research. This approach centers the database as a key site for the production and materialization of social meaning. The database ethnography draws attention to the ways digital choices and practices—around database design, schema, data models, and so on—leave traces through time. From these traces, we may infer lessons about how phenomena come to be encoded as data and acted upon in urban contexts. Open databases are, in other words, key ways in which knowledges about the smart city are framed, delimited, and represented. More specifically, we argue that open databases limit data types, categorize and classify data to align with technical specifications, reflect the database designer’s episteme, and (re)produce conceptions of the world. We substantiate these claims through a database ethnography of the open data portal for the city of Calgary, in Western Canada.


Sign in / Sign up

Export Citation Format

Share Document