Use(less) Data: Discovery, COUNTER, and Music Databases

2018 ◽  
Vol 21 (4) ◽  
pp. 171-184 ◽  
Author(s):  
Angela L. Pratesi
2017 ◽  
Vol 8 (3) ◽  
pp. 1-18 ◽  
Author(s):  
Mohamed Elhadi Rahmani ◽  
Abdelmalek Amine ◽  
Reda Mohamed Hamou

Bio-inspired algorithms are sort of implementation of natural solutions to solve hard problems – so called NP problems. A seismic hazard is the probability that an earthquake will occur in a given geographic area, within a given window of time, and with ground motion intensity exceeding a given threshold. Seismic hazards prediction is one of the fields where data mining plays an important role. This paper presents a new bio-inspired algorithm motivated by the echolocation behavior of bats for seismic hazard states prediction in coal mines based on previously recorded data. It is a distance calculation based approach, Results were very satisfactory in a manner that encourage us to continue working on this approach. The implementation of the algorithm touches three fields of studies, data discovery or so called data mining, bio inspired techniques, and seismic hazards predictions.


Author(s):  
Raul Castro Fernandez ◽  
Essam Mansour ◽  
Abdulhakim A. Qahtan ◽  
Ahmed Elmagarmid ◽  
Ihab Ilyas ◽  
...  

2021 ◽  
Author(s):  
Andrii Salnikov ◽  
Balázs Kónya

AbstractDistributed e-Infrastructure is a key component of modern BIG Science. Service discovery in e-Science environments, such as Worldwide LHC Computing Grid (WLCG), is a crucial functionality that relies on service registry. In this paper we re-formulate the requirements for the service endpoint registry based on our more than 10 years experience with many systems designed or used within the WLCG e-Infrastructure. To satisfy those requirements the paper proposes a novel idea to use the existing well-established Domain Name System (DNS) infrastructure together with a suitable data model as a service endpoint registry. The presented ARC Hierarchical Endpoints Registry (ARCHERY) system consists of a minimalistic data model representing services and their endpoints within e-Infrastructures, a rendering of the data model embedded into DNS-records, a lightweight software layer for DNS-record management and client-side data discovery. Our approach for the ARCHERY registry required minimal software development and inherits all the benefits of one of the most reliable distributed information discovery source of the internet, the DNS infrastructure. In particular, deployment, management and operation of ARCHERY is fully relying on DNS. Results of ARCHERY deployment use-cases are provided together with performance analysis.


2021 ◽  
Author(s):  
Sarah Bauermeister ◽  
Joshua R Bauermeister ◽  
R Bridgman ◽  
C Felici ◽  
M Newbury ◽  
...  

Abstract Research-ready data (that curated to a defined standard) increases scientific opportunity and rigour by integrating the data environment. The development of research platforms has highlighted the value of research-ready data, particularly for multi-cohort analyses. Following user consultation, a standard data model (C-Surv), optimised for data discovery, was developed using data from 12 Dementias Platform UK (DPUK) population and clinical cohort studies. The model uses a four-tier nested structure based on 18 data themes selected according to user behaviour or technology. Standard variable naming conventions are applied to uniquely identify variables within the context of longitudinal studies. The data model was used to develop a harmonised dataset for 11 cohorts. This dataset populated the Cohort Explorer data discovery tool for assessing the feasibility of an analysis prior to making a data access request. It was concluded that developing and applying a standard data model (C-Surv) for research cohort data is feasible and useful.


2018 ◽  
Author(s):  
Ge Peng ◽  
Anna Milan ◽  
Nancy A. Ritchey ◽  
Robert P. Partee ◽  
Sonny Zinn ◽  
...  

Assessing the stewardship maturity of individual datasets is an essential part of ensuring and improving the way datasets are documented, preserved, and disseminated to users. It is a critical step towards meeting U.S. federal regulations, organizational requirements, and user needs. However, it is challenging to do so consistently and quantifiably. The Data Stewardship Maturity Matrix (DSMM), developed jointly by NOAA’s National Centers for Environmental Information (NCEI) and the Cooperative Institute for Climate and Satellites–North Carolina (CICS-NC), provides a uniform framework for consistently rating stewardship maturity of individual datasets in nine key components: preservability, accessibility, usability, production sustainability, data quality assurance, data quality control/monitoring, data quality assessment, transparency/traceability, and data integrity. So far, the DSMM has been applied to over 900 individual datasets that are archived and/or managed by NCEI, in support of the NOAA’s OneStop Data Discovery and Access Framework Project. As a part of the OneStop-ready process, tools, implementation guidance, workflows, and best practices are developed to assist the application of the DSMM and described in this paper. The DSMM ratings are also consistently captured in the ISO standard-based dataset-level quality metadata and citable quality descriptive information documents, which serve as interoperable quality information to both machine and human end-users. These DSMM implementation and integration workflows and best practices could be adopted by other data management and stewardship projects or adapted for applications of other maturity assessment models.


Sign in / Sign up

Export Citation Format

Share Document