scholarly journals Smoothing surfaces and attributes

2014 ◽  
Vol 33 (2) ◽  
pp. 128-129 ◽  
Author(s):  
Matt Hall

Welcome to this new column. Every two months, a geoscientist will present a brief exploration of a geophysical topic. The idea is to take a tour bus around a subject and point out some of the sights, perhaps stopping briefly at an exemplary problem or instructive viewpoint. So far it's useful, but maybe not remarkable. The remarkable thing, I hope, is that the tour will be open access. The tutors will use only open data sets that anyone can download. There will be no proprietary software. I will strongly encourage the use of Octave, R, or Python, all high-level (that is, easy-to-learn) programming languages for scientists, and the important parts of the code will be shared. I've tried to give a flavor of all this in today's tutorial, using Python. If you are new to Python, IPython is a great place to start—visit ipython.org/install .


2017 ◽  
Author(s):  
Federica Rosetta

Watch the VIDEO here.Within the Open Science discussions, the current call for “reproducibility” comes from the raising awareness that results as presented in research papers are not as easily reproducible as expected, or even contradicted those original results in some reproduction efforts. In this context, transparency and openness are seen as key components to facilitate good scientific practices, as well as scientific discovery. As a result, many funding agencies now require the deposit of research data sets, institutions improve the training on the application of statistical methods, and journals begin to mandate a high level of detail on the methods and materials used. How can researchers be supported and encouraged to provide that level of transparency? An important component is the underlying research data, which is currently often only partly available within the article. At Elsevier we have therefore been working on journal data guidelines which clearly explain to researchers when and how they are expected to make their research data available. Simultaneously, we have also developed the corresponding infrastructure to make it as easy as possible for researchers to share their data in a way that is appropriate in their field. To ensure researchers get credit for the work they do on managing and sharing data, all our journals support data citation in line with the FORCE11 data citation principles – a key step in the direction of ensuring that we address the lack of credits and incentives which emerged from the Open Data analysis (Open Data - the Researcher Perspective https://www.elsevier.com/about/open-science/research-data/open-data-report ) recently carried out by Elsevier together with CWTS. Finally, the presentation will also touch upon a number of initiatives to ensure the reproducibility of software, protocols and methods. With STAR methods, for instance, methods are submitted in a Structured, Transparent, Accessible Reporting format; this approach promotes rigor and robustness, and makes reporting easier for the author and replication easier for the reader.



2019 ◽  
Vol 3 ◽  
pp. 1661 ◽  
Author(s):  
Emmanuel Ruhamyankaka ◽  
Brian P. Brunk ◽  
Grant Dorsey ◽  
Omar S. Harb ◽  
Danica A. Helb ◽  
...  

The concept of open data has been gaining traction as a mechanism to increase data use, ensure that data are preserved over time, and accelerate discovery. While epidemiology data sets are increasingly deposited in databases and repositories, barriers to access still remain. ClinEpiDB was constructed as an open-access online resource for clinical and epidemiologic studies by leveraging the extensive web toolkit and infrastructure of the Eukaryotic Pathogen Database Resources (EuPathDB; a collection of databases covering 170+ eukaryotic pathogens, relevant related species, and select hosts) combined with a unified semantic web framework. Here we present an intuitive point-and-click website that allows users to visualize and subset data directly in the ClinEpiDB browser and immediately explore potential associations. Supporting study documentation aids contextualization, and data can be downloaded for advanced analyses. By facilitating access and interrogation of high-quality, large-scale data sets, ClinEpiDB aims to spur collaboration and discovery that improves global health.



2020 ◽  
Vol 1 (1) ◽  
pp. 31-40
Author(s):  
Hina Afzal ◽  
Arisha Kamran ◽  
Asifa Noreen

The market nowadays, due to the rapid changes happening in the technologies requires a high level of interaction between the educators and the fresher coming to going the market. The demand for IT-related jobs in the market is higher than all other fields, In this paper, we are going to discuss the survival analysis in the market of parallel two programming languages Python and R . Data sets are growing large and the traditional methods are not capable enough of handling the large data sets, therefore, we tried to use the latest data mining techniques through python and R programming language. It took several months of effort to gather such an amount of data and process it with the data mining techniques using python and R but the results showed that both languages have the same rate of growth over the past years.



Algorithms ◽  
2018 ◽  
Vol 11 (12) ◽  
pp. 209 ◽  
Author(s):  
Mauro Pelucchi ◽  
Giuseppe Psaila ◽  
Maurizio Toccu

The Hammer prototype is a query engine for corpora of Open Data that provides users with the concept of blind querying. Since data sets published on Open Data portals are heterogeneous, users wishing to find out interesting data sets are blind: queries cannot be fully specified, as in the case of databases. Consequently, the query engine is responsible for rewriting and adapting the blind query to the actual data sets, by exploiting lexical and semantic similarity. The effectiveness of this approach was discussed in our previous works. In this paper, we report our experience in developing the query engine. In fact, in the very first version of the prototype, we realized that the implementation of the retrieval technique was too slow, even though corpora contained only a few thousands of data sets. We decided to adopt the Map-Reduce paradigm, in order to parallelize the query engine and improve performances. We passed through several versions of the query engine, either based on the Hadoop framework or on the Spark framework. Hadoop and Spark are two very popular frameworks for writing and executing parallel algorithms based on the Map-Reduce paradigm. In this paper, we present our study about the impact of adopting the Map-Reduce approach and its two most famous frameworks to parallelize the Hammer query engine; we discuss various implementations of the query engine, either obtained without significantly rewriting the algorithm or obtained by completely rewriting the algorithm by exploiting high level abstractions provided by Spark. The experimental campaign we performed shows the benefits provided by each studied solution, with the perspective of moving toward Big Data in the future. The lessons we learned are collected and synthesized into behavioral guidelines for developers approaching the problem of parallelizing algorithms by means of Map-Reduce frameworks.



2020 ◽  
Author(s):  
Scott C Edmunds ◽  
Laurie Goodman

Current practices in scientific publishing are unsuitable for rapidly changing fields and for presenting updatable data sets and software tools. In this regard, and as part of the need to push scientific publishing to match the needs of modern research, we would like to announce the upcoming launch of GigaByte, an online open-access, open data journal that aims to be a new way to publish research following the software paradigm: CODE, RELEASE, FORK, UPDATE and REPEAT. Following on the success of GigaScience in promoting data sharing and reproducibility of research, its new sister, GigaByte, will aim to take this even further. With a focus on short articles, using a questionnaire-style review process, and combining that with the custom built publishing infrastructure from River Valley Technologies, we now have a cutting edge, XML-first publishing platform designed specifically to make the entire publication process easier, quicker, more interactive, and better suited to the speed needed to communicate modern research.



2020 ◽  
Vol 3 ◽  
pp. 1661
Author(s):  
Emmanuel Ruhamyankaka ◽  
Brian P. Brunk ◽  
Grant Dorsey ◽  
Omar S. Harb ◽  
Danica A. Helb ◽  
...  

The concept of open data has been gaining traction as a mechanism to increase data use, ensure that data are preserved over time, and accelerate discovery. While epidemiology data sets are increasingly deposited in databases and repositories, barriers to access still remain. ClinEpiDB was constructed as an open-access online resource for clinical and epidemiologic studies by leveraging the extensive web toolkit and infrastructure of the Eukaryotic Pathogen Database Resources (EuPathDB; a collection of databases covering 170+ eukaryotic pathogens, relevant related species, and select hosts) combined with a unified semantic web framework. Here we present an intuitive point-and-click website that allows users to visualize and subset data directly in the ClinEpiDB browser and immediately explore potential associations. Supporting study documentation aids contextualization, and data can be downloaded for advanced analyses. By facilitating access and interrogation of high-quality, large-scale data sets, ClinEpiDB aims to spur collaboration and discovery that improves global health.



2018 ◽  
Vol 1 (1) ◽  
pp. 6-21 ◽  
Author(s):  
I. K. Razumova ◽  
N. N. Litvinova ◽  
M. E. Shvartsman ◽  
A. Yu. Kuznetsov

Introduction. The paper presents survey results on the awareness towards and practice of Open Access scholarly publishing among Russian academics.Materials and Methods. We employed methods of statistical analysis of survey results. Materials comprise results of data processing of Russian survey conducted in 2018 and published results of the latest international surveys. The survey comprised 1383 respondents from 182 organizations. We performed comparative studies of the responses from academics and research institutions as well as different research areas. The study compares results obtained in Russia with the recently published results of surveys conducted in the United Kingdom and Europe.Results. Our findings show that 95% of Russian respondents support open access, 94% agree to post their publications in open repositories and 75% have experience in open access publishing. We did not find any difference in the awareness and attitude towards open access among seven reference groups. Our analysis revealed the difference in the structure of open access publications of the authors from universities and research institutes. Discussion andConclusions. Results reveal a high level of awareness and support to open access and succeful practice in the open access publications in the Russian scholarly community. The results for Russia demonstrate close similarity with the results of the UK academics. The governmental open access policies and programs would foster the practical realization of the open access in Russia.



Sensors ◽  
2021 ◽  
Vol 21 (15) ◽  
pp. 5204
Author(s):  
Anastasija Nikiforova

Nowadays, governments launch open government data (OGD) portals that provide data that can be accessed and used by everyone for their own needs. Although the potential economic value of open (government) data is assessed in millions and billions, not all open data are reused. Moreover, the open (government) data initiative as well as users’ intent for open (government) data are changing continuously and today, in line with IoT and smart city trends, real-time data and sensor-generated data have higher interest for users. These “smarter” open (government) data are also considered to be one of the crucial drivers for the sustainable economy, and might have an impact on information and communication technology (ICT) innovation and become a creativity bridge in developing a new ecosystem in Industry 4.0 and Society 5.0. The paper inspects OGD portals of 60 countries in order to understand the correspondence of their content to the Society 5.0 expectations. The paper provides a report on how much countries provide these data, focusing on some open (government) data success facilitating factors for both the portal in general and data sets of interest in particular. The presence of “smarter” data, their level of accessibility, availability, currency and timeliness, as well as support for users, are analyzed. The list of most competitive countries by data category are provided. This makes it possible to understand which OGD portals react to users’ needs, Industry 4.0 and Society 5.0 request the opening and updating of data for their further potential reuse, which is essential in the digital data-driven world.



2021 ◽  
Vol 10 (4) ◽  
pp. 251
Author(s):  
Christina Ludwig ◽  
Robert Hecht ◽  
Sven Lautenbach ◽  
Martin Schorcht ◽  
Alexander Zipf

Public urban green spaces are important for the urban quality of life. Still, comprehensive open data sets on urban green spaces are not available for most cities. As open and globally available data sets, the potential of Sentinel-2 satellite imagery and OpenStreetMap (OSM) data for urban green space mapping is high but limited due to their respective uncertainties. Sentinel-2 imagery cannot distinguish public from private green spaces and its spatial resolution of 10 m fails to capture fine-grained urban structures, while in OSM green spaces are not mapped consistently and with the same level of completeness everywhere. To address these limitations, we propose to fuse these data sets under explicit consideration of their uncertainties. The Sentinel-2 derived Normalized Difference Vegetation Index was fused with OSM data using the Dempster–Shafer theory to enhance the detection of small vegetated areas. The distinction between public and private green spaces was achieved using a Bayesian hierarchical model and OSM data. The analysis was performed based on land use parcels derived from OSM data and tested for the city of Dresden, Germany. The overall accuracy of the final map of public urban green spaces was 95% and was mainly influenced by the uncertainty of the public accessibility model.



Sign in / Sign up

Export Citation Format

Share Document