open source data Latest Research Papers

BeerMIMS: Exploring the Use of Membrane-Inlet Mass Spectrometry (MIMS) Coupled to KNIME for the Characterization of Danish Beers

European Journal of Mass Spectrometry ◽

10.1177/14690667211073317 ◽

2022 ◽

pp. 146906672110733

Author(s):

Sean Sebastian Hughes ◽

Marcus M. K. Hughes ◽

Rasmus Voersaa Jonsbo ◽

Carsten Uhd Nielsen ◽

Frants Roager Lauritsen ◽

...

Keyword(s):

Mass Spectrometry ◽

Data Processing ◽

Complex Nature ◽

Feature Identification ◽

Computing Power ◽

Membrane Inlet Mass Spectrometry ◽

Data Processing Software ◽

Open Source Data ◽

Free Open Source

Beer is a complex mix of more than 7700 compounds, around 800 of which are volatile. While GC-MS has been actively employed in the analysis of the volatome of beer, this method is challenged by the complex nature of the sample. Herein, we explored the possible of using membrane-inlet mass spectrometry (MIMS) coupled to KNIME to characterize local Danish beers. KNIME stands for Konstanz Information Miner and is a free open-source data processing software which comes with several prebuilt nodes, that, when organized, result in data processing workflows allowing swift analysis of data with outputs that can be visualized in the desired format. KNIME has been shown to be promising in automation of large datasets and requires very little computing power. In fact, most of the computations can be carried out on a regular PC. Herein, we have utilized a KNIME workflow for data visualization of MIMS data to understand the global volatome of beers. Feature identification was not possible as of now but with a combination of MIMS and a KNIME workflow, we were able to distinguish beers from different micro-breweries located in Denmark, laying the foundation for the use of MIMS in future analysis of the beer volatome.

Download Full-text

Survey of Applications of Neural Networks and Machine Learning to COVID-19 Predictions

10.4018/978-1-7998-8455-2.ch002 ◽

2022 ◽

pp. 30-57

Author(s):

Richard S. Segall

Keyword(s):

Neural Network ◽

Artificial Intelligence ◽

Machine Learning ◽

Neural Networks ◽

Big Data ◽

Recurrent Neural Networks ◽

Big Data Analytics ◽

Data Sets ◽

Open Source Data ◽

Source Data

The purpose of this chapter is to illustrate how artificial intelligence (AI) technologies have been used for COVID-19 detection and analysis. Specifically, the use of neural networks (NN) and machine learning (ML) are described along with which countries are creating these techniques and how these are being used for COVID-19 diagnosis and detection. Illustrations of multi-layer convolutional neural networks (CNN), recurrent neural networks (RNN), and deep neural networks (DNN) are provided to show how these are used for COVID-19 detection and prediction. A summary of big data analytics for COVID-19 and some available COVID-19 open-source data sets and repositories and their characteristics for research and analysis are also provided. An example is also shown for artificial intelligence (AI) and neural network (NN) applications using real-time COVID-19 data.

Download Full-text

Landslide Susceptibility Assessment using Open-Source Data in the Far Western Nepal Himalaya: Case Studies from Selected Local Level Units

Journal of Institute of Science and Technology ◽

10.3126/jist.v26i2.41327 ◽

2021 ◽

Vol 26 (2) ◽

pp. 31-42

Author(s):

Kabi Raj Paudyal ◽

Krishna Chandra Devkota ◽

Binod Prasad Parajuli ◽

Puja Shakya ◽

Preshika Baskota

Keyword(s):

Open Source ◽

Case Studies ◽

Landslide Susceptibility ◽

Local Level ◽

Shear Zones ◽

Analytical Models ◽

Landslide Risk ◽

Susceptibility Map ◽

Open Source Data ◽

Source Data

This paper explores openly available geo-spatial and earth observatory data to understand landslide risk in data scarce rural areas of Nepal. It attempts to explore the application of open-source data and analytical models to inform future landslide research. The first step of this procedure starts from the review of global open datasets, literatures and case studies relevant to landslide research. The second step is followed by the case study in one of the mountainous municipalities of Nepal where we tested the identified open-source data and models to produce landslide susceptibility maps. Past studies and experiences show that the major potential sites of landslide in Nepal are highly concentrated in a geologically weak area such as the active fault regions, shear zones, axis of folds and unfavorable setting of lithology. Triggering factors like concentrated precipitation, frequent earthquake phenomenon and haphazard infrastructural development activities in the marginally stable mountain slopes have posed serious issues of landslides mostly through the geologically weak regions. In this context, openly available geo-spatial datasets can provide baseline information for exploring the landslide hazard scenario in the data scarce areas of Nepal. This research has used the available open-source data to produce a landslide susceptibility map of the Bithadchir Rural Municipality in Bajhang District and Budiganga Municipality in Bajura District of the Sudurpaschim Province of Nepal. We used qualitative analysis to evaluate the parameters and assess the susceptibility of landslide; the result was classified into five susceptibility zones: Very High, High, Moderate, Low, and Very Low. Slope and Aspect were identified to be the major determinants for the assessment. This approach is applicable, specifically, for the preliminary investigation in the data scarce region using open data sources. Furthermore, the result can be used to plan and prioritize effective disaster risk reduction strategies.

Download Full-text

Transforming the Generative Pretrained Transformer Into Augmented Business Text Writer

10.21203/rs.3.rs-1170589/v1 ◽

2021 ◽

Author(s):

Faisal Khalil ◽

Prof. Dr. Gordon Pipa

Keyword(s):

Fortune 500 ◽

Model Parameters ◽

Use Case ◽

Online Portal ◽

Fortune 500 Companies ◽

Open Source Data ◽

Source Data ◽

General Business ◽

Social News ◽

The Given

Abstract This study uses transformers architecture of Artificial neural networks to generate artificial business text for a given topic or theme. The implication of the study is to augment the business report writing, and general business writings process with help of Generative pretrained transformers (GPT) networks. Main focus of study is to provide practical use case for GPTs models with help of big data. Our study model has 355 million model parameters and trained for three months on GPU enable devices using 2.3 billion text tokens(is available as open-source data now). Text tokens are collected with help of rigorous preprocessing, which includes; shortlisting of Subreddits of Fortune 500 companies and industries, listed on US-based social news aggregation online portal called "Reddit". After shortlisting, millions of submission of users during the five years, are parsed to collect the URLs out of it. 1.8 million working URLs are scrutinized. Business text is parsed, cleaned, and converted into word embeddings out of URLs. The result shows that both models; conditional interactive and random sampling, generate text paragraphs that are grammatically accurate and stick to the given topic.

Download Full-text

Interactive visualization of climate model data via Python or GUI with psyplot

10.5194/dach2022-280 ◽

2021 ◽

Author(s):

Philipp S. Sommer

Keyword(s):

Climate Model ◽

Basic Structure ◽

Special Focus ◽

Command Line ◽

Software Packages ◽

Open Source Data ◽

Source Data ◽

Different Types ◽

Flexible Framework ◽

High Level

<div> <p><span data-contrast="auto">psyplot (</span><span data-contrast="none">https://psyplot.github.io</span><span data-contrast="auto">) is an open-source data visualization framework that integrates rich computational and mathematical software packages (such as xarray and matplotlib) into a flexible framework for visualization. It differs from most of the visual analytic software such that it focuses on extensibility in order to flexibly tackle the different types of analysis questions that arise in pioneering research. The design of the high-level API of the framework enables a simple and standardized usage from the command-line, python scripts or Jupyter notebooks. A modular plugin framework enables a flexible development of the framework that can potentially go into many different directions. The additional enhancement with a graphical user interface (GUI) makes it the only visualization framework that can be handled from the convenient command-line or scripts, as well as via point-click handling. It additionally allows to build further desktop applications on top of the existing framework.</span><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:259}">&#160;</span></p> </div> <div> <p><span data-contrast="auto">In this presentation, I will show the main functionalities of psyplot, with a special focus on the visualization of unstructured grids (such as the ICON model by the German Weather Service (DWD)), and the usage of psyplot on the HPC facilities of the DKRZ (mistral, jupyterhub, remote desktop, etc.). My demonstration will cover the basic structure of the psyplot framework and how to use psyplot in python scripts (and Jupyter notebooks). I will demonstrate a quick demo of to the psyplot GUI and psy-view, a ncview-like interface built upon psyplot, and talk about different features such as reusing plot configurations and exporting figures.</span></p> </div>

Download Full-text

Estimating COVID Risk During a Period of Pandemic Decline

Frontiers in Public Health ◽

10.3389/fpubh.2021.744819 ◽

2021 ◽

Vol 9 ◽

Author(s):

Timothy J. J. Inglis ◽

Benjamin McFadden ◽

Anthony Macali

Keyword(s):

Public Health ◽

Real Time ◽

Health Resources ◽

Risk Estimate ◽

Contact Tracing ◽

Business Continuity ◽

Risk Monitoring ◽

Open Source Data ◽

Travel Restrictions ◽

Community Transmission

Background: Many parts of the world that succeeded in suppressing epidemic coronavirus spread in 2020 have been caught out by recent changes in the transmission dynamics of SARS-CoV-2. Australia's early success in suppressing COVID-19 resulted in lengthy periods without community transmission. However, a slow vaccine rollout leaves this geographically isolated population vulnerable to leakage of new variants from quarantine, which requires internal travel restrictions, disruptive lockdowns, contact tracing and testing surges.Methods: To assist long term sustainment of limited public health resources, we sought a method of continuous, real-time COVID-19 risk monitoring that could be used to alert non-specialists to the level of epidemic risk on a sub-national scale. After an exploratory data assessment, we selected four COVID-19 metrics used by public health in their periodic threat assessments, applied a business continuity matrix and derived a numeric indicator; the COVID-19 Risk Estimate (CRE), to generate a daily spot CRE, a 3 day net rise and a seven day rolling average. We used open source data updated daily from all Australian states and territories to monitor the CRE for over a year.Results: Upper and lower CRE thresholds were established for the CRE seven day rolling average, corresponding to risk of sustained and potential outbreak propagation, respectively. These CRE thresholds were used in a real-time map of Australian COVID-19 risk estimate distribution by state and territory.Conclusions: The CRE toolkit we developed complements other COVID-19 risk management techniques and provides an early indication of emerging threats to business continuity.

Download Full-text

Global sanctions against corruption and asset recovery: a European approach

Journal of Money Laundering Control ◽

10.1108/jmlc-10-2021-0120 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Georgios Pavlidis

Keyword(s):

European Union ◽

Human Rights ◽

Design Methodology ◽

The European Union ◽

Content Type ◽

Open Source Data ◽

Asset Recovery ◽

Source Data ◽

The Uk ◽

The Eu

Purpose This paper aims to critically examine whether it is timely and actionable for the European Union (EU) to adopt a global sanctions regime against corruption and how such a regime can be designed to maximise its efficiency. This paper argues that developing such a dedicated framework is necessary, feasible and supportive of the international fight against corruption and the efforts to enhance the recovery of corruption proceeds. Design/methodology/approach This paper draws on reports, legislations, legal scholarships and other open-source data on global sanctions against corruption and the recovery of corruption proceeds. Findings This paper argues in favour of a dedicated global sanctions regime against corruption, which is necessary to mitigate significant risks for the EU internal market. Originality/value To the best of the authors’ knowledge, this study is one of the first to examine recent legislative developments, such as the EU Global Human Rights Sanctions Regime and the UK Global Anti-Corruption Sanctions Regulations, and the possible development of an EU-dedicated global sanctions regime against corruption with strong asset recovery components.

Download Full-text

Free and (Mostly) Open Source Data Analysis Software for Academic Research

10.31234/osf.io/9pnkw ◽

2021 ◽

Author(s):

Chinchu C.

Keyword(s):

Data Analysis ◽

Knowledge Creation ◽

Statistical Data ◽

Academic Research ◽

Research Process ◽

Ease Of Use ◽

Statistical Data Analysis ◽

Software Packages ◽

Open Source Data ◽

Source Data

Data analysis is a crucial task in knowledge creation in social sciences. Free resources for data analysis provide researchers with greater freedom and make the research process more accessible and democratic. This article lists some free software which can perform basic and advanced statistical data analysis tasks. Some software which can perform other tasks such as text mining are also introduced. Ease of use and functionality are the major criteria for selecting these software packages.

Download Full-text

Transfreq: a Python package for computing the theta-to-alpha transition frequency from resting state EEG data

10.1101/2021.12.03.471064 ◽

2021 ◽

Author(s):

Elisabetta Vallarino ◽

Sara Sommariva ◽

Dario Arnaldi ◽

Francesco Famà ◽

Michele Piana ◽

...

Keyword(s):

Resting State ◽

Transition Frequency ◽

Data Set ◽

Open Source Data ◽

Eeg Data ◽

Source Data ◽

Classic Approach ◽

Spectral Profiles ◽

Eeg Recordings ◽

The Individual

AbstractA classic approach to estimate the individual theta-to-alpha transition frequency requires two electroencephalographic (EEG) recordings, one acquired in restingstate condition and one showing an alpha de-synchronisation due e.g. to task execution. This translates into longer recording sessions that my be cumbersome in studies involving patients. Moreover, incomplete de-synchronisation of the alpha rhythm may compromise the final estimation of the transition frequency. Here we present transfreq, a Python library that allows the computation of the transition frequency from resting-state data by clustering the spectral profiles at different EEG channels based on their content in the alpha and theta bands. We first provide an overview of the transfreq core algorithm and of the software architecture. Then we demonstrate its feasibility and robustness across different experimental setups on a publicly available EEG data set and on in-house recordings. A detailed documentation of transfreq and the codes for reproducing the analysis of the paper with the open-source data set are available online at https://elisabettavallarino.github.io/transfreq/

Download Full-text

Geodiversity mapping of the Bakony–Balaton UNESCO Global Geopark, Hungary

Proceedings of the ICA ◽

10.5194/ica-proc-4-82-2021 ◽

2021 ◽

Vol 4 ◽

pp. 1-6

Author(s):

Márton Pál ◽

Gáspár Albert

Keyword(s):

Open Source ◽

Natural World ◽

Nature Protection ◽

Open Source Data ◽

The World ◽

Open Source Gis ◽

Gis Environment ◽

Tourism Potential ◽

Cultural Ecological ◽

High Degree

Abstract. Geodiversity is the natural range of elements in the physical environment. The relationships, properties, and systems of geoscientific features have an impact not only on the natural world but also on cultural and societal aspects of life. Geodiversity can be considered as a quantitative variable that is unevenly distributed all over the world. This spatial variability helps to locate areas with a high degree of geodiversity. These areas can be the basis of further nature protection and geotourism purposes: high geodiversity usually means higher scientific/cultural/ecological values in an area. We present a GIS-based workflow in which we collect, evaluate, and visualize geoscientific variables to provide information on the geodiversity of the Bakony–Balaton UNESCO Global Geopark in Hungary. By using mainly freely accessible data and an open-source GIS environment, we aim to develop a method that can be applied in many areas of the world. The evaluation is built up by the determination of five sub-indices per unit area, which are related to the elements of geodiversity: geology, relief, hydrology, soil, palaeontology, and mineralogy. The geodiversity index is the sum of the sub-indices. The current tourism potential is mainly found in the high geodiversity regions: the Balaton Uplands, the Tapolca Basin, the Káli Basin, and the Bakony Mountains. The results show that the current geopark infrastructure is in accordance with the geodiversity, but it took several years to reach this state. However, new geoparks are established every year and their infrastructure is yet to be planned. The method we apply helps in this process by using open-source data in the assessment and provides a workflow in areas that have not been evaluated before.

Download Full-text

open source data
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

BeerMIMS: Exploring the Use of Membrane-Inlet Mass Spectrometry (MIMS) Coupled to KNIME for the Characterization of Danish Beers

Survey of Applications of Neural Networks and Machine Learning to COVID-19 Predictions

Landslide Susceptibility Assessment using Open-Source Data in the Far Western Nepal Himalaya: Case Studies from Selected Local Level Units

Transforming the Generative Pretrained Transformer Into Augmented Business Text Writer

Interactive visualization of climate model data via Python or GUI with psyplot

Estimating COVID Risk During a Period of Pandemic Decline

Global sanctions against corruption and asset recovery: a European approach

Free and (Mostly) Open Source Data Analysis Software for Academic Research

Transfreq: a Python package for computing the theta-to-alpha transition frequency from resting state EEG data

Geodiversity mapping of the Bakony–Balaton UNESCO Global Geopark, Hungary

Export Citation Format

open source dataRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

BeerMIMS: Exploring the Use of Membrane-Inlet Mass Spectrometry (MIMS) Coupled to KNIME for the Characterization of Danish Beers

Survey of Applications of Neural Networks and Machine Learning to COVID-19 Predictions

Landslide Susceptibility Assessment using Open-Source Data in the Far Western Nepal Himalaya: Case Studies from Selected Local Level Units

Transforming the Generative Pretrained Transformer Into Augmented Business Text Writer

Interactive visualization of climate model data via Python or GUI with psyplot

Estimating COVID Risk During a Period of Pandemic Decline

Global sanctions against corruption and asset recovery: a European approach

Free and (Mostly) Open Source Data Analysis Software for Academic Research

Transfreq: a Python package for computing the theta-to-alpha transition frequency from resting state EEG data

Geodiversity mapping of the Bakony–Balaton UNESCO Global Geopark, Hungary

open source data
Recently Published Documents