generic data Latest Research Papers

Parcellation of whole brain tractogram is a critical step to study brain white matter structures and connectivity patterns. The existing methods based on supervised classification of streamlines into predefined streamline bundle types are not designed to explore sub-bundle structures, and methods with manually designed features are expensive to compute streamline-wise similarities. To resolve these issues, we proposed a novel atlas-free method that learnt a latent space using a deep recurrent autoencoder which efficiently embedded any lengths of streamlines to fixed-size feature vectors, namely, streamline embeddings, and enabled tractogram parcellation via unsupervised clustering in the latent space. The method is evaluated on the ISMRM 2015 tractography challenge dataset, and shows the ability to discriminate major bundles with unsupervised clustering and query streamline based on similarity. The learnt latent representations of streamlines and bundles also open the possibility of quantitatively studying any granularities of sub-bundle structures with generic data mining techniques.

Download Full-text

pyam: Analysis and visualisation of integrated assessment and macro-energy scenarios

Open Research Europe ◽

10.12688/openreseurope.13633.2 ◽

2021 ◽

Vol 1 ◽

pp. 74

Author(s):

Daniel Huppmann ◽

Matthew J. Gidden ◽

Zebedee Nicholls ◽

Jonas Hörsch ◽

Robin Lamboll ◽

...

Keyword(s):

Integrated Assessment ◽

Energy Transition ◽

Scientific Software ◽

Data Formats ◽

Code Base ◽

Generic Data ◽

Domain Expertise ◽

Novice Users ◽

Annual Resolution ◽

Python Package

The open-source Python package pyam provides a suite of features and methods for the analysis, validation and visualization of reference data and scenario results generated by integrated assessment models, macro-energy tools and other frameworks in the domain of energy transition, climate change mitigation and sustainable development. It bridges the gap between scenario processing and visualisation solutions that are "hard-wired" to specific modelling frameworks and generic data analysis or plotting packages. The package aims to facilitate reproducibility and reliability of scenario processing, validation and analysis by providing well-tested and documented methods for working with timeseries data in the context of climate policy and energy systems. It supports various data formats, including sub-annual resolution using continuous time representation and "representative timeslices". The pyam package can be useful for modelers generating scenario results using their own tools as well as researchers and analysts working with existing scenario ensembles such as those supporting the IPCC reports or produced in research projects. It is structured in a way that it can be applied irrespective of a user's domain expertise or level of Python knowledge, supporting experts as well as novice users. The code base is implemented following best practices of collaborative scientific-software development. This manuscript describes the design principles of the package and the types of data which can be handled. The usefulness of pyam is illustrated by highlighting several recent applications.

Download Full-text

An Umbrella Converse for Data Exchange: Applied to Caching, Computing, and Shuffling

Entropy ◽

10.3390/e23080985 ◽

2021 ◽

Vol 23 (8) ◽

pp. 985

Author(s):

Prasad Krishnan ◽

Lakshmi Natarajan ◽

Vadlamani Lalitha

Keyword(s):

Distributed Computing ◽

Data Exchange ◽

Specific Problem ◽

Independence Number ◽

Index Coding ◽

Communication Problems ◽

Generic Data ◽

General Data ◽

Problem Formulations ◽

Coded Caching

The problem of data exchange between multiple nodes with storage and communication capabilities models several current multi-user communication problems like Coded Caching, Data Shuffling, Coded Computing, etc. The goal in such problems is to design communication schemes which accomplish the desired data exchange between the nodes with the optimal (minimum) amount of communication load. In this work, we present a converse to such a general data exchange problem. The expression of the converse depends only on the number of bits to be moved between different subsets of nodes, and does not assume anything further specific about the parameters in the problem. Specific problem formulations, such as those in Coded Caching, Coded Data Shuffling, and Coded Distributed Computing, can be seen as instances of this generic data exchange problem. Applying our generic converse, we can efficiently recover known important converses in these formulations. Further, for a generic coded caching problem with heterogeneous cache sizes at the clients with or without a central server, we obtain a new general converse, which subsumes some existing results. Finally we relate a “centralized” version of our bound to the known generalized independence number bound in index coding and discuss our bound’s tightness in this context.

Download Full-text

pyam: Analysis and visualisation of integrated assessment and macro-energy scenarios

Open Research Europe ◽

10.12688/openreseurope.13633.1 ◽

2021 ◽

Vol 1 ◽

pp. 74

Author(s):

Daniel Huppmann ◽

Matthew J. Gidden ◽

Zebedee Nicholls ◽

Jonas Hörsch ◽

Robin Lamboll ◽

...

Keyword(s):

Integrated Assessment ◽

Energy Transition ◽

Scientific Software ◽

Data Formats ◽

Code Base ◽

Generic Data ◽

Unit Conversion ◽

Annual Resolution ◽

Python Package ◽

Change Mitigation

The open-source Python package pyam provides a suite of features and methods for the analysis, validation and visualization of reference data and scenario results generated by integrated assessment models, macro-energy tools and other frameworks in the domain of energy transition, climate change mitigation and sustainable development. It bridges the gap between scenario processing and visualisation solutions that are "hard-wired" to specific modelling frameworks and generic data analysis or plotting packages. The package aims to facilitate reproducibility and reliability of scenario processing, validation and analysis by providing well-tested and documented methods for timeseries aggregation, downscaling and unit conversion. It supports various data formats, including sub-annual resolution using continuous time representation and "representative timeslices". The code base is implemented following best practices of collaborative scientific-software development. This manuscript describes the design principles of the package and the types of data which can be handled. The usefulness of pyam is illustrated by highlighting several recent applications.

Download Full-text

The ultimate database to (re)set the evolutionary history of primate genital bones

Scientific Reports ◽

10.1038/s41598-021-90787-2 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Federica Spani ◽

Maria Pia Morigi ◽

Matteo Bettuzzi ◽

Massimiliano Scalici ◽

Gabriele Gentile ◽

...

Keyword(s):

Methodological Approach ◽

Ancestral State ◽

Micro Computed Tomography ◽

Binary Matrix ◽

Occurrence Data ◽

Generic Data ◽

Primary Literature ◽

History Of ◽

Meta Analyses ◽

First Time

AbstractScientific literature concerning genital bones in primates consists of both ancient works (dating back to the nineteenth century) and more recent revisions/meta-analyses, which, however, are not always so detailed or exhaustive. Based on a thorough analysis, several conflicting data, inaccurate references, and questionable claims have emerged. We generated a binary matrix of genital bone occurrence data, considering only data at the species level, based on (1) a rigorous literature search protocol, (2) raw data (collected exclusively from primary literature), (3) an updated taxonomy (often tracing back to the species taxonomic history) and (4) new occurrence data from scanned genitals of fresh and museum specimens (using micro-computed tomography-micro-CT). Thanks to this methodological approach, we almost doubled available occurrence data so far, avoiding any arbitrary extension of generic data to conspecific species. This practice, in fact, has been recently responsible for an overestimation of the occurrence data, definitively flattening the interspecific variability. We performed the ancestral state reconstruction analysis of genital bone occurrence and results were mapped onto the most updated phylogeny of primates. As for baculum, we definitively demonstrated its simplesiomorphy for the entire order. As for baubellum, we interpreted all scattered absences as losses, actually proposing (for the first time) a simplesiomorphic state for the clitoral bone as well. The occurrence data obtained, while indirectly confirming the baculum/baubellum homology (i.e., for each baubellum a baculum was invariably present), could also directly demonstrate an intra-specific variability affecting ossa genitalia occurrence. With our results, we established a radically improved and updated database about the occurrence of genital bones in primates, available for further comparative analyses.

Download Full-text

Conceptual modeling of prosopographic databases integrating quality dimensions

Journal of Data Mining & Digital Humanities ◽

10.46298/jdmdh.5078 ◽

2021 ◽

Vol Special Issue on Data Science... ◽

Author(s):

Jacky Akoka ◽

Isabelle Comyn-Wattiau ◽

Stéphane LamassÉ ◽

Cédric Du Mouza

Keyword(s):

Data Model ◽

Large Scale ◽

Social Groups ◽

Conceptual Modeling ◽

Generic Data ◽

International Audience ◽

Quality Dimensions ◽

Generic Data Model ◽

Stored Information

International audience Prosopographic databases, which allow the study of social groups through their bibliography, are used today by a significant number of historians. Computerization has allowed intensive and large-scale exploitation of these databases. The modeling of these proposopographic databases has given rise to several data models. An important problem is to ensure a level of quality of the stored information. In this article , we propose a generic data model allowing to describe most of the existing prosopographic databases and to enrich them by integrating several quality concepts such as uncertainty, reliability, accuracy or completeness.

Download Full-text

Key Learnings During the Development of a Generic Data Collection Tool to Support Assessment of Freedom of Infection in Cattle Herds

Frontiers in Veterinary Science ◽

10.3389/fvets.2021.656336 ◽

2021 ◽

Vol 8 ◽

Author(s):

Annika M. van Roon ◽

Egle Rapaliute ◽

Xhelil Koleci ◽

Violeta Muñoz ◽

Mathilde Mercat ◽

...

Keyword(s):

Data Collection ◽

Infectious Diseases ◽

Disease Control ◽

Data Availability ◽

European Countries ◽

Data Collection Tool ◽

Modeling Framework ◽

Cattle Diseases ◽

Generic Data ◽

Control Programmes

Various European Member States have implemented control or eradication programmes for endemic infectious diseases in cattle. The design of these programmes varies between countries and therefore comparison of the outputs of different control programmes is complex. Although output-based methods to estimate the confidence of freedom resulting from these programmes are under development, as yet there is no practical modeling framework applicable to a variety of infectious diseases. Therefore, a data collection tool was developed to evaluate data availability and quality and to collect actual input data required for such a modeling framework. The aim of the current paper is to present the key learnings from the process of the development of this data collection tool. The data collection tool was developed by experts from two international projects: STOC free (Surveillance Tool for Outcome-based Comparison of FREEdom from infection, www.stocfree.eu) and SOUND control (Standardizing OUtput-based surveillance to control Non-regulated Diseases of cattle in the EU, www.sound-control.eu). Initially a data collection tool was developed for assessment of freedom of bovine viral diarrhea virus in six Western European countries. This tool was then further generalized to enable inclusion of data for other cattle diseases i.e., infectious bovine rhinotracheitis and Johne's disease. Subsequently, the tool was pilot-tested by a Western and Eastern European country, discussed with animal health experts from 32 different European countries and further developed for use throughout Europe. The developed online data collection tool includes a wide range of variables that could reasonably influence confidence of freedom, including those relating to cattle demographics, risk factors for introduction and characteristics of disease control programmes. Our results highlight the fact that data requirements for different cattle diseases can be generalized and easily included in a data collection tool. However, there are large differences in data availability and comparability across European countries, presenting challenges to the development of a standardized data collection tool and modeling framework. These key learnings are important for development of any generic data collection tool for animal disease control purposes. Further, the results can facilitate development of output-based modeling frameworks that aim to calculate confidence of freedom from disease.

Download Full-text

Development of data-to-text (D2T) on generic data using fuzzy sets

International Journal of Advanced Technology and Engineering Exploration ◽

10.19101/ijatee.2020.762134 ◽

2021 ◽

Vol 8 (75) ◽

pp. 382-390

Author(s):

Lala Septem Riza ◽

Muhammad Ridwan ◽

Enjun Junaeti ◽

Khyrina Airin Fariza Abu Samah

Keyword(s):

Fuzzy Sets ◽

Generic Data

Download Full-text

PolyDAT: A Generic Data Schema for Polymer Characterization

Journal of Chemical Information and Modeling ◽

10.1021/acs.jcim.1c00028 ◽

2021 ◽

Author(s):

Tzyy-Shyang Lin ◽

Nathan J. Rebello ◽

Haley K. Beech ◽

Zi Wang ◽

Bassil El-Zaatari ◽

...

Keyword(s):

Polymer Characterization ◽

Data Schema ◽

Generic Data

Download Full-text

generic data
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Bayesian prior distribution based on generic data and experts’ opinion: A case study in the O&G industry

Auto-encoded Latent Representations of White Matter Streamlines for Quantitative Distance Analysis

pyam: Analysis and visualisation of integrated assessment and macro-energy scenarios

An Umbrella Converse for Data Exchange: Applied to Caching, Computing, and Shuffling

pyam: Analysis and visualisation of integrated assessment and macro-energy scenarios

The ultimate database to (re)set the evolutionary history of primate genital bones

Conceptual modeling of prosopographic databases integrating quality dimensions

Key Learnings During the Development of a Generic Data Collection Tool to Support Assessment of Freedom of Infection in Cattle Herds

Development of data-to-text (D2T) on generic data using fuzzy sets

PolyDAT: A Generic Data Schema for Polymer Characterization

Export Citation Format

generic dataRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Bayesian prior distribution based on generic data and experts’ opinion: A case study in the O&G industry

Auto-encoded Latent Representations of White Matter Streamlines for Quantitative Distance Analysis

pyam: Analysis and visualisation of integrated assessment and macro-energy scenarios

An Umbrella Converse for Data Exchange: Applied to Caching, Computing, and Shuffling

pyam: Analysis and visualisation of integrated assessment and macro-energy scenarios

The ultimate database to (re)set the evolutionary history of primate genital bones

Conceptual modeling of prosopographic databases integrating quality dimensions

Key Learnings During the Development of a Generic Data Collection Tool to Support Assessment of Freedom of Infection in Cattle Herds

Development of data-to-text (D2T) on generic data using fuzzy sets

PolyDAT: A Generic Data Schema for Polymer Characterization

generic data
Recently Published Documents