scholarly journals Construction and accessibility of a cross-species phenotype ontology along with gene annotations for biomedical research

F1000Research ◽  
2013 ◽  
Vol 2 ◽  
pp. 30 ◽  
Author(s):  
Sebastian Köhler ◽  
Sandra C Doelken ◽  
Barbara J Ruef ◽  
Sebastian Bauer ◽  
Nicole Washington ◽  
...  

Phenotype analyses, e.g. investigating metabolic processes, tissue formation, or organism behavior, are an important element of most biological and medical research activities. Biomedical researchers are making increased use of ontological standards and methods to capture the results of such analyses, with one focus being the comparison and analysis of phenotype information between species.We have generated a cross-species phenotype ontology for human, mouse and zebra fish that contains zebrafish phenotypes. We also provide up-to-date annotation data connecting human genes to phenotype classes from the generated ontology. We have included the data generation pipeline into our continuous integration system ensuring stable and up-to-date releases.This article describes the data generation process and is intended to help interested researchers access both the phenotype annotation data and the associated cross-species phenotype ontology. The resource described here can be used in sophisticated semantic similarity and gene set enrichment analyses for phenotype data across species. The stable releases of this resource can be obtained from http://purl.obolibrary.org/obo/hp/uberpheno/.

F1000Research ◽  
2014 ◽  
Vol 2 ◽  
pp. 30 ◽  
Author(s):  
Sebastian Köhler ◽  
Sandra C Doelken ◽  
Barbara J Ruef ◽  
Sebastian Bauer ◽  
Nicole Washington ◽  
...  

Phenotype analyses, e.g. investigating metabolic processes, tissue formation, or organism behavior, are an important element of most biological and medical research activities. Biomedical researchers are making increased use of ontological standards and methods to capture the results of such analyses, with one focus being the comparison and analysis of phenotype information between species.We have generated a cross-species phenotype ontology for human, mouse and zebrafish that contains classes from the Human Phenotype Ontology, Mammalian Phenotype Ontology, and generated classes for zebrafish phenotypes. We also provide up-to-date annotation data connecting human genes to phenotype classes from the generated ontology. We have included the data generation pipeline into our continuous integration system ensuring stable and up-to-date releases.This article describes the data generation process and is intended to help interested researchers access both the phenotype annotation data and the associated cross-species phenotype ontology. The resource described here can be used in sophisticated semantic similarity and gene set enrichment analyses for phenotype data across species. The stable releases of this resource can be obtained from http://purl.obolibrary.org/obo/hp/uberpheno/.


2021 ◽  
Vol 15 (6) ◽  
pp. 1-22
Author(s):  
Yashen Wang ◽  
Huanhuan Zhang ◽  
Zhirun Liu ◽  
Qiang Zhou

For guiding natural language generation, many semantic-driven methods have been proposed. While clearly improving the performance of the end-to-end training task, these existing semantic-driven methods still have clear limitations: for example, (i) they only utilize shallow semantic signals (e.g., from topic models) with only a single stochastic hidden layer in their data generation process, which suffer easily from noise (especially adapted for short-text etc.) and lack of interpretation; (ii) they ignore the sentence order and document context, as they treat each document as a bag of sentences, and fail to capture the long-distance dependencies and global semantic meaning of a document. To overcome these problems, we propose a novel semantic-driven language modeling framework, which is a method to learn a Hierarchical Language Model and a Recurrent Conceptualization-enhanced Gamma Belief Network, simultaneously. For scalable inference, we develop the auto-encoding Variational Recurrent Inference, allowing efficient end-to-end training and simultaneously capturing global semantics from a text corpus. Especially, this article introduces concept information derived from high-quality lexical knowledge graph Probase, which leverages strong interpretability and anti-nose capability for the proposed model. Moreover, the proposed model captures not only intra-sentence word dependencies, but also temporal transitions between sentences and inter-sentence concept dependence. Experiments conducted on several NLP tasks validate the superiority of the proposed approach, which could effectively infer meaningful hierarchical concept structure of document and hierarchical multi-scale structures of sequences, even compared with latest state-of-the-art Transformer-based models.


Sensors ◽  
2021 ◽  
Vol 21 (6) ◽  
pp. 2144
Author(s):  
Stefan Reitmann ◽  
Lorenzo Neumann ◽  
Bernhard Jung

Common Machine-Learning (ML) approaches for scene classification require a large amount of training data. However, for classification of depth sensor data, in contrast to image data, relatively few databases are publicly available and manual generation of semantically labeled 3D point clouds is an even more time-consuming task. To simplify the training data generation process for a wide range of domains, we have developed the BLAINDER add-on package for the open-source 3D modeling software Blender, which enables a largely automated generation of semantically annotated point-cloud data in virtual 3D environments. In this paper, we focus on classical depth-sensing techniques Light Detection and Ranging (LiDAR) and Sound Navigation and Ranging (Sonar). Within the BLAINDER add-on, different depth sensors can be loaded from presets, customized sensors can be implemented and different environmental conditions (e.g., influence of rain, dust) can be simulated. The semantically labeled data can be exported to various 2D and 3D formats and are thus optimized for different ML applications and visualizations. In addition, semantically labeled images can be exported using the rendering functionalities of Blender.


2011 ◽  
Vol 12 (1) ◽  
pp. 32 ◽  
Author(s):  
Gary Schindelman ◽  
Jolene S Fernandes ◽  
Carol A Bastiani ◽  
Karen Yook ◽  
Paul W Sternberg

2019 ◽  
Vol 5 (2) ◽  
pp. 76-82
Author(s):  
Cornelius Mellino Sarungu ◽  
Liliana Liliana

Project management practice used many tools to support the process of recording and tracking data generated along the whole project. Project analytics provide deeper insights to be used on decision making. To conduct project analytics, one should explore the tools and techniques required. The mostcommon tool is Microsoft Excel. Its simplicity and flexibility make project manager or project team members can utilize it to do almost any kind of activities. We combine MS Excel with R Studio to brought data analytics into the project management process. While the data input process still using the old way that the project manager already familiar, the analytic engine could extract data from it and create visualization of needed parameters in a single output report file. This kind of approach deliver a low cost solution of project analytics for the organization. We can implement it with relatively low cost technology onone side, some of them are free, while maintaining the simple way of data generation process. This solution can also be proposed to improve project management process maturity level to the next stage, like CMMI level 4 that promote project analytics. Index Terms—project management, project analytics, data analytics.


2017 ◽  
Vol 20 (3) ◽  
pp. 446-462 ◽  
Author(s):  
Taran Thune ◽  
Magnus Gulbrandsen

Purpose The purpose of this paper is to investigate how a combination of diverse sources of knowledge is important for generation of new ideas and address how institutional infrastructures and practices support integration of knowledge across organizations in medicine and life sciences. Design/methodology/approach The paper investigates new product ideas that emerge from hospital and university employees, and looks at the extent of interaction between clinical and scientific environments in the idea generation process. The paper utilizes data about all new product ideas within life science that were reported in South-Eastern Norway in 2009-2011, as well as information about the individuals and teams that had been involved in disclosing these ideas. Interviews with inventors have also been carried out. Findings Interaction and integration across scientific and clinical domains are common and important for generating new product ideas. More than half of the disclosed life science ideas in the database come from groups representing multiple institutions with both scientific and clinical units or from individuals with multiple institutional affiliations. The interviews indicate that the infrastructure for cross-domain interaction is well-developed, particularly for research activities, which has a positive effect on invention. Originality/value The paper uses an original data set of invention disclosures and investigates the hospital-science interface, which is a novel setting for studies of inventive activities.


2020 ◽  
pp. 002234332096215
Author(s):  
Sophia Dawkins

This article examines what scholars can learn about civilian killings from newswire data in situations of non-random missingness. It contributes to this understanding by offering a unique view of the data-generation process in the South Sudanese civil war. Drawing on 40 hours of interviews with 32 human rights advocates, humanitarian workers, and journalists who produce ACLED and UCDP-GED’s source data, the article illustrates how non-random missingness leads to biases of inconsistent magnitude and direction. The article finds that newswire data for contexts like South Sudan suffer from a self-fulfilling narrative bias, where journalists select stories and human rights investigators target incidents that conform to international views of what a conflict is about. This is compounded by the way agencies allocate resources to monitor specific locations and types of violence to fit strategic priorities. These biases have two implications: first, in the most volatile conflicts, point estimates about violence using newswire data may be impossible, and most claims of precision may be false; secondly, body counts reveal little if divorced from circumstance. The article presents a challenge to political methodologists by asking whether social scientists can build better cross-national fatality measures given the biases inherent in the data-generation process.


2019 ◽  
Vol 18 ◽  
pp. 160940691881623 ◽  
Author(s):  
Gillian M. Martin

Qualitative research with children as participants is challenging on many levels—ethical, methodological, and relational. When researching the experience of children with particular bodily vulnerabilities, these issues are further amplified. This article describes a data generating tool designed to address these challenges. It was used within the context of an ethnographic study exploring relational societal processes associated with childhood obesity in Malta. This creative child-centric method uses “me” drawings as elicitation foci during informal conversations in the field where the agentic status of the child was prioritized and their role as active collaborators emphasized. Optimizing ethical symmetry was a key concern, as was emphasis on relational ethics and assent. Using the “Draw(Me) and Tell” activity positioned the child in a realistic position of power by giving them control over the data generation process, and helped address ethical issues related to agency, privacy, and sensitivity. It allowed ethical generation of qualitative data based on the children’s reflexive commentary on their own body shapes, with the aim of exploring their embodied habitus, identity, and selfhood.


2020 ◽  
Vol 19 ◽  
pp. 160940692091367
Author(s):  
Bethan Pell ◽  
Denitza Williams ◽  
Rhiannon Phillips ◽  
Julia Sanders ◽  
Adrian Edwards ◽  
...  

Visual timeline methods have been used as part of face-to-face qualitative interviewing with vulnerable populations to uncover the intricacies of lived experiences, but little is known about whether visual timelines can be effectively used in telephone interviews. In this article, we reflect on the process of using visual timelines in 16 telephone interviews with women as part of the “STarting a family when you have an Autoimmune Rheumatic disease” study (STAR Family Study). The visual timeline method was used to empower women to organize and share their narratives about the sensitive and complex topic of starting a family. We conducted a thematic analysis of the audio-recorded interview data, using researchers’ field notes and reflections to provide context for our understanding of the benefits of using timelines and to understand the process of using visual timelines during telephone interviews. Resource packs were sent to women before study participation; 11 of the 16 women completed a version of the timeline activity. Six themes were identified in the methodological data analysis: (1) use and adaptation of the timeline tool, (2) timeline exchange, (3) framing the interview: emphasizing that women are in control, (4) jumping straight in, (5) taking a lead, and (6) disclosing personal and sensitive experiences. The use of visual timelines facilitated interviewee control and elicited rich narratives of participants’ experiences in telephone interviews. Women created their visual timelines autonomously and retained ownership of their timeline data; these features of the data generation process need to be considered when using visual timelines in telephone rather than face-to-face interviews. Use of visual methods within telephone interviews is feasible, can generate rich data, and should be further explored in a wider range of settings.


2017 ◽  
Author(s):  
Stefan Hunziker ◽  
Stefan Brönnimann ◽  
Juan Marcos Calle ◽  
Isabel Moreno ◽  
Marcos Andrade ◽  
...  

Abstract. Systematic data quality issues may occur at various stages of the data generation process. They may affect large fractions of observational datasets and remain largely undetected with standard data quality control. This study investigates the effects of such undetected data quality issues on the results of climatological analyses. For this purpose, we quality controlled daily observations of manned weather stations from the Central Andean area with a standard and an enhanced approach. The climate variables analysed are minimum and maximum temperature, and precipitation. About 40 % of the observations are inappropriate for the calculation of monthly temperature means and precipitation sums due to data quality issues. These quality problems undetected with the standard quality control method strongly affect climatological analyses, since they reduce the correlation coefficients of station pairs, deteriorate the performance of data homogenization methods, increase the spread of individual station trends, and significantly bias regional temperature trends. Our findings indicate that undetected data quality issues are included in important and frequently used observational datasets, and hence may affect a high number of climatological studies. It is of utmost importance to apply comprehensive and adequate data quality control approaches on manned weather station records in order to avoid biased results and large uncertainties.


Sign in / Sign up

Export Citation Format

Share Document