scholarly journals Linking big biomedical datasets to modular analysis with Portable Encapsulated Projects

2020 ◽  
Author(s):  
Nathan C. Sheffield ◽  
Michał Stolarczyk ◽  
Vincent P. Reuter ◽  
André F. Rendeiro

Organizing and annotating biological sample data is critical in data-intensive bioinformatics. Unfortunately, incompatibility is common between metadata format of a data source and that required by a processing tool. There is no broadly accepted standard to organize metadata across biological projects and bioinformatics tools, restricting the portability and reusability of both annotated datasets and analysis software. To address this, we present Portable Encapsulated Projects (PEP), a formal specification for biological sample metadata structure. The PEP specification accommodates typical features of data-intensive bioinformatics projects with many samples, whether from individual experiments, organisms, or single cells. In addition to standardization, the PEP specification provides descriptors and modifiers for different organizational layers of a project, which improve portability among computing environments and facilitate use of different processing tools. PEP includes a schema validator framework, allowing formal definition of required metadata attributes for any type of biomedical data analysis. We have implemented packages for reading PEPs in both Python and R to provide a language-agnostic interface for organizing project metadata. PEP therefore presents an important step toward unifying data annotation and processing tools in data-intensive biological research projects.

GigaScience ◽  
2021 ◽  
Vol 10 (12) ◽  
Author(s):  
Nathan C Sheffield ◽  
Michał Stolarczyk ◽  
Vincent P Reuter ◽  
André F Rendeiro

Abstract Background Organizing and annotating biological sample data is critical in data-intensive bioinformatics. Unfortunately, metadata formats from a data provider are often incompatible with requirements of a processing tool. There is no broadly accepted standard to organize metadata across biological projects and bioinformatics tools, restricting the portability and reusability of both annotated datasets and analysis software. Results To address this, we present the Portable Encapsulated Project (PEP) specification, a formal specification for biological sample metadata structure. The PEP specification accommodates typical features of data-intensive bioinformatics projects with many biological samples. In addition to standardization, the PEP specification provides descriptors and modifiers for project-level and sample-level metadata, which improve portability across both computing environments and data processing tools. PEPs include a schema validator framework, allowing formal definition of required metadata attributes for data analysis broadly. We have implemented packages for reading PEPs in both Python and R to provide a language-agnostic interface for organizing project metadata. Conclusions The PEP specification is an important step toward unifying data annotation and processing tools in data-intensive biological research projects. Links to tools and documentation are available at http://pep.databio.org/.


1997 ◽  
Vol 3 (S2) ◽  
pp. 1081-1082
Author(s):  
I. Angert ◽  
W. Jahn ◽  
K.C. Holmes ◽  
R.R. Schröder

Understanding the contrast formation mechanism in the EM is one of the prerequisites for artefact-free reconstruction of biological structures from images. We found that the normally used correction of contrast formation applied to zero energy loss filtered images corrupted spatial resolution. Therefore the contribution of contrast formed by inelastic electrons was reconsidered, including partial coherence of inelastically scattered electrons and lens aberrations of the microscope. Based on this, a complete description of the zero-loss contrast transfer function (CTF) is now possible.We used tobacco mosaic virus (TMV), a biological sample known at atomic resolution, for definition of optimum CTF-parameters to reconstruct defocus series from an EFTEM LEO 912. CTF theory as known so far describes image contrast in the weak phase approximation as a linear sum of amplitude and phase contrast. The contribution of amplitude contrast (ratio of amplitude to phase contrast A/P) was determined to be between 7% and 5 % for unfiltered images and 12-14 % for zero-loss filtered images. However, in a filter microscope we remove electrons from the image, so we expect a higher amplitude contrast than in non-filtered images.


1999 ◽  
Vol 17 (2) ◽  
pp. 131-133 ◽  
Author(s):  
Dina Ralt

There have been a variety of Western explanations for the Qi of traditional Chinese medicine, but all have essentially had to compromise between expression of energy, matter and flow. The author suggests that a non-linear, fractal approach, similar to that of Chaos theory, offers a tool to understand Qi; the yin-yang and five phases theories of Chinese philosophy can be regarded as fractals. Qi, as the “net of life”, can also be looked on as an information network with close parallels to the computer-based web of the internet. This article therefore suggests a new Western definition of Qi, proposing that: “The Qi of Chinese medicine is inter-cellular information communicated within the body: information which enables all bodily functions and is a key component in regulation”. Referring to Qi as information offers the chance to integrate Chinese medical philosophy with current biological research on cellular communication.


2015 ◽  
Vol 2015 ◽  
pp. 1-8 ◽  
Author(s):  
Andreas Friedrich ◽  
Erhan Kenar ◽  
Oliver Kohlbacher ◽  
Sven Nahnsen

Big data bioinformatics aims at drawing biological conclusions from huge and complex biological datasets. Added value from the analysis of big data, however, is only possible if the data is accompanied by accurate metadata annotation. Particularly in high-throughput experiments intelligent approaches are needed to keep track of the experimental design, including the conditions that are studied as well as information that might be interesting for failure analysis or further experiments in the future. In addition to the management of this information, means for an integrated design and interfaces for structured data annotation are urgently needed by researchers. Here, we propose a factor-based experimental design approach that enables scientists to easily create large-scale experiments with the help of a web-based system. We present a novel implementation of a web-based interface allowing the collection of arbitrary metadata. To exchange and edit information we provide a spreadsheet-based, humanly readable format. Subsequently, sample sheets with identifiers and metainformation for data generation facilities can be created. Data files created after measurement of the samples can be uploaded to a datastore, where they are automatically linked to the previously created experimental design model.


2019 ◽  
Vol 7 (4) ◽  
pp. 208-213 ◽  
Author(s):  
Fabian V. Filipp

Abstract Purpose of Review We critically evaluate the future potential of machine learning (ML), deep learning (DL), and artificial intelligence (AI) in precision medicine. The goal of this work is to show progress in ML in digital health, to exemplify future needs and trends, and to identify any essential prerequisites of AI and ML for precision health. Recent Findings High-throughput technologies are delivering growing volumes of biomedical data, such as large-scale genome-wide sequencing assays; libraries of medical images; or drug perturbation screens of healthy, developing, and diseased tissue. Multi-omics data in biomedicine is deep and complex, offering an opportunity for data-driven insights and automated disease classification. Learning from these data will open our understanding and definition of healthy baselines and disease signatures. State-of-the-art applications of deep neural networks include digital image recognition, single-cell clustering, and virtual drug screens, demonstrating breadths and power of ML in biomedicine. Summary Significantly, AI and systems biology have embraced big data challenges and may enable novel biotechnology-derived therapies to facilitate the implementation of precision medicine approaches.


2016 ◽  
Vol 25 (01) ◽  
pp. 178-183 ◽  
Author(s):  
M. Barros ◽  
F.M. Couto

Summary Introduction: Biomedical research is increasingly becoming a data-intensive science in several areas, where prodigious amounts of data is being generated that has to be stored, integrated, shared and analyzed. In an effort to improve the accessibility of data and knowledge, the Linked Data initiative proposed a well-defined set of recommendations for exposing, sharing and integrating data, information and knowledge, using semantic web technologies. Objective: The main goal of this paper is to identify the current status and future trends of knowledge representation and management in Life and Health Sciences, mostly with regard to linked data technologies. Methods: We selected three prominent linked data studies, namely Bio2RDF, Open PHACTS and EBI RDF platform, and selected 14 studies published after 2014 (inclusive) that cited any of the three studies. We manually analyzed these 14 papers in relation to how they use linked data techniques. Results: The analyses show a tendency to use linked data techniques in Life and Health Sciences, and even if some studies do not follow all of the recommendations, many of them already represent and manage their knowledge using RDF and biomedical ontologies. Conclusion: These insights from RDF and biomedical ontologies are having a strong impact on how knowledge is generated from biomedical data, by making data elements increasingly connected and by providing a better description of their semantics. As health institutes become more data centric, we believe that the adoption of linked data techniques will continue to grow and be an effective solution to knowledge representation and management.


Yeast ◽  
2000 ◽  
Vol 1 (3) ◽  
pp. 211-217 ◽  
Author(s):  
Gerard Brady

Increasingly mRNA expression patterns established using a variety of molecular technologies such as cDNA microarrays, SAGE and cDNA display are being used to identify potential regulatory genes and as a means of providing valuable insights into the biological status of the starting sample. Until recently, the application of these techniques has been limited to mRNA isolated from millions or, at very best, several thousand cells thereby restricting the study of small samples and complex tissues. To overcome this limitation a variety of amplification approaches have been developed which are capable of broadly evaluating mRNA expression patterns in single cells. This review will describe approaches that have been employed to examine global gene expression patterns either in small numbers of cells or, wherever possible, in actual isolated single cells. The first half of the review will summarize the technical aspects of methods developed for single-cell analysis and the latter half of the review will describe the areas of biological research that have benefited from single-cell expression analysis.


1957 ◽  
Vol 61 (563) ◽  
pp. 727-755 ◽  
Author(s):  
E. W. Still

SummaryThe general requirements for the complete air conditioning of aircraft are discussed in the light of the complete system concept. The author takes into consideration safety, differential pressure, weight saving, power and air supply, passenger comfort, cooling and humidity. Particular systems are then described and there is a section on the test equipment required for the laboratory testing of air conditioning equipment. Cooling systems are taken first and divided into the air cycle system embodying bootstrap, turbine fan, and regenerative applications of cold air units–the vapour cycle system employing a boiling tank and that using proprietary refrigerants; properties of liquid refrigerants are discussed. Regulation of cabin temperature, air flow, humidity, pressure and oxygen is done by control systems and the equipment used is described. Four appendices give (1) suggested detailed requirements for air conditioning equipment and user requirements, (2) sample data and calculations for air conditioning a 100-seater civil transport, (3) some notes on the definition of refrigeration terms and (4) data on pressure losses in aircraft ducts.


2017 ◽  
Vol 7 (3) ◽  
pp. 20160098 ◽  
Author(s):  
Anthony Trewavas

Intelligence is defined for wild plants and its role in fitness identified. Intelligent behaviour exhibited by single cells and systems similarity between the interactome and connectome indicates neural systems are not necessary for intelligent capabilities. Plants sense and respond to many environmental signals that are assessed to competitively optimize acquisition of patchily distributed resources. Situations of choice engender motivational states in goal-directed plant behaviour; consequent intelligent decisions enable efficient gain of energy over expenditure. Comparison of swarm intelligence and plant behaviour indicates the origins of plant intelligence lie in complex communication and is exemplified by cambial control of branch function. Error correction in behaviours indicates both awareness and intention as does the ability to count to five. Volatile organic compounds are used as signals in numerous plant interactions. Being complex in composition and often species and individual specific, they may represent the plant language and account for self and alien recognition between individual plants. Game theory has been used to understand competitive and cooperative interactions between plants and microbes. Some unexpected cooperative behaviour between individuals and potential aliens has emerged. Behaviour profiting from experience, another simple definition of intelligence, requires both learning and memory and is indicated in the priming of herbivory, disease and abiotic stresses.


Sign in / Sign up

Export Citation Format

Share Document