biological data analysis Latest Research Papers

AbstractHigh-throughput biological data analysis commonly involves identifying features such as genes, genomic regions, and proteins, whose values differ between two conditions, from numerous features measured simultaneously. The most widely used criterion to ensure the analysis reliability is the false discovery rate (FDR), which is primarily controlled based on p-values. However, obtaining valid p-values relies on either reasonable assumptions of data distribution or large numbers of replicates under both conditions. Clipper is a general statistical framework for FDR control without relying on p-values or specific data distributions. Clipper outperforms existing methods for a broad range of applications in high-throughput data analysis.

Download Full-text

easyfm : An easy software suite for file manipulation of Next Generation Sequencing data on desktops

10.1101/2021.09.29.462291 ◽

2021 ◽

Author(s):

Hyungtaek Jung ◽

Brendan Jeon ◽

Daniel Ortiz-Barrientos

Keyword(s):

Next Generation Sequencing ◽

High Performance ◽

Biological Data ◽

Next Generation Sequencing Data ◽

Next Generation ◽

Sequencing Data ◽

File Formats ◽

Biological Data Analysis ◽

Next Generation Sequencing Ngs ◽

Generation Sequencing

Storing and manipulating Next Generation Sequencing (NGS) file formats is an essential but difficult task in biological data analysis. The easyfm ( easy f ile m anipulation) toolkit ( https://github.com/TaekAndBrendan/easyfm ) makes manipulating commonly used NGS files more accessible to biologists. It enables them to perform end-to-end reproducible data analyses using a free standalone desktop application (available on Windows, Mac and Linux). Unlike existing tools (e.g. Galaxy), the Graphical User Interface (GUI)-based easyfm is not dependent on any high-performance computing (HPC) system and can be operated without an internet connection. This specific benefit allow easyfm to seamlessly integrate visual and interactive representations of NGS files, supporting a wider scope of bioinformatics applications in the life sciences.

Download Full-text

A Primer in Biological Data Analysis and Visualization Using R

10.7312/hart20212 ◽

2021 ◽

Author(s):

Gregg Hartvigsen

Keyword(s):

Data Analysis ◽

Biological Data ◽

Biological Data Analysis

Download Full-text

Practical R for biologists: an introduction

10.1079/9781789245349.0000 ◽

2021 ◽

Keyword(s):

Statistical Tests ◽

Statistical Modelling ◽

Biological Data ◽

Early Years ◽

Main Text ◽

Biological Data Analysis ◽

Base Functions ◽

Almost All ◽

Selection Of ◽

R Functions

Abstract R is an open-source statistical environment modelled after the previously widely used commercial programs S and S-Plus, but in addition to powerful statistical analysis tools, it also provides powerful graphics outputs. In addition to its statistical and graphical capabilities, R is a programming language suitable for medium-sized projects. This book presents a set of studies that collectively represent almost all the R operations that beginners, analysing their own data up to perhaps the early years of doing a PhD, need. Although the chapters are organized around topics such as graphing, classical statistical tests, statistical modelling, mapping and text parsing, examples have been chosen based largely on real scientific studies at the appropriate level and within each the use of more R functions is nearly always covered than are simply necessary just to get a p-value or a graph. R comes with around a thousand base functions which are automatically installed when R is downloaded. This book covers the use of those of most relevance to biological data analysis, modelling and graphics. Throughout each chapter, the functions introduced and used in that chapter are summarized in Tool Boxes. The book also shows the user how to adapt and write their own code and functions. A selection of base functions relevant to graphics that are not necessarily covered in the main text are described in Appendix 1, and additional housekeeping functions in Appendix 2.

Download Full-text

Synthetic data generation with probabilistic Bayesian Networks

Mathematical Biosciences and Engineering ◽

10.3934/mbe.2021426 ◽

2021 ◽

Vol 18 (6) ◽

pp. 8603-8621

Author(s):

Grigoriy Gogoshin ◽

◽

Sergio Branciamore ◽

Andrei S. Rodin

Keyword(s):

Synthetic Data ◽

Network Models ◽

Biological Data ◽

Data Generation ◽

Computational Systems Biology ◽

Probabilistic Simulation ◽

Synthetic Data Generation ◽

Biological Data Analysis ◽

Benchmark Datasets ◽

Computational Systems

<abstract><p>Bayesian Network (BN) modeling is a prominent and increasingly popular computational systems biology method. It aims to construct network graphs from the large heterogeneous biological datasets that reflect the underlying biological relationships. Currently, a variety of strategies exist for evaluating BN methodology performance, ranging from utilizing artificial benchmark datasets and models, to specialized biological benchmark datasets, to simulation studies that generate synthetic data from predefined network models. The last is arguably the most comprehensive approach; however, existing implementations often rely on explicit and implicit assumptions that may be unrealistic in a typical biological data analysis scenario, or are poorly equipped for automated arbitrary model generation. In this study, we develop a purely probabilistic simulation framework that addresses the demands of statistically sound simulations studies in an unbiased fashion. Additionally, we expand on our current understanding of the theoretical notions of causality and dependence / conditional independence in BNs and the Markov Blankets within.</p></abstract>

Download Full-text

SPDE: A Multi-functional Software for Sequence Processing and Data Extraction

10.1101/2020.11.08.373720 ◽

2020 ◽

Cited By ~ 1

Author(s):

Dong Xu ◽

Zhuchou Lu ◽

Kangming Jin ◽

Wenmin Qiu ◽

Guirong Qiao ◽

...

Keyword(s):

Data Extraction ◽

Single Gene ◽

Biological Data ◽

Sequence Processing ◽

Reverse Complement ◽

Biological Data Analysis ◽

Basic Functions ◽

Functional Software ◽

Genome Information ◽

Ncbi Blast

AbstractEfficiently extracting information from biological big data can be a huge challenge for people (especially those who lack programming skills). We developed Sequence Processing and Data Extraction (SPDE) as an integrated tool for sequence processing and data extraction for gene family and omics analyses. Currently, SPDE has seven modules comprising 100 basic functions that range from single gene processing (e.g., translation, reverse complement, and primer design) to genome information extraction. All SPDE functions can be used without the need for programming or command lines. The SPDE interface has enough prompt information to help users run SPDE without barriers. In addition to its own functions, SPDE also incorporates the publicly available analyses tools (such as, NCBI-blast, HMMER, Primer3 and SAMtools), thereby making SPDE a comprehensive bioinformatics platform for big biological data analysis.AvailabilitySPDE was built using Python and can be run on 32-bit, 64-bit Windows and macOS systems. It is an open-source software that can be downloaded from https://github.com/simon19891216/[email protected]

Download Full-text

Newt: a comprehensive web-based tool for viewing, constructing and analyzing biological maps

Bioinformatics ◽

10.1093/bioinformatics/btaa850 ◽

2020 ◽

Cited By ~ 1

Author(s):

Hasan Balci ◽

Metin Can Siper ◽

Nasim Saleh ◽

Ilkin Safarli ◽

Ludovic Roy ◽

...

Keyword(s):

Data Analysis ◽

Source Code ◽

Biological Data ◽

Web Based ◽

Cellular Processes ◽

Biological Data Analysis ◽

Accepted Standard ◽

Visualization Techniques

Abstract Motivation Visualization of cellular processes and pathways is a key recurring requirement for effective biological data analysis. There is a considerable need for sophisticated web-based pathway viewers and editors operating with widely accepted standard formats, using the latest visualization techniques and libraries. Results We developed a web-based tool named Newt for viewing, constructing and analyzing biological maps in standard formats such as SBGN, SBML and SIF. Availability and implementation Newt’s source code is publicly available on GitHub and freely distributed under the GNU LGPL. Ample documentation on Newt can be found on http://newteditor.org and on YouTube.

Download Full-text

Watchdog 2.0: New developments for reusability, reproducibility, and workflow execution

GigaScience ◽

10.1093/gigascience/giaa068 ◽

2020 ◽

Vol 9 (6) ◽

Cited By ~ 1

Author(s):

Michael Kluge ◽

Marie-Sophie Friedl ◽

Amrei L Menzel ◽

Caroline C Friedel

Keyword(s):

Large Scale ◽

Workflow Management ◽

Biological Data ◽

Management Systems ◽

Workflow Management Systems ◽

Computer Clusters ◽

New Developments ◽

Workflow Execution ◽

Biological Data Analysis ◽

Log File

Abstract Background Advances in high-throughput methods have brought new challenges for biological data analysis, often requiring many interdependent steps applied to a large number of samples. To address this challenge, workflow management systems, such as Watchdog, have been developed to support scientists in the (semi-)automated execution of large analysis workflows. Implementation Here, we present Watchdog 2.0, which implements new developments for module creation, reusability, and documentation and for reproducibility of analyses and workflow execution. Developments include a graphical user interface for semi-automatic module creation from software help pages, sharing repositories for modules and workflows, and a standardized module documentation format. The latter allows generation of a customized reference book of public and user-specific modules. Furthermore, extensive logging of workflow execution, module and software versions, and explicit support for package managers and container virtualization now ensures reproducibility of results. A step-by-step analysis protocol generated from the log file may, e.g., serve as a draft of a manuscript methods section. Finally, 2 new execution modes were implemented. One allows resuming workflow execution after interruption or modification without rerunning successfully executed tasks not affected by changes. The second one allows detaching and reattaching to workflow execution on a local computer while tasks continue running on computer clusters. Conclusions Watchdog 2.0 provides several new developments that we believe to be of benefit for large-scale bioinformatics analysis and that are not completely covered by other competing workflow management systems. The software itself, module and workflow repositories, and comprehensive documentation are freely available at https://www.bio.ifi.lmu.de/watchdog.

Download Full-text

NetConfer: a web application for comparative analysis of multiple biological networks

BMC Biology ◽

10.1186/s12915-020-00781-9 ◽

2020 ◽

Vol 18 (1) ◽

Cited By ~ 2

Author(s):

Sunil Nagpal ◽

Krishanu Das Baksi ◽

Bhusan K. Kuntal ◽

Sharmila S. Mande

Keyword(s):

Comparative Analysis ◽

Biological Networks ◽

Web Application ◽

Biological Systems ◽

Batch Process ◽

Biological Data ◽

Data Intensive ◽

Multiple Networks ◽

Biological Data Analysis ◽

Multiple Network

Abstract Background Most biological experiments are inherently designed to compare changes or transitions of state between conditions of interest. The advancements in data intensive research have in particular elevated the need for resources and tools enabling comparative analysis of biological data. The complexity of biological systems and the interactions of their various components, such as genes, proteins, taxa, and metabolites, have been inferred, represented, and visualized via graph theory-based networks. Comparisons of multiple networks can help in identifying variations across different biological systems, thereby providing additional insights. However, while a number of online and stand-alone tools exist for generating, analyzing, and visualizing individual biological networks, the utility to batch process and comprehensively compare multiple networks is limited. Results Here, we present a graphical user interface (GUI)-based web application which implements multiple network comparison methodologies and presents them in the form of organized analysis workflows. Dedicated comparative visualization modules are provided to the end-users for obtaining easy to comprehend, insightful, and meaningful comparisons of various biological networks. We demonstrate the utility and power of our tool using publicly available microbial and gene expression data. Conclusion NetConfer tool is developed keeping in mind the requirements of researchers working in the field of biological data analysis with limited programming expertise. It is also expected to be useful for advanced users from biological as well as other domains (working with association networks), benefiting from provided ready-made workflows, as they allow to focus directly on the results without worrying about the implementation. While the web version allows using this application without installation and dependency requirements, a stand-alone version has also been supplemented to accommodate the offline requirement of processing large networks.

Download Full-text

A Brief Overview on Intelligent Computing-Based Biological Data and Image Analysis

Advances in Computational Intelligence and Robotics - Applications of Advanced Machine Intelligence in Computer Vision and Object Recognition ◽

10.4018/978-1-7998-2736-8.ch003 ◽

2020 ◽

pp. 65-89

Author(s):

Mousomi Roy

Keyword(s):

Real Life ◽

Automated Analysis ◽

Biological Data ◽

Accurate Diagnosis ◽

Automated Systems ◽

Intelligent Computing ◽

Reasonable Time ◽

Biological Data Analysis ◽

Challenging Tasks ◽

Automated Methods

Biological data analysis is one of the most important and challenging tasks in today's world. Automated analysis of these data is necessary for quick and accurate diagnosis. Intelligent computing-based solutions are highly required to reduce the human intervention as well as time. Artificial intelligence-based methods are frequently used to analyze and mine information from biological data. There are several machine learning-based tools available, using which powerful and intelligent automated systems can be developed. In general, the amount and volume of this kind of data is quite huge and demands sophisticated tools that can efficiently handle this data and produce results within reasonable time by extracting useful information from big data. In this chapter, the authors have made a comprehensive study about different computer-aided automated methods and tools to analyze the different types of biological data. Moreover, this chapter gives an insight about various types of biological data and their real-life applications.

Download Full-text

biological data analysis
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Clipper: p-value-free FDR control on high-throughput data from two conditions

easyfm : An easy software suite for file manipulation of Next Generation Sequencing data on desktops

A Primer in Biological Data Analysis and Visualization Using R

Practical R for biologists: an introduction

Synthetic data generation with probabilistic Bayesian Networks

SPDE: A Multi-functional Software for Sequence Processing and Data Extraction

Newt: a comprehensive web-based tool for viewing, constructing and analyzing biological maps

Watchdog 2.0: New developments for reusability, reproducibility, and workflow execution

NetConfer: a web application for comparative analysis of multiple biological networks

A Brief Overview on Intelligent Computing-Based Biological Data and Image Analysis

Export Citation Format

biological data analysisRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Clipper: p-value-free FDR control on high-throughput data from two conditions

easyfm : An easy software suite for file manipulation of Next Generation Sequencing data on desktops

A Primer in Biological Data Analysis and Visualization Using R

Practical R for biologists: an introduction

Synthetic data generation with probabilistic Bayesian Networks

SPDE: A Multi-functional Software for Sequence Processing and Data Extraction

Newt: a comprehensive web-based tool for viewing, constructing and analyzing biological maps

Watchdog 2.0: New developments for reusability, reproducibility, and workflow execution

NetConfer: a web application for comparative analysis of multiple biological networks

A Brief Overview on Intelligent Computing-Based Biological Data and Image Analysis

biological data analysis
Recently Published Documents