scholarly journals MetaCycle: an integrated R package to evaluate periodicity in large scale data

2016 ◽  
Author(s):  
Gang Wu ◽  
Ron C Anafi ◽  
Michael E Hughes ◽  
Karl Kornacker ◽  
John B Hogenesch

Summary: Detecting periodicity in large scale data remains a challenge. Different algorithms offer strengths and weaknesses in statistical power, sensitivity to outliers, ease of use, and sampling requirements. While efforts have been made to identify best of breed algorithms, relatively little research has gone into integrating these methods in a generalizable method. Here we present MetaCycle, an R package that incorporates ARSER, JTK_CYCLE, and Lomb-Scargle to conveniently evaluate periodicity in time-series data. Availability and implementation: MetaCycle package is available on the CRAN repository (https://cran.r-project.org/web/packages/MetaCycle/index.html) and GitHub (https://github.com/gangwug/MetaCycle). Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.

2019 ◽  
Author(s):  
Zachary B. Abrams ◽  
Caitlin E. Coombes ◽  
Suli Li ◽  
Kevin R. Coombes

AbstractSummaryUnsupervised data analysis in many scientific disciplines is based on calculating distances between observations and finding ways to visualize those distances. These kinds of unsupervised analyses help researchers uncover patterns in large-scale data sets. However, researchers can select from a vast number of different distance metrics, each designed to highlight different aspects of different data types. There are also numerous visualization methods with their own strengths and weaknesses. To help researchers perform unsupervised analyses, we developed the Mercator R package. Mercator enables users to see important patterns in their data by generating multiple visualizations using different standard algorithms, making it particularly easy to compare and contrast the results arising from different metrics. By allowing users to select the distance metric that best fits their needs, Mercator helps researchers perform unsupervised analyses that use pattern identification through computation and visual inspection.Availability and ImplementationMercator is freely available at the Comprehensive R Archive Network (https://cran.r-project.org/web/packages/Mercator/index.html)[email protected] informationSupplementary data are available at Bioinformatics online.


2021 ◽  
Vol 15 ◽  
Author(s):  
Witney Chen ◽  
Lowry Kirkby ◽  
Miro Kotzev ◽  
Patrick Song ◽  
Ro’ee Gilron ◽  
...  

Advances in neuromodulation technologies hold the promise of treating a patient’s unique brain network pathology using personalized stimulation patterns. In service of these goals, neuromodulation clinical trials using sensing-enabled devices are routinely generating large multi-modal datasets. However, with the expansion of data acquisition also comes an increasing difficulty to store, manage, and analyze the associated datasets, which integrate complex neural and wearable time-series data with dynamic assessments of patients’ symptomatic state. Here, we discuss a scalable cloud-based data platform that enables ingestion, aggregation, storage, query, and analysis of multi-modal neurotechnology datasets. This large-scale data infrastructure will accelerate translational neuromodulation research and enable the development and delivery of next-generation deep brain stimulation therapies.


Author(s):  
Zachary B Abrams ◽  
Caitlin E Coombes ◽  
Suli Li ◽  
Kevin R Coombes

Abstract Summary Unsupervised machine learning provides tools for researchers to uncover latent patterns in large-scale data, based on calculated distances between observations. Methods to visualize high-dimensional data based on these distances can elucidate subtypes and interactions within multi-dimensional and high-throughput data. However, researchers can select from a vast number of distance metrics and visualizations, each with their own strengths and weaknesses. The Mercator R package facilitates selection of a biologically meaningful distance from 10 metrics, together appropriate for binary, categorical and continuous data, and visualization with 5 standard and high-dimensional graphics tools. Mercator provides a user-friendly pipeline for informaticians or biologists to perform unsupervised analyses, from exploratory pattern recognition to production of publication-quality graphics. Availabilityand implementation Mercator is freely available at the Comprehensive R Archive Network (https://cran.r-project.org/web/packages/Mercator/index.html).


2018 ◽  
Author(s):  
Carlos Martínez-Mira ◽  
Ana Conesa ◽  
Sonia Tarazona

AbstractMotivationAs new integrative methodologies are being developed to analyse multi-omic experiments, validation strategies are required for benchmarking. In silico approaches such as simulated data are popular as they are fast and cheap. However, few tools are available for creating synthetic multi-omic data sets.ResultsMOSim is a new R package for easily simulating multi-omic experiments consisting of gene expression data, other regulatory omics and the regulatory relationships between them. MOSim supports different experimental designs including time series data.AvailabilityThe package is freely available under the GPL-3 license from the Bitbucket repository (https://bitbucket.org/ConesaLab/mosim/)[email protected] informationSupplementary material is available at bioRxiv online.


2016 ◽  
Vol 32 (21) ◽  
pp. 3351-3353 ◽  
Author(s):  
Gang Wu ◽  
Ron C. Anafi ◽  
Michael E. Hughes ◽  
Karl Kornacker ◽  
John B. Hogenesch

2017 ◽  
Vol 79 (1) ◽  
pp. 28-34
Author(s):  
Will H. Ryan ◽  
Elise S. Gornish ◽  
Lynn Christenson ◽  
Stacey Halpern ◽  
Sandra Henderson ◽  
...  

The value of long-term data (generally >10 years) in ecology is well known. Funding agencies clearly see the value in these data and have supported a limited number of projects to this end. However, individual researchers often see the challenges of long-term data collection as insurmountable. We propose that long-term data collection can be practical as part of any teaching or outreach program, and we provide guidance on how long-term projects can fit into a teaching and research schedule. While our primary audience is college faculty, our message is appropriate for anyone interested in establishing long-term studies. The benefits of adopting these kinds of projects include experience for students, encouraging public interest in science, increased publication potential for researchers, and increased large-scale data availability, leading to a better understanding of ecological phenomena.


2021 ◽  
Author(s):  
Isaac Fink ◽  
Richard J. Abdill ◽  
Ran Blekhman ◽  
Laura Grieneisen

AbstractSummaryA key aspect of microbiome research is analysis of longitudinal dynamics using time series data. A method to visualize both the proportional and absolute change in the abundance of multiple taxa across multiple subjects over time is needed. We developed BiomeHorizon, an open-source R package that visualizes longitudinal compositional microbiome data using horizon plots.Availability and ImplementationBiomeHorizon is available at https://github.com/blekhmanlab/biomehorizon/ and released under the MIT license. A guide with step-by-step instructions for using the package is provided at https://blekhmanlab.github.io/biomehorizon/. The guide also provides code to reproduce all plots in this [email protected], [email protected], [email protected] informationNone


2019 ◽  
Author(s):  
Simona Vigodner ◽  
Raya Khanin

AbstractGenetic underpinnings of facial aging are still largely unknown. In this study, we leverage the statistical power of large-scale data from the UK Biobank and perform insilico analysis of genome-wide self-perceived facial aging. Functional analysis reveals significant over-representation of skin pigmentation and immune related pathways that are correlated with facial aging. For males, hair loss is one of the top categories that is highly significantly over-represented in the genetics data associated with self-reported facial aging. Our analysis confirms that genes coding for the extracellular matrix play important roles in aging. Overall, our results provide evidence that while somewhat biased, large-scale self-reported data on aging can be utilized for extracting useful insights into underlying biology, provide candidate skin aging biomarkers, and advance anti-aging skincare.


2019 ◽  
Vol 36 (8) ◽  
pp. 2572-2574
Author(s):  
Soumitra Pal ◽  
Teresa M Przytycka

Abstract Summary Large-scale data analysis in bioinformatics requires pipelined execution of multiple software. Generally each stage in a pipeline takes considerable computing resources and several workflow management systems (WMS), e.g. Snakemake, Nextflow, Common Workflow Language, Galaxy, etc. have been developed to ensure optimum execution of the stages across two invocations of the pipeline. However, when the pipeline needs to be executed with different settings of parameters, e.g. thresholds, underlying algorithms, etc. these WMS require significant scripting to ensure an optimal execution. We developed JUDI on top of DoIt, a Python based WMS, to systematically handle parameter settings based on the principles of database management systems. Using a novel modular approach that encapsulates a parameter database in each task and file associated with a pipeline stage, JUDI simplifies plug-and-play of the pipeline stages. For a typical pipeline with n parameters, JUDI reduces the number of lines of scripting required by a factor of O(n). With properly designed parameter databases, JUDI not only enables reproducing research under published values of parameters but also facilitates exploring newer results under novel parameter settings. Availability and implementation https://github.com/ncbi/JUDI Supplementary information Supplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document