Neuroscience Cloud Analysis as a Service: An Open Source Platform for Scalable, Reproducible Data Analysis

2021 ◽  
Author(s):  
Taiga Abe ◽  
Ian Kinsella ◽  
Shreya Saxena ◽  
E. Kelly Buchanan ◽  
Joao Couto ◽  
...  
Author(s):  
Taiga Abe ◽  
Ian Kinsella ◽  
Shreya Saxena ◽  
Liam Paninski ◽  
John P. Cunningham

AbstractA major goal of computational neuroscience is to develop powerful analysis tools that operate on large datasets. These methods provide an essential toolset to unlock scientific insights from new experiments. Unfortunately, a major obstacle currently impedes progress: while existing analysis methods are frequently shared as open source software, the infrastructure needed to deploy these methods – at scale, reproducibly, cheaply, and quickly – remains totally inaccessible to all but a minority of expert users. As a result, many users can not fully exploit these tools, due to constrained computational resources (limited or costly compute hardware) and/or mismatches in expertise (experimentalists vs. large-scale computing experts). In this work we develop Neuroscience Cloud Analysis As a Service (NeuroCAAS): a fully-managed infrastructure platform, based on modern large-scale computing advances, that makes state-of-the-art data analysis tools accessible to the neuroscience community. We offer NeuroCAAS as an open source service with a drag-and-drop interface, entirely removing the burden of infrastructure expertise, purchasing, maintenance, and deployment. NeuroCAAS is enabled by three key contributions. First, NeuroCAAS cleanly separates tool implementation from usage, allowing cutting-edge methods to be served directly to the end user with no need to read or install any analysis software. Second, NeuroCAAS automatically scales as needed, providing reliable, highly elastic computational resources that are more efficient than personal or lab-supported hardware, without management overhead. Finally, we show that many popular data analysis tools offered through NeuroCAAS outperform typical analysis solutions (in terms of speed and cost) while improving ease of use and maintenance, dispelling the myth that cloud compute is prohibitively expensive and technically inaccessible. By removing barriers to fast, efficient cloud computation, NeuroCAAS can dramatically accelerate both the dissemination and the effective use of cutting-edge analysis tools for neuroscientific discovery.


Solid Earth ◽  
2011 ◽  
Vol 2 (1) ◽  
pp. 53-63 ◽  
Author(s):  
S. Tavani ◽  
P. Arbues ◽  
M. Snidero ◽  
N. Carrera ◽  
J. A. Muñoz

Abstract. In this work we present the Open Plot Project, an open-source software for structural data analysis, including a 3-D environment. The software includes many classical functionalities of structural data analysis tools, like stereoplot, contouring, tensorial regression, scatterplots, histograms and transect analysis. In addition, efficient filtering tools are present allowing the selection of data according to their attributes, including spatial distribution and orientation. This first alpha release represents a stand-alone toolkit for structural data analysis. The presence of a 3-D environment with digitalising tools allows the integration of structural data with information extracted from georeferenced images to produce structurally validated dip domains. This, coupled with many import/export facilities, allows easy incorporation of structural analyses in workflows for 3-D geological modelling. Accordingly, Open Plot Project also candidates as a structural add-on for 3-D geological modelling software. The software (for both Windows and Linux O.S.), the User Manual, a set of example movies (complementary to the User Manual), and the source code are provided as Supplement. We intend the publication of the source code to set the foundation for free, public software that, hopefully, the structural geologists' community will use, modify, and implement. The creation of additional public controls/tools is strongly encouraged.


2014 ◽  
Vol 28 (S1) ◽  
Author(s):  
James Bassingthwaighte ◽  
Erik Butterworth ◽  
Bart Jardine ◽  
Gary Raymond ◽  
Maxwell Neal

2021 ◽  
Author(s):  
Sevim Cengiz ◽  
Muhammed Yildirim ◽  
Abdullah Bas ◽  
Esin Ozturk-Isik

Proton magnetic resonance spectroscopic imaging (1H-MRSI) provides noninvasive evaluation of brain metabolism. However, there are some limitations of 1H-MRSI preventing its wider use in the clinics, including the spectral quality issues, partial volume effect and chemical shift artifact. Additionally, it is necessary to create metabolite maps for analyzing spectral data along with other MRI modalities. In this study, a MATLAB-based open-source data analysis software for 3D 1H-MRSI, called Oryx-MRSI, which includes modules for visualization of raw 1H-MRSI data and LCModel outputs, chemical shift correction, tissue fraction calculation, metabolite map production, and registration onto standard MNI152 brain atlas while providing automatic spectral quality control, is presented. Oryx-MRSI implements region of interest analysis at brain parcellations defined on MNI152 brain atlas. All generated metabolite maps are stored in NIfTI format. Oryx-MRSI is publicly available at https://github.com/sevimcengiz/Oryx-MRSI along with six example datasets.


2021 ◽  
Author(s):  
Fabian Kovacs ◽  
Max Thonagel ◽  
Marion Ludwig ◽  
Alexander Albrecht ◽  
Manuel Hegner ◽  
...  

BACKGROUND Big data in healthcare must be exploited to achieve a substantial increase in efficiency and competitiveness. Especially the analysis of patient-related data possesses huge potential to improve decision-making processes. However, most analytical approaches used today are highly time- and resource-consuming. OBJECTIVE The presented software solution Conquery is an open-source software tool providing advanced, but intuitive data analysis without the need for specialized statistical training. Conquery aims to simplify big data analysis for novice database users in the medical sector. METHODS Conquery is a document-oriented distributed timeseries database and analysis platform. Its main application is the analysis of per-person medical records by non-technical medical professionals. Complex analyses are realized in the Conquery frontend by dragging tree nodes into the query editor. Queries are evaluated by a bespoke distributed query-engine for medical records in a column-oriented fashion. We present a custom compression scheme to facilitate low response times that uses online calculated as well as precomputed metadata and data statistics. RESULTS Conquery allows for easy navigation through the hierarchy and enables complex study cohort construction whilst reducing the demand on time and resources. The UI of Conquery and a query output is exemplified by the construction of a relevant clinical cohort. CONCLUSIONS Conquery is an efficient and intuitive open-source software for performant and secure data analysis and aims at supporting decision-making processes in the healthcare sector.


GigaScience ◽  
2020 ◽  
Vol 9 (10) ◽  
Author(s):  
Katrina L Kalantar ◽  
Tiago Carvalho ◽  
Charles F A de Bourcy ◽  
Boris Dimitrov ◽  
Greg Dingle ◽  
...  

Abstract Background Metagenomic next-generation sequencing (mNGS) has enabled the rapid, unbiased detection and identification of microbes without pathogen-specific reagents, culturing, or a priori knowledge of the microbial landscape. mNGS data analysis requires a series of computationally intensive processing steps to accurately determine the microbial composition of a sample. Existing mNGS data analysis tools typically require bioinformatics expertise and access to local server-class hardware resources. For many research laboratories, this presents an obstacle, especially in resource-limited environments. Findings We present IDseq, an open source cloud-based metagenomics pipeline and service for global pathogen detection and monitoring (https://idseq.net). The IDseq Portal accepts raw mNGS data, performs host and quality filtration steps, then executes an assembly-based alignment pipeline, which results in the assignment of reads and contigs to taxonomic categories. The taxonomic relative abundances are reported and visualized in an easy-to-use web application to facilitate data interpretation and hypothesis generation. Furthermore, IDseq supports environmental background model generation and automatic internal spike-in control recognition, providing statistics that are critical for data interpretation. IDseq was designed with the specific intent of detecting novel pathogens. Here, we benchmark novel virus detection capability using both synthetically evolved viral sequences and real-world samples, including IDseq analysis of a nasopharyngeal swab sample acquired and processed locally in Cambodia from a tourist from Wuhan, China, infected with the recently emergent SARS-CoV-2. Conclusion The IDseq Portal reduces the barrier to entry for mNGS data analysis and enables bench scientists, clinicians, and bioinformaticians to gain insight from mNGS datasets for both known and novel pathogens.


2016 ◽  
Vol 20 (9) ◽  
pp. 3739-3743 ◽  
Author(s):  
Michael N. Fienen ◽  
Mark Bakker

Abstract. In the past decade, difficulties encountered in reproducing the results of a cancer study at Duke University resulted in a scandal and an investigation which concluded that tools used for data management, analysis, and modeling were inappropriate for the documentation of the study, let alone the reproduction of the results. New protocols were developed which require that data analysis and modeling be carried out with scripts that can be used to reproduce the results and are a record of all decisions and interpretations made during an analysis or a modeling effort. In the hydrological sciences, we face similar challenges and need to develop similar standards for transparency and repeatability of results. A promising route is to start making use of open-source languages (such as R and Python) to write scripts and to use collaborative coding environments (such as Git) to share our codes for inspection and use by the hydrological community. An important side-benefit to adopting such protocols is consistency and efficiency among collaborators.


SoftwareX ◽  
2016 ◽  
Vol 5 ◽  
pp. 121-126 ◽  
Author(s):  
Tobias Weber ◽  
Robert Georgii ◽  
Peter Böni

Sign in / Sign up

Export Citation Format

Share Document