scholarly journals Using Galaxy to Perform Large‐Scale Interactive Data Analyses—An Update

2021 ◽  
Vol 1 (2) ◽  
Author(s):  
Alexander Ostrovsky ◽  
Jennifer Hillman‐Jackson ◽  
Dave Bouvier ◽  
Dave Clements ◽  
Enis Afgan ◽  
...  
Author(s):  
Jennifer Hillman‐Jackson ◽  
Dave Clements ◽  
Daniel Blankenberg ◽  
James Taylor ◽  
Anton Nekrutenko ◽  
...  

Author(s):  
James Taylor ◽  
Ian Schenck ◽  
Dan Blankenberg ◽  
Anton Nekrutenko

2019 ◽  
Author(s):  
Eduard Klapwijk ◽  
Wouter van den Bos ◽  
Christian K. Tamnes ◽  
Nora Maria Raschle ◽  
Kathryn L. Mills

Many workflows and tools that aim to increase the reproducibility and replicability of research findings have been suggested. In this review, we discuss the opportunities that these efforts offer for the field of developmental cognitive neuroscience, in particular developmental neuroimaging. We focus on issues broadly related to statistical power and to flexibility and transparency in data analyses. Critical considerations relating to statistical power include challenges in recruitment and testing of young populations, how to increase the value of studies with small samples, and the opportunities and challenges related to working with large-scale datasets. Developmental studies involve challenges such as choices about age groupings, lifespan modelling, analyses of longitudinal changes, and data that can be processed and analyzed in a multitude of ways. Flexibility in data acquisition, analyses and description may thereby greatly impact results. We discuss methods for improving transparency in developmental neuroimaging, and how preregistration can improve methodological rigor. While outlining challenges and issues that may arise before, during, and after data collection, solutions and resources are highlighted aiding to overcome some of these. Since the number of useful tools and techniques is ever-growing, we highlight the fact that many practices can be implemented stepwise.


2006 ◽  
Vol 63 (5) ◽  
pp. 1377-1389 ◽  
Author(s):  
Tim Li ◽  
Bing Fu

Abstract The structure and evolution characteristics of Rossby wave trains induced by tropical cyclone (TC) energy dispersion are revealed based on the Quick Scatterometer (QuikSCAT) and Tropical Rainfall Measuring Mission (TRMM) Microwave Imager (TMI) data. Among 34 cyclogenesis cases analyzed in the western North Pacific during 2000–01 typhoon seasons, six cases are associated with the Rossby wave energy dispersion of a preexisting TC. The wave trains are oriented in a northwest–southeast direction, with alternating cyclonic and anticyclonic vorticity circulation. A typical wavelength of the wave train is about 2500 km. The TC genesis is observed in the cyclonic circulation region of the wave train, possibly through a scale contraction process. The satellite data analyses reveal that not all TCs have a Rossby wave train in their wakes. The occurrence of the Rossby wave train depends to a certain extent on the TC intensity and the background flow. Whether or not a Rossby wave train can finally lead to cyclogenesis depends on large-scale dynamic and thermodynamic conditions related to both the change of the seasonal mean state and the phase of the tropical intraseasonal oscillation. Stronger low-level convergence and cyclonic vorticity, weaker vertical shear, and greater midtropospheric moisture are among the favorable large-scale conditions. The rebuilding process of a conditional unstable stratification is important in regulating the frequency of TC genesis.


F1000Research ◽  
2016 ◽  
Vol 5 ◽  
pp. 291 ◽  
Author(s):  
Darawan Rinchai ◽  
Sabri Boughorbel ◽  
Scott Presnell ◽  
Charlie Quinn ◽  
Damien Chaussabel

Systems-scale profiling approaches have become widely used in translational research settings. The resulting accumulation of large-scale datasets in public repositories represents a critical opportunity to promote insight and foster knowledge discovery. However, resources that can serve as an interface between biomedical researchers and such vast and heterogeneous dataset collections are needed in order to fulfill this potential. Recently, we have developed an interactive data browsing and visualization web application, the Gene Expression Browser (GXB). This tool can be used to overlay deep molecular phenotyping data with rich contextual information about analytes, samples and studies along with ancillary clinical or immunological profiling data. In this note, we describe a curated compendium of 93 public datasets generated in the context of human monocyte immunological studies, representing a total of 4,516 transcriptome profiles. Datasets were uploaded to an instance of GXB along with study description and sample annotations. Study samples were arranged in different groups. Ranked gene lists were generated based on relevant group comparisons. This resource is publicly available online athttp://monocyte.gxbsidra.org/dm3/landing.gsp.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Gábor Pozsgai ◽  
Ibtissem Ben Fekih ◽  
Markus V. Kohnen ◽  
Said Amrani ◽  
Sándor Bérces ◽  
...  

AbstractDescribing and conserving ecological interactions woven into ecosystems is one of the great challenges of the 21st century. Here, we present a unique dataset compiling the biotic interactions between two ecologically and economically important taxa: ground beetles (Coleoptera: Carabidae) and fungi. The resulting dataset contains the carabid-fungus associations collected from 392 scientific publications, 129 countries, mostly from the Palearctic region, published over a period of 200 years. With an updated taxonomy to match the currently accepted nomenclature, 3,378 unique associations among 5,564 records were identified between 1,776 carabid and 676 fungal taxa. Ectoparasitic Laboulbeniales were the most frequent fungal group associated with carabids, especially with Trechinae. The proportion of entomopathogens was low. Three different formats of the data have been provided along with an interactive data digest platform for analytical purposes. Our database summarizes the current knowledge on biotic interactions between insects and fungi, while offering a valuable resource to test large-scale hypotheses on those interactions.


2018 ◽  
Author(s):  
M. Jason de la Cruz ◽  
Michael W. Martynowycz ◽  
Johan Hattne ◽  
Tamir Gonen

AbstractWe developed a procedure for the cryoEM method MicroED using SerialEM. With this approach, SerialEM coordinates stage rotation, microscope operation, and camera functions for automated continuous-rotation MicroED data collection. More than 300 datasets can be collected overnight in this way, facilitating high-throughput MicroED data collection for large-scale data analyses.


Water ◽  
2020 ◽  
Vol 12 (10) ◽  
pp. 2928
Author(s):  
Jeffrey D. Walker ◽  
Benjamin H. Letcher ◽  
Kirk D. Rodgers ◽  
Clint C. Muhlfeld ◽  
Vincent S. D’Angelo

With the rise of large-scale environmental models comes new challenges for how we best utilize this information in research, management and decision making. Interactive data visualizations can make large and complex datasets easier to access and explore, which can lead to knowledge discovery, hypothesis formation and improved understanding. Here, we present a web-based interactive data visualization framework, the Interactive Catchment Explorer (ICE), for exploring environmental datasets and model outputs. Using a client-based architecture, the ICE framework provides a highly interactive user experience for discovering spatial patterns, evaluating relationships between variables and identifying specific locations using multivariate criteria. Through a series of case studies, we demonstrate the application of the ICE framework to datasets and models associated with three separate research projects covering different regions in North America. From these case studies, we provide specific examples of the broader impacts that tools like these can have, including fostering discussion and collaboration among stakeholders and playing a central role in the iterative process of data collection, analysis and decision making. Overall, the ICE framework demonstrates the potential benefits and impacts of using web-based interactive data visualization tools to place environmental datasets and model outputs directly into the hands of stakeholders, managers, decision makers and other researchers.


2020 ◽  
Author(s):  
Katharina Höflich ◽  
Martin Claus ◽  
Willi Rath ◽  
Dorian Krause ◽  
Benedikt von St. Vieth ◽  
...  

<p>Demand on high-end high performance computer (HPC) systems by the Earth system science community today encompasses not only the handling of complex simulations but also machine and deep learning as well as interactive data analysis workloads on large volumes of data. This poster addresses the infrastructure needs of large-scale interactive data analysis workloads on supercomputers. It lays out how to enable optimizations of existing infrastructure with respect to accessibility, usability and interactivity and aims at informing decision making about future systems. To enhance accessibility, options for distributed access, e.g. through JupyterHub, will be evaluated. To increase usability, the unification of working environments via the operation and the joint maintenance of containers will be explored. Containers serve as a portable base software setting for data analysis application stacks and allow for long-term usability of individual working environments and repeatability of scientific analysis. Aiming for interactive big-data analysis on HPC will also help the scientific community in utilizing increasingly heterogeneous supercomputers, since the modular data-analysis stack already contains solutions for seamless use of various architectures such as accelerators. However, to enable day-to-day interactive work on supercomputers, the inter-operation of workloads with quick turn-around times and highly variable resource demands needs to be understood and evaluated. To this end, scheduling policies on selected HPC systems are reviewed with respect to existing technical solutions such as job preemption, utilizing the resiliency features of parallel computing toolkits like Dask. Presented are preliminary results focussing on the aspects of usability and interactive use of HPC systems on the basis of typical use cases from the ocean science community.</p>


Sign in / Sign up

Export Citation Format

Share Document