scholarly journals High Performance Computing Tools for Cross Correlation of Multi-Dimensional Data Sets Across Instrument Platforms

2016 ◽  
Vol 22 (S3) ◽  
pp. 288-289
Author(s):  
Alex Belianinov ◽  
Danka Gobeljic ◽  
Vladimir Shvartsman ◽  
Erik Endeve ◽  
Eric J. Lingerfelt ◽  
...  
2020 ◽  
Author(s):  
Kary Ocaña ◽  
Micaella Coelho ◽  
Guilherme Freire ◽  
Carla Osthoff

Bayesian phylogenetic algorithms are computationally intensive. BEAST 1.10 inferences made use of the BEAGLE 3 high-performance library for efficient likelihood computations. The strategy allows phylogenetic inference and dating in current knowledge for SARS-CoV-2 transmission. Follow-up simulations on hybrid resources of Santos Dumont supercomputer using four phylogenomic data sets, we characterize the scaling performance behavior of BEAST 1.10. Our results provide insight into the species tree and MCMC chain length estimation, identifying preferable requirements to improve the use of high-performance computing resources. Ongoing steps involve analyzes of SARS-CoV-2 using BEAST 1.8 in multi-GPUs.


Author(s):  
Phillip L. Manning ◽  
Peter L. Falkingham

Dinosaurs successfully conjure images of lost worlds and forgotten lives. Our understanding of these iconic, extinct animals now comes from many disciplines, not just the science of palaeontology. In recent years palaeontology has benefited from the application of new and existing techniques from physics, biology, chemistry, engineering, but especially computational science. The application of computers in palaeontology is highlighted in this chapter as a key area of development in studying fossils. The advances in high performance computing (HPC) have greatly aided and abetted multiple disciplines and technologies that are now feeding paleontological research, especially when dealing with large and complex data sets. We also give examples of how such multidisciplinary research can be used to communicate not only specific discoveries in palaeontology, but also the methods and ideas, from interrelated disciplines to wider audiences. Dinosaurs represent a useful vehicle that can help enable wider public engagement, communicating complex science in digestible chunks.


Genes ◽  
2019 ◽  
Vol 10 (12) ◽  
pp. 996 ◽  
Author(s):  
Ashley Cliff ◽  
Jonathon Romero ◽  
David Kainer ◽  
Angelica Walker ◽  
Anna Furches ◽  
...  

As time progresses and technology improves, biological data sets are continuously increasing in size. New methods and new implementations of existing methods are needed to keep pace with this increase. In this paper, we present a high-performance computing (HPC)-capable implementation of Iterative Random Forest (iRF). This new implementation enables the explainable-AI eQTL analysis of SNP sets with over a million SNPs. Using this implementation, we also present a new method, iRF Leave One Out Prediction (iRF-LOOP), for the creation of Predictive Expression Networks on the order of 40,000 genes or more. We compare the new implementation of iRF with the previous R version and analyze its time to completion on two of the world’s fastest supercomputers, Summit and Titan. We also show iRF-LOOP’s ability to capture biologically significant results when creating Predictive Expression Networks. This new implementation of iRF will enable the analysis of biological data sets at scales that were previously not possible.


2015 ◽  
Vol 8s1 ◽  
pp. MRI.S23558 ◽  
Author(s):  
Anwar S. Shatil ◽  
Sohail Younas ◽  
Hossein Pourreza ◽  
Chase R. Figley

With larger data sets and more sophisticated analyses, it is becoming increasingly common for neuroimaging researchers to push (or exceed) the limitations of standalone computer workstations. Nonetheless, although high-performance computing platforms such as clusters, grids and clouds are already in routine use by a small handful of neuroimaging researchers to increase their storage and/or computational power, the adoption of such resources by the broader neuroimaging community remains relatively uncommon. Therefore, the goal of the current manuscript is to: 1) inform prospective users about the similarities and differences between computing clusters, grids and clouds; 2) highlight their main advantages; 3) discuss when it may (and may not) be advisable to use them; 4) review some of their potential problems and barriers to access; and finally 5) give a few practical suggestions for how interested new users can start analyzing their neuroimaging data using cloud resources. Although the aim of cloud computing is to hide most of the complexity of the infrastructure management from end-users, we recognize that this can still be an intimidating area for cognitive neuroscientists, psychologists, neurologists, radiologists, and other neuroimaging researchers lacking a strong computational background. Therefore, with this in mind, we have aimed to provide a basic introduction to cloud computing in general (including some of the basic terminology, computer architectures, infrastructure and service models, etc.), a practical overview of the benefits and drawbacks, and a specific focus on how cloud resources can be used for various neuroimaging applications.


2020 ◽  
Vol 53 (5) ◽  
pp. 1404-1413
Author(s):  
Vincent Favre-Nicolin ◽  
Gaétan Girard ◽  
Steven Leake ◽  
Jerome Carnis ◽  
Yuriy Chushkin ◽  
...  

The open-source PyNX toolkit has been extended to provide tools for coherent X-ray imaging data analysis and simulation. All calculations can be executed on graphical processing units (GPUs) to achieve high-performance computing speeds. The toolkit can be used for coherent diffraction imaging (CDI), ptychography and wavefront propagation, in the far- or near-field regime. Moreover, all imaging operations (propagation, projections, algorithm cycles…) can be implemented in Python as simple mathematical operators, an approach which can be used to easily combine basic algorithms in a tailored chain. Calculations can also be distributed to multiple GPUs, e.g. for large ptychography data sets. Command-line scripts are available for on-line CDI and ptychography analysis, either from raw beamline data sets or using the coherent X-ray imaging data format.


2015 ◽  
Vol 17 (3) ◽  
pp. 368-379 ◽  
Author(s):  
Alex Upton ◽  
Oswaldo Trelles ◽  
José Antonio Cornejo-García ◽  
James Richard Perkins

Author(s):  
Ashley Cliff ◽  
Jonathon Romero ◽  
David Kainer ◽  
Daniel Jacobson

As time progresses and technology improves, biological data sets are continuously increasing in size. New methods and new implementations of existing methods are needed to keep pace with this increase. In this paper, we present a high performance computing(HPC)-capable implementation of Iterative Random Forest (iRF). This new implementation enables the explainable-AI eQTL analysis of SNP sets with over a million SNPs. Using this implementation we also present a new method, iRF Leave One Out Prediction (iRF-LOOP), for the creation of Predictive Expression Networks on the order of 40,000 genes or more. We compare the new implementation of iRF with the previous R version and analyze its time to completion on two of the world's fastest supercomputers Summit and Titan. We also show iRF-LOOP's ability to capture biologically significant results when creating Predictive Expression Networks. This new implementation of iRF will enable the analysis of biological data sets at scales that were previously not possible.


Sign in / Sign up

Export Citation Format

Share Document