GenomicDataCommons: a Bioconductor Interface to the NCI Genomic Data Commons

AbstractThe National Cancer Institute (NCI) Genomic Data Commons (Grossman et al. 2016, https://gdc.cancer.gov/) provides the cancer research community with an open and unified repository for sharing and accessing data across numerous cancer studies and projects via a high-performance data transfer and query infrastructure. The Bioconductor project (Huber et al. 2015) is an open source and open development software project built on the R statistical programming environment (R Core Team 2016). A major goal of the Bioconductor project is to facilitate the use, analysis, and comprehension of genomic data. The GenomicDataCommons Bioconductor package provides basic infrastructure for querying, accessing, and mining genomic datasets available from the GDC. We expect that Bioconductor developer and bioinformatics community will build on the GenomicDataCommons package to add higher-level functionality and expose cancer genomics data to many state-of-the-art bioinformatics methods available in Bioconductor.Availabilityhttps://bioconductor.org/packages/GenomicDataCommons & https://github.com/seandavi/GenomicDataCommons.

Download Full-text

hts-nim: scripting high-performance genomic analyses

10.1101/261735 ◽

2018 ◽

Author(s):

Brent S. Pedersen ◽

Aaron R. Quinlan

Keyword(s):

High Performance ◽

Genomic Data ◽

Supplementary Information ◽

Supplementary Data ◽

Scripting Languages ◽

Link Type ◽

Custom Software ◽

Genomic Analyses ◽

Biological Insight ◽

Supplementary Material

AbstractMotivationExtracting biological insight from genomic data inevitably requires custom software. In many cases, this is accomplished with scripting languages, owing to their accessibility and brevity. Unfortunately, the ease of scripting languages typically comes at a substantial performance cost that is especially acute with the scale of modern genomics datasets.ResultsWe present hts-nim, a high-performance library written in the Nim programming language that provides a simple, scripting-like syntax without sacrificing performance.Availabilityhts-nim is available at https://github.com/brentp/hts-nim and the example tools are at https://github.com/brentp/hts-nim-tools both under the MIT [email protected] informationSupplementary data are available at Bioinformatics online.

Download Full-text

TCGAbiolinksGUI: A graphical user interface to analyze GDC cancer molecular and clinical data

10.1101/147496 ◽

2017 ◽

Cited By ~ 3

Author(s):

Tiago C Silva ◽

Antonio Colaprico ◽

Catharina Olsen ◽

Gianluca Bontempi ◽

Michele Ceccarelli ◽

...

Keyword(s):

User Interface ◽

Graphical User Interface ◽

Cancer Genomics ◽

Bioconductor Package ◽

Link Type ◽

Bioconductor Project ◽

Video Tutorials ◽

Data Portal ◽

Advanced Knowledge ◽

Data Commons

AbstractBackground:The GDC (Genomic Data Commons) data portal provides users with data from cancer genomics studies. Recently, we developed the R/Bioconductor TCGAbiolinks package, which allows users to search, download and prepare cancer genomics data for integrative data analysis. The use of this package requires users to have advanced knowledge of R thus limiting the number of users.Results:To overcome this obstacle and improve the accessibility of the package by a wider range of users, we developed TCGAbiolinksGUI that uses shiny graphical user interface (GUI) available through the R/Bioconductor package.Conclusion:The TCGAbiolinksGUI package is freely available within the Bioconductor project at http://bioconductor.org/packages/TCGAbiolinksGUI/. Links to the GitHub repository, a demo version of the tool, a docker image and PDF/video tutorials are available at http://bit.do/TCGAbiolinksDocs.

Download Full-text

TCGAbiolinksGUI: A graphical user interface to analyze cancer molecular and clinical data

F1000Research ◽

10.12688/f1000research.14197.1 ◽

2018 ◽

Vol 7 ◽

pp. 439 ◽

Cited By ~ 6

Author(s):

Tiago Chedraoui Silva ◽

Antonio Colaprico ◽

Catharina Olsen ◽

Tathiane M Malta ◽

Gianluca Bontempi ◽

...

Keyword(s):

Data Analysis ◽

User Interface ◽

Graphical User Interface ◽

Cancer Genomics ◽

Genomic Data ◽

Bioconductor Project ◽

Video Tutorials ◽

Data Portal ◽

Advanced Knowledge ◽

Data Commons

The GDC (Genomic Data Commons) data portal provides users with data from cancer genomics studies. Recently, we developed the R/Bioconductor TCGAbiolinks package, which allows users to search, download and prepare cancer genomics data for integrative data analysis. The use of this package requires users to have advanced knowledge of R thus limiting the number of users. To overcome this obstacle and improve the accessibility of the package by a wider range of users, we developed a graphical user interface (GUI) using Shiny available through the package TCGAbiolinksGUI. The TCGAbiolinksGUI package is freely available within the Bioconductor project at http://bioconductor.org/packages/TCGAbiolinksGUI/. Links to the GitHub repository, a demo version of the tool, a docker image and PDF/video tutorials are available from the TCGAbiolinksGUI site.

Download Full-text

Author Correction: The NCI Genomic Data Commons

Nature Genetics ◽

10.1038/s41588-021-00883-2 ◽

2021 ◽

Author(s):

Allison P. Heath ◽

Vincent Ferretti ◽

Stuti Agrawal ◽

Maksim An ◽

James C. Angelakos ◽

...

Keyword(s):

Genomic Data ◽

Data Commons

Download Full-text

Artificial Intelligence-Assisted Colonoscopy for Detection of Colon Polyps: a Prospective, Randomized Cohort Study

Journal of Gastrointestinal Surgery ◽

10.1007/s11605-020-04802-4 ◽

2020 ◽

Author(s):

Yuchen Luo ◽

Yi Zhang ◽

Ming Liu ◽

Yihong Lai ◽

Panpan Liu ◽

...

Keyword(s):

Artificial Intelligence ◽

Real Time ◽

High Performance ◽

Detection System ◽

Random Order ◽

Colon Polyps ◽

Clinical Environment ◽

Polyp Detection ◽

Link Type ◽

Polyp Detection Rate

Abstract Background and aims Improving the rate of polyp detection is an important measure to prevent colorectal cancer (CRC). Real-time automatic polyp detection systems, through deep learning methods, can learn and perform specific endoscopic tasks previously performed by endoscopists. The purpose of this study was to explore whether a high-performance, real-time automatic polyp detection system could improve the polyp detection rate (PDR) in the actual clinical environment. Methods The selected patients underwent same-day, back-to-back colonoscopies in a random order, with either traditional colonoscopy or artificial intelligence (AI)-assisted colonoscopy performed first by different experienced endoscopists (> 3000 colonoscopies). The primary outcome was the PDR. It was registered with clinicaltrials.gov. (NCT047126265). Results In this study, we randomized 150 patients. The AI system significantly increased the PDR (34.0% vs 38.7%, p < 0.001). In addition, AI-assisted colonoscopy increased the detection of polyps smaller than 6 mm (69 vs 91, p < 0.001), but no difference was found with regard to larger lesions. Conclusions A real-time automatic polyp detection system can increase the PDR, primarily for diminutive polyps. However, a larger sample size is still needed in the follow-up study to further verify this conclusion. Trial Registration clinicaltrials.gov Identifier: NCT047126265

Download Full-text

Compiler-directed scratchpad memory data transfer optimization for multithreaded applications on a heterogeneous many-core architecture

The Journal of Supercomputing ◽

10.1007/s11227-021-03853-x ◽

2021 ◽

Author(s):

Xiaohan Tao ◽

Jianmin Pang ◽

Jinlong Xu ◽

Yu Zhu

Keyword(s):

Energy Consumption ◽

High Performance ◽

Scientific Computing ◽

Data Transfer ◽

Performance Model ◽

Experimental Result ◽

Transfer Model ◽

Scratchpad Memory ◽

On Chip ◽

Many Core

AbstractThe heterogeneous many-core architecture plays an important role in the fields of high-performance computing and scientific computing. It uses accelerator cores with on-chip memories to improve performance and reduce energy consumption. Scratchpad memory (SPM) is a kind of fast on-chip memory with lower energy consumption compared with a hardware cache. However, data transfer between SPM and off-chip memory can be managed only by a programmer or compiler. In this paper, we propose a compiler-directed multithreaded SPM data transfer model (MSDTM) to optimize the process of data transfer in a heterogeneous many-core architecture. We use compile-time analysis to classify data accesses, check dependences and determine the allocation of data transfer operations. We further present the data transfer performance model to derive the optimal granularity of data transfer and select the most profitable data transfer strategy. We implement the proposed MSDTM on the GCC complier and evaluate it on Sunway TaihuLight with selected test cases from benchmarks and scientific computing applications. The experimental result shows that the proposed MSDTM improves the application execution time by 5.49$$\times$$ × and achieves an energy saving of 5.16$$\times$$ × on average.

Download Full-text

I-DMAC: An Intelligent DMA Controller for Utilization - Aware Video Streaming used in AI Applications

10.54216/jcim.080203 ◽

2021 ◽

pp. 60-70

Author(s):

Piyush Kumar Shukla ◽

◽

Prashant Kumar Shukla ◽

Keyword(s):

Video Processing ◽

High Performance ◽

Data Transfer ◽

Direct Memory Access ◽

Large Data ◽

Video Frame ◽

Microprocessor System ◽

Bulk Data ◽

Xilinx Fpga ◽

Vhdl Code

The interpretation of large data streams necessitates high-performance repeated transfers, which overload Microprocessor System on Chips (SoC). The effective direct memory access (DMA) controller performs bulk data transfers without the CPU's involvement. The Direct Memory Controller (DMAC) solves this by facilitating bulk data transfer and execution. In this work, we created an intelligent DMAC (I-DMAC) for accessing video processing data without using CPUs. The model includes Bus selection Module, User control signal, Status Register, DMA supported Address, and AXI-PCI subsystems for improved video frame analysis. These modules are experimentally verified in Xilinx FPGA SoC architecture using VHDL code simulation and results compared to the E-DMAC model.

Download Full-text

PhasorNet A High Performance Network Communications Architecture for Synchrophasor Data Transfer in Wide Area Monitoring, Protection and Control Applications

2007 iREP Symposium - Bulk Power System Dynamics and Control - VII. Revitalizing Operational Reliability ◽

10.1109/irep.2007.4410566 ◽

2007 ◽

Cited By ~ 2

Author(s):

K A Fahid ◽

Prasanth Gopalakrishnan ◽

Sushil Cherian

Keyword(s):

High Performance ◽

Data Transfer ◽

Wide Area ◽

Control Applications ◽

Wide Area Monitoring ◽

Network Communications ◽

Protection And Control ◽

And Control ◽

Area Monitoring

Download Full-text

Genomic Medicine Frontier in Human Solid Tumors: Prospects and Challenges

Journal of Clinical Oncology ◽

10.1200/jco.2012.45.2268 ◽

2013 ◽

Vol 31 (15) ◽

pp. 1874-1884 ◽

Cited By ~ 86

Author(s):

Rodrigo Dienstmann ◽

Jordi Rodon ◽

Jordi Barretina ◽

Josep Tabernero

Keyword(s):

Solid Tumors ◽

Cancer Progression ◽

Cancer Genomics ◽

Genomic Medicine ◽

Genomic Data ◽

Cost Effective ◽

Assessment Model ◽

Historical View ◽

Systemic Treatments ◽

Human Solid Tumors

Recent discoveries of genomic alterations that underlie and promote the malignant phenotype, together with an expanded repertoire of targeted agents, have provided many opportunities to conduct hypothesis-driven clinical trials. The ability to profile each unique cancer for actionable aberrations by using high-throughput technologies in a cost-effective way provides unprecedented opportunities for using matched therapies in a selected patient population. The major challenges are to integrate and make biologic sense of the substantial genomic data derived from multiple platforms. We define two different approaches for the analysis, interpretation, and clinical applicability of genomic data: (1) the genomically stratified model originates from the “one test-one drug” paradigm and is currently being expanded with an upfront multicategorical approach following recent advances in multiplexed genotyping platforms; and (2) the comprehensive assessment model is based on whole-genome, -exome, and -transcriptome data and allows identification of novel drivers and subsequent therapies in the experimental setting. Tumor heterogeneity and evolution of the diverse populations of cancer cells during cancer progression, influenced by the effects of systemic treatments, will need to be addressed in the new scenario of early drug development. Logistical issues related to prescreening strategies and trial allocation, in addition to concerns in the economic and ethical domains, must be taken into consideration. Here we present a historical view of how increased understanding of cancer genomics has been translated to the clinic and discuss the prospects and challenges for further implementation of a personalized treatment strategy for human solid tumors.

Download Full-text

BigData Express: Toward Predictable, Schedulable, and High-Performance Data Transfer

10.2172/1460784 ◽

2018 ◽

Author(s):

Wenji Wu

Keyword(s):

High Performance ◽

Data Transfer ◽

Performance Data

Download Full-text