scholarly journals The need for statistical contributions to bioinformatics at scale, with illustration to mass spectrometry

2017 ◽  
Vol 17 (4-5) ◽  
pp. 290-299
Author(s):  
Andrew W Dowsey

In their article, Morris and Baladandayuthapani clearly evidence the influence of statisticians in recent methodological advances throughout the bioinformatics pipeline and advocate for the expansion of this role. The latest acquisition platforms, such as next generation sequencing (genomics/transcriptomics) and hyphenated mass spectrometry (proteomics/metabolomics), output raw datasets in the order of gigabytes; it is not unusual to acquire a terabyte or more of data per study. The increasing computational burden this brings is a further impediment against the use of statistically rigorous methodology in the pre-processing stages of the bioinformatics pipeline. In this discussion I describe the mass spectrometry pipeline and use it as an example to show that beneath this challenge lies a two-fold opportunity: (a) Biological complexity and dynamic range is still well beyond what is captured by current processing methodology; hence, potential biomarkers and mechanistic insights are consistently missed; (b) Statistical science could play a larger role in optimizing the acquisition process itself. Data rates will continue to increase as routine clinical omics analysis moves to large-scale facilities with systematic, standardized protocols. Key inferential gains will be achieved by borrowing strength across the sum total of all analyzed studies, a task best underpinned by appropriate statistical modelling.

2021 ◽  
Author(s):  
Longwei Liu ◽  
Praopim Limsakul ◽  
Xianhui Meng ◽  
Yan Huang ◽  
Reed E. S. Harrison ◽  
...  

Abstract Genetically-encoded biosensors based on FRET have been widely used to dynamically monitor the activity of protein tyrosine kinases (PTKs) in living cell with high spatiotemporal resolution. However, the limitation in sensitivity, specificity, and dynamic range of FRET biosensors have hindered their broader applications. Here, we introduced a systematic platform, FRET-Seq, which integrates high-throughput FRET sorting and next-generation sequencing, to identify FRET biosensors with better performance from large-scale libraries directly in mammalian cells.


2018 ◽  
Vol 71 (10) ◽  
pp. 895-899 ◽  
Author(s):  
Georgina L Ryland ◽  
Kate Jones ◽  
Melody Chin ◽  
John Markham ◽  
Elle Aydogan ◽  
...  

AimsMultiple myeloma is a genomically complex haematological malignancy with many genomic alterations recognised as important in diagnosis, prognosis and therapeutic decision making. Here, we provide a summary of genomic findings identified through routine diagnostic next-generation sequencing at our centre.MethodsA cohort of 86 patients with multiple myeloma underwent diagnostic sequencing using a custom hybridisation-based panel targeting 104 genes. Sequence variants, genome-wide copy number changes and structural rearrangements were detected using an inhouse-developed bioinformatics pipeline.ResultsAt least one mutation was found in 69 (80%) patients. Frequently mutated genes included TP53 (36%), KRAS (22.1%), NRAS (15.1%), FAM46C/DIS3 (8.1%) and TET2/FGFR3 (5.8%), including multiple mutations not previously described in myeloma. Importantly we observed TP53 mutations in the absence of a 17 p deletion in 8% of the cohort, highlighting the need for sequencing-based assessment in addition to cytogenetics to identify these high-risk patients. Multiple novel copy number changes and immunoglobulin heavy chain translocations are also discussed.ConclusionsOur results demonstrate that many clinically relevant genomic findings remain in multiple myeloma which have not yet been identified through large-scale sequencing efforts, and provide important mechanistic insights into plasma cell pathobiology.


2021 ◽  
Vol 72 (1) ◽  
Author(s):  
Dapeng Li ◽  
Emmanuel Gaquerel

The remarkable diversity of specialized metabolites produced by plants has inspired several decades of research and nucleated a long list of theories to guide empirical ecological studies. However, analytical constraints and the lack of untargeted processing workflows have long precluded comprehensive metabolite profiling and, consequently, the collection of the critical currencies to test theory predictions for the ecological functions of plant metabolic diversity. Developments in mass spectrometry (MS) metabolomics have revolutionized the large-scale inventory and annotation of chemicals from biospecimens. Hence, the next generation of MS metabolomics propelled by new bioinformatics developments provides a long-awaited framework to revisit metabolism-centered ecological questions, much like the advances in next-generation sequencing of the last two decades impacted all research horizons in genomics. Here, we review advances in plant (computational) metabolomics to foster hypothesis formulation from complex metabolome data. Additionally, we reflect on how next-generation metabolomics could reinvigorate the testing of long-standing theories on plant metabolic diversity. Expected final online publication date for the Annual Review of Plant Biology, Volume 72 is May 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.


2014 ◽  
Vol 53 (2) ◽  
pp. R93-R101 ◽  
Author(s):  
Petter Vikman ◽  
Joao Fadista ◽  
Nikolay Oskolkov

Previous global RNA analysis was restricted to known transcripts in species with a defined transcriptome. Next generation sequencing has transformed transcriptomics by making it possible to analyse expressed genes with an exon level resolution from any tissue in any species without any a priori knowledge of which genes that are being expressed, splice patterns or their nucleotide sequence. In addition, RNA sequencing is a more sensitive technique compared with microarrays with a larger dynamic range, and it also allows for investigation of imprinting and allele-specific expression. This can be done for a cost that is able to compete with that of a microarray, making RNA sequencing a technique available to most researchers. Therefore RNA sequencing has recently become the state of the art with regards to large-scale RNA investigations and has to a large extent replaced microarrays. The only drawback is the large data amounts produced, which together with the complexity of the data can make a researcher spend far more time on analysis than performing the actual experiment.


2019 ◽  
Author(s):  
Ingo Strenge ◽  
Carsten Engelhard

<p>The article demonstrates the importance of using a suitable approach to compensate for dead time relate count losses (a certain measurement artefact) whenever short, but potentially strong transient signals are to be analysed using inductively coupled plasma mass spectrometry (ICP-MS). Findings strongly support the theory that inadequate time resolution, and therefore insufficient compensation for these count losses, is one of the main reasons for size underestimation observed when analysing inorganic nanoparticles using ICP-MS, a topic still controversially discussed.</p>


2020 ◽  
Vol 86 (7) ◽  
pp. 12-19
Author(s):  
I. V. Plyushchenko ◽  
D. G. Shakhmatov ◽  
I. A. Rodin

A viral development of statistical data processing, computing capabilities, chromatography-mass spectrometry, and omics technologies (technologies based on the achievements of genomics, transcriptomics, proteomics, metabolomics) in recent decades has not led to formation of a unified protocol for untargeted profiling. Systematic errors reduce the reproducibility and reliability of the obtained results, and at the same time hinder consolidation and analysis of data gained in large-scale multi-day experiments. We propose an algorithm for conducting omics profiling to identify potential markers in the samples of complex composition and present the case study of urine samples obtained from different clinical groups of patients. Profiling was carried out by the method of liquid chromatography mass spectrometry. The markers were selected using methods of multivariate analysis including machine learning and feature selection. Testing of the approach was performed using an independent dataset by clustering and projection on principal components.


2019 ◽  
Vol 25 (31) ◽  
pp. 3350-3357 ◽  
Author(s):  
Pooja Tripathi ◽  
Jyotsna Singh ◽  
Jonathan A. Lal ◽  
Vijay Tripathi

Background: With the outbreak of high throughput next-generation sequencing (NGS), the biological research of drug discovery has been directed towards the oncology and infectious disease therapeutic areas, with extensive use in biopharmaceutical development and vaccine production. Method: In this review, an effort was made to address the basic background of NGS technologies, potential applications of NGS in drug designing. Our purpose is also to provide a brief introduction of various Nextgeneration sequencing techniques. Discussions: The high-throughput methods execute Large-scale Unbiased Sequencing (LUS) which comprises of Massively Parallel Sequencing (MPS) or NGS technologies. The Next geneinvolved necessarily executes Largescale Unbiased Sequencing (LUS) which comprises of MPS or NGS technologies. These are related terms that describe a DNA sequencing technology which has revolutionized genomic research. Using NGS, an entire human genome can be sequenced within a single day. Conclusion: Analysis of NGS data unravels important clues in the quest for the treatment of various lifethreatening diseases and other related scientific problems related to human welfare.


2021 ◽  
Vol 20 (2) ◽  
pp. 1280-1295
Author(s):  
Aleksandr Gaun ◽  
Kaitlyn N. Lewis Hardell ◽  
Niclas Olsson ◽  
Jonathon J. O’Brien ◽  
Sudha Gollapudi ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document