scholarly journals Improvements to the Rice Genome Annotation Through Large-Scale Analysis of RNA-Seq and Proteomics Data Sets

2018 ◽  
Vol 18 (1) ◽  
pp. 86-98 ◽  
Author(s):  
Zhe Ren ◽  
Da Qi ◽  
Nina Pugh ◽  
Kai Li ◽  
Bo Wen ◽  
...  
2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Chayaporn Suphavilai ◽  
Shumei Chia ◽  
Ankur Sharma ◽  
Lorna Tu ◽  
Rafael Peres Da Silva ◽  
...  

AbstractWhile understanding molecular heterogeneity across patients underpins precision oncology, there is increasing appreciation for taking intra-tumor heterogeneity into account. Based on large-scale analysis of cancer omics datasets, we highlight the importance of intra-tumor transcriptomic heterogeneity (ITTH) for predicting clinical outcomes. Leveraging single-cell RNA-seq (scRNA-seq) with a recommender system (CaDRReS-Sc), we show that heterogeneous gene-expression signatures can predict drug response with high accuracy (80%). Using patient-proximal cell lines, we established the validity of CaDRReS-Sc’s monotherapy (Pearson r>0.6) and combinatorial predictions targeting clone-specific vulnerabilities (>10% improvement). Applying CaDRReS-Sc to rapidly expanding scRNA-seq compendiums can serve as in silico screen to accelerate drug-repurposing studies. Availability: https://github.com/CSB5/CaDRReS-Sc.


2014 ◽  
Vol 2014 ◽  
pp. 1-7 ◽  
Author(s):  
Jonatan Taminau ◽  
Cosmin Lazar ◽  
Stijn Meganck ◽  
Ann Nowé

An increasing amount of microarray gene expression data sets is available through public repositories. Their huge potential in making new findings is yet to be unlocked by making them available for large-scale analysis. In order to do so it is essential that independent studies designed for similar biological problems can be integrated, so that new insights can be obtained. These insights would remain undiscovered when analyzing the individual data sets because it is well known that the small number of biological samples used per experiment is a bottleneck in genomic analysis. By increasing the number of samples the statistical power is increased and more general and reliable conclusions can be drawn. In this work, two different approaches for conducting large-scale analysis of microarray gene expression data—meta-analysis and data merging—are compared in the context of the identification of cancer-related biomarkers, by analyzing six independent lung cancer studies. Within this study, we investigate the hypothesis that analyzing large cohorts of samples resulting in merging independent data sets designed to study the same biological problem results in lower false discovery rates than analyzing the same data sets within a more conservative meta-analysis approach.


2020 ◽  
Author(s):  
Chayaporn Suphavilai ◽  
Shumei Chia ◽  
Ankur Sharma ◽  
Lorna Tu ◽  
Rafael Peres Da Silva ◽  
...  

SummaryWhile understanding heterogeneity in molecular signatures across patients underpins precision oncology, there is increasing appreciation for taking intra-tumor heterogeneity into account. Single-cell RNA-seq (scRNA-seq) technologies have facilitated investigations into the role of intra-tumor transcriptomic heterogeneity (ITTH) in tumor biology and evolution, but their application to in silico models of drug response has not been explored. Based on large-scale analysis of cancer omics datasets, we highlight the utility of ITTH for predicting clinical outcomes. We then show that heterogeneous gene expression signatures obtained from scRNA-seq data can be accurately analyzed (80%) in a recommender system framework (CaDRReS-Sc) for in silico drug response prediction. Patient-derived cell lines capturing transcriptomic heterogeneity from primary and metastatic tumors were used as in vitro proxies for validating monotherapy predictions (Pearson r>0.6), as well as optimal drug combinations to target different subclonal populations (>10% improvement). Applying CaDRReS-Sc to the increasing number of publicly available tumor scRNA-seq datasets can serve as an in silico screen for further in vitro and in vivo drug repurposing studies.Graphical abstractHighlightsLarge-scale analysis to establish the impact of transcriptomic heterogeneity within tumors on clinical outcomesCalibrated recommender system for drug response prediction based on single-cell RNA-seq data (CaDRReS-Sc)Prediction of drug response in patient-derived cell lines with transcriptomic heterogeneityIn silico identification of drug combinations that work based on clonal vulnerabilities


2019 ◽  
Vol 18 (5) ◽  
pp. 610-621
Author(s):  
Karin Priem ◽  
Lynn Fendler

This article historicizes “rigor,” discipline,” and “systematic” as inventions of a certain rational spirit of Enlightenment that was radicalized during the 19th century. These terms acquired temporary value in a transition during the 19th century when a culture of research was established within a modern episteme. Beginning in the 20th century, this development was perceived as problematic, triggering criticism from philosophy and the arts, and even within the sciences. “Discipline,” “rigor,” and “systematic” have changed meanings over time, and recent contributions from digital humanities are promising for a renewed critical debate about rigor in research. Both digital humanities and quantitative research deal with big data sets aimed at providing a large-scale analysis. However, unlike most quantitative research, digital humanities explore uncertainties as their main focus. Attention to the human-machine collaboration has led to more expansive thinking in scientific research. Digital humanities go further by advancing a metaperspective that deals with the material hermeneutics of data accumulation itself.


2018 ◽  
Author(s):  
Zhe Ren ◽  
Da Qi ◽  
Nina Pugh ◽  
Kai Li ◽  
Bo Wen ◽  
...  

AbstractRice (Oryza sativa) is one of the most important worldwide crops. The genome has been available for over 10 years and has undergone several rounds of annotation. We created a comprehensive database of transcripts from 29 public RNA sequencing datasets, officially predicted genes from Ensembl plants, and common contaminants in which to search for protein-level evidence. We re-analysed nine publicly accessible rice proteomics datasets. In total, we identified 420K peptide spectrum matches from 47K peptides and 8,187 protein groups. 4168 peptides were initially classed as putative novel peptides (not matching official genes). Following a strict filtration scheme to rule out other possible explanations, we discovered 1,584 high confidence novel peptides. The novel peptides were clustered into 692 genomic loci where our results suggest annotation improvements. 80% of the novel peptides had an ortholog match in the curated protein sequence set from at least one other plant species. For the peptides clustering in intergenic regions (and thus potentially new genes), 101 loci were identified, for which 43 had a high-confidence hit for a protein domain. Our results can be displayed as tracks on the Ensembl genome or other browsers supporting Track Hubs, to support re-annotation of the rice genome.


Sign in / Sign up

Export Citation Format

Share Document