sequencing data
Recently Published Documents





2022 ◽  
Vol 3 (1) ◽  
pp. 101036
Adelina Rabenius ◽  
Sajitha Chandrakumaran ◽  
Lea Sistonen ◽  
Anniina Vihervaara

2022 ◽  
Vol 11 ◽  
Yingyun Guo ◽  
Yuan Li ◽  
Jiao Li ◽  
Weiping Tao ◽  
Weiguo Dong

Low-grade gliomas (LGG) are heterogeneous, and the current predictive models for LGG are either unsatisfactory or not user-friendly. The objective of this study was to establish a nomogram based on methylation-driven genes, combined with clinicopathological parameters for predicting prognosis in LGG. Differential expression, methylation correlation, and survival analysis were performed in 516 LGG patients using RNA and methylation sequencing data, with accompanying clinicopathological parameters from The Cancer Genome Atlas. LASSO regression was further applied to select optimal prognosis-related genes. The final prognostic nomogram was implemented together with prognostic clinicopathological parameters. The predictive efficiency of the nomogram was internally validated in training and testing groups, and externally validated in the Chinese Glioma Genome Atlas database. Three DNA methylation-driven genes, ARL9, CMYA5, and STEAP3, were identified as independent prognostic factors. Together with IDH1 mutation status, age, and sex, the final prognostic nomogram achieved the highest AUC value of 0.930, and demonstrated stable consistency in both internal and external validations. The prognostic nomogram could predict personal survival probabilities for patients with LGG, and serve as a user-friendly tool for prognostic evaluation, optimizing therapeutic regimes, and managing LGG patients.

2022 ◽  
Masahiro Nakano ◽  
Mineto Ota ◽  
Yusuke Takeshima ◽  
Yukiko Iwasaki ◽  
Hiroaki Hatano ◽  

Systemic lupus erythematosus (SLE) is a complex and heterogeneous autoimmune disease involving multiple immune cells. A major hurdle to the elucidation of SLE pathogenesis is our limited understanding of dysregulated gene expression linked to various clinical statuses with a high cellular resolution. Here, we conducted a large-scale transcriptome study with 6,386 RNA sequencing data covering 27 immune cell types from 159 SLE and 89 healthy donors. We first profiled two distinct cell-type-specific transcriptomic signatures: disease-state and disease-activity signatures, reflecting disease establishment and exacerbation, respectively. We next identified candidate biological processes unique to each signature. This study suggested the clinical value of disease-activity signatures, which were associated with organ involvement and responses to therapeutic agents such as belimumab. However, disease-activity signatures were less enriched around SLE risk variants than disease-state signatures, suggesting that the genetic studies to date may not well capture clinically vital biology in SLE. Together, we identified comprehensive gene signatures of SLE, which will provide essential foundations for future genomic, genetic, and clinical studies.

Marine Drugs ◽  
2022 ◽  
Vol 20 (1) ◽  
pp. 74
Kenneth Sandoval ◽  
Grace P. McCormack

Actinoporins are proteinaceous toxins known for their ability to bind to and create pores in cellular membranes. This quality has generated interest in their potential use as new tools, such as therapeutic immunotoxins. Isolated historically from sea anemones, genes encoding for similar actinoporin-like proteins have since been found in a small number of other animal phyla. Sequencing and de novo assembly of Irish Haliclona transcriptomes indicated that sponges also possess similar genes. An exhaustive analysis of publicly available sequencing data from other sponges showed that this is a potentially widespread feature of the Porifera. While many sponge proteins possess a sequence similarity of 27.70–59.06% to actinoporins, they show consistency in predicted structure. One gene copy from H. indistincta has significant sequence similarity to sea anemone actinoporins and possesses conserved residues associated with the fundamental roles of sphingomyelin recognition, membrane attachment, oligomerization, and pore formation, indicating that it may be an actinoporin. Phylogenetic analyses indicate frequent gene duplication, no distinct clade for sponge-derived proteins, and a stronger signal towards actinoporins than similar proteins from other phyla. Overall, this study provides evidence that a diverse array of Porifera represents a novel source of actinoporin-like proteins which may have biotechnological and pharmaceutical applications.

2022 ◽  
christopher Baker ◽  
Dhruv Patel ◽  
Benjamin J. Cole ◽  
Lindsey G. Ching ◽  
Oliver Dautermann ◽  

Climate change is globally affecting rainfall patterns, necessitating the improvement of drought tolerance in crops. Sorghum bicolor is a drought-tolerant cereal capable of producing high yields under water scarcity conditions. Functional stay-green sorghum genotypes can maintain green leaf area and efficient grain filling in terminal post-flowering water deprivation, a period of ~10 weeks. To obtain molecular insights into these characteristics, two drought-tolerant genotypes, BTx642 and RTx430, were grown in control and terminal post-flowering drought field plots in the Central Valley of California. Photosynthetic, photoprotective, water dynamics, and biomass traits were quantified and correlated with metabolomic data collected from leaves, stems, and roots at multiple timepoints during drought. Physiological and metabolomic data was then compared to longitudinal RNA sequencing data collected from these two genotypes. The metabolic response to drought highlights the uniqueness of the post-flowering drought acclimation relative to pre-flowering drought. The functional stay-green genotype BTx642 specifically induced photoprotective responses in post-flowering drought supporting a putative role for photoprotection in the molecular basis of the functional stay-green trait. Specific genes are highlighted that may contribute to post-flowering drought tolerance and that can be targeted in crops to maximize yields under limited water input conditions.

2022 ◽  
Jordan M Eizenga ◽  
Benedict Paten

Modern genomic sequencing data is trending toward longer sequences with higher accuracy. Many analyses using these data will center on alignments, but classical exact alignment algorithms are infeasible for long sequences. The recently proposed WFA algorithm demonstrated how to perform exact alignment for long, similar sequences in O(sN) time and O(s2) memory, where s is a score that is low for similar sequences (Marco-Sola et al., 2021). However, this algorithm still has infeasible memory requirements for longer sequences. Also, it uses an alternate scoring system that is unfamiliar to many bioinformaticians. We describe variants of WFA that improve its asymptotic memory use from O(s2) to O(s3/2) and its asymptotic run time from O(sN) to O(s2 + N). We expect the reduction in memory use to be particularly impactful, as it makes it practical to perform highly multithreaded megabase-scale exact alignments in common compute environments. In addition, we show how to fold WFA's alternate scoring into the broader literature on alignment scores.

2022 ◽  
Vol 23 (1) ◽  
Ludwig Mann ◽  
Kathrin M. Seibt ◽  
Beatrice Weber ◽  
Tony Heitkam

Abstract Background Extrachromosomal circular DNAs (eccDNAs) are ring-like DNA structures physically separated from the chromosomes with 100 bp to several megabasepairs in size. Apart from carrying tandemly repeated DNA, eccDNAs may also harbor extra copies of genes or recently activated transposable elements. As eccDNAs occur in all eukaryotes investigated so far and likely play roles in stress, cancer, and aging, they have been prime targets in recent research—with their investigation limited by the scarcity of computational tools. Results Here, we present the ECCsplorer, a bioinformatics pipeline to detect eccDNAs in any kind of organism or tissue using next-generation sequencing techniques. Following Illumina-sequencing of amplified circular DNA (circSeq), the ECCsplorer enables an easy and automated discovery of eccDNA candidates. The data analysis encompasses two major procedures: first, read mapping to the reference genome allows the detection of informative read distributions including high coverage, discordant mapping, and split reads. Second, reference-free comparison of read clusters from amplified eccDNA against control sample data reveals specifically enriched DNA circles. Both software parts can be run separately or jointly, depending on the individual aim or data availability. To illustrate the wide applicability of our approach, we analyzed semi-artificial and published circSeq data from the model organisms Homo sapiens and Arabidopsis thaliana, and generated circSeq reads from the non-model crop plant Beta vulgaris. We clearly identified eccDNA candidates from all datasets, with and without reference genomes. The ECCsplorer pipeline specifically detected mitochondrial mini-circles and retrotransposon activation, showcasing the ECCsplorer’s sensitivity and specificity. Conclusion The ECCsplorer (available online at is a bioinformatics pipeline to detect eccDNAs in any kind of organism or tissue using next-generation sequencing data. The derived eccDNA targets are valuable for a wide range of downstream investigations—from analysis of cancer-related eccDNAs over organelle genomics to identification of active transposable elements.

2022 ◽  
Sofya Lipnitskaya ◽  
Yang Shen ◽  
Stefan Legewie ◽  
Holger Klein ◽  
Kolja Becker

Abstract Background: Recent studies in the area of transcriptomics performed on single-cell and population levels reveal noticeable variability in gene expression measurements provided by different RNA sequencing technologies. Due to increased noise and complexity of single-cell RNA-Seq (scRNA-Seq) data over the bulk experiment, there is a substantial number of variably-expressed genes and so-called dropouts, challenging the subsequent computational analysis and potentially leading to false positive discoveries. In order to investigate factors affecting technical variability between RNA sequencing experiments of different technologies, we performed a systematic assessment of single-cell and bulk RNA-Seq data, which have undergone the same pre-processing and sample preparation procedures. Results: Our analysis indicates that variability between gene expression measurements as well as dropout events are not exclusively caused by biological variability, low expression levels, or random variation. Furthermore, we propose FAVSeq, a machine learning-assisted pipeline for detection of factors contributing to gene expression variability in matched RNA-Seq data provided by two technologies. Based on the analysis of the matched bulk and single-cell dataset, we found the 3'-UTR and transcript lengths as the most relevant effectors of the observed variation between RNA-Seq experiments, while the same factors together with cellular compartments were shown to be associated with dropouts. Conclusions: Here, we investigated the sources of variation in RNA-Seq profiles of matched single-cell and bulk experiments. In addition, we proposed the FAVSeq pipeline for analyzing multimodal RNA sequencing data, which allowed to identify factors affecting quantitative difference in gene expression measurements as well as the presence of dropouts. Hereby, the derived knowledge can be employed further in order to improve the interpretation of RNA-Seq data and identify genes that can be affected by assay-based deviations. Source code is available under the MIT license at

NAR Cancer ◽  
2022 ◽  
Vol 4 (1) ◽  
Eirik Høye ◽  
Bastian Fromm ◽  
Paul H M Böttger ◽  
Diana Domanska ◽  
Annette Torgunrud ◽  

ABSTRACT Although microRNAs (miRNAs) contribute to all hallmarks of cancer, miRNA dysregulation in metastasis remains poorly understood. The aim of this work was to reliably identify miRNAs associated with metastatic progression of colorectal cancer (CRC) using novel and previously published next-generation sequencing (NGS) datasets generated from 268 samples of primary (pCRC) and metastatic CRC (mCRC; liver, lung and peritoneal metastases) and tumor adjacent tissues. Differential expression analysis was performed using a meticulous bioinformatics pipeline, including only bona fide miRNAs, and utilizing miRNA-tailored quality control and processing. Five miRNAs were identified as up-regulated at multiple metastatic sites Mir-210_3p, Mir-191_5p, Mir-8-P1b_3p [mir-141–3p], Mir-1307_5p and Mir-155_5p. Several have previously been implicated in metastasis through involvement in epithelial-to-mesenchymal transition and hypoxia, while other identified miRNAs represent novel findings. The use of a publicly available pipeline facilitates reproducibility and allows new datasets to be added as they become available. The set of miRNAs identified here provides a reliable starting-point for further research into the role of miRNAs in metastatic progression.

Cancers ◽  
2022 ◽  
Vol 14 (2) ◽  
pp. 404
Yuri Belotti ◽  
Elaine Hsuen Lim ◽  
Chwee Teck Lim

Ovarian cancer is the eighth global leading cause of cancer-related death among women. The most common form is the high-grade serous ovarian carcinoma (HGSOC). No further improvements in the 5-year overall survival have been seen over the last 40 years since the adoption of platinum- and taxane-based chemotherapy. Hence, a better understanding of the mechanisms governing this aggressive phenotype would help identify better therapeutic strategies. Recent research linked onset, progression, and response to treatment with dysregulated components of the tumor microenvironment (TME) in many types of cancer. In this study, using bioinformatic approaches, we identified a 19-gene TME-related HGSOC prognostic genetic panel (19 prognostic genes (PLXNB2, HMCN2, NDNF, NTN1, TGFBI, CHAD, CLEC5A, PLXNA1, CST9, LOXL4, MMP17, PI3, PRSS1, SERPINA10, TLL1, CBLN2, IL26, NRG4, and WNT9A) by assessing the RNA sequencing data of 342 tumors available in the TCGA database. Using machine learning, we found that specific patterns of infiltrating immune cells characterized each risk group. Furthermore, we demonstrated the predictive potential of our risk score across different platforms and its improved prognostic performance compared with other gene panels.

Sign in / Sign up

Export Citation Format

Share Document