scholarly journals Analysis of Aneuploidy Spectrum From Whole-Genome Sequencing Provides Rapid Assessment of Clonal Variation Within Established Cancer Cell Lines

2021 ◽  
Vol 20 ◽  
pp. 117693512110492
Author(s):  
Ahmed Ibrahim Samir Khalil ◽  
Anupam Chattopadhyay ◽  
Amartya Sanyal

Background: The revolution in next-generation sequencing (NGS) technology has allowed easy access and sharing of high-throughput sequencing datasets of cancer cell lines and their integrative analyses. However, long-term passaging and culture conditions introduce high levels of genomic and phenotypic diversity in established cell lines resulting in strain differences. Thus, clonal variation in cultured cell lines with respect to the reference standard is a major barrier in systems biology data analyses. Therefore, there is a pressing need for a fast and entry-level assessment of clonal variations within cell lines using their high-throughput sequencing data. Results: We developed a Python-based software, AStra, for de novo estimation of the genome-wide segmental aneuploidy to measure and visually interpret strain-level similarities or differences of cancer cell lines from whole-genome sequencing (WGS). We demonstrated that aneuploidy spectrum can capture the genetic variations in 27 strains of MCF7 breast cancer cell line collected from different laboratories. Performance evaluation of AStra using several cancer sequencing datasets revealed that cancer cell lines exhibit distinct aneuploidy spectra which reflect their previously-reported karyotypic observations. Similarly, AStra successfully identified large-scale DNA copy number variations (CNVs) artificially introduced in simulated WGS datasets. Conclusions: AStra provides an analytical and visualization platform for rapid and easy comparison between different strains or between cell lines based on their aneuploidy spectra solely using the raw BAM files representing mapped reads. We recommend AStra for rapid first-pass quality assessment of cancer cell lines before integrating scientific datasets that employ deep sequencing. AStra is an open-source software and is available at https://github.com/AISKhalil/AStra .

2021 ◽  
Author(s):  
Niantao Deng ◽  
Andre Minoche ◽  
Kate Harvey ◽  
Andrei Goga ◽  
Alex Swarbrick

Abstract BackgroundBreast cancer cell lines (BCCLs) and patient-derived xenografts (PDX) are the most frequently used models in breast cancer research. Despite their widespread usage, genome sequencing of these models is incomplete, with previous studies only focusing on targeted gene panels, whole exome or shallow whole genome sequencing. Deep whole genome sequencing is the most sensitive and accurate method to detect single nucleotide variants and indels, gene copy number and structural events such as gene fusions. ResultsHere we describe deep whole genome sequencing (WGS) of commonly used BCCL and PDX models using the Illumina X10 platform with an average ~ 60x coverage. We identify novel genomic alterations, including point mutations and genomic rearrangements at base-pair resolution, compared to previously available sequencing data. Through integrative analysis with publicly available functional screening data, we annotate new genomic features likely to be of biological significance. CSMD1 , previously identified as a tumor suppressor gene in various cancer types, including head and neck, lung and breast cancers, has been identified with deletion in 50% of our PDX models, suggesting an important role in aggressive breast cancers. ConclusionsOur WGS data provides a comprehensive genome sequencing resource of these models.


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Kanika Arora ◽  
Minita Shah ◽  
Molly Johnson ◽  
Rashesh Sanghvi ◽  
Jennifer Shelton ◽  
...  

AbstractTo test the performance of a new sequencing platform, develop an updated somatic calling pipeline and establish a reference for future benchmarking experiments, we performed whole-genome sequencing of 3 common cancer cell lines (COLO-829, HCC-1143 and HCC-1187) along with their matched normal cell lines to great sequencing depths (up to 278x coverage) on both Illumina HiSeqX and NovaSeq sequencing instruments. Somatic calling was generally consistent between the two platforms despite minor differences at the read level. We designed and implemented a novel pipeline for the analysis of tumor-normal samples, using multiple variant callers. We show that coupled with a high-confidence filtering strategy, the use of combination of tools improves the accuracy of somatic variant calling. We also demonstrate the utility of the dataset by creating an artificial purity ladder to evaluate the somatic pipeline and benchmark methods for estimating purity and ploidy from tumor-normal pairs. The data and results of the pipeline are made accessible to the cancer genomics community.


2021 ◽  
Author(s):  
Niantao Deng ◽  
Andre Minoche ◽  
Kate Harvey ◽  
Meng Li ◽  
Juliane Winkler ◽  
...  

Abstract Background: Breast cancer cell lines (BCCLs) and patient-derived xenografts (PDX) are the most frequently used models in breast cancer research. Despite their widespread usage, genome sequencing of these models is incomplete, with previous studies only focusing on targeted gene panels, whole exome or shallow whole genome sequencing. Deep whole genome sequencing is the most sensitive and accurate method to detect single nucleotide variants and indels, gene copy number and structural events such as gene fusions.Results: Here we describe deep whole genome sequencing (WGS) of commonly used BCCL and PDX models using the Illumina X10 platform with an average ~ 60x coverage. We identify novel genomic alterations, including point mutations and genomic rearrangements at base-pair resolution, compared to previously available sequencing data. Through integrative analysis with publicly available functional screening data, we annotate new genomic features likely to be of biological significance. CSMD1, previously identified as a tumor suppressor gene in various cancer types, including head and neck, lung and breast cancers, has been identified with deletion in 50% of our PDX models, suggesting an important role in aggressive breast cancers. Conclusions: Our WGS data provides a comprehensive genome sequencing resource of these models.


2016 ◽  
Vol 7 ◽  
Author(s):  
Maël Bessaud ◽  
Serge A. Sadeuh-Mba ◽  
Marie-Line Joffret ◽  
Richter Razafindratsimandresy ◽  
Patsy Polston ◽  
...  

2018 ◽  
Vol 9 ◽  
Author(s):  
Marie-Line Joffret ◽  
Patsy M. Polston ◽  
Richter Razafindratsimandresy ◽  
Maël Bessaud ◽  
Jean-Michel Heraud ◽  
...  

2021 ◽  
Author(s):  
Fatimah Alhamlan ◽  
Dana Bakheet ◽  
Marie Bohol ◽  
Madain Alsanea ◽  
Basma Alahaideb ◽  
...  

Background: The need for active genomic sequencing surveillance to rapidly identify circulating SARS-CoV-2 variants of concern (VOCs) is critical. However, increased global demand has led to a shortage of commercial SARS-CoV-2 sequencing kits, and not every country has the technological capability or the funds for high-throughput sequencing platforms. Therefore, this study aimed to develop and validate a rapid, cost-efficient genome sequencing protocol that uses supplies, equipment, and methodologic expertise available in standard molecular or diagnostic laboratories to identify circulating SARS-CoV-2 variants of concern. Methods: Sets of primers flanking the SARS-CoV-2 spike gene were designed using SARS-CoV-2 genome sequences retrieved from the Global Initiative on Sharing Avian Influenza Data (GISAID) Database and synthesized in-house. Primer specificity and final sequences were verified using online prediction analyses with BLAST. The primers were validated using 282 nasopharyngeal samples collected from patients assessed as positive for SARS-CoV-2 at the diagnostic laboratory of the hospital using a Rotor-Gene PCR cycler with an Altona Diagnostics SARS-CoV-2 kit. The patient samples were subjected to RNA extraction followed by cDNA synthesis, conventional polymerase chain reaction, and Sanger sequencing. Protocol specificity was confirmed by comparing these results with SARS-CoV-2 whole genome sequencing of the same samples. Results: Sanger sequencing using the newly designed primers and next-generation whole genome sequencing of 282 patient samples indicated identical variants of concern results: 123 samples contained the alpha variant (B.1.1.7); 78, beta (B.1.351), 0, gamma (P.1), and 13, delta (B.1.617.2). Moreover, the remaining samples were non-VOC that belonged to none of these variants and had 99.97% identity with the reference genome. Only four samples had poor sequence quality by Sanger sequencing owing to a low viral count (Ct value >38). Therefore, mutation calls were >98% accurate. Conclusions: Sanger sequencing method using in-house primers is an alternative approach that can be used in facilities with existing equipment to mitigate limitations in high throughput supplies required to identify SARS-CoV-2 variants of concern during the COVID-19 pandemic. This protocol is easily adaptable for detection of emerging variants.


Blood ◽  
2015 ◽  
Vol 126 (4) ◽  
pp. 508-519 ◽  
Author(s):  
Laura Y. McGirt ◽  
Peilin Jia ◽  
Devin A. Baerenwald ◽  
Robert J. Duszynski ◽  
Kimberly B. Dahlman ◽  
...  

Key Points High-throughput sequencing of MF revealed multiple mutations within epigenetic and cytokine pathways that may drive disease. Pharmacologically targeting the JAK3 pathway in MF results in cell death and may be an effective treatment of this disease.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Sung Yong Park ◽  
Gina Faraci ◽  
Pamela M. Ward ◽  
Jane F. Emerson ◽  
Ha Youn Lee

AbstractCOVID-19 global cases have climbed to more than 33 million, with over a million total deaths, as of September, 2020. Real-time massive SARS-CoV-2 whole genome sequencing is key to tracking chains of transmission and estimating the origin of disease outbreaks. Yet no methods have simultaneously achieved high precision, simple workflow, and low cost. We developed a high-precision, cost-efficient SARS-CoV-2 whole genome sequencing platform for COVID-19 genomic surveillance, CorvGenSurv (Coronavirus Genomic Surveillance). CorvGenSurv directly amplified viral RNA from COVID-19 patients’ Nasopharyngeal/Oropharyngeal (NP/OP) swab specimens and sequenced the SARS-CoV-2 whole genome in three segments by long-read, high-throughput sequencing. Sequencing of the whole genome in three segments significantly reduced sequencing data waste, thereby preventing dropouts in genome coverage. We validated the precision of our pipeline by both control genomic RNA sequencing and Sanger sequencing. We produced near full-length whole genome sequences from individuals who were COVID-19 test positive during April to June 2020 in Los Angeles County, California, USA. These sequences were highly diverse in the G clade with nine novel amino acid mutations including NSP12-M755I and ORF8-V117F. With its readily adaptable design, CorvGenSurv grants wide access to genomic surveillance, permitting immediate public health response to sudden threats.


Sign in / Sign up

Export Citation Format

Share Document