scholarly journals Integrated Analysis of Whole Genome and Epigenome Data Using Machine Learning Technology: Toward the Establishment of Precision Oncology

2021 ◽  
Vol 11 ◽  
Author(s):  
Ken Asada ◽  
Syuzo Kaneko ◽  
Ken Takasawa ◽  
Hidenori Machino ◽  
Satoshi Takahashi ◽  
...  

With the completion of the International Human Genome Project, we have entered what is known as the post-genome era, and efforts to apply genomic information to medicine have become more active. In particular, with the announcement of the Precision Medicine Initiative by U.S. President Barack Obama in his State of the Union address at the beginning of 2015, “precision medicine,” which aims to divide patients and potential patients into subgroups with respect to disease susceptibility, has become the focus of worldwide attention. The field of oncology is also actively adopting the precision oncology approach, which is based on molecular profiling, such as genomic information, to select the appropriate treatment. However, the current precision oncology is dominated by a method called targeted-gene panel (TGP), which uses next-generation sequencing (NGS) to analyze a limited number of specific cancer-related genes and suggest optimal treatments, but this method causes the problem that the number of patients who benefit from it is limited. In order to steadily develop precision oncology, it is necessary to integrate and analyze more detailed omics data, such as whole genome data and epigenome data. On the other hand, with the advancement of analysis technologies such as NGS, the amount of data obtained by omics analysis has become enormous, and artificial intelligence (AI) technologies, mainly machine learning (ML) technologies, are being actively used to make more efficient and accurate predictions. In this review, we will focus on whole genome sequencing (WGS) analysis and epigenome analysis, introduce the latest results of omics analysis using ML technologies for the development of precision oncology, and discuss the future prospects.

2021 ◽  
Vol 39 (15_suppl) ◽  
pp. e13588-e13588
Author(s):  
Laura Sachse ◽  
Smriti Dasari ◽  
Marc Ackermann ◽  
Emily Patnaude ◽  
Stephanie OLeary ◽  
...  

e13588 Background: Pre-screening for clinical trials is becoming more challenging as inclusion/exclusion criteria becomes increasingly complex. Oncology precision medicine provides an exciting opportunity to simplify this process and quickly match patients with trials by leveraging machine learning technology. The Tempus TIME Trial site network matches patients to relevant, open, and recruiting clinical trials, personalized to each patient’s clinical and molecular biology. Methods: Tempus screens patients at sites within the TIME Trial Network to find high-fidelity matches to clinical trials. The patient records include documentation submitted alongside NGS orders as well as electronic medical records (EMR) ingested through EMR Integrations. While Tempus-sequenced patients were automatically matched to trials using a Tempus-built matching application, EMR records were run through a natural language processing (NLP) data abstraction model to identify patients with an actionable gene of interest. Structured data were analyzed to filter to patients that lack a deceased date and have an encounter date within a predefined time period. Tempus abstractors manually validated the resulting unstructured records to ensure each patient was matched to a TIME Trial at a site capable of running the trial. For all high-level patient matches, a Tempus Clinical Navigator manually evaluated other clinical criteria to confirm trial matches and communicated with the site about trial options. Results: Patient matching was accelerated by implementing NLP gene and report detection (which isolated 17% of records) and manual screening. As a result, Tempus facilitated screening of over 190,000 patients efficiently using proprietary NLP technology to match 332 patients to 21 unique interventional clinical trials since program launch. Tempus continues to optimize its NLP models to increase high-fidelity trial matching at scale. Conclusions: The TIME Trial Network is an evolving, dynamic program that efficiently matches patients with clinical trial sites using both EMR and Tempus sequencing data. Here, we show how machine learning technology can be utilized to efficiently identify and recruit patients to clinical trials, thereby personalizing trial enrollment for each patient.[Table: see text]


2022 ◽  
Vol 2 ◽  
Author(s):  
Rasheed Omobolaji Alabi ◽  
Alhadi Almangush ◽  
Mohammed Elmusrati ◽  
Antti A. Mäkitie

Oral squamous cell carcinoma (OSCC) is one of the most prevalent cancers worldwide and its incidence is on the rise in many populations. The high incidence rate, late diagnosis, and improper treatment planning still form a significant concern. Diagnosis at an early-stage is important for better prognosis, treatment, and survival. Despite the recent improvement in the understanding of the molecular mechanisms, late diagnosis and approach toward precision medicine for OSCC patients remain a challenge. To enhance precision medicine, deep machine learning technique has been touted to enhance early detection, and consequently to reduce cancer-specific mortality and morbidity. This technique has been reported to have made a significant progress in data extraction and analysis of vital information in medical imaging in recent years. Therefore, it has the potential to assist in the early-stage detection of oral squamous cell carcinoma. Furthermore, automated image analysis can assist pathologists and clinicians to make an informed decision regarding cancer patients. This article discusses the technical knowledge and algorithms of deep learning for OSCC. It examines the application of deep learning technology in cancer detection, image classification, segmentation and synthesis, and treatment planning. Finally, we discuss how this technique can assist in precision medicine and the future perspective of deep learning technology in oral squamous cell carcinoma.


2015 ◽  
Author(s):  
Sarah Guthrie ◽  
Abram Connelly ◽  
Peter Amstutz ◽  
Adam F. Berrey ◽  
Nicolas Cesar ◽  
...  

The scientific and medical community is reaching an era of inexpensive whole genome sequencing, opening the possibility of precision medicine for millions of individuals. Here we present tiling: a flexible representation of whole genome sequences that supports simple and consistent names, annotation, queries, machine learning, and clinical screening. We partitioned the genome into 10,655,006 tiles: overlapping, variable-length sequences that begin and end with unique 24-base tags. We tiled and annotated 680 public whole genome sequences from the 1000 Genomes Project Consortium (1KG) and Harvard Personal Genome Project (PGP) using ClinVar database information. These genomes cover 14.13 billion tile sequences (4.087 trillion high quality bases and 0.4321 trillion low quality bases) and 251 phenotypes spanning ICD-9 code ranges 140-289, 320-629, and 680-759. We used these data to build a Global Alliance for Genomics and Health Beacon and graph database. We performed principal component analysis (PCA) on the 680 public whole genomes, and by projecting the tiled genomes onto their first two principal components, we replicated the 1KG principle component separation by population ethnicity codes. Interestingly, we found the PGP self reported ethnicities cluster consistently with 1KG ethnicity codes. We built a set of support-vector ABO blood-type classifiers using 75 PGP participants who had both a whole genome sequence and a self-reported blood type. Our classifier predicts A antigen presence to within 1% of the current state-of-the art for in silico A antigen prediction. Finally, we found six PGP participants with previously undiscovered pathogenic BRCA variants, and using our tiling, gave them simple, consistent names, which can be easily and independently re-derived. Given the near-future requirements of genomics research and precision medicine, we propose the adoption of tiling and invite all interested individuals and groups to view, rerun, copy, and modify these analyses at https://curover.se/su92l- j7d0g-swtofxa2rct8495


Biomolecules ◽  
2019 ◽  
Vol 10 (1) ◽  
pp. 62 ◽  
Author(s):  
Ryuji Hamamoto ◽  
Masaaki Komatsu ◽  
Ken Takasawa ◽  
Ken Asada ◽  
Syuzo Kaneko

To clarify the mechanisms of diseases, such as cancer, studies analyzing genetic mutations have been actively conducted for a long time, and a large number of achievements have already been reported. Indeed, genomic medicine is considered the core discipline of precision medicine, and currently, the clinical application of cutting-edge genomic medicine aimed at improving the prevention, diagnosis and treatment of a wide range of diseases is promoted. However, although the Human Genome Project was completed in 2003 and large-scale genetic analyses have since been accomplished worldwide with the development of next-generation sequencing (NGS), explaining the mechanism of disease onset only using genetic variation has been recognized as difficult. Meanwhile, the importance of epigenetics, which describes inheritance by mechanisms other than the genomic DNA sequence, has recently attracted attention, and, in particular, many studies have reported the involvement of epigenetic deregulation in human cancer. So far, given that genetic and epigenetic studies tend to be accomplished independently, physiological relationships between genetics and epigenetics in diseases remain almost unknown. Since this situation may be a disadvantage to developing precision medicine, the integrated understanding of genetic variation and epigenetic deregulation appears to be now critical. Importantly, the current progress of artificial intelligence (AI) technologies, such as machine learning and deep learning, is remarkable and enables multimodal analyses of big omics data. In this regard, it is important to develop a platform that can conduct multimodal analysis of medical big data using AI as this may accelerate the realization of precision medicine. In this review, we discuss the importance of genome-wide epigenetic and multiomics analyses using AI in the era of precision medicine.


Cancers ◽  
2021 ◽  
Vol 13 (17) ◽  
pp. 4324
Author(s):  
Karin P. S. Langenberg ◽  
Eleonora J. Looze ◽  
Jan J. Molenaar

Over the last years, various precision medicine programs have been developed for pediatric patients with high-risk, relapsed, or refractory malignancies, selecting patients for targeted treatment through comprehensive molecular profiling. In this review, we describe characteristics of these initiatives, demonstrating the feasibility and potential of molecular-driven precision medicine. Actionable events are identified in a significant subset of patients, although comparing results is complicated due to the lack of a standardized definition of actionable alterations and the different molecular profiling strategies used. The first biomarker-driven trials for childhood cancer have been initiated, but until now the effect of precision medicine on clinical outcome has only been reported for a small number of patients, demonstrating clinical benefit in some. Future perspectives include the incorporation of novel approaches such as liquid biopsies and immune monitoring as well as innovative collaborative trial design including combination strategies, and the development of agents specifically targeting aberrations in childhood malignancies.


2020 ◽  
Vol 13 (1) ◽  
Author(s):  
Kris G. Samsom ◽  
Linda J. W. Bosch ◽  
Luuk J. Schipper ◽  
Paul Roepman ◽  
Ewart de Bruijn ◽  
...  

Abstract Background ‘Precision oncology’ can ensure the best suitable treatment at the right time by tailoring treatment towards individual patient and comprehensive tumour characteristics. In current molecular pathology, diagnostic tests which are part of the standard of care (SOC) only cover a limited part of the spectrum of genomic changes, and often are performed in an iterative way. This occurs at the expense of valuable patient time, available tissue sample, and interferes with ‘first time right’ treatment decisions. Whole Genome Sequencing (WGS) captures a near complete view of genomic characteristics of a tumour in a single test. Moreover, WGS facilitates faster implementation of new treatment relevant biomarkers. At present, WGS mainly has been applied in study settings, but its performance in a routine diagnostic setting remains to be evaluated. The WIDE study aims to investigate the feasibility and validity of WGS-based diagnostics in clinical practice. Methods 1200 consecutive patients in a single comprehensive cancer centre with (suspicion of) a metastasized solid tumour will be enrolled with the intention to analyse tumour tissue with WGS, in parallel to SOC diagnostics. Primary endpoints are (1) feasibility of implementation of WGS-based diagnostics into routine clinical care and (2) clinical validation of WGS by comparing identification of treatment-relevant variants between WGS and SOC molecular diagnostics. Secondary endpoints entail (1) added clinical value in terms of additional treatment options and (2) cost-effectiveness of WGS compared to SOC diagnostics through a Health Technology Assessment (HTA) analysis. Furthermore, the (3) perceived impact of WGS-based diagnostics on clinical decision making will be evaluated through questionnaires. The number of patients included in (experimental) therapies initiated based on SOC or WGS diagnostics will be reported with at least 3 months follow-up. The clinical efficacy is beyond the scope of WIDE. Key performance indicators will be evaluated after every 200 patients enrolled, and procedures optimized accordingly, to continuously improve the diagnostic performance of WGS in a routine clinical setting. Discussion WIDE will yield the optimal conditions under which WGS can be implemented in a routine molecular diagnostics setting and establish the position of WGS compared to SOC diagnostics in routine clinical care.


2015 ◽  
Author(s):  
Sarah Guthrie ◽  
Abram Connelly ◽  
Peter Amstutz ◽  
Adam F. Berrey ◽  
Nicolas Cesar ◽  
...  

The scientific and medical community is reaching an era of inexpensive whole genome sequencing, opening the possibility of precision medicine for millions of individuals. Here we present tiling: a flexible representation of whole genome sequences that supports simple and consistent names, annotation, queries, machine learning, and clinical screening. We partitioned the genome into 10,655,006 tiles: overlapping, variable-length sequences that begin and end with unique 24-base tags. We tiled and annotated 680 public whole genome sequences from the 1000 Genomes Project Consortium (1KG) and Harvard Personal Genome Project (PGP) using ClinVar database information. These genomes cover 14.13 billion tile sequences (4.087 trillion high quality bases and 0.4321 trillion low quality bases) and 251 phenotypes spanning ICD-9 code ranges 140-289, 320-629, and 680-759. We used these data to build a Global Alliance for Genomics and Health Beacon and graph database. We performed principal component analysis (PCA) on the 680 public whole genomes, and by projecting the tiled genomes onto their first two principal components, we replicated the 1KG principle component separation by population ethnicity codes. Interestingly, we found the PGP self reported ethnicities cluster consistently with 1KG ethnicity codes. We built a set of support-vector ABO blood-type classifiers using 75 PGP participants who had both a whole genome sequence and a self-reported blood type. Our classifier predicts A antigen presence to within 1% of the current state-of-the art for in silico A antigen prediction. Finally, we found six PGP participants with previously undiscovered pathogenic BRCA variants, and using our tiling, gave them simple, consistent names, which can be easily and independently re-derived. Given the near-future requirements of genomics research and precision medicine, we propose the adoption of tiling and invite all interested individuals and groups to view, rerun, copy, and modify these analyses at https://curover.se/su92l- j7d0g-swtofxa2rct8495


Author(s):  
Alexander Meisel

Until recently, the clinical management of cancer heavily relied on anatomical and histopathological criteria, with ad hoc guidelines directing the therapeutic choices in specific indications. In the last years, the development and therapeutic implementation of novel anticancer therapies significantly improved the clinical outcome of cancer patients. Nonetheless, such cutting-edge approaches revealed the limitation of the one-size-fits-all paradigm. The newly discovered molecular targets can be exploited either as bona fide targets for subsequent drug development, or as tools to precision medicine, in the form of prognostic and/or predictive biomarkers. This article provides an overview of some of the most recent advances in precision medicine in oncology, with a focus on novel tissue-agnostic anticancer therapies. The definition and implementation of biomarkers and companion diagnostics in clinical trials and clinical practice are also discussed, as well as the changing landscape in clinical trial design.


2019 ◽  
Vol 19 (25) ◽  
pp. 2301-2317 ◽  
Author(s):  
Ruirui Liang ◽  
Jiayang Xie ◽  
Chi Zhang ◽  
Mengying Zhang ◽  
Hai Huang ◽  
...  

In recent years, the successful implementation of human genome project has made people realize that genetic, environmental and lifestyle factors should be combined together to study cancer due to the complexity and various forms of the disease. The increasing availability and growth rate of ‘big data’ derived from various omics, opens a new window for study and therapy of cancer. In this paper, we will introduce the application of machine learning methods in handling cancer big data including the use of artificial neural networks, support vector machines, ensemble learning and naïve Bayes classifiers.


Sign in / Sign up

Export Citation Format

Share Document