scholarly journals Predicting Complete Remission of Acute Myeloid Leukemia: Machine Learning Applied to Gene Expression

2019 ◽  
Vol 18 ◽  
pp. 117693511983554 ◽  
Author(s):  
Ophir Gal ◽  
Noam Auslander ◽  
Yu Fan ◽  
Daoud Meerzaman

Machine learning (ML) is a useful tool for advancing our understanding of the patterns and significance of biomedical data. Given the growing trend on the application of ML techniques in precision medicine, here we present an ML technique which predicts the likelihood of complete remission (CR) in patients diagnosed with acute myeloid leukemia (AML). In this study, we explored the question of whether ML algorithms designed to analyze gene-expression patterns obtained through RNA sequencing (RNA-seq) can be used to accurately predict the likelihood of CR in pediatric AML patients who have received induction therapy. We employed tests of statistical significance to determine which genes were differentially expressed in the samples derived from patients who achieved CR after 2 courses of treatment and the samples taken from patients who did not benefit. We tuned classifier hyperparameters to optimize performance and used multiple methods to guide our feature selection as well as our assessment of algorithm performance. To identify the model which performed best within the context of this study, we plotted receiver operating characteristic (ROC) curves. Using the top 75 genes from the k-nearest neighbors algorithm (K-NN) model ( K = 27) yielded the best area-under-the-curve (AUC) score that we obtained: 0.84. When we finally tested the previously unseen test data set, the top 50 genes yielded the best AUC = 0.81. Pathway enrichment analysis for these 50 genes showed that the guanosine diphosphate fucose (GDP-fucose) biosynthesis pathway is the most significant with an adjusted P value = .0092, which may suggest the vital role of N-glycosylation in AML.

2019 ◽  
Vol 3 (8) ◽  
pp. 1330-1346 ◽  
Author(s):  
Sarah Wagner ◽  
Jayakumar Vadakekolathu ◽  
Sarah K. Tasian ◽  
Heidi Altmann ◽  
Martin Bornhäuser ◽  
...  

Abstract Acute myeloid leukemia (AML) is a genetically heterogeneous hematological malignancy with variable responses to chemotherapy. Although recurring cytogenetic abnormalities and gene mutations are important predictors of outcome, 50% to 70% of AMLs harbor normal or risk-indeterminate karyotypes. Therefore, identifying more effective biomarkers predictive of treatment success and failure is essential for informing tailored therapeutic decisions. We applied an artificial neural network (ANN)–based machine learning approach to a publicly available data set for a discovery cohort of 593 adults with nonpromyelocytic AML. ANN analysis identified a parsimonious 3-gene expression signature comprising CALCRL, CD109, and LSP1, which was predictive of event-free survival (EFS) and overall survival (OS). We computed a prognostic index (PI) using normalized gene-expression levels and β-values from subsequently created Cox proportional hazards models, coupled with clinically established prognosticators. Our 3-gene PI separated the adult patients in each European LeukemiaNet cytogenetic risk category into subgroups with different survival probabilities and identified patients with very high–risk features, such as those with a high PI and either FLT3 internal tandem duplication or nonmutated nucleophosmin 1. The PI remained significantly associated with poor EFS and OS after adjusting for established prognosticators, and its ability to stratify survival was validated in 3 independent adult cohorts (n = 905 subjects) and 1 cohort of childhood AML (n = 145 subjects). Further in silico analyses established that AML was the only tumor type among 39 distinct malignancies for which the concomitant upregulation of CALCRL, CD109, and LSP1 predicted survival. Therefore, our ANN-derived 3-gene signature refines the accuracy of patient stratification and the potential to significantly improve outcome prediction.


Blood ◽  
2004 ◽  
Vol 104 (11) ◽  
pp. 2037-2037
Author(s):  
Lars Bullinger ◽  
Claudia Scholl ◽  
Eric Bair ◽  
Konstanze Dohner ◽  
Stefan Frohling ◽  
...  

Abstract Recurrent cytogenetic aberrations have been shown to constitute markers of diagnostic and prognostic value in acute myeloid leukemia (AML). However, even within the well-defined cytogenetic AML subgroup with an inv(16) we see substantial biological and clinical heterogeneity which is not fully reflected by the current classification system. To better characterize this cytogenetic group on the molecular level we profiled gene expression in a series of adult AML patients (n=26) with inv(16) using 42k cDNA microarrays. By unsupervised hierarchical clustering we observed that samples with inv(16) separated primarily into two different subgroups. These showed no significant differences regarding known risk factors like age, WBC, LDH, etc. However, these newly defined inv(16)-subgroups were characterized by distinct clinical behavior. There was a strong trend towards unfavorable outcome with shorter overall survival times in one group (P=0.09, log rank test). Since the primary translocation/inversion events themselves are not sufficient for leukemogenesis, distinct patterns of gene expression found within each of these cytogenetic groups may suggest alternative cooperating mutations and deregulated pathways leading to transformation. Therefore, we performed a supervised analysis to determine the characteristic gene expression patterns underlying the cluster-defined subgroups. This Significance Analysis of Microarrays (SAM) method identified 260 genes significantly differentially expressed between the two newly defined inv(16)-subgroups (false discovery rate = 0.002). High expression levels of JUN, JUNB, JUND, FOS and FOSB characterized the first inv(16) subgroup (having less favorable prognosis). FOS gene family members can dimerize with proteins of the JUN family, forming the transcription factor complex AP-1 which has been implicated in the regulation of cell proliferation, differentiation, and transformation. Among the second subgroup, the proto-oncogene ETS1,displayed elevated expression, possibly resulting from aberrant MEK/ERK pathway activation as these cases also showed an over-expression of MAP3K1 and MAP3K2. In conclusion, both supervised and unsupervised methods provide numerous insights into the pathogenesis of AML with inv(16), identifying clinically significant patterns of gene expression, as well as candidate target genes involved in leukemogenesis.


Blood ◽  
2005 ◽  
Vol 106 (11) ◽  
pp. 673-673
Author(s):  
Lars Bullinger ◽  
Stephan Kurz ◽  
Konstanze Dohner ◽  
Claudia Scholl ◽  
Stefan Frohling ◽  
...  

Abstract Recurrent cytogenetic aberrations have been shown to constitute markers of diagnostic and prognostic value in acute myeloid leukemia (AML). However, even within well-defined cytogenetic AML subgroups with an inv(16) or a t(8;21) we see substantial biological and clinical heterogeneity which is not fully reflected by the current classification system. Therefore, we profiled gene expression in a large series of adult AML patients with core binding factor (CBF) leukemia [inv(16) n=55, t(8;21) n=38] using a whole genome DNA microarray platform in order to better characterize this disease on the molecular level. By unsupervised hierarchical clustering based on 8556 filtered genes we observed that our CBF leukemia samples separated primarily into three different subgroups. While two of the subgroups were characterized by inv(16) and t(8;21) cases, respectively, the third subgroup contained a mixture of both cytogenetic groups. There was no obvious correlation with known secondary aberrations or molecular marker like FLT3-ITD, NRAS and KIT mutations between the cases in the mixed subgroup and the others. However, the newly defined inv(16)/t(8;21)-subgroup (n=35) was characterized by distinct clinical behavior with shorter overall survival times (P=0.029; log rank test) compared to the other two groups. Unsupervised analyses within the inv(16) and t(8;21) cases also revealed corresponding inv(16) and t(8;21) subgroups with a strong trend towards inferior outcome (P=0.11 and P=0.09, respectively; log rank test). Since the primary translocation/inversion events themselves are not sufficient for leukemogenesis, distinct patterns of gene expression found within each of these cytogenetic groups may suggest alternative cooperating mutations and deregulated pathways leading to transformation. Therefore, we performed a supervised analysis to determine the characteristic gene expression patterns underlying the cluster-defined subgroups. We identified 528 genes significantly differentially expressed between the newly defined inv(16)/t(8;21)-subgroup and the other CBF cases (significance analysis of microarrays, false discovery rate < 0.001). Potential candidates for cooperating pathways characterizing the mixed inv(16)/t(8;21)-subgroup included e.g. AVO3, a member of the mTOR pathway, oncogene homologs like LYN and BRAF, as well as FOXO1A and IL6ST which have been previously identified to correlate with outcome in AML (Bullinger et al., N Engl J Med350:1605, 2004). In conclusion, while the observed signatures remain to be validated for their functional relevance, both supervised and unsupervised methods provide numerous insights into the pathogenesis of CBF AML, identifying clinically significant patterns of gene expression, as well as candidate target genes involved in leukemogenesis.


Blood ◽  
2012 ◽  
Vol 120 (21) ◽  
pp. 1380-1380
Author(s):  
Michael A Morgan ◽  
Birgit Markus ◽  
Malou Hermkens ◽  
Frederik Damm ◽  
Katarina Reinhardt ◽  
...  

Abstract Abstract 1380 NADH dehydrogenase subunit 4 (ND4) is encoded by mitochondrial DNA and is an integral component of Complex I, one of the core enzymatic complexes critical for mitochondrial oxidative phosphorylation and regulation of the balance between NADH and NAD+. ND4 mutations have recently been described in adult acute myeloid leukemia (AML). In the current study, we investigated the frequency and prognostic impact of ND4 mutations in 289 pediatric leukemia patients (<= 18 years). Total cellular DNA was isolated from bone marrow or peripheral blood samples at diagnosis (n=289) and at complete remission (n=6) for children treated uniformly within multicenter treatment trials AML-Berlin-Frankfurt-Münster (BFM, n=180) and Dutch Childhood Oncology Group (DCOG, n=109). ND4 mutations were detected by direct sequencing in 13 of 289 (4.5 %) pediatric AML patients. Mutations occurred throughout the ND4 sequence, and included missense mutations (n=10), deletions (n=2) and a nonsense mutation. The most commonly detected mutations were S86N (n=2), delA 11,032–11,038 (n=2), and F50L (n=2). All other mutations were detected in single cases. Four (30.8 %) ND4 mutations were heteroplasmic (i.e. both wild-type and mutated ND4 were detected) and 9 (69.2 %) were homoplasmic (i.e. only mutated ND4 was detected), which is similar to the distribution we previously observed for adult AML patients (37.9% and 62.1%, respectively). Of the 4 heteroplasmic mutations detected in the pediatric AML cohort, 3 are predicted to result in a truncated ND4 protein. The remaining heteroplasmic mutation, which results in an L72P substitution, is predicted to be damaging (PolyPhen2 score = 0.999). Thus all 4 heteroplasmic mutations are expected to interfere with ND4 protein function. In contrast, 3 of the 9 (33.3 %) homoplasmic mutations are within transmembrane regions and only 1 (11.1 %) is predicted to be damaging (S459Y, PolyPhen2 score = 0.906). The 11 predicted transmembrane domains (TMD) of ND4 may be important for mitochondrial proton transport. However, like in adult AML, the presence of ND4 mutations affecting or not affecting a TMD had no impact on pediatric AML patient outcome. Non-tumoral DNA available through samples collected in routine follow-up examinations during complete remission allowed determination of mutation origin (e.g. somatic or germ-line) in 6 cases. Interestingly, the homoplasmic substitutions resulting in F50L, S86N and A131T were each defined to be germline mutations in both adult and pediatric AML samples. The heteroplasmic one base-pair deletion in a stretch of seven adenine residues (11,032–11,038) detected in two pediatric leukemia samples was determined to be somatic in the one case for whom a sample obtained during complete remission was available for analyses. Patient characteristics including age, FAB-subtype, WBC count, cytogenetic subgroup or presence of FLT3-ITD were similar regardless of ND4 mutation status. In accordance with our earlier observations in adult AML, comparison of ND4mutated with ND4wildtype patients demonstrated no significant difference on overall survival (OS, P=.67). In the adult study, a survival advantage was observed for patients with somatic heteroplasmic ND4 mutations. No survival advantage was observed for children with heteroplasmic ND4 mutations, possibly due to limited numbers of ND4mutated patients treated in the BFM and DCOG study groups. Gene expression profiles (GEP) for ND4mutated (n=11) and ND4wild-type (n=188) pediatric AML patients revealed no significant differences. However, 8 probe sets were found to be differentially regulated when GEP for heteroplasmic ND4mutated (n=4) and ND4wildtype (n=187) were compared. Two of these probe sets annotated the SETDB2 (CLLD8, KMT1F) gene, which encodes a histone H3 methyltransferase. Quantitative RT-PCR validated the lower SETDB2 expression as a characteristic of ND4mutated cases (P=.02). SETDB2 contributes to several important cellular functions, including heterochromatin formation, chromatin condensation and transcriptional repression. In summary, ND4 mutations were not predictive for outcome in pediatric AML, but were significantly associated with decreased SETDB2 expression, providing a link between mitochondrial gene mutation and epigenetic control of gene expression. Disclosures: No relevant conflicts of interest to declare.


Blood ◽  
2009 ◽  
Vol 114 (22) ◽  
pp. 114-114
Author(s):  
Jatinder K. Lamba ◽  
Kristine Crews ◽  
Stanley Pounds ◽  
Xueyuan Cao ◽  
Varsha Gandhi ◽  
...  

Abstract Abstract 114 To identify genes that influence responses to cytarabine (ara-C) treatment, we explored the association of gene expression in leukemic cells at diagnosis with multiple pharmacological and clinical end-points in children with acute myeloid leukemia (AML) treated with ara-C on the St. Jude AML97 clinical trial. We applied a novel statistical procedure, PRojection Onto the Most Interesting Statistical Evidence (PROMISE; PR), to identify genes with expression levels associated with clinical and pharmacological endpoints. To do this, we first defined the following values of the clinical and pharmacological variables as “therapeutically beneficial” :higher leukemic cell ara-C triphosphate levels, lower DNA synthesis values on days 1 and 2 of treatment relative to baseline, decreases in leukocyte counts on day 2 of treatment, improved response and decreased risk of relapse, death, or second malignancy. We considered a gene to show a therapeutically beneficial pattern of association if its expression was positively correlated with ara-CTP levels, negatively correlated with DNA synthesis levels, negatively correlated with decrease in leukocyte counts on day 2, positively correlated with better treatment response, and negativelycorrelated with the risk of relapse or death. A gene showed a therapeutically detrimental pattern of association if its expression had the opposite correlations with the clinical and pharmacological variables. We performed five variable (PR5 using early pharmacologically interesting phenotype measures) or seven variable (PR7 all the above indicated phenotypes) PROMISE analyses. PR5 identified 275 beneficial probe sets and 69 detrimental probe sets (p ≤ 0.005). PR7 analysis identified 112 beneficial probe sets and 115 detrimental probe sets (p ≤ 0.005). To confirm these results, we performed a PROMISE for a cohort of patients treated with ara-C and other agents on the AML02 protocol. Gene expression in leukemic cells at diagnosis was analyzed for a beneficial or detrimental pattern of association with three phenotypes (PR-3); diagnostic blast ara-C cytotoxicity, minimal residual disease (MRD) and event-free survival (EFS). Eighty-one probe sets identified by PR5 or PR7 analyses in the initial cohort were confirmed in the PR-3 analysis of AML02 data. Genes identified in the present study may serve as predictive markers of response and candidates for future drug development. Disclosures: No relevant conflicts of interest to declare.


2015 ◽  
Vol 7 ◽  
pp. e2015033 ◽  
Author(s):  
Adel Abd elhaleim Hagag

Abstract      Background: Acute myeloid leukemia (AML) accounts for 25%-35% of the acute leukemia in children. BAALC (Brain and Acute Leukemia, Cytoplasmic gene) is a recently identified gene on chromosome 8q22.3 that has prognostic significance in AML.  The aim of this work was to study the impact of BAALC gene expression on prognosis of AML in Egyptian children. Patients and methods: This study was conducted on 40 patients of newly diagnosed AML who were subjected to the following: Full history taking, clinical examination, laboratory investigations including: complete blood count, LDH, bone marrow aspiration, cytochemistry and immunophenotyping, assessment of BAALC Gene by real time PCR in bone marrow aspirate mononuclear cells before the start of chemotherapy. Results: BAALC gene expression showed positive expression in 24 cases (60%) and negative expression in 16 cases (40%). Patients who showed positive BAALC gene expression included 10 patients achieved complete remission, 8 patients died and 6 relapsed patients, while patients who showed negative expression include 12 patients achieved complete remission, 1 relapsed patient and 3 patients died. There was significant association between BAALC gene expression and FAB classification of patients of AML patientsas positive BAALC expression is predominantly seen in FAB subtypes M1 and M2 compared with negative BAALC gene expression that was found more in M3 and M4 (8 cases with M1, 12 cases with M2, 1 case with M3 and 3 cases with M4 in positive BAALC expression versus 2 cases with M1, 3 cases with M2, 4 cases with M3 and 7 cases with M4 in BAALC gene negative expression group with significant difference regarding FAB subtypes). As regard age, sex, splenomegaly, lymphadenopathy, pallor, purpura, platelets count, WBCs count, and percentage of blast cells in BM, the present study showed no significant association with BAALC. Conclusion: BAALC expression is an important prognostic factor in AML patients and its incorporation into novel risk-adapted therapeutic strategies will improve the currently disappointing cure rate of this group of patients.


Blood ◽  
2006 ◽  
Vol 108 (11) ◽  
pp. 155-155 ◽  
Author(s):  
Lars Bullinger ◽  
Konstanze Dohner ◽  
Raphael Kranz ◽  
Frank G. Rucker ◽  
Stefan Frohling ◽  
...  

Abstract Acute myeloid leukemia (AML) with normal karyotype comprises a large number of molecularly distinct variants. For example the presence of internal tandem duplications (ITDs) of the FLT3 (fms-related tyrosine kinase 3) gene is associated with poor outcome, whereas mutations of the NPM1 (nucleophosmin) gene are prognostically favorable. However, this effect is mainly attributed to the NPM1-mutated/FLT3 ITD-negative AML cases. While NPM1-mutated cases are characterized by a distinct gene expression pattern, it remains unclear whether NPM1-mutated/FLT3 ITD-negative cases also display a characteristic signature, which might provide additional insights into the molecular basis for the good clinical outcome. Thus, we sought to identify a molecular profile for AML cases with NPM1-mutated/FLT3 ITD-negative normal karyotype disease. Towards this goal, we profiled gene expression of 138 samples of adult AML patients with normal karyotype using DNA microarray technology. All samples analyzed were derived from AML patients entered within the randomized multicenter treatment trial HD-98A of the German-Austrian AML Study Group (AMLSG). Based on supervised data analyses we were able to identify a 116-genes comprising expression pattern correlated with NPM1-mutated and FLT3 ITD-negative AML cases. In accordance with previous findings in NPM1-mutated cases (Alcalay et al. 2005, Verhaak et al. 2005), the NPM1-mutated/FLT3 ITD-negative pattern was also in part characterized by a prominent HOX gene cluster, which clearly separated the NPM1-wildtype from the NPM1-mutated cases. Similarly, the expression levels of BAALC and MN1 were correlated with the NPM1 mutational status, with NPM1-unmutated cases displaying higher BAALC and MN1 expression in our data set. However, as expected the newly defined signature also defined a NPM1-mutated group that did not contain many FLT3 ITD-positive samples. This group was characterized by several interesting genes including for example TLE1, which encodes a Groucho/TLE family protein. Groucho/TLE family proteins are transcriptional co-repressors, which mediate repression essential in embryonic development and are involved in regulation of Wnt signaling in adult tissue. Moreover, we identified several other genes of potential pathogenic relevance which also have been previously shown to be predictive in normal karyotype AML. Our findings support a distinct molecular mechanism associated with the favorable outcome of NPM1-mutated/FLT3 ITD-negative AML cases. Furthermore, the reported signature might contribute to improved risk stratification and clinical management of AML patients with normal karyotype disease.


Blood ◽  
2008 ◽  
Vol 112 (11) ◽  
pp. 1193-1193
Author(s):  
Lars Bullinger ◽  
Jan Kronke ◽  
Ursula Botzenhardt ◽  
Sabrina Heinrich ◽  
Katja Urlbauer ◽  
...  

Abstract Genome-wide single nucleotide polymorphism (SNP) analyses have revealed uniparental disomy (UPD) to be a common event in cytogenetically normal acute myeloid leukemia (CN-AML) occurring in approximately 20% of cases. Acquired UPD results in copy number neutral loss of heterozygosity (LOH). Comparing matched tumor and germline DNA samples recurrent acquired UPDs affecting chromosomes 11p and 13q were identified. As DNA microarray-based gene expression profiling (GEP) has recently been shown to powerfully capture the molecular heterogeneity of leukemia, we sought to identify gene expression patterns associated with recurrent UPD in CN-AML. We profiled a set of clinically annotated CN-AML specimens (n=66) entered on a multicenter trial for patients <60 years (AMLSG 07-04) which had been characterized by either 50k or 500k Affymetrix SNP microarrays. All cases were analyzed using Affymetrix microarrays (Human Genome U133 Plus 2.0 Arrays). In this data set we investigated 12 UPDs (affecting chromosomes 1p, 2p, 6p, 11p, 13q and 19q) and applied supervised analyses to define gene-expression patterns associated with UPDs on chromosome 11p and 13q. For the case with an acquired UPD on 19q a gene dosage effect could be demonstrated. Genes located in the 36 Mb large UPD region showed a significantly lower average expression (p<0.001; t-test). Similarly, we observed a gene dosage effect in one of the UPDs observed on chromosome 1 (p=0.0097; t-test), whereas for the other UPDs no significant association between LOH and gene expression levels could be identified. Despite small sample numbers supervised analyses revealed a biologically meaningful gene expression signatures associated with acquired UPD 11p and 13q. In accordance with the association of UPD 13q with FLT3-ITD, the UPD13q gene expression signature was enriched for genes associated with FLT3-ITD. The UPD11p expression pattern was characterized by genes found to be down-regulated in CEBPAmut CN-AML cases, such as down-regulation of homeobox genes HOXA9, HOXA10, HOXB2, and MEIS1. Notably, the UPD11p signature was also characterized by the expression of e.g. UGT2B28, P2RX5, PGDS, CAPN1, NDFIP1, and TRIB2, an expression profile that has been shown to be associated with CEBPAmut CN-AML as well as AML cases with epigenetic CEBPA silencing. Thus, our findings represent a starting point to further dissect CN-AML characterized by recurrent UPD, and ongoing analyses will provide additional insights into leukemia biology.


2018 ◽  
Author(s):  
Stefanie Warnat-Herresthal ◽  
Konstantinos Perrakis ◽  
Bernd Taschler ◽  
Matthias Becker ◽  
Lea Seep ◽  
...  

ABSTRACTAcute Myeloid Leukemia (AML) is a severe, mostly fatal hematopoietic malignancy. Despite nearly two decades of promising results using gene expression profiling, international recommendations for diagnosis and differential diagnosis of AML remain based on classical approaches including assessment of morphology, immunophenotyping, cytochemistry, and cytogenetics. Concerns about the translation of whole transcriptome profiling include the robustness of derived predictors when taking into account factors such as study- and site-specific effects and whether achievable levels of accuracy are sufficient for practical use. In the present study, we sought to shed light on these issues via a large-scale analysis using machine learning methods applied to a total of 12,029 samples from 105 different studies. Taking advantage of the breadth of data and the now much improved understanding of high-dimensional modeling, we show that AML can be predicted with high accuracy. High-dimensional approaches - in which multivariate signatures are learned directly from genome-wide data with no prior biological knowledge - are highly effective and robust. We explore also the relationship between predictive signatures, differential expression and known AML-related genes. Taken together, our results support the notion that transcriptome assessment could be used as part of an integrated genomic approach in cancer diagnosis and treatment to be implemented early on for diagnosis and differential diagnosis of AML.One Sentence SummaryBlood gene expression data and machine learning were used to develop robust and accurate classifiers for diagnosis and differential diagnosis of acute myeloid leukemia based on analysis of more than 12,000 samples derived from more than 100 individual studies


Oncogene ◽  
2005 ◽  
Vol 24 (9) ◽  
pp. 1580-1588 ◽  
Author(s):  
Kai Neben ◽  
Susanne Schnittger ◽  
Benedikt Brors ◽  
Björn Tews ◽  
Felix Kokocinski ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document