scholarly journals To assemble or not to resemble—A validated Comparative Metatranscriptomics Workflow (CoMW)

GigaScience ◽  
2019 ◽  
Vol 8 (8) ◽  
Author(s):  
Muhammad Zohaib Anwar ◽  
Anders Lanzen ◽  
Toke Bang-Andreasen ◽  
Carsten Suhr Jacobsen

Abstract Background Metatranscriptomics has been used widely for investigation and quantification of microbial communities’ activity in response to external stimuli. By assessing the genes expressed, metatranscriptomics provides an understanding of the interactions between different major functional guilds and the environment. Here, we present a de novo assembly-based Comparative Metatranscriptomics Workflow (CoMW) implemented in a modular, reproducible structure. Metatranscriptomics typically uses short sequence reads, which can either be directly aligned to external reference databases (“assembly-free approach”) or first assembled into contigs before alignment (“assembly-based approach”). We also compare CoMW (assembly-based implementation) with an assembly-free alternative workflow, using simulated and real-world metatranscriptomes from Arctic and temperate terrestrial environments. We evaluate their accuracy in precision and recall using generic and specialized hierarchical protein databases. Results CoMW provided significantly fewer false-positive results, resulting in more precise identification and quantification of functional genes in metatranscriptomes. Using the comprehensive database M5nr, the assembly-based approach identified genes with only 0.6% false-positive results at thresholds ranging from inclusive to stringent compared with the assembly-free approach, which yielded up to 15% false-positive results. Using specialized databases (carbohydrate-active enzyme and nitrogen cycle), the assembly-based approach identified and quantified genes with 3–5 times fewer false-positive results. We also evaluated the impact of both approaches on real-world datasets. Conclusions We present an open source de novo assembly-based CoMW. Our benchmarking findings support assembling short reads into contigs before alignment to a reference database because this provides higher precision and minimizes false-positive results.

2019 ◽  
Author(s):  
Muhammad Zohaib Anwar ◽  
Anders Lanzen ◽  
Toke Bang-Andreasen ◽  
Carsten Suhr Jacobsen

AbstractBackgroundMetatranscriptomics has been used widely for investigation and quantification of microbial communities’ activity in response to external stimuli. By assessing the genes expressed, metatranscriptomics provide an understanding of the interactions between different major functional guilds and the environment. Here, we present de-novo assembly-based Comparative Metatranscriptomics Workflow (CoMW) implemented in a modular, reproducible structure, significantly improving the annotation and quantification of metatranscriptomes. Metatranscriptomics typically utilize short sequence reads, which can either be directly aligned to external reference databases (“assembly-free approach”) or first assembled into contigs before alignment (“assembly-based approach”). We also compare CoMW (assembly-based implementation) with assembly-free alternative workflow, using simulated and real-world metatranscriptomes from Arctic and Temperate terrestrial environments. We evaluate their accuracy in precision and recall using generic and specialized hierarchical protein databases.ResultsCoMW provided significantly fewer false positives resulting in more precise identification and quantification of functional genes in metatranscriptomes. Using the comprehensive database M5nr, the assembly-based approach identified genes with only 0.6% false positives at thresholds ranging from inclusive to stringent compared to the assembly-free approach yielding up to 15% false positives. Using specialized databases (Carbohydrate Active-enzyme and Nitrogen Cycle), the assembly-based approach identified and quantified genes with 3-5x less false positives. We also evaluated the impact of both approaches on real-world datasets.ConclusionsWe present an open source de-novo assembly-based Comparative Metatranscriptomics Workflow (CoMW). Our benchmarking findings support the argument of assembling short reads into contigs before alignment to a reference database, since this provides higher precision and minimizes false positives.


Medicine ◽  
2019 ◽  
Vol 98 (40) ◽  
pp. e17451 ◽  
Author(s):  
Mari Carmen Bernal-Soriano ◽  
Lucy A. Parker ◽  
Maite López-Garrigos ◽  
Ildefonso Hernández-Aguado ◽  
Juan P. Caballero-Romeu ◽  
...  

2020 ◽  
Vol 2 (3) ◽  
pp. 153-159
Author(s):  
Dr. V. Suma

There has been an increasing demand in the e-commerce market for refurbished products across India during the last decade. Despite these demands, there has been very little research done in this domain. The real-world business environment, market factors and varying customer behavior of the online market are often ignored in the conventional statistical models evaluated by existing research work. In this paper, we do an extensive analysis of the Indian e-commerce market using data-mining approach for prediction of demand of refurbished electronics. The impact of the real-world factors on the demand and the variables are also analyzed. Real-world datasets from three random e-commerce websites are considered for analysis. Data accumulation, processing and validation is carried out by means of efficient algorithms. Based on the results of this analysis, it is evident that highly accurate prediction can be made with the proposed approach despite the impacts of varying customer behavior and market factors. The results of analysis are represented graphically and can be used for further analysis of the market and launch of new products.


Author(s):  
Owen R. Baker ◽  
M. Kate Grabowski ◽  
Ronald M. Galiwango ◽  
Aminah Nalumansi ◽  
Jennifer Serwanga ◽  
...  

Background: We assessed the performance of CoronaCHEK lateral flow assay on samples from Uganda and Baltimore to determine the impact of geographic origin on assay performance. Methods: Plasma samples from SARS-CoV-2 PCR+ individuals (Uganda: 78 samples from 78 individuals and Baltimore: 266 samples from 38 individuals) and from pre-pandemic individuals (Uganda 1077 and Baltimore 532) were evaluated. Prevalence ratios (PR) were calculated to identify factors associated with a false-positive test. Results: After first positive PCR in Ugandan samples the sensitivity was: 45% (95% CI 24,68) at 0-7 days; 79% (95%CI 64,91) 8-14 days; and 76% (95%CI 50,93) >15 days. In samples from Baltimore, sensitivity was: 39% (95% CI 30, 49) 0-7 days; 86% (95% CI 79,92) 8-14 days; and 100% (95% CI 89,100) 15 days post positive PCR. The specificity of 96.5% (95% CI 97.5,95.2) in Ugandan samples was significantly lower than samples from Baltimore 99.3% (95% CI 98.1,99.8), p<0.01. In Ugandan samples, individuals with a false positive result were more likely to be male (PR 2.04, 95% CI 1.03,3.69) or individuals who had a fever more than a month prior to sample acquisition (PR 2.87, 95% CI 1.12,7.35). Conclusions: Sensitivity of the CoronaCHEK was similar in samples from Uganda and Baltimore. The specificity was significantly lower in Ugandan samples than in Baltimore samples. False positive results in Ugandan samples appear to correlate with a recent history of a febrile illness, potentially indicative of a cross-reactive immune response in individuals from East Africa.


Blood ◽  
2020 ◽  
Vol 136 (Supplement 1) ◽  
pp. 49-50
Author(s):  
Gee Youn (Geeny) Kim ◽  
Jamie L. Koprivnikar ◽  
Rebecca Testi ◽  
Tara McCabe ◽  
Grace Perry ◽  
...  

Background Patients with secondary acute myeloid leukemia (sAML) have poor outcomes compared to those with de novo AML. In 2017, liposomal daunorubicin and cytarabine (CPX-351) was FDA approved for the treatment of adults with newly diagnosed AML with myelodysplasia-related change (AML-MRC) or therapy-related AML (t-AML). In its landmark trial, CPX-351 has displayed significant improvement in overall survival (OS) compared to conventional 7+3 in patients 60-75 years of age with sAML. Gaps remain in the literature regarding the clinical use of CPX-351 in context of the FDA approved label. Here we evaluate real-world outcomes with disease response and molecular monitoring in patients treated with CPX-351. Methods Adults who received CPX-351 between September 2017 and December 2019 were identified. The primary endpoint was overall response rate (ORR), defined by complete remission (CR) and CR with incomplete hematologic recovery (CRi) according to the Revised IWG criteria. Additional outcomes of interest included molecular minimal residual disease (MRD) status post induction as measured by next-generation sequencing (NGS), ORR in patients with baseline TP53, and progression-free survival (PFS) in patients with CR/CRi, with and without MRD after induction. Mutations associated with clonal hematopoiesis (TET2, ASXL1, DNMT3A) were excluded from analysis of molecular MRD. Results Fifty-four patients were identified with baseline characteristics as shown in Table 1. Overall, the study population was elderly with the median age of 64 [IQR: 60-68], and 13 patients were younger than 60 years old. Six patients developed AML in the setting of a pre-existing myeloproliferative neoplasm (MPN). The most common indication for treatment with CPX-351 was antecedent MDS (42.6%), followed by de novo AML with MDS karyotype (24.1%), therapy-related AML (13%), and antecedent MPN (11.1%). NGS was performed prior to treatment with CPX-351 in all but one patient, and 88.7% had at least one molecular marker that is not identified as one of the mutations associated with clonal hematopoiesis. Most commonly identified molecular markers were TP53 (16/53, 30.2%), RUNX1 (10/53, 18.9%), SRSF2 (8/53, 15.1%), NRAS (7/53, 13.2%), and IDH2 and JAK2 (6/53, 11.3%, each). Most patients were hospitalized until hematologic recovery. However, 5 patients received induction in the outpatient setting, and an additional 6 patients were discharged early before hematologic recovery. Among the patients who were discharged early or underwent outpatient induction, 81.8% (9/11) were admitted for a complication. There were no deaths associated with outpatient induction. Overall, 46 patients (85.2%) experienced febrile neutropenia and 17 patients (31.5%) had bacteremia. Thirty-day and 60-day mortality were 9.3% and 14.8%, respectively. The ORR was 54%, and the response rates observed in patients who were younger vs older than 60 years were similar (41.7% vs. 57.9%, p=0.508). In patients who achieved a remission after induction, 56% (14/25) were MRD positive by NGS. Among those who had TP53 mutation at baseline, 14 were available for response assessment after induction. The ORR in this subgroup was 57% (8/14) and all but 3 (63%) were MRD negative by NGS. Consolidation with allogeneic transplant was performed in 18 patients (33%). Median OS was 10.4 mos. Median OS was similar for patients older or younger than 60 years (p=0.76). For patients achieving a CR/CRi, median OS had not been reached at the time of analysis but was significantly improved compared to those with refractory disease (6.1 mos, p=0.0007). Median OS or PFS did not differ significantly (p=0.68) based on MRD negativity (Figure 1). Conclusion This analysis demonstrates comparable response rates to the landmark trial (54% in our analysis vs. 47.7%). Outpatient induction and/or early discharge was safe and feasible in appropriately selected patients. While this analysis is limited by the small sample size, CPX-351 appeared effective in populations that were not included in the published randomized studies, such as patients below the age of 60 years old and those with antecedent MPN. Remission rates and MRD clearance was high among TP53 mutants. A considerable number of patients who achieved a remission remained MRD positive by NGS, but this did not impact PFS. Future studies should evaluate the impact of molecular MRD and allele frequency to further guide treatment. Disclosures Koprivnikar: Alexion: Speakers Bureau; BMS: Speakers Bureau; Novartis: Speakers Bureau; Amgen: Speakers Bureau. McCloskey:Takeda: Consultancy, Honoraria, Speakers Bureau; Novartis: Speakers Bureau; Abbvie: Speakers Bureau; Amgen: Consultancy, Speakers Bureau; BMS: Consultancy, Honoraria, Speakers Bureau; Jazz: Consultancy, Honoraria, Speakers Bureau.


Blood ◽  
2021 ◽  
Vol 138 (Supplement 1) ◽  
pp. 2310-2310
Author(s):  
Alex Legg ◽  
Pesheya Doubleday ◽  
Adam Reich ◽  
Alexandrina Lambova ◽  
Greg Medalla

Abstract Introduction: CPX-351 (US: Vyxeos ®; Europe: Vyxeos ® Liposomal) is a dual-drug liposomal encapsulation of daunorubicin and cytarabine in a synergistic 1:5 molar ratio. Since November 2018, the National Institute for Health and Care Excellence (NICE) has recommended its use for adults with newly diagnosed, therapy-related AML (t-AML) or AML with myelodysplasia-related changes (AML-MRC) due to either prior myelodysplastic syndrome (MDS)/chronic myelomonocytic leukemia (CMML) or de novo AML with myelodysplasia-related cytogenetic changes. The key aims of this study were to utilize the Cancer Analysis System (CAS) database available through the National Cancer Registration and Analysis Service (NCRAS) to describe the demographics and clinical characteristics of adults with AML in England who have received CPX-351, as well as to estimate overall survival (OS) and survival within stratifications of interest. Methods: The NCRAS systematically collects and curates population-level data about cancer diagnoses, treatments, and outcomes across England. Adults (aged ≥18 years) diagnosed with AML and treated with CPX-351 were included in this study. A diagnosis of t-AML or AML-MRC between January 2013 and March 2020 was determined either directly using International Classification of Diseases for Oncology, Third Edition (ICD-O-3) codes or indirectly using non-specific ICD-O-2, ICD-O-3, or ICD-10 AML codes in combination with either prior systemic anticancer therapy or radiotherapy (t-AML) or a prior diagnosis of MDS or CMML (AML-MRC; other AML-MRC subtypes could not be specifically identified and are included within the de novo AML subgroup). OS was measured from the date of diagnosis; a separate analysis of OS landmarked from the date of hematopoietic cell transplant (HCT) was also performed. Within this preliminary analysis, no OS adjustments have been made to account for any COVID-19-related deaths. Results: A total of 172 patients with AML who were treated with CPX-351 were identified: 37 (22%) had t-AML, 57 (33%) had AML-MRC, and 78 (45%) had de novo AML. At diagnosis, the mean (standard deviation) age was 62.8 years (10.1), with 49/172 (28%) patients aged &lt;60 years; 66% of patients were male; 87% were white; and most had an Eastern Cooperative Oncology Group performance status of 0 or 1 (68%). Six (3%) patients had received azacitidine treatment for a prior malignancy. To date, 43/172 (25%) patients had undergone HCT overall, including 43/97 (44%) patients with ≥3 months of follow-up. The cut-off date for OS was December 31, 2020, giving a median (interquartile range) follow-up of 11.2 months (3.6, 16.9). Overall, 91 patients had died, with an estimated median OS (95% confidence interval [CI]) of 16.6 months (11.0, not estimable) and probability of survival (95% CI) at 1 and 2 years of 0.54 (0.47, 0.62) and 0.39 (0.30, 0.50), respectively (Figure 1). Early mortality rates were 7% at 30 days and 15% at 60 days. When OS was landmarked from the date of HCT, median OS was not reached, with a probability of survival (95% CI) at 1 year of 0.74 (0.62, 0.89; Figure 2). When stratified by age, estimated median OS (95% CI) was not reached for patients aged &lt;60 years and 12.8 months (8.9, 17.6) for patients aged ≥60 years. In a treatment patterns analysis that evaluated second-line treatments after CPX-351, 68 patients died without salvage therapy and 64 were alive without receiving subsequent therapy by the end of the study period. The most common salvage treatments were fludarabine, cytarabine, idarubicin, and granulocyte-colony stimulating factor (FLAG-Ida; n = 15), daunorubicin plus cytarabine (DA)-based therapy (n = 6), and azacitidine alone (n = 7). Of the 43 patients who received an HCT, 6 (14%) underwent HCT following salvage therapy. Conclusions: This is the largest study to date examining the real-world outcomes for patients with AML who were treated with CPX-351. The estimated median OS of 16.6 months is consistent with reported real-world outcomes for CPX-351 in French and Italian studies. Median OS has not been reached in patients aged &lt;60 years or when landmarked from the date of HCT. Once the CAS database has been updated, these analyses will be repeated to increase follow-up and patient numbers and to determine the impact of COVID-19 on OS following CPX-351 treatment. Figure 1 Figure 1. Disclosures Legg: Jazz Pharmaceuticals: Current Employment, Current equity holder in publicly-traded company. Doubleday: IQVIA Inc., which was contracted by Jazz Pharmaceuticals for the conduct of this analysis: Current Employment. Reich: IQVIA Inc., which was contracted by Jazz Pharmaceuticals for the conduct of this analysis: Current Employment. Lambova: IQVIA Inc., which was contracted by Jazz Pharmaceuticals for the conduct of this analysis: Current Employment. Medalla: Jazz Pharmaceuticals: Current Employment, Current equity holder in publicly-traded company.


2021 ◽  
Author(s):  
Jeff Mayfield ◽  
Peter Hesse ◽  
David Ledden

The impact of universal transport media (UTM) and viral transport media (VTM) liquid samples on the performance of the Healgen Scientific Rapid COVID-19 Antigen Test was investigated. Twelve different UTM/VTM liquid samples were added at different dilutions to the extraction buffer, and 2 of 12 generated false-positive results. To understand the cause of these false-positive results, the effect of extraction buffer dilution on sample pH, surfactant concentration, and ionic strength were investigated. The most important factor in UTM/VTM liquid sample dilution of the extraction buffer was ionic strength as measured by conductivity. Dilutions with conductivity below ~17 mS/cm can induce a false-positive result. It was also noted that the ionic strength of UTM/VTMs can vary, and those with low ionic strength can be problematic. To rule out the effect of other common components found in UTMs/VTMs, several materials were mixed with extraction buffer and tested at high concentrations. None was shown to produce false-positive results.


2020 ◽  
Vol 2 (2) ◽  
pp. 101-110
Author(s):  
Dr. Suma V.

There has been an increasing demand in the e-commerce market for refurbished products across India during the last decade. Despite these demands, there has been very little research done in this domain. The real-world business environment, market factors and varying customer behavior of the online market are often ignored in the conventional statistical models evaluated by existing research work. In this paper, we do an extensive analysis of the Indian e-commerce market using data-mining approach for prediction of demand of refurbished electronics. The impact of the real-world factors on the demand and the variables are also analyzed. Real-world datasets from three random e-commerce websites are considered for analysis. Data accumulation, processing and validation is carried out by means of efficient algorithms. Based on the results of this analysis, it is evident that highly accurate prediction can be made with the proposed approach despite the impacts of varying customer behavior and market factors. The results of analysis are represented graphically and can be used for further analysis of the market and launch of new products.


2021 ◽  
Author(s):  
Beth Signal ◽  
Tim Kahlke

ABSTRACTORF prediction in de-novo assembled transcriptomes is a critical step for RNA-Seq analysis and transcriptome annotation. However, current approaches do not appropriately account for factors such as strand-specificity and incompletely assembled transcripts. Strand-specific RNA-Seq libraries should produce assembled transcripts in the correct orientation, and therefore ORFs should only be annotated on the sense strand. Additionally, start site selection is more complex than appreciated as sequences upstream of the first start codon need to be correctly annotated as 5’ UTR in completely assembled transcripts, or part of the main ORF in incomplete transcripts. Both of these factors influence the accurate annotation of ORFs and therefore the transcriptome as a whole. We generated four de-novo transcriptome assemblies of well annotated species as a gold-standard dataset to test the impact strand specificity and start site selection have on ORF prediction in real data. Our results show that prediction of ORFs on the antisense strand in data from stranded RNA libraries results in false-positive ORFs with no or very low similarity to known proteins. In addition, we found that up to 23% of assembled transcripts had no stop codon upstream and in-frame of the first start codon, instead comprising a sequence of upstream codons. We found the optimal length cutoff of these upstream sequences to accurately classify these transcripts as either complete (upstream sequence is 5’ UTR) or 5’ incomplete (transcript is incompletely assembled and upstream sequence is part of the ORF). Here, we present Borf, the better ORF finder, specifically designed to minimise false-positive ORF prediction in stranded RNA-Seq data and improve annotation of ORF start-site prediction accuracy. Borf is written in Python3 and freely available at https://github.com/betsig/borf.


Sign in / Sign up

Export Citation Format

Share Document