scholarly journals A Model for Genome-First Care: Returning Secondary Genomic Findings to Participants and Their Healthcare Providers in a Large Research Cohort

2017 ◽  
Author(s):  
Marci L. B. Schwartz ◽  
Cara Zayac McCormick ◽  
Amanda L. Lazzeri ◽  
D’Andra M. Lindbuchler ◽  
Miranda L. G. Hallquist ◽  
...  

ABSTRACTBackgroundResearch cohorts with linked genomic data exist, or are being developed, at many research centers. Within any such “sequenced cohort” of more than 100 participants, it is likely that there are participants with previously undisclosed risk for life-threatening monogenic diseases that could be identified with targeted analysis of their existing data. Identification of such disease-associated findings are not usually primary to the enrollment research goals. At Geisinger Health System, MyCode® Community Health Initiative (MyCode) participants represent one such large sequenced cohort. Since 2013, MyCode participants in discovery research have been consented for secondary analysis of their existing research genomic sequences to allow delivery of medically actionable findings to them and their healthcare providers. This return of genomic results program was developed to manage an anticipated 3.5% of MyCode participants who will receive clinically confirmed genomic variants from an approved gene list out of more than 150,000 total participants. Risk-associated DNA sequences alone without any clinical parameter, prompt “genome-first” follow-up encounters.MethodsThis article describes our process for generating clinical grade results from research-based genomic sequencing data, delivering results to patients and their providers, facilitating targeted clinical evaluations of patients and promoting cascade testing of at-risk relatives. We also summarize our early data about the results generated during this process and our ability to contact patients and their providers to disclose the information.ResultsThis process has been used to generate 343 results on 339 patients. 93% of patients with a result have been successfully contacted about their results as evidenced by direct interaction about their result with the research team or a healthcare provider. 222 healthcare providers have been notified of a result on one or more patient through this result delivery process.ConclusionsHere we describe the existing GHS model to deliver genomic data into the electronic medical record and the clinical interactions that are prompted and supported. Elements of this genome-first care model can be applied in other healthcare settings and in national efforts, such as “All of Us”, that wish to establish programs for returning genomic results to research participants.

Algorithms ◽  
2020 ◽  
Vol 13 (6) ◽  
pp. 151
Author(s):  
Bruno Carpentieri

The increase in memory and in network traffic used and caused by new sequenced biological data has recently deeply grown. Genomic projects such as HapMap and 1000 Genomes have contributed to the very large rise of databases and network traffic related to genomic data and to the development of new efficient technologies. The large-scale sequencing of samples of DNA has brought new attention and produced new research, and thus the interest in the scientific community for genomic data has greatly increased. In a very short time, researchers have developed hardware tools, analysis software, algorithms, private databases, and infrastructures to support the research in genomics. In this paper, we analyze different approaches for compressing digital files generated by Next-Generation Sequencing tools containing nucleotide sequences, and we discuss and evaluate the compression performance of generic compression algorithms by confronting them with a specific system designed by Jones et al. specifically for genomic file compression: Quip. Moreover, we present a simple but effective technique for the compression of DNA sequences in which we only consider the relevant DNA data and experimentally evaluate its performances.


Genes ◽  
2021 ◽  
Vol 12 (1) ◽  
pp. 120
Author(s):  
Yiyun Sun ◽  
Dandan Xu ◽  
Chundong Zhang ◽  
Yitao Wang ◽  
Lian Zhang ◽  
...  

We previously demonstrated that proline-rich protein 11 (PRR11) and spindle and kinetochore associated 2 (SKA2) constituted a head-to-head gene pair driven by a prototypical bidirectional promoter. This gene pair synergistically promoted the development of non-small cell lung cancer. However, the signaling pathways leading to the ectopic expression of this gene pair remains obscure. In the present study, we first analyzed the lung squamous cell carcinoma (LSCC) relevant RNA sequencing data from The Cancer Genome Atlas (TCGA) database using the correlation analysis of gene expression and gene set enrichment analysis (GSEA), which revealed that the PRR11-SKA2 correlated gene list highly resembled the Hedgehog (Hh) pathway activation-related gene set. Subsequently, GLI1/2 inhibitor GANT-61 or GLI1/2-siRNA inhibited the Hh pathway of LSCC cells, concomitantly decreasing the expression levels of PRR11 and SKA2. Furthermore, the mRNA expression profile of LSCC cells treated with GANT-61 was detected using RNA sequencing, displaying 397 differentially expressed genes (203 upregulated genes and 194 downregulated genes). Out of them, one gene set, including BIRC5, NCAPG, CCNB2, and BUB1, was involved in cell division and interacted with both PRR11 and SKA2. These genes were verified as the downregulated genes via RT-PCR and their high expression significantly correlated with the shorter overall survival of LSCC patients. Taken together, our results indicate that GLI1/2 mediates the expression of the PRR11-SKA2-centric gene set that serves as an unfavorable prognostic indicator for LSCC patients, potentializing new combinatorial diagnostic and therapeutic strategies in LSCC.


Author(s):  
Varshita Chirumamilla ◽  
Joseph M. Gerard ◽  
Alison E. Sweeney ◽  
Kristin P. Tully ◽  
Alison M. Stuebe ◽  
...  

Assessing hospital environment conditions is necessary for healthcare providers and patients to coordinate safe care. The aims of this research included: a) identifying patterns in hospital visit feedback transcripts regarding bathroom doors and lights in the hospital room and b) interpreting the results to make recommendations for more enabling clinical environments. The methods used by the research team included organizing transcript data, assigning codes, and conducting an interrater reliability test to assess codebook efficacy. Finally, working with maternal and infant mortality experts, recommendations for the hospital were developed. We identified four possible interventions to address barriers: a) implement low-height, dimmable lighting along the base of the patient room, b) provide personal lights, such as penlights, to staff for nighttime assessments, c) install and improve on existing grab bars in patient room bathrooms and d) replace the standard patient room bathroom door with a different kind of auditory/visual privacy barrier.


2021 ◽  
Vol 14 (1) ◽  
Author(s):  
Francisco Macías ◽  
Raquel Afonso-Lehmann ◽  
Patricia E. Carreira ◽  
M. Carmen Thomas

Abstract Background Trypanosomatid genomes are colonized by active and inactive mobile DNA elements, such as LINE, SINE-like, SIDER and DIRE retrotransposons. These elements all share a 77-nucleotide-long sequence at their 5′ ends, known as Pr77, which activates transcription, thereby generating abundant unspliced and translatable transcripts. However, transcription factors that mediates this process have still not been reported. Methods TATA-binding protein (TBP) and small nuclear RNA-activating protein 50 kDa (SNAP50) recombinant proteins and specific antibodies raised against them were generated. Protein capture assay, electrophoretic mobility-shift assays (EMSA) and EMSA competition assays carried out using these proteins and nuclear proteins of the parasite together to specific DNA sequences used as probes allowed detecting direct interaction of these transcription factors to Pr77 sequence. Results This study identified TBP and SNAP50 as part of the DNA-protein complex formed by the Pr77 promoter sequence and nuclear proteins of Trypanosoma cruzi. TBP establishes direct and specific contact with the Pr77 sequence, where the DPE and DPE downstream regions are docking sites with preferential binding. TBP binds cooperatively (Hill coefficient = 1.67) to Pr77 and to both strands of the Pr77 sequence, while the conformation of this highly structured sequence is not involved in TBP binding. Direct binding of SNAP50 to the Pr77 sequence is weak and may be mediated by protein–protein interactions through other trypanosomatid nuclear proteins. Conclusions Identification of the transcription factors that mediate Pr77 transcription may help to elucidate how these retrotransposons are mobilized within the trypanosomatid genomes and their roles in gene regulation processes in this human parasite. Graphic abstract


2021 ◽  
Vol 54 (1) ◽  
pp. 1-22
Author(s):  
Rayan Chikhi ◽  
Jan Holub ◽  
Paul Medvedev

The analysis of biological sequencing data has been one of the biggest applications of string algorithms. The approaches used in many such applications are based on the analysis of k -mers, which are short fixed-length strings present in a dataset. While these approaches are rather diverse, storing and querying a k -mer set has emerged as a shared underlying component. A set of k -mers has unique features and applications that, over the past 10 years, have resulted in many specialized approaches for its representation. In this survey, we give a unified presentation and comparison of the data structures that have been proposed to store and query a k -mer set. We hope this survey will serve as a resource for researchers in the field as well as make the area more accessible to researchers outside the field.


BMJ Open ◽  
2021 ◽  
Vol 11 (1) ◽  
pp. e041108
Author(s):  
Isabella Joy de Vere Hunt ◽  
Abigail McNiven ◽  
Amanda Roberts ◽  
Himesh Parmar ◽  
Tess McPherson

BackgroundThere is little qualitative research in the UK focussing on adolescents’ experience of their healthcare providers, and inflammatory skin conditions are a common heath problem in adolescence.AimTo explore the experiences of adolescents with eczema and psoriasis with healthcare professionals, and to distil the participants’ key messages for their healthcare providers.DesignThis is a secondary thematic analysis of interviews with adolescents with eczema or psoriasis.ParticipantsThere were a total of 41 text transcripts of interviews with young people with eczema or psoriasis who had given permission for secondary analysis; 23 of the participants had eczema, and 18 psoriasis. Participants were living in the UK at time of interview, and aged 15–24 years old.ResultsWe have distilled the following key messages from young people with eczema and psoriasis for healthcare providers: (1) address the emotional impact; (2) give more information, with the subtheme and (3) appreciate patient research. We identified the following eczema-specific themes: (ECZ-4) ‘It’s not taken seriously’; (ECZ-5) offer choice in treatment and (ECZ-6) lack of structure/conflicting advice. Two psoriasis-specific themes were identified: (PSO-4) feeling dehumanised/treat me as a person; and (PSO-5) think about how treatments will affect daily life.ConclusionThis qualitative data analysis highlights the need for greater recognition of the emotional impact of skin disease in adolescence, and for more comprehensive provision of information about the conditions. We call for greater sensitivity and flexibility in our approach to adolescents with skin disease, with important implications for healthcare delivery to this group.


2012 ◽  
Vol 58 (10) ◽  
pp. 1467-1475 ◽  
Author(s):  
Kwan-Wood G Lam ◽  
Peiyong Jiang ◽  
Gary J W Liao ◽  
K C Allen Chan ◽  
Tak Y Leung ◽  
...  

Abstract BACKGROUND A genomewide genetic and mutational profile of a fetus was recently determined via deep sequencing of maternal plasma DNA. This technology could have important applications for noninvasive prenatal diagnosis (NIPD) of many monogenic diseases. Relative haplotype dosage (RHDO) analysis, a core step of this procedure, would allow one to elucidate the maternally inherited half of the fetal genome. For clinical applications, the cost and complexity of data analysis might be reduced via targeted application of this approach to selected genomic regions containing disease-causing genes. There is thus a need to explore the feasibility of performing RHDO analysis in a targeted manner. METHODS We performed target enrichment by using solution-phase hybridization followed by massively parallel sequencing of the β-globin gene region in 2 families undergoing prenatal diagnosis for β-thalassemia. We used digital PCR strategies to physically deduce parental haplotypes. Finally, we performed RHDO analysis with target-enriched sequencing data and parental haplotypes to reveal the β-thalassemic status for the fetuses. RESULTS A mean sequencing depth of 206-fold was achieved in the β-globin gene region by targeted sequencing of maternal plasma DNA. RHDO analysis was successful for the sequencing data obtained from the target-enriched samples, including a region in one of the families in which the parents had similar haplotype structures. Data analysis revealed that both fetuses were heterozygous carriers of β-thalassemia. CONCLUSIONS Targeted sequencing of maternal plasma DNA for NIPD of monogenic diseases is feasible.


2021 ◽  
pp. e001700
Author(s):  
Sarah Godby ◽  
R Dierst-Davies ◽  
D Kogut ◽  
L Degiorgi Winslow ◽  
M M Truslow ◽  
...  

BackgroundElectronic cigarette (or e-cigarette) use has grown substantially since its US market introduction in 2007. Although marketed as a safer alternative to traditional cigarettes, studies have shown they can also be a gateway to their use. The purpose of this investigation is to identify factors associated with different patterns of tobacco use among active duty military personnel.MethodsA secondary analysis was conducted using the 2014 Defense Health Agency Health Related Behaviors survey data. Results are based on 45 986 US military respondents, weighted to 1 251 606. Both univariate and regression analyses were conducted to identify correlates.ResultsIn 2014, approximately 7.8% of respondents reported using e-cigarettes at least once in the past year. Among e-cigarette users, 49% reported exclusive e-cigarette use. Prevalence of exclusive use is highest among white people (58%), Navy (33%), men (83%) and persons with income ≤$45 000 (65%). Regression comparing exclusive cigarette with exclusive e-cigarette users revealed higher odds of being Air Force (OR=2.19; CI 1.18 to 4.06) or Navy (OR=2.25; CI 1.14 to 4.41) personnel and being male (OR=1.72; CI 1.12 to 2.64), and more likely to not receive smoking cessation messaging from healthcare providers in the last 12 months (OR=2.88; CI 1.80 to 4.62). When comparing exclusive e-cigarette users with poly-tobacco users, e-cigarette users had higher odds of being Hispanic (OR=2.20; CI 1.02 to 4.78), college educated (OR=4.25; CI 1.22 to 14.84) and not receiving tobacco prevention/cessation messaging (OR=4.80; CI 2.79 to 8.27).ConclusionThe results demonstrate that exclusive e-cigarette users in the military have unique characteristics when compared with groups of other/mixed tobacco users. Findings can inform cessation and prevention efforts to improve both the overall health and combat readiness of active duty military personnel.


2020 ◽  
Author(s):  
Maxence Queyrel ◽  
Edi Prifti ◽  
Jean-Daniel Zucker

AbstractAnalysis of the human microbiome using metagenomic sequencing data has demonstrated high ability in discriminating various human diseases. Raw metagenomic sequencing data require multiple complex and computationally heavy bioinformatics steps prior to data analysis. Such data contain millions of short sequences read from the fragmented DNA sequences and are stored as fastq files. Conventional processing pipelines consist multiple steps including quality control, filtering, alignment of sequences against genomic catalogs (genes, species, taxonomic levels, functional pathways, etc.). These pipelines are complex to use, time consuming and rely on a large number of parameters that often provide variability and impact the estimation of the microbiome elements. Recent studies have demonstrated that training Deep Neural Networks directly from raw sequencing data is a promising approach to bypass some of the challenges associated with mainstream bioinformatics pipelines. Most of these methods use the concept of word and sentence embeddings that create a meaningful and numerical representation of DNA sequences, while extracting features and reducing the dimentionality of the data. In this paper we present an end-to-end approach that classifies patients into disease groups directly from raw metagenomic reads: metagenome2vec. This approach is composed of four steps (i) generating a vocabulary of k-mers and learning their numerical embeddings; (ii) learning DNA sequence (read) embeddings; (iii) identifying the genome from which the sequence is most likely to come and (iv) training a multiple instance learning classifier which predicts the phenotype based on the vector representation of the raw data. An attention mechanism is applied in the network so that the model can be interpreted, assigning a weight to the influence of the prediction for each genome. Using two public real-life datasets as well a simulated one, we demonstrated that this original approach reached very high performances, comparable with the state-of-the-art methods applied directly on processed data though mainstream bioinformatics workflows. These results are encouraging for this proof of concept work. We believe that with further dedication, the DNN models have the potential to surpass mainstream bioinformatics workflows in disease classification tasks.


2019 ◽  
Author(s):  
Kate Chkhaidze ◽  
Timon Heide ◽  
Benjamin Werner ◽  
Marc J. Williams ◽  
Weini Huang ◽  
...  

AbstractQuantification of the effect of spatial tumour sampling on the patterns of mutations detected in next-generation sequencing data is largely lacking. Here we use a spatial stochastic cellular automaton model of tumour growth that accounts for somatic mutations, selection, drift and spatial constrains, to simulate multi-region sequencing data derived from spatial sampling of a neoplasm. We show that the spatial structure of a solid cancer has a major impact on the detection of clonal selection and genetic drift from bulk sequencing data and single-cell sequencing data. Our results indicate that spatial constrains can introduce significant sampling biases when performing multi-region bulk sampling and that such bias becomes a major confounding factor for the measurement of the evolutionary dynamics of human tumours. We present a statistical inference framework that takes into account the spatial effects of a growing tumour and allows inferring the evolutionary dynamics from patient genomic data. Our analysis shows that measuring cancer evolution using next-generation sequencing while accounting for the numerous confounding factors requires a mechanistic model-based approach that captures the sources of noise in the data.SummarySequencing the DNA of cancer cells from human tumours has become one of the main tools to study cancer biology. However, sequencing data are complex and often difficult to interpret. In particular, the way in which the tissue is sampled and the data are collected, impact the interpretation of the results significantly. We argue that understanding cancer genomic data requires mathematical models and computer simulations that tell us what we expect the data to look like, with the aim of understanding the impact of confounding factors and biases in the data generation step. In this study, we develop a spatial simulation of tumour growth that also simulates the data generation process, and demonstrate that biases in the sampling step and current technological limitations severely impact the interpretation of the results. We then provide a statistical framework that can be used to overcome these biases and more robustly measure aspects of the biology of tumours from the data.


Sign in / Sign up

Export Citation Format

Share Document