scholarly journals Whole genome sequencing identifies a novel factor required for secretory granule maturation in Tetrahymena thermophila

2016 ◽  
Author(s):  
Cassandra Kontur ◽  
Santosh Kumar ◽  
Xun Lan ◽  
Jonathan K Pritchard ◽  
Aaron P Turkewitz

Unbiased genetic approaches have a unique ability to identify novel genes associated with specific biological pathways. Thanks to next generation sequencing, forward genetic strategies can be expanded into a wider range of model organisms. The formation of secretory granules, called mucocysts, in the ciliate Tetrahymena thermophila relies in part on ancestral lysosomal sorting machinery but is also likely to involve novel factors. In prior work, multiple strains with defect in mucocyst biogenesis were generated by nitrosoguanidine mutagenesis, and characterized using genetic and cell biological approaches, but the genetic lesions themselves were unknown. Here, we show that analyzing one such mutant by whole genome sequencing reveals a novel factor in mucocyst formation. Strain UC620 has both morphological and biochemical defects in mucocyst maturation, a process analogous to dense core granule maturation in animals. Illumina sequencing of a pool of UC620 F2 clones identified a missense mutation in a novel gene called MMA1 (Mucocyst maturation). The defects in UC620 were rescued by expression of a wildtype copy of MMA1, and disruption of MMA1 in an otherwise wildtype strain generated a phenocopy of UC620. The product of MMA1, characterized as a CFP-tagged copy, encodes a large soluble cytosolic protein. A small fraction of Mma1p-CFP is pelletable, which may reflect association with endosomes. The gene has no identifiable homologs except in other Tetrahymena species, and therefore represents an evolutionarily recent innovation that is required for granule maturation.

2016 ◽  
Vol 6 (8) ◽  
pp. 2505-2516 ◽  
Author(s):  
Cassandra Kontur ◽  
Santosh Kumar ◽  
Xun Lan ◽  
Jonathan K. Pritchard ◽  
Aaron P. Turkewitz

Genomics ◽  
2020 ◽  
Vol 112 (5) ◽  
pp. 2915-2921 ◽  
Author(s):  
Thiago Mafra Batista ◽  
Heron Oliveira Hilario ◽  
Gabriel Antônio Mendes de Brito ◽  
Rennan Garcias Moreira ◽  
Carolina Furtado ◽  
...  

2021 ◽  
Author(s):  
Einar Gabbasov ◽  
Miguel Moreno-Molina ◽  
Iñaki Comas ◽  
Maxwell Libbrecht ◽  
Leonid Chindelevitch

AbstractThe occurrence of multiple strains of a bacterial pathogen such as M. tuberculosis or C. difficile within a single human host, referred to as a mixed infection, has important implications for both healthcare and public health. However, methods for detecting it, and especially determining the proportion and identities of the underlying strains, from WGS (whole-genome sequencing) data, have been limited.In this paper we introduce SplitStrains, a novel method for addressing these challenges. Grounded in a rigorous statistical model, SplitStrains not only demonstrates superior performance in proportion estimation to other existing methods on both simulated as well as real M. tuberculosis data, but also successfully determines the identity of the underlying strains.We conclude that SplitStrains is a powerful addition to the existing toolkit of analytical methods for data coming from bacterial pathogens, and holds the promise of enabling previously inaccessible conclusions to be drawn in the realm of public health microbiology.Author summaryWhen multiple strains of a pathogenic organism are present in a patient, it may be necessary to not only detect this, but also to identify the individual strains. However, this problem has not yet been solved for bacterial pathogens processed via whole-genome sequencing. In this paper, we propose the SplitStrains algorithm for detecting multiple strains in a sample, identifying their proportions, and inferring their sequences, in the case of Mycobacterium tuberculosis. We test it on both simulated and real data, with encouraging results. We believe that our work opens new horizons in public health microbiology by allowing a more precise detection, identification and quantification of multiple infecting strains within a sample.


2019 ◽  
Vol 10 (1) ◽  
pp. 417-430 ◽  
Author(s):  
Elizabeth A. Morton ◽  
Ashley N. Hall ◽  
Elizabeth Kwan ◽  
Calvin Mok ◽  
Konstantin Queitsch ◽  
...  

Individuals within a species can exhibit vast variation in copy number of repetitive DNA elements. This variation may contribute to complex traits such as lifespan and disease, yet it is only infrequently considered in genotype-phenotype associations. Although the possible importance of copy number variation is widely recognized, accurate copy number quantification remains challenging. Here, we assess the technical reproducibility of several major methods for copy number estimation as they apply to the large repetitive ribosomal DNA array (rDNA). rDNA encodes the ribosomal RNAs and exists as a tandem gene array in all eukaryotes. Repeat units of rDNA are kilobases in size, often with several hundred units comprising the array, making rDNA particularly intractable to common quantification techniques. We evaluate pulsed-field gel electrophoresis, droplet digital PCR, and Nextera-based whole genome sequencing as approaches to copy number estimation, comparing techniques across model organisms and spanning wide ranges of copy numbers. Nextera-based whole genome sequencing, though commonly used in recent literature, produced high error. We explore possible causes for this error and provide recommendations for best practices in rDNA copy number estimation. We present a resource of high-confidence rDNA copy number estimates for a set of S. cerevisiae and C. elegans strains for future use. We furthermore explore the possibility for FISH-based copy number estimation, an alternative that could potentially characterize copy number on a cellular level.


2019 ◽  
Author(s):  
Stephenie D. Prokopec ◽  
Aileen Lu ◽  
Sandy Che-Eun S. Lee ◽  
Cindy Q. Yao ◽  
Ren X. Sun ◽  
...  

AbstractThe aryl hydrocarbon receptor (AHR) mediates many of the toxic effects of 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD). However, the AHR alone is insufficient to explain the widely different outcomes among organisms. Attempts to identify unknown factor(s) have been confounded by genetic variability of model organisms. Here, we evaluated three transgenic mouse lines, each expressing a different rat AHR isoform (rWT, DEL, and INS), as well as C57BL/6 and DBA/2 mice. We supplement these with whole-genome sequencing and transcriptomic analyses of the corresponding rat models: Long-Evans (L-E) and Han/Wistar (H/W) rats. These integrated multi-species genomic and transcriptomic data were used to identify genes associated with TCDD-response phenotypes.We identified several genes that show consistent transcriptional changes in both transgenic mice and rats. Hepatic Pxdc1 was significantly repressed by TCDD in C57BL/6, rWT mice, and in L-E rat. Three genes demonstrated different AHRE-1 (full) motif occurrences within their promoter regions: Cxxc5 had fewer occurrences in H/W, as compared with L-E; Sugp1 and Hgfac (in either L-E or H/W respectively). These genes also showed different patterns of mRNA abundance across strains.The AHR isoform explains much of the transcriptional variability: up to 50% of genes with altered mRNA abundance following TCDD exposure are associated with a single AHR isoform (30% and 10% unique to DEL and rWT respectively following 500 μg/kg TCDD). Genomic and transcriptomic evidence allowed identification of genes potentially involved in phenotypic outcomes: Pxdc1 had differential mRNA abundance by phenotype; Cxxc5 had altered AHR binding sites and differential mRNA abundance.Author SummaryEnvironmental contaminants such as dioxins cause many toxic responses, anything from chloracne (common in humans) to death. These toxic responses are mostly regulated by the Ahr, a ligand-activated transcription factor with roles in drug metabolism and immune responses, however other contributing factors remain unclear. Studies are complicated by the underlying genetic heterogeneity of model organisms. Our team evaluated a number of mouse and rat models, including two strains of mouse, two strains of rat and three transgenic mouse lines which differ only at the Ahr locus, that present widely different sensitivities to the most potent dioxin: 2,3,7,8 tetrachlorodibenzo-p-dioxin (TCDD). We identified a number of changes to gene expression that were associated with different toxic responses. We then contrasted these findings with results from whole-genome sequencing of the H/W and L-E rats and found some key genes, such as Cxxc5 and Mafb, which might contribute to TCDD toxicity. These transcriptomic and genomic datasets will provide a valuable resource for future studies into the mechanisms of dioxin toxicities.


2016 ◽  
Author(s):  
Harold E. Smith ◽  
Amy S. Fabritius ◽  
Aimee Jaramillo-Lambert ◽  
Andy Golden

ABSTRACTWhole-genome sequencing provides a rapid and powerful method for identifying mutations on a global scale, and has spurred a renewed enthusiasm for classical genetic screens in model organisms. The most commonly characterized category of mutation consists of monogenic, recessive traits, due to their genetic tractability. Therefore, most of the mapping methods for mutation identification by whole-genome sequencing are directed toward alleles that fulfill those criteria (i.e., single-gene, homozygous variants). However, such approaches are not entirely suitable for the characterization of a variety of more challenging mutations, such as dominant and semi-dominant alleles or multigenic traits. Therefore, we have developed strategies for the identification of those classes of mutations, using polymorphism mapping in Caenorhabditis elegans as our model for validation. We also report an alternative approach for mutation identification from traditional recombinant crosses, and a solution to the technical challenge of sequencing sterile or terminally arrested strains where population size is limiting. The methods described herein extend the applicability of whole-genome sequencing to a broader spectrum of mutations, including classes that are difficult to map by traditional means.


2020 ◽  
Vol 9 (2) ◽  
Author(s):  
Thomas A. Randall

In many cases, genes for commonly used genetic markers in model organisms have not been identified; therefore, it is of interest to identify the causative genes. Whole-genome sequencing was used to identify potential causative mutations for a col-4 allele of Neurospora crassa.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Elisa Pischedda ◽  
Cristina Crava ◽  
Martina Carlassara ◽  
Susanna Zucca ◽  
Leila Gasmi ◽  
...  

Abstract Background Several bioinformatics pipelines have been developed to detect sequences from viruses that integrate into the human genome because of the health relevance of these integrations, such as in the persistence of viral infection and/or in generating genotoxic effects, often progressing into cancer. Recent genomics and metagenomics analyses have shown that viruses also integrate into the genome of non-model organisms (i.e., arthropods, fish, plants, vertebrates). However, rarely studies of endogenous viral elements (EVEs) in non-model organisms have gone beyond their characterization from reference genome assemblies. In non-model organisms, we lack a thorough understanding of the widespread occurrence of EVEs and their biological relevance, apart from sporadic cases which nevertheless point to significant roles of EVEs in immunity and regulation of expression. The concomitance of repetitive DNA, duplications and/or assembly fragmentations in a genome sequence and intrasample variability in whole-genome sequencing (WGS) data could determine misalignments when mapping data to a genome assembly. This phenomenon hinders our ability to properly identify integration sites. Results To fill this gap, we developed ViR, a pipeline which solves the dispersion of reads due to intrasample variability in sequencing data from both single and pooled DNA samples thus ameliorating the detection of integration sites. We tested ViR to work with both in silico and real sequencing data from a non-model organism, the arboviral vector Aedes albopictus. Potential viral integrations predicted by ViR were molecularly validated supporting the accuracy of ViR results. Conclusion ViR will open new venues to explore the biology of EVEs, especially in non-model organisms. Importantly, while we generated ViR with the identification of EVEs in mind, its application can be extended to detect any lateral transfer event providing an ad-hoc sequence to interrogate.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Tatiana Maroilley ◽  
Xiao Li ◽  
Matthew Oldach ◽  
Francesca Jean ◽  
Susan J. Stasiuk ◽  
...  

AbstractGenomic rearrangements cause congenital disorders, cancer, and complex diseases in human. Yet, they are still understudied in rare diseases because their detection is challenging, despite the advent of whole genome sequencing (WGS) technologies. Short-read (srWGS) and long-read WGS approaches are regularly compared, and the latter is commonly recommended in studies focusing on genomic rearrangements. However, srWGS is currently the most economical, accurate, and widely supported technology. In Caenorhabditis elegans (C. elegans), such variants, induced by various mutagenesis processes, have been used for decades to balance large genomic regions by preventing chromosomal crossover events and allowing the maintenance of lethal mutations. Interestingly, those chromosomal rearrangements have rarely been characterized on a molecular level. To evaluate the ability of srWGS to detect various types of complex genomic rearrangements, we sequenced three balancer strains using short-read Illumina technology. As we experimentally validated the breakpoints uncovered by srWGS, we showed that, by combining several types of analyses, srWGS enables the detection of a reciprocal translocation (eT1), a free duplication (sDp3), a large deletion (sC4), and chromoanagenesis events. Thus, applying srWGS to decipher real complex genomic rearrangements in model organisms may help designing efficient bioinformatics pipelines with systematic detection of complex rearrangements in human genomes.


2017 ◽  
Author(s):  
Dea Garcia-Hermoso ◽  
Alexis Criscuolo ◽  
Soo chan Lee ◽  
Matthieu Legrand ◽  
Marc Chaouat ◽  
...  

AbstractMucorales are ubiquitous environmental molds responsible for mucormycosis in diabetic, immunocompromised, and severely burned patients. Small outbreaks of invasive wound mucormycosis (IWM) have already been reported in burn units without extensive microbiological investigations. We faced an outbreak of IWM in our center and investigated the clinical isolates with whole genome sequencing (WGS) analysis.We analyzed M. circinelloides isolates from patients in our burn unit (BU1) together with non-outbreak isolates from burn unit 2 (BU2, Paris area) and from France over a two-year period (2013-2015). For each isolate, WGS and a de novo genome assembly was performed from read data extracted from the aligned contig sequences of the reference genome (1006PhL).A total of 21 isolates were sequenced including 14 isolates from six BU1 patients. Phylogenetic classification showed that the clinical isolates clustered in four highly divergent clades. Clade1 contained at least one of the strains from the six epidemiologically-linked BU1 patients. The clinical isolates seemed specific to each patient. Two patients were infected with more than two strains from different clades suggesting that an environmental reservoir of clonally unrelated isolates was the source of contamination. Only two patients shared one strain in BU1, suggesting direct transmission or contamination with the same environmental source.WGS coupled with precise epidemiological data and analysis of several isolates per patients revealed in our study a complex situation with both potential cross-transmission and multiple contaminations with a heterogeneous pool of strains from a cryptic environmental reservoir.ImportanceInvasive wound mucormycosis (IWM) is a severe infection due to the environmental molds belonging to the order Mucorales. Severely burned patients are particularly at risk for IWM. Here, we used Whole Genome Sequencing (WGS) analysis to resolve an outbreak of IWM due to Mucor circinelloides that occurred in our hospital (BU1). We sequenced 21 clinical isolates, including 14 from BU1 and 7 unrelated isolates, and compared them to the reference genome (1006PhL). This analysis revealed that the outbreak was mainly due to multiple strains that seemed patient-specific, suggesting that the patients were more likely infected from a pool of diverse strains from the environment rather than from direct transmission between the patients. This study revealed the complexity of a Mucorales outbreak in the settings of IWM in burn patients, which has been highlighted based on whole genome sequencing and careful sampling.


Sign in / Sign up

Export Citation Format

Share Document