scholarly journals Targeted sequence capture outperforms RNA-Seq and degenerate-primer PCR cloning for sequencing the largest mammalian multi-gene family

2019 ◽  
Author(s):  
Laurel R. Yohe ◽  
Kalina T. J. Davies ◽  
Nancy B. Simmons ◽  
Karen E. Sears ◽  
Elizabeth R. Dumont ◽  
...  

AbstractMultigene families evolve from single-copy ancestral genes via duplication, and typically encode proteins critical to key biological processes. Molecular analyses of these gene families require high-confidence sequences, but the high sequence similarity of the members can create challenges for both sequencing and downstream analyses. Focusing on the common vampire bat,Desmodus rotundus, we evaluated how different sequencing approaches performed in recovering the largest mammalian protein-coding multigene family:olfactory receptors(OR). Using the common vampire bat genome as a reference, we determined the proportion of putatively protein-coding receptors recovered by: 1) amplicons from degenerate primers sequenced via Sanger technology, 2) RNA-Seq of the main olfactory epithelium, and 3) those genes “captured” with probes designed from transcriptomes of closely-related species. Our initial re-annotation of the high-quality vampire bat genome resulted in >400 intactORgenes, more than double the number based on original estimates. Sanger-sequenced amplicons performed the poorest among the three approaches, detecting <33% of receptors in the genome. In contrast, the transcriptome reliably recovered >50% of the annotated genomicORs, and targeted sequence capture recovered nearly 75% of annotated genes. Each sequencing approach assembled high-quality sequences, even if it did not recover all putative receptors in the genome. Therefore, variation among assemblies was caused by low coverage of some receptors, rather than high rates of assembly error. Given this variability, we caution against using the counts of number of intact receptors per species to model the birth-death process of multigene families. Instead, our results support the use of orthologous sequences to explore and model the evolutionary processes shaping these genes.

2019 ◽  
Vol 20 (1) ◽  
pp. 140-153 ◽  
Author(s):  
Laurel R. Yohe ◽  
Kalina T. J. Davies ◽  
Nancy B. Simmons ◽  
Karen E. Sears ◽  
Elizabeth R. Dumont ◽  
...  

Nature ◽  
2021 ◽  
Vol 592 (7856) ◽  
pp. 737-746 ◽  
Author(s):  
Arang Rhie ◽  
Shane A. McCarthy ◽  
Olivier Fedrigo ◽  
Joana Damas ◽  
Giulio Formenti ◽  
...  

AbstractHigh-quality and complete reference genome assemblies are fundamental for the application of genomics to biology, disease, and biodiversity conservation. However, such assemblies are available for only a few non-microbial species1–4. To address this issue, the international Genome 10K (G10K) consortium5,6 has worked over a five-year period to evaluate and develop cost-effective methods for assembling highly accurate and nearly complete reference genomes. Here we present lessons learned from generating assemblies for 16 species that represent six major vertebrate lineages. We confirm that long-read sequencing technologies are essential for maximizing genome quality, and that unresolved complex repeats and haplotype heterozygosity are major sources of assembly error when not handled correctly. Our assemblies correct substantial errors, add missing sequence in some of the best historical reference genomes, and reveal biological discoveries. These include the identification of many false gene duplications, increases in gene sizes, chromosome rearrangements that are specific to lineages, a repeated independent chromosome breakpoint in bat genomes, and a canonical GC-rich pattern in protein-coding genes and their regulatory regions. Adopting these lessons, we have embarked on the Vertebrate Genomes Project (VGP), an international effort to generate high-quality, complete reference genomes for all of the roughly 70,000 extant vertebrate species and to help to enable a new era of discovery across the life sciences.


Author(s):  
Chao Wang ◽  
Ola Wallerman ◽  
Maja-Louise Arendt ◽  
Elisabeth Sundström ◽  
Åsa Karlsson ◽  
...  

AbstractHere we present a new high-quality canine reference genome with gap number reduced 41-fold, from 23,836 to 585. Analysis of existing and novel data, RNA-seq, miRNA-seq and ATAC-seq, revealed a large proportion of these harboured previously hidden elements, including genes, promoters and miRNAs. Short-read dark regions were detected, and genomic regions completed, including the DLA, TCR and 366 cancer genes. 10x sequencing of 27 dogs uncovered a total of 22.1 million SNPs, Indels and larger structural variants (SVs). 1.4% overlap with protein coding genes and could provide a source of normal or aberrant phenotypic modifications.


2021 ◽  
Vol 22 (14) ◽  
pp. 7298
Author(s):  
Izabela Rudzińska ◽  
Małgorzata Cieśla ◽  
Tomasz W. Turowski ◽  
Alicja Armatowska ◽  
Ewa Leśniewska ◽  
...  

The coordinated transcription of the genome is the fundamental mechanism in molecular biology. Transcription in eukaryotes is carried out by three main RNA polymerases: Pol I, II, and III. One basic problem is how a decrease in tRNA levels, by downregulating Pol III efficiency, influences the expression pattern of protein-coding genes. The purpose of this study was to determine the mRNA levels in the yeast mutant rpc128-1007 and its overdose suppressors, RBS1 and PRT1. The rpc128-1007 mutant prevents assembly of the Pol III complex and functionally mimics similar mutations in human Pol III, which cause hypomyelinating leukodystrophies. We applied RNAseq followed by the hierarchical clustering of our complete RNA-seq transcriptome and functional analysis of genes from the clusters. mRNA upregulation in rpc128-1007 cells was generally stronger than downregulation. The observed induction of mRNA expression was mostly indirect and resulted from the derepression of general transcription factor Gcn4, differently modulated by suppressor genes. rpc128-1007 mutation, regardless of the presence of suppressors, also resulted in a weak increase in the expression of ribosome biogenesis genes. mRNA genes that were downregulated by the reduction of Pol III assembly comprise the proteasome complex. In summary, our results provide the regulatory links affected by Pol III assembly that contribute differently to cellular fitness.


2021 ◽  
Author(s):  
Víctor Hugo Mendoza-Sáenz ◽  
Darío Alejandro Navarrete-Gutiérrez ◽  
Guillermo Jiménez-Ferrer ◽  
Cristian Kraker-Castañeda ◽  
Romeo A. Saldaña-Vázquez

2021 ◽  
Vol 22 (5) ◽  
pp. 2683
Author(s):  
Princess D. Rodriguez ◽  
Hana Paculova ◽  
Sophie Kogut ◽  
Jessica Heath ◽  
Hilde Schjerven ◽  
...  

Non-coding RNAs (ncRNAs) comprise a diverse class of non-protein coding transcripts that regulate critical cellular processes associated with cancer. Advances in RNA-sequencing (RNA-Seq) have led to the characterization of non-coding RNA expression across different types of human cancers. Through comprehensive RNA-Seq profiling, a growing number of studies demonstrate that ncRNAs, including long non-coding RNA (lncRNAs) and microRNAs (miRNA), play central roles in progenitor B-cell acute lymphoblastic leukemia (B-ALL) pathogenesis. Furthermore, due to their central roles in cellular homeostasis and their potential as biomarkers, the study of ncRNAs continues to provide new insight into the molecular mechanisms of B-ALL. This article reviews the ncRNA signatures reported for all B-ALL subtypes, focusing on technological developments in transcriptome profiling and recently discovered examples of ncRNAs with biologic and therapeutic relevance in B-ALL.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Étienne Fafard-Couture ◽  
Danny Bergeron ◽  
Sonia Couture ◽  
Sherif Abou-Elela ◽  
Michelle S. Scott

Abstract Background Small nucleolar RNAs (snoRNAs) are mid-size non-coding RNAs required for ribosomal RNA modification, implying a ubiquitous tissue distribution linked to ribosome synthesis. However, increasing numbers of studies identify extra-ribosomal roles of snoRNAs in modulating gene expression, suggesting more complex snoRNA abundance patterns. Therefore, there is a great need for mapping the snoRNome in different human tissues as the blueprint for snoRNA functions. Results We used a low structure bias RNA-Seq approach to accurately quantify snoRNAs and compare them to the entire transcriptome in seven healthy human tissues (breast, ovary, prostate, testis, skeletal muscle, liver, and brain). We identify 475 expressed snoRNAs categorized in two abundance classes that differ significantly in their function, conservation level, and correlation with their host gene: 390 snoRNAs are uniformly expressed and 85 are enriched in the brain or reproductive tissues. Most tissue-enriched snoRNAs are embedded in lncRNAs and display strong correlation of abundance with them, whereas uniformly expressed snoRNAs are mostly embedded in protein-coding host genes and are mainly non- or anticorrelated with them. Fifty-nine percent of the non-correlated or anticorrelated protein-coding host gene/snoRNA pairs feature dual-initiation promoters, compared to only 16% of the correlated non-coding host gene/snoRNA pairs. Conclusions Our results demonstrate that snoRNAs are not a single homogeneous group of housekeeping genes but include highly regulated tissue-enriched RNAs. Indeed, our work indicates that the architecture of snoRNA host genes varies to uncouple the host and snoRNA expressions in order to meet the different snoRNA abundance levels and functional needs of human tissues.


Scanning ◽  
2017 ◽  
Vol 2017 ◽  
pp. 1-7
Author(s):  
Xu Chen ◽  
Tengfei Guo ◽  
Yubin Hou ◽  
Jing Zhang ◽  
Wenjie Meng ◽  
...  

A new scan-head structure for the scanning tunneling microscope (STM) is proposed, featuring high scan precision and rigidity. The core structure consists of a piezoelectric tube scanner of quadrant type (for XY scans) coaxially housed in a piezoelectric tube with single inner and outer electrodes (for Z scan). They are fixed at one end (called common end). A hollow tantalum shaft is coaxially housed in the XY-scan tube and they are mutually fixed at both ends. When the XY scanner scans, its free end will bring the shaft to scan and the tip which is coaxially inserted in the shaft at the common end will scan a smaller area if the tip protrudes short enough from the common end. The decoupled XY and Z scans are desired for less image distortion and the mechanically reduced scan range has the superiority of reducing the impact of the background electronic noise on the scanner and enhancing the tip positioning precision. High quality atomic resolution images are also shown.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Geneviève Bart ◽  
Daniel Fischer ◽  
Anatoliy Samoylenko ◽  
Artem Zhyvolozhnyi ◽  
Pavlo Stehantsev ◽  
...  

Abstract Background The human sweat is a mixture of secretions from three types of glands: eccrine, apocrine, and sebaceous. Eccrine glands open directly on the skin surface and produce high amounts of water-based fluid in response to heat, emotion, and physical activity, whereas the other glands produce oily fluids and waxy sebum. While most body fluids have been shown to contain nucleic acids, both as ribonucleoprotein complexes and associated with extracellular vesicles (EVs), these have not been investigated in sweat. In this study we aimed to explore and characterize the nucleic acids associated with sweat particles. Results We used next generation sequencing (NGS) to characterize DNA and RNA in pooled and individual samples of EV-enriched sweat collected from volunteers performing rigorous exercise. In all sequenced samples, we identified DNA originating from all human chromosomes, but only the mitochondrial chromosome was highly represented with 100% coverage. Most of the DNA mapped to unannotated regions of the human genome with some regions highly represented in all samples. Approximately 5 % of the reads were found to map to other genomes: including bacteria (83%), archaea (3%), and virus (13%), identified bacteria species were consistent with those commonly colonizing the human upper body and arm skin. Small RNA-seq from EV-enriched pooled sweat RNA resulted in 74% of the trimmed reads mapped to the human genome, with 29% corresponding to unannotated regions. Over 70% of the RNA reads mapping to an annotated region were tRNA, while misc. RNA (18,5%), protein coding RNA (5%) and miRNA (1,85%) were much less represented. RNA-seq from individually processed EV-enriched sweat collection generally resulted in fewer percentage of reads mapping to the human genome (7–45%), with 50–60% of those reads mapping to unannotated region of the genome and 30–55% being tRNAs, and lower percentage of reads being rRNA, LincRNA, misc. RNA, and protein coding RNA. Conclusions Our data demonstrates that sweat, as all other body fluids, contains a wealth of nucleic acids, including DNA and RNA of human and microbial origin, opening a possibility to investigate sweat as a source for biomarkers for specific health parameters.


Sign in / Sign up

Export Citation Format

Share Document