Human transposon insertion profiling by sequencing (TIPseq) to map LINE-1 insertions in single cells

Long interspersed element-1 (LINE-1, L1) sequences, which comprise about 17% of human genome, are the product of one of the most active types of mobile DNAs in modern humans. LINE-1 insertion alleles can cause inherited and de novo genetic diseases, and LINE-1-encoded proteins are highly expressed in some cancers. Genome-wide LINE-1 mapping in single cells could be useful for defining somatic and germline retrotransposition rates, and for enabling studies to characterize tumour heterogeneity, relate insertions to transcriptional and epigenetic effects at the cellular level, or describe cellular phylogenies in development. Our laboratories have reported a genome-wide LINE-1 insertion site mapping method for bulk DNA, named transposon insertion profiling by sequencing (TIPseq). There have been significant barriers applying LINE-1 mapping to single cells, owing to the chimeric artefacts and features of repetitive sequences. Here, we optimize a modified TIPseq protocol and show its utility for LINE-1 mapping in single lymphoblastoid cells. Results from single-cell TIPseq experiments compare well to known LINE-1 insertions found by whole-genome sequencing and TIPseq on bulk DNA. Among the several approaches we tested, whole-genome amplification by multiple displacement amplification followed by restriction enzyme digestion, vectorette ligation and LINE-1-targeted PCR had the best assay performance. This article is part of a discussion meeting issue ‘Crossroads between transposons and gene regulation’.

Download Full-text

A genome-wide study of de novo deletions identifies a candidate locus for non-syndromic isolated cleft lip/palate risk

BMC Genetics ◽

10.1186/1471-2156-15-24 ◽

2014 ◽

Vol 15 (1) ◽

pp. 24 ◽

Cited By ~ 18

Author(s):

Samuel G Younkin ◽

Robert B Scharpf ◽

Holger Schwender ◽

Margaret M Parker ◽

Alan F Scott ◽

...

Keyword(s):

Cleft Lip ◽

De Novo ◽

Candidate Locus ◽

Genome Wide ◽

A Genome ◽

Cleft Lip Palate ◽

Genome Wide Study

Download Full-text

Genome assembly of the JD17 soybean provides a new reference genome for Comparative genomics

10.1101/2021.11.23.469778 ◽

2021 ◽

Author(s):

Xinxin Yi ◽

Jing Liu ◽

Shengcai Chen ◽

Hao Wu ◽

Min Liu ◽

...

Keyword(s):

Nitrogen Fixation ◽

Genome Assembly ◽

Reference Genome ◽

De Novo ◽

Genomic Analysis ◽

Comparative Genomic ◽

High Quality ◽

Genome Wide ◽

A Genome ◽

Cultivated Soybean

Cultivated soybean (Glycine max) is an important source for protein and oil. Many elite cultivars with different traits have been developed for different conditions. Each soybean strain has its own genetic diversity, and the availability of more high-quality soybean genomes can enhance comparative genomic analysis for identifying genetic underpinnings for its unique traits. In this study, we constructed a high-quality de novo assembly of an elite soybean cultivar Jidou 17 (JD17) with chromsome contiguity and high accuracy. We annotated 52,840 gene models and reconstructed 74,054 high-quality full-length transcripts. We performed a genome-wide comparative analysis based on the reference genome of JD17 with three published soybeans (WM82, ZH13 and W05) , which identified five large inversions and two large translocations specific to JD17, 20,984 - 46,912 PAVs spanning 13.1 - 46.9 Mb in size, and 5 - 53 large PAV clusters larger than 500kb. 1,695,741 - 3,664,629 SNPs and 446,689 - 800,489 Indels were identified and annotated between JD17 and them. Symbiotic nitrogen fixation (SNF) genes were identified and the effects from these variants were further evaluated. It was found that the coding sequences of 9 nitrogen fixation-related genes were greatly affected. The high-quality genome assembly of JD17 can serve as a valuable reference for soybean functional genomics research.

Download Full-text

sPepFinder expedites genome-wide identification of small proteins in bacteria

10.1101/2020.05.05.079178 ◽

2020 ◽

Author(s):

Lei Li ◽

Yanjie Chao

Keyword(s):

De Novo ◽

Bacterial Species ◽

Computational Prediction ◽

Ribosome Profiling ◽

Support Vector ◽

Initiation Rate ◽

E Coli ◽

Small Proteins ◽

Genome Wide ◽

A Genome

ABSTRACTSmall proteins shorter than 50 amino acids have been long overlooked. A number of small proteins have been identified in several model bacteria using experimental approaches and assigned important functions in diverse cellular processes. The recent development of ribosome profiling technologies has allowed a genome-wide identification of small proteins and small ORFs (smORFs), but our incomplete understanding of small proteins hinders de novo computational prediction of smORFs in non-model bacterial species. Here, we have identified several sequence features for smORFs by a systematic analysis of all the known small proteins in E. coli, among which the translation initiation rate is the strongest determinant. By integrating these features into a support vector machine learning model, we have developed a novel sPepFinder algorithm that can predict conserved smORFs in bacterial genomes with a high accuracy of 92.8%. De novo prediction in E. coli has revealed several novel smORFs with evidence of translation supported by ribosome profiling. Further application of sPepFinder in 549 bacterial species has led to the identification of > 100,000 novel smORFs, many of which are conserved at the amino acid and nucleotide levels under purifying selection. Overall, we have established sPepFinder as a valuable tool to identify novel smORFs in both model and non-model bacterial organisms, and provided a large resource of small proteins for functional characterizations.

Download Full-text

A porcine brain-wide RNA editing landscape

10.21203/rs.3.rs-110949/v1 ◽

2020 ◽

Author(s):

Jinrong Huang ◽

Lin Lin ◽

Zhanying Dong ◽

Ling Yang ◽

Tianyu Zheng ◽

...

Keyword(s):

Rna Editing ◽

Repetitive Sequences ◽

Brain Regions ◽

Mammalian Brain ◽

Protein Coding ◽

Porcine Brain ◽

Coding Regions ◽

Pig Brain ◽

Genome Wide ◽

A Genome

Abstract Adenosine-to-inosine (A-to-I) RNA editing, catalyzed by ADAR enzymes, is an essential post-transcriptional modiﬁcation. Although hundreds of thousands of RNA editing sites have been reported in mammals, brain-wide analysis of the RNA editing in the mammalian brain remains rare. Here, a genome-wide RNA editing investigation is performed in 119 samples, representing 30 anatomically defined subregions in the pig brain. We identify a total of 682,037 A-to-I RNA editing sites of which 97% are not identified before. Within the pig brain, cerebellum and olfactory bulb are regions with most edited transcripts. The editing level of sites residing in protein-coding regions are similar across brain regions, whereas region-distinct editing is observed in repetitive sequences. Highly edited conserved recoding events in pig and human brain are found in neurotransmitter receptors, demonstrating the evolutionary importance of RNA editing in neurotransmission functions. The porcine brain-wide RNA landscape provides a rich resource to better understand the evolutionally importance of post-transcriptional RNA editing.

Download Full-text

A whole-genome screen identifies Salmonella enterica serovar Typhi genes involved in fluoroquinolone susceptibility

Journal of Antimicrobial Chemotherapy ◽

10.1093/jac/dkaa204 ◽

2020 ◽

Vol 75 (9) ◽

pp. 2516-2525

Author(s):

A Keith Turner ◽

Sabine E Eckert ◽

Daniel J Turner ◽

Muhammud Yasir ◽

Mark A Webber ◽

...

Keyword(s):

Salmonella Enterica ◽

Insertion Site ◽

Whole Genome ◽

Genome Screen ◽

Polysaccharide Biosynthesis ◽

Salmonella Enterica Serovar Typhi ◽

Transposon Insertion ◽

Associated Functions ◽

The Impact ◽

Insertion Mutations

Abstract Objectives A whole-genome screen at sub-gene resolution was performed to identify candidate loci that contribute to enhanced or diminished ciprofloxacin susceptibility in Salmonella enterica serovar Typhi. Methods A pool of over 1 million transposon insertion mutants of an S. Typhi Ty2 derivative were grown in a sub-MIC concentration of ciprofloxacin, or without ciprofloxacin. Transposon-directed insertion site sequencing (TraDIS) identified relative differences between the mutants that grew following the ciprofloxacin treatment compared with the untreated mutant pool, thereby indicating which mutations contribute to gain or loss of ciprofloxacin susceptibility. Results Approximately 88% of the S. Typhi strain’s 4895 annotated genes were assayed, and at least 116 were identified as contributing to gain or loss of ciprofloxacin susceptibility. Many of the identified genes are known to influence susceptibility to ciprofloxacin, thereby providing method validation. Genes were identified that were not known previously to be involved in susceptibility, and some of these had no previously known phenotype. Susceptibility to ciprofloxacin was enhanced by insertion mutations in genes coding for efflux, other surface-associated functions, DNA repair and expression regulation, including phoP, barA and marA. Insertion mutations that diminished susceptibility were predominantly in genes coding for surface polysaccharide biosynthesis and regulatory genes, including slyA, emrR, envZ and cpxR. Conclusions A genomics approach has identified novel contributors to gain or loss of ciprofloxacin susceptibility in S. Typhi, expanding our understanding of the impact of fluoroquinolones on bacteria and of mechanisms that may contribute to resistance. The data also demonstrate the power of the TraDIS technology for antibacterial research.

Download Full-text

Efficient and accurate determination of genome-wide DNA methylation patterns in Arabidopsis thaliana with enzymatic methyl sequencing

Epigenetics & Chromatin ◽

10.1186/s13072-020-00361-9 ◽

2020 ◽

Vol 13 (1) ◽

Cited By ~ 1

Author(s):

Suhua Feng ◽

Zhenhui Zhong ◽

Ming Wang ◽

Steven E. Jacobsen

Keyword(s):

Dna Methylation ◽

Bisulfite Sequencing ◽

Accurate Determination ◽

Gc Content ◽

Epigenetic Mark ◽

Whole Genome ◽

Whole Genome Bisulfite Sequencing ◽

Genome Wide ◽

A Genome ◽

Genome Bisulfite Sequencing

Abstract Background 5′ methylation of cytosines in DNA molecules is an important epigenetic mark in eukaryotes. Bisulfite sequencing is the gold standard of DNA methylation detection, and whole-genome bisulfite sequencing (WGBS) has been widely used to detect methylation at single-nucleotide resolution on a genome-wide scale. However, sodium bisulfite is known to severely degrade DNA, which, in combination with biases introduced during PCR amplification, leads to unbalanced base representation in the final sequencing libraries. Enzymatic conversion of unmethylated cytosines to uracils can achieve the same end product for sequencing as does bisulfite treatment and does not affect the integrity of the DNA; enzymatic methylation sequencing may, thus, provide advantages over bisulfite sequencing. Results Using an enzymatic methyl-seq (EM-seq) technique to selectively deaminate unmethylated cytosines to uracils, we generated and sequenced libraries based on different amounts of Arabidopsis input DNA and different numbers of PCR cycles, and compared these data to results from traditional whole-genome bisulfite sequencing. We found that EM-seq libraries were more consistent between replicates and had higher mapping and lower duplication rates, lower background noise, higher average coverage, and higher coverage of total cytosines. Differential methylation region (DMR) analysis showed that WGBS tended to over-estimate methylation levels especially in CHG and CHH contexts, whereas EM-seq detected higher CG methylation levels in certain highly methylated areas. These phenomena can be mostly explained by a correlation of WGBS methylation estimation with GC content and methylated cytosine density. We used EM-seq to compare methylation between leaves and flowers, and found that CHG methylation level is greatly elevated in flowers, especially in pericentromeric regions. Conclusion We suggest that EM-seq is a more accurate and reliable approach than WGBS to detect methylation. Compared to WGBS, the results of EM-seq are less affected by differences in library preparation conditions or by the skewed base composition in the converted DNA. It may therefore be more desirable to use EM-seq in methylation studies.

Download Full-text

Fine Mapping Using Whole-Genome Sequencing Confirms Anti-Müllerian Hormone as a Major Gene for Sex Determination in Farmed Nile Tilapia (Oreochromis niloticus L.)

G3 Genes|Genome|Genetics ◽

10.1534/g3.119.400297 ◽

2019 ◽

Vol 9 (10) ◽

pp. 3213-3223 ◽

Cited By ~ 8

Author(s):

Giovanna Cáceres ◽

María E. López ◽

María I. Cádiz ◽

Grazyella M. Yoshida ◽

Ana Jedlicki ◽

...

Keyword(s):

Sex Determination ◽

Whole Genome Sequencing ◽

Genome Sequencing ◽

Oreochromis Niloticus ◽

Nile Tilapia ◽

Major Gene ◽

Whole Genome ◽

Important Species ◽

Genome Wide ◽

A Genome

Nile tilapia (Oreochromis niloticus) is one of the most cultivated and economically important species in world aquaculture. Intensive production promotes the use of monosex animals, due to an important dimorphism that favors male growth. Currently, the main mechanism to obtain all-male populations is the use of hormones in feeding during larval and fry phases. Identifying genomic regions associated with sex determination in Nile tilapia is a research topic of great interest. The objective of this study was to identify genomic variants associated with sex determination in three commercial populations of Nile tilapia. Whole-genome sequencing of 326 individuals was performed, and a total of 2.4 million high-quality bi-allelic single nucleotide polymorphisms (SNPs) were identified after quality control. A genome-wide association study (GWAS) was conducted to identify markers associated with the binary sex trait (males = 1; females = 0). A mixed logistic regression GWAS model was fitted and a genome-wide significant signal comprising 36 SNPs, spanning a genomic region of 536 kb in chromosome 23 was identified. Ten out of these 36 genetic variants intercept the anti-Müllerian (Amh) hormone gene. Other significant SNPs were located in the neighboring Amh gene region. This gene has been strongly associated with sex determination in several vertebrate species, playing an essential role in the differentiation of male and female reproductive tissue in early stages of development. This finding provides useful information to better understand the genetic mechanisms underlying sex determination in Nile tilapia.

Download Full-text

DNA Analysis by Restriction Enzyme (DARE) enables concurrent genomic and epigenomic characterization of single cells

Nucleic Acids Research ◽

10.1093/nar/gkz717 ◽

2019 ◽

Vol 47 (19) ◽

pp. e122-e122

Author(s):

Ramya Viswanathan ◽

Elsie Cheruba ◽

Lih Feng Cheow

Keyword(s):

Dna Methylation ◽

Restriction Enzyme ◽

Copy Number ◽

Dna Analysis ◽

Whole Genome Amplification ◽

Single Cells ◽

Whole Genome ◽

Copy Number Alterations ◽

Genome Wide

Abstract Genome-wide profiling of copy number alterations and DNA methylation in single cells could enable detailed investigation into the genomic and epigenomic heterogeneity of complex cell populations. However, current methods to do this require complex sample processing and cleanup steps, lack consistency, or are biased in their genomic representation. Here, we describe a novel single-tube enzymatic method, DNA Analysis by Restriction Enzyme (DARE), to perform deterministic whole genome amplification while preserving DNA methylation information. This method was evaluated on low amounts of DNA and single cells, and provides accurate copy number aberration calling and representative DNA methylation measurement across the whole genome. Single-cell DARE is an attractive and scalable approach for concurrent genomic and epigenomic characterization of cells in a heterogeneous population.

Download Full-text

A Multireference-Based Whole Genome Assembly for the Obligate Ant-Following Antbird, Rhegmatorhina melanosticta (Thamnophilidae)

Diversity ◽

10.3390/d11090144 ◽

2019 ◽

Vol 11 (9) ◽

pp. 144 ◽

Cited By ~ 4

Author(s):

Laís Coelho ◽

Lukas Musher ◽

Joel Cracraft

Keyword(s):

Genome Assembly ◽

High Throughput Sequencing ◽

Population Genomics ◽

De Novo ◽

Structural Difference ◽

Whole Genome ◽

Sequencing Technology ◽

A Genome ◽

Avian Genomes ◽

Chromosome Level

Current generation high-throughput sequencing technology has facilitated the generation of more genomic-scale data than ever before, thus greatly improving our understanding of avian biology across a range of disciplines. Recent developments in linked-read sequencing (Chromium 10×) and reference-based whole-genome assembly offer an exciting prospect of more accessible chromosome-level genome sequencing in the near future. We sequenced and assembled a genome of the Hairy-crested Antbird (Rhegmatorhina melanosticta), which represents the first publicly available genome for any antbird (Thamnophilidae). Our objectives were to (1) assemble scaffolds to chromosome level based on multiple reference genomes, and report on differences relative to other genomes, (2) assess genome completeness and compare content to other related genomes, and (3) assess the suitability of linked-read sequencing technology for future studies in comparative phylogenomics and population genomics studies. Our R. melanosticta assembly was both highly contiguous (de novo scaffold N50 = 3.3 Mb, reference based N50 = 53.3 Mb) and relatively complete (contained close to 90% of evolutionarily conserved single-copy avian genes and known tetrapod ultraconserved elements). The high contiguity and completeness of this assembly enabled the genome to be successfully mapped to the chromosome level, which uncovered a consistent structural difference between R. melanosticta and other avian genomes. Our results are consistent with the observation that avian genomes are structurally conserved. Additionally, our results demonstrate the utility of linked-read sequencing for non-model genomics. Finally, we demonstrate the value of our R. melanosticta genome for future researchers by mapping reduced representation sequencing data, and by accurately reconstructing the phylogenetic relationships among a sample of thamnophilid species.

Download Full-text

Next generation sequencing allows deeper analysis and understanding of genomes and transcriptomes including aspects to fertility

Reproduction Fertility and Development ◽

10.1071/rd10247 ◽

2011 ◽

Vol 23 (1) ◽

pp. 75 ◽

Cited By ~ 7

Author(s):

Thomas Werner

Keyword(s):

Next Generation Sequencing ◽

Transcriptional Control ◽

Target Genes ◽

De Novo ◽

Alternative Promoters ◽

Next Generation ◽

Sequencing Data ◽

Genome Wide ◽

A Genome ◽

Generation Sequencing

Reproduction and fertility are controlled by specific events naturally linked to oocytes, testes and early embryonal tissues. A significant part of these events involves gene expression, especially transcriptional control and alternative transcription (alternative promoters and alternative splicing). While methods to analyse such events for carefully predetermined target genes are well established, until recently no methodology existed to extend such analyses into a genome-wide de novo discovery process. With the arrival of next generation sequencing (NGS) it becomes possible to attempt genome-wide discovery in genomic sequences as well as whole transcriptomes at a single nucleotide level. This does not only allow identification of the primary changes (e.g. alternative transcripts) but also helps to elucidate the regulatory context that leads to the induction of transcriptional changes. This review discusses the basics of the new technological and scientific concepts arising from NGS, prominent differences from microarray-based approaches and several aspects of its application to reproduction and fertility research. These concepts will then be illustrated in an application example of NGS sequencing data analysis involving postimplantation endometrium tissue from cows.

Download Full-text