scholarly journals Characterization of a COPD-Associated NPNT Functional Splicing Genetic Variant in Human Lung Tissue via Long-Read Sequencing

Author(s):  
Aabida Saferali ◽  
Zhonghui Xu ◽  
Gloria M. Sheynkman ◽  
Craig P. Hersh ◽  
Michael H. Cho ◽  
...  

AbstractChronic obstructive pulmonary disease (COPD) is a leading cause of death worldwide. Genome-wide association studies (GWAS) have identified over 80 loci that are associated with COPD and emphysema, however for most of these loci the causal variant and gene are unknown. Here, we utilize lung splice quantitative trait loci (sQTL) data from the Genotype-Tissue Expression project (GTEx) and short read sequencing data from the Lung Tissue Research Consortium (LTRC) to characterize a locus in nephronectin (NPNT) associated with COPD case-control status and lung function. We found that the rs34712979 variant is associated with alternative splice junction use in NPNT, specifically for the junction connecting the 2nd and 4th exons (chr4:105898001-105927336) (p=4.02×10−38). This association colocalized with GWAS data for COPD and lung spirometry measures with a posterior probability of 94%, indicating that the same causal genetic variants in NPNT underlie the associations with COPD risk, spirometric measures of lung function, and splicing. Investigation of NPNT short read sequencing revealed that rs34712979 creates a cryptic splice acceptor site which results in the inclusion of a 3 nucleotide exon extension, coding for a serine residue near the N-terminus of the protein. Using Oxford Nanopore Technologies (ONT) long read sequencing we identified 13 NPNT isoforms, 6 of which are predicted to be protein coding. Two of these are full length isoforms which differ only in the 3 nucleotide exon extension whose occurrence differs by genotype. Overall, our data indicate that rs34712979 modulates COPD risk and lung function by creating a novel splice acceptor which results in the inclusion of a 3 nucelotide sequence coding for a serine in the nephronectin protein sequence. Our findings implicate NPNT splicing in contributing to COPD risk, and identify a novel serine insertion in the nephronectin protein that warrants further study.

2019 ◽  
Author(s):  
Bastian Schiffthaler ◽  
Nicolas Delhomme ◽  
Carolina Bernhardsson ◽  
Jerry Jenkins ◽  
Stefan Jansson ◽  
...  

ABSTRACTThe genome assembly of the European aspen Populus tremula proved difficult for a short-read based strategy due to high genomic variation. As a consequence, the fragmented sequence is impeding studies that benefit from highly contiguous data, particularly genome-wide association studies (GWAS) and comparative genomics. Here we present an updated assembly based on long-read sequences, optical mapping and genetic mapping. This assembly - henceforth referred to as Potra V2 - is assembled into 19 contiguous chromosomes which provides a powerful tool for future association studies. The genome sequence and any feature files are available from the PopGenIE resource.


2021 ◽  
Author(s):  
Valentin Waschulin ◽  
Chiara Borsetto ◽  
Robert James ◽  
Kevin K. Newsham ◽  
Stefano Donadio ◽  
...  

AbstractThe growing problem of antibiotic resistance has led to the exploration of uncultured bacteria as potential sources of new antimicrobials. PCR amplicon analyses and short-read sequencing studies of samples from different environments have reported evidence of high biosynthetic gene cluster (BGC) diversity in metagenomes, indicating their potential for producing novel and useful compounds. However, recovering full-length BGC sequences from uncultivated bacteria remains a challenge due to the technological restraints of short-read sequencing, thus making assessment of BGC diversity difficult. Here, long-read sequencing and genome mining were used to recover >1400 mostly full-length BGCs that demonstrate the rich diversity of BGCs from uncultivated lineages present in soil from Mars Oasis, Antarctica. A large number of highly divergent BGCs were not only found in the phyla Acidobacteriota, Verrucomicrobiota and Gemmatimonadota but also in the actinobacterial classes Acidimicrobiia and Thermoleophilia and the gammaproteobacterial order UBA7966. The latter furthermore contained a potential novel family of RiPPs. Our findings underline the biosynthetic potential of underexplored phyla as well as unexplored lineages within seemingly well-studied producer phyla. They also showcase long-read metagenomic sequencing as a promising way to access the untapped genetic reservoir of specialised metabolite gene clusters of the uncultured majority of microbes.


2014 ◽  
Vol 56 (1) ◽  
pp. 115-121 ◽  
Author(s):  
Alicja Piasecka ◽  
Paweł Brzuzan ◽  
Maciej Woźny ◽  
Sławomir Ciesielski ◽  
Dariusz Kaczmarczyk

1990 ◽  
Vol 10 (7) ◽  
pp. 3492-3504 ◽  
Author(s):  
G Rudenko ◽  
S Le Blancq ◽  
J Smith ◽  
M G Lee ◽  
A Rattray ◽  
...  

At least one of the procyclic acidic repetitive protein (PARP or procyclin) loci of Trypanosoma brucei is a small (5- to 6-kilobase) polycistronic transcription unit which is transcribed in an alpha-amanitin-resistant manner. Its single promoter, as mapped by run-on transcription analysis and UV inactivation of transcription, is located immediately upstream of the first alpha-PARP gene. Transcription termination occurs in a region approximately 3 kilobases downstream of the beta-PARP gene. The location of the promoter was confirmed by its ability to direct transcription of the bacterial chloramphenicol acetyltransferase gene in insect-form (procyclic) T. brucei. The putative PARP promoter is located in the region between the 3' splice acceptor site (nucleotide position 0) and nucleotide position -196 upstream of the alpha-PARP genes. Regulatory regions influencing the levels of PARP expression may be located further upstream. We conclude that a single promoter, which is located very close to the 3' splice acceptor site of the alpha-PARP genes, directs the transcription of a small, polycistronic, and alpha-amanitin-resistant transcription unit.


2020 ◽  
Author(s):  
Andrew J. Page ◽  
Nabil-Fareed Alikhan ◽  
Michael Strinden ◽  
Thanh Le Viet ◽  
Timofey Skvortsov

AbstractSpoligotyping of Mycobacterium tuberculosis provides a subspecies classification of this major human pathogen. Spoligotypes can be predicted from short read genome sequencing data; however, no methods exist for long read sequence data such as from Nanopore or PacBio. We present a novel software package Galru, which can rapidly detect the spoligotype of a Mycobacterium tuberculosis sample from as little as a single uncorrected long read. It allows for near real-time spoligotyping from long read data as it is being sequenced, giving rapid sample typing. We compare it to the existing state of the art software and find it performs identically to the results obtained from short read sequencing data. Galru is freely available from https://github.com/quadram-institute-bioscience/galru under the GPLv3 open source licence.


Sign in / Sign up

Export Citation Format

Share Document