scholarly journals Variant analysis of RNA sequences in severe equine asthma

Author(s):  
Laurence Tessier ◽  
Olivier Côté ◽  
Dorothee Bienzle

Background. Severe equine asthma is a chronic inflammatory disease of the lung in horses similar to low-Th2 late-onset asthma in humans. The disease in horses has complex inheritance including both dominant and recessive patterns that are ill defined. This study aimed to determine the utility of RNA-Seq to call gene variants and identify mutations potentially linked to disease. Methods. RNA-Seq data were generated from endobronchial biopsies collected from 6 asthmatic and 7 non-asthmatic horses before and after challenge (26 samples total). Sequences were aligned to the equine genome with Spliced Transcripts Alignment to Reference software. Read preparation for variant calling was performed with Picard tools and Genome Analysis Toolkit (GATK). Coverage was visualized using Integrative Genomic Viewer software and variants were called and filtered using GATK and Ensembl Variant Effect Predictor (VEP) tools. Novel variant selection by VEP was based on score of <0.01 predicted with Sorting Intolerant From Tolerant (SIFT) software, missense nature, location within the protein coding sequence and presence in all asthmatic individuals. For selected mutations, the effect of predicted variants on protein function was assessed with Polymorphism Phenotyping (PolyPhen) 2 and Screening for Non-Acceptable Polymorphism (SNAP) 2 softwares. RNA-Seq predicted variants were confirmed in all horses, and investigated in an additional 4 asthmatic and 7 non-asthmatic individuals with PCR and Sanger sequencing. Gene alignment and 3D protein structures were predicted with Geneious software. Results. Level of expression across the genome was similar in all individuals. RNA-Seq variant calling and filtering identified with highest confidence mutations in PACRG and RTTN. Sanger sequencing confirmed that the PACRG variant was appropriately identified in all 26 samples while the RTTN variant was identified correctly by RNA-Seq in 24 of 26 samples. SIFT and PolyPhen2 indicated both mutations would result in loss of function, and SNAP2 that they would be non-neutral. Amino acid substitutions projected no change of hydrophobicity and isoelectric point in PACRG, a change in both for RTTN; and a slight change in 3D structure for PACRG and RTTN. For PACRG, samples from additional individuals confirmed higher frequency of the heterozygous genotype in asthmatics, while the RTTN homozygous mutant phenotype was more prevalent in the asthmatic compared to non-asthmatic group. Discussion. RNA-Seq was sensitive and specific for calling gene variants in this disease model. Even moderate coverage (<10-20 cpm) yielded correct identification in 92% of samples, suggesting RNA-Seq may be suitable to detect variants in low coverage samples. The impact of amino acid alterations in PACRG and RTTN proteins are unknown at this point, but their role in structure and function of cilia may warrant further investigation.

2018 ◽  
Author(s):  
Laurence Tessier ◽  
Olivier Côté ◽  
Dorothee Bienzle

Background. Severe equine asthma is a chronic inflammatory disease of the lung in horses similar to low-Th2 late-onset asthma in humans. This study aimed to determine the utility of RNA-Seq to call gene sequence variants, and to identify sequence variants or potential relevance to the pathogenesis of asthma. Methods. RNA-Seq data were generated from endobronchial biopsies collected from 6 asthmatic and 7 non-asthmatic horses before and after challenge (26 samples total). Sequences were aligned to the equine genome with Spliced Transcripts Alignment to Reference software. Read preparation for sequence variant calling was performed with Picard tools and Genome Analysis Toolkit (GATK). Sequence variants were called and filtered using GATK and Ensembl Variant Effect Predictor (VEP) tools, and two RNA-Seq predicted sequence variants were investigated with both PCR and Sanger sequencing. Supplementary analysis of novel sequence variant selection with VEP was based on a score of <0.01 predicted with Sorting Intolerant From Tolerant (SIFT) software, missense nature, location within the protein coding sequence and presence in all asthmatic individuals. For select variants, effect on protein function was assessed with Polymorphism Phenotyping (PolyPhen) 2 and Screening for Non-Acceptable Polymorphism (SNAP) 2 software. Sequences were aligned and 3D protein structures predicted with Geneious software. Difference in allele frequency between the groups was assessed using a Pearson's Chi-squared test with Yates' continuity correction, and difference in genotype frequency was calculated using the Fisher's exact test for count data. Results. RNA-Seq variant calling and filtering correctly identified substitution variants in PACRG and RTTN. Sanger sequencing confirmed that the PACRG substitution was appropriately identified in all 26 samples while the RTTN substitution was identified correctly in 24 of 26 samples. These variants of uncertain significance had substitutions that were predicted to result in loss of function and to be non-neutral. Amino acid substitutions projected no change of hydrophobicity and isoelectric point in PACRG, and a change in both for RTTN. For PACRG, no difference in allele frequency between the two groups was detected but a higher proportion of asthmatic horses had the altered RTTN allele compared to non-asthmatic animals. Discussion. RNA-Seq was sensitive and specific for calling gene sequence variants in this disease model. Even moderate coverage (<10-20 cpm) yielded correct identification in 92% of samples, suggesting RNA-Seq may be suitable to detect sequence variants in low coverage samples. The impact of amino acid alterations in PACRG and RTTN proteins, and possible association of the sequence variants with asthma, is of uncertain significance, but their role in ciliary function may be of future interest.


2018 ◽  
Author(s):  
Laurence Tessier ◽  
Olivier Côté ◽  
Dorothee Bienzle

Background. Severe equine asthma is a chronic inflammatory disease of the lung in horses similar to low-Th2 late-onset asthma in humans. This study aimed to determine the utility of RNA-Seq to call gene sequence variants, and to identify sequence variants or potential relevance to the pathogenesis of asthma. Methods. RNA-Seq data were generated from endobronchial biopsies collected from 6 asthmatic and 7 non-asthmatic horses before and after challenge (26 samples total). Sequences were aligned to the equine genome with Spliced Transcripts Alignment to Reference software. Read preparation for sequence variant calling was performed with Picard tools and Genome Analysis Toolkit (GATK). Sequence variants were called and filtered using GATK and Ensembl Variant Effect Predictor (VEP) tools, and two RNA-Seq predicted sequence variants were investigated with both PCR and Sanger sequencing. Supplementary analysis of novel sequence variant selection with VEP was based on a score of <0.01 predicted with Sorting Intolerant From Tolerant (SIFT) software, missense nature, location within the protein coding sequence and presence in all asthmatic individuals. For select variants, effect on protein function was assessed with Polymorphism Phenotyping (PolyPhen) 2 and Screening for Non-Acceptable Polymorphism (SNAP) 2 software. Sequences were aligned and 3D protein structures predicted with Geneious software. Difference in allele frequency between the groups was assessed using a Pearson's Chi-squared test with Yates' continuity correction, and difference in genotype frequency was calculated using the Fisher's exact test for count data. Results. RNA-Seq variant calling and filtering correctly identified substitution variants in PACRG and RTTN. Sanger sequencing confirmed that the PACRG substitution was appropriately identified in all 26 samples while the RTTN substitution was identified correctly in 24 of 26 samples. These variants of uncertain significance had substitutions that were predicted to result in loss of function and to be non-neutral. Amino acid substitutions projected no change of hydrophobicity and isoelectric point in PACRG, and a change in both for RTTN. For PACRG, no difference in allele frequency between the two groups was detected but a higher proportion of asthmatic horses had the altered RTTN allele compared to non-asthmatic animals. Discussion. RNA-Seq was sensitive and specific for calling gene sequence variants in this disease model. Even moderate coverage (<10-20 cpm) yielded correct identification in 92% of samples, suggesting RNA-Seq may be suitable to detect sequence variants in low coverage samples. The impact of amino acid alterations in PACRG and RTTN proteins, and possible association of the sequence variants with asthma, is of uncertain significance, but their role in ciliary function may be of future interest.


PeerJ ◽  
2018 ◽  
Vol 6 ◽  
pp. e5759
Author(s):  
Laurence Tessier ◽  
Olivier Côté ◽  
Dorothee Bienzle

Background Severe equine asthma is a chronic inflammatory disease of the lung in horses similar to low-Th2 late-onset asthma in humans. This study aimed to determine the utility of RNA-Seq to call gene sequence variants, and to identify sequence variants of potential relevance to the pathogenesis of asthma. Methods RNA-Seq data were generated from endobronchial biopsies collected from six asthmatic and seven non-asthmatic horses before and after challenge (26 samples total). Sequences were aligned to the equine genome with Spliced Transcripts Alignment to Reference software. Read preparation for sequence variant calling was performed with Picard tools and Genome Analysis Toolkit (GATK). Sequence variants were called and filtered using GATK and Ensembl Variant Effect Predictor (VEP) tools, and two RNA-Seq predicted sequence variants were investigated with both PCR and Sanger sequencing. Supplementary analysis of novel sequence variant selection with VEP was based on a score of <0.01 predicted with Sorting Intolerant from Tolerant software, missense nature, location within the protein coding sequence and presence in all asthmatic individuals. For select variants, effect on protein function was assessed with Polymorphism Phenotyping 2 and screening for non-acceptable polymorphism 2 software. Sequences were aligned and 3D protein structures predicted with Geneious software. Difference in allele frequency between the groups was assessed using a Pearson’s Chi-squared test with Yates’ continuity correction, and difference in genotype frequency was calculated using the Fisher’s exact test for count data. Results RNA-Seq variant calling and filtering correctly identified substitution variants in PACRG and RTTN. Sanger sequencing confirmed that the PACRG substitution was appropriately identified in all 26 samples while the RTTN substitution was identified correctly in 24 of 26 samples. These variants of uncertain significance had substitutions that were predicted to result in loss of function and to be non-neutral. Amino acid substitutions projected no change of hydrophobicity and isoelectric point in PACRG, and a change in both for RTTN. For PACRG, no difference in allele frequency between the two groups was detected but a higher proportion of asthmatic horses had the altered RTTN allele compared to non-asthmatic animals. Discussion RNA-Seq was sensitive and specific for calling gene sequence variants in this disease model. Even moderate coverage (<10–20 counts per million) yielded correct identification in 92% of samples, suggesting RNA-Seq may be suitable to detect sequence variants in low coverage samples. The impact of amino acid alterations in PACRG and RTTN proteins, and possible association of the sequence variants with asthma, is of uncertain significance, but their role in ciliary function may be of future interest.


F1000Research ◽  
2014 ◽  
Vol 3 ◽  
pp. 217 ◽  
Author(s):  
Sandeep Chakraborty ◽  
Basuthkar J. Rao ◽  
Bjarni Asgeirsson ◽  
Ravindra Venkatramani ◽  
Abhaya M. Dandekar

The remarkable diversity in biological systems is rooted in the ability of the twenty naturally occurring amino acids to perform multifarious catalytic functions by creating unique structural scaffolds known as the active site. Finding such structrual motifs within the protein structure is a key aspect of many computational methods. The algorithm for obtaining combinations of motifs of a certain length, although polynomial in complexity, runs in non-trivial computer time. Also, the search space expands considerably if stereochemically equivalent residues are allowed to replace an amino acid in the motif. In the present work, we propose a method to precompile all possible motifs comprising of a set (n=4 in this case) of predefined amino acid residues from a protein structure that occur within a specified distance (R) of each other (PREMONITION). PREMONITION rolls a sphere of radius R along the protein fold centered at the C atom of each residue, and all possible motifs are extracted within this sphere. The number of residues that can occur within a sphere centered around a residue is bounded by physical constraints, thus setting an upper limit on the processing times. After such a pre-compilation step, the computational time required for querying a protein structure with multiple motifs is considerably reduced. Previously, we had proposed a computational method to estimate the promiscuity of proteins with known active site residues and 3D structure using a database of known active sites in proteins (CSA) by querying each protein with the active site motif of every other residue. The runtimes for such a comparison is reduced from days to hours using the PREMONITION methodology.


2021 ◽  
pp. annrheumdis-2020-218359
Author(s):  
Xinyi Meng ◽  
Xiaoyuan Hou ◽  
Ping Wang ◽  
Joseph T Glessner ◽  
Hui-Qi Qu ◽  
...  

ObjectiveJuvenile idiopathic arthritis (JIA) is the most common type of arthritis among children, but a few studies have investigated the contribution of rare variants to JIA. In this study, we aimed to identify rare coding variants associated with JIA for the genome-wide landscape.MethodsWe established a rare variant calling and filtering pipeline and performed rare coding variant and gene-based association analyses on three RNA-seq datasets composed of 228 JIA patients in the Gene Expression Omnibus against different sets of controls, and further conducted replication in our whole-exome sequencing (WES) data of 56 JIA patients. Then we conducted differential gene expression analysis and assessed the impact of recurrent functional coding variants on gene expression and signalling pathway.ResultsBy the RNA-seq data, we identified variants in two genes reported in literature as JIA causal variants, as well as additional 63 recurrent rare coding variants seen only in JIA patients. Among the 44 recurrent rare variants found in polyarticular patients, 10 were replicated by our WES of patients with the same JIA subtype. Several genes with recurrent functional rare coding variants have also common variants associated with autoimmune diseases. We observed immune pathways enriched for the genes with rare coding variants and differentially expressed genes.ConclusionThis study elucidated a novel landscape of recurrent rare coding variants in JIA patients and uncovered significant associations with JIA at the gene pathway level. The convergence of common variants and rare variants for autoimmune diseases is also highlighted in this study.


2020 ◽  
Vol 117 (45) ◽  
pp. 28201-28211
Author(s):  
Sumaiya Iqbal ◽  
Eduardo Pérez-Palma ◽  
Jakob B. Jespersen ◽  
Patrick May ◽  
David Hoksza ◽  
...  

Interpretation of the colossal number of genetic variants identified from sequencing applications is one of the major bottlenecks in clinical genetics, with the inference of the effect of amino acid-substituting missense variations on protein structure and function being especially challenging. Here we characterize the three-dimensional (3D) amino acid positions affected in pathogenic and population variants from 1,330 disease-associated genes using over 14,000 experimentally solved human protein structures. By measuring the statistical burden of variations (i.e., point mutations) from all genes on 40 3D protein features, accounting for the structural, chemical, and functional context of the variations’ positions, we identify features that are generally associated with pathogenic and population missense variants. We then perform the same amino acid-level analysis individually for 24 protein functional classes, which reveals unique characteristics of the positions of the altered amino acids: We observe up to 46% divergence of the class-specific features from the general characteristics obtained by the analysis on all genes, which is consistent with the structural diversity of essential regions across different protein classes. We demonstrate that the function-specific 3D features of the variants match the readouts of mutagenesis experiments for BRCA1 and PTEN, and positively correlate with an independent set of clinically interpreted pathogenic and benign missense variants. Finally, we make our results available through a web server to foster accessibility and downstream research. Our findings represent a crucial step toward translational genetics, from highlighting the impact of mutations on protein structure to rationalizing the variants’ pathogenicity in terms of the perturbed molecular mechanisms.


2017 ◽  
Vol 17 (3) ◽  
pp. 703-715 ◽  
Author(s):  
Katarzyna Piórkowska ◽  
Kacper Żukowski ◽  
Tomasz Szmatoła ◽  
Katarzyna Ropka-Molik ◽  
Mirosław Tyra

Abstract A high meat percentage in the porcine carcass has been achieved as a result of selection, but it has contributed to a deterioration of pork quality. The level of intramuscular fat has significantly declined, the pork has lost its tenderness and drip loss in meat has substantially increased, which has led to a deterioration of meat flavour and its technological suitability. The recovery of good pork quality could be supported by the development of genetic markers enabling faster breeding progress. This study presents a method by using RNA-seq data that identifies new variants for a chromosome region rich in QTLs for pork quality and selects gene candidates for these traits. This work included two pig breeds: the Polish Landrace (PL) and Puławska (PUL), which differ in meat quality and fat content. The transcriptome profile was estimated for semimembranosus and longissimus dorsi muscles. Into variant calling analysis, transcripts of both muscles encoded by genes located in a region between microsatellites SW964 and SW906 (43-135.9 Mbp) in SSC15 were included. In total, 439 transcripts were searched, 2,800 gene variants were identified and 6 mutations with a high effect belonging to the frameshift variants were found (ENSSSCG00000015976, ENSSSCG00000027516, WRN and XIRP2). Moreover, several interesting significant missense variants in PDLIM3, PLCD4 and SARAF genes were detected. These genes are recommended as candidates for meat quality; however they require further investigation in an association study.


2021 ◽  
Author(s):  
Myo Naung ◽  
Elijah Martin ◽  
Jacob Munro ◽  
Somya Mehra ◽  
Andrew J Guy ◽  
...  

Investigation of the diversity of malaria parasite antigens can help prioritize and validate them as vaccine candidates and identify the most common variants for inclusion in vaccine formulations. Studies on Plasmodium falciparum antigen diversity have focused on well-known vaccine candidates while the diversity of several others has never been studied. Here we provide an overview of the diversity and population structure of leading vaccine candidate antigens of P. falciparum using the MalariaGEN Pf3K (version 5.1) resource, comprising more than 2600 genomes from 15 malaria endemic countries. We developed a stringent variant calling pipeline to extract high quality antigen gene sequences from the global dataset and a new R-package named VaxPack to streamline population genetic analyses. In addition, a newly developed algorithm that enables spatial averaging of selection pressure on 3D protein structures was applied to the dataset. We analysed the genes encoding 23 leading and novel candidate malaria vaccine antigens including csp, trap, eba175, ama1, rh5, and CelTOS. We found that current malaria vaccine formulations are based on rare variants and thus may have limited efficacy. High levels of diversity with evidence of balancing selection was detected for most of the erythrocytic and pre-erythrocytic antigens. Measures of natural selection were then mapped to 3D protein structures to predict targets of functional antibodies. For some antigens, geographical variation in the intensity and distribution of these signals on the 3D structure suggests adaptations to different human host or mosquito vector populations. This study provides an essential framework for the diversity of P. falciparum antigens for inclusion in the design of the next generation of malaria vaccines.


2022 ◽  
Vol 23 (2) ◽  
pp. 858
Author(s):  
Sali Anies ◽  
Vincent Jallu ◽  
Julien Diharce ◽  
Tarun J. Narwani ◽  
Alexandre G. de Brevern

Integrin αIIbβ3, a glycoprotein complex expressed at the platelet surface, is involved in platelet aggregation and contributes to primary haemostasis. Several integrin αIIbβ3 polymorphisms prevent the aggregation that causes haemorrhagic syndromes, such as Glanzmann thrombasthenia (GT). Access to 3D structure allows understanding the structural effects of polymorphisms related to GT. In a previous analysis using Molecular Dynamics (MD) simulations of αIIb Calf-1 domain structure, it was observed that GT associated with single amino acid variation affects distant loops, but not the mutated position. In this study, experiments are extended to Calf-1, Thigh, and Calf-2 domains. Two loops in Calf-2 are unstructured and therefore are modelled expertly using biophysical restraints. Surprisingly, MD revealed the presence of rigid zones in these loops. Detailed analysis with structural alphabet, the Proteins Blocks (PBs), allowed observing local changes in highly flexible regions. The variant P741R located at C-terminal of Calf-1 revealed that the Calf-2 presence did not affect the results obtained with isolated Calf-1 domain. Simulations for Calf- 1+ Calf-2, and Thigh + Calf-1 variant systems are designed to comprehend the impact of five single amino acid variations in these domains. Distant conformational changes are observed, thus highlighting the potential role of allostery in the structural basis of GT.


2020 ◽  
Author(s):  
Sujaya Srinivasan ◽  
Natallia Kalinava ◽  
Rafael Aldana ◽  
Zhipan Li ◽  
Sjoerd van Hagen ◽  
...  

AbstractBackgroundNext generation sequencing is widely used in cancer to profile tumors and detect variants. Most somatic variant callers used in these pipelines identify variants at the lowest possible granularity – single nucleotide variants (SNVs). As a result, multiple adjacent SNVs are called individually instead of as a multi-nucleotide variant (MNV). The problem with this level of granularity is that the amino acid change from the individual SNVs within a codon could be different from the amino acid change based on the MNV that results from combining the SNVs. Most variant annotation tools do not account for this, leading to incorrect conclusions about the downstream effects of the variants.MethodHere, we used Variant Call Files (VCFs) from the TCGA Mutect2 caller, and developed a solution to merge SNVs to MNVs. Our custom script takes the phasing information from the SNV VCFs and based on a gene model, determines if SNVs are at the same codon and need to be merged into a MNV prior to variant annotation.ResultsWe analyzed 10,383 VCFs from TCGA and found 12,141 MNVs that were incorrectly annotated. Strikingly, the analysis of seven commonly mutated genes from 178 studies from cBioPortal revealed that MNVs were consistently missed in 20 of these studies, while they were correctly annotated in 15 more recent studies. The best and most common example of MNVs was found at the BRAF V600 locus, where several public datasets reported separate BRAF V600E and BRAF V600M variants, instead of a single merged V600K variant.ConclusionWhile some datasets merged MNVs correctly, many public datasets have not been corrected for this problem. As a best practice for variant calling, we recommend that MNVs be accounted for in NGS processing pipelines, thus improving analyses on the impact of somatic variants in cancer genomics.


Sign in / Sign up

Export Citation Format

Share Document