scholarly journals Expert Curation of the Human and Mouse Olfactory Receptor Gene Repertoires Identifies Conserved Coding Regions Split Across Two Exons

2020 ◽  
Author(s):  
If Habib Ahmed Barnes ◽  
Ximena Ibarra-Soria ◽  
Stephen Fitzgerald ◽  
Jose Manuel Gonzalez ◽  
Claire Davidson ◽  
...  

Abstract Background: Olfactory receptor (OR) genes are the largest multi-gene family in the mammalian genome, with 874 in human and 1483 loci in mouse (including pseudogenes). The expansion of the OR gene repertoire has occurred through numerous duplication events followed by diversification, resulting in a large number of highly similar paralogous genes. These characteristics have made the annotation of the complete OR gene repertoire a complex task. Most OR genes have been predicted in silico and are typically annotated as intronless coding sequences. Results: Here we have developed an expert curation pipeline to analyse and annotate every OR gene in the human and mouse reference genomes. By combining evidence from structural features, evolutionary conservation and experimental data, we have unified the annotation of these gene families, and have systematically determined the protein-coding potential of each locus. We have defined the non-coding regions of many OR genes, enabling us to generate full-length transcript models. We found that 13 human and 41 mouse OR loci have coding sequences that are split across two exons. These split OR genes are conserved across mammals, and are expressed at the same level as protein-coding OR genes with an intronless coding region. Our findings challenge the long-standing and widespread notion that the coding region of a vertebrate OR gene is contained within a single exon.Conclusions: This work provides the most comprehensive curation effort of the human and mouse OR gene repertoires to date. The complete annotation has been integrated into the GENCODE reference gene set, for immediate availability to the research community.

2019 ◽  
Author(s):  
If Barnes ◽  
Ximena Ibarra-Soria ◽  
Stephen Fitzgerald ◽  
Jose Gonzalez ◽  
Claire Davidson ◽  
...  

Abstract Olfactory receptor (OR) genes are the largest multi-gene family in the mammalian genome, with over 850 in human and nearly 1500 genes in mouse. The expansion of the OR gene repertoire has occurred through numerous duplication events followed by diversification, resulting in a large number of highly similar paralogous genes. These characteristics have made the annotation of the complete OR gene repertoire a complex task. Most OR genes have been predicted in silico and are typically annotated as intronless coding sequences. Here we have developed an expert curation pipeline to analyse and annotate every OR gene in the human and mouse reference genomes. By combining evidence from structural features, evolutionary conservation and experimental data, we have unified the annotation of these gene families, and have systematically determined the protein-coding potential of each locus. We have defined the non-coding regions of many OR genes, enabling us to generate full-length transcript models. We found that 13 human and 41 mouse OR loci have coding sequences that are split across two exons. These split OR genes are conserved across mammals, and are expressed at the same level as protein-coding OR genes with an intronless coding region. Our findings challenge the long-standing and widespread notion that the coding region of a vertebrate OR gene is contained within a single exon.


2019 ◽  
Author(s):  
If H. A. Barnes ◽  
Ximena Ibarra-Soria ◽  
Stephen Fitzgerald ◽  
Jose M. Gonzalez ◽  
Claire Davidson ◽  
...  

ABSTRACTOlfactory receptor (OR) genes are the largest multi-gene family in the mammalian genome, with over 850 in human and nearly 1500 genes in mouse. The expansion of the OR gene repertoire has occurred through numerous duplication events followed by diversification, resulting in a large number of highly similar paralogous genes. These characteristics have made the annotation of the complete OR gene repertoire a complex task. Most OR genes have been predicted in silico and are typically annotated as intronless coding sequences. Here we have developed an expert curation pipeline to analyse and annotate every OR gene in the human and mouse reference genomes. By combining evidence from structural features, evolutionary conservation and experimental data, we have unified the annotation of these gene families, and have systematically determined the protein-coding potential of each locus. We have defined the non-coding regions of many OR genes, enabling us to generate full-length transcript models. We found that 13 human and 41 mouse OR loci have coding sequences that are split across two exons. These split OR genes are conserved across mammals, and are expressed at the same level as protein-coding OR genes with an intronless coding region. Our findings challenge the long-standing and widespread notion that the coding region of a vertebrate OR gene is contained within a single exon.


2020 ◽  
Author(s):  
If Barnes ◽  
Ximena Ibarra-Soria ◽  
Stephen Fitzgerald ◽  
Jose Gonzalez ◽  
Claire Davidson ◽  
...  

Abstract Background: Olfactory receptor (OR) genes are the largest multi-gene family in the mammalian genome, with 874 in human and 1483 loci in mouse (including pseudogenes). The expansion of the OR gene repertoire has occurred through numerous duplication events followed by diversification, resulting in a large number of highly similar paralogous genes. These characteristics have made the annotation of the complete OR gene repertoire a complex task. Most OR genes have been predicted in silico and are typically annotated as intronless coding sequences. Results: Here we have developed an expert curation pipeline to analyse and annotate every OR gene in the human and mouse reference genomes. By combining evidence from structural features, evolutionary conservation and experimental data, we have unified the annotation of these gene families, and have systematically determined the protein-coding potential of each locus. We have defined the non-coding regions of many OR genes, enabling us to generate full-length transcript models. We found that 13 human and 41 mouse OR loci have coding sequences that are split across two exons. These split OR genes are conserved across mammals, and are expressed at the same level as protein-coding OR genes with an intronless coding region. Our findings challenge the long-standing and widespread notion that the coding region of a vertebrate OR gene is contained within a single exon.Conclusions: This work provides the most comprehensive curation effort of the human and mouse OR gene repertoires to date. The complete annotation has been integrated into the GENCODE reference gene set, for immediate availability to the research community.


2020 ◽  
Author(s):  
If Habib Ahmed Barnes ◽  
Ximena Ibarra-Soria ◽  
Stephen Fitzgerald ◽  
Jose Manuel Gonzalez ◽  
Claire Davidson ◽  
...  

Abstract Background Olfactory receptor (OR) genes are the largest multi-gene family in the mammalian genome, with 874 in human and 1483 loci in mouse (including pseudogenes). The expansion of the OR gene repertoire has occurred through numerous duplication events followed by diversification, resulting in a large number of highly similar paralogous genes. These characteristics have made the annotation of the complete OR gene repertoire a complex task. Most OR genes have been predicted in silico and are typically annotated as intronless coding sequences. Results Here we have developed an expert curation pipeline to analyse and annotate every OR gene in the human and mouse reference genomes. By combining evidence from structural features, evolutionary conservation and experimental data, we have unified the annotation of these gene families, and have systematically determined the protein-coding potential of each locus. We have defined the non-coding regions of many OR genes, enabling us to generate full-length transcript models. We found that 13 human and 41 mouse OR loci have coding sequences that are split across two exons. These split OR genes are conserved across mammals, and are expressed at the same level as protein-coding OR genes with an intronless coding region. Our findings challenge the long-standing and widespread notion that the coding region of a vertebrate OR gene is contained within a single exon. Conclusions This work provides the most comprehensive curation effort of the human and mouse OR gene repertoires to date. The complete annotation has been integrated into the GENCODE reference gene set, for immediate availability to the research community.


2000 ◽  
Vol 10 (12) ◽  
pp. 1968-1978 ◽  
Author(s):  
Anke Ehlers ◽  
Stephan Beck ◽  
Simon A. Forbes ◽  
John Trowsdale ◽  
Armin Volz ◽  
...  

Clusters of olfactory receptor (OR) genes are found on most human chromosomes. They are one of the largest mammalian multigene families. Here, we report a systematic study of polymorphism of OR genes belonging to the largest fully sequenced OR cluster. The cluster contains 36 OR genes, of which two belong to the vomeronasal 1 (V1-OR) family. The cluster is divided into a major and a minor region at the telomeric end of the HLA complex on chromosome 6. These OR genes could be involved in MHC-related mate preferences. The polymorphism screen was carried out with 13 genes from the HLA-linked OR cluster and three genes from chromosomes 7, 17, and 19 as controls. Ten human cell lines, representing 18 different chromosome 6s, were analyzed. They were from various ethnic origins and exhibited different HLA haplotypes. All OR genes tested, including those not linked to the HLA complex, were polymorphic. These polymorphisms were dispersed along the coding region and resulted in up to seven alleles for a given OR gene. Three polymorphisms resulted either in stop codons (genes hs6M1-4P,hs6M1-17) or in a 16–bp deletion (gene hs6M1-19P), possibly leading to lack of ligand recognition by the respective receptors in the cell line donors. In total, 13 HLA-linked OR haplotypes could be defined. Therefore, allelic variation appears to be a general feature of human OR genes.[The sequence data reported in this paper have been submitted to EMBL under accession nos. AC006137, AC004178, AJ132194, AL022727, AL031983,AL035402, AL035542, Z98744, CAB55431, AL050339, AL035402, AL096770,AL133267, AL121944, Z98745, AL021808, and AL021807.]


BMC Genomics ◽  
2020 ◽  
Vol 21 (1) ◽  
Author(s):  
If H. A. Barnes ◽  
Ximena Ibarra-Soria ◽  
Stephen Fitzgerald ◽  
Jose M. Gonzalez ◽  
Claire Davidson ◽  
...  

1991 ◽  
Vol 11 (3) ◽  
pp. 1770-1776
Author(s):  
R G Collum ◽  
D F Clayton ◽  
F W Alt

We found that the canary N-myc gene is highly related to mammalian N-myc genes in both the protein-coding region and the long 3' untranslated region. Examined coding regions of the canary c-myc gene were also highly related to their mammalian counterparts, but in contrast to N-myc, the canary and mammalian c-myc genes were quite divergent in their 3' untranslated regions. We readily detected N-myc and c-myc expression in the adult canary brain and found N-myc expression both at sites of proliferating neuronal precursors and in mature neurons.


2016 ◽  
Vol 4 (6) ◽  
Author(s):  
Xuehua Wan ◽  
Shaobin Hou ◽  
Kazukuni Hayashi ◽  
James Anderson ◽  
Stuart P. Donachie

Rheinheimera salexigens KH87 T is an obligately halophilic gammaproteobacterium. The strain’s draft genome sequence, generated by the Roche 454 GS FLX+ platform, comprises two scaffolds of ~3.4 Mbp and ~3 kbp, with 3,030 protein-coding sequences and 58 tRNA coding regions. The G+C content is 42 mol%.


2019 ◽  
Vol 44 (9) ◽  
pp. 705-720
Author(s):  
James E Farber ◽  
Robert P Lane

Abstract Olfactory neuronal function depends on the expression and proper regulation of odorant receptor (OR) genes. Previous studies have identified 54 putative intergenic enhancers within or flanking 40 mouse OR clusters. At least 2 of these putative enhancers have been shown to regulate the expression of a small subset of proximal OR genes. In recognition of the large size of the mouse OR gene family (~1400 OR genes distributed across multiple chromosomal loci), it is likely that there remain many additional not-as-yet discovered OR enhancers. We utilized 23 of the previously identified enhancers as a training set (TS) and designed an algorithm that combines a broad range of epigenetic criteria (histone-3-lysine-4 monomethylation, histone-3-lysine-79 trimethylation, histone-3-lysine-27 acetylation, and DNase hypersensitivity) and genetic criteria (cross-species sequence conservation and transcription-factor binding site enrichment) to more broadly search OR gene clusters for additional candidates. We identified 181 new candidate enhancers located at 58 (of 68) mouse OR loci, including 25 new candidates identified by stringent search criteria whose signal strengths are not significantly different from the 23 previously characterized OR enhancers used as the TS. Additionally, we compared OR enhancer versus generic enhancer features in order to evaluate likelihoods that new enhancer candidates specifically function in OR regulation. We found that features distinguishing OR-specific function are significantly more evident for enhancer candidates located within OR clusters as compared with those in flanking regions.


2019 ◽  
Vol 109 (6) ◽  
pp. 983-992 ◽  
Author(s):  
Dan Edward V. Villamor ◽  
Kenneth C. Eastwell

Western X (WX) disease, caused by ‘Candidatus Phytoplasma pruni’, is a devastating disease of sweet cherry resulting in the production of small, bitter-flavored fruits that are unmarketable. Escalation of WX disease in Washington State prompted the development of a rapid detection assay based on recombinase polymerase amplification (RPA) to facilitate timely removal and replacement of diseased trees. Here, we report on a reliable RPA assay targeting putative immunodominant protein coding regions that showed comparable sensitivity to polymerase chain reaction (PCR) in detecting ‘Ca. Phytoplasma pruni’ from crude sap of sweet cherry tissues. Apart from the predominant strain of ‘Ca. Phytoplasma pruni’, the RPA assay also detected a novel strain of phytoplasma from several WX-affected trees. Multilocus sequence analyses using the immunodominant protein A (idpA), imp, rpoE, secY, and 16S ribosomal RNA regions from several ‘Ca. Phytoplasma pruni’ isolates from WX-affected trees showed that this novel phytoplasma strain represents a new subgroup within the 16SrIII group. Examination of high-throughput sequencing data from total RNA of WX-affected trees revealed that the imp coding region is highly expressed, and as supported by quantitative reverse transcription PCR data, it showed higher RNA transcript levels than the previously proposed idpA coding region of ‘Ca. Phytoplasma pruni’.


Sign in / Sign up

Export Citation Format

Share Document