scholarly journals smORFunction: a tool for predicting functions of small open reading frames and microproteins

2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Xiangwen Ji ◽  
Chunmei Cui ◽  
Qinghua Cui

Abstract Background Small open reading frame (smORF) is open reading frame with a length of less than 100 codons. Microproteins, translated from smORFs, have been found to participate in a variety of biological processes such as muscle formation and contraction, cell proliferation, and immune activation. Although previous studies have collected and annotated a large abundance of smORFs, functions of the vast majority of smORFs are still unknown. It is thus increasingly important to develop computational methods to annotate the functions of these smORFs. Results In this study, we collected 617,462 unique smORFs from three studies. The expression of smORF RNAs was estimated by reannotated microarray probes. Using a speed-optimized correlation algorism, the functions of smORFs were predicted by their correlated genes with known functional annotations. After applying our method to 5 known microproteins from literatures, our method successfully predicted their functions. Further validation from the UniProt database showed that at least one function of 202 out of 270 microproteins was predicted. Conclusions We developed a method, smORFunction, to provide function predictions of smORFs/microproteins in at most 265 models generated from 173 datasets, including 48 tissues/cells, 82 diseases (and normal). The tool can be available at https://www.cuilab.cn/smorfunction.

2020 ◽  
Author(s):  
Xiangwen Ji ◽  
Chunmei Cui ◽  
Qinghua Cui

Abstract Background Small open reading frame (smORF) is open reading frame with a length of less than 100 codons. Microproteins, translated from smORFs, have been found to participate in a variety of biological processes such as muscle formation and contraction, cell proliferation, and immune activation. Although previous studies have collected and annotated a large abundance of smORFs, functions of the vast majority of smORFs are still unknown. It is thus increasingly important to develop computational methods to annotate the functions of these smORFs. Results In this study, we collected 617,462 unique smORFs from three studies. The expression of smORF RNAs was estimated by reannotated microarray probes. Using a speed-optimized correlation algorism, the functions of smORFs were predicted by their correlated genes with known functional annotations. After applying our method to 5 known microproteins from literatures, our method successfully predicted their functions. Further validation from the UniProt database showed that at least one function of 202 out of 270 microproteins was predicted. Conclusions We developed a method, smORFunction, to provide function predictions of smORFs/microproteins in at most 265 models generated from 173 datasets, including 48 tissues/cells, 82 diseases (and normal). The tool can be available at http://www.cuilab.cn/smorfunction.


2020 ◽  
Author(s):  
Xiangwen Ji ◽  
Chunmei Cui ◽  
Qinghua Cui

Abstract Background Small open reading frame (smORF) is open reading frame with a length of less than 100 codons. Microproteins, translated from smORFs, have been found to participate in a variety of biological processes such as muscle formation and contraction, cell proliferation, and immune activation. Although previous studies have collected and annotated a large abundance of smORFs, functions of the vast majority of smORFs are still unknown. It is thus increasingly important to develop computational methods to annotate the functions of these smORFs. Results In this study, we collected 617,462 unique smORFs from three studies. The expression of smORF RNAs was estimated by reannotated microarray probes. Using a speed-optimized correlation algorism, the functions of smORFs were predicted by their correlated genes with known functional annotations. After applying our method to 5 known microproteins from literatures, our method successfully predicted their functions. Further validation from the UniProt database showed that at least one function of 202 out of 270 microproteins was predicted. Conclusions We developed a method, smORFunction, to provide function predictions of smORFs/microproteins in at most 265 models generated from 173 datasets, including 48 tissues/cells, 82 diseases (and normal). The tool can be available at http://www.cuilab.cn/smorfunction.


2020 ◽  
Author(s):  
Xiangwen Ji ◽  
Chunmei Cui ◽  
Qinghua Cui

Abstract Background Small open reading frame (smORF) is open reading frame with a length of less than 100 codons. Microproteins, translated from smORFs, have been found to participate in a variety of biological processes such as muscle formation and contraction, cell proliferation, and immune activation. Although previous studies have collected and annotated a large abundance of smORFs, functions of the vast majority of smORFs are still unknown. It is thus increasingly important to develop computational methods to annotate the functions of these smORFs. Results In this study, we collected 617,462 unique smORFs from three studies. The expression of smORF RNAs was estimated by reannotated microarray probes. Using a speed-optimized correlation algorism, the functions of smORFs were predicted by their correlated genes with known functional annotations. After applying our method to 5 known microproteins from literatures, our method successfully predicted their functions. Further validation from the UniProt database showed that at least one function of 202 out of 270 microproteins was predicted. Conclusions We developed a method, smORFunction, to provide function predictions of smORFs/microproteins in at most 265 models generated from 173 datasets, including 48 tissues/cells, 82 diseases (and normal). The tool can be available at http://www.cuilab.cn/smorfunction.


2019 ◽  
Vol 15 (2) ◽  
pp. 108-116 ◽  
Author(s):  
Alexandra Khitun ◽  
Travis J. Ness ◽  
Sarah A. Slavoff

Increasing evidence suggests that some small open reading frame-encoded polypeptides (SEPs) function in prokaryotic and eukaryotic cellular stress responses.


1999 ◽  
Vol 10 (04) ◽  
pp. 635-643 ◽  
Author(s):  
AGNIESZKA GIERLIK ◽  
PAWEŁ MACKIEWICZ ◽  
MARIA KOWALCZUK ◽  
STANISŁAW CEBRAT ◽  
MIROSŁAW R. DUDEK

Coding sequences of DNA generate Open Reading Frames (ORFs) inside them with much higher frequency than random DNA sequences do, especially in the antisense strand. This is a specific feature of the genetic code. Since coding sequences are selected for their length, the generated ORFs are indirect results of this selection and their length is also influenced by selection. That is why ORFs found in any genome, even much longer ones than those spontaneously generated in random DNA sequences, should be considered as two different sets of ORFs: The first one coding for proteins, the second one generated by the coding ORFs. Even intergenic sequences possess greater capacity for generating ORFs than random DNA sequences of the same nucleotide composition, which seems to be a premise that intergenic sequences were generated from coding sequences by recombinational mechanisms.


2008 ◽  
Vol 82 (17) ◽  
pp. 8917-8921 ◽  
Author(s):  
Christopher J. McCormick ◽  
Omar Salim ◽  
Paul R. Lambden ◽  
Ian N. Clarke

ABSTRACT A generally accepted view of norovirus replication is that capsid expression requires production of a subgenomic transcript, the presence of capsid often being used as a surrogate marker to indicate the occurrence of viral replication. Using a polymerase II-based baculovirus delivery system, we observed capsid expression following introduction of a full-length genogroup 3 norovirus genome into HepG2 cells. However, capsid expression occurred as a result of a novel translation termination/reinitiation event between the nonstructural-protein and capsid open reading frames, a feature that may be unique to genogroup 3 noroviruses.


1990 ◽  
Vol 10 (1) ◽  
pp. 28-36 ◽  
Author(s):  
C I Brannan ◽  
E C Dees ◽  
R S Ingram ◽  
S M Tilghman

The mouse H19 gene was identified as an abundant hepatic fetal-specific mRNA under the transcriptional control of a trans-acting locus termed raf. The protein this gene encoded was not apparent from an analysis of its nucleotide sequence, since the mRNA contained multiple translation termination signals in all three reading frames. As a means of assessing which of the 35 small open reading frames might be important to the function of the gene, the human H19 gene was cloned and sequenced. Comparison of the two homologs revealed no conserved open reading frame. Cellular fractionation showed that H19 RNA is cytoplasmic but not associated with the translational machinery. Instead, it is located in a particle with a sedimentation coefficient of approximately 28S. Despite the fact that it is transcribed by RNA polymerase II and is spliced and polyadenylated, we suggest that the H19 RNA is not a classical mRNA. Instead, the product of this unusual gene may be an RNA molecule.


2016 ◽  
Vol 61 (3) ◽  
Author(s):  
Costas C. Papagiannitsis ◽  
Leonidas S. Tzouvelekis ◽  
Eva Tzelepi ◽  
Vivi Miriagou

ABSTRACT By searching the Integrall integron and GenBank databases, a novel open reading frame (ORF) of 51 nucleotides (nts) (ORF-17) overlapping the previously described ORF-11 was identified within the attI1 site in virtually all class 1 integrons. Using a set of isogenic plasmid constructs carrying a single gene cassette (bla GES-1) and possessing a canonical translation initiation region, we found that ORF-17 contributes to GES-1 expression.


1987 ◽  
Vol 7 (8) ◽  
pp. 2728-2734 ◽  
Author(s):  
C A Strick ◽  
T D Fox

The yeast nuclear gene PET111 is required specifically for translation of the mitochondrion-coded mRNA for cytochrome c oxidase subunit II. We have determined the nucleotide sequence of a 3-kilobase segment of DNA that carries PET111. The sequence contains a single long open reading frame that predicts a basic protein of 718 amino acids. The PET111 gene product is a mitochondrial protein, since a hybrid protein which includes the amino-terminal 154 amino acids of PET111 fused to beta-galactosidase is specifically associated with mitochondria. PET111 is translated from a 2.9-kilobase mRNA which, interestingly, has an extended 5'-leader sequence containing four short open reading frames upstream of the long open reading frame. These open reading frames exhibit an interesting pattern of overlap with each other and with the PET111 reading frame.


1987 ◽  
Vol 7 (8) ◽  
pp. 2728-2734
Author(s):  
C A Strick ◽  
T D Fox

The yeast nuclear gene PET111 is required specifically for translation of the mitochondrion-coded mRNA for cytochrome c oxidase subunit II. We have determined the nucleotide sequence of a 3-kilobase segment of DNA that carries PET111. The sequence contains a single long open reading frame that predicts a basic protein of 718 amino acids. The PET111 gene product is a mitochondrial protein, since a hybrid protein which includes the amino-terminal 154 amino acids of PET111 fused to beta-galactosidase is specifically associated with mitochondria. PET111 is translated from a 2.9-kilobase mRNA which, interestingly, has an extended 5'-leader sequence containing four short open reading frames upstream of the long open reading frame. These open reading frames exhibit an interesting pattern of overlap with each other and with the PET111 reading frame.


Sign in / Sign up

Export Citation Format

Share Document