Proteins encoded by Novel ORFs have increased disorder but can be biochemically regulated and harbour pathogenic mutations
AbstractRecent evidence has suggested that protein or protein-like products can be encoded by previously uncharacterized Open Reading Frames (ORFs) that we define as Novel Open Reading Frames (nORFs)1,2. These nORFs are present in both coding and non coding regions of the human genome and the novel proteins that they encode have increased the number and complexity of cellular proteome from bacteria to humans. It is a conundrum whether these protein or protein-like products could play any significant functional biological role. But hopes have been raised to target them for anticancer and antimicrobial therapy3,4. To infer whether these novel proteins can perform biological functions, we used computational predictions to systematically investigate whether their amino acid sequences can form ordered or disordered structures. Our results indicated that that these novel proteins have significantly higher predicted disorder structure compared to all known proteins, yet we do not find any correlation between the pathogenicity of the mutations and whether they are present in the ordered and disordered regions of these novel proteins. This study reveals that we should investigate these novel proteins more systematically as they may be important to understand complex diseases.