scholarly journals Protein-Coding Genes in Euarchontoglires with Pseudogene Homologs in Humans

Life ◽  
2020 ◽  
Vol 10 (9) ◽  
pp. 192
Author(s):  
Lev I. Rubanov ◽  
Oleg A. Zverkov ◽  
Gregory A. Shilovsky ◽  
Alexandr V. Seliverstov ◽  
Vassily A. Lyubetsky

An original bioinformatics technique is developed to identify the protein-coding genes in rodents, lagomorphs and nonhuman primates that are pseudogenized in humans. The method is based on per-gene verification of local synteny, similarity of exon-intronic structures and orthology in a set of genomes. It is applicable to any genome set, even with the number of genomes exceeding 100, and efficiently implemented using fast computer software. Only 50 evolutionary recent human pseudogenes were predicted. Their functional homologs in model species are often associated with the immune system or digestion and mainly express in the testes. According to current evidence, knockout of most of these genes leads to an abnormal phenotype. Some genes were pseudogenized or lost independently in human and nonhuman hominoids.

Blood ◽  
2015 ◽  
Vol 126 (23) ◽  
pp. 2989-2989 ◽  
Author(s):  
Mehmet K Samur ◽  
Annamaria Gulla ◽  
Alice Cleynen ◽  
Florence Magrangeas ◽  
Stephane Minvielle ◽  
...  

Abstract Long intergenic non-coding RNA (lincRNA) are transcripts longer than 200 nucleotides which have a diverse sets of regulatory functions but do not get translated into protein. lincRNAs are located between the protein coding genes and do not overlap exons of either protein-coding or other non-lincRNA. However precise role of individual lincRNA in disease biology remains unclear. Here, we have evaluated the lincRNA expression and their potential biological functions in MM. We performed RNA-seq on CD138+ MM cells from 296 newly diagnosed patients and 16 normal bone marrow plasma cells (NBM) and analyzed for lincRNA expression. Data from paired-end RNAseq reads were mapped to the latest human genome, differentially expressed lincRNAs were identified and for each expressed lincRNA event free survival was examined with univariate cox regression model and support vector machine. Finally, we identified protein coding genes that are strongly correlated (cor > 0.5) with lincRNAs with significant altered expression in MM and impact on EFS to identify their biological role. lincRNA and protein coding genes that have more than 10 reads/million reads for at least 15 normal samples or 62 MM samples (20% all MM samples) were included in the analysis. We identified 60 differentially expressed lincRNA (adj p value <0.05), 51 of those had at least 1.5 fold change difference. The differentially expressed lncRNAs were in close proximity of Ig-related genes, genome stability related genes, hosting miRNAs such as mir222 and mir22 and previously reported for other cancers (PVT and TTY15). We evaluated relation of these lincRNAs with event free survival (EFS) and observed 6 lincRNAs associated with shorter EFS. We have developed multivariate signature model to predict EFS by using these 6 lincRNAs. We divided our dataset into training (n=99) and test (n=156) dataset and we utilized support vector machine classification to divide samples into 2 groups using six lincRNAs. This model was able to predict good and poor survival groups in training dataset (p val < 0.001) as well as test dataset (p val = 0.002) (Figure). We examined genome wide correlation between these six differentially expressed and prognostically significant lincRNAs to expressed protein coding genes to identify their biological functions in MM. Four of these lincRNAs strongly correlated with 47 to 504 genes (abs(cor) > 0.5), affecting immune system pathways and pathways in cancer including Jak-STAT signaling pathway. We also found that these lincRNAs are also highly correlated with tumor development genes such as TNFRSF1B,FGR,TP53BP2,TNF and T or B cells related genes PIK3CD, BCL6. In addition, two of these lincRNAs (LINC00936 and CTB-61M7.2) were found highly correlated with their protein coding neighbor genes ATP2B1(cor = 0.45) and FCAR (cor = 0.95) respectively and MIR22HG was host gene for mir22 which may indicate lincRNAs are using different machinery in MM to regulate protein coding genes. In summary, we report that lincRNA is differentially expressed and prognostically significant in myeloma and may function through their impact on immune system and tumor progression. Our ongoing integrative approach will provide further evidence of their regulatory role in MM with potential therapeutic application. Figure 1. Figure 1. Disclosures Anderson: acetylon pharmaceuticals: Equity Ownership; Celgene Corporation: Consultancy; Gilead: Consultancy; Oncocorp: Equity Ownership; Millennium: Consultancy; BMS: Consultancy. Munshi:onyx: Membership on an entity's Board of Directors or advisory committees; celgene: Membership on an entity's Board of Directors or advisory committees; novartis: Membership on an entity's Board of Directors or advisory committees; millenium: Membership on an entity's Board of Directors or advisory committees.


2015 ◽  
Author(s):  
Elisha D Roberson

CRISPR/Cas9 is emerging as one of the most used methods of genome modification in organisms ranging from bacteria to human cells. However, the efficiency of editing varies tremendously site-to-site. A recent report identified a novel motif, called the 3’GG motif, which substantially increases the efficiency of editing at all sites tested. Furthermore, they highlighted that previously published gRNAs with high editing efficiency also had this motif. I designed a python command-line tool, ngg2, to identify 3’GG gRNA sites from indexed FASTA files. As a proof-of-concept, I screened for these motifs in six genomes: Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila melanogaster, Danio rerio, Mus musculus, and Homo sapiens. I identified more than 24 million single match 3’GG motifs in these reference genomes. Greater than 87% of all protein coding genes in the six reference genomes had at least one overlapping unique 3’GG gRNA site. In particular, more than 96% of mouse and 99% of human protein coding genes have at least one unique, overlapping 3’GG gRNA. These identified sites can be used as a starting point in gRNA design, and the ngg2 tool provides an important ability to identify high-efficiency editing sites in non-model species.


2019 ◽  
Vol 07 (02) ◽  
Author(s):  
Saira Bibi ◽  
Muhammad Fiaz Khan ◽  
Aqsa Rehman ◽  
Faisal Nouroz

Sign in / Sign up

Export Citation Format

Share Document