gene index
Recently Published Documents


TOTAL DOCUMENTS

55
(FIVE YEARS 6)

H-INDEX

15
(FIVE YEARS 1)

2020 ◽  
Vol 49 (D1) ◽  
pp. D274-D281 ◽  
Author(s):  
Michael Y Galperin ◽  
Yuri I Wolf ◽  
Kira S Makarova ◽  
Roberto Vera Alvarez ◽  
David Landsman ◽  
...  

Abstract The Clusters of Orthologous Genes (COG) database, also referred to as the Clusters of Orthologous Groups of proteins, was created in 1997 and went through several rounds of updates, most recently, in 2014. The current update, available at https://www.ncbi.nlm.nih.gov/research/COG, substantially expands the scope of the database to include complete genomes of 1187 bacteria and 122 archaea, typically, with a single genome per genus. In addition, the current version of the COGs includes the following new features: (i) the recently deprecated NCBI’s gene index (gi) numbers for the encoded proteins are replaced with stable RefSeq or GenBank\ENA\DDBJ coding sequence (CDS) accession numbers; (ii) COG annotations are updated for >200 newly characterized protein families with corresponding references and PDB links, where available; (iii) lists of COGs grouped by pathways and functional systems are added; (iv) 266 new COGs for proteins involved in CRISPR-Cas immunity, sporulation in Firmicutes and photosynthesis in cyanobacteria are included; and (v) the database is made available as a web page, in addition to FTP. The current release includes 4877 COGs. Future plans include further expansion of the COG collection by adding archaeal COGs (arCOGs), splitting the COGs containing multiple paralogs, and continued refinement of COG annotations.


PLoS ONE ◽  
2020 ◽  
Vol 15 (10) ◽  
pp. e0240986
Author(s):  
Amin M. Cheikhi ◽  
Zariel I. Johnson ◽  
Dana R. Julian ◽  
Sarah Wheeler ◽  
Carol Feghali-Bostwick ◽  
...  

2018 ◽  
Vol 14 (1) ◽  
pp. 33-42 ◽  
Author(s):  
Behzad Hajieghrari ◽  
Naser Farrokhi ◽  
Bahram Goliaei ◽  
Kaveh Kavousi

Background: MicroRNAs (miRNAs) are groups of small non-protein-coding endogenous single stranded RNAs with approximately 18-24 nucleotides in length. High evolutionary sequence conservation of miRNAs among plant species and availability of powerful computational tools allow identification of new orthologs and paralogs. Methods: New conserved miRNAs in P. patens were found by EST-based homology search approaches. All candidates were screened according to a series of miRNA filtering criteria. Unigene, DFCI Gene Index (PpspGI) databases and psRNATarget algorithm were applied to identify target transcripts using P. patens putative conserved miRNA sequences. Results: Nineteen conserved P. patens miRNAs were identified. The sequences were homologous to known reference plant mature miRNA from 10 miRNA families. They could be folded into the typical miRNA secondary structures. RepeatMasker algorithm demonstrated that ppt-miR2919e and pptmiR1533 had simple sequence repeats in their sequences. Target sites (49 genes) were identified for 7 out of 19 miRNAs. GO and KEGG analysis of targets indicated the involvement of some in important multiple biological and metabolic processes. Conclusion: The majority of the registered miRNAs in databases were predicted by computational approaches while many more have remained unknown. Due to the conserved nature of miRNAs in plant species from closely to distantly related, homology search-based approaches between plants species could lead to the identification of novel miRNAs in other plant species providing baseline information for further search about the biological functions and evolution of miRNAs.


Blood ◽  
2018 ◽  
Vol 132 (Supplement 1) ◽  
pp. 1895-1895
Author(s):  
Daniel Penaherrera ◽  
Sheri Skerget ◽  
Austin Christofferson ◽  
Jessica Aldrich ◽  
Sara Nasser ◽  
...  

Abstract Multiple Myeloma (MM) is a genetically heterogeneous disease of plasma cells that generally exhibits chromosomal abnormalities and distinct gene expression signatures. Previous studies have sought to identify gene expression indices using microarray technology to discern genes associated with survival outcomes to predict whether a newly diagnosed patient has an aggressive form of the disease. One such MM-specific index is the UAMS 70 gene index, which is composed of 51 over- and 19 under-expressed genes. This index was developed using Affymetrix U133Plus2.0 microarray data from 532 MM patients at diagnosis by computing log-rank test statistics on gene expression quartiles. Despite consistently achieving a high performance across a variety of MM datasets, issues arise when applying this index to RNAseq data. Here we address those issues, deriving an independent index based on the RNAseq data from the Multiple Myeloma Research Foundation (MMRF) CoMMpass Study (NCT01454297), and benchmark its performance to an implementation of the UAMS 70 gene index. UAMS index scores are computed by taking the difference between the average log2-scale expression of the 51 over- and 19 under-expressed genes. We applied this calculation to RNAseq data analyzed using Sailfish, Salmon v7.2, and HTseq counts collected from 41 Multiple Myeloma Genomics Initiative samples and compared the results to scores from matching GCRMA, MAS5, RMA, and PLIER16 Affymetrix U133Plus2.0 microarray data. Differences in the distribution of index values across data types led to nonconforming classification of high-risk individuals. Additionally, when applied to RNAseq data, several Affymetrix probesets did not uniquely match to gene annotations from Ensembl-v74. This reduced the number of genes upon which our UAMS score was calculated to 61 genes. Of the original 51 over-expressed probes, only 44 uniquely mapped genes remained after 7 multi-mapped probes are removed and similarly, out of the 19 under-expressed genes only 17 were uniquely mapped. Given the complication of probe-gene mismatch and inconsistencies identifying high-risk individuals when applied to RNAseq data, we developed an independent index using the baseline RNAseq data from the MMRF CoMMpass Study IA13 dataset. From a training set (n=375) of RNAseq data measuring 56430 genes, we performed univariate log-rank tests on expression quartiles associated with disease-related survival while controlling for an FDR of 2.5%, resulting in 23 under- and 332 over-expressed genes. Subsequent multivariate Cox regression analysis and backward stepwise selection culminated in the identification of the CoMMpass RNAseq index, which is based on the ratio of mean expression values of 87 genes (19 under- and 68 over-expressed) predictive of high risk (hazard ratio [HR] = 8.7341, 95% CI = 5.615-13.58, p < 0.001). Validation on the test set (n=251) yielded a HR of 5.612 (95% CI = 3.066-10.27, p < 0.001) as compared to a HR of 4.753 (95% CI = 2.688-8.403, p < 0.001) achieved with the adapted UAMS index. Adjusting for a patient's International Staging System (ISS) stage revises these hazard ratios to 6.236 (95% CI = 3.345-11.627, p < 0.001) and 3.6420 (95% CI = 1.9726-6.724, p < 0.001) for the CoMMpass RNAseq and the adapted UAMS indices, respectively. Furthermore, the distribution of CoMMpass RNAseq index values across the training and test set show no observable bias with respect to three main therapy arms, suggesting it is predictive of high risk independent of treatment. Our newly derived CoMMpass RNAseq index shares one gene in common with the UAMS 61 gene index (CENPW) and recovers two over-expressed genes (FABP5, TAGLN2), which were removed from the UAMS 70 gene index due to probe multimapping. When the recovered genes are added back to the UAMS index, the unadjusted and adjusted hazard ratios measured for the test set are 5.173 (CI = 2.926-9.146, p < 0.001) and 4.022 (CI = 2.1840-7.408, p < 0.001), respectively. Of the original 70 genes in the UAMS index, 21 (30%) map to chromosome 1, which frequently exhibits copy number gains in MM. Only 11 of the 87 (13%) genes in our proposed index map to chr1, which indicates that, given its performance, the newly derived list of genes may represent a more diverse index to predict, and provide novel insights into, high risk MM. Altogether, the CoMMpass RNAseq index identifies a high risk signature in 13% of MM patients and outperforms the UAMS index. Disclosures Lonial: Amgen: Research Funding.


2017 ◽  
Vol 2017 ◽  
pp. 1-5
Author(s):  
Elizabeth S. Sandberg ◽  
Ali S. Calikoglu ◽  
Karen J. Loechner ◽  
Lydia L. Snyder

Deficiency of the short stature homeobox-containing (SHOX) gene is a frequent cause of short stature in children (2–15%). Here, we report 7 siblings with SHOX deficiency due to a point mutation in the SHOX gene. Index case was a 3-year-old male who presented for evaluation of short stature. His past medical history and birth history were unremarkable. Family history was notable for multiple individuals with short stature. Physical exam revealed short stature, with height standard deviation score (SDS) of −2.98, as well as arm span 3 cm less than his height. His laboratory workup was noncontributory for common etiologies of short stature. Due to significant familial short stature and shortened arm span, SHOX gene analysis was performed and revealed patient is heterozygous for a novel SHOX gene mutation at nucleotide position c.582. This mutation is predicted to cause termination of the SHOX protein at codon 194, effectively causing haploinsufficiency. Six out of nine other siblings were later found to also be heterozygous for the same mutation. Growth hormone was initiated in all seven siblings upon diagnosis and they have demonstrated improved height SDS.


2015 ◽  
Vol 60 (1) ◽  
pp. 84-94 ◽  
Author(s):  
Gabriela S. Kinker ◽  
Sueli M. Oba-Shinjo ◽  
Claudia E. Carvalho-Sousa ◽  
Sandra M. Muxel ◽  
Suely K. N. Marie ◽  
...  
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document