gene symbols
Recently Published Documents


TOTAL DOCUMENTS

56
(FIVE YEARS 20)

H-INDEX

8
(FIVE YEARS 0)

2021 ◽  
Author(s):  
James M Heather ◽  
Matthew J Spindler ◽  
Marta Herrero Alonso ◽  
Yifang Ivana Shui ◽  
David G Millar ◽  
...  

The study and manipulation of T cell receptors (TCRs) is central to multiple fields across basic and translational immunology research. Produced by V(D)J recombination, TCRs are often only recorded in the literature and data repositories as a combination of their V and J gene symbols, plus their hypervariable CDR3 amino acid sequence. However, numerous applications require full-length coding nucleotide sequences. Here we present Stitchr, a software tool developed to specifically address this limitation. Given minimal V/J/CDR3 information, Stitchr produces complete coding sequences representing a fully spliced TCR cDNA. Due to its modular design, Stitchr can be used for TCR engineering using either published germline or novel/modified variable and constant region sequences. Sequences produced by Stitchr were validated by synthesizing and transducing TCR sequences into Jurkat cells, recapitulating the expected antigen specificity of the parental TCR. Using a companion script, Thimble, we demonstrate that Stitchr can process a million TCRs in under ten minutes using a standard desktop personal computer. By systemizing the production and modification of TCR sequences, we propose that Stitchr will increase the speed, repeatability, and reproducibility of TCR research. Stitchr is available on GitHub.


Leukemia ◽  
2021 ◽  
Author(s):  
Elspeth A. Bruford ◽  
Cristina R. Antonescu ◽  
Andrew J. Carroll ◽  
Arul Chinnaiyan ◽  
Ian A. Cree ◽  
...  

AbstractGene fusions have been discussed in the scientific literature since they were first detected in cancer cells in the early 1980s. There is currently no standardized way to denote the genes involved in fusions, but in the majority of publications the gene symbols in question are listed either separated by a hyphen (-) or by a forward slash (/). Both types of designation suffer from important shortcomings. HGNC has worked with the scientific community to determine a new, instantly recognizable and unique separator—a double colon (::)—to be used in the description of fusion genes, and advocates its usage in all databases and articles describing gene fusions.


2021 ◽  
Vol 108 (10) ◽  
pp. 1813-1816
Author(s):  
Bryony Braschi ◽  
Ruth L. Seal ◽  
Susan Tweedie ◽  
Tamsin E.M. Jones ◽  
Elspeth A. Bruford
Keyword(s):  

2021 ◽  
Author(s):  
Gourab Das ◽  
Pradeep Kumar

AbstractTo investigate prospective key genes and pathways associated with the pathogenesis and prognosis of stroke types along with subtypes. Human genes using genome assembly build 38 patch release 13 with known gene symbols through NCBI gene database (https://www.ncbi.nlm.nih.gov/gene) were fetched. PubMed advanced queries were constructed using stroke-related keywords and associations were calculated using Normalized pointwise mutual information (nPMI) between each gene symbol and queries. Genes related with stroke risk within their types and subtypes were investigated in order to discover genetic markers to predict individuals who are at the risk of developing stroke with their subtypes. A total of 2,785 (9.4%) genes were found to be linked to the risk of stroke. Based on stroke types, 1,287 (46.2%) and 376 (13.5%) genes were found to be related with IS and HS respectively. Further stratification of IS based on TOAST classification, 86 (6.6%) genes were confined to Large artery atherosclerosis; 131 (10.1%) and 130 (10%) genes were related with the risk of small vessel disease and Cardioembolism subtypes of IS. Besides, a prognostic panel of 9 genes signature consisting of CYP4A11, ALOX5P, NOTCH, NINJ2, FGB, MTHFR, PDE4D, HDAC9, and ZHFX3 can be treated as a diagnostic marker to predict individuals who are at the risk of developing stroke with their subtypes.


2021 ◽  
Author(s):  
Zhuo Zhen Chen ◽  
Wei-Cheih Wang ◽  
Lloyd Johnson ◽  
Jaimie Dufresne ◽  
Peter Bowden ◽  
...  

Abstract INTODUCTIONThere is an urgent need for a simple and sensitive method to elucidate the human plasma proteome to find markers of disease, or therapeutic factors. Human plasma proteome may be obtained from tryptic peptides that results from native digestion using commonly available, sensitive and robust analytical instruments such as linear quadrupole, tandem mass spectrometers. METHODSThe human plasma proteome was elucidated from three independent human EDTA plasma populations analyzed by precipitation with acetonitrile (ACN) for quaternary amine (QA) micro-chromatography prior to native tryptic digestion for nano liquid chromatography, electrospray ionization and tandem mass spectrometry (LC-ESI-MS/MS). The LC-ESI-MS/MS results from authentic plasma and blank injection MS/MS noise controls were parsed into SQL Server along with the fit of the MS/MS spectra from the rigorous X!TANDEM for analysis with the R statistical system. A total of 13,408 gene symbols from tryptic (TRYP) and/or phosphor/tryptic (STYP) peptides showed ≥ 10 peptides with an FDR q ≤ 0.01 from fit of MS/MS spectra by X!TANDEM and were resolved from the null distribution of background noise showed a Chi Square value of χ2 ≥ 9 (p ≤ 0.005). RESULTSNative digestion of human EDTA plasma permitted the identification and quantification of ~ 13,408 protein gene symbols in plasma that showed low FDR (q≤0.01) from the fit of peptide MS/MS spectra and where observation frequency was resolved from the null distribution of random MS/MS spectra of source noise from recordings of blank injections. There was good agreement between the orbital ion trap (OIT) and the sensitive linear ion trap (LIT) as well as the tryptic versus phospho/tryptic peptides. A distinct subset of human cellular proteins showed a variety of specific interaction domains that formed a highly interconnected network in the plasma. DISCUSIONThe agreement between the fit of the peptide MS/MS spectra by the rigorous X!TANDEM algorithm versus random MS/MS spectra controls from blank noise injections demonstrated the reliability of the experimental approach. The highly interconnected network in the plasma confirmed that digestion of plasma under native conditions permitted the identification and quantification of the proteins in a population of human plasma samples. CONCLUSIONIt was feasible to identify more than ten thousand proteins from human plasma with high confidence using a simple linear ion trap after precipitation, quaternary amine chromatography, native digestion and nano spray analysis with a linear quadrupole ion trap.


2021 ◽  
Vol 17 (8) ◽  
pp. e1009283
Author(s):  
Tomasz Konopka ◽  
Sandra Ng ◽  
Damian Smedley

Integrating reference datasets (e.g. from high-throughput experiments) with unstructured and manually-assembled information (e.g. notes or comments from individual researchers) has the potential to tailor bioinformatic analyses to specific needs and to lead to new insights. However, developing bespoke analysis pipelines from scratch is time-consuming, and general tools for exploring such heterogeneous data are not available. We argue that by treating all data as text, a knowledge-base can accommodate a range of bioinformatic data types and applications. We show that a database coupled to nearest-neighbor algorithms can address common tasks such as gene-set analysis as well as specific tasks such as ontology translation. We further show that a mathematical transformation motivated by diffusion can be effective for exploration across heterogeneous datasets. Diffusion enables the knowledge-base to begin with a sparse query, impute more features, and find matches that would otherwise remain hidden. This can be used, for example, to map multi-modal queries consisting of gene symbols and phenotypes to descriptions of diseases. Diffusion also enables user-driven learning: when the knowledge-base cannot provide satisfactory search results in the first instance, users can improve the results in real-time by adding domain-specific knowledge. User-driven learning has implications for data management, integration, and curation.


2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Hao Zhu ◽  
Yuhuan Shi ◽  
Shanshan Jiang ◽  
Xiuxiu Jiao ◽  
Hui Zhu ◽  
...  

Background. Chuankezhi injection (CKZI) was an effective traditional Chinese medicine (TCM) injection in adjuvant bronchial asthma therapy. In this report, we used a network pharmacology method to reveal the mechanisms of CKZI for the treatment of asthma. Methods. The candidate compounds in CKZI were determined by searching the Traditional Chinese Medicine Systems Pharmacology Database and Analysis Platform (TCMSP) and China National Knowledge Infrastructure website (CNKI). The targets of candidate compounds were searched in the TCMSP, DrugBank 5.0, and SwissTargetPrediction. The disease targets were screened from the Online Mendelian Inheritance in Man (OMIM) and GeneCards. The overlapping gene symbols between candidate compounds and disease were filtered via a Venn diagram and were considered as potential targets. A protein-protein interaction (PPI) network and disease-related candidate compound-target-pathway (DC-T-P) network were visualized by Cytoscape 3.6.1. Gene Ontology (GO) functions and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis were performed by metascape to determine the pathways related to asthma. Results. A total of 70 overlapping gene symbols were recognized as potential targets. Cytokines (IL6, TNF, and IL1B) and chemokines (CXCL8 and CCL2) could be recognized as hub genes. Asthma-related candidate compounds were mainly flavonoids, such as quercetin, luteolin, and kaempferol. The cytokine-mediated signaling pathway, cytokine receptor binding, and membrane craft were the most significant biological process (BP), molecular function (MF), and cellular component (CC) of GO function results, respectively. The relevant pathways of CKZI against asthma mainly include IL-17, NF-kappa B, HIF-1, calcium, and PI3K-Akt signaling pathways. Conclusion. Our research provided a theoretical basis for further investigating the mechanisms of CKZI in the treatment of asthma.


2021 ◽  
Author(s):  
Jaimie Dufresne ◽  
Angelique Florentinus-Mefailoski ◽  
Juliet Ajambo ◽  
Ammara Ferwa ◽  
Peter Bowden ◽  
...  

The tryptic peptides from ice cold versus room temperature plasma were identified by C18 liquid chromatography and micro electrospray ionization tandem mass spectrometry (LC–ESI–MS/MS). Samples collected on ice showed low levels of endogenous tryptic peptides compared to the same samples incubated at room temperature. Plasma on ice contained peptides from albumin, complement, and apolipoproteins and others that were observed by the X!TANDEM and SEQUEST algorithms. In contrast to ice cold samples, after incubation at room temperature, greater numbers of tryptic peptides from well characterized plasma proteins, and from cellular proteins were observed. A total of 583,927 precursor ions and MS/MS spectra were correlated to 94,669 best fit peptides that reduced to 22,287 correlations to the best accession within a gene symbol and to 7174 correlations to at least 510 gene symbols with ≥ 5 independent MS/MS correlations (peptide counts) that showed FDR q-values ranging from E−9 (i.e. FDR = 0.000000001) to E−227. A set of 528 gene symbols identified by X!TANDEM and SEQUEST including C4B showed ≥ fivefold variation between ice cold versus room temperature incubation. STRING analysis of the protein gene symbols observed from endogenous peptides in normal plasma revealed an extensive protein-interaction network of cellular factors associated with cell signalling and regulation, the formation of membrane bound organelles, cellular exosomes and exocytosis network proteins. Taken together the results indicated that a pool of cellular proteins, or protein complexes, in plasma are apparently not stable and degrade soon after incubation at room temperature.


2021 ◽  
Author(s):  
Jaimie Dufresne ◽  
Angelique Florentinus-Mefailoski ◽  
Juliet Ajambo ◽  
Ammara Ferwa ◽  
Peter Bowden ◽  
...  

The tryptic peptides from ice cold versus room temperature plasma were identified by C18 liquid chromatography and micro electrospray ionization tandem mass spectrometry (LC–ESI–MS/MS). Samples collected on ice showed low levels of endogenous tryptic peptides compared to the same samples incubated at room temperature. Plasma on ice contained peptides from albumin, complement, and apolipoproteins and others that were observed by the X!TANDEM and SEQUEST algorithms. In contrast to ice cold samples, after incubation at room temperature, greater numbers of tryptic peptides from well characterized plasma proteins, and from cellular proteins were observed. A total of 583,927 precursor ions and MS/MS spectra were correlated to 94,669 best fit peptides that reduced to 22,287 correlations to the best accession within a gene symbol and to 7174 correlations to at least 510 gene symbols with ≥ 5 independent MS/MS correlations (peptide counts) that showed FDR q-values ranging from E−9 (i.e. FDR = 0.000000001) to E−227. A set of 528 gene symbols identified by X!TANDEM and SEQUEST including C4B showed ≥ fivefold variation between ice cold versus room temperature incubation. STRING analysis of the protein gene symbols observed from endogenous peptides in normal plasma revealed an extensive protein-interaction network of cellular factors associated with cell signalling and regulation, the formation of membrane bound organelles, cellular exosomes and exocytosis network proteins. Taken together the results indicated that a pool of cellular proteins, or protein complexes, in plasma are apparently not stable and degrade soon after incubation at room temperature.


2021 ◽  
Author(s):  
Jaimie Dufresne ◽  
Pete Bowden ◽  
Thanusi Thavarajah ◽  
Angelique Florentinus-Mefailoski ◽  
Zhuo Zhen Chen ◽  
...  

Background It may be possible to discover new diagnostic or therapeutic peptides or proteins from blood plasma using LC–ESI–MS/MS to identify, quantify and compare the statistical distributions of peptides cleaved ex vivo from plasma samples from different clinical populations. Methods A systematic method for the organic fractionation of plasma peptides was applied to identify and quantify the endogenous tryptic peptides from human plasma from multiple institutions by C18 HPLC followed nano electrospray ionization and tandem mass spectrometry (LC–ESI–MS/MS) with a linear quadrupole ion trap. The endogenous tryptic peptides, or tryptic phospho peptides (i.e. without exogenous digestion), were extracted in a mixture of organic solvent and water, dried and collected by preparative C18. The tryptic peptides from 6 institutions with 12 different disease and normal EDTA plasma populations, alongside ice cold controls for pre-analytical variation, were characterized by mass spectrometry. Each patient plasma was precipitated in 90% acetonitrile and the endogenous tryptic peptides extracted by a stepwise gradient of increasing water and then formic acid resulting in 10 sub-fractions. The fractionated peptides were manually collected over preparative C18 and injected for 1508 LC–ESI–MS/MS experiments analyzed in SQL Server R. Results Peptides that were cleaved in human plasma by a tryptic activity ex vivo provided convenient and sensitive access to most human proteins in plasma that show differences in the frequency or intensity of proteins observed across populations that may have clinical significance. Combination of step wise organic extraction of 200 μL of plasma with nano electrospray resulted in the confident identification and quantification ~ 14,000 gene symbols by X!TANDEM that is the largest number of blood proteins identified to date and shows that you can monitor the ex vivo proteolysis of most human proteins, including interleukins, from blood. A total of 15,968,550 MS/MS spectra ≥ E4 intensity counts were correlated by the SEQUEST and X!TANDEM algorithms to a federated library of 157,478 protein sequences that were filtered for best charge state (2+ or 3+) and peptide sequence in SQL Server resulting in 1,916,672 distinct best-fit peptide correlations for analysis with the R statistical system. SEQUEST identified some 140,054 protein accessions, or some ~ 26,000 gene symbols, proteins or loci, with at least 5 independent correlations. The X!TANDEM algorithm made at least 5 best fit correlations to more than 14,000 protein gene symbols with p-values and FDR corrected q-values of ~ 0.001 or less. Log10 peptide intensity values showed a Gaussian distribution from E8 to E4 arbitrary counts by quantile plot, and significant variation in average precursor intensity across the disease and controls treatments by ANOVA with means compared by the Tukey–Kramer test. STRING analysis of the top 2000 gene symbols showed a tight association of cellular proteins that were apparently present in the plasma as protein complexes with related cellular components, molecular functions and biological processes. Conclusions The random and independent sampling of pre-fractionated blood peptides by LC-ESI-MS/MS with SQL Server-R analysis revealed the largest plasma proteome to date and was a practical method to quantify and compare the frequency or log10 intensity of individual proteins cleaved ex vivo across populations of plasma samples from multiple clinical locations to discover treatment-specific variation using classical statistics suitable for clinical science. It was possible to identify and quantify nearly all human proteins from EDTA plasma and compare the results of thousands of LC–ESI–MS/MS experiments from multiple clinical populations using standard database methods in SQL Server and classical statistical strategies in the R data analysis system.


Sign in / Sign up

Export Citation Format

Share Document