scholarly journals Discovering and Summarizing Relationships Between Chemicals, Genes, Proteins, and Diseases in PubChem

Author(s):  
Leonid Zaslavsky ◽  
Tiejun Cheng ◽  
Asta Gindulyte ◽  
Siqian He ◽  
Sunghwan Kim ◽  
...  

The literature knowledge panels developed and implemented in PubChem are described. These help to uncover and summarize important relationships between chemicals, genes, proteins, and diseases by analyzing co-occurrences of terms in biomedical literature abstracts. Named entities in PubMed records are matched with chemical names in PubChem, disease names in Medical Subject Headings (MeSH), and gene/protein names in popular gene/protein information resources, and the most closely related entities are identified using statistical analysis and relevance-based sampling. Knowledge panels for the co-occurrence of chemical, disease, and gene/protein entities are included in PubChem Compound, Protein, and Gene pages, summarizing these in a compact form. Statistical methods for removing redundancy and estimating relevance scores are discussed, along with benefits and pitfalls of relying on automated (i.e., not human-curated) methods operating on data from multiple heterogeneous sources.

2015 ◽  
Vol 23 (3) ◽  
pp. 617-626 ◽  
Author(s):  
Nophar Geifman ◽  
Sanchita Bhattacharya ◽  
Atul J Butte

Abstract Objective Cytokines play a central role in both health and disease, modulating immune responses and acting as diagnostic markers and therapeutic targets. This work takes a systems-level approach for integration and examination of immune patterns, such as cytokine gene expression with information from biomedical literature, and applies it in the context of disease, with the objective of identifying potentially useful relationships and areas for future research. Results We present herein the integration and analysis of immune-related knowledge, namely, information derived from biomedical literature and gene expression arrays. Cytokine-disease associations were captured from over 2.4 million PubMed records, in the form of Medical Subject Headings descriptor co-occurrences, as well as from gene expression arrays. Clustering of cytokine-disease co-occurrences from biomedical literature is shown to reflect current medical knowledge as well as potentially novel relationships between diseases. A correlation analysis of cytokine gene expression in a variety of diseases revealed compelling relationships. Finally, a novel analysis comparing cytokine gene expression in different diseases to parallel associations captured from the biomedical literature was used to examine which associations are interesting for further investigation. Discussion We demonstrate the usefulness of capturing Medical Subject Headings descriptor co-occurrences from biomedical publications in the generation of valid and potentially useful hypotheses. Furthermore, integrating and comparing descriptor co-occurrences with gene expression data was shown to be useful in detecting new, potentially fruitful, and unaddressed areas of research. Conclusion Using integrated large-scale data captured from the scientific literature and experimental data, a better understanding of the immune mechanisms underlying disease can be achieved and applied to research.


Children ◽  
2021 ◽  
Vol 8 (2) ◽  
pp. 143
Author(s):  
Julie Sommet ◽  
Enora Le Roux ◽  
Bérengère Koehl ◽  
Zinedine Haouari ◽  
Damir Mohamed ◽  
...  

Background: Many pediatric studies describe the association between biological parameters (BP) and severity of sickle cell disease (SCD) using different methods to collect or to analyze BP. This article assesses the methods used for collection and subsequent statistical analysis of BP, and how these impact prognostic results in SCD children cohort studies. Methods: Firstly, we identified the collection and statistical methods used in published SCD cohort studies. Secondly, these methods were applied to our cohort of 375 SCD children, to evaluate the association of BP with cerebral vasculopathy (CV). Results: In 16 cohort studies, BP were collected either once or several times during follow-up. The identified methods in the statistical analysis were: (1) one baseline value per patient (2) last known value; (3) mean of all values; (4) modelling of all values in a two-stage approach. Applying these four different statistical methods to our cohort, the results and interpretation of the association between BP and CV were different depending on the method used. Conclusion: The BP prognostic value depends on the chosen statistical analysis method. Appropriate statistical analyses of prognostic factors in cohort studies should be considered and should enable valuable and reproducible conclusions.


Database ◽  
2019 ◽  
Vol 2019 ◽  
Author(s):  
Peter Brown ◽  
Aik-Choon Tan ◽  
Mohamed A El-Esawi ◽  
Thomas Liehr ◽  
Oliver Blanck ◽  
...  

Abstract Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency–Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.


2018 ◽  
Vol 34 (3) ◽  
pp. 633-645
Author(s):  
Cornel Samoilă ◽  
Doru Ursuţiu ◽  
Vlad Jinga

Abstract MOOC appearance has produced, in a first phase, more discussions than contributions. Despite pessimistic opinions or those catastrophic foreseeing the end of the classic education by accepting MOOC, the authors consider that, as it is happening in all situations when a field is reformed, instead of criticism or catastrophic predictions, an assessment should be simply made. MOOC will not be better or worse if it is discussed and dissected but can be tested in action, perfected by results, or abandoned if it has no prospects. Without testing, no decision is valid. A similarity between the MOOC appearance and the appearance of the idea of flying machines heavier than air can be made. In the flight case, the first reaction was a strong negation (including at Academies level) and only performing the first independent flight with an apparatus heavier than air has shifted orientation from denial to contributions. So, practical tests clarified the battle between ideas. The authors of this article encourage the idea of testing–assessment and, therefore, imagined and proposed one software for quickly assess whether MOOC produces changes in knowledge, by simply transferring courses from ‘face-to-face’ environment into the virtual one. Among the methods of statistical analysis for student behavioral changes was chosen the Keppel method. It underpins the assessment method of this work being approached using both the version with one variable and also with three variables. It is intended that this attempts to pave the way for other series of rapid assessment regarding MOOC effects (using other statistical methods). We believe, that this is the only approach that can lead either to improve the system or to renunciation.


Radiocarbon ◽  
2013 ◽  
Vol 55 (2) ◽  
pp. 720-730 ◽  
Author(s):  
Christopher Bronk Ramsey ◽  
Sharen Lee

OxCal is a widely used software package for the calibration of radiocarbon dates and the statistical analysis of 14C and other chronological information. The program aims to make statistical methods easily available to researchers and students working in a range of different disciplines. This paper will look at the recent and planned developments of the package. The recent additions to the statistical methods are primarily aimed at providing more robust models, in particular through model averaging for deposition models and through different multiphase models. The paper will look at how these new models have been implemented and explore the implications for researchers who might benefit from their use. In addition, a new approach to the evaluation of marine reservoir offsets will be presented. As the quantity and complexity of chronological data increase, it is also important to have efficient methods for the visualization of such extensive data sets and methods for the presentation of spatial and geographical data embedded within planned future versions of OxCal will also be discussed.


2012 ◽  
Vol 610-613 ◽  
pp. 1033-1040
Author(s):  
Wei Dai ◽  
Jia Qi Gao ◽  
Bo Wang ◽  
Feng Ouyang

Effects of weather conditions including temperature, relative humidity, wind speed, wind and direction on PM2.5 were studied using statistical methods. PM2.5 samples were collected during the summer and the winter in a suburb of Shenzhen. Then, correlations, hypothesis test and statistical distribution of PM2.5 and meteorological data were analyzed with IBM SPSS predictive analytics software. Seasonal and daily variations of PM2.5 have been found and these mainly resulted from the weather effects.


Author(s):  
Mariana Vanon Moreira ◽  
Aline Batista Brighenti dos Santos ◽  
Júlia Abrahão Lopes ◽  
Cecília Barra de Oliveira Hespanhol

Introdução: O tromboembolismo venoso (TEV) é uma das principais causas de morbimortalidade materno-fetal. Isso porque, durante a gestação, a predisposição a essa condição eleva-se em virtude do estado de hipercoagulabilidade do sangue da mãe, já que durante esse período a gestante apresenta fatores de risco para os três componentes da tríade de Virchow. Dessa forma, há: a) estase venosa, pela diminuição do tônus venoso e obstrução do fluxo venoso pelo aumento do útero; b) hipercoagulabilidade, com aumento da geração de fibrina, diminuição da atividade fibrinolítica e aumento dos fatores de coagulação II, VII, VIII e X, além de queda progressiva nos níveis de proteína S e resistência adquirida à proteína C ativada; e c) lesão endotelial decorrente da nidação e do remodelamento vascular das artérias uteroespiraladas com o parto e com a dequitação placentária. Apesar de tais fenômenos serem necessários para o controle hemorrágico da mulher durante e após o parto, pode ocorrer obstrução de vasos placentários, acarretando abortos de repetição, descolamento prematuro de placenta e hipertensão arterial materna. Objetivo: Analisar a tromboprofilaxia gestacional recomendada para mulheres com risco de TEV. Material e Método: Em março de 2021, foi realizada uma revisão sistemática na base de dados Medical Literature Analysis and Retrieval System Online (MEDLINE), utilizando os descritores: “hematologic pregnancy complications”; “treatment”; “venous thromboembolism”; e suas variações, obtidas do Medical Subject Headings (MeSH). Foram incluídos ensaios clínicos controlados e randomizados (ECCR), publicados nos últimos cinco anos e na língua inglesa. Resultados: Encontraram-se 24 artigos, dos quais seis foram empregados para a confecção deste trabalho. Dois ECCR selecionados envolveram 4.258 grávidas com predisposição ao TEV, as quais foram divididas em grupo controle e grupo experimental. Ambos os estudos utilizaram um tratamento profilático antes e depois da gestação e relataram a necessidade de submeter toda grávida a uma avaliação de risco para TEV. Em caso positivo, é recomendada a utilização de heparina de baixo peso molecular (HBPM) como principal agente de escolha profilática. A respeito da terapia pós-gestacional, postula-se o uso de HBPM e de meias compressivas por sete dias após o nascimento da criança, salvo em mulheres que já utilizavam anticoagulantes antes da gravidez - as quais precisam ingerir o medicamento por seis semanas. Conclusão: As mudanças anatômicas que ocorrem no organismo da mulher tornam as gestantes suscetíveis aos riscos de um evento trombótico durante a gravidez. Dessa forma, um tratamento profilático adequado é essencial para evitar a mortalidade obstétrica na ausência de um manejo adequado. Nesse quesito, destaca-se o uso de HBPM e de meias compressivas.


Sign in / Sign up

Export Citation Format

Share Document