scholarly journals Large-scale literature mining to assess the relation between anti-cancer drugs and cancer types.

Author(s):  
Chris Bauer ◽  
Ralf Herwig ◽  
Matthias Lienhard ◽  
Paul Prasse ◽  
Tobias Scheffer ◽  
...  

Abstract Background There is a huge body of scientific literature describing the relation between tumor types and anti-cancer drugs. The vast amount of scientific literature makes it impossible for researchers and physicians to extract all relevant information manually. Methods In order to cope with the large amount of literature we apply an automated text mining approach to assess the relations between 30 most frequent cancer types and 270 anti-cancer drugs. We apply two different approaches, a classical text mining based on named entity recognition and an AI-based approach employing word embeddings. The consistency of literature mining results is validated with 3 independent methods: first, using data from FDA approval, second, using experimentally measured IC-50 cell line data and third, using clinical patient survival data. Results We demonstrate that the automated text mining is able to successfully assess the relation between cancer types and anti-cancer drugs. All validation methods show a good correspondence between the results from literature mining and independent confirmatory approaches. The relation between most frequent cancer types and drugs employed for their treatment are visualization in a large heatmap. All results are made accessible in an interactive web-based knowledge base using the following link: https://knowledgebase.microdiscovery.de/heatmap Conclusions Our approach is well suited to assess the relations between compounds and cancer types in an automated manner. Both, cancer types and compounds can be grouped into different clusters. Researchers can use the interactive knowledge base to inspect the presented results and follow their own research questions.

2021 ◽  
Vol 19 (1) ◽  
Author(s):  
Chris Bauer ◽  
Ralf Herwig ◽  
Matthias Lienhard ◽  
Paul Prasse ◽  
Tobias Scheffer ◽  
...  

Abstract Background There is a huge body of scientific literature describing the relation between tumor types and anti-cancer drugs. The vast amount of scientific literature makes it impossible for researchers and physicians to extract all relevant information manually. Methods In order to cope with the large amount of literature we applied an automated text mining approach to assess the relations between 30 most frequent cancer types and 270 anti-cancer drugs. We applied two different approaches, a classical text mining based on named entity recognition and an AI-based approach employing word embeddings. The consistency of literature mining results was validated with 3 independent methods: first, using data from FDA approvals, second, using experimentally measured IC-50 cell line data and third, using clinical patient survival data. Results We demonstrated that the automated text mining was able to successfully assess the relation between cancer types and anti-cancer drugs. All validation methods showed a good correspondence between the results from literature mining and independent confirmatory approaches. The relation between most frequent cancer types and drugs employed for their treatment were visualized in a large heatmap. All results are accessible in an interactive web-based knowledge base using the following link: https://knowledgebase.microdiscovery.de/heatmap. Conclusions Our approach is able to assess the relations between compounds and cancer types in an automated manner. Both, cancer types and compounds could be grouped into different clusters. Researchers can use the interactive knowledge base to inspect the presented results and follow their own research questions, for example the identification of novel indication areas for known drugs.


2021 ◽  
Vol 12 ◽  
Author(s):  
Xiaojie Tan ◽  
Jiahui Fu ◽  
Zhaoxin Yuan ◽  
Lingjuan Zhu ◽  
Leilei Fu

Objectives: Cancer is well-known as a collection of diseases of uncontrolled proliferation of cells caused by mutated genes which are generated by external or internal factors. As the mechanisms of cancer have been constantly revealed, including cell cycle, proliferation, apoptosis and so on, a series of new emerging anti-cancer drugs acting on each stage have also been developed. It is worth noting that natural products are one of the important sources for the development of anti-cancer drugs. To the best of our knowledge, there is not any database summarizing the relationships between natural products, compounds, molecular mechanisms, and cancer types.Materials and methods: Based upon published literatures and other sources, we have constructed an anti-cancer natural product database (ACNPD) (http://www.acnpd-fu.com/). The database currently contains 521 compounds, which specifically refer to natural compounds derived from traditional Chinese medicine plants (derivatives are not considered herein). And, it includes 1,593 molecular mechanisms/signaling pathways, covering 10 common cancer types, such as breast cancer, lung cancer and cervical cancer.Results: Integrating existing data sources, we have obtained a large amount of information on natural anti-cancer products, including herbal sources, regulatory targets and signaling pathways. ACNPD is a valuable online resource that illustrates the complex pharmacological relationship between natural products and human cancers.Conclusion: In summary, ACNPD is crucial for better understanding of the relationships between traditional Chinese medicine (TCM) and cancer, which is not only conducive to expand the influence of TCM, but help to find more new anti-cancer drugs in the future.


2017 ◽  
Author(s):  
David Westergaard ◽  
Hans-Henrik Stærfeldt ◽  
Christian Tønsberg ◽  
Lars Juhl Jensen ◽  
Søren Brunak

AbstractAcross academia and industry, text mining has become a popular strategy for keeping up with the rapid growth of the scientific literature. Text mining of the scientific literature has mostly been carried out on collections of abstracts, due to their availability. Here we present an analysis of 15 million English scientific full-text articles published during the period 1823–2016. We describe the development in article length and publication sub-topics during these nearly 250 years. We showcase the potential of text mining by extracting published protein–protein, disease–gene, and protein subcellular associations using a named entity recognition system, and quantitatively report on their accuracy using gold standard benchmark data sets. We subsequently compare the findings to corresponding results obtained on 16.5 million abstracts included in MEDLINE and show that text mining of full-text articles consistently outperforms using abstracts only.


2016 ◽  
Vol 19 (4) ◽  
pp. 475
Author(s):  
Anatoly Mayburd ◽  
Ancha Baranova

Purpose: Novel, “outside of the box” approaches are needed for evaluating candidate molecules, especially in oncology. Throughout the years of 2000-2010, the efficiency of drug development fell to barely acceptable levels, and in the second decade of this century, levels have improved only marginally. This dismal condition continues despite unprecedented progress in the development of a variety of high-throughput tools, computational methods, aggregated databases, drug repurposing programs and innovative chemistries. Here we tested a hypothesis that the economic impact of targeting a particular gene product is predictable a priori by employing a combination of transcriptome profiles and quantitative metrics reflecting existing literature. Methods: To extract classification features, the gene expression patterns of a posteriori high-impact and low-impact anti-cancer target sets were compared. To minimize the possible bias of text-mining, the number of manuscripts published prior to the first clinical trial or relevant review paper, as well as its first derivative in this interval, were collected and used as quantitative metrics of public interest. Results: By combining the gene expression and literature mining features, a 4-fold enrichment in high-impact targets was produced, resulting in a favourable ROC curve analysis for the top impact targets. The dataset was enriched by the highest impact anti-cancer targets, while demonstrating drastic differences in economic value between high and low-impact targets. Known anti-cancer products of EGFR, ERBB2, CYP19A1/aromatase, MTOR, PTGS2, tubulin, VEGFA, BRAF, PGR, PDGFRA, SRC, REN, CSF1R, CTLA4 and HSP90AA1 genes received the highest scores for predicted impact, while microsomal steroid sulfatase, anticoagulant protein C, p53, CDKN2A, c-Jun, and TNSFS11 were highlighted as most promising research-stage targets. Conclusions: A significant cost reduction may be achieved by a priori impact assessment of targets and ligands before their development or repurposing. Expanding a suite of combinational treatments could also decrease the costs, while achieving a higher impact per developed ligand. This article is open to POST-PUBLICATION REVIEW. Registered readers (see “For Readers”) may comment by clicking on ABSTRACT on the issue’s contents page.


2015 ◽  
Vol 1 ◽  
pp. e37 ◽  
Author(s):  
Bahar Sateli ◽  
René Witte

Motivation.Finding relevant scientific literature is one of the essential tasks researchers are facing on a daily basis. Digital libraries and web information retrieval techniques provide rapid access to a vast amount of scientific literature. However, no further automated support is available that would enable fine-grained access to the knowledge ‘stored’ in these documents. The emerging domain ofSemantic Publishingaims at making scientific knowledge accessible to both humans and machines, by adding semantic annotations to content, such as a publication’s contributions, methods, or application domains. However, despite the promises of better knowledge access, the manual annotation of existing research literature is prohibitively expensive for wide-spread adoption. We argue that a novel combination of three distinct methods can significantly advance this vision in a fully-automated way: (i) Natural Language Processing (NLP) forRhetorical Entity(RE) detection; (ii)Named Entity(NE) recognition based on the Linked Open Data (LOD) cloud; and (iii) automatic knowledge base construction for both NEs and REs using semantic web ontologies that interconnect entities in documents with the machine-readable LOD cloud.Results.We present a complete workflow to transform scientific literature into a semantic knowledge base, based on the W3C standards RDF and RDFS. A text mining pipeline, implemented based on the GATE framework, automatically extracts rhetorical entities of typeClaimsandContributionsfrom full-text scientific literature. These REs are further enriched with named entities, represented as URIs to the linked open data cloud, by integrating the DBpedia Spotlight tool into our workflow. Text mining results are stored in a knowledge base through a flexible export process that provides for a dynamic mapping of semantic annotations to LOD vocabularies through rules stored in the knowledge base. We created a gold standard corpus from computer science conference proceedings and journal articles, whereClaimandContributionsentences are manually annotated with their respective types using LOD URIs. The performance of the RE detection phase is evaluated against this corpus, where it achieves an averageF-measure of 0.73. We further demonstrate a number of semantic queries that show how the generated knowledge base can provide support for numerous use cases in managing scientific literature.Availability.All software presented in this paper is available under open source licenses athttp://www.semanticsoftware.info/semantic-scientific-literature-peerj-2015-supplements. Development releases of individual components are additionally available on our GitHub page athttps://github.com/SemanticSoftwareLab.


1993 ◽  
Vol 55 (1) ◽  
pp. 43-46
Author(s):  
Jun YOSHIDA ◽  
Juichiro NAKAYAMA ◽  
Nobuyuki SHIMIZU ◽  
Shonosuke NAGAE ◽  
Yoshiaki HORI

2019 ◽  
Vol 26 (17) ◽  
pp. 3009-3025 ◽  
Author(s):  
Bin Li ◽  
Ho Lam Chan ◽  
Pingping Chen

Cancer is one of the most deadly diseases in the modern world. The last decade has witnessed dramatic advances in cancer treatment through immunotherapy. One extremely promising means to achieve anti-cancer immunity is to block the immune checkpoint pathways – mechanisms adopted by cancer cells to disguise themselves as regular components of the human body. Many review articles have described a variety of agents that are currently under extensive clinical evaluation. However, while checkpoint blockade is universally effective against a broad spectrum of cancer types and is mostly unrestricted by the mutation status of certain genes, only a minority of patients achieve a complete response. In this review, we summarize the basic principles of immune checkpoint inhibitors in both antibody and smallmolecule forms and also discuss potential mechanisms of resistance, which may shed light on further investigation to achieve higher clinical efficacy for these inhibitors.


2019 ◽  
Vol 24 (32) ◽  
pp. 3829-3841 ◽  
Author(s):  
Lakshmanan Loganathan ◽  
Karthikeyan Muthusamy

Worldwide, colorectal cancer takes up the third position in commonly detected cancer and fourth in cancer mortality. Recent progress in molecular modeling studies has led to significant success in drug discovery using structure and ligand-based methods. This study highlights aspects of the anticancer drug design. The structure and ligand-based drug design are discussed to investigate the molecular and quantum mechanics in anti-cancer drugs. Recent advances in anticancer agent identification driven by structural and molecular insights are presented. As a result, the recent advances in the field and the current scenario in drug designing of cancer drugs are discussed. This review provides information on how cancer drugs were formulated and identified using computational power by the drug discovery society.


Sign in / Sign up

Export Citation Format

Share Document