chemical ontologies
Recently Published Documents


TOTAL DOCUMENTS

15
(FIVE YEARS 8)

H-INDEX

2
(FIVE YEARS 1)

2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Janna Hastings ◽  
Martin Glauer ◽  
Adel Memariani ◽  
Fabian Neuhaus ◽  
Till Mossakowski

AbstractChemical data is increasingly openly available in databases such as PubChem, which contains approximately 110 million compound entries as of February 2021. With the availability of data at such scale, the burden has shifted to organisation, analysis and interpretation. Chemical ontologies provide structured classifications of chemical entities that can be used for navigation and filtering of the large chemical space. ChEBI is a prominent example of a chemical ontology, widely used in life science contexts. However, ChEBI is manually maintained and as such cannot easily scale to the full scope of public chemical data. There is a need for tools that are able to automatically classify chemical data into chemical ontologies, which can be framed as a hierarchical multi-class classification problem. In this paper we evaluate machine learning approaches for this task, comparing different learning frameworks including logistic regression, decision trees and long short-term memory artificial neural networks, and different encoding approaches for the chemical structures, including cheminformatics fingerprints and character-based encoding from chemical line notation representations. We find that classical learning approaches such as logistic regression perform well with sets of relatively specific, disjoint chemical classes, while the neural network is able to handle larger sets of overlapping classes but needs more examples per class to learn from, and is not able to make a class prediction for every molecule. Future work will explore hybrid and ensemble approaches, as well as alternative network architectures including neuro-symbolic approaches.


2020 ◽  
Author(s):  
Janna Hastings ◽  
Martin Glauer ◽  
Adel Memariani ◽  
Fabian Neuhaus ◽  
Till Mossakowski

Abstract Chemical data is increasingly openly available in databases such as PubChem, which contains more than 110 million compound entries as of October 2020. With the availability of data at such scale, the burden has shifted to organisation, analysis and interpretation. Chemical ontologies provide structured classifications of chemical entities that can be used for navigation and filtering of the large chemical space. ChEBI is a prominent example of a chemical ontology, widely used in life science contexts. However, ChEBI is manually maintained and as such cannot easily scale to the full scope of public chemical data. There is a need for tools that are able to automatically classify chemical data into chemical ontologies, which can be framed as a hierarchical multi-class classification problem. In this paper we evaluate machine learning approaches for this task, comparing different learning frameworks including logistic regression, decision trees and long short-term memory articial neural networks, and different encoding approaches for the chemical structures, including cheminformatics fingerprints and character-based encoding from chemical line notation representations. We nd that classical learning approaches such as logistic regression perform well with sets of relatively specific, disjoint chemical classes, while the neural network is able to handle larger sets of overlapping classes but needs more examples per class to learn from, and is not able to make a class prediction for every molecule. Future work will explore hybrid and ensemble approaches, as well as alternative network architectures including neuro-symbolic approaches.


Author(s):  
Anupriya Tripathi ◽  
Yoshiki Vázquez-Baeza ◽  
Julia M. Gauglitz ◽  
Mingxun Wang ◽  
Kai Dührkop ◽  
...  

AbstractUntargeted mass spectrometry is employed to detect small molecules in complex biospecimens, generating data that are difficult to interpret. We developed Qemistree, a data exploration strategy based on hierarchical organization of molecular fingerprints predicted from fragmentation spectra, represented in the context of sample metadata and chemical ontologies. By expressing molecular relationships as a tree, we can apply ecological tools, designed around the relatedness of DNA sequences, to study chemical composition.


2020 ◽  
Author(s):  
Lutz Weber ◽  
Konstantin Kruse ◽  
Timo Böhme ◽  
Claudia Bobach ◽  
Stephen Boyer

2020 ◽  
Author(s):  
Lutz Weber ◽  
Konstantin Kruse ◽  
Timo Böhme ◽  
Claudia Bobach ◽  
Stephen Boyer

2020 ◽  
Author(s):  
Lutz Weber ◽  
Konstantin Kruse ◽  
Timo Böhme ◽  
Claudia Bobach ◽  
Stephen Boyer

2020 ◽  
Author(s):  
Lutz Weber ◽  
Konstantin Kruse ◽  
Timo Böhme ◽  
Claudia Bobach ◽  
Stephen Boyer

2020 ◽  
Vol 47 (4) ◽  
pp. 283-299
Author(s):  
Dipankana Banerjee ◽  
Shiv Shakti Ghosh ◽  
Tarun Kumar Mondal

A comprehensive set of evaluation criteria, named OnE, for evaluating ontologies has been proposed in this paper. Each criterion of OnE has been defined in a way such that together they are capable of evaluating any ontology from all aspects. The process of using OnE for evaluation has been demonstrated by evaluating chemical ontologies. Also, for this purpose, an ontology on the domain of agricultural chemicals has been constructed by following the human-centric faceted approach for ontology construction (HCFOC) and has been evaluated using OnE. The results obtained after the evaluation has provided insights about the ontologies. The constructed ontology aims to support any information system trying to support farmers in the process of decision making while selecting chemicals for use in agriculture. Also, it is envisaged that the demonstrated ontology and the set of evaluation criteria named OnE will redefine ontology evaluation and make it easy while making a strong impact on ontology developers.


Sign in / Sign up

Export Citation Format

Share Document