scholarly journals EHreact: Extended Hasse Diagrams for the Extraction and Scoring of Enzymatic Reaction Templates

Author(s):  
Esther Heid ◽  
Samuel Goldman ◽  
Karthik Sankaranarayanan ◽  
Connor W. Coley ◽  
Christoph Flamm ◽  
...  

Data-driven computer-aided synthesis planning utilizing organic or biocatalyzed reactions from large databases has gained increasing interest in the last decade, sparking the development of numerous tools to extract, apply and score general reaction templates. The generation of reaction rules for enzymatic reactions is especially challenging, since substrate promiscuity varies between enzymes, causing the optimal levels of rule specificity and optimal number of included atoms to differ between enzymes. This complicates an automated extraction from databases and has promoted the creation of manually curated reaction rule sets. Here we present EHreact, a purely data-driven open-source software tool to extract and score reaction rules from sets of reactions known to be catalyzed by an enzyme at appropriate levels of specificity without expert knowledge. EHreact extracts and groups reaction rules into tree-like structures, Hasse diagrams, based on common substructures in the imaginary transition structures. Each diagram can be utilized to output a single or a set of reaction rules, as well as calculate the probability of a new substrate to be processed by the given enzyme by inferring information about the reactive site of the enzyme from the known reactions and their grouping in the template tree. EHreact heuristically predicts the activity of a given enzyme on a new substrate, outperforming current approaches in accuracy and functionality.

2021 ◽  
Author(s):  
Esther Heid ◽  
Samuel Goldman ◽  
Karthik Sankaranarayanan ◽  
Connor W. Coley ◽  
Christoph Flamm ◽  
...  

Data-driven computer-aided synthesis planning utilizing organic or biocatalyzed reactions from large databases has gained increasing interest in the last decade, sparking the development of numerous tools to extract, apply and score general reaction templates. The generation of reaction rules for enzymatic reactions is especially challenging, since substrate promiscuity varies between enzymes, causing the optimal levels of rule specificity and optimal number of included atoms to differ between enzymes. This complicates an automated extraction from databases and has promoted the creation of manually curated reaction rule sets. Here we present EHreact, a purely data-driven open-source software tool to extract and score reaction rules from sets of reactions known to be catalyzed by an enzyme at appropriate levels of specificity without expert knowledge. EHreact extracts and groups reaction rules into tree-like structures, Hasse diagrams, based on common substructures in the imaginary transition structures. Each diagram can be utilized to output a single or a set of reaction rules, as well as calculate the probability of a new substrate to be processed by the given enzyme by inferring information about the reactive site of the enzyme from the known reactions and their grouping in the template tree. EHreact heuristically predicts the activity of a given enzyme on a new substrate, outperforming current approaches in accuracy and functionality.


2021 ◽  
Author(s):  
Esther Heid ◽  
Samuel Goldman ◽  
Karthik Sankaranarayanan ◽  
Connor W. Coley ◽  
Christoph Flamm ◽  
...  

Data-driven computer-aided synthesis planning utilizing organic or biocatalyzed reactions from large databases has gained increasing interest in the last decade, sparking the development of numerous tools to extract, apply and score general reaction templates. The generation of reaction rules for enzymatic reactions is especially challenging, since substrate promiscuity varies between enzymes, causing the optimal levels of rule specificity and optimal number of included atoms to differ between enzymes. This complicates an automated extraction from databases and has promoted the creation of manually curated reaction rule sets. Here we present EHreact, a purely data-driven open-source software tool to extract and score reaction rules from sets of reactions known to be catalyzed by an enzyme at appropriate levels of specificity without expert knowledge. EHreact extracts and groups reaction rules into tree-like structures, Hasse diagrams, based on common substructures in the imaginary transition structures. Each diagram can be utilized to output a single or a set of reaction rules, as well as calculate the probability of a new substrate to be processed by the given enzyme by inferring information about the reactive site of the enzyme from the known reactions and their grouping in the template tree. EHreact heuristically predicts the activity of a given enzyme on a new substrate, outperforming current approaches in accuracy and functionality.


2014 ◽  
Vol 12 (06) ◽  
pp. 1442001 ◽  
Author(s):  
Masaaki Kotera ◽  
Yosuke Nishimura ◽  
Zen-Ichi Nakagawa ◽  
Ai Muto ◽  
Yuki Moriya ◽  
...  

Genomics is faced with the issue of many partially annotated putative enzyme-encoding genes for which activities have not yet been verified, while metabolomics is faced with the issue of many putative enzyme reactions for which full equations have not been verified. Knowledge of enzymes has been collected by IUBMB, and has been made public as the Enzyme List. To date, however, the terminology of the Enzyme List has not been assessed comprehensively by bioinformatics studies. Instead, most of the bioinformatics studies simply use the identifiers of the enzymes, i.e. the Enzyme Commission (EC) numbers. We investigated the actual usage of terminology throughout the Enzyme List, and demonstrated that the partial characteristics of reactions cannot be retrieved by simply using EC numbers. Thus, we developed a novel ontology, named PIERO, for annotating biochemical transformations as follows. First, the terminology describing enzymatic reactions was retrieved from the Enzyme List, and was grouped into those related to overall reactions and biochemical transformations. Consequently, these terms were mapped onto the actual transformations taken from enzymatic reaction equations. This ontology was linked to Gene Ontology (GO) and EC numbers, allowing the extraction of common partial reaction characteristics from given sets of orthologous genes and the elucidation of possible enzymes from the given transformations. Further future development of the PIERO ontology should enhance the Enzyme List to promote the integration of genomics and metabolomics.


2020 ◽  
Vol 175 ◽  
pp. 05027
Author(s):  
Valery Dimitrov ◽  
Lyudmila Borisova ◽  
Inna Nurutdinova

The paper considers the problems of developing and presenting fuzzy expert data on external factors and adjustable parameters of the harvester header. The object domain “Technological adjustment of the harvester header” was studied. On the basis of the data, obtained from four experts a linguistic description of the problem statements was given, linguistic variables were introduced, membership functions were developed, consistency characteristic properties were calculated. The base of fuzzy expert knowledge intended for the unit of obtaining and updating knowledge of the decision support intelligent system by an operator in the field conditions was created. In order to estimate quality of the fuzzy expert data and define the degree of its suitability for application in intelligent information system we used the algorithm which provides setting the quality criteria, availability of feedback with experts to update the data, makes it possible to choose the optimal number of terms of the membership functions. The possibility of taking into account the expert data hierarchy in the given algorithm made it possible to introduce experts ranging according to their qualification, for this purpose Fishburn numbers were used as weightihg factors.


Filomat ◽  
2015 ◽  
Vol 29 (9) ◽  
pp. 1969-1981
Author(s):  
Emina Milovanovic ◽  
Igor Milovanovic ◽  
Mile Stojcev

This paper describes mathematical procedure for designing hexagonal systolic arrays that implement fault-tolerant matrix multiplication. Fault-tolerance is achieved by introducing redundancy at algorithm level by defining three equivalent algorithms with disjoint index spaces. The essence of the proposed method is based on mapping data dependency graph that corresponds to the matrix multiplication algorithm, by an appropriate epimorphism, into a graph with desired properties. Since there is a 1:1 correspondence between the algorithm and it?s graph representation, all transformations performed on the graph directly affect the algorithm. Chosen epimorphism depends on the projection direction vector ?? = [?1 ?2 ?3]T and enables obtaining hexagonal arrays with optimal number of processing elements (PEs) for the given matrix dimensions, which realizes fault-tolerant matrix multiplication for the shortest possible time for that number of PEs. The proposed procedure is formally described by explicit formulas and can be used as a software tool for automatic synthesis of fault-tolerant arrays.


Geosciences ◽  
2021 ◽  
Vol 11 (2) ◽  
pp. 99 ◽  
Author(s):  
Yueqi Gu ◽  
Orhun Aydin ◽  
Jacqueline Sosa

Post-earthquake relief zone planning is a multidisciplinary optimization problem, which required delineating zones that seek to minimize the loss of life and property. In this study, we offer an end-to-end workflow to define relief zone suitability and equitable relief service zones for Los Angeles (LA) County. In particular, we address the impact of a tsunami in the study due to LA’s high spatial complexities in terms of clustering of population along the coastline, and a complicated inland fault system. We design data-driven earthquake relief zones with a wide variety of inputs, including geological features, population, and public safety. Data-driven zones were generated by solving the p-median problem with the Teitz–Bart algorithm without any a priori knowledge of optimal relief zones. We define the metrics to determine the optimal number of relief zones as a part of the proposed workflow. Finally, we measure the impacts of a tsunami in LA County by comparing data-driven relief zone maps for a case with a tsunami and a case without a tsunami. Our results show that the impact of the tsunami on the relief zones can extend up to 160 km inland from the study area.


2018 ◽  
Vol 119 ◽  
pp. 453-461 ◽  
Author(s):  
Danli Wang ◽  
Lufeng Yan ◽  
Xiaobin Ma ◽  
Wenjun Wang ◽  
Mingming Zou ◽  
...  

2018 ◽  
Vol 23 (6) ◽  
pp. 561-573 ◽  
Author(s):  
Martin Winter ◽  
Tom Bretschneider ◽  
Carola Kleiner ◽  
Robert Ries ◽  
Jörg P. Hehn ◽  
...  

Label-free, mass spectrometric (MS) detection is an emerging technology in the field of drug discovery. Unbiased deciphering of enzymatic reactions is a proficient advantage over conventional label-based readouts suffering from compound interference and intricate generation of tailored signal mediators. Significant evolvements of matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) MS, as well as associated liquid handling instrumentation, triggered extensive efforts in the drug discovery community to integrate the comprehensive MS readout into the high-throughput screening (HTS) portfolio. Providing speed, sensitivity, and accuracy comparable to those of conventional, label-based readouts, combined with merits of MS-based technologies, such as label-free parallelized measurement of multiple physiological components, emphasizes the advantages of MALDI-TOF for HTS approaches. Here we describe the assay development for the identification of protein tyrosine phosphatase 1B (PTP1B) inhibitors. In the context of this precious drug target, MALDI-TOF was integrated into the HTS environment and cross-compared with the well-established AlphaScreen technology. We demonstrate robust and accurate IC50 determination with high accordance to data generated by AlphaScreen. Additionally, a tailored MALDI-TOF assay was developed to monitor compound-dependent, irreversible modification of the active cysteine of PTP1B. Overall, the presented data proves the promising perspective for the integration of MALDI-TOF into drug discovery campaigns.


Sign in / Sign up

Export Citation Format

Share Document