EHreact: Extended Hasse Diagrams for the Extraction and Scoring of Enzymatic Reaction Templates

Data-driven computer-aided synthesis planning utilizing organic or biocatalyzed reactions from large databases has gained increasing interest in the last decade, sparking the development of numerous tools to extract, apply and score general reaction templates. The generation of reaction rules for enzymatic reactions is especially challenging, since substrate promiscuity varies between enzymes, causing the optimal levels of rule specificity and optimal number of included atoms to differ between enzymes. This complicates an automated extraction from databases and has promoted the creation of manually curated reaction rule sets. Here we present EHreact, a purely data-driven open-source software tool to extract and score reaction rules from sets of reactions known to be catalyzed by an enzyme at appropriate levels of specificity without expert knowledge. EHreact extracts and groups reaction rules into tree-like structures, Hasse diagrams, based on common substructures in the imaginary transition structures. Each diagram can be utilized to output a single or a set of reaction rules, as well as calculate the probability of a new substrate to be processed by the given enzyme by inferring information about the reactive site of the enzyme from the known reactions and their grouping in the template tree. EHreact heuristically predicts the activity of a given enzyme on a new substrate, outperforming current approaches in accuracy and functionality.

Download Full-text

EHreact: Extended Hasse Diagrams for the Extraction and Scoring of Enzymatic Reaction Templates

10.26434/chemrxiv.14714748.v1 ◽

2021 ◽

Author(s):

Esther Heid ◽

Samuel Goldman ◽

Karthik Sankaranarayanan ◽

Connor W. Coley ◽

Christoph Flamm ◽

...

Keyword(s):

Expert Knowledge ◽

Enzymatic Reaction ◽

Software Tool ◽

Optimal Number ◽

Data Driven ◽

Enzymatic Reactions ◽

Hasse Diagrams ◽

Large Databases ◽

Rule Sets ◽

The Given

Data-driven computer-aided synthesis planning utilizing organic or biocatalyzed reactions from large databases has gained increasing interest in the last decade, sparking the development of numerous tools to extract, apply and score general reaction templates. The generation of reaction rules for enzymatic reactions is especially challenging, since substrate promiscuity varies between enzymes, causing the optimal levels of rule specificity and optimal number of included atoms to differ between enzymes. This complicates an automated extraction from databases and has promoted the creation of manually curated reaction rule sets. Here we present EHreact, a purely data-driven open-source software tool to extract and score reaction rules from sets of reactions known to be catalyzed by an enzyme at appropriate levels of specificity without expert knowledge. EHreact extracts and groups reaction rules into tree-like structures, Hasse diagrams, based on common substructures in the imaginary transition structures. Each diagram can be utilized to output a single or a set of reaction rules, as well as calculate the probability of a new substrate to be processed by the given enzyme by inferring information about the reactive site of the enzyme from the known reactions and their grouping in the template tree. EHreact heuristically predicts the activity of a given enzyme on a new substrate, outperforming current approaches in accuracy and functionality.

Download Full-text

PIERO ontology for analysis of biochemical transformations: Effective implementation of reaction information in the IUBMB enzyme list

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720014420013 ◽

2014 ◽

Vol 12 (06) ◽

pp. 1442001 ◽

Cited By ~ 5

Author(s):

Masaaki Kotera ◽

Yosuke Nishimura ◽

Zen-Ichi Nakagawa ◽

Ai Muto ◽

Yuki Moriya ◽

...

Keyword(s):

Enzyme Commission ◽

Enzymatic Reaction ◽

Enzymatic Reactions ◽

Orthologous Genes ◽

Actual Usage ◽

Effective Implementation ◽

Reaction Characteristics ◽

The Given ◽

Encoding Genes ◽

Reaction Equations

Genomics is faced with the issue of many partially annotated putative enzyme-encoding genes for which activities have not yet been verified, while metabolomics is faced with the issue of many putative enzyme reactions for which full equations have not been verified. Knowledge of enzymes has been collected by IUBMB, and has been made public as the Enzyme List. To date, however, the terminology of the Enzyme List has not been assessed comprehensively by bioinformatics studies. Instead, most of the bioinformatics studies simply use the identifiers of the enzymes, i.e. the Enzyme Commission (EC) numbers. We investigated the actual usage of terminology throughout the Enzyme List, and demonstrated that the partial characteristics of reactions cannot be retrieved by simply using EC numbers. Thus, we developed a novel ontology, named PIERO, for annotating biochemical transformations as follows. First, the terminology describing enzymatic reactions was retrieved from the Enzyme List, and was grouped into those related to overall reactions and biochemical transformations. Consequently, these terms were mapped onto the actual transformations taken from enzymatic reaction equations. This ontology was linked to Gene Ontology (GO) and EC numbers, allowing the extraction of common partial reaction characteristics from given sets of orthologous genes and the elucidation of possible enzymes from the given transformations. Further future development of the PIERO ontology should enhance the Enzyme List to promote the integration of genomics and metabolomics.

Download Full-text

Development and analysis of fuzzy expert data for technological adjustment of a grain harvester header

E3S Web of Conferences ◽

10.1051/e3sconf/202017505027 ◽

2020 ◽

Vol 175 ◽

pp. 05027

Author(s):

Valery Dimitrov ◽

Lyudmila Borisova ◽

Inna Nurutdinova

Keyword(s):

Intelligent System ◽

Expert Knowledge ◽

Quality Criteria ◽

Optimal Number ◽

Membership Functions ◽

Object Domain ◽

Estimate Quality ◽

Intelligent Information ◽

The Given ◽

Expert Data

The paper considers the problems of developing and presenting fuzzy expert data on external factors and adjustable parameters of the harvester header. The object domain “Technological adjustment of the harvester header” was studied. On the basis of the data, obtained from four experts a linguistic description of the problem statements was given, linguistic variables were introduced, membership functions were developed, consistency characteristic properties were calculated. The base of fuzzy expert knowledge intended for the unit of obtaining and updating knowledge of the decision support intelligent system by an operator in the field conditions was created. In order to estimate quality of the fuzzy expert data and define the degree of its suitability for application in intelligent information system we used the algorithm which provides setting the quality criteria, availability of feedback with experts to update the data, makes it possible to choose the optimal number of terms of the membership functions. The possibility of taking into account the expert data hierarchy in the given algorithm made it possible to introduce experts ranging according to their qualification, for this purpose Fishburn numbers were used as weightihg factors.

Download Full-text

Hexagonal arrays for fault-tolerant matrix multiplication

Filomat ◽

10.2298/fil1509969m ◽

2015 ◽

Vol 29 (9) ◽

pp. 1969-1981

Author(s):

Emina Milovanovic ◽

Igor Milovanovic ◽

Mile Stojcev

Keyword(s):

Fault Tolerant ◽

Matrix Multiplication ◽

Software Tool ◽

Optimal Number ◽

Graph Representation ◽

Matrix Multiplication Algorithm ◽

Projection Direction ◽

The Matrix ◽

The Given ◽

Hexagonal Arrays

This paper describes mathematical procedure for designing hexagonal systolic arrays that implement fault-tolerant matrix multiplication. Fault-tolerance is achieved by introducing redundancy at algorithm level by defining three equivalent algorithms with disjoint index spaces. The essence of the proposed method is based on mapping data dependency graph that corresponds to the matrix multiplication algorithm, by an appropriate epimorphism, into a graph with desired properties. Since there is a 1:1 correspondence between the algorithm and it?s graph representation, all transformations performed on the graph directly affect the algorithm. Chosen epimorphism depends on the projection direction vector ?? = [?1 ?2 ?3]T and enables obtaining hexagonal arrays with optimal number of processing elements (PEs) for the given matrix dimensions, which realizes fault-tolerant matrix multiplication for the shortest possible time for that number of PEs. The proposed procedure is formally described by explicit formulas and can be used as a software tool for automatic synthesis of fault-tolerant arrays.

Download Full-text

Quantifying the Impact of a Tsunami on Data-Driven Earthquake Relief Zone Planning in Los Angeles County via Multivariate Spatial Optimization

Geosciences ◽

10.3390/geosciences11020099 ◽

2021 ◽

Vol 11 (2) ◽

pp. 99 ◽

Cited By ~ 1

Author(s):

Yueqi Gu ◽

Orhun Aydin ◽

Jacqueline Sosa

Keyword(s):

Los Angeles ◽

A Priori ◽

Safety Data ◽

Spatial Optimization ◽

Optimal Number ◽

Multidisciplinary Optimization ◽

Data Driven ◽

Fault System ◽

Design Data ◽

The Impact

Post-earthquake relief zone planning is a multidisciplinary optimization problem, which required delineating zones that seek to minimize the loss of life and property. In this study, we offer an end-to-end workflow to define relief zone suitability and equitable relief service zones for Los Angeles (LA) County. In particular, we address the impact of a tsunami in the study due to LA’s high spatial complexities in terms of clustering of population along the coastline, and a complicated inland fault system. We design data-driven earthquake relief zones with a wide variety of inputs, including geological features, population, and public safety. Data-driven zones were generated by solving the p-median problem with the Teitz–Bart algorithm without any a priori knowledge of optimal relief zones. We define the metrics to determine the optimal number of relief zones as a part of the proposed workflow. Finally, we measure the impacts of a tsunami in LA County by comparing data-driven relief zone maps for a case with a tsunami and a case without a tsunami. Our results show that the impact of the tsunami on the relief zones can extend up to 160 km inland from the study area.

Download Full-text

A Data-Driven Method Using BRB With Data Reliability and Expert Knowledge for Complex Systems Modeling

IEEE Transactions on Systems Man and Cybernetics Systems ◽

10.1109/tsmc.2021.3095524 ◽

2021 ◽

pp. 1-15

Author(s):

Leilei Chang ◽

Chao Fu ◽

Zijian Wu ◽

Weiyong Liu

Keyword(s):

Complex Systems ◽

Expert Knowledge ◽

Data Driven ◽

Systems Modeling ◽

Data Reliability

Download Full-text

A comparative study of an expert knowledge-based model and two data-driven models for landslide susceptibility mapping

CATENA ◽

10.1016/j.catena.2018.04.003 ◽

2018 ◽

Vol 166 ◽

pp. 317-327 ◽

Cited By ~ 25

Author(s):

A-Xing Zhu ◽

Yamin Miao ◽

Rongxun Wang ◽

Tongxin Zhu ◽

Yongcui Deng ◽

...

Keyword(s):

Comparative Study ◽

Landslide Susceptibility ◽

Expert Knowledge ◽

Susceptibility Mapping ◽

Data Driven ◽

Landslide Susceptibility Mapping ◽

Knowledge Based

Download Full-text

Ultrasound promotes enzymatic reactions by acting on different targets: Enzymes, substrates and enzymatic reaction systems

International Journal of Biological Macromolecules ◽

10.1016/j.ijbiomac.2018.07.133 ◽

2018 ◽

Vol 119 ◽

pp. 453-461 ◽

Cited By ~ 26

Author(s):

Danli Wang ◽

Lufeng Yan ◽

Xiaobin Ma ◽

Wenjun Wang ◽

Mingming Zou ◽

...

Keyword(s):

Enzymatic Reaction ◽

Enzymatic Reactions ◽

Reaction Systems

Download Full-text

Establishing MALDI-TOF as Versatile Drug Discovery Readout to Dissect the PTP1B Enzymatic Reaction

SLAS DISCOVERY Advancing Life Sciences ◽

10.1177/2472555218759267 ◽

2018 ◽

Vol 23 (6) ◽

pp. 561-573 ◽

Cited By ~ 7

Author(s):

Martin Winter ◽

Tom Bretschneider ◽

Carola Kleiner ◽

Robert Ries ◽

Jörg P. Hehn ◽

...

Keyword(s):

Drug Discovery ◽

High Throughput Screening ◽

Tyrosine Phosphatase ◽

Enzymatic Reaction ◽

Enzymatic Reactions ◽

Label Free ◽

Ionization Time ◽

Maldi Tof ◽

Ptp1b Inhibitors ◽

Promising Perspective

Label-free, mass spectrometric (MS) detection is an emerging technology in the field of drug discovery. Unbiased deciphering of enzymatic reactions is a proficient advantage over conventional label-based readouts suffering from compound interference and intricate generation of tailored signal mediators. Significant evolvements of matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) MS, as well as associated liquid handling instrumentation, triggered extensive efforts in the drug discovery community to integrate the comprehensive MS readout into the high-throughput screening (HTS) portfolio. Providing speed, sensitivity, and accuracy comparable to those of conventional, label-based readouts, combined with merits of MS-based technologies, such as label-free parallelized measurement of multiple physiological components, emphasizes the advantages of MALDI-TOF for HTS approaches. Here we describe the assay development for the identification of protein tyrosine phosphatase 1B (PTP1B) inhibitors. In the context of this precious drug target, MALDI-TOF was integrated into the HTS environment and cross-compared with the well-established AlphaScreen technology. We demonstrate robust and accurate IC50 determination with high accordance to data generated by AlphaScreen. Additionally, a tailored MALDI-TOF assay was developed to monitor compound-dependent, irreversible modification of the active cysteine of PTP1B. Overall, the presented data proves the promising perspective for the integration of MALDI-TOF into drug discovery campaigns.

Download Full-text