input formula Latest Research Papers

Chemical structure generators are used in cheminformatics to produce or enumerate virtual molecules based on a set of boundary conditions. The result can then be tested for properties of interest, such as adherence to measured data or for their suitability as drugs. The starting point can be a potentially fuzzy set of fragments or a molecular formula. In the latter case, the generator produces the set of constitutional isomers of the given input formula. Here we present the novel constitutional isomer generator surge based on the canonical generation path method. Surge uses the nauty package to compute automorphism groups of graphs. We outline the working principles of surge and present benchmarking results which show that surge is currently the fastest structure generator. Surge is available under a liberal open-source license.

Download Full-text

Conflict-Driven Satisfiability for Theory Combination: Lemmas, Modules, and Proofs

Journal of Automated Reasoning ◽

10.1007/s10817-021-09606-y ◽

2021 ◽

Author(s):

Maria Paola Bonacina ◽

Stéphane Graham-Lengrand ◽

Natarajan Shankar

Keyword(s):

Theorem Proving ◽

Practical Interest ◽

Interactive Theorem Proving ◽

Transition System ◽

Candidate Model ◽

Inference Systems ◽

Input Formula ◽

Theory Combination

AbstractSearch-based satisfiability procedures try to build a model of the input formula by simultaneously proposing candidate models and deriving new formulae implied by the input. Conflict-driven procedures perform non-trivial inferences only when resolving conflicts between formulæ and assignments representing the candidate model. CDSAT (Conflict-Driven SATisfiability) is a method for conflict-driven reasoning in unions of theories. It combines inference systems for individual theories as theory modules within a solver for the union of the theories. This article augments CDSAT with a more general lemma learning capability and with proof generation. Furthermore, theory modules for several theories of practical interest are shown to fulfill the requirements for completeness and termination of CDSAT. Proof generation is accomplished by a proof-carrying version of the CDSAT transition system that produces proof objects in memory accommodating multiple proof formats. Alternatively, one can apply to CDSAT the LCF approach to proofs from interactive theorem proving, by defining a kernel of reasoning primitives that guarantees the correctness by construction of CDSAT proofs.

Download Full-text

Recognizing Lexicographically Smallest Words and Computing Successors in Regular Languages

International Journal of Foundations of Computer Science ◽

10.1142/s0129054121420028 ◽

2021 ◽

pp. 1-22

Author(s):

Lukas Fleischer ◽

Jeffrey Shallit

Keyword(s):

Regular Language ◽

Formal Language ◽

Equivalence Classes ◽

Regular Languages ◽

State Complexity ◽

Finite State ◽

Finite State Transducer ◽

Enumeration Problem ◽

Necessary And Sufficient ◽

Input Formula

For a formal language [Formula: see text], the problem of language enumeration asks to compute the length-lexicographically smallest word in [Formula: see text] larger than a given input [Formula: see text] (henceforth called the [Formula: see text]-successor of [Formula: see text]). We investigate this problem for regular languages from a computational complexity and state complexity perspective. We first show that if [Formula: see text] is recognized by a DFA with [Formula: see text] states, then [Formula: see text] states are (in general) necessary and sufficient for an unambiguous finite-state transducer to compute [Formula: see text]-successors. As a byproduct, we obtain that if [Formula: see text] is recognized by a DFA with [Formula: see text] states, then [Formula: see text] states are sufficient for a DFA to recognize the subset [Formula: see text] of [Formula: see text] composed of its lexicographically smallest words. We give a matching lower bound that holds even if [Formula: see text] is represented as an NFA. It has been known that [Formula: see text]-successors can be computed in polynomial time, even if the regular language is given as part of the input (assuming a suitable representation of the language, such as a DFA). In this paper, we refine this result in multiple directions. We show that if the regular language is given as part of the input and encoded as a DFA, the problem is in [Formula: see text]. If the regular language [Formula: see text] is fixed, we prove that the enumeration problem of the language is reducible to deciding membership to the Myhill-Nerode equivalence classes of [Formula: see text] under [Formula: see text]-uniform [Formula: see text] reductions. In particular, this implies that fixed star-free languages can be enumerated in [Formula: see text], arbitrary fixed regular languages can be enumerated in [Formula: see text] and that there exist regular languages for which the problem is [Formula: see text]-complete.

Download Full-text

Indexing k-mers in linear space for quality value compression

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720019400110 ◽

2019 ◽

Vol 17 (05) ◽

pp. 1940011

Author(s):

Yoshihiro Shibuya ◽

Matteo Comin

Keyword(s):

Linear Space ◽

Storage Space ◽

Sequencing Data ◽

Snp Calling ◽

Bioinformatics Tools ◽

Quality Value ◽

Multiple Genomes ◽

Input Formula

Many bioinformatics tools heavily rely on [Formula: see text]-mer dictionaries to describe the composition of sequences and allow for faster reference-free algorithms or look-ups. Unfortunately, naive [Formula: see text]-mer dictionaries are very memory-inefficient, requiring very large amount of storage space to save each [Formula: see text]-mer. This problem is generally worsened by the necessity of an index for fast queries. In this work, we discuss how to build an indexed linear reference containing a set of input [Formula: see text]-mers and its application to the compression of quality scores in FASTQ files. Most of the entropies of sequencing data lie in the quality scores, and thus they are difficult to compress. Here, we present an application to improve the compressibility of quality values while preserving the information for SNP calling. We show how a dictionary of significant [Formula: see text]-mers, obtained from SNP databases and multiple genomes, can be indexed in linear space and used to improve the compression of quality value. Availability: The software is freely available at https://github.com/yhhshb/yalff .

Download Full-text

Recognizing Geometric Trees as Positively Weighted Straight Skeletons and Reconstructing Their Input

International Journal of Computational Geometry & Applications ◽

10.1142/s0218195919500080 ◽

2019 ◽

Vol 29 (03) ◽

pp. 251-267

Author(s):

Günther Eder ◽

Martin Held ◽

Peter Palfrader

Keyword(s):

Weight Function ◽

Time And Space ◽

Positive Weight ◽

Input Formula

We extend results by Biedl et al. (ISVD’13) on the recognition and reconstruction of straight skeletons: Given a geometric tree [Formula: see text], can we recognize whether [Formula: see text] resembles a weighted straight skeleton [Formula: see text] and, if so, can we reconstruct an appropriate polygonal input [Formula: see text] and an appropriate positive weight function [Formula: see text] such that [Formula: see text]? We show that a solution polygon [Formula: see text] and a weight function [Formula: see text] can be found in [Formula: see text] time and space for a geometric tree [Formula: see text] with [Formula: see text] faces if at most one node of [Formula: see text] has two incident edges that span an angle greater than [Formula: see text]. In addition, we show that [Formula: see text] implicitly encodes enough information such that all other weighted bisectors of any solution [Formula: see text] can be obtained from [Formula: see text] without explicitly computing [Formula: see text].

Download Full-text

A 2-approximation algorithm for the contig-based genomic scaffold filling problem

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720018500221 ◽

2018 ◽

Vol 16 (06) ◽

pp. 1850022 ◽

Cited By ~ 2

Author(s):

Haitao Jiang ◽

Letu Qingge ◽

Daming Zhu ◽

Binhai Zhu

Keyword(s):

Approximation Algorithm ◽

Reference Genome ◽

Constant Factor ◽

Important Case ◽

Completeness Proof ◽

Np Completeness ◽

Genomic Scaffold ◽

Np Complete ◽

Filling Problem ◽

Input Formula

The genomic scaffold filling problem has attracted a lot of attention recently. The problem is on filling an incomplete sequence (scaffold) [Formula: see text] into [Formula: see text], with respect to a complete reference genome [Formula: see text], such that the number of common/shared adjacencies between [Formula: see text] and [Formula: see text] is maximized. The problem is NP-complete, and admits a constant-factor approximation. However, the sequence input [Formula: see text] is not quite practical and does not fit most of the real datasets (where a scaffold is more often given as a list of contigs). In this paper, we revisit the genomic scaffold filling problem by considering this important case when a scaffold [Formula: see text] is given, the missing genes can only be inserted in between the contigs, and the objective is to maximize the number of common adjacencies between [Formula: see text] and the filled genome [Formula: see text]. For this problem, we present a simple NP-completeness proof, we then present a factor-2 approximation algorithm.

Download Full-text

The Impact of Entropy and Solution Density on Selected SAT Heuristics

Entropy ◽

10.3390/e20090713 ◽

2018 ◽

Vol 20 (9) ◽

pp. 713

Author(s):

Dor Cohen ◽

Ofer Strichman

Keyword(s):

Medium Size ◽

Solution Density ◽

Back Door ◽

Key Variables ◽

Computationally Expensive ◽

The Impact ◽

Input Formula

We present a new characterization of propositional formulas called entropy, which approximates the freedom we have in assigning the variables. Like several other such measures (e.g., back-door and back-door-key variables), it is computationally expensive to compute. Nevertheless, for small and medium-size satisfiable formulas, it enables us to study the effect of this freedom on the impact of various SAT heuristics, following up on a recent study by C. Oh (Oh, SAT’15, LNCS 9340, 307–323). Oh’s findings were that the expected success of various heuristics depends on whether the input formula is satisfiable or not. With entropy, and also with the measure of solution density, we are able to refine these findings for the case of satisfiable formulas. Specifically, we found empirically that satisfiable formulas with small entropy “behave” similarly to unsatisfiable formulas.

Download Full-text

Multitime Scale Study of Bursting Activities in the Pre-Bötzinger Complex

International Journal of Bifurcation and Chaos ◽

10.1142/s0218127417501723 ◽

2017 ◽

Vol 27 (11) ◽

pp. 1750172 ◽

Cited By ~ 2

Author(s):

Zhuosheng Lü ◽

Cui Zhao ◽

Bizhao Zhang ◽

Lixia Duan

Keyword(s):

Cell Model ◽

Excitatory Input ◽

Maximal Conductance ◽

Single Cell Model ◽

Bötzinger Complex ◽

Geometric Decomposition ◽

The Mean ◽

Slow Variables ◽

Slow Timescale ◽

Input Formula

In this paper, we consider a single cell model of pre-Bötzinger complex, which is derived by adding an external tonic drive ([Formula: see text]) to the model developed by Park and Rubin. Using fast–slow geometric decomposition and bifurcation analysis, we study firing activities of the system and try to reveal the mechanisms underlying the bursts related to the mean level of excitatory input ([Formula: see text]) and the maximal conductance associated with the sodium ([Formula: see text]). Since a regular bursting requires at least two timescales, we consider the effects of timescale, especially of the slow timescale, on the bursting oscillations. Unlike the previous works, in this paper, we conduct our investigation by choosing different slow variables. We show how [Formula: see text] and [Formula: see text] affect bifurcations of the fast subsystem and how the bifurcations further determine firing activities of the full system with different slow variables.

Download Full-text

nanoCoP: Natural Non-clausal Theorem Proving

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/695 ◽

2017 ◽

Cited By ~ 1

Author(s):

Jens Otten

Keyword(s):

Conjunctive Normal Form ◽

Order Logic ◽

Theorem Prover ◽

Original Structure ◽

First Order ◽

Proof Search ◽

Automated Theorem Prover ◽

Speed Up ◽

Input Formula ◽

Original Formula

Most efficient fully automated theorem provers implement proof search calculi that require the input formula to be in a clausal form, i.e. disjunctive or conjunctive normal form. The translation into clausal form introduces a significant overhead to the proof search and modifies the structure of the original formula. Translating a proof in clausal form back into a more readable non-clausal proof of the original formula is not straightforward. This paper presents a non-clausal automated theorem prover for classical first-order logic. It is based on a non-clausal connection calculus and implemented with a few lines of Prolog code. Working entirely on the original structure of the input formula yields not only a speed up of the proof search, but the resulting non-clausal proofs are also shorter.

Download Full-text

On an algorithm for receiving Sudoku matrices

Discrete Mathematics Algorithms and Applications ◽

10.1142/s1793830917500380 ◽

2017 ◽

Vol 09 (03) ◽

pp. 1750038

Author(s):

Krasimir Yordzhev

Keyword(s):

Theoretical Approach ◽

Efficient Algorithm ◽

To Receive ◽

Input Formula

This work examines the problem to describe an efficient algorithm for obtaining [Formula: see text] Sudoku matrices. For this purpose, we define the concepts of [Formula: see text] [Formula: see text]-matrix and disjoint [Formula: see text]-matrices. The paper, using the set-theoretical approach, describes an algorithm for obtaining [Formula: see text]-tuples of [Formula: see text] mutually disjoint [Formula: see text] matrices. We show that in input [Formula: see text] mutually disjoint [Formula: see text] matrices, it is not difficult to receive a Sudoku matrix.

Download Full-text

input formula
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Surge - A Fast Open-Source Chemical Graph Generator

Conflict-Driven Satisfiability for Theory Combination: Lemmas, Modules, and Proofs

Recognizing Lexicographically Smallest Words and Computing Successors in Regular Languages

Indexing k-mers in linear space for quality value compression

Recognizing Geometric Trees as Positively Weighted Straight Skeletons and Reconstructing Their Input

A 2-approximation algorithm for the contig-based genomic scaffold filling problem

The Impact of Entropy and Solution Density on Selected SAT Heuristics

Multitime Scale Study of Bursting Activities in the Pre-Bötzinger Complex

nanoCoP: Natural Non-clausal Theorem Proving

On an algorithm for receiving Sudoku matrices

Export Citation Format

input formulaRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Surge - A Fast Open-Source Chemical Graph Generator

Conflict-Driven Satisfiability for Theory Combination: Lemmas, Modules, and Proofs

Recognizing Lexicographically Smallest Words and Computing Successors in Regular Languages

Indexing k-mers in linear space for quality value compression

Recognizing Geometric Trees as Positively Weighted Straight Skeletons and Reconstructing Their Input

A 2-approximation algorithm for the contig-based genomic scaffold filling problem

The Impact of Entropy and Solution Density on Selected SAT Heuristics

Multitime Scale Study of Bursting Activities in the Pre-Bötzinger Complex

nanoCoP: Natural Non-clausal Theorem Proving

On an algorithm for receiving Sudoku matrices

input formula
Recently Published Documents