On the Complexity of CCG Parsing

We study the parsing complexity of Combinatory Categorial Grammar (CCG) in the formalism of Vijay-Shanker and Weir ( 1994 ). As our main result, we prove that any parsing algorithm for this formalism will take in the worst case exponential time when the size of the grammar, and not only the length of the input sentence, is included in the analysis. This sets the formalism of Vijay-Shanker and Weir ( 1994 ) apart from weakly equivalent formalisms such as Tree Adjoining Grammar, for which parsing can be performed in time polynomial in the combined size of grammar and input sentence. Our results contribute to a refined understanding of the class of mildly context-sensitive grammars, and inform the search for new, mildly context-sensitive versions of CCG.

Download Full-text

Lexicalization and Generative Power in CCG

Computational Linguistics ◽

10.1162/coli_a_00219 ◽

2015 ◽

Vol 41 (2) ◽

pp. 215-247 ◽

Cited By ~ 7

Author(s):

Marco Kuhlmann ◽

Alexander Koller ◽

Giorgio Satta

Keyword(s):

Categorial Grammar ◽

Linguistic Variation ◽

Weak Equivalence ◽

Generative Power ◽

Context Sensitive ◽

Combinatory Categorial Grammar ◽

Tree Adjoining Grammar ◽

Central Result ◽

Grammar Formalisms

The weak equivalence of Combinatory Categorial Grammar (CCG) and Tree-Adjoining Grammar (TAG) is a central result of the literature on mildly context-sensitive grammar formalisms. However, the categorial formalism for which this equivalence has been established differs significantly from the versions of CCG that are in use today. In particular, it allows restriction of combinatory rules on a per grammar basis, whereas modern CCG assumes a universal set of rules, isolating all cross-linguistic variation in the lexicon. In this article we investigate the formal significance of this difference. Our main result is that lexicalized versions of the classical CCG formalism are strictly less powerful than TAG.

Download Full-text

Strong Equivalence of TAG and CCG

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00393 ◽

2021 ◽

Vol 9 ◽

pp. 707-720

Author(s):

Lena Katharina Schiffer ◽

Andreas Maletti

Keyword(s):

Expressive Power ◽

Categorial Grammar ◽

First Order ◽

Context Sensitive ◽

Combinatory Categorial Grammar ◽

Tree Adjoining Grammar ◽

Empty String ◽

Grammar Formalisms

Tree-adjoining grammar (TAG) and combinatory categorial grammar (CCG) are two well-established mildly context-sensitive grammar formalisms that are known to have the same expressive power on strings (i.e., generate the same class of string languages). It is demonstrated that their expressive power on trees also essentially coincides. In fact, CCG without lexicon entries for the empty string and only first-order rules of degree at most 2 are sufficient for its full expressive power.

Download Full-text

Generating with Discourse Combinatory Categorial Grammar

Linguistic Issues in Language Technology ◽

10.33011/lilt.v4i.1221 ◽

2010 ◽

Vol 4 ◽

Author(s):

Crystal Nakatsu ◽

Michael White

Keyword(s):

Complex Interaction ◽

Categorial Grammar ◽

Structure Theory ◽

Proof Of Concept ◽

Combinatory Categorial Grammar ◽

Tree Adjoining Grammar ◽

Programming Techniques ◽

Discourse Grammar ◽

Discourse Connectives ◽

Do So

This article introduces Discourse Combinatory Categorial Grammar (DCCG) and shows how it can be used to generate multisentence paraphrases, flexibly incorporating both intra- and intersentential discourse connectives. DCCG employs a simple, practical approach to extending Combinatory Categorial Grammar (CCG) to encompass coverage of discourse-level phenomena, which furthermore makes it possible to generate clauses with multiple connectives and — in contrast to approaches based on Rhetorical Structure Theory — with rhetorical dependencies that do not form a tree. To do so, it borrows from Discourse Lexicalized Tree Adjoining Grammar (D-LTAG) the distinction between structural connectives and anaphoric discourse adverbials. Unlike D-LTAG, however, DCCG treats both sentential and discourse phenomena in the same grammar, rather than employing a separate discourse grammar. A key ingredient of this single-grammar approach is cue threading, a tightly constrained technique for extending the semantic scope of a discourse connective beyond the sentence. As DCCG requires no additions to the CCG formalism, it can be used to generate paraphrases of an entire dialogue turn using the OpenCCG realizer as-is, without the need to revise its architecture. In addition, from an interpretation perspective, a single grammar enables easier management of ambiguity across discourse and sentential levels using standard dynamic programming techniques, whereas D-LTAG has required a potentially complex interaction of sentential and discourse grammars to manage the same ambiguity. As a proof-of-concept, the article demonstrates how OpenCCG can be used with a DCCG to generate multi-sentence paraphrases that reproduce and extend those in the SPaRKy Restaurant Corpus.

Download Full-text

A New Parsing Algorithm for Combinatory Categorial Grammar

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00192 ◽

2014 ◽

Vol 2 ◽

pp. 405-418 ◽

Cited By ~ 2

Author(s):

Marco Kuhlmann ◽

Giorgio Satta

Keyword(s):

Polynomial Time ◽

Categorial Grammar ◽

Combinatory Categorial Grammar ◽

Previous Algorithm ◽

Asymptotic Complexity ◽

Parsing Algorithm

We present a polynomial-time parsing algorithm for CCG, based on a new decomposition of derivations into small, shareable parts. Our algorithm has the same asymptotic complexity, O( n6), as a previous algorithm by Vijay-Shanker and Weir (1993), but is easier to understand, implement, and prove correct.

Download Full-text

Informed parsing for coordination with combinatory categorial grammar

10.3115/992730.992732 ◽

2000 ◽

Cited By ~ 6

Author(s):

Jong C. Park ◽

Hyung Joon Cho

Keyword(s):

Categorial Grammar ◽

Combinatory Categorial Grammar

Download Full-text

Results on a Super Strong Exponential Time Hypothesis

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i09.7125 ◽

2020 ◽

Vol 34 (09) ◽

pp. 13700-13703

Author(s):

Nikhil Vyas ◽

Ryan Williams

Keyword(s):

Randomized Algorithm ◽

Time Algorithm ◽

Critical Threshold ◽

Polynomial Method ◽

Variable Ratio ◽

Sat Solving ◽

Worst Case ◽

Exponential Time ◽

Exponential Time Hypothesis ◽

Special Case

All known SAT-solving paradigms (backtracking, local search, and the polynomial method) only yield a 2n(1−1/O(k)) time algorithm for solving k-SAT in the worst case, where the big-O constant is independent of k. For this reason, it has been hypothesized that k-SAT cannot be solved in worst-case 2n(1−f(k)/k) time, for any unbounded ƒ : ℕ → ℕ. This hypothesis has been called the “Super-Strong Exponential Time Hypothesis” (Super Strong ETH), modeled after the ETH and the Strong ETH. We prove two results concerning the Super-Strong ETH:1. It has also been hypothesized that k-SAT is hard to solve for randomly chosen instances near the “critical threshold”, where the clause-to-variable ratio is 2k ln 2 −Θ(1). We give a randomized algorithm which refutes the Super-Strong ETH for the case of random k-SAT and planted k-SAT for any clause-to-variable ratio. In particular, given any random k-SAT instance F with n variables and m clauses, our algorithm decides satisfiability for F in 2n(1−Ω( log k)/k) time, with high probability (over the choice of the formula and the randomness of the algorithm). It turns out that a well-known algorithm from the literature on SAT algorithms does the job: the PPZ algorithm of Paturi, Pudlak, and Zane (1998).2. The Unique k-SAT problem is the special case where there is at most one satisfying assignment. It is natural to hypothesize that the worst-case (exponential-time) complexity of Unique k-SAT is substantially less than that of k-SAT. Improving prior reductions, we show the time complexities of Unique k-SAT and k-SAT are very tightly related: if Unique k-SAT is in 2n(1−f(k)/k) time for an unbounded f, then k-SAT is in 2n(1−f(k)(1−ɛ)/k) time for every ɛ > 0. Thus, refuting Super Strong ETH in the unique solution case would refute Super Strong ETH in general.

Download Full-text

Chapter 5. Combinatory Categorial Grammar

Combinatory Linguistics ◽

10.1515/9783110296877.61 ◽

2012 ◽

Keyword(s):

Categorial Grammar ◽

Combinatory Categorial Grammar

Download Full-text

CCGbank: A Corpus of CCG Derivations and Dependency Structures Extracted from the Penn Treebank

Computational Linguistics ◽

10.1162/coli.2007.33.3.355 ◽

2007 ◽

Vol 33 (3) ◽

pp. 355-396 ◽

Cited By ~ 55

Author(s):

Julia Hockenmaier ◽

Mark Steedman

Keyword(s):

Long Range ◽

State Of The Art ◽

Categorial Grammar ◽

Linguistic Data ◽

Combinatory Categorial Grammar ◽

Extensive Analysis ◽

Dependency Structures

This article presents an algorithm for translating the Penn Treebank into a corpus of Combinatory Categorial Grammar (CCG) derivations augmented with local and long-range word-word dependencies. The resulting corpus, CCGbank, includes 99.4% of the sentences in the Penn Treebank. It is available from the Linguistic Data Consortium, and has been used to train wide-coverage statistical parsers that obtain state-of-the-art rates of dependency recovery. In order to obtain linguistically adequate CCG analyses, and to eliminate noise and inconsistencies in the original annotation, an extensive analysis of the constructions and annotations in the Penn Treebank was called for, and a substantial number of changes to the Treebank were necessary. We discuss the implications of our findings for the extraction of other linguistically expressive grammars from the Treebank, and for the design of future treebanks.

Download Full-text

A Probabilistic Approach to Unsupervised Induction of Combinatory Categorial Grammar in Situated Human-Robot Interaction

2018 IEEE-RAS 18th International Conference on Humanoid Robots (Humanoids) ◽

10.1109/humanoids.2018.8625009 ◽

2018 ◽

Cited By ~ 2

Author(s):

Amir Aly ◽

Tadahiro Taniguchi ◽

Daichi Mochihashi

Keyword(s):

Probabilistic Approach ◽

Categorial Grammar ◽

Human Robot Interaction ◽

Robot Interaction ◽

Combinatory Categorial Grammar

Download Full-text

REDUCING SIMPLE GRAMMARS: EXPONENTIAL AGAINST HIGHLY-POLYNOMIAL TIME IN PRACTICE

International Journal of Foundations of Computer Science ◽

10.1142/s0129054107004930 ◽

2007 ◽

Vol 18 (04) ◽

pp. 715-725

Author(s):

CÉDRIC BASTIEN ◽

JUREK CZYZOWICZ ◽

WOJCIECH FRACZAK ◽

WOJCIECH RYTTER

Keyword(s):

Polynomial Time ◽

Experimental Analysis ◽

Time Complexity ◽

Packet Classification ◽

State Machines ◽

Worst Case ◽

Exponential Time ◽

Push Down

Simple grammar reduction is an important component in the implementation of Concatenation State Machines (a hardware version of stateless push-down automata designed for wire-speed network packet classification). We present a comparison and experimental analysis of the best-known algorithms for grammar reduction. There are two approaches to this problem: one processing compressed strings without decompression and another one which processes strings explicitly. It turns out that the second approach is more efficient in the considered practical scenario despite having worst-case exponential time complexity (while the first one is polynomial). The study has been conducted in the context of network packet classification, where simple grammars are used for representing the classification policies.

Download Full-text