Finite State Morphology of the Nguni Language Cluster: Modelling and Implementation Issues

Author(s):  
Laurette Pretorius ◽  
Sonja Bosch
2016 ◽  
Vol 2 (1) ◽  
Author(s):  
Uwe Springmann ◽  
Helmut Schmid ◽  
Dietmar Najock

AbstractWe present the first large-coverage finite-state open-source morphology for Latin (called LatMor) which parses as well as generates vowel quantity information. LatMor is based on the Berlin Latin Lexicon comprising about 70,000 lemmata of classical Latin compiled by the group of Dietmar Najock in theirwork on concordances of Latin authors (see Rapsch and Najock, 1991) which was recently updated by us. Compared to the well-known Morpheus system of Crane (1991, 1998), which is written in the C programming language, based on 50,000 lemmata of Lewis and Short (1907), not well documented and therefore not easily extended, our new morphology has a larger vocabulary, is about 60 to 1200 times faster and is built in the form of finite-state transducers which can analyze as well as generate wordforms and represent the state-of-the-art implementation method in computational morphology. The current coverage of LatMor is evaluated against Morpheus and other existing systems (some of which are not openly accessible), and is shown to rank first among all systems together with the Pisa LEMLAT morphology (not yet openly accessible). Recall has been analyzed taking the Latin Dependency Treebank¹ as gold data and the remaining defect classes have been identified. LatMor is available under an open source licence to allow its wide usage by all interested parties.


1996 ◽  
Vol 2 (4) ◽  
pp. 331-336 ◽  
Author(s):  
KIMMO KOSKENNIEMI

A source of potential systematic errors in information retrieval is identified and discussed. These errors occur when base form reduction is applied with a (necessarily) finite dictionary. Formal methods for avoiding this error source are presented, along with some practical complexities met in its implementation.


Author(s):  
Dunstan Brown

The purpose of modelling inflectional structure computationally is addressed. It is a good way of checking analyses, and it provides external evidence for their validity. The development of a computational analysis can lead to the discovery of new generalizations about a language’s morphology. Finite state morphology and default inheritance methods are discussed. One question is whether morphological entities such as inflectional classes should be treated in terms of morphological features or whether they should be seen as emerging from the structure of the hierarchy or network. Both inflectional classes and stem classes can be treated as inheritance hierarchies. The issue of the different types of feature involved is raised again when deponency and syncretism are considered. Because it raises these issues, computational modelling allows for subtle distinctions in the treatment of a particular typological phenomenon, as well as providing a better understanding of the basic connections between related phenomena.


1996 ◽  
Vol 2 (4) ◽  
pp. 303-304 ◽  
Author(s):  
MANUEL VILARES FERRO ◽  
JORGE GRAÑA GIL ◽  
PILAR ALVARIÑO ALVARIÑO

The full paper describes an environment for the generation of non-deterministic taggers, currently used for the development of a Spanish lexicon. In relation to previous approaches, our system includes the use of verification tools in order to assure the robustness of the generated taggers. A wide variety of user defined criteria can be applied for checking the exact properties of the system.


Sign in / Sign up

Export Citation Format

Share Document