Bootstrapping a Persian Dependency Treebank
Keyword(s):
Data Set
◽
This paper presents an ongoing project whose goal is to create a freely available dependency treebank for Persian. The data is taken from the Bijankhan corpus, which is already annotated for parts of speech, and a syntactic dependency annotation based on the Stanford Typed Dependencies is added through a bootstrapping procedure involving the open-source dependency parser MaltParser. We report preliminary parsing experiments with promising results after training the parser on a manually annotated seed data set of 215 sentences.
Keyword(s):
2007 ◽
pp. 25-46