Dependency Tree Semantics: Branching Quantification in Underspecification

Author(s):  
Livio Robaldo
Keyword(s):  
Author(s):  
Shumin Shi ◽  
Dan Luo ◽  
Xing Wu ◽  
Congjun Long ◽  
Heyan Huang

Dependency parsing is an important task for Natural Language Processing (NLP). However, a mature parser requires a large treebank for training, which is still extremely costly to create. Tibetan is a kind of extremely low-resource language for NLP, there is no available Tibetan dependency treebank, which is currently obtained by manual annotation. Furthermore, there are few related kinds of research on the construction of treebank. We propose a novel method of multi-level chunk-based syntactic parsing to complete constituent-to-dependency treebank conversion for Tibetan under scarce conditions. Our method mines more dependencies of Tibetan sentences, builds a high-quality Tibetan dependency tree corpus, and makes fuller use of the inherent laws of the language itself. We train the dependency parsing models on the dependency treebank obtained by the preliminary transformation. The model achieves 86.5% accuracy, 96% LAS, and 97.85% UAS, which exceeds the optimal results of existing conversion methods. The experimental results show that our method has the potential to use a low-resource setting, which means we not only solve the problem of scarce Tibetan dependency treebank but also avoid needless manual annotation. The method embodies the regularity of strong knowledge-guided linguistic analysis methods, which is of great significance to promote the research of Tibetan information processing.


Author(s):  
Emily Pitler ◽  
Sampath Kannan ◽  
Mitchell Marcus

Dependency parsing algorithms capable of producing the types of crossing dependencies seen in natural language sentences have traditionally been orders of magnitude slower than algorithms for projective trees. For 95.8–99.8% of dependency parses in various natural language treebanks, whenever an edge is crossed, the edges that cross it all have a common vertex. The optimal dependency tree that satisfies this 1-Endpoint-Crossing property can be found with an O( n4) parsing algorithm that recursively combines forests over intervals with one exterior point. 1-Endpoint-Crossing trees also have natural connections to linguistics and another class of graphs that has been studied in NLP.


2019 ◽  
Author(s):  
Alexandre M. R. Cunha ◽  
Kele T. Belloze ◽  
Gustavo P. Guedes

Textual data sources may assist in the detection of adverse events not predicted for a particular drug. However, given the amount of information available in several sources, it is reasonable to adopt a computational approach to analyze these sources to search for adverse events. In this scenario, we created an extension of CoreNLP to process Brazilian Portuguese texts from pharma- covigilance area. We trained three natural language models: a Part-of-speech tagger, a parser and a Named Entity Recognizer. Preliminary results indicate success in generating a dependency tree for phrases in the pharmacovigilance area and in identifying pharmacovigilance named entities.


2017 ◽  
Author(s):  
Eric Lewitus ◽  
Hélène Morlon

AbstractUnderstanding the relative influence of various abiotic and biotic variables on diversification dynamics is a major goal of macroevolutionary studies. Recently, phylogenetic approaches have been developed that make it possible to estimate the role of various environmental variables on diversification using time-calibrated species trees, paleoenvironmental data, and maximum-likelihood techniques. These approaches have been effectively employed to estimate how speciation and extinction rates vary with key abiotic variables, such as temperature and sea level, and we can anticipate that they will be increasingly used in the future. Here we compile a series of biotic and abiotic paleodatasets that can be used as explanatory variables in these models and use simulations to assess the statistical properties of the approach when applied to these paleodatasets. We demonstrate that environment-dependent models perform well in recovering environment-dependent speciation and extinction parameters, as well as in correctly identifying the simulated environmental model when speciation isenvironment-dependent. We explore how the strength of the environment-dependency, tree size, missing taxa, and characteristics of the paleoenvironmental curves influence the performance of the models. Finally, using these models, we infer environment-dependent diversification in three empirical phylogenies: temperature-dependence in Cetacea,δ13C-dependence in Ruminantia, andCO2-dependence in Portulacaceae. We illustrate how to evaluate the relative importance of abiotic and biotic variables in these three clades and interpret these results in light of macroevolutionary hypotheses for mammals and plants. Given the important role paleoenvironments are presumed to have played in species evolution, our statistical assessment of how environment-dependent models behave is crucial for their utility in macroevolutionary analysis.


Sign in / Sign up

Export Citation Format

Share Document