Learning Verb Classes in an Incremental Model

Author(s):  
Libby Barak ◽  
Afsaneh Fazly ◽  
Suzanne Stevenson
2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Olga Majewska ◽  
Charlotte Collins ◽  
Simon Baker ◽  
Jari Björne ◽  
Susan Windisch Brown ◽  
...  

Abstract Background Recent advances in representation learning have enabled large strides in natural language understanding; However, verbal reasoning remains a challenge for state-of-the-art systems. External sources of structured, expert-curated verb-related knowledge have been shown to boost model performance in different Natural Language Processing (NLP) tasks where accurate handling of verb meaning and behaviour is critical. The costliness and time required for manual lexicon construction has been a major obstacle to porting the benefits of such resources to NLP in specialised domains, such as biomedicine. To address this issue, we combine a neural classification method with expert annotation to create BioVerbNet. This new resource comprises 693 verbs assigned to 22 top-level and 117 fine-grained semantic-syntactic verb classes. We make this resource available complete with semantic roles and VerbNet-style syntactic frames. Results We demonstrate the utility of the new resource in boosting model performance in document- and sentence-level classification in biomedicine. We apply an established retrofitting method to harness the verb class membership knowledge from BioVerbNet and transform a pretrained word embedding space by pulling together verbs belonging to the same semantic-syntactic class. The BioVerbNet knowledge-aware embeddings surpass the non-specialised baseline by a significant margin on both tasks. Conclusion This work introduces the first large, annotated semantic-syntactic classification of biomedical verbs, providing a detailed account of the annotation process, the key differences in verb behaviour between the general and biomedical domain, and the design choices made to accurately capture the meaning and properties of verbs used in biomedical texts. The demonstrated benefits of leveraging BioVerbNet in text classification suggest the resource could help systems better tackle challenging NLP tasks in biomedicine.


2020 ◽  
Vol 7 (1) ◽  
Author(s):  
Ari Wibisono ◽  
Petrus Mursanto ◽  
Jihan Adibah ◽  
Wendy D. W. T. Bayu ◽  
May Iffah Rizki ◽  
...  

Abstract Real-time information mining of a big dataset consisting of time series data is a very challenging task. For this purpose, we propose using the mean distance and the standard deviation to enhance the accuracy of the existing fast incremental model tree with the drift detection (FIMT-DD) algorithm. The standard FIMT-DD algorithm uses the Hoeffding bound as its splitting criterion. We propose the further use of the mean distance and standard deviation, which are used to split a tree more accurately than the standard method. We verify our proposed method using the large Traffic Demand Dataset, which consists of 4,000,000 instances; Tennet’s big wind power plant dataset, which consists of 435,268 instances; and a road weather dataset, which consists of 30,000,000 instances. The results show that our proposed FIMT-DD algorithm improves the accuracy compared to the standard method and Chernoff bound approach. The measured errors demonstrate that our approach results in a lower Mean Absolute Percentage Error (MAPE) in every stage of learning by approximately 2.49% compared with the Chernoff Bound method and 19.65% compared with the standard method.


2016 ◽  
Vol 11 (1) ◽  
pp. 76-93
Author(s):  
Michael Richter ◽  
Roeland van Hout

This paper investigates set-theoretical transitive and intransitive similarity relationships in triplets of verbs that can be deduced from raters’ similarity judgments on the pairs of verbs involved. We collected similarity judgments on pairs made up of 35 German verbs and found that the concept of transitivity adds to the information obtained from collecting pair-wise semantic similarity judgments. The concept of transitive similarity enables more complex relations to be revealed in triplets of verbs. To evaluate the outcomes that we obtained by analyzing transitive similarities we used two previously developed verb classifications of the same set of 35 verbs based on the analysis of large corpora (Richter & van Hout, 2016). We applied a modified form of weak stochastic transitivity (Block & Marschak, 1960; Luce & Suppes, 1965; Tversky, 1969) and found that (1), in contrast to Rips’ claim (2011), similarity relations in raters’ judgments systematically turn out to be transitive, and (2) transitivity discloses lexical and aspectual properties of verbs relevant in distinguishing verb classes.


Author(s):  
Dmitry Ganenkov ◽  
Timur Maisak

The chapter is a survey of the Nakh-Daghestanian family (also known as East Caucasian), one of the indigenous language families spoken in the Caucasus. The family comprises more than 30 languages, some of which are spoken by only a few hundred people and remain unwritten and/or underdescribed. The chapter provides general information about the sociolinguistic status of Nakh-Daghestanian languages and the history of their research as well as their phonology, morphology, syntax, and lexicon. The languages of the family have rich consonant systems and are morphologically ergative, head-final, with rich case systems, complex verbal paradigms, and pervasive gender-number agreement. Alongside the major transitive and intransitive lexical verb classes, verbs of perception and cognition with the dative experiencer subject usually comprise one or more minor valency classes with non-canonically marked subjects. Among valency-increasing derivations, the causative is the most prominent. The most typical subordination strategies are non-finite, making use of participles, converbs, infinitives and verbal nouns.


2018 ◽  
Vol 22 (8) ◽  
pp. 4565-4581 ◽  
Author(s):  
Florian U. Jehn ◽  
Lutz Breuer ◽  
Tobias Houska ◽  
Konrad Bestian ◽  
Philipp Kraft

Abstract. The ambiguous representation of hydrological processes has led to the formulation of the multiple hypotheses approach in hydrological modeling, which requires new ways of model construction. However, most recent studies focus only on the comparison of predefined model structures or building a model step by step. This study tackles the problem the other way around: we start with one complex model structure, which includes all processes deemed to be important for the catchment. Next, we create 13 additional simplified models, where some of the processes from the starting structure are disabled. The performance of those models is evaluated using three objective functions (logarithmic Nash–Sutcliffe; percentage bias, PBIAS; and the ratio between the root mean square error and the standard deviation of the measured data). Through this incremental breakdown, we identify the most important processes and detect the restraining ones. This procedure allows constructing a more streamlined, subsequent 15th model with improved model performance, less uncertainty and higher model efficiency. We benchmark the original Model 1 and the final Model 15 with HBV Light. The final model is not able to outperform HBV Light, but we find that the incremental model breakdown leads to a structure with good model performance, fewer but more relevant processes and fewer model parameters.


Sign in / Sign up

Export Citation Format

Share Document