Dependency profiles in the large-scale analysis of discourse connectives

2018 ◽  
Vol 0 (0) ◽  
Author(s):  
Veronika Laippala ◽  
Aki-Juhani Kyröläinen ◽  
Jenna Kanerva ◽  
Filip Ginter

AbstractThis article presents dependency profiles (DPs) as an empirical method to investigate linguistic elements and their application to the study of 24 discourse connectives in the 3.7-billion token Finnish Internet Parsebank (http://bionlp-www.utu.fi/dep_search/). DPs are based on co-occurrence patterns of the discourse connectives with dependency syntax relations. They follow the assumption of usage-based models, according to which the semantic and functional properties of linguistic expressions arise based on their distributional characteristics. We focus on the typical usage patterns reflected by the DPs and the (dis)similarities among discourse connectives that these patterns reveal. We demonstrate that 1) DPs can be analyzed with clustering to obtain linguistically meaningful groupings among the connectives and that 2) the clustering can be combined with support vector machines to obtain generic and stable linguistic characteristics of the discourse connectives. We show that this data-driven method offers support for previous results and reveals novel tendencies outside the scope of studies on smaller corpora. As the method is based on automatic syntactic analysis following the cross-linguistic universal dependencies, it does not require manual annotation and can be applied to a number of languages and in contrastive studies.

2021 ◽  
Author(s):  
M. Tanveer ◽  
A. Tiwari ◽  
R. Choudhary ◽  
M. A. Ganaie

Author(s):  
Denali Molitor ◽  
Deanna Needell

Abstract In today’s data-driven world, storing, processing and gleaning insights from large-scale data are major challenges. Data compression is often required in order to store large amounts of high-dimensional data, and thus, efficient inference methods for analyzing compressed data are necessary. Building on a recently designed simple framework for classification using binary data, we demonstrate that one can improve classification accuracy of this approach through iterative applications whose output serves as input to the next application. As a side consequence, we show that the original framework can be used as a data preprocessing step to improve the performance of other methods, such as support vector machines. For several simple settings, we showcase the ability to obtain theoretical guarantees for the accuracy of the iterative classification method. The simplicity of the underlying classification framework makes it amenable to theoretical analysis.


2014 ◽  
Vol 989-994 ◽  
pp. 2184-2187
Author(s):  
Jie Lv ◽  
Feng Li Deng ◽  
Zhen Guo Yan

This study focused on estimating chlorophyll concentration of rice using PROSPECT and support vector machine. The study site is located in West Lake sewage irrigation area of Changchun, Jiliin Province. Reflectance spectrual of rice were measured by ASD3 spectrometer, chlorophyll contents of rice were recorded with a portable chlorophyll meter SPAD-502. Support vector machines and PROSPECT model were adopted to construct hyperspectral models for predicting chlorophyll content. The results indicate that: the hyperspectral prediction model of rice chlorophyll content yields a maximum correlation coefficient of 0.8563, and achieves a smallest RMSE of 9.5106; and the prediction accuracy based on the first derivative spectrum is higher than on the original spectrum. Research of this paper provides a theoretical basis for large scale dynamic prediction of rice chlorophyll content in sewage irrigated area.


2016 ◽  
Vol 8 (1) ◽  
Author(s):  
Jonathan Alvarsson ◽  
Samuel Lampa ◽  
Wesley Schaal ◽  
Claes Andersson ◽  
Jarl E. S. Wikberg ◽  
...  

2017 ◽  
Vol 235 ◽  
pp. 199-209 ◽  
Author(s):  
Hakan Cevikalp ◽  
Vojtech Franc

Sign in / Sign up

Export Citation Format

Share Document