Full parsing approximation for information extraction via finite-state cascades

2002 ◽  
Vol 8 (2-3) ◽  
pp. 145-165 ◽  
Author(s):  
FABIO CIRAVEGNA ◽  
ALBERTO LAVELLI

This paper proposes a robust approach to parsing suitable for Information Extraction (IE) from texts using finite-state cascades. The approach is characterized by the construction of an approximation of the full parse tree that captures all the information relevant for IE purposes, leaving the other relations underspecified. Sequences of cascades of finite-state rules deterministically analyze the text, building unambiguous structures. Initially basic chunks are analyzed; then clauses are recognized and nested; finally modifier attachment is performed and the global parse tree is built. The parsing approach allows robust, effective and efficient analysis of real world texts. The grammar organization simplifies changes, insertion of new rules and integration of domain-oriented rules. The approach has been tested for Italian, English, and Russian. A parser based on such an approach has been implemented as part of Pinocchio, an environment for developing and running IE applications.

2021 ◽  
pp. 1-27 ◽  
Author(s):  
Brandon de la Cuesta ◽  
Naoki Egami ◽  
Kosuke Imai

Abstract Conjoint analysis has become popular among social scientists for measuring multidimensional preferences. When analyzing such experiments, researchers often focus on the average marginal component effect (AMCE), which represents the causal effect of a single profile attribute while averaging over the remaining attributes. What has been overlooked, however, is the fact that the AMCE critically relies upon the distribution of the other attributes used for the averaging. Although most experiments employ the uniform distribution, which equally weights each profile, both the actual distribution of profiles in the real world and the distribution of theoretical interest are often far from uniform. This mismatch can severely compromise the external validity of conjoint analysis. We empirically demonstrate that estimates of the AMCE can be substantially different when averaging over the target profile distribution instead of uniform. We propose new experimental designs and estimation methods that incorporate substantive knowledge about the profile distribution. We illustrate our methodology through two empirical applications, one using a real-world distribution and the other based on a counterfactual distribution motivated by a theoretical consideration. The proposed methodology is implemented through an open-source software package.


Robotics ◽  
2021 ◽  
Vol 10 (2) ◽  
pp. 68
Author(s):  
Lei Shi ◽  
Cosmin Copot ◽  
Steve Vanlanduit

In gaze-based Human-Robot Interaction (HRI), it is important to determine human visual intention for interacting with robots. One typical HRI interaction scenario is that a human selects an object by gaze and a robotic manipulator will pick up the object. In this work, we propose an approach, GazeEMD, that can be used to detect whether a human is looking at an object for HRI application. We use Earth Mover’s Distance (EMD) to measure the similarity between the hypothetical gazes at objects and the actual gazes. Then, the similarity score is used to determine if the human visual intention is on the object. We compare our approach with a fixation-based method and HitScan with a run length in the scenario of selecting daily objects by gaze. Our experimental results indicate that the GazeEMD approach has higher accuracy and is more robust to noises than the other approaches. Hence, the users can lessen cognitive load by using our approach in the real-world HRI scenario.


2010 ◽  
Vol 72 (1) ◽  
pp. 24-29 ◽  
Author(s):  
Hui-Min Chung ◽  
Kristina Jackson Behan

Authentic assessment exercises are similar to real-world tasks that would be expected by a professional. An authentic assessment in combination with an inquiry-based learning activity enhances students' learning and rehearses them for their future roles, whether as scientists or as informed citizens. Over a period of 2 years, we experimented with two inquiry-based projects; one had traditional scientific inquiry characteristics, and the other used popular culture as the medium of inquiry. We found that activities that incorporated group learning motivated students and sharpened their abilities to apply and communicate their knowledge of science. We also discovered that incorporating popular culture provided ““Millennial”” students with a refreshing view of science learning and increased their appetites to explore and elaborate on science.


2001 ◽  
Vol 33 (4) ◽  
pp. 657-660
Author(s):  
Frank Tachau

Based on Professor Özbudun's lectures at Bilkent University, this book is at once compact, highly readable, and very insightful. Unlike much current literature on Turkey, the analysis is set in a broad and informed comparative context. Turkey, the author points out, has been left out of comparative political studies, particularly those encompassing the Middle East and southern Europe, in which arguably it could (or should) have been included. Scholarly neglect thus reflects real-world politics: Turkey falls between two worlds, one of which it once largely controlled, the other to which it currently aspires to belong. This lacuna is one of the factors that persuaded Özbudun to publish this volume.


Algorithms ◽  
2021 ◽  
Vol 14 (7) ◽  
pp. 197
Author(s):  
Ali Seman ◽  
Azizian Mohd Sapawi

In the conventional k-means framework, seeding is the first step toward optimization before the objects are clustered. In random seeding, two main issues arise: the clustering results may be less than optimal and different clustering results may be obtained for every run. In real-world applications, optimal and stable clustering is highly desirable. This report introduces a new clustering algorithm called the zero k-approximate modal haplotype (Zk-AMH) algorithm that uses a simple and novel seeding mechanism known as zero-point multidimensional spaces. The Zk-AMH provides cluster optimality and stability, therefore resolving the aforementioned issues. Notably, the Zk-AMH algorithm yielded identical mean scores to maximum, and minimum scores in 100 runs, producing zero standard deviation to show its stability. Additionally, when the Zk-AMH algorithm was applied to eight datasets, it achieved the highest mean scores for four datasets, produced an approximately equal score for one dataset, and yielded marginally lower scores for the other three datasets. With its optimality and stability, the Zk-AMH algorithm could be a suitable alternative for developing future clustering tools.


2021 ◽  
Vol 178 (1-2) ◽  
pp. 59-76
Author(s):  
Emmanuel Filiot ◽  
Pierre-Alain Reynier

Copyless streaming string transducers (copyless SST) have been introduced by R. Alur and P. Černý in 2010 as a one-way deterministic automata model to define transductions of finite strings. Copyless SST extend deterministic finite state automata with a set of variables in which to store intermediate output strings, and those variables can be combined and updated all along the run, in a linear manner, i.e., no variable content can be copied on transitions. It is known that copyless SST capture exactly the class of MSO-definable string-to-string transductions, and are as expressive as deterministic two-way transducers. They enjoy good algorithmic properties. Most notably, they have decidable equivalence problem (in PSpace). On the other hand, HDT0L systems have been introduced for a while, the most prominent result being the decidability of the equivalence problem. In this paper, we propose a semantics of HDT0L systems in terms of transductions, and use it to study the class of deterministic copyful SST. Our contributions are as follows: (i)HDT0L systems and total deterministic copyful SST have the same expressive power, (ii)the equivalence problem for deterministic copyful SST and the equivalence problem for HDT0L systems are inter-reducible, in quadratic time. As a consequence, equivalence of deterministic SST is decidable, (iii)the functionality of non-deterministic copyful SST is decidable, (iv)determining whether a non-deterministic copyful SST can be transformed into an equivalent non-deterministic copyless SST is decidable in polynomial time.


2010 ◽  
Vol 36 (S1) ◽  
pp. 25-46 ◽  
Author(s):  
WILLIAM BAIN

AbstractThis article takes up Louise Arbour's claim that the doctrine of the ‘Responsibility to Protect’ is grounded in existing obligations of international law, specifically those pertaining to the prevention and punishment of genocide. In doing so, it argues that the aspirations of the R2P project cannot be sustained by the idea of ‘responsibility’ alone. The article proceeds in arguing that the coherence of R2P depends on an unacknowledged and unarticulated theory of obligation that connects notions of culpability, blame, and accountability with the kind of preventive, punitive, and restorative action that Arbour and others advocate. Two theories of obligation are then offered, one natural the other conventional, which make this connection explicit. But the ensuing clarity comes at a cost: the naturalist account escapes the ‘real’ world to redeem the intrinsic dignity of all men and women, while the conventionalist account remains firmly tethered to the ‘real’ world in redeeming whatever dignity can be had by way of an agreement. The article concludes by arguing that the advocate of the responsibility to protect can have one or the other, but not both.


2021 ◽  
pp. 298-319
Author(s):  
Lidija Bajuk

Trying to interpret oneself and the other in the world, the traditional Man has established a real world and an otherworld. Specific herbal and animal attributes were ascribed to particular people who allegedly had the power to communicate between worldliness and transcendence. Also some human characteristics were linked with herbal and animal mediators. These attributes were folklorized as miraculous powers. Such supernatural beings from South Slavic traditional conceptionsof the world have been largely associated with the pre-Christian deities and their degradations, based on the observed real attributes of the vegetal and animal species. The interdisciplinary comparative way of treating South Slavic folklore real-unreal motifs through time and space in this article is its ethnological, animalistic and anthropological contribution.


2017 ◽  
Author(s):  
Amelia McNamara ◽  
Nicholas J Horton

Data wrangling is a critical foundation of data science, and wrangling of categorical data is an important component of this process. However, categorical data can introduce unique issues in data wrangling, particularly in real-world settings with collaborators and periodically-updated dynamic data. This paper discusses common problems arising from categorical variable transformations in R, demonstrates the use of factors, and suggests approaches to address data wrangling challenges. For each problem, we present at least two strategies for management, one in base R and the other from the ‘tidyverse.’ We consider several motivating examples, suggest defensive coding strategies, and outline principles for data wrangling to help ensure data quality and sound analysis.


Sign in / Sign up

Export Citation Format

Share Document