Multiword Modifiers in some Romance languages. Semantic formats and syntactic templates

2016 ◽  
Vol 7 (1) ◽  
pp. 3-34 ◽  
Author(s):  
Valentina Piunno

Abstract This paper focuses on a specific type of Multiword Expressions, particularly widespread in Italian as well as in other Romance languages: Multiword Modifiers, i.e. prepositional phrases functioning as modifiers of a noun (Multiword Adjectives) and of a verb (Multiword Adverbs). Exploiting both syntactic and semantic analysis, this paper explores the hypothesis that Multiword Modifiers are formed on the basis of regular syntactic templates, which can structure and organize the semantic information associated with words. In this perspective, after a brief presentation of Multiword Lexical Units and the class of Multiword Modifiers, the methodology and the general theoretical framework of this study will be explained. The last section is devoted to the analysis of some semantic relations frequently fulfilled by Multiword Modifiers of Italian, French and Spanish. This investigation aims at demonstrating that all Romance languages considered make a regular use of this kind of analytical resource in adjectival or adverbial function, showing similar patterns and syntactic templates.

Author(s):  
J. Detkova ◽  
◽  
V. Novitskiy ◽  
M. Petrova ◽  
V. Selegey ◽  
...  

The current paper suggests a new representation type of word collocations—the semantic sketches. It was first tested on one of the subcorpora of the General Internet-Corpus of Russian. The semantic sketches continue the idea of word sketches based on grammatical relations between words and expand it by adding the semantic information—word meanings and semantic relations between words. Moreover, the sketches can be additionally provided with metatextual characteristics. Certainly, building such sketches demands the semantic markup of the corpora. Therefore, we have used partial semantic analysis of the Compreno parser for our purposes. The paper demonstrates the examples of the sketches, provides the quality evaluation of the markup they are based on, and shows the advantages and disadvantages of the given approach.


2017 ◽  
Vol 16 (16) ◽  
pp. 87
Author(s):  
Patricia Fernández Martín

El objetivo del presente trabajo es profundizar en la historia del funcionamiento de las construcciones castellanas {tener/llevar} + participio, tomando como centro de estudio la lengua de los siglos xvi y xvii y estableciendo ciertas comparaciones, a lo largo del texto, con otras lenguas romances, en especial el asturiano. El punto de partida se encuentra en la idea de que los problemas que crean estas construccionesse deben esencialmente a la doble naturaleza del participio (adjetival y verbal), solo comprensible inserta en un continuum entre el puro adjetivo y el puro verbo. Para ello, comenzaremos estableciendo, en el marco teórico, nuestro concepto de perífrasis verbal de participio y su aplicación a las construcciones que nos ocupan en el español de los Siglos de Oro. En una segunda parte, analizaremos el funcionamiento de dichas estructuras en el español clásico, empleando un corpus formado por tres génerosdiscursivos, escritos entre 1519 y 1656, que componen sendos subapartados (novelas picarescas, epístolas y crónicas de Indias). La principal conclusión es que los géneros discursivos no afectan a las construcciones de participio en la misma medida en que puede afectar a otros fenómenos gramaticales, como los pronombres personales.The aim of this work is to deepen in the history of the Spanish structures{tener/llevar} + participle, taking into account the language of the 16th and 17th centuries and offering certain comparisons with other Romance languages, specially Asturian. The starting point lies in the idea that the problems that create these constructions are essentially due to the dual nature of the participle (between a verb and an adjective), which can be only understood into a continuum, whose ends are the pure adjective and the pure verb. For that, we will start setting our concept of participial periphrases in the theoretical framework, as well as its applicationto the Spanish language spoken in the Golden Age. Then, we analyze how these structures work in that Spanish, using a corpus formed by three discourse genres (picaresque novels, letters and chronicles of the Indies), whose texts were written between 1519 and 1656. Finally, all of which allows to conclude that the discourse genres do not affect the appearance of the constructions of participle in the same extent that it may affect other grammatical phenomena, such as personal pronouns.


2015 ◽  
pp. 183-200
Author(s):  
Marek Maziarz ◽  
Maciej Piasecki ◽  
Stanisław Szpakowicz ◽  
Joanna Rabiega-Wiśniewska ◽  
Bożena Hojka

Semantic relations between verbs in Polish WordNet 2.0The noun dominates wordnets. The lexical semantics of verbs is usually under-represented, even if it is essential in any semantic analysis which goes beyond statistical methods. We present our attempt to remedy the imbalance; it begins by designing a sufficiently rich set of wordnet relations for verbs. We discuss and show in detail such a relation set in the largest Polish wordnet. Our design decisions, while as general and language-independent as possible, are mainly informed by our desire to capture the nature and peculiarities of the verb system in Polish.


2013 ◽  
Vol 427-429 ◽  
pp. 1649-1652
Author(s):  
Bo Chen ◽  
Chen Lv ◽  
Dong Hong Ji

Parsing Chinese is a key issue in NLP. Many controversies arise from Chinese special sentence patterns. This paper puts forward a novel model Feature Structure theory to resolve the semantic labeling of Chinese special sentence patterns. We analyze the difficulties in annotating these sentences, and compare Feature Structure with dependency structure. Feature Structure represents more semantic information and more semantic relations. Feature Graph is a recursive undirected graph, allows nesting and multiple correlations.


Author(s):  
Katrin Erk

Computational semantics performs automatic meaning analysis of natural language. Research in computational semantics designs meaning representations and develops mechanisms for automatically assigning those representations and reasoning over them. Computational semantics is not a single monolithic task but consists of many subtasks, including word sense disambiguation, multi-word expression analysis, semantic role labeling, the construction of sentence semantic structure, coreference resolution, and the automatic induction of semantic information from data. The development of manually constructed resources has been vastly important in driving the field forward. Examples include WordNet, PropBank, FrameNet, VerbNet, and TimeBank. These resources specify the linguistic structures to be targeted in automatic analysis, and they provide high-quality human-generated data that can be used to train machine learning systems. Supervised machine learning based on manually constructed resources is a widely used technique. A second core strand has been the induction of lexical knowledge from text data. For example, words can be represented through the contexts in which they appear (called distributional vectors or embeddings), such that semantically similar words have similar representations. Or semantic relations between words can be inferred from patterns of words that link them. Wide-coverage semantic analysis always needs more data, both lexical knowledge and world knowledge, and automatic induction at least alleviates the problem. Compositionality is a third core theme: the systematic construction of structural meaning representations of larger expressions from the meaning representations of their parts. The representations typically use logics of varying expressivity, which makes them well suited to performing automatic inferences with theorem provers. Manual specification and automatic acquisition of knowledge are closely intertwined. Manually created resources are automatically extended or merged. The automatic induction of semantic information is guided and constrained by manually specified information, which is much more reliable. And for restricted domains, the construction of logical representations is learned from data. It is at the intersection of manual specification and machine learning that some of the current larger questions of computational semantics are located. For instance, should we build general-purpose semantic representations, or is lexical knowledge simply too domain-specific, and would we be better off learning task-specific representations every time? When performing inference, is it more beneficial to have the solid ground of a human-generated ontology, or is it better to reason directly with text snippets for more fine-grained and gradual inference? Do we obtain a better and deeper semantic analysis as we use better and deeper manually specified linguistic knowledge, or is the future in powerful learning paradigms that learn to carry out an entire task from natural language input and output alone, without pre-specified linguistic knowledge?


2005 ◽  
Vol 04 (02) ◽  
pp. 133-138
Author(s):  
D. Manjula ◽  
T. V. Geetha

The traditional Boolean word-based approach to information retrieval (IR) considers only words for indexing. Irrelevant information is retrieved because of non-inclusion of semantic information like word senses and word context. In this work, the importance of representing the documents along another semantic dimension in addition to sense context information is considered. The incorporation of semantic relations as an additional dimension gives a better insight into the interpretation of the document. The micro-contexts generated from the documents are also used in indexing. The retrieval performance is measured in terms of precision and recall. The results tabulated show better performance.


2004 ◽  
Vol 26 (2) ◽  
pp. 311-328
Author(s):  
Mireille Piot

SummaryIn this paper, we present a unified hypothesis about «focusing» conjunctional adverbs and subordinating conjunctions in French. A similar hypothesis is to be taken to hold at all romance languages as we argue after Piot (2003) mentioned above. At first, differences are to be observed between this case (with “focus”) and another case in which the same conjunctional items are purely conjunctions (coordinating or subordinating, but without “focus”). Then, we point out which are the common semantic and syntactic properties of the global “focus” operation related to all these items (parallelism between sentences and nominal phrases correlated by these conjunctional items, inclusion or union semantic relations between nominal phrases in some respects ensembles theory relations alike: the addition of syntactic-semantic specific items shares this inclusion or union relation). In particular, this study highlights, as a result, the role of the subject nature of the conjoined first sentence and the syntactic-semantic nature of the verb-phrase in the second sentence. Another study (to appear) will present the results about distinctions in this operation according to the particular significance of each different item.


2015 ◽  
Vol 2015 ◽  
pp. 1-9 ◽  
Author(s):  
Lin Guo ◽  
Wanli Zuo ◽  
Tao Peng ◽  
Lin Yue

The diversities of large-scale semistructured data make the extraction of implicit semantic information have enormous difficulties. This paper proposes an automatic and unsupervised method of text categorization, in which tree-shape structures are used to represent semantic knowledge and to explore implicit information by mining hidden structures without cumbersome lexical analysis. Mining implicit frequent structures in trees can discover both direct and indirect semantic relations, which largely enhances the accuracy of matching and classifying texts. The experimental results show that the proposed algorithm remarkably reduces the time and effort spent in training and classifying, which outperforms established competitors in correctness and effectiveness.


Sign in / Sign up

Export Citation Format

Share Document