Technical terminology: some linguistic properties and an algorithm for identification in text

John S. Justeson; Slava M. Katz

doi:10.1017/s1351324900000048

Technical terminology: some linguistic properties and an algorithm for identification in text

Natural Language Engineering ◽

10.1017/s1351324900000048 ◽

1995 ◽

Vol 1 (1) ◽

pp. 9-27 ◽

Cited By ~ 258

Author(s):

John S. Justeson ◽

Slava M. Katz

Keyword(s):

Noun Phrase ◽

Noun Phrases ◽

Technical Vocabulary ◽

Technical Terms

AbstractThis paper identifies some linguistic properties of technical terminology, and uses them to formulate an algorithm for identifying technical terms in running text. The grammatical properties discussed are preferred phrase structures: technical terms consist mostly of noun phrases containing adjectives, nouns, and occasionally prepositions; rerely do terms contain verbs, adverbs, or conjunctions. The discourse properties are patterns of repetition that distinguish noun phrases that are technical terms, especially those multi-word phrases that constitute a substantial majority of all technical vocabulary, from other types of noun phrase.The paper presents a terminology indentification algorithm that is motivated by these linguistic properties. An implementation of the algorithm is described; it recovers a high proportion of the technical terms in a text, and a high proportaion of the recovered strings are vaild technical terms. The algorithm proves to be effective regardless of the domain of the text to which it is applied.

A Machine Learning Approach to Coreference Resolution of Noun Phrases

Computational Linguistics ◽

10.1162/089120101753342653 ◽

2001 ◽

Vol 27 (4) ◽

pp. 521-544 ◽

Cited By ~ 287

Author(s):

Wee Meng Soon ◽

Hwee Tou Ng ◽

Daniel Chung Yong Lim

Keyword(s):

Machine Learning ◽

Noun Phrase ◽

State Of The Art ◽

Noun Phrases ◽

Learning Approach ◽

Data Sets ◽

Coreference Resolution ◽

Machine Learning Approach

In this paper, we present a learning approach to coreference resolution of noun phrases in unrestricted text. The approach learns from a small, annotated corpus and the task includes resolving not just a certain type of noun phrase (e.g., pronouns) but rather general noun phrases. It also does not restrict the entity types of the noun phrases; that is, coreference is assigned whether they are of “organization,” “person,” or other types. We evaluate our approach on common data sets (namely, the MUC-6 and MUC-7 coreference corpora) and obtain encouraging results, indicating that on the general noun phrase coreference task, the learning approach holds promise and achieves accuracy comparable to that of nonlearning approaches. Our system is the first learning-based system that offers performance comparable to that of state-of-the-art nonlearning systems on these data sets.

Abstractions and exemplars: The measure noun phrase alternation in German

Cognitive Linguistics ◽

10.1515/cog-2017-0050 ◽

2018 ◽

Vol 29 (4) ◽

pp. 729-771

Author(s):

Roland Schäfer

Keyword(s):

Noun Phrase ◽

Cognitive Linguistics ◽

Experimental Validation ◽

Native Speakers ◽

Noun Phrases ◽

Prototype Model ◽

Weak Effect ◽

Specific Measure ◽

Corpus Study ◽

Large Corpus

AbstractIn this paper, an alternation in German measure noun phrases is examined under a varying-abstraction perspective. In a specific measure NP construction, the embedded kind-denoting noun either agrees in case with the measure noun (eine Tasse guter Kaffee‘a cup of good coffee’) or it stands in the genitive (eine Tasse guten Kaffees). Each of the two alternants is syntactically similar to a non-alternating construction. I propose a prototype model which assigns a common prototypical meaning to each of the alternants and its corresponding non-alternating construction. Based on this, I argue that lexical, morphosyntactic, and stylistic features help to predict the choice of the alternant. A large corpus study is presented which supports this analysis. However, in addition to the prototype effects, an exemplar effect is also shown to influence the choice, namely the relative frequencies with which lemmas occur in the non-alternating constructions. I argue that allowing both prototype and exemplar effects is more adequate than following radical prototype or exemplar approaches. It is also verified in two experiments that the corpus-derived model corresponds to the behaviour of native speakers. The weak effect size of the experimental validation is discussed in the context of corpus-based cognitive linguistics and the validation of corpus-derived models.

Learning and Retaining Technical Vocabulary with Enhancement Activities in an ESP Course

English Language Teaching ◽

10.5539/elt.v11n10p145 ◽

2018 ◽

Vol 11 (10) ◽

pp. 145

Author(s):

Liju Xu

Keyword(s):

Language Learning ◽

English Language ◽

Vocabulary Knowledge ◽

Successful Implementation ◽

Technical Vocabulary ◽

Post Test ◽

Group A ◽

Group B ◽

Knowledge Scale ◽

Technical Terms

The purpose of this paper is to explore the effectiveness of vocabulary enhancement activities on vocabulary learning in an ESP course. In designing the activities, technical terms on journal entries were chosen for the acquisition of language necessary for the successful implementation of accounting major’s professional tasks. The desirable difficulty approach and the four strands principle，focused input, meaning-focused output, language-focused learning and fluency development, were guidelines in combining subject matter and English language learning. To test the result of the activities, the Vocabulary Knowledge Scale was employed to measure students’ knowledge of 50 vocabulary items. Subjects of the ESP course in discussion comprised 200 accounting juniors in Guangdong University of Foreign Studies in China. Half of them in Group A read the texts and did matching exercises and translation exercises. The other half in Group B read the texts and practiced journal entry activity, targeting at accounting concepts and terminologies. The results revealed that Group B gained better results than Group A at a post-test. After the test a reflection on the vocabulary activities was gathered among the participants of Group B. The feedback further proved that the students did benefit from the enhancement activities on selected technical terms.

The Post-Modification of Noun Phrase: Types and Distributions on Abstracts of Local and International Journals

MOZAIK HUMANIORA ◽

10.20473/mozaik.v20i1.16611 ◽

2020 ◽

Vol 20 (1) ◽

pp. 100

Author(s):

Shanty A.Y.P.S Duwila

Keyword(s):

Noun Phrase ◽

Academic Writing ◽

Noun Phrases ◽

Prepositional Phrase ◽

Efl Teachers ◽

Post Modification

Focusing on single and multiple post-modification of noun phrase complexity in academic writing, this study adopted Berlage’s (2014) types of single and multiple post-modification of noun phrase to investigate the types and distributionof noun phrase on 15 abstracts of accredited local journal and 15 international journal indexed by Scopus. Subjects, objects, and complements are coded manually and then extracted for noun phrases. The findings revealed that both groups of writers heavily relied on noun phrase involving prepositional phrase in single-post modification and noun phrase involving prepositional phrase(s) and coordination(s) in multiple-post modification. This finding may give contribution to EFL teachers and material developers in order to provide information and materials about NP post modifiers that can be used in academic writing.

Bhadarwahi: A Typological Sketch

Acta Linguistica Asiatica ◽

10.4312/ala.5.1.125-148 ◽

2015 ◽

Vol 5 (1) ◽

pp. 125-148

Author(s):

Amitabh Vikram DWIVEDI

Keyword(s):

Noun Phrase ◽

Distinctive Feature ◽

Word Order ◽

Noun Phrases ◽

Direct Object ◽

Prominent Feature ◽

Jammu And Kashmir ◽

Dominant Language ◽

Indirect Object ◽

Main Verb

This paper is a summary of some phonological and morphosyntactice features of the Bhadarwahi language of Indo-Aryan family. Bhadarwahi is a lesser known and less documented language spoken in district of Doda of Jammu region of Jammu and Kashmir State in India. Typologically it is a subject dominant language with an SOV word order (SV if without object) and its verb agrees with a noun phrase which is not followed by an overt post-position. These noun phrases can move freely in the sentence without changing the meaning of the sentence. The indirect object generally precedes the direct object. Aspiration, like any other Indo-Aryan languages, is a prominent feature of Bhadarwahi. Nasalization is a distinctive feature, and vowel and consonant contrasts are commonly observed. Infinitive and participle forms are formed by suffixation while infixation is also found in causative formation. Tense is carried by auxiliary and aspect and mood is marked by the main verb.

Re-examining the relationship between the subject agreement morpheme and (in)definiteness in Northern Sotho

Literator ◽

10.4102/lit.v34i1.377 ◽

2013 ◽

Vol 34 (1) ◽

Author(s):

Mampaka L. Mojapelo

Keyword(s):

Noun Phrase ◽

Noun Phrases ◽

Subject Position ◽

Primary Role ◽

Northern Sotho ◽

Class 1 ◽

Subject Noun ◽

The Subject ◽

Subject Agreement

The grammatical position of the subject noun phrase in Northern Sotho is to the left of the predicate. The subject agreement morpheme is a compulsory link between the subject noun phrase and the predicate. Scholars have examined the role of this morpheme from various perspectives. It is also extensively documented that the morpheme has dual functions. Its primary function is to mark agreement between the subject and the predicate. Its secondary function is pronominal, whereby it is co-referenced to some antecedent. This article reexamined the primary role of the subject agreement morpheme in Northern Sotho in relation to the interpretation of a subject noun phrase as definite or indefinite. This was accomplished by (1) revisiting existing works that are directly or indirectly linked to (in)definiteness and subject agreement, (2) analysing texts that may facilitate discussion on the issue, and (3) relating the findings from previous works to current analyses. The first hypothesis in this article was that when some class 9 subject noun phrases, denoting persons, agree with the verb stem by a class 1 agreement morpheme, the noun phrases are interpreted as definite. The second hypothesis was that although the subject position is considered predominantly topical and definite it may not categorically exclude indefinite noun phrases. Therefore some indefinite noun phrases may also agree with predicates by means of this morpheme.

Coordinated Bare Definites

Linguistic Inquiry ◽

10.1162/002438903322247551 ◽

2003 ◽

Vol 34 (3) ◽

pp. 443-469 ◽

Cited By ~ 26

Author(s):

Caroline Heycock ◽

Roberto Zamparelli

Keyword(s):

Recent Work ◽

Noun Phrase ◽

Noun Phrases ◽

Complex Constraints

Recent work on the syntax and semantics of functional projections within the noun phrase has had as one goal an explanation for the crosslinguistic distribution of “bare” (determinerless) noun phrases. This article provides an account for an apparent anomaly: the relatively free occurrence of bare noun phrases under coordination. We argue that this construction involves coordination of projections below the DP level, with the coordinated structure subsequently raising to Spec, DP.Our analysis accounts for the fact that these nominals are endowed with uniqueness conditions, but only in some cases, and for a number of other hitherto undocumented facts, including complex constraints on modification.

Producing written noun phrases in French

Written Language & Literacy ◽

10.1075/wll.18.1.01mag ◽

2015 ◽

Vol 18 (1) ◽

pp. 1-24 ◽

Cited By ~ 4

Author(s):

Severine Maggio ◽

Florence Chenu ◽

Guillemette Bes de Berc ◽

Blandine Pesci ◽

Bernard Lété ◽

...

Keyword(s):

Parallel Processing ◽

Noun Phrase ◽

Time Course ◽

Noun Phrases ◽

Frequency Effect ◽

Word Frequency Effect ◽

Written Production ◽

Bare Nouns ◽

Written Word ◽

Consistency Effects

This research compares the time-course of the written production of bare nouns to that of noun phrases. French adults named pictures of objects either using or not using determiners. Resulting pauses and writing rates were analyzed in relation to word-orthographic frequency, syllabic length, and phoneme-to-grapheme consistency at the end of words. More specifically, we showed that the noun production process begins as soon the determiner production is initiated (word frequency effect on latencies, length and consistency effects on determiner writing rate) and continued during the course of the noun production. When the determiner was absent, the management of writing was different: the writer slowed the production speed, probably in order to realize the lexeme processing that s/he could not do in the absence of the determiner production time. These results provided further evidence that some form of parallel processing occurs in written word production and led us to sketch the time-course of the noun spelling in written denomination of a noun phrase.

Some Features of the Old Castilian Alfonsine Translation of ʿAlī Ibn Khalaf’s Treatise on the Lámina Universal

Medieval Encounters ◽

10.1163/15700674-12342244 ◽

2017 ◽

Vol 23 (1-5) ◽

pp. 106-123 ◽

Cited By ~ 1

Author(s):

Emilia Calvo

Keyword(s):

Thirteenth Century ◽

Eleventh Century ◽

Arabic Language ◽

Grammatical Structure ◽

Additional Information ◽

Technical Vocabulary ◽

Scientific Circle ◽

Astronomical Instrument ◽

Technical Terms

Abstract The aim of the paper is to present some features of the treatise on the lámina universal, an astronomical instrument devised by ʿAlī ibn Khalaf, an eleventh-century Andalusi mathematician and astronomer who belonged to the scientific circle of Ṣāʿid al- Andalusī. ʿAlī ibn Khalaf was a contemporary of Ibn al-Zarqālluh (al-Zarqālī, Azarquiel), also a mathematician and astronomer working under Ṣāʿid’s patronage, and the inven- tor of the instrument known as azafea. Both instruments, the lámina universal and the azafea, are universal instruments devised to overcome the limitations of the standard astrolabe. The only text describing ʿAlī ibn Khalaf’s instrument is the thirteenth- century old-Castilian Alfonsine translation, which has not been studied in detail up to now, although some preliminary studies have been published. The present study deals with some linguistic and technical difficulties of the text. In many passages, it seems to follow literally the grammatical structure of the Arabic language while in others, the lack of technical terms forced the translators to resort either to a literal transcription of the original Arabic terminology or, in some cases, to approximate translations that make the text somewhat difficult to follow. The paper provides additional information related mainly to the astronomical parameters and the technical vocabulary used in the translation.

Noun Phrase or Compound Noun? An Investigation of N + A and N + N Boundary Cases in Persian

Theory and Practice in Language Studies ◽

10.17507/tpls.0608.25 ◽

2016 ◽

Vol 6 (8) ◽

pp. 1687

Author(s):

Shahram Peyman ◽

Vali Rezai

Keyword(s):

Noun Phrase ◽

Noun Phrases ◽

Special Focus ◽

Demarcation Criterion ◽

Compound Nouns ◽

Almost All ◽

Demarcation Criteria

The present study seeks to investigate the demarcation between noun phrases (NPs) and compound nouns (CNs) in Persian at the syntax-morphology interface. This objective is accomplished through the examination of two most complex nominal patterns, viz. N + A, N + N, with special focus on boundary cases, i.e. the intermediate constructs which possess some properties of both NPs and CNs simultaneously and thus demonstrate contradictory reactions to the various NP-CN demarcation criteria. The results indicate that boundary cases ensue from partial syntactic erosion of NPs through pure lexicalization, whereby NPs turn into CNs without center-switching or category change. This study also shows that almost all boundary cases have no potential for syntactic modification of their elements. It is further demonstrated that N + A and N + N boundary cases are endocentric, head-initial constructs with optional or obligatory internal inflection as well as Ezafe. Syntactic modifiability is also introduced as the most efficient NP-CN demarcation criterion in Persian since it is the first property lost in NP lexicalization process.