scholarly journals Extracting Significant Words in Engineering Texts for Specialised Language Descriptions

The academic discourse of a specialised language is characterised by specialised and technical vocabulary, and lexicogrammar. Studies on language description suggest the need to explore and determine the specific characteristics of the academic discourse of each specialised language, to serve the language needs of the learners. This study demonstrates an exploration of this discipline specificity by looking at the nouns used in a specialised language - an Engineering English. It attempts to integrate a multivariate technique, i.e. the Correspondence Analysis (CA), as a tool to extract significant nouns in a specialised language for any further language use scrutiny. CA allows visual representations of the word interrelationships across different genres in a specialised language. To exemplify this, an Engineering English Corpus (E2C) was created. E2C is composed of two sub-corpora (genres): Engineering reference books (RBC) and online journals articles (EJC). The British National Corpus (BNC) was used as the reference corpus. 30 key-key-nouns were identified from the E2C, and the frequency lists of the words were retrieved from all the corpora to run the CA. The CA maps of the nouns display how these corpora are different from each other, as well as, which words characterise not only E2C from a general corpus (BNC), but also the different genres in E2C. Thus, CA proves to be a potential tool to display words which characterise not only a specialised corpus from a general corpus, but also the different genres in that specialised corpus. This study promises more informed descriptions of a specialised language can be made with the identification of specific and significant vocabulary for any academic discourse investigations.

2014 ◽  
Vol 12 (4) ◽  
pp. 319-340
Author(s):  
Anu Koskela

This paper explores the lexicographic representation of a type of polysemy that arises when the meaning of one lexical item can either include or contrast with the meaning of another, as in the case of dog/bitch, shoe/boot, finger/thumb and animal/bird. A survey of how such pairs are represented in monolingual English dictionaries showed that dictionaries mostly represent as explicitly polysemous those lexical items whose broader and narrower readings are more distinctive and clearly separable in definitional terms. They commonly only represented the broader readings for terms that are in fact frequently used in the narrower reading, as shown by data from the British National Corpus.  


2016 ◽  
Vol 2 (2) ◽  
pp. 177-204 ◽  
Author(s):  
Gerold Schneider ◽  
Gaëtanelle Gilquin

In research on L2 English, recent corpus-based studies indicate that some non-standard forms are shared by indigenized (ESL) and foreign (EFL) varieties of English, which challenges the idea of a clear dichotomy between innovation and error. We present a data-driven large-scale method to detect innovations, test it on verb + preposition structures (including phrasal verbs) and adjective + preposition structures, and describe similarities and differences between EFL and ESL. We use a dependency-parsed version of the International Corpus of Learner English to automatically extract potential innovations, defined as patterns of overuse compared to the British National Corpus as reference corpus. We measure overuse by means of collocation measures like O/E or T-score, and compare our results with similar results for ESL. In both quantitative and qualitative analyses, we detect similarities between the two varieties (e.g. discuss about) and dissimilarities (e.g. accuse for, only distinctive for EFL). We report more verb/adjective + preposition combinations than previous studies and discuss the roles of analogy and transfer.


Author(s):  
Ike Susanti Effendi ◽  
Riska Amalia ◽  
Sakinah Asa Lalita

<p><em>The study on (near) synonymous word has been of intriguing topic in the recent decades. Scholars have investigated them from diverse perspectives including but not limited to semantics, grammar, and language teaching. However, few of them examine synonymous verbs. This study endeavors to scrutinize ‘announce’, ‘declare’, and ‘state’ by employing descriptive qualitative approach and British National Corpus as data source. Besides, it also attempts to shed pivotal light the pedagogical implication of corpus linguistics to the teaching of word or vocabulary and meaning in use. Sketch Engine is used as instrument analysis by which collocation and concordance analysis were employed to elucidate word combination and contexts to produce meaning. The findings demonstrate that ‘announce’, ‘declare’, and ‘state’ could not be used rudimentary interchangeably since they carry out (slightly) different meaning depending on collocate word and grammatical pattern. This study also corroborated the notion that corpus linguistics plays significant role in foreign language teaching since it offers authentic materials and contextual clue for language use.</em><strong><em></em></strong></p>


Languages ◽  
2020 ◽  
Vol 5 (4) ◽  
pp. 62
Author(s):  
Aziz Thabit Saeed ◽  
Saleh Al-Salman

While acknowledging that different conjunctions define different relationships between ideas, this study focuses on the interpretation of four subordinating conjunctions, namely, because, since, for and as in causal clauses. Since the meanings of these conjunctions vary according to context, among other things, they usually pose problems to language users leading to misinterpretations. Not only does this emerge from the fine and minute distinctions in usage between them, but also due to the lack of adequate knowledge of the rules of language use in their metatheoretical framework. This kind of knowledge is crucial in interactive communication, speech acts, pragmatics, logical arguments and multidisciplinary debates. The data for this study were compiled from grammar books, articles, and from the British National Corpus (BNC). The data were analyzed not only to identify the discrete features of each conjunction that would render it different from its synonymous counterparts, but also to understand the kind of knowledge required to determine the choice. The findings of the study reveal that, in addition to the syntactic constraints, the degree of the ‘givenness’ or ‘newness’ of the information that the conjunction introduces, context, degree of formality of the register, and lexical density of the utterance that contains the conjunctions emerged to play a role.


2017 ◽  
Vol 4 (4) ◽  
pp. 46
Author(s):  
Inesa Šeškauskienė

The paper sets out to examine the lemma argument* in English and Lithuanian academic discourse. Supporting the claim that academic discourse is largely metaphorical, the present investigation is driven by the conceptual theory of metaphor and aims to uncover the metaphors manifested in the contexts of argument/s and argumentation. The data has been collected from the academic section of the British National Corpus (BNC) and the Corpus of Academic Lithuanian (CoraLit). The results have demonstrated that English and Lithuanian share a number of metaphors, such as research / argument is an object, research / argument is a building / structure, research / argument is a person, research / argument is verbal communication and some others. However, the image rendered by the argument in both languages seems to be different—English gives preference to the ‘embodied’ argument, whereas Lithuanian is more confined to treating it as an object. The research has also uncovered interesting language-specific realisation of all metaphors.


2018 ◽  
Vol 8 (7) ◽  
pp. 59
Author(s):  
Ibrahim Bashir ◽  
Kamariah Yunus ◽  
Tamer Mohammed Al-Jarrah

This is a corpus-based study on the uses and functions of modal verbs &ldquo;will&rdquo; and &ldquo;shall&rdquo; in the Nigerian legal discourse. It aims at examining their pragmatic functions as hedges in the legal discourse. It specifically aims to investigate how hedges are used in the legal texts to indicate precision and uncertainty. To achieve these objectives a specialised corpus was constructed which we named as &ldquo;Nigerian Law Corpus&rdquo; (NLC). The compilation of NLC is based on the Nigerian court proceedings and law reports. Hence, the compiled NLC corpus contains 546,313-word tokens. Meanwhile, reference corpus of law with 2.2 million word tokens based on the British National Corpus (BNC) is retrieved for comparison with NLC. To this end, two concordance tools were utilised to analyse the data of this study viz. &ldquo;AntConc version 3.5&rdquo; a semi-automated computer-aided tool and a web-based tool &ldquo;Lextutor version 7&rdquo;. Based on the frequency distribution the results revealed that model verb &ldquo;will&rdquo; featured in 493 instances in the NLC and 7,711 instances in the BNC Law, while, &ldquo;shall&rdquo; occurred at 401 instances in NLC and 1,348 instances in BNC Law. The results also indicated that &ldquo;shall&rdquo; was an overused element in NLC than in BNC Law with standardised concordance hits per million (NLC=734, BNC Law =589) while, &ldquo;will&rdquo; is the least used element of NLC (902 instances per million) compared to BNC Law (3,369 instances per million). The study also enumerated different semantic and pragmatic functions of &ldquo;will&rdquo; and &ldquo;shall&rdquo; in legal discourse, citing examples from both tag corpus (NLC) and reference corpus (BNC Law). Some of the functions as hedges (conveying a truth value of a proposition) are epistemic meanings: politeness, obligation, precision, duty, intention, and permission. In nutshell, the results indicated that &ldquo;will&rdquo; and &ldquo;shall&rdquo; are used by legal practitioners more especially lawyers in a courtroom to achieve precision in their argument in a case to persuade the court by showing the true value of commitment of the proposition.


Author(s):  
Ute Römer ◽  
Selahattin Yilmaz

Using data from the International Corpus of Learner English (ICLE) and the British National Corpus (BNC), this article examines what Turkish learners of English know about a set of frequent verb-argument constructions (VACs, such as ‘V with n’ as illustrated by ‘I like to go with the flow’) and in what ways their VAC knowledge is influenced by native English usage and by transfer from their first language (L1), Turkish. An ICLE Turkish analysis gave us access to dominant verb-VAC associations in Turkish learners ́ English, and provided insights into the productivity and predictability of selected constructions. Comparisons with the BNC and other ICLE subsets (ICLE German and ICLE Spanish) allowed us to determine how strong the usage effect is on Turkish learners’ verb-VAC associations and whether Turkish learners differ in this respect from learners of other typologically different L1s. Potential effects of L1 transfer were explored with the help of a large reference corpus of Turkish, the Turkish National Corpus (TNC).


2021 ◽  
Vol 9 (2) ◽  
pp. 30-50
Author(s):  
Sylvia Dimitrova ◽  
◽  
Temenuzhka Seizova-Nankova ◽  

The paper presents a corpus-based analysis of the predicative use of the adjective “ashamed” giving a full description of its complementation patterns with the help of the Valency Theory (VT – Herbst et al., 2004). The findings are based on a reference corpus extracted from the British National Corpus (BNC) by using the SkE software. The analysis reveals the advantages of the approach used for learners at levels B1 and B2 while, on the other hand, it shows the insufficiency of information found in the main English dictionaries (OALD, LDCE, etc.). It also demonstrates how both language learning and teaching, and materials production could be optimized using the corpus-based analysis.


2015 ◽  
Vol 24 (1) ◽  
pp. 3-22 ◽  
Author(s):  
Aletta G Dorst

This article presents a quantitative cross-register comparison of the forms and frequency of linguistic metaphor in fiction based on a 45,000-word annotated corpus containing excerpts from 12 contemporary British-English novels sampled from the British National Corpus. The results for fiction are compared to those for three other registers, namely news texts, academic discourse and conversations. The linguistic manifestations of metaphor in the corpus were identified using the MIPVU procedure (Steen et al., 2010), a revised and extended version of the original Metaphor Identification Procedure, or MIP, as developed by the Pragglejaz Group (2007). Contrary to common expectations, fiction was not the register with the highest number of metaphors, but was situated in between academic discourse and news on the one hand, and conversation on the other. However, it turned out that metaphor signals and direct expressions of metaphor (e.g. simile) were typical of fiction, as has been claimed in the literature (e.g. Goatly, 1997; Lodge, 1977; Sayce, 1953). Based on these quantitative findings, this article will show that fiction does not contain more metaphors than the other registers, but rather, different ones.


2014 ◽  
Vol 1030-1032 ◽  
pp. 2689-2692
Author(s):  
Yong Mei Peng ◽  
Yun Hua Qu

This paper examines our spoken English Majors used to connect words and characteristics. Corpus used the "Chinese students Spoken and Written English Corpus (SWECCL2.0)" in the spoken corpus SECCL2.0, reference corpus used in the British National Corpus BNC spoken corpus BNC Spoken Corpus (BNC / S). The study found that of native speakers of English majors and English spoken words using both common connections are also differences. Meanwhile, China's English Majors spoken word there are multiple connections with the situation misuse. Based on the findings, the article on spoken English teaching some suggestions.


Sign in / Sign up

Export Citation Format

Share Document