scholarly journals A shallow parser based on closed-class words to capture relations in biomedical text

2003 ◽  
Vol 36 (3) ◽  
pp. 145-158 ◽  
Author(s):  
Gondy Leroy ◽  
Hsinchun Chen ◽  
Jesse D Martinez
Author(s):  
Richard Compton

This chapter examines polysynthetic word formation in Inuit (Eskimo-Aleut), using the presence and variable ordering of a closed class of adverbs within verbal complexes as a diagnostic device to evaluate the adequacy of different accounts of word formation. It is argued that a head movement account of Mirror Principle orders within Inuit words undergenerates with respect to the observed variation in adverb ordering, particularly if a fixed hierarchy of adverbial functional projections is assumed, as in Cinque (1999). Instead, it is shown that an analysis that employs a right-headed structure, XP-sized phasal words, and Ernst’s (2002) semantically based framework of adverb licensing better captures the observed variation.


2021 ◽  
pp. 103699
Author(s):  
Muhammad Ali Ibrahim ◽  
Muhammad Usman Ghani Khan ◽  
Faiza Mehmood ◽  
Muhammad Nabeel Asim ◽  
Waqar Mahmood

Database ◽  
2021 ◽  
Vol 2021 ◽  
Author(s):  
Yifan Shao ◽  
Haoru Li ◽  
Jinghang Gu ◽  
Longhua Qian ◽  
Guodong Zhou

Abstract Extraction of causal relations between biomedical entities in the form of Biological Expression Language (BEL) poses a new challenge to the community of biomedical text mining due to the complexity of BEL statements. We propose a simplified form of BEL statements [Simplified Biological Expression Language (SBEL)] to facilitate BEL extraction and employ BERT (Bidirectional Encoder Representation from Transformers) to improve the performance of causal relation extraction (RE). On the one hand, BEL statement extraction is transformed into the extraction of an intermediate form—SBEL statement, which is then further decomposed into two subtasks: entity RE and entity function detection. On the other hand, we use a powerful pretrained BERT model to both extract entity relations and detect entity functions, aiming to improve the performance of two subtasks. Entity relations and functions are then combined into SBEL statements and finally merged into BEL statements. Experimental results on the BioCreative-V Track 4 corpus demonstrate that our method achieves the state-of-the-art performance in BEL statement extraction with F1 scores of 54.8% in Stage 2 evaluation and of 30.1% in Stage 1 evaluation, respectively. Database URL: https://github.com/grapeff/SBEL_datasets


2016 ◽  
Vol 2 (1) ◽  
Author(s):  
Peter Crosthwaite ◽  
Lavigne L.Y. Choy ◽  
Yeonsuk Bae

AbstractWe present an Integrated Contrastive Model of non-numerical quantificational NPs (NNQs, i.e. ‘some people’) produced by L1 English speakers and Mandarin and Korean L2 English learners. Learner corpus data was sourced from the ICNALE (Ishikawa, 2011, 2013) across four L2 proficiency levels. An average 10% of L2 NNQs were specific to L2 varieties, including noun number mismatches (*‘many child’), omitting obligatory quantifiers after adverbs (*‘almost people’), adding unnecessary particles (*‘all of people’) and non-L1 English-like quantifier/noun agreement (*‘many water’). Significantly fewer ‘openclass’ NNQs (e.g a number of people) are produced by L2 learners, preferring ‘closed-class’ single lexical quantifiers (following L1-like use). While such production is predictable via L1 transfer, Korean L2 English learners produced significantly more L2-like NNQs at each proficiency level, which was not entirely predictable under a transfer account. We thus consider whether positive transfer of other linguistic forms (i.e. definiteness marking) aids the learnability of other L2 forms (i.e. expression of quantification).


2007 ◽  
Vol 1 (4) ◽  
pp. 389 ◽  
Author(s):  
Lawrence H. Reeve ◽  
Hyoil Han ◽  
Ari D. Brooks
Keyword(s):  

2012 ◽  
Vol 19 (5) ◽  
pp. 800-808 ◽  
Author(s):  
Balaji Polepalli Ramesh ◽  
Rashmi Prasad ◽  
Tim Miller ◽  
Brian Harrington ◽  
Hong Yu
Keyword(s):  

2021 ◽  
pp. 1-18
Author(s):  
Trevor K.M. DAY ◽  
Jed T. ELISON

Abstract A critical question in the study of language development is to understand lexical and syntactic acquisition, which play different roles in speech to the extent it would be natural to surmise they are acquired differently. As measured through the comprehension and production of closed-class words, syntactic ability emerges at roughly the 400-word mark. However, a significant proportion of the developmental work uses a coarse combination of function and content words on the MacArthur-Bates Communicative Development Inventory (MB-CDI). Using the MB-CDI Wordbank database, we implemented a factor analytic approach to distinguish between lexical and syntactic development from the Words and Sentences (WS) form that involves both function words and the explicit categorizations. Although the Words and Gestures (WG) form did not share the factor structure, common WG/WS elements recapitulate the expected age-related changes. This parsing of the MB-CDI may prove simple, yet fruitful in subsequent investigation.


1986 ◽  
Vol 9 (1) ◽  
pp. 55-82
Author(s):  
Beata Schmid

In this paper, I have shown that Joshi's (1982) framework of codeswitching constraints can largely be applied to Swedish-English code-switches. I feel qualified to conclude that Joshi's claims concerning the non-switchability of closed class items and matrix language and embedded languages are held up by the Swedish- English data. The need for corresponding categories proved to be less clear-cut than originally proposed by Woolford (1983) and others. It seems that optimal switching conditions are given if the categories, rules and metarules correspond in the two languages. Apparently, however, it is also possible to switch if the node admissibility conditions for the matrix language only are met, as was shown by code-switched sentences containing RPs. This requires that the speaker has a clear sense of which language is the host and which is embedded. Rules from the embedded language only are not acceptable. This calls for some sort of determination strategy by the parser. I found no evidence for determining Lm at any specific point in the sentence, except at the topmost S. Rather, the judgments by code-switchers that a sentence “comes from” one language seems to coincide with the fact that the resulting sentence is based on the rules from that language. Other than that, the matrix language is determined by the communicative context as a whole.The data involving RPs also seemed to indicate that RPs are not separate ategories, but are NPs, introduced by a “de-slashing” rule (Sells 1984). If they were separate categories, this would be evidence for there being no need for category equivalence. In this case, we would have to explicitly state all other cases which require category equivalence (the majority of cases), which is undesirable.


Sign in / Sign up

Export Citation Format

Share Document