scholarly journals A French corpus annotated for multiword expressions and named entities

2021 ◽  
Vol 8 (2) ◽  
Author(s):  
Marie Candito ◽  
Mathieu Constant ◽  
Carlos Ramisch ◽  
Agata Savary ◽  
Bruno Guillaume ◽  
...  

We present the enrichment of a French treebank of various genres with a new annotation layer for multiword expressions (MWEs) and named entities (NEs).1 Our contribution with respect to previous work on NE and MWE annotation is the particular care taken to use formal criteria, organized into decision flowcharts, shedding some light on the interactions between NEs and MWEs. Moreover, in order to cope with the well-known difficulty to draw a clear-cut frontier between compositional expressions and MWEs, we chose to use sufficient criteria only. As a result, annotated MWEs satisfy a varying number of sufficient criteria, accounting for the scalar nature of the MWE status. In addition to the span of the elements, annotation includes the subcategory of NEs (e.g., person, location) and one matching sufficient criterion for non-verbal MWEs (e.g., lexical substitution). The 3,099 sentences of the treebank were double-annotated and adjudicated, and we paid attention to cross-type consistency and compatibility with thesyntactic layer. Overall inter-annotator agreement on non-verbal MWEs and NEs reached 71.1%. The released corpus contains 3,112 annotated NEs and 3,440 MWEs, and is distributed under an open license.

2016 ◽  
Vol 23 (3) ◽  
pp. 441-472 ◽  
Author(s):  
MAI OUDAH ◽  
KHALED SHAALAN

AbstractNamed Entity Recognition (NER) is an essential task for many natural language processing systems, which makes use of various linguistic resources. NER becomes more complicated when the language in use is morphologically rich and structurally complex, such as Arabic. This language has a set of characteristics that makes it particularly challenging to handle. In a previous work, we have proposed an Arabic NER system that follows the hybrid approach, i.e. integrates both rule-based and machine learning-based NER approaches. Our hybrid NER system is the state-of-the-art in Arabic NER according to its performance on standard evaluation datasets. In this article, we discuss a novel methodology for overcoming the coverage drawback of rule-based NER systems in order to improve their performance and allow for automated rule update. The presented mechanism utilizes the recognition decisions made by the hybrid NER system in order to identify the weaknesses of the rule-based component and derive new linguistic rules aiming at enhancing the rule base, which will help in achieving more reliable and accurate results. We used ACE 2004 Newswire standard dataset as a resource for extracting and analyzing new linguistic rules for person, location and organization names recognition. We formulate each new rule based on two distinctive feature groups, i.e. Gazetteers of each type of named entities and Part-of-Speech tags, in particular noun and proper noun. Fourteen new patterns are derived, formulated as grammar rules, and evaluated in terms of coverage. The conducted experiments exploit a POS tagged version of the ACE 2004 NW dataset. The empirical results show that the performance of the enhanced rule-based system, i.e. NERA 2.0, improves the coverage of the previously misclassified person, location and organization named entities types by 69.93 per cent, 57.09 per cent and 54.28 per cent, respectively.


2021 ◽  
Vol 5 (1) ◽  
pp. 27-46
Author(s):  
Alexis Kauffmann ◽  
François-Claude Rey ◽  
Iana Atanassova ◽  
Arnaud Gaudinat ◽  
Peter Greenfield ◽  
...  

We define here indirectly named entities, as a term to denote multiword expressions referring to known named entities by means of periphrasis.  While named entity recognition is a classical task in natural language processing, little attention has been paid to indirectly named entities and their treatment. In this paper, we try to address this gap, describing issues related to the detection and understanding of indirectly named entities in texts. We introduce a proof of concept for retrieving both lexicalised and non-lexicalised indirectly named entities in French texts. We also show example cases where this proof of concept is applied, and discuss future perspectives. We have initiated the creation of a first lexicon of 712 indirectly named entity entries that is available for future research.


2017 ◽  
Vol 22 (1) ◽  
pp. 11-16
Author(s):  
Joel Weddington ◽  
Charles N. Brooks ◽  
Mark Melhorn ◽  
Christopher R. Brigham

Abstract In most cases of shoulder injury at work, causation analysis is not clear-cut and requires detailed, thoughtful, and time-consuming causation analysis; traditionally, physicians have approached this in a cursory manner, often presenting their findings as an opinion. An established method of causation analysis using six steps is outlined in the American College of Occupational and Environmental Medicine Guidelines and in the AMA Guides to the Evaluation of Disease and Injury Causation, Second Edition, as follows: 1) collect evidence of disease; 2) collect epidemiological data; 3) collect evidence of exposure; 4) collect other relevant factors; 5) evaluate the validity of the evidence; and 6) write a report with evaluation and conclusions. Evaluators also should recognize that thresholds for causation vary by state and are based on specific statutes or case law. Three cases illustrate evidence-based causation analysis using the six steps and illustrate how examiners can form well-founded opinions about whether a given condition is work related, nonoccupational, or some combination of these. An evaluator's causal conclusions should be rational, should be consistent with the facts of the individual case and medical literature, and should cite pertinent references. The opinion should be stated “to a reasonable degree of medical probability,” on a “more-probable-than-not” basis, or using a suitable phrase that meets the legal threshold in the applicable jurisdiction.


2002 ◽  
Vol 41 (06) ◽  
pp. 233-239 ◽  
Author(s):  
C. Hausteiner ◽  
A. Drzezga ◽  
P. Bartenstein ◽  
M. Schwaiger ◽  
H. Förstl ◽  
...  

SummaryAim: Multiple chemical sensitivity (MCS) is a controversially discussed symptom complex. Patients afflicted by MCS react to very low and generally nontoxic concentrations of environmental chemicals. It has been suggested that MCS leads to neurotoxic damage or neuroimmunological alteration in the brain detectable by positron emission tomography (PET) and single photon emission computer tomography (SPECT). These methods are often applied to MCS patients for diagnosis, although they never proved appropriate. Method: We scanned 12 MCS patients with PET, hypothesizing that it would reveal abnormal findings. Results: Mild glucose hypometabolism was present in one patient. In comparison with normal controls, the patient group showed no significant functional brain changes. Conclusion: This first systematic PET study in MCS patients revealed no hint of neurotoxic or neuroimmuno-logical brain changes of functional significance.


1975 ◽  
Vol 34 (01) ◽  
pp. 106-114 ◽  
Author(s):  
I. D Walker ◽  
J. F Davidson ◽  
P Young ◽  
J. A Conkie

SummaryThe effect of seven different anabolic steroids (Ethyloestrenol, Methenolone acetate, Norethandrolone, Methylandrostenediol, Oxymetholone, Methandienone, and Stanozolol) on three α-globulin antiprotease inhibitors of thrombin and plasmin was studied in men with ischaemic heart disease. In distinct contrast to the oral contraceptives, five of the six 17-α-alkylated anabolic steroids studied produced increased plasma Antithrombin III levels and five produced decreased levels of plasma α2-macroglobulin. The effect on plasma α1antitrypsin levels was less clear-cut but three of the steroids examined produced significantly elevated levels. The increased plasma fibrinolytic activity which the 17-α-alkylated anabolic steroids induce is therefore unlikely to be secondary to disseminated intravascular coagulation.


1996 ◽  
Vol 75 (05) ◽  
pp. 778-781 ◽  
Author(s):  
Domenico Prisco ◽  
Sandra Fedi ◽  
Tamara Brunelli ◽  
Anna Paola Cellai ◽  
Mohamed Isse Hagi ◽  
...  

SummaryAt least five studies based on more than twenty thousand healthy subjects indicated that fibrinogen is an independent risk factor for cardiovascular events; less clear-cut is the relation between factor VII and risk for arterial thrombotic disorders, which was demonstrated in two of the three studies investigating this association. However, no study has investigated the behaviour of fibrinogen and factor VII in an adolescent population. In a study of Preventive Medicine and Education Program, fibrinogen (clotting method) and factor Vllag (ELISA), in addition to other metabolic parameters, life-style and familial history, were investigated in 451 students (313 females and 138 males, age 15-17 years) from two high schools of Florence. Fibrinogen levels were significantly higher in women than in men, whereas factor Vllag levels did not significantly differ. Both fibrinogen and factor Vllag significantly correlated with total cholesterol (p <0.05) while only fibrinogen correlated with body mass index (p <0.01). Factor Vllag was significantly correlated with systolic blood pressure (p <0.001). This study provides information on coagulation risk factors in a population of adolescents which may be of importance in planning coronary heart disease prevention programs.


2008 ◽  
Vol 56 (S 1) ◽  
Author(s):  
J Müller ◽  
M Brodde ◽  
P Nüsser ◽  
J Müller ◽  
K Graichen ◽  
...  

2020 ◽  
pp. 54-59
Author(s):  
A. A. Yelizarov ◽  
A. A. Skuridin ◽  
E. A. Zakirova

A computer model and the results of a numerical experiment for a sensitive element on a planar mushroom-shaped metamaterial with cells of the “Maltese cross” type are presented. The proposed electrodynamic structure is shown to be applicable for nondestructive testing of geometric and electrophysical parameters of technological media, as well as searching for inhomogeneities in them. Resonant frequency shift and change of the attenuation coefficient value of the structure serve as informative parameters.


Author(s):  
Bhawana Pant ◽  
Sanjay Gaur ◽  
Prabhat Pant

F.NA.C has been used for ages as a safe and economical tool for fast preoperative diagnosis of parotid tumors. It has certain pitfall which sometimes leads to misdiagnosis and consequently it may have affect on treatment of the tumors. Keeping in view of the diverse classification of parotid tumors’ information from cytology should be combined with radiology as well as clinical diagnosis. Aim: To discuss some cases where there was discrepancy between cytological diagnosis and histopathological result and also suggest measures to improve the efficacy of F.N.A.C. Material and methods: The study includes 50 cases of parotid tumours who presented to the  department of ENT at Government medical college Haldwani which is a tertiary referral centre during 2009 to 2016. Only adult patients were included and inflammatory swelling were excluded from the study. All patients evaluated  Contrast enhanced computerized tomography(CECT) and  Magnetic resonance imaging (MRI) followed by Fine needle aspiration cytology .Preoperative diagnosis was made upon the findings of the above investigations and different types of  parotid surgeries  were done. . Final diagnosis was made on  histopathological  examination. Result :The most common tumour  came out to be pleomorphic adenoma (23 cases-46%) followed by mucoepidermoid carcinoma(12cases-24%). In ten  cases there was no clear cut  association between cytological diagnosis and final histopathological diagnosis. Conclusion: FNAC is highly sensitive and specific technique for diagnosis of many salivary gland swellings. FNAC can be used preoperatively to avoid unnecessary surgery and biopsy. Details of clinical information and radiologic features may help the pathologist to arrive at the appropriate diagnosis and reduce false interpretation. Pitfalls may also occur with improper technique of FNAC which can be overcome by proper caution.


Sign in / Sign up

Export Citation Format

Share Document