Automatic term recognition based on statistics of compound nouns

Terminology ◽  
2000 ◽  
Vol 6 (2) ◽  
pp. 195-210 ◽  
Author(s):  
Hiroshi Nakagawa

The NTCIR1 TMREC group called for participation of the term recognition task which is a part of NTCIR1 held in 1999. As an activity of TMREC, they have provided us with the test collection of the term recognition task. The goal of this task is to automatically recognize and extract terms from the text corpus which consists of 1,870 abstracts gathered from the NACSIS Academic Conference Database. This article describes the term extraction method we have proposed to extract terms consisting of simple and compound nouns and the experimental evaluation of the proposed method with this NTCIR TMREC test collection. The basic idea of scoring a simple noun N of our term extraction method is to count how many nouns are conjoined with N to make compound nouns. Then we extend this score to measure the score of compound nouns because most of technical terms are compound nouns. Our method has a parameter to tune the degree of preference either for longer compound nouns or for shorter compound nouns. As for term candidates, in addition to noun sequences, we may add variations such as patterns of "A no B" that roughly means "B of A" or "A’ś B" and/or "A na B" where "A na" is an adjective. Experimental results of our method are promising, namely recall of 0.83, precision of 0.46 and F-value of 0.59 for exactly matched extracted terms when we take into account top scoring 16,000 extracted terms.

Author(s):  
Hu Zhang ◽  
Bangze Pan ◽  
Ru Li

Legal judgment elements extraction (LJEE) aims to identify the different judgment features from the fact description in legal documents automatically, which helps to improve the accuracy and interpretability of the judgment results. In real court rulings, judges usually need to scan both the fact descriptions and the law articles repeatedly to find out the relevant information, and it is hard to acquire the key judgment features quickly, so legal judgment elements extraction is a crucial and challenging task for legal judgment prediction. However, most existing methods follow the text classification framework, which fails to model the attentive relations of the law articles and the legal judgment elements. To address this issue, we simulate the working process of human judges, and propose a legal judgment elements extraction method with a law article-aware mechanism, which captures the complex semantic correlations of the law article and the legal judgment elements. Experimental results show that our proposed method achieves significant improvements than other state-of-the-art baselines on the element recognition task dataset. Compared with the BERT-CNN model, the proposed “All labels Law Articles Embedding Model (ALEM)” improves the accuracy, recall, and F1 value by 0.5, 1.4 and 1.0, respectively.


Terminology ◽  
2000 ◽  
Vol 6 (2) ◽  
pp. 287-311 ◽  
Author(s):  
Jong-Hoon Oh ◽  
Juho Lee ◽  
Kyung-Soon Lee ◽  
Key-Sun Choi

There have been many studies of automatic term recognition (ATR) and they have achieved good results. However, they focus on a mono-lingual term extraction method. Therefore, it is difficult to extract terms from documents in foreign languages. This article describes an automatic term extraction method from documents in foreign languages using a machine translation system. In our method, we translate documents in foreign languages into documents in Korean and extract terms in the translated Korean documents. Finally the terms recognized from the Korean documents are translated into terms in the foreign language. By using our method, one can extract terms for languages, which one does not know.


Terminology ◽  
2003 ◽  
Vol 9 (2) ◽  
pp. 201-219 ◽  
Author(s):  
Hiroshi Nakagawa ◽  
Tatsunori Mori

In this paper, we propose a new approach to enhance automatic recognition systems for domain-specific terms. The approach is based on the statistics about the relation between a compound noun and its constituents that are simple nouns. More precisely, we focus on how many nouns adjoin the noun in question to form compound nouns. We propose several scoring methods based on this approach and experimentally evaluate them on the NTCIR1 TMREC test collection. The results are very promising, especially in low and high recall.


2005 ◽  
Vol 2 (4) ◽  
pp. 253-268 ◽  
Author(s):  
Mats Lindgren ◽  
Ilja Belov ◽  
Peter Leisner

This article presents results of experimental evaluation of glob-top materials for multi-chip-modules (MCM) in harsh environments. Material and process tests have been performed with the purpose to find a material which would fulfill the reliability requirements for use e.g. in military or automotive applications. Seven polymer materials, i.e. four epoxies, two silicones and one polyurethane material have been selected and evaluated in the experiments. The most critical material and process parameters for glob-top have been identified and measured. Based on the experimental results, application-based scoring of studied epoxy materials has been performed. Material evaluation results have been summarized in conclusions about the most suitable glob-top material for use in harsh environments.


Terminology ◽  
2000 ◽  
Vol 6 (2) ◽  
pp. 257-286 ◽  
Author(s):  
Yoshio Fukushige ◽  
Naohiko Noguchi

In this article we describe our approaches and the results to the Term Recognition (TMREC) task in the first NTCIR Workshop on Research in Japanese Text. Retrieval and Term Recognition, held 30 August-1 September 1999. Our first approach aims to collect words that appear distinctively in documents of the target domain through statistical method. Our second approach aims to collect terms that have a particular inner structure by applying several diagnostic tests using the collocational information in the corpus. Section 1 describes the outline of the term recognition task. Section 2 briefly describes the two approaches, details of which are described in Sections 3 and 4. In Section 6, we offer a short discussion based on the comparison between the candidates.


2013 ◽  
Vol 427-429 ◽  
pp. 1874-1878
Author(s):  
Guo De Wang ◽  
Zhi Sheng Jing ◽  
Guo Wei Qin ◽  
Shan Chao Tu

Wear particles recognition is a key link in the process of Ferrography analysis. Different kinds of wear particles vary greatly in texture, texture feature is one of the most important feature in wear particles recognition. Local Binary Pattern (LBP) is an efficient operator for texture description. The binary sequence of traditional LBP operator is obtained by the comparison between the gray value of the neighborhood and the gray value of the center pixel of the neighborhood, the comparison is too simple to cause the loss of the texture. In this paper, an improved LBP operator is presented for texture feature extraction and it is applied to the recognition of severe sliding particles, fatigue spall particles and laminar particles. The experimental results show that our method is an effective feature extraction method and obtains better recognition accuracy compared with other methods.


Author(s):  
HONGYUN ZHANG ◽  
DUOQIAN MIAO ◽  
CAIMING ZHONG

It is difficult but crucial for minutiae extraction and pseudo minutiae deletion of low quality fingerprint images in auto fingerprint identification systems. Traditional methods based on thinning images or gray-level images are, however, susceptible to noise. Reference 14 indicated that principal curves based fingerprint minutiae extraction was feasible to overcome the drawback, but the extended polygonal line (EPL) principal curves algorithm used in the paper extracted the principal curves ineffectively. As the fingerprint data sets are usually large, the original EPL principal curves algorithm is time-consuming. Meanwhile, scattered fingerprint data lead to the deviation of fingerprint skeleton. In this paper, the algorithm is modified, and a fingerprint minutiae extraction and pseudo minutiae detection method based on principal curves is proposed. Experimental results show that the modified EPL principal curves algorithm outperforms the original EPL algorithm both in efficiency and quality, and the proposed minutiae extraction method outperforms the methods proposed by Miao under noise conditions.


Sign in / Sign up

Export Citation Format

Share Document