scholarly journals Reducing redundancy in multi-document summarization using lexical semantic similarity

Author(s):  
Iris Hendrickx ◽  
Walter Daelemans ◽  
Erwin Marsi ◽  
Emiel Krahmer
Author(s):  
Martin Maiden

The historical morphology of the verb ‘snow’ in Francoprovençal presents a conundrum, in that it is clearly analogically influenced by the verb ‘rain’, for obvious reasons of lexical semantic similarity, but the locus of that influence is not the ‘root’ (the ostensible bearer of lexical meaning) but desinential inflexion-class members, which are in principle independent of any lexical meaning. Similar morphological changes are also identified for other Gallo-Romance verbs. It seems, in effect, that speakers can identify exponents of the lexical meaning of word-forms in linear sequences larger than the apparent ‘morphemic’ composition of those word-forms, even when such a composition may seem prima facie transparent and obvious. It is argued that these facts are inherently incompatible with ‘constructivist’, morpheme-based, models of morphology, and strongly compatible with what have been called ‘abstractivist’ (‘word-and-paradigm’) approaches, which generally take entire word-forms as the primary units of morphological analysis.


Document summarization is the process of generating the summary of the documents gathered from the web sources. It reduces the burden of web readers by reducing the necessity of reading the entire document contents by generating the short summary. In our previous research work this is performed by introducing the method namely Noun weight based Automated Multi-Document Summarization method (NW-AMDSM). However the previous research work doesn’t concentrate on the semantic similarity which might reduce the accuracy of the summarization outcome. This is resolved in the proposed research method by introducing the method namely Semantic Similarity based Automatic Document Summarization Method (SS-ADSM). In this research work, multi document grouping is done is based on semantic similarity computation, thus the document with similar contents can be grouped more accurately. Here the semantic similarity computation is performed with the help of word net analyzer. The document grouping is done by introducing the modified FCM clustering algorithm. Finally hybrid neuro fuzzy genetic algorithm is introduced to perform the automatic summarization. The numerical analysis of the proposed research method is conducted in the matlab simulation environment and compared with other research methods in terms various performance metrics. The simulation analysis proved proposed method tends to have better performance in terms of increased accuracy of document summarization outcome.


2020 ◽  
Vol 21 (2) ◽  
pp. 154-181
Author(s):  
Tabassom Azimi ◽  
◽  
Zahra-Saddat Qoreishi ◽  
Reza Nilipour ◽  
Morteza Farazi ◽  
...  

Objective: Brain trauma evidences suggest that the two grammatical categories of noun and verb are processed in different regions of the brain due to differences in the complexity of grammatical and semantic information processing. Studies have shown that the verbs belonging to different semantic categories lead to neural activity in different areas of the brain, and action verb processing is related to the activity of motor and pre-motor areas of the brain. Researchers use different tasks to evaluate action verb processing. The most common tasks are action naming and action fluency tasks. Although these types of tasks are sensitive to deficits in action verb processing, they do not specify the nature of the injury. To understand whether dysfunction in action verb processing is due to difficulty in lexical access or specific impairment in semantic processing, it is necessary to design a specific test to evaluate lexical-semantic processing. Semantic Similarity Judgment (SSJ) test targets the lexical-semantic encoding at a deep and controlled processing level. The purpose of the present study was to develop a SSJ test for Persian action verbs and non-action nouns and determine its content validity. Materials & Methods: In this methodological study, 70 Persian action concrete verbs and 80 Persian non-action concrete nouns were first selected. For each word, a semantically related word based on functional, physical, categorical features and similarity in action was selected according to the opinion of 4 experts (3 speech-language pathologists and one linguist) using a 7-point scale. For semantic similarity rating, only the pairs of words with a high semantic similarity score (5 to 7) remained and the rest were omitted. Then, for each pair of semantically related words, a semantically unrelated word was selected. After determining content validity qualitatively by three experts and removing inappropriate items, for matching the two sets of nouns and verbs, the lexical and psycholinguistic characteristics of the remaining words (207 nouns and 156 verbs) including frequency, number of syllables, phonemes, letters, phonological and orthographic neighbors, action association, imageability, familiarity and age of acquisition were extracted by 18 volunteers (13 speech-language pathologists and linguists and 5 parents selected by a convenience sampling method) based on a 7-point scale. The verbs with low action associations and the nouns with high action association were removed and then, the two sets of words were matched for other lexical and psycholinguistic characteristics. Finally, 34 triples of verbs with high action association and 34 triples of nouns with low action association were selected. In both noun and verb sets, the words were chosen in such a way that, in order to judge, the semantic features of the words need to be carefully considered. Data analysis was performed using descriptive statistics and independent t-test.


2014 ◽  
Vol 513-517 ◽  
pp. 1326-1331
Author(s):  
Yu Bing Yang ◽  
Yan Yan Wu ◽  
Zhan Ping Li ◽  
Lin Ge Wang

Lexical semantic tendentious recognition calculation is the base of the sentence, and the sentence tendentious recognition is the text and the chapter structure tendentious recognition foundation. Based on HowNet lexical semantic similarity calculation, according to the current vocabulary appraisable tendentious theory, the paper put forward an improved orientation algorithm. With experiment validation, in the same pair of benchmark words, accuracy rate has lager improve to above 90%. There are some values in practical application.


Sign in / Sign up

Export Citation Format

Share Document