Exploration of Document Relation Quality with Consideration of Term Representation Basis, Term Weighting and Association Measure

Author(s):  
Nichnan Kittiphattanabawon ◽  
Thanaruk Theeramunkong ◽  
Ekawit Nantajeewarawat
Author(s):  
Stefano Rastelli

Abstract This article suggests a method to appraise L2 morphemes productivity in longitudinal learner data. Traditionally, morpheme productivity is believed to depend on type frequency and on proportion between inflected and uninflected lexemes. However, such measures cannot distinguish between rote-learning and rule-learning of target-like forms. In contrast, the association measure ΔP (delta pi) can quantify the extent to which a morpheme is contingent upon a limited number of lexemes. Decreasing contingency might parallel learners’ increasing awareness of asymmetrical morpheme-lexeme distribution in the input and this might be a cue of developing L2 grammatical competence beyond appearances. The paper presents the rationale and procedure for analyzing within-item variance – or the ‘intra-language’ – and illustrates a case-study concerning the perfective morpheme in L2 Italian.


2013 ◽  
Vol 278-280 ◽  
pp. 2058-2064
Author(s):  
Cheng Ying Chi ◽  
Hong Li ◽  
Xue Gang Zhan ◽  
Sheng Nan Jiang

In this paper, through analysis of the structure of web news texts, we have proposed an improvement measure for term weighting in hot topics detection, and a topic weighting scheme for hot topics ranking. Experiment result comparison shows that our method is effective and ranking of hot topics is closer to reality.


2020 ◽  
Vol 11 (2) ◽  
pp. 107-111
Author(s):  
Christevan Destitus ◽  
Wella Wella ◽  
Suryasari Suryasari

This study aims to clarify tweets on twitter using the Support Vector Machine and Information Gain methods. The clarification itself aims to find a hyperplane that separates the negative and positive classes. In the research stage, there is a system process, namely text mining, text processing which has stages of tokenizing, filtering, stemming, and term weighting. After that, a feature selection is made by information gain which calculates the entropy value of each word. After that, clarify based on the features that have been selected and the output is in the form of identifying whether the tweet is bully or not. The results of this study found that the Support Vector Machine and Information Gain methods have sufficiently maximum results.


2010 ◽  
Author(s):  
Sirinoot Boonsuk ◽  
Donglai Zhu ◽  
Bin Ma ◽  
Atiwong Suchato ◽  
Proadpran Punyabukkana ◽  
...  

IEEE Access ◽  
2019 ◽  
Vol 7 ◽  
pp. 166578-166592
Author(s):  
Surender Singh Samant ◽  
N. L. Bhanu Murthy ◽  
Aruna Malapati

Sign in / Sign up

Export Citation Format

Share Document