scholarly journals Stopping Active Learning Based on Predicted Change of F Measure for Text Classification

Author(s):  
Michael Altschuler ◽  
Michael Bloodgood
Author(s):  
Sarmad Mahar ◽  
Sahar Zafar ◽  
Kamran Nishat

Headnotes are the precise explanation and summary of legal points in an issued judgment. Law journals hire experienced lawyers to write these headnotes. These headnotes help the reader quickly determine the issue discussed in the case. Headnotes comprise two parts. The first part comprises the topic discussed in the judgment, and the second part contains a summary of that judgment. In this thesis, we design, develop and evaluate headnote prediction using machine learning, without involving human involvement. We divided this task into a two steps process. In the first step, we predict law points used in the judgment by using text classification algorithms. The second step generates a summary of the judgment using text summarization techniques. To achieve this task, we created a Databank by extracting data from different law sources in Pakistan. We labelled training data generated based on Pakistan law websites. We tested different feature extraction methods on judiciary data to improve our system. Using these feature extraction methods, we developed a dictionary of terminology for ease of reference and utility. Our approach achieves 65% accuracy by using Linear Support Vector Classification with tri-gram and without stemmer. Using active learning our system can continuously improve the accuracy with the increased labelled examples provided by the users of the system.


Author(s):  
Pascal Cuxac ◽  
Jean-Charles Lamirel ◽  
Maha Ghribi

Nous présentons une approche alternative pour l'évaluation de la qualité de classifications non supervisées de textes basée sur des critères de rappel, précision et F-mesure non supervisés, exploitant les descripteurs associées aux classes. La comparaison expérimentale du comportement des critères classiques avec notre approche est effectuée sur des données bibliographiques.This paper presents an alternative approach to measuring the quality of non-supervised text classification based on the recall, precision and non-supervised F-measure criteria, using class descriptors. The experimental comparison of classical criteria behaviour to our approach is based on bibliographic data.


Information ◽  
2018 ◽  
Vol 9 (8) ◽  
pp. 184 ◽  
Author(s):  
Yuliya Rubtsova

The research identifies and substantiates the problem of quality deterioration in the sentiment classification of text collections identical in composition and characteristics, but staggered over time. It is shown that the quality of sentiment classification can drop up to 15% in terms of the F-measure over a year and a half. This paper presents three different approaches to improving text classification by sentiment in continuously-updated text collections in Russian: using a weighing scheme with linear computational complexity, adding lexicons of emotional vocabulary to the feature space and distributed word representation. All methods are compared, and it is shown which method is most applicable in certain cases. Experiments comparing the methods on sufficiently representative text collections are described. It is shown that suggested approaches could reduce the deterioration of sentiment classification results for collections staggered over time.


Sign in / Sign up

Export Citation Format

Share Document