Focused Information Retrieval & English Language Instruction: A New Text Complexity Algorithm for Automatic Text Classification

Author(s):  
Trisevgeni Liontou
SCITECH Nepal ◽  
2018 ◽  
Vol 13 (1) ◽  
pp. 64-69
Author(s):  
Dinesh Dangol ◽  
Rupesh Dahi Shrestha ◽  
Arun Timalsina

With an increasing trend of publishing news online on website, automatic text processing becomes more and more important. Automatic text classification has been a focus of many researchers in different languages for decades. There is a huge amount of research repository on features of English language and their uses on automated text processing. This research implements Nepali language key features for automatic text classification of Nepali news. In particular, the study on impact of Nepali language based features, which are extremely different than English language is more challenging because of the higher level of complexity to be resolved. The research experiment using vector space model, n-gram model and key feature based processing specific to Nepali language shows promising result compared to bag-of-words model for the task of automated Nepali news classification.


2021 ◽  
Author(s):  
V. S. Martins ◽  
C. D. Silva

Automatic Text Classification represents a great improvement in law area workflow, mainly in the migration of physical to electronic lawsuits. A systematic review of studies on text classification in law area from January 2017 up to February 2020 was conducted. The search strategy identified 20 studies, that were analyzed and compared. The review investigates from research questions: what are the state-of-art language models, its application of text classification in English and Brazilian Portuguese datasets from legal area, if there are available language models trained on Brazilian Portuguese, and datasets in Brazilian law area. It concludes that there are applications of automatic text classification in Brazil, although there is a gap on the use of language models when compared with English language dataset studies, also the importance of language model in domain pre-training to improve results, as well as there are two studies making available Brazilian Portuguese language models, and one introducing a dataset in Brazilian law area.


Multifold growth of internet users due to penetration of Information and Communication technology has resulted in huge soft content on the internet. Though most of it is available in English language, other languages including Indian languages are also catching up the race rapidly. Due to exponential growth in Internet users in India common man is also posting moderate size data on the web. Due to which e-content in Indian languages is growing in size. This high dimensionality of e-content is a curse for Information Retrieval. Hence automatic text classification and structuring of this e-content has become the need of the day. Automatic text classification is the process of assigning a category or categories to a new test document from one or more predefined categories according to the contents of that document. Text classification works for 14 Indian languages are reported in the literature. Marathi language is one of the officially recognized languages of Indian union. Little work has been done for Marathi text classification. This paper investigates Marathi text classification using popular Machine Learning methods such as Naïve Bayes, K-Nearest Neighbor, Support Vector Machine, Centroid Based and Modified KNN (MKNN) on manually extracted newspaper data from sport’s domain. Our experimental results show that Naïve Bayes and Centroid Based give best performance with 99.166% Micro and Macro Average of F-score and Modified KNN gives lowest performance with 97.16% Micro Average of F-Score and 96.997% Macro Average of F-score. The proposed work will be helpful for proper organization of Marathi text document and many applications of Marathi Information Retrieval.


ELT Journal ◽  
2019 ◽  
Vol 74 (1) ◽  
pp. 83-85 ◽  
Author(s):  
Paul J Meighan

Abstract Comment is a feature that allows contributors to express a personal, and sometimes controversial, view about a matter of current concern in the profession outside the format of a reviewed academic article. The views expressed are not necessarily those of the Editor or the Publisher. Reaction to Comment features, in the form of letters to the Editor, are especially welcome.


2019 ◽  
Vol 2 (1) ◽  
pp. 43-51
Author(s):  
Welliam Hamer ◽  
Ledy Nur Lely

This article aims at sharing information on how pictionary game is used to increase the learners’ vacabulary mastery in the process of teaching and learning. It is clear that vocabulary is one of components of English language. When the learners are reading, they need to master vocabulary related to certain topic. Therefore vocabulary is important thing in learning English. However, mastering English vocabularies is not easy. English is foreign language in which learning English is often considered to be difficult to comprehend. This problem can be seen from the unsatisfactory result when learning English. The learning processs commonly used in the classroom just puts the teacher as a center of learning. It means that the teacher always dominates him/herself to teach, not to focus on how the learners learn effectively. This makes the learners passive and less interested in following the course of learning. In fact the learners’ interest is the most important factor in the study. Interest can be developed if the learning process run with fun, vary, and conducive athmosphere. There are many factors that can support the existance of an increase in the study, i.e. teachers, learners, materials, media, methods, and other learning sources. One factor that can help the learners learn vocabulary is the use of pictionary game. In this study, pictionary game is a classic game of drawing and guessing pictures. Pictionary game can also increase the imagination of learners, where learners are asked to draw according to the word given by the teachers. Things that are needed to play pictionary game are a list or card of vocabulary items, whiteboard, calkboard, or smart board and markers. Pictionary game will help learners to get involved in classroom activities. Other advantages of using pictionary game can be concluded that it provides fun language practice in the various language skills.


2019 ◽  
Vol 7 ◽  
Author(s):  
Tomáš Hlava

In English language instruction in Slovakia, a strong preference for declarative knowledge at the expense of procedural knowledge development has been reported over the last two decades. However, the cognitive aspects of language attainment predict no impact of instructional efforts, since mental representations of language to be attained are told to be supported by different cognitive systems than associative learning develops. Language variation materializes differences among languages based on differences in digitalizing the experience and thus understanding the world. For Slovak learners, the English present perfect is one such anomaly in categorization. This paper aims to answer what the specific interactions between past simple and present perfect are and how the predicted cognitive aspects of language attainment influence the use of different types of knowledge. A proficiency test focusing on declarative knowledge and language use without context and in context was distributed to 600 Slovak learners of English at the ISCED3a level. In Past simple conditions, students proved highly proficiency in all 3 types of tasks. In present perfect conditions, declarative knowledge strongly dominated over language use in context. In Present perfect conditions, substitutions by past simple were significantly more frequent than substitutions of present perfect by past simple. Cognitive funneling was recognized as a process inhibiting fast proceduralization of the English present perfect compared to fast and reliable proceduralization of the past simple.


Sign in / Sign up

Export Citation Format

Share Document