A Comparative Study of Deep and Shallow Parsing Approaches to Automated Grammaticality Evaluation

2020 ◽  
Vol 27 (1) ◽  
Author(s):  
MK Aregbesola ◽  
RA Ganiyu ◽  
SO Olabiyisi ◽  
EO Omidiora

The concept of automated grammar evaluation of natural language texts is one that has attracted significant interests in the natural language processing community. It is the examination of natural language text for grammatical accuracy using computer software. The current work is a comparative study of different deep and shallow parsing techniques that have been applied to lexical analysis and grammaticality evaluation of natural language texts. The comparative analysis was based on data gathered from numerous related works. Shallow parsing using induced grammars was first examined along with its two main sub-categories, the probabilistic statistical parsers and the connectionist approach using neural networks. Deep parsing using handcrafted grammar was subsequently examined along with several of it‟s subcategories including Transformational Grammars, Feature Based Grammars, Lexical Functional Grammar (LFG), Definite Clause Grammar (DCG), Property Grammar (PG), Categorial Grammar (CG), Generalized Phrase Structure Grammar (GPSG), and Head-driven Phrase Structure Grammar (HPSG). Based on facts gathered from literature on the different aforementioned formalisms, a comparative analysis of the deep and shallow parsing techniques was performed. The comparative analysis showed among other things that while the shallow parsing approach was usually domain dependent, influenced by sentence length and lexical frequency and employed machine learning to induce grammar rules, the deep parsing approach were not domain dependent, not influenced by sentence length nor lexical frequency, and they made use of well spelt out set of precise linguistic rules. The deep parsing techniques proved to be a more labour intensive approach while the induced grammar rules were usually faster and reliability increased with size, accuracy and coverage of training data. The shallow parsing approach has gained immense popularity owing to availability of large corpora for different languages, and has therefore become the most accepted and adopted approach in recent times. Keywords: Grammaticality, Natural language processing, Deep parsing, Shallow parsing, Handcrafted grammar, Precision grammar, Induced grammar, Automated scoring, Computational linguistics, Comparative study.

Author(s):  
Rashida Ali ◽  
Ibrahim Rampurawala ◽  
Mayuri Wandhe ◽  
Ruchika Shrikhande ◽  
Arpita Bhatkar

Internet provides a medium to connect with individuals of similar or different interests creating a hub. Since a huge hub participates on these platforms, the user can receive a high volume of messages from different individuals creating a chaos and unwanted messages. These messages sometimes contain a true information and sometimes false, which leads to a state of confusion in the minds of the users and leads to first step towards spam messaging. Spam messages means an irrelevant and unsolicited message sent by a known/unknown user which may lead to a sense of insecurity among users. In this paper, the different machine learning algorithms were trained and tested with natural language processing (NLP) to classify whether the messages are spam or ham.


Author(s):  
Goran Klepac ◽  
Marko Velić

This chapter covers natural language processing techniques and their application in predicitve models development. Two case studies are presented. First case describes a project where textual descriptions of various situations in call center of one telecommunication company were processed in order to predict churn. Second case describes sentiment analysis of business news and describes practical and testing issues in text mining projects. Both case studies depict different approaches and are implemented in different tools. Language of the texts processed in these projects is Croatian which belongs to the Slavic group of languages with more complex morphologies and grammar rules than English. Chapter concludes with several points on the future research possible in this domain.


2021 ◽  
Author(s):  
Masoom Raza ◽  
Aditee Patil ◽  
Mangesh Bedekar ◽  
Rashmi Phalnikar ◽  
Bhavana Tiple

Ontologies are largely responsible for the creation of a framework or taxonomy for a particular domain which represents the shared knowledge, concepts and how these concepts are related with each other. This paper shows the usage of ontology for the comparison of a syllabus structure of universities. This is done with the extraction of the syllabus, creation of ontology for the representing syllabus, then parsing the ontology and applying Natural language processing to remove unwanted information. After getting the appropriate ontologies, a comparative study is made on them. Restrictions are made over the extracted syllabus to the subject “Software Engineering” for convenience. This depicts the collection and management of ontology knowledge and processing it in the right manner to get the desired insights.


2013 ◽  
Vol 274 ◽  
pp. 359-362
Author(s):  
Shuang Zhang ◽  
Shi Xiong Zhang

Abstract. Shallow parsing is a new strategy of language processing in the domain of natural language processing recently years. It is not focus on the obtaining of the full parsing tree but requiring of the recognition of some simple composition of some structure. It separated parsing into two subtasks: one is the recognition and analysis of chunks the other is the analysis of relationships among chunks. In this essay, some applied technology of shallow parsing is introduced and a new method of it is experimented.


Author(s):  
Sa Wang ◽  
Hui Xu

In view of the lack of intelligent guidance in online teaching of English composition, this paper proposes an intelligent support system for English writing based on B/S mode. On the basis of vocabulary, grammar rules and other corpus, this system uses Natural Language Processing technology, which combines rule matching and probability statistics, to evaluate and optimize the efficiency of the composition. The empirical results show that the system can effectively improve the teaching direction according to the results of intelligent quantitative analysis.


2015 ◽  
Vol 1 (1) ◽  
pp. 198-205
Author(s):  
Daniela Gîfu ◽  
Marius Cioca

AbstractThe paper presents the importance of analysis isotopes on anonymous readers’ comments as an important part of deep interpretation of texts. Furthermore, we describe a classification methodology of the anonymous readers’ comments on online articles, through the overlapping of isotopes, which completed the traditional analytical methods. Automatic recognition of isotopes is an important topic in Natural Language Processing (NLP), especially in the semantic disambiguation. The aim of this article is the automatic comparative analysis of the identified isotopes in articles and comments, which reveals an important part of online behavior. Moreover, we present a new tool that classifies the online commentators based on existing resources, open-source or freely available for research purposes. This study is intend to help direct beneficiaries (journalists, business, education, managers, PR specialists), but also specialists and researchers in the field of natural language processing, linguists, psychologists, etc.


2021 ◽  
Vol 12 ◽  
Author(s):  
Changcheng Wu ◽  
Junyi Li ◽  
Ye Zhang ◽  
Chunmei Lan ◽  
Kaiji Zhou ◽  
...  

Nowadays, most courses in massive open online course (MOOC) platforms are xMOOCs, which are based on the traditional instruction-driven principle. Course lecture is still the key component of the course. Thus, analyzing lectures of the instructors of xMOOCs would be helpful to evaluate the course quality and provide feedback to instructors and researchers. The current study aimed to portray the lecture styles of instructors in MOOCs from the perspective of natural language processing. Specifically, 129 course transcripts were downloaded from two major MOOC platforms. Two semantic analysis tools (linguistic inquiry and word count and Coh-Metrix) were used to extract semantic features including self-reference, tone, effect, cognitive words, cohesion, complex words, and sentence length. On the basis of the comments of students, course video review, and the results of cluster analysis, we found four different lecture styles: “perfect,” “communicative,” “balanced,” and “serious.” Significant differences were found between the different lecture styles within different disciplines for notes taking, discussion posts, and overall course satisfaction. Future studies could use fine-grained log data to verify the results of our study and explore how to use the results of natural language processing to improve the lecture of instructors in both MOOCs and traditional classes.


Sign in / Sign up

Export Citation Format

Share Document