An Interactive Tutoring System for Learning Language Processing and Compiler Design

Machine learning language models are combinations of algorithms and neural networks designed for text processing composed in natural language (Natural Language Processing, NLP). In 2020, the largest language model from the artificial intelligence research company OpenAI, GPT-3, was released, the maximum number of parameters of which reaches 175 billion. The parameterization of the model increased by more than 100 times made it possible to improve the quality of generated texts to a level that is hard to distinguish from human-written texts. It is noteworthy that this model was trained on a training dataset mainly collected from open sources on the Internet, the volume of which is estimated at 570 GB. This article discusses the problem of memorizing critical information, in particular, personal data of individual, at the stage of training large language models (GPT-2/3 and derivatives), and also describes an algorithmic approach to solving this problem, which consists in additional preprocessing training dataset and refinement of the model inference in the context of generating pseudo-personal data and embedding into the results of work on the tasks of summarization, text generation, formation of answers to questions and others from the field of seq2seq.

Download Full-text

Detection of Gene Interactions Based on Syntactic Relations

Journal of Biomedicine and Biotechnology ◽

10.1155/2008/371710 ◽

2008 ◽

Vol 2008 ◽

pp. 1-9 ◽

Cited By ~ 1

Author(s):

Mi-Young Kim

Keyword(s):

Language Processing ◽

Training Dataset ◽

Phase Method ◽

Gene Interactions ◽

Second Phase ◽

Good Precision ◽

Linguistic Information ◽

Three Phase ◽

Syntactic Relations ◽

Learning Language

Interactions between proteins and genes are considered essential in the description of biomolecular phenomena, and networks of interactions are applied in a system's biology approach. Recently, many studies have sought to extract information from biomolecular text using natural language processing technology. Previous studies have asserted that linguistic information is useful for improving the detection of gene interactions. In particular, syntactic relations among linguistic information are good for detecting gene interactions. However, previous systems give a reasonably good precision but poor recall. To improve recall without sacrificing precision, this paper proposes a three-phase method for detecting gene interactions based on syntactic relations. In the first phase, we retrieve syntactic encapsulation categories for each candidate agent and target. In the second phase, we construct a verb list that indicates the nature of the interaction between pairs of genes. In the last phase, we determine direction rules to detect which of two genes is the agent or target. Even without biomolecular knowledge, our method performs reasonably well using a small training dataset. While the first phase contributes to improve recall, the second and third phases contribute to improve precision. In the experimental results using ICML 05 Workshop on Learning Language in Logic (LLL05) data, our proposed method gave an F-measure of 67.2% for the test data, significantly outperforming previous methods. We also describe the contribution of each phase to the performance.

Download Full-text

An interactive tutoring system for engineering mathematics

1996 IEEE International Conference on Multi Media Engineering Education. Conference Proceedings ◽

10.1109/mmee.1996.570301 ◽

2002 ◽

Author(s):

I. Ng

Keyword(s):

Tutoring System ◽

Engineering Mathematics ◽

Interactive Tutoring

Download Full-text

Subsentence Extraction from Text Using Coverage-Based Deep Learning Language Models

Sensors ◽

10.3390/s21082712 ◽

2021 ◽

Vol 21 (8) ◽

pp. 2712

Author(s):

JongYoon Lim ◽

Inkyu Sa ◽

Ho Seok Ahn ◽

Norina Gasteiger ◽

Sanghyub John Lee ◽

...

Keyword(s):

Deep Learning ◽

Language Processing ◽

Auxiliary Information ◽

Language Models ◽

Research Fields ◽

Software Packages ◽

Public Dataset ◽

High Degree ◽

Learning Language ◽

Important Building Block

Sentiment prediction remains a challenging and unresolved task in various research fields, including psychology, neuroscience, and computer science. This stems from its high degree of subjectivity and limited input sources that can effectively capture the actual sentiment. This can be even more challenging with only text-based input. Meanwhile, the rise of deep learning and an unprecedented large volume of data have paved the way for artificial intelligence to perform impressively accurate predictions or even human-level reasoning. Drawing inspiration from this, we propose a coverage-based sentiment and subsentence extraction system that estimates a span of input text and recursively feeds this information back to the networks. The predicted subsentence consists of auxiliary information expressing a sentiment. This is an important building block for enabling vivid and epic sentiment delivery (within the scope of this paper) and for other natural language processing tasks such as text summarisation and Q&A. Our approach outperforms the state-of-the-art approaches by a large margin in subsentence prediction (i.e., Average Jaccard scores from 0.72 to 0.89). For the evaluation, we designed rigorous experiments consisting of 24 ablation studies. Finally, our learned lessons are returned to the community by sharing software packages and a public dataset that can reproduce the results presented in this paper.

Download Full-text

Self - Adaptive and Interactive Tutoring System(SA-ITS) for Mathematics Learners

Journal of Xidian University ◽

10.37896/jxu14.5/189 ◽

2020 ◽

Vol 14 (5) ◽

Keyword(s):

Tutoring System ◽

Interactive Tutoring ◽

Self Adaptive

Download Full-text

Natural language processing in an intelligent writing strategy tutoring system

Behavior Research Methods ◽

10.3758/s13428-012-0258-1 ◽

2012 ◽

Vol 45 (2) ◽

pp. 499-515 ◽

Cited By ~ 56

Author(s):

Danielle S. McNamara ◽

Scott A. Crossley ◽

Rod Roscoe

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Tutoring System ◽

Writing Strategy

Download Full-text

Intelligent Language Tutoring System

International Journal of Information and Communication Technology Education ◽

10.4018/ijicte.2019070105 ◽

2019 ◽

Vol 15 (3) ◽

pp. 60-74 ◽

Cited By ~ 2

Author(s):

Dara Tafazoli ◽

Elena Gómez María ◽

Cristina A. Huertas Abril

Keyword(s):

Second Language Acquisition ◽

Language Learning ◽

Language Processing ◽

Teaching And Learning ◽

Intelligent Tutoring ◽

Intelligent Tutoring System ◽

Language Education ◽

Computer Assisted ◽

Tutoring System ◽

Tutoring Systems

Intelligent computer-assisted language learning (ICALL) is a multidisciplinary area of research that combines natural language processing (NLP), intelligent tutoring system (ITS), second language acquisition (SLA), and foreign language teaching and learning (FLTL). Intelligent tutoring systems (ITS) are able to provide a personalized approach to learning by assuming the role of a real teacher/expert who adapts and steers the learning process according to the specific needs of each learner. This article reviews and discusses the issues surrounding the development and use of ITSs for language learning and teaching. First, the authors look at ICALL history: its evolution from CALL. Second, issues in ICALL research and integration will be discussed. Third, they will explain how artificial intelligence (AI) techniques are being implemented in language education as ITS and intelligent language tutoring systems (ITLS). Finally, the successful integration and development of ITLS will be explained in detail.

Download Full-text

An Intelligent Interactive Tutoring System For An Electric Circuits Course

10.18260/1-2--1262 ◽

2020 ◽

Author(s):

Saroj Biswas ◽

Musoke Sendaula ◽

Sesha Yeruva ◽

Krishana Priya Sannidhi ◽

Ravi Shankar Dwivedula

Keyword(s):

Tutoring System ◽

Electric Circuits ◽

Interactive Tutoring

Download Full-text