Question Type Classification Using a Part-of-Speech Hierarchy

Author(s):  
Richard Khoury
2017 ◽  
Vol 5 ◽  
pp. 163-177 ◽  
Author(s):  
Zhiyang Teng ◽  
Yue Zhang

Sequential LSTMs have been extended to model tree structures, giving competitive results for a number of tasks. Existing methods model constituent trees by bottom-up combinations of constituent nodes, making direct use of input word information only for leaf nodes. This is different from sequential LSTMs, which contain references to input words for each node. In this paper, we propose a method for automatic head-lexicalization for tree-structure LSTMs, propagating head words from leaf nodes to every constituent node. In addition, enabled by head lexicalization, we build a tree LSTM in the top-down direction, which corresponds to bidirectional sequential LSTMs in structure. Experiments show that both extensions give better representations of tree structures. Our final model gives the best results on the Stanford Sentiment Treebank and highly competitive results on the TREC question type classification task.


Terminology ◽  
2004 ◽  
Vol 10 (1) ◽  
pp. 153-168 ◽  
Author(s):  
Noriko Tomuro

Question terminology is a set of terms which appear in keywords, idioms and fixed expressions commonly observed in questions. This paper investigates ways to automatically extract question terminology from a corpus of questions and represent them for the purpose of classifying by question type. Our key interest is to see whether or not semantic features can enhance the representation of strongly lexical nature of question sentences. We compare two feature sets: one with lexical features only, and another with a mixture of lexical and semantic features. For evaluation, we measure the classification accuracy made by two machine learning algorithms, C5.0 and PEBLS, by using a procedure called domain cross-validation, which effectively measures the domain transferability of features.


2017 ◽  
Vol 56 (03) ◽  
pp. 209-216 ◽  
Author(s):  
Said Ouatik El Alaoui ◽  
Mourad Sarrouti

SummaryBackground and Objective: Biomedical question type classification is one of the important components of an automatic biomedical question answering system. The performance of the latter depends directly on the performance of its biomedical question type classification system, which consists of assigning a category to each question in order to determine the appropriate answer extraction algorithm. This study aims to automatically classify biomedical questions into one of the four categories: (1) yes/no, (2) factoid, (3) list, and (4) summary.Methods: In this paper, we propose a biomedical question type classification method based on machine learning approaches to automatically assign a category to a biomedical question. First, we extract features from biomedical questions using the proposed handcrafted lexico-syntactic patterns. Then, we feed these features for machine- learning algorithms. Finally, the class label is predicted using the trained classifiers.Results: Experimental evaluations performed on large standard annotated datasets of biomedical questions, provided by the BioASQ challenge, demonstrated that our method exhibits significant improved performance when compared to four baseline systems. The proposed method achieves a roughly 10-point increase over the best baseline in terms of accuracy. Moreover, the obtained results show that using handcrafted lexico-syntactic patterns as features’ provider of support vector machine (SVM) lead to the highest accuracy of 89.40%.Conclusion: The proposed method can automatically classify BioASQ questions into one of the four categories: yes/no, factoid, list, and summary. Furthermore, the results demonstrated that our method produced the best classification performance compared to four baseline systems.


Author(s):  
June-Jei Kuo ◽  
Kuei-Kuang Lin ◽  
Hsin-Hsi Chen ◽  
Cheng-Hsuan Kao ◽  
Bor-Shen Lin

Author(s):  
Bo Shao ◽  
Yeyun Gong ◽  
Junwei Bao ◽  
Jianshu Ji ◽  
Guihong Cao ◽  
...  

Semantic parsing is a challenging and important task which aims to convert a natural language sentence to a logical form. Existing neural semantic parsing methods mainly use <question, logical form> (Q-L) pairs to train a sequence-to-sequence model. However, the amount of existing Q-L labeled data is limited and hard to obtain. We propose an effective method which substantially utilizes labeling information from other tasks to enhance the training of a semantic parser. We design a multi-task learning model to train question type classification, entity mention detection together with question semantic parsing using a shared encoder. We propose a weakly supervised learning method to enhance our multi-task learning model with paraphrase data, based on the idea that the paraphrased questions should have the same logical form and question type information. Finally, we integrate the weakly supervised multi-task learning method to an encoder-decoder framework. Experiments on a newly constructed dataset and ComplexWebQuestions show that our proposed method outperforms state-of-the-art methods which demonstrates the effectiveness and robustness of our method.


1969 ◽  
Vol 8 (02) ◽  
pp. 84-90 ◽  
Author(s):  
A. W. Pratt ◽  
M. Pacak

The system for the identification and subsequent transformation of terminal morphemes in medical English is a part of the information system for processing pathology data which was developed at the National Institutes of Health.The recognition and transformation of terminal morphemes is restricted to classes of adjectivals including the -ING and -ED forms, nominals and homographic adjective/noun forms.The adjective-to-noun and noun-to-noun transforms consist basically of a set of substitutions of adjectival and certain nominal suffixes by a set of suffixes which indicate the corresponding nominal form(s).The adjectival/nominal suffix has a polymorphosyntactic transformational function if it has the property of being transformed into more than one nominalizing suffix (e.g., the adjectival suffix -IC can be substituted by a set of nominalizing suffixes -Ø, -A, -E, -Y, -IS, -IA, -ICS): the adjectival suffix has a monomorphosyntactic transformational property if there is only one admissible transform (e.g., -CIC → -X).The morphological segmentation and the subsequent transformations are based on the following principles:a. The word form is segmented according to the principle of »double consonant cut,« i.e., terminal characters following the last set of double consonants are analyzed and treated as a potential suffix. For practical purposes only such terminal suffixes of a maximum length of four have been analyzed.b. The principle that the largest segment of a word form common to both adjective and noun or to both noun stems is retained as a word base for transformational operations, and the non-identical segment is considered to be a »suffix.«The backward right-to-left character search is initiated by the identification of the terminal grapheme of the given word form and is extended to certain admissible sequences of immediately preceding graphemes.The nodes which represent fixed sequences of graphemes are labeled according to their recognition and/or transformation properties.The tree nodes are divided into two groups:a. productive or activatedb. non-productive or non-activatedThe productive (activated) nodes are sequences of sets of graphemes which possess certain properties, such as the indication about part-of-speech class membership, the transformation properties, or both. The non-productive (non-activated) nodes have the function of connectors, i.e., they specify the admissible path to the productive nodes.The computer program for the identification and transformation of the terminal morphemes is open-ended and is already operational. It will be extended to other sub-fields of medicine in the near future.


2020 ◽  
pp. 1-12
Author(s):  
Li Dongmei

English text-to-speech conversion is the key content of modern computer technology research. Its difficulty is that there are large errors in the conversion process of text-to-speech feature recognition, and it is difficult to apply the English text-to-speech conversion algorithm to the system. In order to improve the efficiency of the English text-to-speech conversion, based on the machine learning algorithm, after the original voice waveform is labeled with the pitch, this article modifies the rhythm through PSOLA, and uses the C4.5 algorithm to train a decision tree for judging pronunciation of polyphones. In order to evaluate the performance of pronunciation discrimination method based on part-of-speech rules and HMM-based prosody hierarchy prediction in speech synthesis systems, this study constructed a system model. In addition, the waveform stitching method and PSOLA are used to synthesize the sound. For words whose main stress cannot be discriminated by morphological structure, label learning can be done by machine learning methods. Finally, this study evaluates and analyzes the performance of the algorithm through control experiments. The results show that the algorithm proposed in this paper has good performance and has a certain practical effect.


Author(s):  
Yabing Zhang

This article is devoted to the problem of using Russian time-prepositions by foreigners, especially by the Chinese. An analysis of modern literature allows the author to identify the main areas of the work aimed at foreign students’ development of the skills and abilities to correctly build the prepositional combinations and continuously improve the communication skills by means of the Russian language. In this paper, the time-prepositions in the Russian language have been analyzed in detail; some examples of polysemantic use of prepositions, their semantic and stylistic shades alongside with possible errors made by foreign students are presented. The results of the study are to help in developing a system of teaching Russian time-prepositions to a foreign language audience, taking into account their native language, on the basis of the systemic and functional, communicative and activity-centred basis. The role of Russian time-prepositions in constructing word combinations has been identified; the need for foreign students’ close attention to this secondary part of speech has been specified. It has been stated that prepositions are the most dynamic and open type of secondary language units within the quantitative and qualitative composition of which regular changes take place. The research substantiates the need that students should be aware of the function of time-preposition in speech; they are to get acquainted with the main time-prepositions and their meanings, to distinguish prepositions and other homonymous parts of speech as well as to learn stylistic shades of time-prepositions. Some recommendations related to the means of mastering time-prepositions have been given: to target speakers to assimilate modern literary norms and, therefore, to teach them how to choose and use them correctly by means of linguistic keys that are intended to fill the word with true meaning, to give it an organic structure, an inherent form and an easy combinability in the texts and oral speech.


Sign in / Sign up

Export Citation Format

Share Document