A Survey of Existing Question Answering Techniques for Indian Languages

To know the information from the internet searching is one of the most important part for any user. In case of ‘Syntactic Search’ keyword based matching technique is used. Search accuracy is improved applying the filter like location, preference, user-history etc. However, it can happen that the user query or question and the best available answer or result in the internet domain has no terms in common or ignorable number of terms is common. In such case syntactic search cannot give the desired output. The role of ‘Semantic Search’ becomes prevalent in this scenario. The execution of semantic search faces challenge due to unavailability of resources like WordNet, Ontology, Annotation etc. An end to end algorithm is described to improve the accuracy of the semantic search in this work. Four classification techniques are used. They are ANN, Decision Tree, SVM and Naïve Bayes. Dataset is provided from the TDIL project of the Ministry of Electronics and IT, Govt. of India. The repository contains 86 categories of text having more than a million sentences. After getting the impressive result for the Bengali language test run was done for other Indian languages and a very good result is achieved. This research is extremely useful for the automatic question answering system, semantic similarity analysis, e-governance and m- governance.

Download Full-text

A Hybrid Bootstrapping Approach for developing Odiya Named Entity Corpora from Wikipedia

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i4.38.24311 ◽

2018 ◽

Vol 7 (4.38) ◽

pp. 11

Author(s):

Sitanath Biswas ◽

Sujata Dash

Keyword(s):

Language Processing ◽

Question Answering ◽

Promising Result ◽

Named Entity Recognition ◽

Entity Recognition ◽

Training Dataset ◽

Indian Languages ◽

Named Entity ◽

Proper Nouns ◽

Different Types

Named Entity Recognition (NER) is considered as very influential undertaking in natural language processing appropriate to Question Answering system, Machine Translation (MT), Information extraction (IE), Information Retrieval (IR) etc. Basically NER is to identify and classify different types of proper nouns present inside given file like location name, person name, number, organization name, time etc. Although huge amount of progress is made for different Indian languages, NER is still a big problem for Odiya Language. Odiya is also a resource constrained language and till today, this is very tough to find out a large and accurate corpus for training and test. Therefore in this paper, we have utilized Wikipedia to develop a huge Odiya corpus of annotated name entities which is quite efficient to be training dataset further. After evaluation, we have got a very promising result with a F-score of 78.89.

Download Full-text

Detecting Paraphrases in Marathi Language

10.54646/bijscit.003 ◽

2020 ◽

pp. 7-17

Author(s):

Shruti Srivastava ◽

◽

Sharvari Govilkar ◽

Keyword(s):

Semantic Similarity ◽

Question Answering ◽

Real Life ◽

Indian Languages ◽

Plagiarism Detection ◽

Statistical Similarity ◽

Universal Networking Language ◽

Textual Content ◽

Factual Data ◽

Semantic Significance

Paraphrasing refers to the sentences that either differs in their textual content or dissimilar in rearrangement of words but convey the same meaning. Identifying a paraphrase is exceptionally important in various real life applications such as Information Retrieval, Plagiarism Detection, Text Summarization and Question Answering. A large amount of work in Paraphrase Detection has been done in English and many Indian Languages. However, there is no existing system to identify paraphrases in Marathi. This is the first such endeavor in Marathi Language. A paraphrase has different structured sentences and Marathi being semantically strong language hence this system is designed for checking both statistical and semantic similarity of Marathi sentences. Statistical similarity measure does not need any prior knowledge as it is only based on the factual data of sentences. The factual data is calculated on the basis of the degree of closeness between the word-set, word-order, word-vector and word-distance. Universal Networking Language (UNL) speaks about the semantic significance in the sentence without any syntacticpointofinterest.Hence, these mantic similarity calculated on the basis of generated UNL graphs for two Marathi sentences renders semantic equality of two Marathi sentences. The total para phrases core was calculated after joining statistical and semantic similarity scores which gives the judgement of being paraphrase or non-paraphrase about the Marathi sentences.

Download Full-text

Reports of the Workshops Held at the 2019 AAAI Conference on Artificial Intelligence

AI Magazine ◽

10.1609/aimag.v40i3.4981 ◽

2019 ◽

Vol 40 (3) ◽

pp. 67-78

Author(s):

Guy Barash ◽

Mauricio Castillo-Effen ◽

Niyati Chhaya ◽

Peter Clark ◽

Huáscar Espinoza ◽

...

Keyword(s):

Artificial Intelligence ◽

Language Processing ◽

Cyber Security ◽

Question Answering ◽

Intent Recognition ◽

Affective Content ◽

Learning Plan ◽

Dialog System ◽

Affective Content Analysis ◽

Games And Simulations

The workshop program of the Association for the Advancement of Artificial Intelligence’s 33rd Conference on Artificial Intelligence (AAAI-19) was held in Honolulu, Hawaii, on Sunday and Monday, January 27–28, 2019. There were fifteen workshops in the program: Affective Content Analysis: Modeling Affect-in-Action, Agile Robotics for Industrial Automation Competition, Artificial Intelligence for Cyber Security, Artificial Intelligence Safety, Dialog System Technology Challenge, Engineering Dependable and Secure Machine Learning Systems, Games and Simulations for Artificial Intelligence, Health Intelligence, Knowledge Extraction from Games, Network Interpretability for Deep Learning, Plan, Activity, and Intent Recognition, Reasoning and Learning for Human-Machine Dialogues, Reasoning for Complex Question Answering, Recommender Systems Meet Natural Language Processing, Reinforcement Learning in Games, and Reproducible AI. This report contains brief summaries of the all the workshops that were held.

Download Full-text

Parts of Speech Tagging for Indian Languages Review and Scope for Punjabi Language

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse/v7i4/0140 ◽

2017 ◽

Vol 7 (4) ◽

pp. 214-217 ◽

Cited By ~ 1

Author(s):

Ramandeep Kaur ◽

◽

Lakhvir Singh Garcha ◽

Mohita Garag ◽

Satinderpal Singh ◽

...

Keyword(s):

Indian Languages ◽

Parts Of Speech ◽

Speech Tagging

Download Full-text

proceeding of the 2002 conference on multilingual summarization and question answering - COLING-02

10.3115/1118845 ◽

2002 ◽

Keyword(s):

Question Answering

Download Full-text

Automated question answering in Webclopedia

10.3115/1289189.1289230 ◽

2002 ◽

Cited By ~ 2

Author(s):

Ulf Hermjakob ◽

Eduard Hovy ◽

Chin-Yew Lin

Keyword(s):

Question Answering

Download Full-text

Natural language understanding and the perspectives of question answering

10.3115/991813.991871 ◽

1982 ◽

Author(s):

Petr Sgall

Keyword(s):

Natural Language ◽

Question Answering ◽

Natural Language Understanding ◽

Language Understanding

Download Full-text

Pencarian Question-Answer Menggunakan Convolutional Neural Network Pada Topik Agama Berbahasa Indonesia

Jurnal ULTIMATICS ◽

10.31937/ti.v10i1.842 ◽

2018 ◽

Vol 10 (1) ◽

pp. 57-64 ◽

Cited By ~ 1

Author(s):

Rizqa Raaiqa Bintana ◽

Chastine Fatichah ◽

Diana Purwitasari

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Question Answering ◽

Mean Average Precision ◽

Average Precision ◽

Community Based ◽

Network Methods ◽

The Mean ◽

Index Terms ◽

Search Information

Community-based question answering (CQA) is formed to help people who search information that they need through a community. One condition that may occurs in CQA is when people cannot obtain the information that they need, thus they will post a new question. This condition can cause CQA archive increased because of duplicated questions. Therefore, it becomes important problems to find semantically similar questions from CQA archive towards a new question. In this study, we use convolutional neural network methods for semantic modeling of sentence to obtain words that they represent the content of documents and new question. The result for the process of finding the same question semantically to a new question (query) from the question-answer documents archive using the convolutional neural network method, obtained the mean average precision value is 0,422. Whereas by using vector space model, as a comparison, obtained mean average precision value is 0,282. Index Terms—community-based question answering, convolutional neural network, question retrieval

Download Full-text

An Efficient Semantic Analysis Technique for the Question Answering Systems

Journal of Engineering and Applied Sciences ◽

10.36478/jeasci.2019.8289.8292 ◽

2019 ◽

Vol 14 (22) ◽

pp. 8289-8292

Author(s):

Ibrahim Mahmoud Ibrahim Alturani ◽

Mohd Pouzi Bin Hamzah

Keyword(s):

Question Answering ◽

Semantic Analysis ◽

Analysis Technique ◽

Question Answering Systems

Download Full-text