scholarly journals Natural Language Processing in OTF Computing: Challenges and the Need for Interactive Approaches

Computers ◽  
2019 ◽  
Vol 8 (1) ◽  
pp. 22
Author(s):  
Frederik Bäumer ◽  
Joschka Kersting ◽  
Michaela Geierhos

The vision of On-the-Fly (OTF) Computing is to compose and provide software services ad hoc, based on requirement descriptions in natural language. Since non-technical users write their software requirements themselves and in unrestricted natural language, deficits occur such as inaccuracy and incompleteness. These deficits are usually met by natural language processing methods, which have to face special challenges in OTF Computing because maximum automation is the goal. In this paper, we present current automatic approaches for solving inaccuracies and incompletenesses in natural language requirement descriptions and elaborate open challenges. In particular, we will discuss the necessity of domain-specific resources and show why, despite far-reaching automation, an intelligent and guided integration of end users into the compensation process is required. In this context, we present our idea of a chat bot that integrates users into the compensation process depending on the given circumstances.

Author(s):  
Santosh Kumar Mishra ◽  
Rijul Dhir ◽  
Sriparna Saha ◽  
Pushpak Bhattacharyya

Image captioning is the process of generating a textual description of an image that aims to describe the salient parts of the given image. It is an important problem, as it involves computer vision and natural language processing, where computer vision is used for understanding images, and natural language processing is used for language modeling. A lot of works have been done for image captioning for the English language. In this article, we have developed a model for image captioning in the Hindi language. Hindi is the official language of India, and it is the fourth most spoken language in the world, spoken in India and South Asia. To the best of our knowledge, this is the first attempt to generate image captions in the Hindi language. A dataset is manually created by translating well known MSCOCO dataset from English to Hindi. Finally, different types of attention-based architectures are developed for image captioning in the Hindi language. These attention mechanisms are new for the Hindi language, as those have never been used for the Hindi language. The obtained results of the proposed model are compared with several baselines in terms of BLEU scores, and the results show that our model performs better than others. Manual evaluation of the obtained captions in terms of adequacy and fluency also reveals the effectiveness of our proposed approach. Availability of resources : The codes of the article are available at https://github.com/santosh1821cs03/Image_Captioning_Hindi_Language ; The dataset will be made available: http://www.iitp.ac.in/∼ai-nlp-ml/resources.html .


2017 ◽  
Vol 6 (5) ◽  
pp. 281
Author(s):  
Rishabh Shah ◽  
Siddhant Lahoti ◽  
K. Lavanya

2021 ◽  
Author(s):  
Minoru Yoshida ◽  
Kenji Kita

Both words and numerals are tokens found in almost all documents but they have different properties. However, relatively little attention has been paid in numerals found in texts and many systems treated the numbers found in the document in ad-hoc ways, such as regarded them as mere strings in the same way as words, normalized them to zeros, or simply ignored them. Recent growth of natural language processing (NLP) research areas has change this situations and more and more attentions have been paid to the numeracy in documents. In this survey, we provide a quick overview of the history and recent advances of the research of mining such relations between numerals and words found in text data.


2018 ◽  
Vol 12 (02) ◽  
pp. 237-260
Author(s):  
Weifeng Xu ◽  
Dianxiang Xu ◽  
Abdulrahman Alatawi ◽  
Omar El Ariss ◽  
Yunkai Liu

Unigram is a fundamental element of [Formula: see text]-gram in natural language processing. However, unigrams collected from a natural language corpus are unsuitable for solving problems in the domain of computer programming languages. In this paper, we analyze the properties of unigrams collected from an ultra-large source code repository. Specifically, we have collected 1.01 billion unigrams from 0.7 million open source projects hosted at GitHub.com. By analyzing these unigrams, we have discovered statistical properties regarding (1) how developers name variables, methods, and classes, and (2) how developers choose abbreviations. We describe a probabilistic model which relies on these properties for solving a well-known problem in source code analysis: how to expand a given abbreviation to its original indented word. Our empirical study shows that using the unigrams extracted from source code repository outperforms the using of the natural language corpus by 21% when solving the domain specific problems.


Author(s):  
Matthew W. Crocker

Traditional approaches to natural language processing (NLP) can be considered construction-based. That is to say, they employ surface oriented, language specific rules, whether in the form of an Augmented Transition Network (ATN), logic grammar or some other grammar/parsing formalism. The problems of such approaches have always been apparent; they involve large sets of rules, often ad hoc, and their adequacy with respect to the grammar of the language is difficult to ensure.


2021 ◽  
Vol 113 ◽  
pp. 103665
Author(s):  
Timothy L. Chen ◽  
Max Emerling ◽  
Gunvant R. Chaudhari ◽  
Yeshwant R. Chillakuru ◽  
Youngho Seo ◽  
...  

TEKNO ◽  
2019 ◽  
Vol 29 (2) ◽  
pp. 129
Author(s):  
Yohanes Dhimas Firman Syahputra ◽  
Syaad Patmanthara ◽  
Heru Wahyu Herwanto

Hasil pengembangan aplikasi kecerdasan buatan berupa chat bot guna membantu perusahaan dalam melakukan edukasi customer dengan sistem natural language processing (NLP) diperoleh melalui metode pengembangan sistem. Dimana chat bot guna membantu perusahaan dalam melakukan edukasi customer dengan sistem NLP ini dikembangkan untuk komputer dapat melakukan tugas tertentu seperti yang dilakukan oleh manusia seperti robot chatting (chatbot), yaitu sistem yang mengadopsi pengetahuan manusia ke komputer, agar komputer dapat melakukan percakapan dengan pengguna. Kepintaran chatbot dalam menjawab pertanyaan ditentukan oleh banyaknya data set sehingga perbanyak data jawaban agar lebih banyak memahami pertanyaan dari pelanggan. Berdasarkan hasil uji coba pengembangan Chatbot yang telah dilakuan skor yang diproleh adalah 88,94%. Berdasarkan tabel kategori kelayakan, maka chat bot yang dikembangkan dalam penelitian dapat dinyatakan “sangat layak” untuk digunakan dalam pengembangannya.


Author(s):  
Arkodeep Biswas and Ajay Kaushik

The objective of this paper is to build a Web Application based on Virtual voice and chat Assistant. The current study focuses on development of voice and text/chat bot specifically. It is specially being built for people who feel depressed and insists them to talk open mindedly which in turn pacifies them. As the name of the application suggests, App: An application to pacify people and make them as happy as a cat would be with his or her mother (the reason why a cat purrs). We will be using Dialog flow for the application design and Machine Learning as a part of Artificial Intelligence for Natural Language Processing (NLP), an easiest way to use Machine Learning libraries. At the back-end we will be using a database to store the communication history between the user and the bot. This application will only work on devices with Web operating system version-5.0 and above.


Sign in / Sign up

Export Citation Format

Share Document