Automated Subsurface Knowledge ASK Thamama Retrieval Engine Driven by Conversational Text Analytics and NLP - Lessons Learned in Managing Large Volume of Documents in Abu Dhabi Assets

2021 ◽  
Author(s):  
Fahed Mubarak Braik ◽  
Abdulla Sulaiman Al Shehhi ◽  
Luigi Saputelli ◽  
Carlos Mata ◽  
Dorzhi Badmaev ◽  
...  

Abstract The purpose of this paper is to communicate the experiences in the development of an innovative concept named "ASK Thamama" as an automated data and information retrieval engine driven by artificial intelligence techniques including text analytics and natural language processing. ASK is an AI enabled conversational search engine used to retrieve information from various internal data repositories using natural language queries. The text processing and conversational engine concept is built upon available open-source software requiring minimum coding of new libraries. A data set with 1000 documents was used to validate key functionalities with an accuracy of 90% of the search queries and able to provide specific answers for 80% of queries framed as questions. The results of this work show encouraging results and demonstrate value that AI-enabled methodologies can provide natural language search by enabling automated workflows for data information retrieval. The developed AI methodology has tremendous potential of integration in an end-to-end workflow of knowledge management by utilizing available document repositories to valuable insights, with little to no human intervention.

Informatics ◽  
2021 ◽  
Vol 17 (4) ◽  
pp. 61-72
Author(s):  
D. I. Kachkou

The article is an essay on the development of technologies for natural language processing, which formed the basis of BERT (Bidirectional Encoder Representations from Transformers), a language model from Google, showing high results on the whole class of problems associated with the understanding of natural language. Two key ideas implemented in BERT are knowledge transfer and attention mechanism. The model is designed to solve two problems on a large unlabeled data set and can reuse the identified language patterns for effective learning for a specific text processing problem. Architecture Transformer is based on the attention mechanism, i.e. it involves evaluation of relationships between input data tokens. In addition, the article notes strengths and weaknesses of BERT and the directions for further model improvement.


2020 ◽  
pp. 3-17
Author(s):  
Peter Nabende

Natural Language Processing for under-resourced languages is now a mainstream research area. However, there are limited studies on Natural Language Processing applications for many indigenous East African languages. As a contribution to covering the current gap of knowledge, this paper focuses on evaluating the application of well-established machine translation methods for one heavily under-resourced indigenous East African language called Lumasaaba. Specifically, we review the most common machine translation methods in the context of Lumasaaba including both rule-based and data-driven methods. Then we apply a state of the art data-driven machine translation method to learn models for automating translation between Lumasaaba and English using a very limited data set of parallel sentences. Automatic evaluation results show that a transformer-based Neural Machine Translation model architecture leads to consistently better BLEU scores than the recurrent neural network-based models. Moreover, the automatically generated translations can be comprehended to a reasonable extent and are usually associated with the source language input.


2017 ◽  
Vol 9 (1) ◽  
pp. 19-24 ◽  
Author(s):  
David Domarco ◽  
Ni Made Satvika Iswari

Technology development has affected many areas of life, especially the entertainment field. One of the fastest growing entertainment industry is anime. Anime has evolved as a trend and a hobby, especially for the population in the regions of Asia. The number of anime fans grow every year and trying to dig up as much information about their favorite anime. Therefore, a chatbot application was developed in this study as anime information retrieval media using regular expression pattern matching method. This application is intended to facilitate the anime fans in searching for information about the anime they like. By using this application, user can gain a convenience and interactive anime data retrieval that can’t be found when searching for information via search engines. Chatbot application has successfully met the standards of information retrieval engine with a very good results, the value of 72% precision and 100% recall showing the harmonic mean of 83.7%. As the application of hedonic, chatbot already influencing Behavioral Intention to Use by 83% and Immersion by 82%. Index Terms—anime, chatbot, information retrieval, Natural Language Processing (NLP), Regular Expression Pattern Matching


2019 ◽  
Vol 53 (2) ◽  
pp. 3-10
Author(s):  
Muthu Kumar Chandrasekaran ◽  
Philipp Mayr

The 4 th joint BIRNDL workshop was held at the 42nd ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2019) in Paris, France. BIRNDL 2019 intended to stimulate IR researchers and digital library professionals to elaborate on new approaches in natural language processing, information retrieval, scientometrics, and recommendation techniques that can advance the state-of-the-art in scholarly document understanding, analysis, and retrieval at scale. The workshop incorporated different paper sessions and the 5 th edition of the CL-SciSumm Shared Task.


2021 ◽  
pp. 108357
Author(s):  
Daniel Perdices ◽  
Javier Ramos ◽  
José L. García-Dorado ◽  
Iván González ◽  
Jorge E. López de Vergara

Sign in / Sign up

Export Citation Format

Share Document