scholarly journals The problem of loss of solutions in the task of searching similar documents: Applying terminology in the construction of a corpus vector model

2021 ◽  
Vol 15 (2) ◽  
pp. 60-74
Author(s):  
Fedor Krasnov ◽  
Irina Smaznevich ◽  
Elena Baskakova

This article considers the problem of finding text documents similar in meaning in the corpus. We investigate a problem arising when developing applied intelligent information systems that is non-detection of a part of solutions by the TF-IDF algorithm: one can lose some document pairs that are similar according to human assessment, but receive a low similarity assessment from the program. A modification of the algorithm, with the replacement of the complete vocabulary with a vocabulary of specific terms is proposed. The addition of thesauri when building a corpus vector model based on a ranking function has not been previously investigated; the use of thesauri has so far been studied only to improve topic models. The purpose of this work is to improve the quality of the solution by minimizing the loss of its significant part and not adding “false similar” pairs of documents. The improvement is provided by the use of a vocabulary of specific terms extracted from the text of the analyzed documents when calculating the TF-IDF values for corpus vector representation. The experiment was carried out on two corpora of structured normative and technical documents united by a subject: state standards related to information technology and to the field of railways. The glossary of specific terms was compiled by automatic analysis of the text of the documents under consideration, and rule-based NER methods were used. It was demonstrated that the calculation of TF-IDF based on the terminology vocabulary gives more relevant results for the problem under study, which confirmed the hypothesis put forward. The proposed method is less dependent on the shortcomings of the text layer (such as recognition errors) than the calculation of the documents’ proximity using the complete vocabulary of the corpus. We determined the factors that can affect the quality of the decision: the way of compiling a terminology vocabulary, the choice of the range of n-grams for the vocabulary, the correctness of the wording of specific terms and the validity of their inclusion in the glossary of the document. The findings can be used to solve applied problems related to the search for documents that are close in meaning, such as semantic search, taking into account the subject area, corporate search in multi-user mode, detection of hidden plagiarism, identification of contradictions in a collection of documents, determination of novelty in documents when building a knowledge base.

2021 ◽  
Vol 7 (2) ◽  
pp. 130
Author(s):  
Haitham M. Alzoubi ◽  
Ramsha Aziz

Purpose—The purpose of this research is to explore the direct relationship between the emotional intelligence of top management and the quality of strategic decisions they take for their companies. This relationship is further examined by the mediating role of open innovation in the context of intelligent information systems that can impact the way top managers take decisions. This research adopted a survey design as cross-sectional data were taken through questionnaires from top management of the UAE national banks. A final sample size of 213 questionnaires completed by managers was obtained and analyzed. As predicted, there was a strong, positive relationship between managers’ emotional intelligence and the quality of their strategic decisions. Open innovation has revolutionized the way top managers of banks take decisions that are later transformed into policies. Decision-makers are required to possess the skill of decision-making by being vigilant of their surroundings. Hence, they have emotional intelligence and intelligent information systems (IIS) only enhances the trait. IIS is the glorified version of open innovation that further contributes to the decision-making process and the quality of decisions. This research is one of a kind as no one has explored these dimensions of emotional intelligence in the UAE.


Author(s):  
И.Р. Усамов ◽  
А.А. Албакова ◽  
А.А. Мустиев

Статья посвящена рассмотрению роли интеллектуальных информационных систем в современном мире. Проведен анализ и рассмотрена сущность интеллектуальных систем, отрасли использования интеллектуальных систем, выделены проблемы внедрения интеллектуальных информационных систем и предложены механизмы решения проблем внедрения интеллектуальных информационных систем. Рассмотрены основные отрасли, где используются интеллектуальные информационные системы для повышения скорости производства и улучшения качества оказываемых услуг. Рассмотрены основные три проблемы искусственного интеллекта, которые не решены на данный момент, и которые в будущем могут вызвать мировой хаос. Предложены механизмы решения данных трех проблем. The article is devoted to the role of intelligent information systems in the modern world. The article analyzes and considers the essence of intelligent systems, the branches of using intelligent systems, identifies the problems of implementing intelligent information systems, and suggests mechanisms for solving the problems of implementing intelligent information systems. The main industries where intelligent information systems are used to increase the speed of production and improve the quality of services provided are considered. The main three problems of artificial intelligence, which are not solved at the moment, and which in the future can cause global chaos, are considered. Mechanisms for solving the set here problems areproposed.


1996 ◽  
Vol 05 (01) ◽  
pp. 27-72 ◽  
Author(s):  
R. CLARK ◽  
C. GROSSNER ◽  
T. RADHAKRISHNAN

Resolving disparate viewpoints via planning is an important aspect of the distributed problem solving performed by the agents in Cooperative Intelligent Information Systems. We propose a distributed planning protocol called Consensus. Consensus specifies a methodology by which agents exchange information indicating their preferred actions, integrate these different sets of actions, resolve any conflicts that exist, and choose a joint set of actions. Within the framework of the Consensus protocol, two different heuristics are proposed for resolving conflicts which overcome the computational complexity of an exhaustive search. An implementation of the Consensus protocol is analyzed experimentally to assess its performance in resolving conflicts. The experimental results indicate the trade-off between the cost of planning and the quality of the plan produced with respect to the heuristics used for resolving conflicts.


2019 ◽  
Vol 8 (3) ◽  
pp. 6634-6643 ◽  

Opinion mining and sentiment analysis are valuable to extract the useful subjective information out of text documents. Predicting the customer’s opinion on amazon products has several benefits like reducing customer churn, agent monitoring, handling multiple customers, tracking overall customer satisfaction, quick escalations, and upselling opportunities. However, performing sentiment analysis is a challenging task for the researchers in order to find the users sentiments from the large datasets, because of its unstructured nature, slangs, misspells and abbreviations. To address this problem, a new proposed system is developed in this research study. Here, the proposed system comprises of four major phases; data collection, pre-processing, key word extraction, and classification. Initially, the input data were collected from the dataset: amazon customer review. After collecting the data, preprocessing was carried-out for enhancing the quality of collected data. The pre-processing phase comprises of three systems; lemmatization, review spam detection, and removal of stop-words and URLs. Then, an effective topic modelling approach Latent Dirichlet Allocation (LDA) along with modified Possibilistic Fuzzy C-Means (PFCM) was applied to extract the keywords and also helps in identifying the concerned topics. The extracted keywords were classified into three forms (positive, negative and neutral) by applying an effective machine learning classifier: Convolutional Neural Network (CNN). The experimental outcome showed that the proposed system enhanced the accuracy in sentiment analysis up to 6-20% related to the existing systems.


Author(s):  
Wai-Tat Fu ◽  
Jessie Chin ◽  
Q. Vera Liao

Cognitive science is a science of intelligent systems. This chapter proposes that cognitive science can provide useful perspectives for research on technology-mediated human-information interaction (HII) when HII is cast as emergent behaviour of a coupled intelligent system. It starts with a review of a few foundational concepts related to cognitive computations and how they can be applied to understand the nature of HII. It discusses several important properties of a coupled cognitive system and their implication to designs of information systems. Finally, it covers how levels of abstraction have been useful for cognitive science, and how these levels can inform design of intelligent information systems that are more compatible with human cognitive computations.


2002 ◽  
Vol 8 (2-3) ◽  
pp. 93-96
Author(s):  
AFZAL BALLIM ◽  
VINCENZO PALLOTTA

The automated analysis of natural language data has become a central issue in the design of intelligent information systems. Processing unconstrained natural language data is still considered as an AI-hard task. However, various analysis techniques have been proposed to address specific aspects of natural language. In particular, recent interest has been focused on providing approximate analysis techniques, assuming that when perfect analysis is not possible, partial results may be still very useful.


2017 ◽  
Vol 21 (6) ◽  
pp. 1039-1040
Author(s):  
Quan Z. Sheng ◽  
Wei Emma Zhang ◽  
Elhadi Shakshuki

Sign in / Sign up

Export Citation Format

Share Document