user queries Latest Research Papers

A CP-ABE Scheme for Multi-user Queries in IoT

2021 International Conference on Big Data Analytics for Cyber-Physical System in Smart City - Lecture Notes on Data Engineering and Communications Technologies ◽

10.1007/978-981-16-7469-3_69 ◽

2022 ◽

pp. 619-626

Author(s):

Fanglin An ◽

Yin Zhang ◽

Yi Yue ◽

Jun Ye

Keyword(s):

User Queries

Improving Clustering Methods By Exploiting Richness Of Text Data

10.26686/wgtn.17019287.v1 ◽

2021 ◽

Author(s):

◽

Abdul Wahid

Keyword(s):

Evolutionary Algorithm ◽

State Of The Art ◽

Ensemble Methods ◽

Text Clustering ◽

Clustering Methods ◽

Clustering Method ◽

Clustering Ensemble ◽

Text Data ◽

Multi Objective ◽

User Queries

<p>Clustering is an unsupervised machine learning technique, which involves discovering different clusters (groups) of similar objects in unlabeled data and is generally considered to be a NP hard problem. Clustering methods are widely used in a verity of disciplines for analyzing different types of data, and a small improvement in clustering method can cause a ripple effect in advancing research of multiple fields. Clustering any type of data is challenging and there are many open research questions. The clustering problem is exacerbated in the case of text data because of the additional challenges such as issues in capturing semantics of a document, handling rich features of text data and dealing with the well known problem of the curse of dimensionality. In this thesis, we investigate the limitations of existing text clustering methods and address these limitations by providing five new text clustering methods--Query Sense Clustering (QSC), Dirichlet Weighted K-means (DWKM), Multi-View Multi-Objective Evolutionary Algorithm (MMOEA), Multi-objective Document Clustering (MDC) and Multi-Objective Multi-View Ensemble Clustering (MOMVEC). These five new clustering methods showed that the use of rich features in text clustering methods could outperform the existing state-of-the-art text clustering methods. The first new text clustering method QSC exploits user queries (one of the rich features in text data) to generate better quality clusters and cluster labels. The second text clustering method DWKM uses probability based weighting scheme to formulate a semantically weighted distance measure to improve the clustering results. The third text clustering method MMOEA is based on a multi-objective evolutionary algorithm. MMOEA exploits rich features to generate a diverse set of candidate clustering solutions, and forms a better clustering solution using a cluster-oriented approach. The fourth and the fifth text clustering method MDC and MOMVEC address the limitations of MMOEA. MDC and MOMVEC differ in terms of the implementation of their multi-objective evolutionary approaches. All five methods are compared with existing state-of-the-art methods. The results of the comparisons show that the newly developed text clustering methods out-perform existing methods by achieving up to 16\% improvement for some comparisons. In general, almost all newly developed clustering algorithms showed statistically significant improvements over other existing methods. The key ideas of the thesis highlight that exploiting user queries improves Search Result Clustering(SRC); utilizing rich features in weighting schemes and distance measures improves soft subspace clustering; utilizing multiple views and a multi-objective cluster oriented method improves clustering ensemble methods; and better evolutionary operators and objective functions improve multi-objective evolutionary clustering ensemble methods. The new text clustering methods introduced in this thesis can be widely applied in various domains that involve analysis of text data. The contributions of this thesis which include five new text clustering methods, will not only help researchers in the data mining field but also to help a wide range of researchers in other fields.</p>

Improving Clustering Methods By Exploiting Richness Of Text Data

10.26686/wgtn.17019287 ◽

2021 ◽

Author(s):

◽

Abdul Wahid

Keyword(s):

Evolutionary Algorithm ◽

State Of The Art ◽

Ensemble Methods ◽

Text Clustering ◽

Clustering Methods ◽

Clustering Method ◽

Clustering Ensemble ◽

Text Data ◽

Multi Objective ◽

User Queries

<p>Clustering is an unsupervised machine learning technique, which involves discovering different clusters (groups) of similar objects in unlabeled data and is generally considered to be a NP hard problem. Clustering methods are widely used in a verity of disciplines for analyzing different types of data, and a small improvement in clustering method can cause a ripple effect in advancing research of multiple fields. Clustering any type of data is challenging and there are many open research questions. The clustering problem is exacerbated in the case of text data because of the additional challenges such as issues in capturing semantics of a document, handling rich features of text data and dealing with the well known problem of the curse of dimensionality. In this thesis, we investigate the limitations of existing text clustering methods and address these limitations by providing five new text clustering methods--Query Sense Clustering (QSC), Dirichlet Weighted K-means (DWKM), Multi-View Multi-Objective Evolutionary Algorithm (MMOEA), Multi-objective Document Clustering (MDC) and Multi-Objective Multi-View Ensemble Clustering (MOMVEC). These five new clustering methods showed that the use of rich features in text clustering methods could outperform the existing state-of-the-art text clustering methods. The first new text clustering method QSC exploits user queries (one of the rich features in text data) to generate better quality clusters and cluster labels. The second text clustering method DWKM uses probability based weighting scheme to formulate a semantically weighted distance measure to improve the clustering results. The third text clustering method MMOEA is based on a multi-objective evolutionary algorithm. MMOEA exploits rich features to generate a diverse set of candidate clustering solutions, and forms a better clustering solution using a cluster-oriented approach. The fourth and the fifth text clustering method MDC and MOMVEC address the limitations of MMOEA. MDC and MOMVEC differ in terms of the implementation of their multi-objective evolutionary approaches. All five methods are compared with existing state-of-the-art methods. The results of the comparisons show that the newly developed text clustering methods out-perform existing methods by achieving up to 16\% improvement for some comparisons. In general, almost all newly developed clustering algorithms showed statistically significant improvements over other existing methods. The key ideas of the thesis highlight that exploiting user queries improves Search Result Clustering(SRC); utilizing rich features in weighting schemes and distance measures improves soft subspace clustering; utilizing multiple views and a multi-objective cluster oriented method improves clustering ensemble methods; and better evolutionary operators and objective functions improve multi-objective evolutionary clustering ensemble methods. The new text clustering methods introduced in this thesis can be widely applied in various domains that involve analysis of text data. The contributions of this thesis which include five new text clustering methods, will not only help researchers in the data mining field but also to help a wide range of researchers in other fields.</p>

Information Retrieval based on Cluster Analysis Approach

International Journal of Computer Science and Information Technology ◽

10.5121/ijcsit.2021.13502 ◽

2021 ◽

Vol 13 (5) ◽

pp. 21-29

Author(s):

Orabe Almanaseer

Keyword(s):

Information Retrieval ◽

Phase 1 ◽

Evaluation Criteria ◽

Retrieval Process ◽

Text Documents ◽

Data Set ◽

Analysis Process ◽

High Utility ◽

High Utility Patterns ◽

User Queries

The huge volume of text documents available on the internet has made it difficult to find valuable information for specific users. In fact, the need for efficient applications to extract interested knowledge from textual documents is vitally important. This paper addresses the problem of responding to user queries by fetching the most relevant documents from a clustered set of documents. For this purpose, a cluster-based information retrieval framework was proposed in this paper, in order to design and develop a system for analysing and extracting useful patterns from text documents. In this approach, a preprocessing step is first performed to find frequent and high-utility patterns in the data set. Then a Vector Space Model (VSM) is performed to represent the dataset. The system was implemented through two main phases. In phase 1, the clustering analysis process is designed and implemented to group documents into several clusters, while in phase 2, an information retrieval process was implemented to rank clusters according to the user queries in order to retrieve the relevant documents from specific clusters deemed relevant to the query. Then the results are evaluated according to evaluation criteria. Recall and Precision (P@5, P@10) of the retrieved results. P@5 was 0.660 and P@10 was 0.655.

Data Efficient Algorithms and Interpretability Requirements for Personalized Assessment of Taskable AI Systems

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/693 ◽

2021 ◽

Author(s):

Pulkit Verma

Keyword(s):

Side Effects ◽

Black Box ◽

Efficient Algorithms ◽

Interpretable Model ◽

Instruction Sequences ◽

High Level ◽

Response Capability ◽

User Queries

The vast diversity of internal designs of black-box AI systems and their nuanced zones of safe functionality make it difficult for a layperson to use them without unintended side effects. The focus of my dissertation is to develop algorithms and requirements of interpretability that would enable a user to assess and understand the limits of an AI system's safe operability. We develop an assessment module that lets an AI system execute high-level instruction sequences in simulators and answer the user queries about its execution of sequences of actions. Our results show that such a primitive query-response capability is sufficient to efficiently derive a user-interpretable model of the system in stationary, fully observable, and deterministic settings.

Crowdsourcing for Creating a Dataset for Training a Medication Chatbot

Studies in Health Technology and Informatics - Public Health and Informatics ◽

10.3233/shti210364 ◽

2021 ◽

Author(s):

Cyril R. Zgraggen ◽

Sebastian B. Kunz ◽

Kerstin Denecke

Keyword(s):

Knowledge Base ◽

Mobile Health ◽

Information Needs ◽

Comprehensive Knowledge ◽

Mobile Health Applications ◽

Health Applications ◽

User Queries

To facilitate interaction with mobile health applications, chatbots are increasingly used. They realize the interaction as a dialog where users can ask questions and get answers from the chatbot. A big challenge is to create a comprehensive knowledge base comprising patterns and rules for representing possible user queries the chatbot has to understand and interpret. In this work, we assess how crowdsourcing can be used for generating examples of possible user queries for a medication chatbot. Within one week, the crowdworker generated 2‘738 user questions. The examples provide a large variety of possible formulations and information needs. As a next step, these examples for user queries will be used to train our medication chatbot.

Applying Machine Learning to the Task of Generating Search Queries

Russian Digital Libraries Journal ◽

10.26907/1562-5419-2021-24-2-271-292 ◽

2021 ◽

Vol 24 (2) ◽

pp. 272-293

Author(s):

Александр Михайлович Гусенков ◽

Алина Рафисовна Ситтикова

Keyword(s):

Neural Networks ◽

Semantic Analysis ◽

Short Term Memory ◽

Complete Evaluation ◽

Cosine Measure ◽

Expert Analysis ◽

Transformer Model ◽

Value Decomposition ◽

Gated Recurrent Unit ◽

User Queries

In this paper we research two modifications of recurrent neural networks – Long Short-Term Memory networks and networks with Gated Recurrent Unit with the addition of an attention mechanism to both networks, as well as the Transformer model in the task of generating queries to search engines. GPT-2 by OpenAI was used as the Transformer, which was trained on user queries. Latent-semantic analysis was carried out to identify semantic similarities between the corpus of user queries and queries generated by neural networks. The corpus was convert-ed into a bag of words format, the TFIDF model was applied to it, and a singular value decomposition was performed. Semantic similarity was calculated based on the cosine measure. Also, for a more complete evaluation of the applicability of the models to the task, an expert analysis was carried out to assess the coherence of words in artificially created queries.

Disentangled Representation Learning of User Queries for Database Insider Attack Detection System

KIISE Transactions on Computing Practices ◽

10.5626/ktcp.2021.27.2.76 ◽

2021 ◽

Vol 27 (2) ◽

pp. 76-82

Author(s):

Gwang-Myong Go ◽

Seok-Jun Bu ◽

Sung-Bae Cho

Keyword(s):

Detection System ◽

Representation Learning ◽

Attack Detection ◽

Insider Attack ◽

User Queries

Cluster-based information retrieval using pattern mining

Applied Intelligence ◽

10.1007/s10489-020-01922-x ◽

2020 ◽

Author(s):

Youcef Djenouri ◽

Asma Belhadi ◽

Djamel Djenouri ◽

Jerry Chun-Wei Lin

Keyword(s):

Information Retrieval ◽

Pattern Mining ◽

Spatial Clustering ◽

Clustering Algorithms ◽

User Query ◽

High Quality Information ◽

High Utility ◽

Score Pattern ◽

Mining Algorithms ◽

User Queries

Abstract This paper addresses the problem of responding to user queries by fetching the most relevant object from a clustered set of objects. It addresses the common drawbacks of cluster-based approaches and targets fast, high-quality information retrieval. For this purpose, a novel cluster-based information retrieval approach is proposed, named Cluster-based Retrieval using Pattern Mining (CRPM). This approach integrates various clustering and pattern mining algorithms. First, it generates clusters of objects that contain similar objects. Three clustering algorithms based on k-means, DBSCAN (Density-based spatial clustering of applications with noise), and Spectral are suggested to minimize the number of shared terms among the clusters of objects. Second, frequent and high-utility pattern mining algorithms are performed on each cluster to extract the pattern bases. Third, the clusters of objects are ranked for every query. In this context, two ranking strategies are proposed: i) Score Pattern Computing (SPC), which calculates a score representing the similarity between a user query and a cluster; and ii) Weighted Terms in Clusters (WTC), which calculates a weight for every term and uses the relevant terms to compute the score between a user query and each cluster. Irrelevant information derived from the pattern bases is also used to deal with unexpected user queries. To evaluate the proposed approach, extensive experiments were carried out on two use cases: the documents and tweets corpus. The results showed that the designed approach outperformed traditional and cluster-based information retrieval approaches in terms of the quality of the returned objects while being very competitive in terms of runtime.

Concept Blueprints Serving More Focused User Queries

Proceedings of the 2020 Federated Conference on Computer Science and Information Systems ◽

10.15439/2020f113 ◽

2020 ◽

Author(s):

Kurt Englmeier

Keyword(s):

User Queries

user queries
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

A CP-ABE Scheme for Multi-user Queries in IoT

Improving Clustering Methods By Exploiting Richness Of Text Data

Improving Clustering Methods By Exploiting Richness Of Text Data

Information Retrieval based on Cluster Analysis Approach

Data Efficient Algorithms and Interpretability Requirements for Personalized Assessment of Taskable AI Systems

Crowdsourcing for Creating a Dataset for Training a Medication Chatbot

Applying Machine Learning to the Task of Generating Search Queries

Disentangled Representation Learning of User Queries for Database Insider Attack Detection System

Cluster-based information retrieval using pattern mining

Concept Blueprints Serving More Focused User Queries

Export Citation Format

user queriesRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

A CP-ABE Scheme for Multi-user Queries in IoT

Improving Clustering Methods By Exploiting Richness Of Text Data

Improving Clustering Methods By Exploiting Richness Of Text Data

Information Retrieval based on Cluster Analysis Approach

Data Efficient Algorithms and Interpretability Requirements for Personalized Assessment of Taskable AI Systems

Crowdsourcing for Creating a Dataset for Training a Medication Chatbot

Applying Machine Learning to the Task of Generating Search Queries

Disentangled Representation Learning of User Queries for Database Insider Attack Detection System

Cluster-based information retrieval using pattern mining

Concept Blueprints Serving More Focused User Queries

user queries
Recently Published Documents