SEMANTIC SEARCH OF SERVICES

This paper addresses semantic search of Web services using natural language processing. First we survey various existing approaches, focusing on the fact that the expensive costs of current semantic annotation frameworks result in limited use of semantic search for large scale applications. We then propose a service search framework based on the vector space model to combine the traditional frequency weighted term-document matrix, the syntactical information extracted from a lexical database and a dependency grammar parser. In particular, instead of using terms as the rows in a term-document matrix, we propose using synsets from WordNet to distinguish different meanings of a word under different contexts as well as clustering different words with similar meanings. Also based on the characteristics of Web services descriptions, we propose an approach to identifying semantically important terms to adjust weightings. Our experiments show that our approach achieves its goal well.

Download Full-text

Natural Language Processing Based Question Answering Using Vector Space Model

Advances in Intelligent Systems and Computing - Proceedings of Sixth International Conference on Soft Computing for Problem Solving ◽

10.1007/978-981-10-3325-4_37 ◽

2017 ◽

pp. 368-375 ◽

Cited By ~ 1

Author(s):

R. Jayashree ◽

N. Niveditha

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Vector Space ◽

Language Processing ◽

Question Answering ◽

Vector Space Model ◽

Space Model

Download Full-text

Word Sense Disambiguation Using Cosine Similarity Collaborates with Word2vec and WordNet

Future Internet ◽

10.3390/fi11050114 ◽

2019 ◽

Vol 11 (5) ◽

pp. 114 ◽

Cited By ~ 5

Author(s):

Korawit Orkphol ◽

Wu Yang

Keyword(s):

Vector Space ◽

Language Processing ◽

Semantic Analysis ◽

Word Sense Disambiguation ◽

Vector Space Model ◽

Word Embedding ◽

Cosine Similarity ◽

Word Sense ◽

Lexical Database ◽

Space Model

Words have different meanings (i.e., senses) depending on the context. Disambiguating the correct sense is important and a challenging task for natural language processing. An intuitive way is to select the highest similarity between the context and sense definitions provided by a large lexical database of English, WordNet. In this database, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms interlinked through conceptual semantics and lexicon relations. Traditional unsupervised approaches compute similarity by counting overlapping words between the context and sense definitions which must match exactly. Similarity should compute based on how words are related rather than overlapping by representing the context and sense definitions on a vector space model and analyzing distributional semantic relationships among them using latent semantic analysis (LSA). When a corpus of text becomes more massive, LSA consumes much more memory and is not flexible to train a huge corpus of text. A word-embedding approach has an advantage in this issue. Word2vec is a popular word-embedding approach that represents words on a fix-sized vector space model through either the skip-gram or continuous bag-of-words (CBOW) model. Word2vec is also effectively capturing semantic and syntactic word similarities from a huge corpus of text better than LSA. Our method used Word2vec to construct a context sentence vector, and sense definition vectors then give each word sense a score using cosine similarity to compute the similarity between those sentence vectors. The sense definition also expanded with sense relations retrieved from WordNet. If the score is not higher than a specific threshold, the score will be combined with the probability of that sense distribution learned from a large sense-tagged corpus, SEMCOR. The possible answer senses can be obtained from high scores. Our method shows that the result (50.9% or 48.7% without the probability of sense distribution) is higher than the baselines (i.e., original, simplified, adapted and LSA Lesk) and outperforms many unsupervised systems participating in the SENSEVAL-3 English lexical sample task.

Download Full-text

Sistem Temu Kembali Informasi Pada Gejala Autisme Dengan Metode Vector Space Model

Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) ◽

10.29207/resti.v3i2.1028 ◽

2019 ◽

Vol 3 (2) ◽

pp. 257-264

Author(s):

Bayu Sugara ◽

Dody Dody ◽

Donny Donny

Keyword(s):

Information Retrieval ◽

Vector Space ◽

Large Scale ◽

Retrieval System ◽

Dimensional Space ◽

Vector Space Model ◽

The Internet ◽

Space Model ◽

Information And Communication ◽

Retrieval Software

Information is now very easy to get anywhere. Information technology, especially the internet, strongly supports the exchange of information very quickly. The internet has become an information and communication media that has been used by many people with many interests, especially in taking large-scale information data, Unfortunately the information presented is sometimes less relevant. Quality information is influenced by relevance, accuracy and on time. However, there are not many effective search systems available. This study discusses the implementation of an information retrieval system to find and find symptoms of autism disorders using the Vector Space Model (VSM) method. Vector Space Model (VSM) is a model used to measure the similarity between a document and a query. In this model, queries and documents are considered vectors in n dimensional space. Where n is the number of all terms listed. The purpose of this study was to design an information retrieval software to find and match the symptoms of autism disorders. By using Vector Space Model, it is hoped that it can provide a solution to the search engine to provide text matching information in the database using certain keywords, the results of the matching are presented in the form of ranks.

Download Full-text

A Topological Method for Comparing Document Semantics

10.5121/csit.2020.101411 ◽

2020 ◽

Author(s):

Yuqi Kong ◽

Fanchao Meng ◽

Ben Carterette

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Language Processing ◽

State Of The Art ◽

Vector Space Model ◽

The Other ◽

Space Model ◽

Topological Persistence ◽

Art Methods ◽

Novel Algorithm

Comparing document semantics is one of the toughest tasks in both Natural Language Processing and Information Retrieval. To date, on one hand, the tools for this task are still rare. On the other hand, most relevant methods are devised from the statistic or the vector space model perspectives but nearly none from a topological perspective. In this paper, we hope to make a different sound. A novel algorithm based on topological persistence for comparing semantics similarity between two documents is proposed. Our experiments are conducted on a document dataset with human judges’ results. A collection of state-of-the-art methods are selected for comparison. The experimental results show that our algorithm can produce highly human-consistent results, and also beats most state-of-the-art methods though ties with NLTK.

Download Full-text

A CONCEPT VECTOR SPACE MODEL FOR SEMANTIC KERNELS

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213009000123 ◽

2009 ◽

Vol 18 (02) ◽

pp. 239-272 ◽

Cited By ~ 1

Author(s):

SUJEEVAN ASEERVATHAM

Keyword(s):

Vector Space ◽

Language Processing ◽

Text Categorization ◽

Semantic Analysis ◽

Similarity Measures ◽

Vector Space Model ◽

Inner Product ◽

Support Vector ◽

Linear Kernel ◽

Space Model

Kernels are widely used in Natural Language Processing as similarity measures within inner-product based learning methods like the Support Vector Machine. The Vector Space Model (VSM) is extensively used for the spatial representation of the documents. However, it is purely a statistical representation. In this paper, we present a Concept Vector Space Model (CVSM) representation which uses linguistic prior knowledge to capture the meanings of the documents. We also propose a linear kernel and a latent kernel for this space. The linear kernel takes advantage of the linguistic concepts whereas the latent kernel combines statistical and linguistic concepts. Indeed, the latter kernel uses latent concepts extracted by the Latent Semantic Analysis (LSA) in the CVSM. The kernels were evaluated on a text categorization task in the biomedical domain. The Ohsumed corpus, well known for being difficult to categorize, was used. The results have shown that the CVSM improves performance compared to the VSM.

Download Full-text

Application of Parallel Vector Space Model for Large-Scale DNA Sequence Analysis

Journal of Grid Computing ◽

10.1007/s10723-018-9451-5 ◽

2018 ◽

Vol 17 (2) ◽

pp. 313-324 ◽

Cited By ~ 1

Author(s):

Abdul Majid ◽

Mukhtaj Khan ◽

Nadeem Iqbal ◽

Mian Ahmad Jan ◽

Mushtaq Khan ◽

...

Keyword(s):

Sequence Analysis ◽

Vector Space ◽

Dna Sequence ◽

Large Scale ◽

Vector Space Model ◽

Dna Sequence Analysis ◽

Space Model

Download Full-text

Extended Vector Space Model with Semantic Relatedness on Java Archive Search Engine

Jurnal Teknik Informatika dan Sistem Informasi ◽

10.28932/jutisi.v1i2.372 ◽

2015 ◽

Vol 1 (2) ◽

Cited By ~ 2

Author(s):

Oscar Karnalim

Keyword(s):

Vector Space ◽

Search Engine ◽

Vector Space Model ◽

Semantic Relatedness ◽

Space Model

Download Full-text

Aplikasi Deteksi Kemiripan Tugas Paper

Matrik Jurnal Manajemen Teknik Informatika dan Rekayasa Komputer ◽

10.30812/matrik.v15i2.39 ◽

2017 ◽

Vol 15 (2) ◽

pp. 5

Author(s):

Anthony Anggrawan ◽

Azhari

Keyword(s):

Information Retrieval ◽

Vector Space ◽

Vector Space Model ◽

Mean Average Precision ◽

Average Precision ◽

Information Searching ◽

Space Model ◽

Model Method

Information searching based on users’ query, which is hopefully able to find the documents based on users’ need, is known as Information Retrieval. This research uses Vector Space Model method in determining the similarity percentage of each student’s assignment. This research uses PHP programming and MySQL database. The finding is represented by ranking the similarity of document with query, with mean average precision value of 0,874. It shows how accurate the application with the examination done by the experts, which is gained from the evaluation with 5 queries that is compared to 25 samples of documents. If the number of counted assignments has higher similarity, thus the process of similarity counting needs more time, it depends on the assignment’s number which is submitted.

Download Full-text

Aplikasi Rekomendasi Buku Pada Katalog Perpustakaan Universitas Multimedia Nusantara Menggunakan Vector Space Model

Jurnal ULTIMATICS ◽

10.31937/ti.v9i2.639 ◽

2018 ◽

Vol 9 (2) ◽

pp. 97-105

Author(s):

Richard Firdaus Oeyliawan ◽

Dennis Gunawan

Keyword(s):

Vector Space ◽

Vector Space Model ◽

Vector Model ◽

Library Management ◽

Space Model ◽

Library Management System ◽

Index Terms ◽

Library Catalogue ◽

Language Sample ◽

F Measure

Library is one of the facilities which provides information, knowledge resource, and acts as an academic helper for readers to get the information. The huge number of books which library has, usually make readers find the books with difficulty. Universitas Multimedia Nusantara uses the Senayan Library Management System (SLiMS) as the library catalogue. SLiMS has many features which help readers, but there is still no recommendation feature to help the readers finding the books which are relevant to the specific book that readers choose. The application has been developed using Vector Space Model to represent the document in vector model. The recommendation in this application is based on the similarity of the books description. Based on the testing phase using one-language sample of the relevant books, the F-Measure value gained is 55% using 0.1 as cosine similarity threshold. The books description and variety of languages affect the F-Measure value gained. Index Terms—Book Recommendation, Porter Stemmer, SLiMS Universitas Multimedia Nusantara, TF-IDF, Vector Space Model

Download Full-text

First Movers and Follow-on Invention: Evidence from a Vector Space Model of Invention

SSRN Electronic Journal ◽

10.2139/ssrn.3354530 ◽

2019 ◽

Cited By ~ 1

Author(s):

Kenneth A. Younge ◽

Jeffrey M. Kuhn

Keyword(s):

Vector Space ◽

Vector Space Model ◽

Space Model

Download Full-text