GeTFIRST: ontology-based keyword search towards semantic disambiguation

2015 ◽  
Vol 11 (4) ◽  
pp. 442-467 ◽  
Author(s):  
Hoang-Minh Nguyen ◽  
Hong-Quang Nguyen ◽  
Khoi-Nguyen Tran ◽  
Xuan-Vinh Vo

Purpose – This paper aims to improve the semantic-disambiguation capability of an information-retrieval system by taking advantages of a well-crafted classification tree. The unstructured nature and sheer volume of information accessible over networks have made it drastically difficult for users to seek relevant information. Many information-retrieval methods have been developed to address this problem, and keyword-based approach is amongst the most common approach. Such an approach is often inadequate to cope with the conceptualization associated with user needs and contents. This brings about the problem of semantic ambiguation that refers to the disagreement in meaning of terms between involving parties of a communication due to polysemy, leading to increased complexity and lesser accuracy in information integration, migration, retrieval and other related activities. Design/methodology/approach – A novel ontology-based search approach, named GeTFIRST (short for Graph-embedded Tree Fostering Information Retrieval SysTem), is proposed to disambiguate keywords semantically. The contribution is twofold. First, a search strategy is proposed to prune irrelevant concepts for accuracy improvement using our Graph-embedded Tree (GeT)-based ontology. Second, a path-based ranking algorithm is proposed to incorporate and reward the content specificity. Findings – An empirical evaluation was performed on United States Patent And Trademark Office (USPTO) patent datasets to compare our approach with full-text patent search approaches. The results showed that GeTFIRST handled the ambiguous keywords with higher keyword-disambiguation accuracy than traditional search approaches. Originality/value – The search approach of this paper copes with the semantic ambiguation by using our proposed GeT-based ontology and a path-based ranking algorithm.

2016 ◽  
Vol 34 (4) ◽  
pp. 705-732 ◽  
Author(s):  
Young Man Ko ◽  
Min Sun Song ◽  
Seung Jun Lee

Purpose The purpose of this paper is to construct a structural definition-based terminology ontology system that defines the meanings of academic terms on the basis of properties and links terms with properties that are structured by conceptual categories (classes). This study also aims to test the possibility of semantic searches by generating inference rules and setting very complicated search scenarios. Design/methodology/approach For the study, 55,236 keywords from the articles of the “Korea Citation Index” were structurally defined and relationships among terms and properties were built. Then, the authors converted the RDB data into RDF and designed ontologies using the ontology developing tool Protégé. The authors also tested the designed ontology with the inference engine of the Protégé editor. The generated reference rules were tested by TBox and SPARQL queries. Findings The authors generated inference control rules targeting high-input-ratio data in the properties of classes by calculating the input ratio of real input data in the system, and then the authors executed a semantic search by SPARQL query by setting very complicated search scenarios, for which it would be difficult to deduce results via a simple keyword search. As a result, it was confirmed that the search results show the logical combination of semantically related term data. Practical implications The proposed terminology ontology system was constructed with the author keywords from research papers, it will be useful in searching the research papers which include the keywords as search results by the complex combination of semantic relation. And the Structural Terminology Net database could be utilized as an index database in retrieval services and the mining of informal big data through the application of well-defined semantic concepts to each term. Originality/value This paper presented a methodology for supporting IR using expanded queries based on a novel model of structural terminology-based ontology. The user who wants to access the specific topic can create query that brings the semantically relevant information. The search results show the logical combination of semantically related term data, which would be difficult to deduce results via traditional IR systems.


2017 ◽  
Vol 35 (3) ◽  
pp. 398-409
Author(s):  
Gracielle Mendonça Rodrigues Gomes ◽  
Beatriz Valadares Cendon

Purpose The study aims to propose the use of the semiotics inspection method (SIM) which is an interpretative and qualitative method from semiotics engineering (SE) for the evaluation of the communicability of systems and to evaluate digital libraries and information retrieval systems (IRS). The paper presents the results of the application of this method in the evaluation of the quality of the communicability of the interface and search system of the Coordination for the Improvement of Higher Education Personnel (CAPES) Portal of e-Journals, a major scientific digital library in Brazil. There are proposed solutions to improve this system included. Design/methodology/approach The study used the SIM to evaluate the system. Two evaluators inspected the system. They performed the comparison and the analysis of three types of metamessages (metalinguistic, static and dynamic). The metamessages generated by the evaluators were contrasted to find inconsistencies and ambiguities in the CAPES Portal of e-Journals. Finally, the last step of the method was the final assessment about the inspection. Findings The evaluators identified 52 problems of communicability. These problems were ranked according to severity ratings established by Nielsen (1994). They were grouped in ten types of problems present in the interface and in the search system of the CAPES Portal of e-Journals. Originality value This research contributes theoretically to the field of information retrieval and to the area of human–computer interaction and, in particular, to the theory of SE by adapting SE methods that allow the evaluation of communicability to the context of the scientific IRS. Results obtained through scientific methods should contribute to development of the interface and search tools of IRS to better support query formulation and retrieval of relevant information and more efficiently satisfy the information needs of individuals.


2015 ◽  
Vol 2015 ◽  
pp. 1-9 ◽  
Author(s):  
K. R. Uthayan ◽  
G. S. Anandha Mala

Ontology is the process of growth and elucidation of concepts of an information domain being common for a group of users. Establishing ontology into information retrieval is a normal method to develop searching effects of relevant information users require. Keywords matching process with historical or information domain is significant in recent calculations for assisting the best match for specific input queries. This research presents a better querying mechanism for information retrieval which integrates the ontology queries with keyword search. The ontology-based query is changed into a primary order to predicate logic uncertainty which is used for routing the query to the appropriate servers. Matching algorithms characterize warm area of researches in computer science and artificial intelligence. In text matching, it is more dependable to study semantics model and query for conditions of semantic matching. This research develops the semantic matching results between input queries and information in ontology field. The contributed algorithm is a hybrid method that is based on matching extracted instances from the queries and information field. The queries and information domain is focused on semantic matching, to discover the best match and to progress the executive process. In conclusion, the hybrid ontology in semantic web is sufficient to retrieve the documents when compared to standard ontology.


2013 ◽  
Vol 07 (04) ◽  
pp. 407-426
Author(s):  
TUUKKA RUOTSALO ◽  
MATIAS FROSTERUS

Structured Web data are increasingly accessed using information retrieval methods and information retrieval increasingly relies on structured background knowledge. As users' searches are often directed towards finding information about entities rather than text documents, a key affordance of semantic search is the ability to retrieve relevant information about entities more precisely by utilizing the rich structured descriptions and background knowledge. Entity search also poses challenges for information retrieval methods. Entity descriptions are often short and conventional search term matching alone can be insufficient. As a consequence, the search engine should be able to increase the recall of the returned results and select a representative set of entities for a user; to diversify search results. This paper presents an approach to diversify entity search by using semantics present and inferred from the initial entity search results. Our approach utilizes ontologies as a source of background knowledge to improve recall of entity retrieval and independent component analysis to detect independent latent components shared by the entities. The search results are then diversified by selecting a representative set of entities based on their membership in the independent components. We demonstrate the performance of our approach through retrieval experiments conducted by using a real-world dataset composed from four entity databases. The results suggest that our approach can significantly improve effectiveness and diversity of entity search.


Information Retrieval has become the buzzword in the today’s era of advanced computing. The tremendous amount of information is available over the Internet in the form of documents which can either be structured or unstructured. It is really difficult to retrieve relevant information from such large pool. The traditional search engines based on keyword search are unable to give the desired relevant results as they search the web on the basis of the keywords present in the query fired. On contrary the ontology based semantic search engines provide relevant and quick results to the user as the information stored in the semantic web is more meaningful. The paper gives the comparative study of the ontology based search engines with those which are keyword based. Few of both types have been taken and same queries are run on each one of them to analyze the results to compare the precision of the results provided by them by classifying the results as relevant or non-relevant.


Mathematics ◽  
2021 ◽  
Vol 9 (3) ◽  
pp. 238
Author(s):  
Yuna Hur ◽  
Jaechoon Jo

A significant amount of digital cultural contents is shared online, but learners do not know where subject matter content is or how to find it. Therefore, there is a need for a service to improve educational quality by effectively providing relevant information in response to searches for content that is useful to learners. This study developed and tested the usability and utility of an intelligent information system that effectively searches and visualizes digital cultural contents. The system collects data on digital cultural contents, automatically classifies them, and creates content triple data to automatically display the results with a 3D timeline, knowledge network map, and keyword relation network map through content search, triple search, and keyword search. We also conducted a survey and in-depth interviews to verify users’ satisfaction with respect to the use and utility of the system. For the experiment, we developed survey questions to measure user satisfaction and conducted in-depth interviews regarding the system’s utility with a total of 65 subjects. The results show that the response for satisfaction with regard to the use and utility was generally “satisfied”. In addition, the system stability was evaluated as “high”.


Heliyon ◽  
2021 ◽  
Vol 7 (2) ◽  
pp. e06257
Author(s):  
Ennio Idrobo-Ávila ◽  
Humberto Loaiza-Correa ◽  
Rubiel Vargas-Cañas ◽  
Flavio Muñoz-Bolaños ◽  
Leon van Noorden

2020 ◽  
pp. 102986492097216
Author(s):  
Gaelen Thomas Dickson ◽  
Emery Schubert

Background: Music is thought to be beneficial as a sleep aid. However, little research has explicitly investigated the specific characteristics of music that aid sleep and some researchers assume that music described as generically sedative (slow, with low rhythmic activity) is necessarily conducive to sleep, without directly interrogating this assumption. This study aimed to ascertain the features of music that aid sleep. Method: As part of an online survey, 161 students reported the pieces of music they had used to aid sleep, successfully or unsuccessfully. The participants reported 167 pieces, some more often than others. Nine features of the pieces were analyzed using a combination of music information retrieval methods and aural analysis. Results: Of the pieces reported by participants, 78% were successful in aiding sleep. The features they had in common were that (a) their main frequency register was middle range frequencies; (b) their tempo was medium; (c) their articulation was legato; (d) they were in the major mode, and (e) lyrics were present. They differed from pieces that were unsuccessful in aiding sleep in that (a) their main frequency register was lower; (b) their articulation was legato, and (c) they excluded high rhythmic activity. Conclusion: Music that aids sleep is not necessarily sedative music, as defined in the literature, but some features of sedative music are associated with aiding sleep. In the present study, we identified the specific features of music that were reported to have been successful and unsuccessful in aiding sleep. The identification of these features has important implications for the selection of pieces of music used in research on sleep.


2021 ◽  
pp. 1-11
Author(s):  
V.S. Anoop ◽  
P. Deepak ◽  
S. Asharaf

Online social networks are considered to be one of the most disruptive platforms where people communicate with each other on any topic ranging from funny cat videos to cancer support. The widespread diffusion of mobile platforms such as smart-phones causes the number of messages shared in such platforms to grow heavily, thus more intelligent and scalable algorithms are needed for efficient extraction of useful information. This paper proposes a method for retrieving relevant information from social network messages using a distributional semantics-based framework powered by topic modeling. The proposed framework combines the Latent Dirichlet Allocation and distributional representation of phrases (Phrase2Vec) for effective information retrieval from online social networks. Extensive and systematic experiments on messages collected from Twitter (tweets) show this approach outperforms some state-of-the-art approaches in terms of precision and accuracy and better information retrieval is possible using the proposed method.


2020 ◽  
Vol 36 (S1) ◽  
pp. 10-10
Author(s):  
Vigdis Lauvrak ◽  
Kelly Farrah ◽  
Rosmin Esmail ◽  
Anna Lien Espeland ◽  
Elisabet Hafstad ◽  
...  

IntroductionIn 2019, the Norwegian Institute for Public Health and Canadian Agency for Drugs and Technologies in Health (CADTH) received support from HTAi to produce a quarterly current awareness alert for the HTAi Disinvestment and Early Awareness Interest Group in collaboration with the HTAi Information Retrieval Interest Group. The alert focuses on methods and topical issues, and broader forecasts of potentially disruptive technologies that may be of interest to those involved in horizon scanning and disinvestment initiatives in health technology assessment (HTA).MethodsInformation specialists at both agencies developed search strategies for disinvestment and for horizon scanning in PubMed and Google. The template for the alert was based on an e-newsletter developed by the Information Retrieval Interest Group. Information specialists and researchers reviewed the monthly (PubMed) and weekly (Google) search results and selected potentially relevant publications. Additional sources were also identified through regular HTA and horizon scanning work.ResultsAlerts are posted quarterly on the HTAi Interest Group website; members receive an email notice when new alerts are available. While the revised PubMed searches are identifying relevant information, Google alerts have been disappointing, and this search may need to be revised further or dropped. When the one-year pilot project ends, in Fall 2020, interest group members will be surveyed to see if the alerts were useful, and whether they have suggestions for improving them.ConclusionsCollaborating on this alert service reduces duplication of effort between agencies, and makes new research in horizon scanning and disinvestment more accessible to colleagues in other agencies working in these areas.


Sign in / Sign up

Export Citation Format

Share Document