scholarly journals Engineering Knowledge Graph for Keyword Discovery in Patent Search

Author(s):  
Serhad Sarica ◽  
Binyang Song ◽  
En Low ◽  
Jianxi Luo

AbstractPatent retrieval and analytics have become common tasks in engineering design and innovation. Keyword-based search is the most common method and the core of integrative methods for patent retrieval. Often searchers intuitively choose keywords according to their knowledge on the search interest which may limit the coverage of the retrieval. Although one can identify additional keywords via reading patent texts from prior searches to refine the query terms heuristically, the process is tedious, time-consuming, and prone to human errors. In this paper, we propose a method to automate and augment the heuristic and iterative keyword discovery process. Specifically, we train a semantic engineering knowledge graph on the full patent database using natural language processing and semantic analysis, and use it as the basis to retrieve and rank the keywords contained in the retrieved patents. On this basis, searchers do not need to read patent texts but just select among the recommended keywords to expand their queries. The proposed method improves the completeness of the search keyword set and reduces the human effort for the same task.

2021 ◽  
Vol 47 (05) ◽  
Author(s):  
NGUYỄN CHÍ HIẾU

Knowledge Graphs are applied in many fields such as search engines, semantic analysis, and question answering in recent years. However, there are many obstacles for building knowledge graphs as methodologies, data and tools. This paper introduces a novel methodology to build knowledge graph from heterogeneous documents.  We use the methodologies of Natural Language Processing and deep learning to build this graph. The knowledge graph can use in Question answering systems and Information retrieval especially in Computing domain


Author(s):  
Radha Guha

Background:: In the era of information overload it is very difficult for a human reader to make sense of the vast information available in the internet quickly. Even for a specific domain like college or university website it may be difficult for a user to browse through all the links to get the relevant answers quickly. Objective:: In this scenario, design of a chat-bot which can answer questions related to college information and compare between colleges will be very useful and novel. Methods:: In this paper a novel conversational interface chat-bot application with information retrieval and text summariza-tion skill is designed and implemented. Firstly this chat-bot has a simple dialog skill when it can understand the user query intent, it responds from the stored collection of answers. Secondly for unknown queries, this chat-bot can search the internet and then perform text summarization using advanced techniques of natural language processing (NLP) and text mining (TM). Results:: The advancement of NLP capability of information retrieval and text summarization using machine learning tech-niques of Latent Semantic Analysis(LSI), Latent Dirichlet Allocation (LDA), Word2Vec, Global Vector (GloVe) and Tex-tRank are reviewed and compared in this paper first before implementing them for the chat-bot design. This chat-bot im-proves user experience tremendously by getting answers to specific queries concisely which takes less time than to read the entire document. Students, parents and faculty can get the answers for variety of information like admission criteria, fees, course offerings, notice board, attendance, grades, placements, faculty profile, research papers and patents etc. more effi-ciently. Conclusion:: The purpose of this paper was to follow the advancement in NLP technologies and implement them in a novel application.


Entropy ◽  
2021 ◽  
Vol 23 (6) ◽  
pp. 664
Author(s):  
Nikos Kanakaris ◽  
Nikolaos Giarelis ◽  
Ilias Siachos ◽  
Nikos Karacapilidis

We consider the prediction of future research collaborations as a link prediction problem applied on a scientific knowledge graph. To the best of our knowledge, this is the first work on the prediction of future research collaborations that combines structural and textual information of a scientific knowledge graph through a purposeful integration of graph algorithms and natural language processing techniques. Our work: (i) investigates whether the integration of unstructured textual data into a single knowledge graph affects the performance of a link prediction model, (ii) studies the effect of previously proposed graph kernels based approaches on the performance of an ML model, as far as the link prediction problem is concerned, and (iii) proposes a three-phase pipeline that enables the exploitation of structural and textual information, as well as of pre-trained word embeddings. We benchmark the proposed approach against classical link prediction algorithms using accuracy, recall, and precision as our performance metrics. Finally, we empirically test our approach through various feature combinations with respect to the link prediction problem. Our experimentations with the new COVID-19 Open Research Dataset demonstrate a significant improvement of the abovementioned performance metrics in the prediction of future research collaborations.


2013 ◽  
Vol 18 (2) ◽  
pp. 130-144 ◽  
Author(s):  
KEES DE BOT ◽  
CAROL JAENSCH

While research on third language (L3) and multilingualism has recently shown remarkable growth, the fundamental question of what makes trilingualism special compared to bilingualism, and indeed monolingualism, continues to be evaded. In this contribution we consider whether there is such a thing as a true monolingual, and if there is a difference between dialects, styles, registers and languages. While linguistic and psycholinguistic studies suggest differences in the processing of a third, compared to the first or second language, neurolinguistic research has shown that generally the same areas of the brain are activated during language use in proficient multilinguals. It is concluded that while from traditional linguistic and psycholinguistic perspectives there are grounds to differentiate monolingual, bilingual and multilingual processing, a more dynamic perspective on language processing in which development over time is the core issue, leads to a questioning of the notion of languages as separate entities in the brain.


Information ◽  
2021 ◽  
Vol 12 (10) ◽  
pp. 417
Author(s):  
Mohammed Khader ◽  
Marcel Karam ◽  
Hanna Fares

Cybersecurity is a multifaceted global phenomenon representing complex socio-technical challenges for governments and private sectors. With technology constantly evolving, the types and numbers of cyberattacks affect different users in different ways. The majority of recorded cyberattacks can be traced to human errors. Despite being both knowledge- and environment-dependent, studies show that increasing users’ cybersecurity awareness is found to be one of the most effective protective approaches. However, the intangible nature, socio-technical dependencies, constant technological evolutions, and ambiguous impact make it challenging to offer comprehensive strategies for better communicating and combatting cyberattacks. Research in the industrial sector focused on creating institutional proprietary risk-aware cultures. In contrast, in academia, where cybersecurity awareness should be at the core of an academic institution’s mission to ensure all graduates are equipped with the skills to combat cyberattacks, most of the research focused on understanding students’ attitudes and behaviors after infusing cybersecurity awareness topics into some courses in a program. This work proposes a conceptual Cybersecurity Awareness Framework to guide the implementation of systems to improve the cybersecurity awareness of graduates in any academic institution. This framework comprises constituents designed to continuously improve the development, integration, delivery, and assessment of cybersecurity knowledge into the curriculum of a university across different disciplines and majors; this framework would thus lead to a better awareness among all university graduates, the future workforce. This framework may be adjusted to serve as a blueprint that, once adjusted by academic institutions to accommodate their missions, guides institutions in developing or amending their policies and procedures for the design and assessment of cybersecurity awareness.


Author(s):  
Ángela Almela ◽  
Gema Alcaraz-Mármol ◽  
Arancha García-Pinar ◽  
Clara Pallejá

In this paper, the methods for developing a database of Spanish writing that can be used for forensic linguistic research are presented, including our data collection procedures. Specifically, the main instrument used for data collection has been translated into Spanish and adapted from Chaski (2001). It consists of ten tasks, by means of which the subjects are asked to write formal and informal texts about different topics. To date, 93 undergraduates from Spanish universities have already participated in the study and prisoners convicted of gender-based abuse have participated. A twofold analysis has been performed, since the data collected have been approached from a semantic and a morphosyntactic perspective. Regarding the semantic analysis, psycholinguistic categories have been used, many of them taken from the LIWC dictionary (Pennebaker et al., 2001). In order to obtain a more comprehensive depiction of the linguistic data, some other ad-hoc categories have been created, based on the corpus itself, using a double-check method for their validation so as to ensure inter-rater reliability. Furthermore, as regards morphosyntactic analysis, the natural language processing tool ALIAS TATTLER is being developed for Spanish.  Results shows that is it possible to differentiate non-abusers from abusers with strong accuracy based on linguistic features.


2001 ◽  
Vol 13 (6) ◽  
pp. 829-843 ◽  
Author(s):  
A. L. Roskies ◽  
J. A. Fiez ◽  
D. A. Balota ◽  
M. E. Raichle ◽  
S. E. Petersen

To distinguish areas involved in the processing of word meaning (semantics) from other regions involved in lexical processing more generally, subjects were scanned with positron emission tomography (PET) while performing lexical tasks, three of which required varying degrees of semantic analysis and one that required phonological analysis. Three closely apposed regions in the left inferior frontal cortex and one in the right cerebellum were significantly active above baseline in the semantic tasks, but not in the nonsemantic task. The activity in two of the frontal regions was modulated by the difficulty of the semantic judgment. Other regions, including some in the left temporal cortex and the cerebellum, were active across all four language tasks. Thus, in addition to a number of regions known to be active during language processing, regions in the left inferior frontal cortex were specifically recruited during semantic processing in a task-dependent manner. A region in the right cerebellum may be functionally related to those in the left inferior frontal cortex. Discussion focuses on the implications of these results for current views regarding neural substrates of semantic processing.


Author(s):  
Shatakshi Singh ◽  
Kanika Gautam ◽  
Prachi Singhal ◽  
Sunil Kumar Jangir ◽  
Manish Kumar

The recent development in artificial intelligence is quite astounding in this decade. Especially, machine learning is one of the core subareas of AI. Also, ML field is an incessantly growing along with evolution and becomes a rise in its demand and importance. It transmogrified the way data is extracted, analyzed, and interpreted. Computers are trained to get in a self-training mode so that when new data is fed they can learn, grow, change, and develop themselves without explicit programming. It helps to make useful predictions that can guide better decisions in a real-life situation without human interference. Selection of ML tool is always a challenging task, since choosing an appropriate tool can end up saving time as well as making it faster and easier to provide any solution. This chapter provides a classification of various machine learning tools on the following aspects: for non-programmers, for model deployment, for Computer vision, natural language processing, and audio for reinforcement learning and data mining.


Author(s):  
Subhadra Dutta ◽  
Eric M. O’Rourke

Natural language processing (NLP) is the field of decoding human written language. This chapter responds to the growing interest in using machine learning–based NLP approaches for analyzing open-ended employee survey responses. These techniques address scalability and the ability to provide real-time insights to make qualitative data collection equally or more desirable in organizations. The chapter walks through the evolution of text analytics in industrial–organizational psychology and discusses relevant supervised and unsupervised machine learning NLP methods for survey text data, such as latent Dirichlet allocation, latent semantic analysis, sentiment analysis, word relatedness methods, and so on. The chapter also lays out preprocessing techniques and the trade-offs of growing NLP capabilities internally versus externally, points the readers to available resources, and ends with discussing implications and future directions of these approaches.


Sign in / Sign up

Export Citation Format

Share Document