semantic similarity analysis
Recently Published Documents


TOTAL DOCUMENTS

35
(FIVE YEARS 15)

H-INDEX

6
(FIVE YEARS 2)

2020 ◽  
Vol 18 (1) ◽  
pp. 1-7
Author(s):  
Adnen Mahmoud ◽  
Mounir Zrigui

Paraphrase detection allows determining how original and suspect documents convey the same meaning. It has attracted attention from researchers in many Natural Language Processing (NLP) tasks such as plagiarism detection, question answering, information retrieval, etc., Traditional methods (e.g., Term Frequency-Inverse Document Frequency (TF-IDF), Latent Dirichlet Allocation (LDA), and Latent Semantic Analysis (LSA)) cannot capture efficiently hidden semantic relations when sentences may not contain any common words or the co-occurrence of words is rarely present. Therefore, we proposed a deep learning model based on Global Word embedding (GloVe) and Recurrent Convolutional Neural Network (RCNN). It was efficient for capturing more contextual dependencies between words vectors with precise semantic meanings. Seeing the lack of resources in Arabic language publicly available, we developed a paraphrased corpus automatically. It preserved syntactic and semantic structures of Arabic sentences using word2vec model and Part-Of-Speech (POS) annotation. Overall experiments shown that our proposed model outperformed the state-of-the-art methods in terms of precision and recall


2020 ◽  
Vol 107 (4) ◽  
pp. 683-697 ◽  
Author(s):  
Peter D. Galer ◽  
Shiva Ganesan ◽  
David Lewis-Smith ◽  
Sarah E. McKeown ◽  
Manuela Pendziwiat ◽  
...  

Author(s):  
Sukumar Rajendran ◽  
Prabhu J.

The evolution of humankind is through the exchange of information and extraction of knowledge from available information. The process of exchange of the information differs by the probability of the medium through which the information is exchanged. The Internet of things (IoT) contains millions of devices with sensors simultaneously transferring real time information to devices as rapid streams of data that need to be processed on the go. This leads to the need for development of effective and efficient approaches for segregating data based on class, relatedness, and differences in the information. The extraction of text from images is performed through tesseract irrespective of the language. SCIBERT models to extract scientific information and evaluating on a suite of tasks specially in classifying drugs based on free data (tweets, images, etc.). The images and text-based semantic similarity analysis provide similar drugs grouped together by composition or manufacturer.


The similarity between two synsets or concepts is a numeral measure of the degree to which the two objects are alike or not and the similarity measures say the degree of closeness between two synsets or concepts. The similarity or dissimilarity represented by the term proximity. Proximity measures are defined to have values in the interval [0, 1]. Term Similarity, Sentence similarity and Document similarity are the areas of text similarity. Term similarity measures used to measure the similarity between individual tokens and words, Sentence similarity is the similarity between two or more sentences and Document similarity used to measure the similarity between two or more corpora. This paper is the study between Knowledge based, Distribution based and prediction based semantic models and shows how knowledge based methods capturing information and prediction based methods preserving semantic information.


To know the information from the internet searching is one of the most important part for any user. In case of ‘Syntactic Search’ keyword based matching technique is used. Search accuracy is improved applying the filter like location, preference, user-history etc. However, it can happen that the user query or question and the best available answer or result in the internet domain has no terms in common or ignorable number of terms is common. In such case syntactic search cannot give the desired output. The role of ‘Semantic Search’ becomes prevalent in this scenario. The execution of semantic search faces challenge due to unavailability of resources like WordNet, Ontology, Annotation etc. An end to end algorithm is described to improve the accuracy of the semantic search in this work. Four classification techniques are used. They are ANN, Decision Tree, SVM and Naïve Bayes. Dataset is provided from the TDIL project of the Ministry of Electronics and IT, Govt. of India. The repository contains 86 categories of text having more than a million sentences. After getting the impressive result for the Bengali language test run was done for other Indian languages and a very good result is achieved. This research is extremely useful for the automatic question answering system, semantic similarity analysis, e-governance and m- governance.


Author(s):  
Torben Andreas Mahnken

Abstract Background: The companies’ operative environment is covered by a constant change through digitization, industry 4.0, artificial intelligence and megatrends, which pose enormous challenges. In order to master these challenges, the development of knowledge from foreign industries is a possible solution. This phenomenon is called cross-industry innovation. The aim of this study is to identify major directions in cross-industry innovation research. An essential part of science is to build on and enhance already published knowledge. In order to identify this knowledge and to offer a comprehensible and reproducible process, a systematic approach is required. For this study, a systematic literature review will be applied. During the review of the relevant literature, an understanding of the broad and deep nature of the literature emerges and it is possible to identify research gaps, which is often based on qualitative analyses. We will develop a new approach for analyzing literature based on semantic similarity analysis. Methods: In order to achieve the research objectives a systematic literature review is applied and slightly modified. First, we will apply a systematic literature review protocol to establish reproducible results. Second, we develop a new approach for analyzing literature using a combination of semantic similarity analysis and a cluster analysis. Conclusion: This review will aid in determining the directions of cross-industry innovation. Overall, six directions of cross-industry innovation research can be identified. Furthermore, this study offers a new approach for analyzing literature based on quantitative analysis like semantic similarity analysis.


Sign in / Sign up

Export Citation Format

Share Document