Context-Aware Text Matching Algorithm for Korean Peninsula Language Knowledge Base Based on Density Clustering

The majority of the traditional methods deal with text matching at the word level which remains uncertain as the text semantic features are ignored. This also leads to the problems of low recall and high space utilization of text matching while the comprehensiveness of matching results is poor. The resultant method, thus, cannot process long text and short text simultaneously. The current study proposes a text matching algorithm for Korean Peninsula language knowledge base based on density clustering. Using the deep multiview semantic document representation model, the semantic vector of the text to be matched is captured for semantic dependency which is utilized to extract the text semantic features. As per the feature extraction outcomes, the text similarity is calculated by subtree matching method, and a semantic classification model based on SWEM and pseudo-twin network is designed for semantic text classification. Finally, the text matching of Korean Peninsula language knowledge base is carried out by applying density clustering algorithm. Experimental results show that the proposed method has high matching recall rate with low space requirements and can effectively match long and short texts concurrently.

Download Full-text

Character-word Double-dimensional Semantic Classification Model for Judging Illegal and Irregular Behaviors for Internet Food Safety

2020 IEEE 20th International Conference on Software Quality, Reliability and Security Companion (QRS-C) ◽

10.1109/qrs-c51114.2020.00099 ◽

2020 ◽

Author(s):

Min Zuo ◽

Si-Yu He ◽

Qing-Chuan Zhang ◽

Qing-Bang Wang

Keyword(s):

Food Safety ◽

Classification Model ◽

Semantic Classification

Download Full-text

Design of Library User Profile System Based on Dynamic Density Clustering Algorithm and Stream Computing

2021 IEEE 5th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC) ◽

10.1109/iaeac50856.2021.9391113 ◽

2021 ◽

Author(s):

Jingpei Liao

Keyword(s):

Clustering Algorithm ◽

User Profile ◽

Stream Computing ◽

Dynamic Density ◽

Density Clustering ◽

Library User

Download Full-text

Structure and Semantics of Tactile Verbs in a Comparative Aspect (On the Material of the English, German and Russian Languages)

Uchenye zapiski St. Petersburg University of Management Technologies and Economics ◽

10.35854/2541-8106-2021-3-58-63 ◽

2021 ◽

pp. 58-63

Author(s):

Yu. A. Sakhno

Keyword(s):

Comparative Study ◽

Semantic Features ◽

Semantic Classification ◽

Comparative Aspect ◽

Similarities And Differences ◽

The Comparative Study ◽

Linguistic Units

This article deals with the study of the structural and semantic features of tactile verbs (hereinafter TVs) in English, German and Russian. Particular attention is paid to the comparative study of TVs, which allows us to identify structural and semantic similarities and differences of linguistic units studied. The structural and semantic classification of TVs in the compared languages is also provided.

Download Full-text

Feature Extraction and Mapping Construction for Mobile Robot via Ultrasonic MDP and Fuzzy Model

Sensors ◽

10.3390/s18113673 ◽

2018 ◽

Vol 18 (11) ◽

pp. 3673 ◽

Cited By ~ 2

Author(s):

Zhili Long ◽

Ronghua He ◽

Yuxiang He ◽

Haoyao Chen ◽

Zuohua Li

Keyword(s):

Feature Extraction ◽

Clustering Algorithm ◽

Mobile Robotics ◽

Low Cost ◽

Fuzzy Model ◽

Fuzzy Modeling ◽

Fuzzy Classification ◽

Classification Model ◽

Service Robot ◽

Ultrasonic Array

This paper presents a modeling approach to feature classification and environment mapping for indoor mobile robotics via a rotary ultrasonic array and fuzzy modeling. To compensate for the distance error detected by the ultrasonic sensor, a novel feature extraction approach termed “minimum distance of point” (MDP) is proposed to determine the accurate distance and location of target objects. A fuzzy model is established to recognize and classify the features of objects such as flat surfaces, corner, and cylinder. An environmental map is constructed for automated robot navigation based on this fuzzy classification, combined with a cluster algorithm and least-squares fitting. Firstly, the platform of the rotary ultrasonic array is established by using four low-cost ultrasonic sensors and a motor. Fundamental measurements, such as the distance of objects at different rotary angles and with different object materials, are carried out. Secondly, the MDP feature extraction algorithm is proposed to extract precise object locations. Compared with the conventional range of constant distance (RCD) method, the MDP method can compensate for errors in feature location and feature matching. With the data clustering algorithm, a range of ultrasonic distances is attained and used as the input dataset. The fuzzy classification model—including rules regarding data fuzzification, reasoning, and defuzzification—is established to effectively recognize and classify the object feature types. Finally, accurate environment mapping of a service robot, based on MDP and fuzzy modeling of the measurements from the ultrasonic array, is demonstrated. Experimentally, our present approach can realize environment mapping for mobile robotics with the advantages of acceptable accuracy and low cost.

Download Full-text

An Integrated Approach to Product Delivery Planning and Scheduling

Scientific Journal of Riga Technical University Computer Sciences ◽

10.2478/v10143-011-0049-7 ◽

2011 ◽

Vol 45 (1) ◽

pp. 97-103 ◽

Cited By ~ 1

Author(s):

Galina Merkuryeva ◽

Vitaly Bolshakov ◽

Maksims Kornevs

Keyword(s):

Cluster Analysis ◽

Clustering Algorithm ◽

Expert Knowledge ◽

Real Life ◽

Integrated Approach ◽

Classification Model ◽

Performance Criteria ◽

Planning And Scheduling ◽

Delivery Planning ◽

Product Delivery

An Integrated Approach to Product Delivery Planning and SchedulingProduct delivery planning and scheduling is a task of high priority in transport logistics. In distribution centres this task is related to deliveries of various types of goods in predefined time windows. In real-life applications the problem has different stochastic performance criteria and conditions. Optimisation of schedules itself is time consuming and requires an expert knowledge. In this paper an integrated approach to product delivery planning and scheduling is proposed. It is based on a cluster analysis of demand data of stores to identify typical dynamic demand patterns and product delivery tactical plans, and simulation optimisation to find optimal parameters of transportation or vehicle schedules. Here, a cluster analysis of the demand data by using the K-means clustering algorithm and silhouette plots mean values is performed, and an NBTree-based classification model is built. In order to find an optimal grouping of stores into regions based on their geographical locations and the total demand uniformly distributed over regions, a multiobjective optimisation problem is formulated and solved with the NSGA II algorithm.

Download Full-text

Text Mining Drug-Protein Interactions using an Ensemble of BERT, Sentence BERT and T5 models

10.1101/2021.10.26.465944 ◽

2021 ◽

Author(s):

Xin Sui ◽

Wanjing Wang ◽

Jinfeng Zhang

Keyword(s):

Protein Interactions ◽

Clustering Algorithm ◽

Data Augmentation ◽

Majority Vote ◽

Classification Model ◽

Ensemble Model ◽

K Nearest Neighbors ◽

Test Dataset ◽

Improved Performance ◽

Using Data

In this work, we trained an ensemble model for predicting drug-protein interactions within a sentence based on only its semantics. Our ensembled model was built using three separate models: 1) a classification model using a fine-tuned BERT model; 2) a fine-tuned sentence BERT model that embeds every sentence into a vector; and 3) another classification model using a fine-tuned T5 model. In all models, we further improved performance using data augmentation. For model 2, we predicted the label of a sentence using k-nearest neighbors with its embedded vector. We also explored ways to ensemble these 3 models: a) we used the majority vote method to ensemble these 3 models; and b) based on the HDBSCAN clustering algorithm, we trained another ensemble model using features from all the models to make decisions. Our best model achieved an F-1 score of 0.753 on the BioCreative VII Track 1 test dataset.

Download Full-text

Lexical and semantic features of the names of herbaceous plants in dialects of the Khanty language

Finno-Ugric World ◽

10.15507/2076-2577.012.2020.04.400-410 ◽

2020 ◽

Vol 12 (4) ◽

pp. 400-410

Author(s):

Fedosia M. Lelkhova

Keyword(s):

Material Culture ◽

Semantic Analysis ◽

Word Formation ◽

Cultural Development ◽

Research Interest ◽

Herbaceous Plants ◽

Semantic Features ◽

Semantic Classification ◽

The People ◽

Plant Names

Introduction. The vocabulary of the plant world of the Khanty language contains a significant amount of information, closely connected with ethno-mentality, ethnography and thinking of the people. In this regard, the study of vegetation seems to be one of the most interesting layers of the vocabulary, since it reflects the degree of practical and cultural development of the surrounding nature. The purpose of the article is to establish the lexical and semantic features of the nominations of wild-growing herbs, the definition of dialectal features. The aim of the research is to identify the nominations of herbs with the greatest possible completeness, to establish the lexical meaning of each name in dialects of the language. The relevance of the topic is determined by the research interest to the study of differences between the dialects in the theoretical and practical terms; the attention recently been paid to folk spiritual and material culture; and the loss of certain plant names in the modern Khanty language. Materials and Methods. The study uses a set of methods and techniques for analyzing linguistic material: the method of semantic classification, lexical-semantic analysis, word-formation, linguistic-geographical analysis, as well as the elements of etymological analysis. The description is the main method for studying names of the plants. The source of the material is based on the vocabulary of the Khanty language, which was collected during field work; the source of Eastern dialects was the materials contained in lexicographic publications. When collecting the lexical material, the observation was conducted mainly on the speech of representatives of the older generation, as well as the people who have a traditional way of life, who retain the patterns of active spoken language. At the same time, not only facts that are in the active vocabulary of speakers were recorded, but also the words related to the passive vocabulary, which native speakers use only in conversations and sharing the memories of the past. Results and Discussion. The study of dialectal material based on the names of plants in the Khanty language is of great research interest. The life of the Khanty people since ancient times is closely connected with nature, the vocabulary of the plant world covers almost all spheres of economic activity of the Khanty, thereby making up a significant part of their vocabulary. In Khanty linguistics, this vocabulary has not yet been the subject of a special and detailed study, which makes it an urgent research task for today. The article identifies the signs that underlie the motivation of plant names and highlights the borrowed words. Conclusion. The collected vocabulary tells about the richness and vastness of phytonymic vocabulary of the Khanty language. The authors collected about 50 Khanty names of wild herbaceous plants in the Northern and Eastern dialects of the Khanty language. As a result of the research, new lexemes were identified and described, and the interpretation of the semantics of lexemes was clarified. Late borrowings of Russian origin are recorded. It was found that some dialect words are not actively used in the modern Khanty language. In flora vocabulary, the diversity and multiplicity of the nomination principles was revealed.

Download Full-text

Research and Implementation of the Text Matching Algorithm in the Field of Housing Law and Policy Based on Deep Learning

Complexity ◽

10.1155/2021/3165600 ◽

2021 ◽

Vol 2021 ◽

pp. 1-9

Author(s):

Yin Xu ◽

Hong Ma

Keyword(s):

Deep Learning ◽

Affordable Housing ◽

Legal System ◽

Learning Algorithm ◽

Matching Algorithm ◽

Advantages And Disadvantages ◽

Housing Security ◽

Law And Policy ◽

Depth Learning ◽

Text Matching

Machine learning enables machines to learn rules from a large amount of data input from the outside world through algorithms, so as to identify and judge. It is the main task of the government to further emphasize the importance of improving the housing security mechanism, expand the proportion of affordable housing, increase financial investment, improve the construction quality of affordable housing, and ensure fair distribution. It can be seen that the legal system of housing security is essentially a system to solve the social problems brought by housing marketization, and it is an important part of the whole national housing system. More and more attention has been paid to solving the housing difficulties of low- and middle-income people and establishing a housing security legal system suitable for China’s national conditions and development stage. Aiming at the deep learning problem, a text matching algorithm suitable for the field of housing law and policy is proposed. Classifier based on matching algorithm is a promising classification technology. The research on the legal system of housing security is in the exploratory stage, involving various theoretical and practical research studies. Compare the improved depth learning algorithm with the general algorithm, so as to clearly understand the advantages and disadvantages of the improved depth learning algorithm and depth learning algorithm. This paper introduces the practical application of the deep learning model and fast learning algorithm in detail. Creatively put forward to transform it into an independent public law basis or into an independent savings system.

Download Full-text