Topic Detection Using Multiple Semantic Spider Hunting Algorithm

In every moment, there is a huge capacity of data and information communicated through social network. Analyzing huge amounts of text data is very tedious, time consuming, expensive and manual sorting leads to mistakes and inconsistency. Document dispensation phase is still not accomplished of extracting data as a human reader. Furthermore the significance of content in the text may also differ from one reader to another. The proposed Multiple Spider Hunting Algorithm has been used to diminish the time complexity in compare with single spider move with multiple spiders. The construction of spider is dynamic depends on the volume of a corpus. In some case tokens may related to more than one topic and there is a need to detect Topic on semantic way. Multiple Semantic Spider Hunting Algorithm is proposed based on the semantics among terms and association can be drawn between words using semantic lexicons. Topic or lists of opinions are generated from the knowledge graph. News articles are gathered from five dissimilar topics such as sports, business, education, tourism and media. Usefulness of the proposed algorithms have been calculated based on the factors precision, recall, f-measure, accuracy, true positive, false positive and topic detection percentage. Multiple Semantic Spider Hunting Algorithm produced good result. Topic detection percentage of Spider Hunting Algorithm has been compared to other algorithms Naïve bayes, Neural Network, Decision tree and Particle Swarm Optimization. Spider Hunting Algorithm produced more than 90% precise detection of topic and subtopic.

Download Full-text

Aspect-level sentiment analysis merged with knowledge graph and graph convolutional neural network

Journal of Physics Conference Series ◽

10.1088/1742-6596/2083/4/042044 ◽

2021 ◽

Vol 2083 (4) ◽

pp. 042044

Author(s):

Zuhua Dai ◽

Yuanyuan Liu ◽

Shilong Di ◽

Qi Fan

Keyword(s):

Neural Network ◽

Sentiment Analysis ◽

Structural Information ◽

Knowledge Graph ◽

Convolutional Network ◽

Text Data ◽

Short Text ◽

Fine Grained ◽

Syntactic Information ◽

Text Information

Abstract Aspect level sentiment analysis belongs to fine-grained sentiment analysis, w hich has caused extensive research in academic circles in recent years. For this task, th e recurrent neural network (RNN) model is usually used for feature extraction, but the model cannot effectively obtain the structural information of the text. Recent studies h ave begun to use the graph convolutional network (GCN) to model the syntactic depen dency tree of the text to solve this problem. For short text data, the text information is not enough to accurately determine the emotional polarity of the aspect words, and the knowledge graph is not effectively used as external knowledge that can enrich the sem antic information. In order to solve the above problems, this paper proposes a graph co nvolutional neural network (GCN) model that can process syntactic information, know ledge graphs and text semantic information. The model works on the “syntax-knowled ge” graph to extract syntactic information and common sense information at the same t ime. Compared with the latest model, the model in this paper can effectively improve t he accuracy of aspect-level sentiment classification on two datasets.

Download Full-text

Spam Mail Filtering Using Data Mining Approach

Handling Priority Inversion in Time-Constrained Distributed Databases - Advances in Data Mining and Database Management ◽

10.4018/978-1-7998-2491-6.ch015 ◽

2020 ◽

pp. 253-282 ◽

Cited By ~ 3

Author(s):

Ajay Kumar Gupta

Keyword(s):

Decision Tree ◽

Classification Accuracy ◽

Time Complexity ◽

Identification Accuracy ◽

Tree Model ◽

Swarm Optimization ◽

Spam Filter ◽

Data Mining Approach ◽

Lower Complexity ◽

Using Data

This chapter presents an overview of spam email as a serious problem in our internet world and creates a spam filter that reduces the previous weaknesses and provides better identification accuracy with less complexity. Since J48 decision tree is a widely used classification technique due to its simple structure, higher classification accuracy, and lower time complexity, it is used as a spam mail classifier here. Now, with lower complexity, it becomes difficult to get higher accuracy in the case of large number of records. In order to overcome this problem, particle swarm optimization is used here to optimize the spam base dataset, thus optimizing the decision tree model as well as reducing the time complexity. Once the records have been standardized, the decision tree is again used to check the accuracy of the classification. The chapter presents a study on various spam-related issues, various filters used, related work, and potential spam-filtering scope.

Download Full-text

K-Nearest Neighbor Berbasis Particle Swarm Optimization untuk Analisis Sentimen Terhadap Tokopedia

Jurnal Teknik Informatika dan Sistem Informasi ◽

10.28932/jutisi.v6i2.2658 ◽

2020 ◽

Vol 6 (2) ◽

Author(s):

Dicki Pajri ◽

Yuyun Umaidah ◽

Tesa Nur Padilah

Keyword(s):

Particle Swarm Optimization ◽

Nearest Neighbor ◽

Confusion Matrix ◽

Particle Swarm ◽

K Nearest Neighbor ◽

Swarm Optimization ◽

Evaluation Result ◽

Source Of Information ◽

F Measure ◽

Evaluation Accuracy

Tokopedia is a popular marketplace used by e-commerce in Indonesia. Customers’ perception of Twitter towards Tokopedia can be used as an important source of information and can be processed into useful insights. Sentiment analysis is a solution that can be used to process the customers’ perception using K-Nearest Neighbor based on Particle Swarm Optimization. The purpose of this study is to classify customers’ perception based on positive, neutral, and negative classes. The test is carried out with four different scenarios and k values which are evaluated using a confusion matrix. Evaluation results showed the distribution of the dataset is 90:10 and the value of k = 1 is the best evaluation result, which is 88.11%. The feature selection was used for results by using Particle Swarm Optimization. The Particle Swarm Optimization used 20 iterations and 10 particles. It produced 97.9% the best evaluation accuracy, 96.17% precision, 96.62% recall, and 96.39% f-measure.

Download Full-text

Hotel Recommender System based on Knowledge Graph and Collaborative Approach

International Journal of Computing ◽

10.47839/ijc.20.1.2093 ◽

2021 ◽

pp. 63-71

Author(s):

Yousef Abuzir ◽

Mohamed Dwieb

Keyword(s):

Recommendation System ◽

Hybrid Approach ◽

Recommendation Systems ◽

Knowledge Graph ◽

Similarity Function ◽

Collaborative Approach ◽

Analysis Techniques ◽

Business Sectors ◽

F Measure

With the rapid increase of Information technology, online services and social media, recommendation system becomes an important issue and a need for both the customer and business sectors. The main aim of traditional and online recommendation systems is to recommend the desired and the necessary services that are appropriate recommendations to users. Traditional recommendation systems often suffer from inefficient data analysis techniques, rating the different services without regard to the previous preferences of the users and do not meet the personal demands of the users. Therefore, in this paper we used a hybrid approach based on Knowledge graph and Machine Learning similarity function as a recommendation system. We used real datasets to conduct the experiment. We built the knowledge graph for the visitors, hotels and their ranks, and we used the knowledge graph and similarity scores to recommend a hotel or a set of hotels for the visitors based on former preferences and ratings of other visitors. The results show significant accuracy and good quality of service recommender systems with 93.5% for f-measure.

Download Full-text

Sentence Similarity Calculation Based on Probabilistic Tolerance Rough Sets

Mathematical Problems in Engineering ◽

10.1155/2021/1635708 ◽

2021 ◽

Vol 2021 ◽

pp. 1-9

Author(s):

Ruiteng Yan ◽

Dong Qiu ◽

Haihuan Jiang

Keyword(s):

Language Processing ◽

Rough Set ◽

Time Complexity ◽

Computation Model ◽

Text Data ◽

Efficient Performance ◽

Sentence Similarity ◽

Similarity Calculation ◽

Tolerance Rough Set ◽

Similarity Computation

Sentence similarity calculation is one of the important foundations of natural language processing. The existing sentence similarity calculation measurements are based on either shallow semantics with the limitation of inadequately capturing latent semantics information or deep learning algorithms with the limitation of supervision. In this paper, we improve the traditional tolerance rough set model, with the advantages of lower time complexity and becoming incremental compared to the traditional one. And then we propose a sentence similarity computation model from the perspective of uncertainty of text data based on the probabilistic tolerance rough set model. It has the ability of mining latent semantics information and is unsupervised. Experiments on SICK2014 task and STSbenchmark dataset to calculate sentence similarity identify a significant and efficient performance of our model.

Download Full-text

A Swarm Intelligence Based Weighted Feature Extraction and Classification using SVM for Sentimental Exploration

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.c4077.098319 ◽

2019 ◽

Vol 8 (3) ◽

pp. 883-890

Keyword(s):

Swarm Intelligence ◽

Fitness Function ◽

Performance Measure ◽

Support Vector ◽

Relevant Feature ◽

Data Set ◽

Swarm Optimization ◽

Recall Accuracy ◽

Classification Technique ◽

F Measure

The goal of Sentiment Exploration (SE) is used for mining the accurate sentiments which are very beneficial for businesses, governments, and individuals, the opinions, recommendations, ratings, and feedbacks are becoming an important aspect in present scenarios. The proposed methodology likewise attempts to introduce a swarm intelligence based sentimental supervised methodology. In order to obtain a relevant feature data set from a large number of data samples, this method used particle swarm optimization to attain the utmost optimum feature set. The evaluation of the optimum feature set is obtained by means of using Minimum Redundancy and Maximum Relevancy measure as the fitness function. The categorization of the extracted feature set is accomplished with the Support Vector Machine classification technique. The experimental outcome for the suggested method is evaluated using four performance measure like precision, recall, accuracy, and f-measure and showed that proposed swarm intelligent based classification method has better performance using IMDB, Movie Lens and Trip Advisor Data Samples.

Download Full-text

Klasterisasi Jenis Musik Menggunakan Kombinasi Algoritma Neural Network, K-Means dan Particle Swarm Optimization

Jurnal Buana Informatika ◽

10.24002/jbi.v6i3.431 ◽

2015 ◽

Vol 6 (3) ◽

Author(s):

Alhaji Sheku Sankoh ◽

Ahmad Reza Musthafa ◽

Muhammad Imron Rosadi ◽

Agus Zainal Arifin

Keyword(s):

Neural Network ◽

Particle Swarm Optimization ◽

Particle Swarm ◽

Improved Method ◽

Processing Stage ◽

Text Documents ◽

Swarm Optimization ◽

A Value ◽

Audio Files ◽

F Measure

Abstract. Having a number of audio files in a directory could result to unstructured arrangement of files. This will cause some difficulties for users in sorting a collection of audio files based on a particular category of music. In some previous studies, researchers used a method conducting to group documents on a web page. However, those studies were not carried out on file containing documents such as audio files; relatively they were conducted on files that contain text documents. In this study, we develop a method of grouping files using a combination of pre-processing approach, neural networks, k-means, and particle swarm optimization to obtain a form of audio file collections that are group based on the types of music. The result of this study is a system with improved method of grouping audio files based on the type of music. The pre-processing stage has therefore produced the best results on this approach based on spectrum analysis melody and bass guitar, which offers a value precision 95%, 100% recall and an F-Measure 97.44%.Keywords: Cluster, Music, NN, K-Means, PSOÂ Abstrak. Banyaknya file audio pada suatu direktori membuat sususan file tidak terstruktur. Hal ini akan menyulitkan pengguna untuk mengurutkan bahkan memilah kumpulan file audio berdasarkan kategori tertentu, khususnya kategori berdasarkan jenis musik. Pada penelitian sebelumnya, dilakukan pengelompokan dokumen pada suatu halaman website. Namun hal tersebut tidak dilakukan pada file selain dokumen, seperti file audio. Penelitian ini bertujuan untuk mengembangkan metode pengelompokan file berupa kombinasi pendekatan pre-processing, neural network, k-means, dan particle swarm optimization dengan masukan berupa file audio sehingga diperoleh keluaran berupa kumpulan file audio yang telah terkelompok berdasarkan jenis musik. Hasil dari penelitian ini yaitu berupa suatu sistem dengan pengembangan metode dalam pengelompokan file audio berdasarkan jenis musik. Metode pada tahap pre-processing memiliki hasil terbaik pada pendekatan berdasarkan analisa spectrum melodi gitar dan bass, di mana memiliki nilai precission 95%, recall 100% dan F-Measure 97,44%. Kata kunci: Klaster, Musik, NN, K-Means, PSO

Download Full-text

Experimental Analysis of Time Complexity and Solution Quality of Swarm Intelligence Algorithm

10.20944/preprints202007.0517.v1 ◽

2020 ◽

Author(s):

Mohammed Ajuji ◽

Aliyu Abubakar ◽

Datti, Useni Emmanuel

Keyword(s):

Particle Swarm Optimization ◽

Time Complexity ◽

Artificial Bee Colony ◽

Optimization Problems ◽

Particle Swarm ◽

Worst Case ◽

Swarm Optimization ◽

Bee Colony ◽

Nature Inspired Algorithms

Nature-inspired algorithms are very popular tools for solving optimization problems inspired by nature. However, there is no guarantee that optimal solution can be obtained using a randomly selected algorithm. As such, the problem can be addressed using trial and error via the use of different optimization algorithms. Therefore, the proposed study in this paper analyzes the time-complexity and efficacy of some nature-inspired algorithms which includes Artificial Bee Colony, Bat Algorithm and Particle Swarm Optimization. For each algorithm used, experiments were conducted several times with iterations and comparative analysis was made. The result obtained shows that Artificial Bee Colony outperformed other algorithms in terms of the quality of the solution, Particle Swarm Optimization is time efficient while Artificial Bee Colony yield a worst case scenario in terms of time complexity.

Download Full-text

Low-time complexity and low-cost binary particle swarm optimization algorithm for task scheduling and load balancing in cloud computing

Applied Intelligence ◽

10.1007/s10489-019-01448-x ◽

2019 ◽

Vol 49 (9) ◽

pp. 3308-3330 ◽

Cited By ~ 12

Author(s):

Jean Pepe Buanga Mapetu ◽

Zhen Chen ◽

Lingfu Kong

Keyword(s):

Cloud Computing ◽

Particle Swarm Optimization ◽

Load Balancing ◽

Task Scheduling ◽

Optimization Algorithm ◽

Time Complexity ◽

Particle Swarm Optimization Algorithm ◽

Low Cost ◽

Binary Particle Swarm Optimization ◽

Swarm Optimization

Download Full-text

Hierarchical and K-means Clustering in the Line Drawing Data Shape Using Procrustes Analysis

JOIV International Journal on Informatics Visualization ◽

10.30630/joiv.5.3.532 ◽

2021 ◽

Vol 5 (3) ◽

pp. 306

Author(s):

Ridho Ananda ◽

Agi Prasetiadi

Keyword(s):

Time Complexity ◽

Goodness Of Fit ◽

Line Drawing ◽

Jaccard Index ◽

Rand Index ◽

Procrustes Analysis ◽

Main Process ◽

Minimum Value ◽

Shape Clustering ◽

F Measure

One of the problems in the clustering process is that the objects under inquiry are multivariate measures containing geometrical information that requires shape clustering. Because Procrustes is a technique to obtaining the similarity measure of two shapes, it can become the solution. Therefore, this paper tried to use Procrustes as the main process in the clustering method. Several algorithms proposed for the shape clustering process using Procrustes were namely hierarchical the goodness-of-fit of Procrustes (HGoFP), k-means the goodness-of-fit of Procrustes (KMGoFP), hierarchical ordinary Procrustes analysis (HOPA), and k-means ordinary Procrustes analysis (KMOPA). Those algorithms were evaluated using Rand index, Jaccard index, F-measure, and Purity. Data used was the line drawing dataset that consisted of 180 drawings classified into six clusters. The results showed that the HGoFP, KMGoFP, HOPA and KMOPA algorithms were good enough in Rand index, F-measure, and Purity with 0.697 as a minimum value. Meanwhile, the good clustering results in the Jaccard index were only the HGoFP, KMGoFP, and HOPA algorithms with 0.561 as a minimum value. KMGoFP has the worst result in the Jaccard index that is about 0.300. In the time complexity, the fastest algorithm is the HGoFP algorithm; the time complexity is 4.733. Based on the results, the algorithms proposed in this paper particularly deserve to be proposed as new algorithms to cluster the objects in the line drawing dataset. Then, the HGoFP is suggested clustering the objects in the dataset used.

Download Full-text