scholarly journals Topic Detection Using Multiple Semantic Spider Hunting Algorithm

2021 ◽  
Author(s):  
E. Elakiya ◽  
R. Kanagaraj ◽  
N. Rajkumar

In every moment, there is a huge capacity of data and information communicated through social network. Analyzing huge amounts of text data is very tedious, time consuming, expensive and manual sorting leads to mistakes and inconsistency. Document dispensation phase is still not accomplished of extracting data as a human reader. Furthermore the significance of content in the text may also differ from one reader to another. The proposed Multiple Spider Hunting Algorithm has been used to diminish the time complexity in compare with single spider move with multiple spiders. The construction of spider is dynamic depends on the volume of a corpus. In some case tokens may related to more than one topic and there is a need to detect Topic on semantic way. Multiple Semantic Spider Hunting Algorithm is proposed based on the semantics among terms and association can be drawn between words using semantic lexicons. Topic or lists of opinions are generated from the knowledge graph. News articles are gathered from five dissimilar topics such as sports, business, education, tourism and media. Usefulness of the proposed algorithms have been calculated based on the factors precision, recall, f-measure, accuracy, true positive, false positive and topic detection percentage. Multiple Semantic Spider Hunting Algorithm produced good result. Topic detection percentage of Spider Hunting Algorithm has been compared to other algorithms Naïve bayes, Neural Network, Decision tree and Particle Swarm Optimization. Spider Hunting Algorithm produced more than 90% precise detection of topic and subtopic.

2021 ◽  
Vol 2083 (4) ◽  
pp. 042044
Author(s):  
Zuhua Dai ◽  
Yuanyuan Liu ◽  
Shilong Di ◽  
Qi Fan

Abstract Aspect level sentiment analysis belongs to fine-grained sentiment analysis, w hich has caused extensive research in academic circles in recent years. For this task, th e recurrent neural network (RNN) model is usually used for feature extraction, but the model cannot effectively obtain the structural information of the text. Recent studies h ave begun to use the graph convolutional network (GCN) to model the syntactic depen dency tree of the text to solve this problem. For short text data, the text information is not enough to accurately determine the emotional polarity of the aspect words, and the knowledge graph is not effectively used as external knowledge that can enrich the sem antic information. In order to solve the above problems, this paper proposes a graph co nvolutional neural network (GCN) model that can process syntactic information, know ledge graphs and text semantic information. The model works on the “syntax-knowled ge” graph to extract syntactic information and common sense information at the same t ime. Compared with the latest model, the model in this paper can effectively improve t he accuracy of aspect-level sentiment classification on two datasets.


Author(s):  
Ajay Kumar Gupta

This chapter presents an overview of spam email as a serious problem in our internet world and creates a spam filter that reduces the previous weaknesses and provides better identification accuracy with less complexity. Since J48 decision tree is a widely used classification technique due to its simple structure, higher classification accuracy, and lower time complexity, it is used as a spam mail classifier here. Now, with lower complexity, it becomes difficult to get higher accuracy in the case of large number of records. In order to overcome this problem, particle swarm optimization is used here to optimize the spam base dataset, thus optimizing the decision tree model as well as reducing the time complexity. Once the records have been standardized, the decision tree is again used to check the accuracy of the classification. The chapter presents a study on various spam-related issues, various filters used, related work, and potential spam-filtering scope.


Author(s):  
Dicki Pajri ◽  
Yuyun Umaidah ◽  
Tesa Nur Padilah

Tokopedia is a popular marketplace used by e-commerce in Indonesia. Customers’ perception of Twitter towards Tokopedia can be used as an important source of information and can be processed into useful insights. Sentiment analysis is a solution that can be used to process the customers’ perception using K-Nearest Neighbor based on Particle Swarm Optimization. The purpose of this study is to classify customers’ perception based on positive, neutral, and negative classes. The test is carried out with four different scenarios and k values which are evaluated using a confusion matrix. Evaluation results showed the distribution of the dataset is 90:10 and the value of k = 1 is the best evaluation result, which is 88.11%. The feature selection was used for results by using Particle Swarm Optimization. The Particle Swarm Optimization used 20 iterations and 10 particles. It produced 97.9% the best evaluation accuracy, 96.17% precision, 96.62% recall, and 96.39% f-measure.


2021 ◽  
pp. 63-71
Author(s):  
Yousef Abuzir ◽  
Mohamed Dwieb

With the rapid increase of Information technology, online services and social media, recommendation system becomes an important issue and a need for both the customer and business sectors. The main aim of traditional and online recommendation systems is to recommend the desired and the necessary services that are appropriate recommendations to users. Traditional recommendation systems often suffer from inefficient data analysis techniques, rating the different services without regard to the previous preferences of the users and do not meet the personal demands of the users. Therefore, in this paper we used a hybrid approach based on Knowledge graph and Machine Learning similarity function as a recommendation system. We used real datasets to conduct the experiment. We built the knowledge graph for the visitors, hotels and their ranks, and we used the knowledge graph and similarity scores to recommend a hotel or a set of hotels for the visitors based on former preferences and ratings of other visitors. The results show significant accuracy and good quality of service recommender systems with 93.5% for f-measure.


2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Ruiteng Yan ◽  
Dong Qiu ◽  
Haihuan Jiang

Sentence similarity calculation is one of the important foundations of natural language processing. The existing sentence similarity calculation measurements are based on either shallow semantics with the limitation of inadequately capturing latent semantics information or deep learning algorithms with the limitation of supervision. In this paper, we improve the traditional tolerance rough set model, with the advantages of lower time complexity and becoming incremental compared to the traditional one. And then we propose a sentence similarity computation model from the perspective of uncertainty of text data based on the probabilistic tolerance rough set model. It has the ability of mining latent semantics information and is unsupervised. Experiments on SICK2014 task and STSbenchmark dataset to calculate sentence similarity identify a significant and efficient performance of our model.


The goal of Sentiment Exploration (SE) is used for mining the accurate sentiments which are very beneficial for businesses, governments, and individuals, the opinions, recommendations, ratings, and feedbacks are becoming an important aspect in present scenarios. The proposed methodology likewise attempts to introduce a swarm intelligence based sentimental supervised methodology. In order to obtain a relevant feature data set from a large number of data samples, this method used particle swarm optimization to attain the utmost optimum feature set. The evaluation of the optimum feature set is obtained by means of using Minimum Redundancy and Maximum Relevancy measure as the fitness function. The categorization of the extracted feature set is accomplished with the Support Vector Machine classification technique. The experimental outcome for the suggested method is evaluated using four performance measure like precision, recall, accuracy, and f-measure and showed that proposed swarm intelligent based classification method has better performance using IMDB, Movie Lens and Trip Advisor Data Samples.


2015 ◽  
Vol 6 (3) ◽  
Author(s):  
Alhaji Sheku Sankoh ◽  
Ahmad Reza Musthafa ◽  
Muhammad Imron Rosadi ◽  
Agus Zainal Arifin

Abstract. Having a number of audio files in a directory could result to unstructured arrangement of files. This will cause some difficulties for users in sorting a collection of audio files based on a particular category of music. In some previous studies, researchers used a method conducting to group documents on a web page. However, those studies were not carried out on file containing documents such as audio files; relatively they were conducted on files that contain text documents. In this study, we develop a method of grouping files using a combination of pre-processing approach, neural networks, k-means, and particle swarm optimization to obtain a form of audio file collections that are group based on the types of music. The result of this study is a system with improved method of grouping audio files based on the type of music. The pre-processing stage has therefore produced the best results on this approach based on spectrum analysis melody and bass guitar, which offers a value precision 95%, 100% recall and an F-Measure 97.44%.Keywords: Cluster, Music, NN, K-Means, PSO Abstrak. Banyaknya file audio pada suatu direktori membuat sususan file tidak terstruktur. Hal ini akan menyulitkan pengguna untuk mengurutkan bahkan memilah kumpulan file audio berdasarkan kategori tertentu, khususnya kategori berdasarkan jenis musik. Pada penelitian sebelumnya, dilakukan pengelompokan dokumen pada suatu halaman website. Namun hal tersebut tidak dilakukan pada file selain dokumen, seperti file audio. Penelitian ini bertujuan untuk mengembangkan metode pengelompokan file berupa kombinasi pendekatan pre-processing, neural network, k-means, dan particle swarm optimization dengan masukan berupa file audio sehingga diperoleh keluaran berupa kumpulan file audio yang telah terkelompok berdasarkan jenis musik. Hasil dari penelitian ini yaitu berupa suatu sistem dengan pengembangan metode dalam pengelompokan file audio berdasarkan jenis musik. Metode pada tahap pre-processing memiliki hasil terbaik pada pendekatan berdasarkan analisa spectrum melodi gitar dan bass, di mana memiliki nilai precission 95%, recall 100% dan F-Measure 97,44%. Kata kunci: Klaster, Musik, NN, K-Means, PSO


Author(s):  
Mohammed Ajuji ◽  
Aliyu Abubakar ◽  
Datti, Useni Emmanuel

Nature-inspired algorithms are very popular tools for solving optimization problems inspired by nature. However, there is no guarantee that optimal solution can be obtained using a randomly selected algorithm. As such, the problem can be addressed using trial and error via the use of different optimization algorithms. Therefore, the proposed study in this paper analyzes the time-complexity and efficacy of some nature-inspired algorithms which includes Artificial Bee Colony, Bat Algorithm and Particle Swarm Optimization. For each algorithm used, experiments were conducted several times with iterations and comparative analysis was made. The result obtained shows that Artificial Bee Colony outperformed other algorithms in terms of the quality of the solution, Particle Swarm Optimization is time efficient while Artificial Bee Colony yield a worst case scenario in terms of time complexity.


2021 ◽  
Vol 5 (3) ◽  
pp. 306
Author(s):  
Ridho Ananda ◽  
Agi Prasetiadi

One of the problems in the clustering process is that the objects under inquiry are multivariate measures containing geometrical information that requires shape clustering. Because Procrustes is a technique to obtaining the similarity measure of two shapes, it can become the solution. Therefore, this paper tried to use Procrustes as the main process in the clustering method. Several algorithms proposed for the shape clustering process using Procrustes were namely hierarchical the goodness-of-fit of Procrustes (HGoFP), k-means the goodness-of-fit of Procrustes (KMGoFP), hierarchical ordinary Procrustes analysis (HOPA), and k-means ordinary Procrustes analysis (KMOPA). Those algorithms were evaluated using Rand index, Jaccard index, F-measure, and Purity. Data used was the line drawing dataset that consisted of 180 drawings classified into six clusters. The results showed that the HGoFP, KMGoFP, HOPA and KMOPA algorithms were good enough in Rand index, F-measure, and Purity with 0.697 as a minimum value. Meanwhile, the good clustering results in the Jaccard index were only the HGoFP, KMGoFP, and HOPA algorithms with 0.561 as a minimum value. KMGoFP has the worst result in the Jaccard index that is about 0.300. In the time complexity, the fastest algorithm is the HGoFP algorithm; the time complexity is 4.733. Based on the results, the algorithms proposed in this paper particularly deserve to be proposed as new algorithms to cluster the objects in the line drawing dataset. Then, the HGoFP is suggested clustering the objects in the dataset used.


Sign in / Sign up

Export Citation Format

Share Document