Evaluating the compliance of modern electronic banking and digital cryptocurrency systems with the information society's requirements

Subject. The digital economy emerged as a new generation of financial instruments, such as cryptocurrencies, were invented and proliferated, which were able to counteract global challenges. Those who oppose to the legitimization of digital assets and their integration into the payment infrastructure do not point out material advantages and support drastic transformations of the existing financial system. However, assuming very risky digital payments, the scope of cruptocurrency still grows. The article presents the outcome of intellectual text analysis of feedback left by users of electronic banking and digital cryptocurrency systems. Doing so, we determined to what extent they are satisfied with various systems. Objectives. The study is intended to provide the theoretical and methodological rationale for, and practically test the model that determines key themes in analyzable non-structured big data and allows to automatically evaluate the satisfaction of users with various payment systems. Methods. We resorted to the formal logic, systems approach, methods of comparative analysis, text mining and latent semantic analysis. Results. We analyzed reviews uploaded to www.banki.ru and www.otzovik.ru through parsing, stop word elimination, stemming, probabilistic thematic modeling based on the latent semantic analysis. We assessed to what extent users are satisfied with various systems by examining their reviews through the text tone analysis, the k-nearest neighbor algorithm and automated scoring of unrated reviews. Conclusions and Relevance. Text mining of unstructured big data shows that digital platforms, notwithstanding their infancy and high risks, already mostly satisfy social needs as compared to electronic banking systems, which determines the reasonableness of integrating them into the payment system to unlock their potential.

Download Full-text

Dialogue Act Classification, Instance-Based Learning, and Higher Order Dialogue Structure

Dialogue & Discourse ◽

10.5087/dad.2010.002 ◽

2010 ◽

Vol 1 (2) ◽

pp. 1-24 ◽

Cited By ~ 7

Author(s):

Barbara Di Eugenio ◽

Zhuli Xie ◽

Riccardo Serafin

Keyword(s):

Latent Semantic Analysis ◽

Semantic Analysis ◽

Nearest Neighbor ◽

K Nearest Neighbor ◽

Linguistic Features ◽

Instance Based Learning ◽

Dialogue Structure ◽

K Nearest Neighbor Algorithm ◽

Semantic Spaces ◽

Better Than

In this paper, we explore instance-based learning methods for dialogue act classification on two corpora, MapTask and CallHome Spanish. We start with Latent Semantic Analysis (LSA), and extend it as Feature Latent Semantic Analysis (FLSA). FLSA adds richer linguistic features to LSA, which only uses words. In particular, we explore the extended dialogue context, both linearly (the previous dialogue act) and hierarchically (conversational games). We show how the k-Nearest Neighbor algorithm obtains its best results when applied to the reduced semantic spaces generated by FLSA. Empirically, our results are better than previously published results on these two corpora; linguistically, we confirm and extend previous observations that the hierarchical dialogue structure encoded via the notion of Game is of primary importance for dialogue act recognition.

Download Full-text

A hybrid machine learning approach of fuzzy-rough-k-nearest neighbor, latent semantic analysis, and ranker search for efficient disease diagnosis

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-211820 ◽

2021 ◽

pp. 1-16

Author(s):

Sunil Kumar Jha ◽

Ninoslav Marina ◽

Jinwei Wang ◽

Zulfiqar Ahmad

Keyword(s):

Machine Learning ◽

Latent Semantic Analysis ◽

Semantic Analysis ◽

Nearest Neighbor ◽

Disease Diagnosis ◽

Learning Approach ◽

Learning Approaches ◽

K Nearest Neighbor ◽

Machine Learning Approach ◽

Hybrid Machine

Machine learning approaches have a valuable contribution in improving competency in automated decision systems. Several machine learning approaches have been developed in the past studies in individual disease diagnosis prediction. The present study aims to develop a hybrid machine learning approach for diagnosis predictions of multiple diseases based on the combination of efficient feature generation, selection, and classification methods. Specifically, the combination of latent semantic analysis, ranker search, and fuzzy-rough-k-nearest neighbor has been proposed and validated in the diagnosis prediction of the primary tumor, post-operative, breast cancer, lymphography, audiology, fertility, immunotherapy, and COVID-19, etc. The performance of the proposed approach is compared with single and other hybrid machine learning approaches in terms of accuracy, analysis time, precision, recall, F-measure, the area under ROC, and the Kappa coefficient. The proposed hybrid approach performs better than single and other hybrid approaches in the diagnosis prediction of each of the selected diseases. Precisely, the suggested approach achieved the maximum recognition accuracy of 99.12%of the primary tumor, 96.45%of breast cancer Wisconsin, 94.44%of cryotherapy, 93.81%of audiology, and significant improvement in the classification accuracy and other evaluation metrics in the recognition of the rest of the selected diseases. Besides, it handles the missing values in the dataset effectively.

Download Full-text

Klasifikasi Topik Multi Label pada Hadis Shahih Bukhari Menggunakan K-Nearest Neighbor dan Latent Semantic Analysis

JURIKOM (Jurnal Riset Komputer) ◽

10.30865/jurikom.v7i1.2013 ◽

2020 ◽

Vol 7 (1) ◽

pp. 140

Author(s):

Dian Chusnul Hidayati ◽

Said Al Faraby ◽

Adiwijaya Adiwijaya

Keyword(s):

Latent Semantic Analysis ◽

Semantic Analysis ◽

Nearest Neighbor ◽

Islamic Law ◽

Computation Time ◽

K Nearest Neighbor ◽

Space Model ◽

Binary Relevance ◽

Long Time ◽

Vector Dimension

Hadith is the second source of Islamic law after Al-Quran, making it important to study. However, there are some difficulties in learning hadith, such as to determine which hadith belongs to the topic of suggestions, prohibitions, and information. This certainly obstructs the hadith learning process, especially for Muslims. Therefore, it is necessary to classify hadiths into the topic of suggestions, prohibitions, information, and a combination of the three topics which also called as multi-label topic. The classification can be done with the K-Nearest Neighbor, it is one of the best methods in the Vector Space Model and is the simplest but quite effective method. However, the KNN has a lack in dealing with high vector dimension, resulting in the long time computing classification. For that reason, it is necessary to classify Sahih Bukhari's Hadiths into the topic of recommendations, prohibitions, and information using the Latent-Semantic Analysis - K-nearest Neighbor (LSA-KNN) method. Binary Relevance method is also employed in this research to process the multi-label data. This research shows that the performance of LSA-KNN is 90.28% with the computation time is 19 minutes 21 seconds and the performance of KNN is 90.23% with the computation time is 37 minutes 06 seconds, which means that the LSA-KNN method has a better performance than KNN

Download Full-text

An Enhanced Performance of K-Nearest Neighbor (K-NN) Classifier to Meet New Big Data Necessities

IOP Conference Series Materials Science and Engineering ◽

10.1088/1757-899x/928/3/032013 ◽

2020 ◽

Vol 928 ◽

pp. 032013

Author(s):

Ihab L.Hussein Alsammak ◽

Humam M. Abdul Sahib ◽

Wasan H.Itwee

Keyword(s):

Big Data ◽

Nearest Neighbor ◽

K Nearest Neighbor

Download Full-text

Precision Pig Farming Image Analysis Using Random Forest and Boruta Predictive Big Data Analysis Using Neural Network and K- Nearest Neighbor

2021 2nd International Conference on Intelligent Engineering and Management (ICIEM) ◽

10.1109/iciem51511.2021.9445328 ◽

2021 ◽

Author(s):

S. A. Shaik Mazhar ◽

G. Suseendran

Keyword(s):

Neural Network ◽

Image Analysis ◽

Big Data ◽

Data Analysis ◽

Random Forest ◽

Nearest Neighbor ◽

Big Data Analysis ◽

K Nearest Neighbor ◽

Pig Farming

Download Full-text

Assessment of Latent Semantic Analysis (LSA) Text Mining Algorithms for Large Scale Mapping of Patent and Scientific Publication Documents

SSRN Electronic Journal ◽

10.2139/ssrn.2096159 ◽

2011 ◽

Cited By ~ 2

Author(s):

Bart Van Looy ◽

Bart Baesens ◽

Tom Magerman ◽

Koenraad Debackere

Keyword(s):

Text Mining ◽

Latent Semantic Analysis ◽

Large Scale ◽

Semantic Analysis ◽

Scientific Publication ◽

Mining Algorithms

Download Full-text

Parallel kNN Queries for Big Data Based on Voronoi Diagram Using MapReduce

Advances in Data Mining and Database Management - Handbook of Research on Innovative Database Query Processing Techniques ◽

10.4018/978-1-4666-8767-7.ch014 ◽

2015 ◽

pp. 392-414

Author(s):

Wei Yan

Keyword(s):

Big Data ◽

Voronoi Diagram ◽

Spatial Databases ◽

Nearest Neighbor ◽

Programming Model ◽

Dimensional Space ◽

Data Sets ◽

Two Dimensional ◽

K Nearest Neighbor ◽

K Nearest Neighbors

In cloud computing environments parallel kNN queries for big data is an important issue. The k nearest neighbor queries (kNN queries), designed to find k nearest neighbors from a dataset S for every object in another dataset R, is a primitive operator widely adopted by many applications including knowledge discovery, data mining, and spatial databases. This chapter proposes a parallel method of kNN queries for big data using MapReduce programming model. Firstly, this chapter proposes an approximate algorithm that is based on mapping multi-dimensional data sets into two-dimensional data sets, and transforming kNN queries into a sequence of two-dimensional point searches. Then, in two-dimensional space this chapter proposes a partitioning method using Voronoi diagram, which incorporates the Voronoi diagram into R-tree. Furthermore, this chapter proposes an efficient algorithm for processing kNN queries based on R-tree using MapReduce programming model. Finally, this chapter presents the results of extensive experimental evaluations which indicate efficiency of the proposed approach.

Download Full-text

Twitter text mining for sentiment analysis on government’s response to forest fires with vader lexicon polarity detection and k-nearest neighbor algorithm

Journal of Physics Conference Series ◽

10.1088/1742-6596/1567/3/032024 ◽

2020 ◽

Vol 1567 ◽

pp. 032024

Author(s):

T Mustaqim ◽

K Umam ◽

M A Muslim

Keyword(s):

Text Mining ◽

Sentiment Analysis ◽

Forest Fires ◽

Nearest Neighbor ◽

K Nearest Neighbor ◽

Nearest Neighbor Algorithm ◽

K Nearest Neighbor Algorithm

Download Full-text

Centroid Based Classifier With TF – IDF – ICF for Classfication of Student’s Complaint at Appliation E-Complaint in Muhammadiyah University of Sidoarjo

JEEE-U (Journal of Electrical and Electronic Engineering-UMSIDA) ◽

10.21070/jeee-u.v1i1.23 ◽

2016 ◽

Vol 1 (1) ◽

pp. 17 ◽

Cited By ~ 1

Author(s):

Mochamad Alfan Rosid ◽

Gunawan Gunawan ◽

Edwin Pramana

Keyword(s):

Text Mining ◽

Decision Tree ◽

Nearest Neighbor ◽

Naive Bayes ◽

Naïve Bayes ◽

K Nearest Neighbor ◽

Base Classifier

Text mining mengacu pada proses mengambil informasi berkualitas tinggi dari teks. Informasi berkualitas tinggi biasanya diperoleh melalui peramalan pola dan kecenderungan melalui sarana seperti pembelajaran pola statistik. Salah satu kegiatan penting dalam text mining adalah klasifikasi atau kategorisasi teks. Kategorisasi teks sendiri saat ini memiliki berbagai metode antara lain metode K-Nearest Neighbor, Naïve Bayes, dan Centroid Base Classifier, atau decision tree classification.Pada penelitian ini, klasifikasi keluhan mahasiswa dilakukan dengan metode centroid based classifier dan dengan fitur TF-IDF-ICF, Ada lima tahap yang dilakukan untuk mendapatkan hasil klasifikasi. Tahap pengambilan data keluhan kemudian dilanjutkan dengan tahap preprosesing yaitu mempersiapkan data yang tidak terstruktur sehingga siap digunakan untuk proses selanjutnya, kemudian dilanjutkan dengan proses pembagian data, data dibagi menjadi dua macam yaitu data latih dan data uji, tahap selanjutnya yaitu tahap pelatihan untuk menghasilkan model klasifikasi dan tahap terakhir adalah tahap pengujian yaitu menguji model klasifikasi yang telah dibuat pada tahap pelatihan terhadap data uji. Keluhan untuk pengujian akan diambilkan dari database aplikasi e-complaint Universitas Muhammadiyah Sidoarjo. Adapun hasil uji coba menunjukkan bahwa klasifikasi keluhan dengan algoritma centroid based classifier dan dengan fitur TF-IDF-ICF memiliki rata-rata akurasi yang cukup tinggi yaitu 79.5%. Nilai akurasi akan meningkat dengan meningkatnya data latih dan efesiensi sistem semakin menurun dengan meningkatnya data latih.

Download Full-text

Text Mining dan Klasterisasi Sentimen Pada Ulasan Produk Toko Online

Jurnal Teknologi dan Ilmu Komputer Prima (JUTIKOMP) ◽

10.34012/jutikomp.v2i1.456 ◽

2019 ◽

Vol 2 (1) ◽

pp. 41-48

Author(s):

Rimbun Siringoringo ◽

Jamaludin Jamaludin

Keyword(s):

Text Mining ◽

Nearest Neighbor ◽

K Nearest Neighbor ◽

F Measure

Pertumbuhan media sosial dan e-commerce mengubah cara berinteraksi dan menyampaikan pandangan, opini dan mood. Ulasan produk merupakan salah satu bentuk penyampaian opini dan sentimen konsumen terhadap sebuah produk secara online. Ulasan produk saat ini memiliki peranan yang sangat penting dalam mempengaruhi minat konsumen terhadap sebuah produk. Analisis sentimen merupakan pendekatan yang banyak dikerjakan untuk mengekstrak informasi dan menggali opini berkaitan dengan ulasan produk. Analisis sentimen memiliki beberapa tantangan, yang pertama sering sekali hasil analisis sentimen yang dihasilkan oleh model-model prediksi berbeda dengan sentimen yang aktual, tantangan kedua adalah berkaitan dengan cara konsumen mengekpresikan sentimen dan mood selalu berbeda dari satu keadaan ke keadaan berikutnya. Pada penelitian ini dilakukan analisis sentimen berdasarkan ulasan produk sepatu Trendy Shoes merek Denim. Tahapan analisis sentimen terdiri dari pengumpulan data, pemrosesan awal, transformasi data, seleksi fitur dan tahapan klasifikasi menggunakan Suppport Vector Machine. Pemrosesan awal menerapkan tahapan text mining yakni case folding, non alpha numeric removal, stop words removal, dan stemming. Hasil analisis sentimen diukur menggunakan kriteria Akurasi, G-Mean, dan F-Measure. Dengan menerapkan pengujian pada tiga jenis data sentimen diperoleh hasil bahwa Suppport Vector Machine dapat mengklasifikasi sentimen dengan baik. Performa Suppport Vector Machine dibandingkan dengan metode K-Nearest Neighor. Hasil klasifiasi sentimen menggunakan Suppport Vector Machine lebih unggul dari K-Nearest Neighbor.

Download Full-text