Journal of Computing and Information Technology

In this paper, we first coin a new graph theoretic problem called the diameter cycle problem with numerous applications. A longest cycle in a graph G = (V, E) is referred to as a diameter cycle of G iff the distance in G of every vertex on the cycle to the rest of the on-cycle vertices is maximal. We then present two algorithms for finding a diameter cycle of a biconnected graph. The first algorithm is an abstract intuitive algorithm that utilizes a brute-force mechanism for expanding an initial cycle by repeatedly replacing paths on the cycle with longer paths. The second algorithm is a concrete algorithm that uses fundamental cycles in the expansion process and has the time and space complexity of O(n^6) and O(n^2), respectively. To the best of our knowledge, this problem was neither defined nor addressed in the literature. The diameter cycle problem distinguishes itself from other cycle finding problems by identifying cycles that are maximally long while maximizing the distances between vertices in the cycle. Existing cycle finding algorithms such as fundamental and longest cycle algorithms do not discover cycles where the distances between vertices are maximized while also maximizing the length of the cycle.

An Effective Data Sampling Procedure for Imbalanced Data Learning on Health Insurance Fraud Detection

Journal of Computing and Information Technology ◽

10.20532/cit.2020.1005216 ◽

2021 ◽

Vol 28 (4) ◽

pp. 269-285

Author(s):

Shamitha S Kotekani ◽

Ilango Velchamy

Keyword(s):

Neural Networks ◽

Health Insurance ◽

Academic Research ◽

Imbalanced Data ◽

Fraud Detection ◽

Sampling Procedure ◽

Insurance Fraud ◽

Data Sampling ◽

Decision Tree Classifier ◽

Tree Classifier

Fraud detection has received considerable attention from many academic research and industries worldwide due to its increasing popularity. Insurance datasets are enormous, with skewed distributions and high dimensionality. Skewed class distribution and its volume are considered significant problems while analyzing insurance datasets, as these issues increase the misclassification rates. Although sampling approaches, such as random oversampling and SMOTE can help balance the data, they can also increase the computational complexity and lead to a deterioration of model's performance. So, more sophisticated techniques are needed to balance the skewed classes efficiently. This research focuses on optimizing the learner for fraud detection by applying a Fused Resampling and Cleaning Ensemble (FusedRCE) for effective sampling in health insurance fraud detection. We hypothesized that meticulous oversampling followed with a guided data cleaning would improve the prediction performance and learner's understanding of the minority fraudulent classes compared to other sampling techniques. The proposed model works in three steps. As a first step, PCA is applied to extract the necessary features and reduce the dimensions in the data. In the second step, a hybrid combination of k-means clustering and SMOTE oversampling is used to resample the imbalanced data. Oversampling introduces lots of noise in the data. A thorough cleaning is performed on the balanced data to remove the noisy samples generated during oversampling using the Tomek Link algorithm in the third step. Tomek Link algorithm clears the boundary between minority and majority class samples and makes the data more precise and freer from noise. The resultant dataset is used by four different classification algorithms: Logistic Regression, Decision Tree Classifier, k-Nearest Neighbors, and Neural Networks using repeated 5-fold cross-validation. Compared to other classifiers, Neural Networks with FusedRCE had the highest average prediction rate of 98.9%. The results were also measured using parameters such as F1 score, Precision, Recall and AUC values. The results obtained show that the proposed method performed significantly better than any other fraud detection approach in health insurance by predicting more fraudulent data with greater accuracy and a 3x increase in speed during training.

Automatic Detection of Display Defects for Smart Meters based on Deep Learning

Journal of Computing and Information Technology ◽

10.20532/cit.2020.1005158 ◽

2021 ◽

Vol 28 (4) ◽

pp. 241-254

Author(s):

Ye Chen ◽

Zhihu Hong ◽

Yaohua Liao ◽

Mengmeng Zhu ◽

Tong Han ◽

...

Keyword(s):

Deep Learning ◽

Fault Detection ◽

Defect Detection ◽

Automatic Detection ◽

Detection Task ◽

Detection Methods ◽

Detection Accuracy ◽

Smart Meter ◽

Smart Meters ◽

The Common

The smart meter is an essential part of an intelligent grid system. Defects in the LCD screen the smart meters affect their use. Therefore, detection of LCD screen defects of smart meters is of great significance for management and use of smart electricity meters. At present, detection methods are mainly realized by manual detection and automatic detection based on machine vision. However, performance of these two methods is not satisfactory. The fault detection task of a smart meter LCD screen can be divided into two parts: smart meter LCD localization and LCD fault detection. Therefore, this paper proposes a twostage system based on deep learning, which combines YOLOv5 with ResNet34. YOLOv5 is used for smart meter LCD localization and the classification network based on ResNet34 for LCD fault detection. We have constructed an LCD screen localization dataset and an LCD screen defect detection dataset to train and test our model. As a result, our model achieves a defect detection accuracy of 98.9% on the dataset proposed in this paper and can accurately detect the common defects of an LCD screen.

A Hybrid Approach for Clustering Uncertain Time Series

Journal of Computing and Information Technology ◽

10.20532/cit.2020.1004802 ◽

2021 ◽

Vol 28 (4) ◽

pp. 255-267

Author(s):

Ruizhe Ma ◽

Xiaoping Zhu ◽

Li Yan

Keyword(s):

Time Series ◽

Clustering Algorithm ◽

Time Series Data ◽

Knowledge Engineering ◽

Uncertain Data ◽

Hybrid Approach ◽

Series Data ◽

Adjusted Rand Index ◽

Clustering Approach ◽

Uncertain Time

Information uncertainty extensively exists in the real-world applications, and uncertain data process and analysis have been a crucial issue in the area of data and knowledge engineering. In this paper, we concentrate on uncertain time series data clustering, in which the uncertain values at time points are represented by probability density function. We propose a hybrid clustering approach for uncertain time series. Our clustering approach first partitions the uncertain time series data into a set of micro-clusters and then merges the micro-clusters following the idea of hierarchical clustering. We evaluate our approach with experiments. The experimental results show that, compared with the traditional UK-means clustering algorithm, the Adjusted Rand Index (ARI) of our clustering results have an obviously higher accuracy. In addition, the time efficiency of our clustering approach is significantly improved.

Editorial for Vol.28, No.4

Journal of Computing and Information Technology ◽

10.20532/cit.2020.1005308 ◽

2021 ◽

Vol 28 (4) ◽

pp. i-ii

Keyword(s):

Machine Learning ◽

Information Technology ◽

Computer Vision ◽

Time Series ◽

Graph Theory ◽

Regular Section

In the fourth, and final, issue of CIT. Journal of Computing and Information Technology (Vol. 28, No. 4; December 2020), we bring four papers from the regular section, which address subjects in graph theory, computer vision, time series, and machine learning.

Audit Risk Evaluation Model for Financial Statement Based on Artificial Intelligence

Journal of Computing and Information Technology ◽

10.20532/cit.2020.1005180 ◽

2021 ◽

Vol 28 (3) ◽

pp. 207-223

Keyword(s):

Risk Evaluation ◽

Homomorphic Encryption ◽

Evaluation Model ◽

Financial Statements ◽

Financial Statement ◽

Practical Significance ◽

Audit Risk ◽

Risk Indices ◽

Verification Model ◽

Risk Evaluation Model

In recent years, the economy in China has been steadily improving. The financial situation of enterprises in mainstream industries has become the focus of public concern. However, financial statement frauds, which occur frequently, greatly disrupt the economic order in the country. Thus, it is of practical significance to accurately identify and evaluate the audit risks of financial statements. For this purpose, this paper proposes an audit risk evaluation model of financial statement based on artificial neural networks (ANN). Firstly, the authors designed the audit risk indices and quantified the fraud factors of financial statement. Next, an audit risk verification model was established for financial statement and used to verify the predictions on three aspects of financial statement, namely, audit violation penalty (AVP), audit violation announcement (AVA), and financial statement restatement (FSR). Finally, a feedforward neural network was constructed based on the homomorphic encryption algorithm, which was subsequently used to evaluate and predict the audit risks of financial statements. The effectiveness of our model was proved valid through experiments.

A WGFS Based Approach to Extract Factors Influencing the Marketing of Korean Language in GCC

Journal of Computing and Information Technology ◽

10.20532/cit.2020.1004996 ◽

2021 ◽

Vol 28 (3) ◽

pp. 165-181

Keyword(s):

Feature Selection ◽

Hybrid Model ◽

High Performance ◽

Gulf Cooperation Council ◽

Korean Language ◽

Minimal Set ◽

Empirical Results ◽

Analysis Process ◽

Factors Influencing

This research proposed an approach that is intended to determine the minimal set of important factors that influence the desire of learning Korean language in the Gulf Cooperation Council (GCC). Those factors will then influence marketing of the Korean language in GCC by guiding interested people to increase their commercial abilities, improve their information about Korean drama, and prepare them to study or travel to Korea. A total of 500 responses out of 526 questionnaires were used for the analysis process. Merging the weight by SVM and the weight guided feature selection (WGFS) techniques were proposed to build a strong hybrid model of reduction for the investigated dataset. Five different classifiers were used to test the results. Empirical results have showed that the generated factors (the reduct) are very significant to test the ability/inability of learning the Korean language. SVM was shown as the best with accuracy value of 94%. This research contributed to the literature by highlighting the importance of the Korean language in the GCC and by presenting the important factors that influence learners of the Korean language: encouragements and obstacles. Moreover, current research presented the best classifier which yields to the high performance of classification.

A Survey of Citation Recommendation Tasks and Methods

Journal of Computing and Information Technology ◽

10.20532/cit.2020.1005160 ◽

2021 ◽

Vol 28 (3) ◽

pp. 183-205

Keyword(s):

Machine Learning ◽

Language Processing ◽

State Of The Art ◽

Scientific Production ◽

Machine Learning Methods ◽

Citation Function ◽

Key Aspects ◽

Global And Local ◽

Machine Learning Models

Scientific articles store vast amounts of knowledge amassed through many decades of research. They serve to communicate research results among scientists but also for learning and tracking progress in the field. However, scientific production has risen to levels that make it difficult even for experts to keep up with work in their field. As a remedy, specialized search engines are being deployed, incorporating novel natural language processing and machine learning methods. The task of citation recommendation, in particular, has attracted much interest as it holds promise for improving the quality of scientific production. In this paper, we present the state-of-the-art in citation recommendation: we survey the methods for global and local approaches to the task, the evaluation setups and datasets, and the most successful machine learning models. In addition, we overview two tasks complementary to citation recommendation: extraction of key aspects and entities from articles and citation function classification. With this survey, we hope to provide the ground for understanding current efforts and stimulate further research in this exciting and promising field.

Editorial for Vol.28, No.3

Journal of Computing and Information Technology ◽

10.20532/cit.2020.1005278 ◽

2021 ◽

Vol 28 (3) ◽

pp. i-ii

Keyword(s):

Machine Learning ◽

Information Technology ◽

Natural Language Processing ◽

Natural Language ◽

Software Testing ◽

Language Processing ◽

Business Management ◽

Testing And Debugging ◽

Regular Section

This third issue (September 2020) of CIT. Journal of Computing and Information Technology comprises four papers from the regular section, tackling topics from the areas of software testing and debugging, machine learning, natural language processing and business management processing.

A Review of Software Reliability Testing Techniques

Journal of Computing and Information Technology ◽

10.20532/cit.2020.1005155 ◽

2021 ◽

Vol 28 (3) ◽

pp. 147-164

Keyword(s):

Software Reliability ◽

Intelligent Systems ◽

Test Case ◽

Reliability Testing ◽

Reliability Modeling ◽

Software Technology ◽

Intelligent Software ◽

Challenges And Opportunities ◽

Testing Technology ◽

The Impact

In the era of intelligent systems, the safety and reliability of software have received more attention. Software reliability testing is a significant method to ensure reliability, safety and quality of software. The intelligent software technology has not only offered new opportunities but also posed challenges to software reliability technology. The focus of this paper is to explore the software reliability testing technology under the impact of intelligent software technology. In this study, the basic theories of traditional software and intelligent software reliability testing were investigated via related previous works, and a general software reliability testing framework was established. Then, the technologies of software reliability testing were analyzed, including reliability modeling, test case generation, reliability evaluation, testing criteria and testing methods. Finally, the challenges and opportunities of software reliability testing technology were discussed at the end of this paper.

Journal of Computing and Information Technology
Latest Publications

TOTAL DOCUMENTS

H-INDEX

Published By "Faculty Of Electrical Engineering And Computing, Univ. Of Zagreb"

Algorithms for Finding Diameter Cycles of Biconnected Graphs

An Effective Data Sampling Procedure for Imbalanced Data Learning on Health Insurance Fraud Detection

Automatic Detection of Display Defects for Smart Meters based on Deep Learning

A Hybrid Approach for Clustering Uncertain Time Series

Editorial for Vol.28, No.4

Audit Risk Evaluation Model for Financial Statement Based on Artificial Intelligence

A WGFS Based Approach to Extract Factors Influencing the Marketing of Korean Language in GCC

A Survey of Citation Recommendation Tasks and Methods

Editorial for Vol.28, No.3

A Review of Software Reliability Testing Techniques

Export Citation Format

Journal of Computing and Information TechnologyLatest Publications

TOTAL DOCUMENTS

H-INDEX

Published By "Faculty Of Electrical Engineering And Computing, Univ. Of Zagreb"

Algorithms for Finding Diameter Cycles of Biconnected Graphs

An Effective Data Sampling Procedure for Imbalanced Data Learning on Health Insurance Fraud Detection

Automatic Detection of Display Defects for Smart Meters based on Deep Learning

A Hybrid Approach for Clustering Uncertain Time Series

Editorial for Vol.28, No.4

Audit Risk Evaluation Model for Financial Statement Based on Artificial Intelligence

A WGFS Based Approach to Extract Factors Influencing the Marketing of Korean Language in GCC

A Survey of Citation Recommendation Tasks and Methods

Editorial for Vol.28, No.3

A Review of Software Reliability Testing Techniques

Journal of Computing and Information Technology
Latest Publications