Refined Clustering of Software Components by Using K-Mean and Neural Network

Author(s):  
Indu Verma ◽  
Amarjeet Kaur ◽  
Iqbaldeep Kaur

Data Mining is extraction of relevant information about data set. A data-warehouse is a location where information is stored. There are various services of data mining, clustering is one of them. Clustering is an effort to group similar data onto single cluster. In this paper we propose and implement k-mean and neural network for clustering same components in single cluster. Clustering reduces the search space by grouping similar test cases together according to the requirements and, hence minimizing the search time, for the retrieval of the test cases, resulting in reduced time complexity. In this research paper we proposed approach for re-usability of test cases by unsupervised approach and supervised approach. In unsupervised learning we proposed k-mean and in supervised learning neural network. We have designed the algorithm for requirement and test case document clustering according to its tf-idf vector space and the output is set of highly cohesive pattern groups.

2011 ◽  
Vol 204-210 ◽  
pp. 600-603
Author(s):  
Gang Li ◽  
Xing San Qian ◽  
Chun Ming Ye ◽  
Lin Zhao

This paper focuses mainly on a clustering method for pruning Fully Connected Backpropagation Neural Network (FCBP). The initial neural network is fully connected, after training with sample data, a clustering method is employed to cluster weights between input to hidden layer and from hidden to output layer, and connections that are relatively unnecessary are deleted, thus the initial network becomes a PCBP (Partially Connected Backpropagation) Neural Network. PCBP can be used in prediction or data mining more efficiently than FCBP. At the end of this paper, An experiment is conducted to illustrate the effects of PCBP using the submersible pump repair data set.


2019 ◽  
Vol 8 (3) ◽  
pp. 4373-4378

The amount of data belonging to different domains are being stored rapidly in various repositories across the globe. Extracting useful information from the huge volumes of data is always difficult due to the dynamic nature of data being stored. Data Mining is a knowledge discovery process used to extract the hidden information from the data stored in various repositories, termed as warehouses in the form of patterns. One of the popular tasks of data mining is Classification, which deals with the process of distinguishing every instance of a data set into one of the predefined class labels. Banking system is one of the realworld domains, which collects huge number of client data on a daily basis. In this work, we have collected two variants of the bank marketing data set pertaining to a Portuguese financial institution consisting of 41188 and 45211 instances and performed classification on them using two data reduction techniques. Attribute subset selection has been performed on the first data set and the training data with the selected features are used in classification. Principal Component Analysis has been performed on the second data set and the training data with the extracted features are used in classification. A deep neural network classification algorithm based on Backpropagation has been developed to perform classification on both the data sets. Finally, comparisons are made on the performance of each deep neural network classifier with the four standard classifiers, namely Decision trees, Naïve Bayes, Support vector machines, and k-nearest neighbors. It has been found that the deep neural network classifier outperforms the existing classifiers in terms of accuracy


2016 ◽  
Vol 4 (4) ◽  
pp. 56-70 ◽  
Author(s):  
Ahmad A. Saifan ◽  
Emad Alsukhni ◽  
Hanadi Alawneh ◽  
Ayat AL Sbaih

Software testing is a process of ratifying the functionality of software. It is a crucial area which consumes a great deal of time and cost. The time spent on testing is mainly concerned with testing large numbers of unreliable test cases. The authors' goal is to reduce the numbers and offer more reliable test cases, which can be achieved using certain selection techniques to choose a subset of existing test cases. The main goal of test case selection is to identify a subset of the test cases that are capable of satisfying the requirements as well as exposing most of the existing faults. The state of practice among test case selection heuristics is cyclomatic complexity and code coverage. The authors used clustering algorithm which is a data mining approach to reduce the number of test cases. Their approach was able to obtain 93 unique effective test cases out a total of 504.


Author(s):  
T. Z. Ibragimov ◽  

methods of data mining were used to predict the Septoria leaf blotch of wheat. A system has been developed that allows parallel forecasting with the same data set using the methods of an artificial neural network, a decision tree, and a naive Bayesian classifier. The system allows you to interactively adjust the design parameters for each of the methods, see the results obtained and evaluate their effectiveness.


2014 ◽  
Vol 971-973 ◽  
pp. 2180-2185
Author(s):  
Sheng Long Yang

Based on the Grey neural network, combine with the sampling data from Yangtze estuary wetland which measured in fifteen sampling site in raising tide and falling tide in May 2010 to intelligent comprehensive evaluation the sea water quality of Yangtze estuary wetland. The results showed that the sea water quality of sampling data wereⅠ.The precision of training and testing data set showed the Grey neural network had good generalization capacity, with good fitting precision and strongly predictive ability. It can be used to similar data set calculation.


2015 ◽  
Vol 738-739 ◽  
pp. 191-196
Author(s):  
Yun Jie Li ◽  
Hui Song

In this paper, several data mining techniques were discussed and analyzed in order to achieve the objective of human daily activities recognition based on a continuous sensing data set. The data mining techniques of decision tree, Naïve Bayes and Neural Network were successfully applied to the data set. The paper also proposed an idea of combining the Neural Network with the Decision Tree, the result shows that it works much better than the typical Neural Network and the typical Decision Tree model.


2017 ◽  
Vol 8 (1) ◽  
pp. 21-41
Author(s):  
Emad Alsukhni ◽  
Ahmad A. Saifan ◽  
Hanadi Alawneh

Test cases do not have the same importance when used to detect faults in software; therefore, it is more efficient to test the system with the test cases that have the ability to detect the faults. This research proposes a new framework that combines data mining techniques to prioritize the test cases. It enhances fault prediction and detection using two different techniques: 1) the data mining regression classifier that depends on software metrics to predict defective modules, and 2) the k-means clustering technique that is used to select and prioritize test cases to identify the fault early. Our approach of test case prioritization yields good results in comparison with other studies. The authors used the Average Percentage of Faults Detection (APFD) metric to evaluate the proposed framework, which results in 19.9% for all system modules and 25.7% for defective ones. Our results give us an indication that it is effective to start the testing process with the most defective modules instead of testing all modules arbitrary arbitrarily.


2017 ◽  
Vol 7 (1.3) ◽  
pp. 74
Author(s):  
Lakshmi Prasad Mudarakola ◽  
J. K. R. Sastry

Testing an embedded system is required to locate bugs in software, diminish risk, development, repairs costs and to improve performance for both users and the company. Embedded software testing tools are useful for catching defects during unit, integration and system testing.   Embedded systems in many cases must be optimized by engaging crucial areas of the embedded systems considering all factors of the input domain.  The most important concern is to build a place of test cases depend on design of the requirements that can recognize more number of faults at a least rate and point in time in the major sections of an embedded system. This paper proposes a Neural Network Based strategy (NNBS) to generate optimized test cases based on the considerations of the system. A tool called NNTCG (Neural Network Test Case Generator) has been build up based on the method proposed in this paper. Test cases are generated for testing an embedded system using NNTCG and the same are used to determine the expected output through the neural network and the output generated from the actual firmware. The faulty paths within the firmware are determined when the output generated by the neural network is not same as the output generated by the firmware.


Author(s):  
Amit Verma ◽  
Simranjeet Kaur

Test Case Prioritization (TCP) has gained wide spread acceptance as it often results in good quality software free from defects. Due to the increase in rate of faults in software traditional techniques for prioritization results in increased cost and time. Main challenge in TCP is difficulty in manually validate the priorities of different test cases due to large size of test suites and no more emphasis are made to make the TCP process automate. The objective of this paper is to detect the priorities of different test cases using an artificial neural network which helps to predict the correct priorities with the help of back propagation algorithm. In our proposed work one such method is implemented in which priorities are assigned to different test cases based on their frequency. After assigning the priorities ANN predicts whether correct priority is assigned to every test case or not otherwise it generates the interrupt when wrong priority is assigned. In order to classify the different priority test cases classifiers are used. Proposed algorithm is very effective as it reduces the complexity with robust efficiency and makes the process automated to prioritize the test cases.


2017 ◽  
Vol 16 (2) ◽  
pp. 55
Author(s):  
Anak Agung Gede Bagus Ariana ◽  
I Ketut Gede Darma Putra ◽  
Linawati Linawati

Abstract— This study investigates the performance of artificial neural network method on clustering method. Using UD. Fenny’s customer profile in year 2009 data set with the Recency, Frequency and Monetary model data. Clustering methods were compared in this study is between the Self Organizing Map and Adaptive Resonance Theory 2. The performance evaluation method validation is measured by the index cluster validation. Validation index clusters are used, among others, Davies-Bouldin index, index and index Dunn Silhouette. The test results show the method Self Organizing Map is better to process the data clustering. Index term— Data Mining, Artificial Neural Network, Self Organizing Map, Adaptive Resonance Theory 2. Intisari—Penelitian ini ingin mengetahui unjuk kerja metode clustering data berbasis jaringan saraf tiruan. Menggunakan data set profil pelanggan UD. Fenny tahun 2009 dengan atribut Recency, Frequency dan Monetary. Metode clustering yang dibandingkan pada penelitian ini adalah Self Organizing Map dan Adaptive Resonance Theory 2. Evaluasi kinerja metode dilakukan dengan mengukur validasi index dari cluster yang terbentuk. Validasi cluster yang digunakan antara lain Indeks Davies-Bouldin, Indeks Dunn dan Indeks Silhouette. Hasil pengujian menunjukkan metode Self Organizing Map lebih baik dalam melakukan proses clustering data. Kata Kunci— Data Mining, Jaringan Saraf Tiruan Self Organizing Map, Adaptive Resonance Theory 2.


Sign in / Sign up

Export Citation Format

Share Document