irrelevant attributes
Recently Published Documents


TOTAL DOCUMENTS

43
(FIVE YEARS 13)

H-INDEX

9
(FIVE YEARS 1)

2022 ◽  
pp. 1109-1138
Author(s):  
B. Subashini ◽  
D. Jeya Mala

Software testing is used to find bugs in the software to provide a quality product to the end users. Test suites are used to detect failures in software but it may be redundant and it takes a lot of time for the execution of software. In this article, an enormous number of test cases are created using combinatorial test design algorithms. Attribute reduction is an important preprocessing task in data mining. Attributes are selected by removing all weak and irrelevant attributes to reduce complexity in data mining. After preprocessing, it is not necessary to test the software with every combination of test cases, since the test cases are large and redundant, the healthier test cases are identified using a data mining techniques algorithm. This is healthier and the final test suite will identify the defects in the software, it will provide better coverage analysis and reduces execution time on the software.


2021 ◽  
Vol 11 (4) ◽  
pp. 2976-2997
Author(s):  
Saba Saleem ◽  
Mehmood Ahmed ◽  
Luqman Shah ◽  
Ali Imran Jehangiri ◽  
Muhammad Naeem ◽  
...  

The data mining methods have been extensively used in the process of decision making. The popularity of data mining methods is due to availability of high speed algorithms, processing and storage power of computers. The effective use of data mining methods help in mining datasets and taking better decisions. The data need to be preprocessed before applying data mining methods. Some datasets require little preparation like dealing with missing and redundant instances while some high-dimensional datasets require strong processing like dimensionality reduction. One of the techniques used for dimensionality reduction is feature selection. This study uses graph based centrality measure for feature selection. Graph based centrality measures are used for ranking features which is used for removing irrelevant attributes. After comparison of results with other approaches, it has been found that the proposed approach results in reduction of feature space without compromising accuracy. The results also shows that proposed approach performs better than some other feature selection approaches not only in terms of accuracy but also on the basis of larger reduction in feature space.


2021 ◽  
Vol 9 (3) ◽  
pp. 359
Author(s):  
Dharma Putra ◽  
I Gusti Agung Gede Arya Kadnyanana

Feature selection is one of the research on data mining for datasets that have relatively many attributes. Eliminating some attributes that are irrelevant to the label class will be able to improve the performance of the classification algorithm. The Information Gain algorithm is one of the algorithms for searching for features that are irrelevant to the label class. This algorithm uses wrapper techniques to eliminate irrelevant attributes. This research aims to implement feature selection using the Information Gain algorithm against the NSL KDD intrusion detection dataset which has a large number of relative attributes. The dataset of the selected attribute will be performed by a classification algorithm so that an attribute reduction can improve the compute process and improve the accuracy of the algorithm model used.


2021 ◽  
Author(s):  
Gabriel Marcondes Santos ◽  
Emmanuel Tavares Ferreira Affonso ◽  
Alisson Marques Silva ◽  
Gray Farias Moita

Nowadays the Computational Intelligence (IC) algorithms have shown a lot of efficiency in pattern classification and recognition processes. However, some databases may contain irrelevant attributes that may be detrimental to the learning of the classification model. In order to detect and exclude input attributes with little representativeness in the data sets presented to the classification algorithms, the Features Selection (FS) methods are commonly used. The goal of features selection methods is to minimize the number of input attributes processed by a classifier in order to improve its assertiveness. In this way, this work aims to analyze solutions to classification problems with three different classification algorithms. The first approach used for classification is the unsupervised Fuzzy C-Means (FCM) algorithm, the second approach is a supervised version of FCM and the third approach is a variation of supervised FCM with features selection. The method of features selection incorporated in FCM is called the Mean Ratio Feature Selection (MRFS), and was developed with the objective of being a method with low computational cost, without need for complex mathematical equations and can be easily incorporated into any classifier. For the experiments, the three versions of the unsupervised FCM, supervised FCM and FCM with attribute selection were performed with the aim of verifying whether there would be a significant improvement between the variations of the FCM. The results of the experiments showed that FCM with MRFS is promising, with results superior to the original algorithm and also to its supervised version.


2020 ◽  
Vol 4 (1) ◽  
pp. 41-53
Author(s):  
Abdelkader Derbali ◽  
Lamia Jamel ◽  
Mohamed Bechir Chenguel ◽  
Ali Lamouchi ◽  
Ahmed K Elnagar ◽  
...  

The purpose of this paper is to examine if creditors take account of the firm’s governance attributes to decide the cost of debt. Using a sample of 486 US firms over the period 1998-2017, we synthesized governance in six factorial axes. We have demonstrated that the quality audit (independence, frequency of meetings, auditor’s reputation, there is a charter) and financial expertise (percentage of financial experts and ownership of institutional investors) are informative tools creditors that provide information on the quality and reliability of financial reporting. They affect negatively and significantly the cost of debt. Moreover, creditors appreciate the presence of independent directors on the board and reduce the cost of debt required. Furthermore, the independence of the nomination and compensation committees prove irrelevant attributes of governance perspective because creditors do not reduce their risk of the agency. However, the attributes of the board (the size, the number of meetings, the existence of specialized committees, and meetings) are misunderstood by creditors that will increase the interest rate. In addition, the cost of debt increases with the concentration of managerial ownership and majority shareholders. Similarly, attributes reflecting the managerial entrenchment (duality of CEO tenure) are positively correlated to the cost of debt.


Author(s):  
Abdiya Alaoui ◽  
Zakaria Elberrichi

Classification algorithms are widely applied in medical domain to classify the data for diagnosis. The datasets have considerable irrelevant attributes. Diagnosis of the diseases is costly because many tests are required to predict a disease. Feature selection is one of the significant tasks of the preprocessing phase for the data. It can extract a subset of attributes from a large set and exclude redundant, irrelevant, or noisy attributes. The authors can decrease the cost of diagnosis by avoiding numerous tests by selection of features, which are important for prediction of disease. Applied to the task of supervised classification, the authors construct a robust learning model for disease prediction. The search for a subset of features is an NP-hard problem, which can be solved by the metaheuristics. In this chapter, a wrapper approach by hybridization between ant colony algorithm and adaboost with decision trees to ameliorate the classification is proposed. The authors use an enhanced global pheromone updating rule. With the experimental results, this approach gives good results.


Author(s):  
Talha Mahboob Alam

Malignant mesothelioma is a rare proliferative cancer that develops in the thin layer of tissues surrounding the lungs. Malignant mesothelioma is associated with an extremely poor prognosis and the majority of patients do not show symptoms. The epidemiology of mesothelioma is important for the identification of disease. The primary aim of this study is to explore the risk factors associated with mesothelioma. The dataset consists of healthy and mesothelioma patients but only mesothelioma patients were selected for the identification of symptoms. The raw data set has been pre-processed and then the Apriori method was utilized for association rules with various configurations. The pre-processing task involved the removal of duplicated and irrelevant attributes, balanced the dataset, numerical to the nominal conversion of attributes in the dataset and creating the association rules in the dataset. Strong associations of disease’s factors; asbestos exposure, duration of asbestos exposure, duration of symptoms, erythrocyte sedimentation rate and Pleural to serum LDH ratio determined via Apriori algorithm. The identification of risk factors associated with mesothelioma may prevent patients from going into the high danger of the disease. This will also help to control the comorbidities associated with mesothelioma which are cardiovascular diseases, cancer-related emotional distress, diabetes, anemia, and hypothyroidism.


Sign in / Sign up

Export Citation Format

Share Document