Analysis of Different Evolutionary Techniques on Fuzzy Rule Base Generation

2019 ◽  
Vol 16 (9) ◽  
pp. 4008-4014
Author(s):  
Savita Wadhawan ◽  
Gautam Kumar ◽  
Vivek Bhatnagar

This paper presents the analysis of different population based algorithms for the rulebase generation from numerical data sets. As fuzzy rulebase generation is one of the key issues in fuzzy modeling. The algorithms are applied on a rapid Ni–Cd battery charger data set. In this paper, we compare the efficiency of different algorithms and conclude that SCA algorithms with local search give remarkable efficiency as compared to SCA algorithms alone. Also found that the efficiency of SCA with local search is comparable to memetic algorithms.

2018 ◽  
Vol 11 (2) ◽  
pp. 53-67
Author(s):  
Ajay Kumar ◽  
Shishir Kumar

Several initial center selection algorithms are proposed in the literature for numerical data, but the values of the categorical data are unordered so, these methods are not applicable to a categorical data set. This article investigates the initial center selection process for the categorical data and after that present a new support based initial center selection algorithm. The proposed algorithm measures the weight of unique data points of an attribute with the help of support and then integrates these weights along the rows, to get the support of every row. Further, a data object having the largest support is chosen as an initial center followed by finding other centers that are at the greatest distance from the initially selected center. The quality of the proposed algorithm is compared with the random initial center selection method, Cao's method, Wu method and the method introduced by Khan and Ahmad. Experimental analysis on real data sets shows the effectiveness of the proposed algorithm.


Thorax ◽  
2017 ◽  
Vol 73 (4) ◽  
pp. 339-349 ◽  
Author(s):  
Margreet Lüchtenborg ◽  
Eva J A Morris ◽  
Daniela Tataru ◽  
Victoria H Coupland ◽  
Andrew Smith ◽  
...  

IntroductionThe International Cancer Benchmarking Partnership (ICBP) identified significant international differences in lung cancer survival. Differing levels of comorbid disease across ICBP countries has been suggested as a potential explanation of this variation but, to date, no studies have quantified its impact. This study investigated whether comparable, robust comorbidity scores can be derived from the different routine population-based cancer data sets available in the ICBP jurisdictions and, if so, use them to quantify international variation in comorbidity and determine its influence on outcome.MethodsLinked population-based lung cancer registry and hospital discharge data sets were acquired from nine ICBP jurisdictions in Australia, Canada, Norway and the UK providing a study population of 233 981 individuals. For each person in this cohort Charlson, Elixhauser and inpatient bed day Comorbidity Scores were derived relating to the 4–36 months prior to their lung cancer diagnosis. The scores were then compared to assess their validity and feasibility of use in international survival comparisons.ResultsIt was feasible to generate the three comorbidity scores for each jurisdiction, which were found to have good content, face and concurrent validity. Predictive validity was limited and there was evidence that the reliability was questionable.ConclusionThe results presented here indicate that interjurisdictional comparability of recorded comorbidity was limited due to probable differences in coding and hospital admission practices in each area. Before the contribution of comorbidity on international differences in cancer survival can be investigated an internationally harmonised comorbidity index is required.


Author(s):  
Antonia J. Jones ◽  
Dafydd Evans ◽  
Steve Margetts ◽  
Peter J. Durrant

The Gamma Test is a non-linear modelling analysis tool that allows us to quantify the extent to which a numerical input/output data set can be expressed as a smooth relationship. In essence, it allows us to efficiently calculate that part of the variance of the output that cannot be accounted for by the existence of any smooth model based on the inputs, even though this model is unknown. A key aspect of this tool is its speed: the Gamma Test has time complexity O(Mlog M), where M is the number of datapoints. For data sets consisting of a few thousand points and a reasonable number of attributes, a single run of the Gamma Test typically takes a few seconds. In this chapter we will show how the Gamma Test can be used in the construction of predictive models and classifiers for numerical data. In doing so, we will demonstrate the use of this technique for feature selection, and for the selection of embedding dimension when dealing with a time-series.


2019 ◽  
Vol 152 (Supplement_1) ◽  
pp. S82-S82
Author(s):  
Emilia Calvaresi ◽  
Jonathan Genzen

Abstract Objectives The World Health Organization recommends measurement of G6PD activity prior to initiation of 8-aminoquinolones for the treatment of P vivax malaria. An estimated 400 million people worldwide have G6PD deficiency, making them susceptible to hemolysis under oxidative stress. A new single-dose therapy (radical cure) for malaria with tafenoquine is contraindicated in patients with <70% normal G6PD activity due to its prolonged circulating half-life. However, most clinical laboratories report G6PD activity in units g/Hb or units/1012 RBC and do not provide percentage of normal activity, making potential eligibility determination challenging. Methods Using an IRB-exempt protocol, a limited data set of consecutive G6PD quantitative results was retrieved from the clinical laboratory’s enterprise data warehouse. Laboratory testing of these specimens was previously performed at 37°C using an automated enzymatic assay (Pointe Scientific) configured on a cobas c501 chemistry analyzer (Roche Diagnostics). Data were assembled to include adults age 18 to 89 years and excluded repeat results from the same patients as well as outliers. Conclusion The final data set included 52,216 results (female, 55.9%, n = 29,173; male, 44.1%, n = 23,043) from 47 US states. An adjusted male median (100% G6PD activity) was derived using an approach proposed by the Bangkok Workshop guidelines (Domingo et al., Malaria Journal, 2013), modified to more accurately differentiate bimodal peaks in population G6PD histograms. The laboratory-specific, adjusted male median was 12.7 U g/Hb and was similar to peak values derived from alternative curve-fitting approaches. Applying this median to gender-specific data sets, 5.4% of males and 3.8% of females were found to have <70% of normal activity (<8.9 U g/Hb). This study demonstrates the feasibility of percentage-based G6PD result reporting in adults; further studies will query percentage-based reporting in pediatric populations. Population-based medians will vary based on G6PD assay type and temperature and should be established by laboratories prior to percentage-based reporting.


2019 ◽  
Vol 8 (3) ◽  
pp. 108-122 ◽  
Author(s):  
Halima Salah ◽  
Mohamed Nemissi ◽  
Hamid Seridi ◽  
Herman Akdag

Setting a compact and accurate rule base constitutes the principal objective in designing fuzzy rule-based classifiers. In this regard, the authors propose a designing scheme based on the combination of the subtractive clustering (SC) and the particle swarm optimization (PSO). The main idea relies on the application of the SC on each class separately and with a different radius in order to generate regions that are more accurate, and to represent each region by a fuzzy rule. However, the number of rules is then affected by the radiuses, which are the main preset parameters of the SC. The PSO is therefore used to define the optimal radiuses. To get good compromise accuracy-compactness, the authors propose using a multi-objective function for the PSO. The performances of the proposed method are tested on well-known data sets and compared with several state-of-the-art methods.


Entropy ◽  
2019 ◽  
Vol 21 (11) ◽  
pp. 1051
Author(s):  
Jerzy W. Grzymala-Busse ◽  
Zdzislaw S. Hippe ◽  
Teresa Mroczek

Results of experiments on numerical data sets discretized using two methods—global versions of Equal Frequency per Interval and Equal Interval Width-are presented. Globalization of both methods is based on entropy. For discretized data sets left and right reducts were computed. For each discretized data set and two data sets, based, respectively, on left and right reducts, we applied ten-fold cross validation using the C4.5 decision tree generation system. Our main objective was to compare the quality of all three types of data sets in terms of an error rate. Additionally, we compared complexity of generated decision trees. We show that reduction of data sets may only increase the error rate and that the decision trees generated from reduced decision sets are not simpler than the decision trees generated from non-reduced data sets.


2016 ◽  
Vol 25 (2) ◽  
pp. 263-282 ◽  
Author(s):  
Renu Bala ◽  
Saroj Ratnoo

AbstractFuzzy rule-based systems (FRBSs) are proficient in dealing with cognitive uncertainties like vagueness and ambiguity imperative to real-world decision-making situations. Fuzzy classification rules (FCRs) based on fuzzy logic provide a framework for a flexible human-like reasoning involving linguistic variables. Appropriate membership functions (MFs) and suitable number of linguistic terms – according to actual distribution of data – are useful to strengthen the knowledge base (rule base [RB]+ data base [DB]) of FRBSs. An RB is expected to be accurate and interpretable, and a DB must contain appropriate fuzzy constructs (type of MFs, number of linguistic terms, and positioning of parameters of MFs) for the success of any FRBS. Moreover, it would be fascinating to know how a system behaves in some rare/exceptional circumstances and what action ought to be taken in situations where generalized rules cease to work. In this article, we propose a three-phased approach for discovery of FCRs augmented with intra- and inter-class exceptions. A pre-processing algorithm is suggested to tune DB in terms of the MFs and number of linguistic terms for each attribute of a data set in the first phase. The second phase discovers FCRs employing a genetic algorithm approach. Subsequently, intra- and inter-class exceptions are incorporated in the rules in the third phase. The proposed approach is illustrated on an example data set and further validated on six UCI machine learning repository data sets. The results show that the approach has been able to discover more accurate, interpretable, and interesting rules. The rules with intra-class exceptions tell us about the unique objects of a category, and rules with inter-class exceptions enable us to take a right decision in the exceptional circumstances.


The demand for data mining is now unavoidable in the medical industry due to its various applications and uses in predicting the diseases at the early stage. The methods available in the data mining theories are easy to extract the useful patterns and speed to recognize the task based outcomes. In data mining the classification models are really useful in building the classes for the medical data sets for future analysis in an accurate way. Besides these facilities, Association rules in data mining are a promising technique to find hidden patterns in a medical data set and have been successfully applied with market basket data, census data and financial data. Apriori algorithm, is considered to be a classic algorithm, is useful in mining frequent item sets on a database containing a large number of transactions and it also predicts the relevant association rules. Association rules capture the relationship of items that are present in data sets and when the data set contains continuous attributes, the existing algorithms may not work due to this, discretization can be applied to the association rules in order to find the relation between various patterns in data set. In this paper of our research, using Discretized Apriori the research work is done to predict the by-disease in people who are found with diabetic syndrome; also the rules extracted are analyzed. In the discretization step, numerical data is discretized and fed to the Apriori algorithm for better association rules to predict the diseases.


2009 ◽  
Vol 28 (2) ◽  
pp. 305-324 ◽  
Author(s):  
Mark J. Nigrini ◽  
Steven J. Miller

SUMMARY: Auditors are required to use analytical procedures to identify the existence of unusual transactions, events, and trends. Benford's Law gives the expected patterns of the digits in numerical data, and has been advocated as a test for the authenticity and reliability of transaction level accounting data. This paper describes a new second-order test that calculates the digit frequencies of the differences between the ordered (ranked) values in a data set. These digit frequencies approximate the frequencies of Benford's Law for most data sets. The second-order test is applied to four sets of transactional data. The second-order test detected errors in data downloads, rounded data, data generated by statistical procedures, and the inaccurate ordering of data. The test can be applied to any data set and nonconformity usually signals an unusual issue related to data integrity that might not have been easily detectable using traditional analytical procedures.


Sign in / Sign up

Export Citation Format

Share Document