An Efficient Probability Estimation Decision Tree Postprocessing Method for Mining Optimal Profitable Knowledge for Enterprises with Multi-Class Customers

Enterprises often classify their customers based on the degree of profitability in decreasing order like C1, C2, ..., Cn. Generally, customers representing class Cn are zero profitable since they migrate to the competitor. They are called as attritors (or churners) and are the prime reason for the huge losses of the enterprises. Nevertheless, customers of other intermediary classes are reluctant and offer an insignificant amount of profits in different degrees and lead to uncertainty. Various data mining models like decision trees, etc., which are built using the customers’ profiles, are limited to classifying the customers as attritors or non-attritors only and not providing profitable actionable knowledge. In this paper, we present an efficient algorithm for the automatic extraction of profit-maximizing knowledge for business applications with multi-class customers by postprocessing the probability estimation decision tree (PET). When the PET predicts a customer as belonging to any of the lesser profitable classes, then, our algorithm suggests the cost-sensitive actions to change her/him to a maximum possible higher profitable status. In the proposed novel approach, the PET is represented in the compressed form as a Bit patterns matrix and the postprocessing task is performed on the bit patterns by applying the bitwise AND operations. The computational performance of the proposed method is strong due to the employment of effective data structures. Substantial experiments conducted on UCI datasets, real Mobile phone service data and other benchmark datasets demonstrate that the proposed method remarkably outperforms the state-of-the-art methods.

Download Full-text

Adaptive One-Class gaussian processes allow accurate prioritization of oncology drug targets

Bioinformatics ◽

10.1093/bioinformatics/btaa968 ◽

2020 ◽

Author(s):

Antonio de Falco ◽

Zoltan Dezso ◽

Francesco Ceccarelli ◽

Luigi Cerulo ◽

Angelo Ciaramella ◽

...

Keyword(s):

Gaussian Processes ◽

Drug Targets ◽

Process Model ◽

New Drugs ◽

Supplementary Information ◽

Novel Approach ◽

Benchmark Datasets ◽

Hyperparameter Selection ◽

The Cost ◽

Selection Of

Abstract Motivation The cost of drug development has dramatically increased in the last decades, with the number new drugs approved per billion US dollars spent on R&D halving every year or less. The selection and prioritization of targets is one the the most influential decisions in drug discovery. Here we present a Gaussian Process model for the prioritization of drug targets cast as a problem of learning with only positive and unlabeled examples. Results Since the absence of negative samples does not allow standard methods for automatic selection of hyperparameters, we propose a novel approach for hyperparameter selection of the kernel in One Class Gaussian Processes. We compare our methods with state-of-the-art approaches on benchmark datasets and then show its application to druggability prediction of oncology drugs. Our score reaches an AUC 0.90 on a set of clinical trial targets starting from a small training set of 102 validated oncology targets. Our score recovers the majority of known drug targets and can be used to identify novel set of proteins as drug target candidates. Availability Source code implemented in Python is freely available for download at https://github.com/AntonioDeFalco/Adaptive-OCGP. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

718: Modeling the Cost of Management Options for Stage I Nonseminomatous Germ Cell Tumors: A Decision Tree Analysis

The Journal of Urology ◽

10.1016/s0022-5347(18)35950-0 ◽

2005 ◽

Vol 173 (4S) ◽

pp. 195-196

Author(s):

Richard E. Link ◽

Mohamad E. Allaf ◽

Roberto Pili ◽

Louis R. Kavoussi

Keyword(s):

Decision Tree ◽

Germ Cell ◽

Germ Cell Tumors ◽

Stage I ◽

Decision Tree Analysis ◽

Management Options ◽

Tree Analysis ◽

The Cost ◽

Nonseminomatous Germ Cell Tumors

Download Full-text

Paper 21: Planning for Failures

Proceedings of the Institution of Mechanical Engineers Conference Proceedings ◽

10.1243/pime_conf_1969_184_053_02 ◽

1969 ◽

Vol 184 (2) ◽

pp. 156-160

Author(s):

R. H. W. Brook

Keyword(s):

Reliability Analysis ◽

Gas Turbine ◽

Gas Turbine Engine ◽

Failure Pattern ◽

Management Decisions ◽

Turbine Engine ◽

The Cost ◽

Service Data ◽

Component Failures ◽

Realistic Prediction

When a serious failure situation has developed, an expensive crash programme is usually required. If in-service data are analysed as a routine, then impending trouble may be foreseen and management decisions made to minimize the cost. A reliability analysis can help to establish a failure pattern compatible with intuitive engineering assessment so that, from a realistic prediction, alternative courses of action can be considered. A recent gas-turbine engine problem which has caused six component failures is analysed, and alternative replacement strategies are considered. It is suggested that to adopt the intuitive compromise strategy could be the most expensive in this case.

Download Full-text

Fault Diagnosis of Automobile ECUs with Data Mining Technologies

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.40-41.156 ◽

2010 ◽

Vol 40-41 ◽

pp. 156-161 ◽

Cited By ~ 1

Author(s):

Yang Li ◽

Yan Qiang Li ◽

Zhi Xue Wang

Keyword(s):

Data Mining ◽

Fault Diagnosis ◽

Decision Tree ◽

Data Stream ◽

Electronic Control Unit ◽

Rapid Development ◽

Reliability And Validity ◽

Control Unit ◽

Use Of Data ◽

The Cost

With the rapid development of automotive ECUs(Electronic Control Unit), the fault diagnosis becomes increasingly complicated. And the link between fault and symptom becomes less obvious. In order to improve the maintenance quality and efficiency, the paper proposes a fault diagnosis approach based on data mining technologies. By making full use of data stream, we firstly extract fault symptom vectors by processing data stream, and then establish a diagnosis decision tree through the ID3 decision tree algorithm, and finally store the link rules between faults and the related symptoms into historical fault database as a foundation for the fault diagnosis. The database provides the basis of trend judgments for a future fault. To verify this approach, an example of diagnosing faults of entertainment ECU is showed. The test result testifies the reliability and validity of this diagnostic method and reduces the cost of ECU diagnosis.

Download Full-text

Automatic Extraction of Semantic Relations from Wikipedia

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213015400102 ◽

2015 ◽

Vol 24 (02) ◽

pp. 1540010 ◽

Cited By ~ 8

Author(s):

Patrick Arnold ◽

Erhard Rahm

Keyword(s):

Finite State Machines ◽

Background Knowledge ◽

Semantic Relations ◽

Automatic Extraction ◽

State Machines ◽

High Quality ◽

Novel Approach ◽

Finite State ◽

Semantic Ontology

We introduce a novel approach to extract semantic relations (e.g., is-a and part-of relations) from Wikipedia articles. These relations are used to build up a large and up-to-date thesaurus providing background knowledge for tasks such as determining semantic ontology mappings. Our automatic approach uses a comprehensive set of semantic patterns, finite state machines and NLP techniques to extract millions of relations between concepts. An evaluation for different domains shows the high quality and effectiveness of the proposed approach. We also illustrate the value of the newly found relations for improving existing ontology mappings.

Download Full-text

Novel Approach for Battery Type Determination: A Mere Electrical Alternative

10.21203/rs.3.rs-858317/v1 ◽

2021 ◽

Author(s):

İsmail Can Dikmen ◽

Teoman Karadağ

Keyword(s):

Integrated Circuits ◽

Decision Tree ◽

Statistical Significance ◽

High Capacity ◽

Electrical Energy ◽

Machine Learning Algorithms ◽

Battery Management ◽

Separation Function ◽

Novel Approach ◽

Tree Algorithms

Abstract Today, the storage of electrical energy is one of the most important technical challenges. The increasing number of high capacity, high-power applications, especially electric vehicles and grid energy storage, points to the fact that we will be faced with a large amount of batteries that will need to be recycled and separated in the near future. An alternative method to the currently used methods for separating these batteries according to their chemistry is discussed in this study. This method can be applied even on integrated circuits due to its ease of implementation and low operational cost. In this respect, it is also possible to use it in multi-chemistry battery management systems to detect the chemistry of the connected battery. For the implementation of the method, the batteries are connected to two different loads alternately. In this way, current and voltage values are measured for two different loads without allowing the battery to relax. The obtained data is pre-processed with a separation function developed based on statistical significance. In machine learning algorithms, artificial neural network and decision tree algorithms are trained with processed data and used to determine battery chemistry with 100% accuracy. The efficiency and ease of implementation of the decision tree algorithm in such a categorization method are presented comparatively.

Download Full-text

Localization Approach of Damage in Welded Joint Based on Acoustic Emission Beamforming

Volume 13: Vibration, Acoustics and Wave Propagation ◽

10.1115/imece2014-37658 ◽

2014 ◽

Author(s):

Denghong Xiao ◽

Tian He ◽

Xiandong Liu ◽

Yingchun Shan

Keyword(s):

Acoustic Emission ◽

Welded Joints ◽

Welded Joint ◽

Steel Tube ◽

Area Of Interest ◽

Novel Approach ◽

Line Array ◽

The Cost ◽

Localization Approach ◽

Ae Signals

A novel approach of locating damage in welded joints is proposed based on acoustic emission (AE) beamforming, which is particularly applicable to complex plate-like structures. First, five AE sensors used to obtain AE signals generated from damage are distributed on the surface of the structure in a uniform line array. Then the beamforming method is adopted to detect the weld joints in the area of interest rather than all the points of the whole structure, and to determine the location and obtain information of AE sources. In order to study the ability of the proposed method more comprehensively, a rectangular steel tube with welded joints is taken for the pencil-lead-broken test. The localization results indicate that the proposed localization approach can effectively localize the failure welded joints. This improvement greatly reduces the cost of computation and also improves the efficiency of localization work compared with the traditional beamforming.

Download Full-text

Privacy-preserving energy storage sharing with blockchain and secure multi-party computation

ACM SIGEnergy Energy Informatics Review ◽

10.1145/3508467.3508471 ◽

2021 ◽

Vol 1 (1) ◽

pp. 32-50

Author(s):

Nan Wang ◽

Sid Chi-Kin Chau ◽

Yue Zhou

Keyword(s):

Energy Storage ◽

Energy Demand ◽

Empirical Evaluation ◽

Privacy Preserving ◽

Third Party ◽

User Privacy ◽

External Energy ◽

Novel Approach ◽

Significant Cost Reduction ◽

The Cost

Energy storage provides an effective way of shifting temporal energy demands and supplies, which enables significant cost reduction under time-of-use energy pricing plans. Despite its promising benefits, the cost of present energy storage remains expensive, presenting a major obstacle to practical deployment. A more viable solution to improve the cost-effectiveness is by sharing energy storage, such as community sharing, cloud energy storage and peer-to-peer sharing. However, revealing private energy demand data to an external energy storage operator may compromise user privacy, and is susceptible to data misuses and breaches. In this paper, we explore a novel approach to support energy storage sharing with privacy protection, based on privacy-preserving blockchain and secure multi-party computation. We present an integrated solution to enable privacy-preserving energy storage sharing, such that energy storage service scheduling and cost-sharing can be attained without the knowledge of individual users' demands. It also supports auditing and verification by the grid operator via blockchain. Furthermore, our privacy-preserving solution can safeguard against a majority of dishonest users, who may collude in cheating, without requiring a trusted third-party. We implemented our solution as a smart contract on real-world Ethereum blockchain platform, and provided empirical evaluation in this paper 1 .

Download Full-text

AUTOMATIC MRF-BASED REGISTRATION OF HIGH RESOLUTION SATELLITE VIDEO DATA

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprsannals-iii-1-121-2016 ◽

2016 ◽

Vol III-1 ◽

pp. 121-128 ◽

Cited By ~ 1

Author(s):

C. Platias ◽

M. Vakalopoulou ◽

K. Karantzalos

Keyword(s):

High Resolution ◽

Markov Random Fields ◽

Video Data ◽

Registration Accuracy ◽

Registration Method ◽

Computational Performance ◽

Markov Random ◽

Video Frames ◽

Reference Map ◽

The Cost

In this paper we propose a deformable registration framework for high resolution satellite video data able to automatically and accurately co-register satellite video frames and/or register them to a reference map/image. The proposed approach performs non-rigid registration, formulates a Markov Random Fields (MRF) model, while efficient linear programming is employed for reaching the lowest potential of the cost function. The developed approach has been applied and validated on satellite video sequences from Skybox Imaging and compared with a rigid, descriptor-based registration method. Regarding the computational performance, both the MRF-based and the descriptor-based methods were quite efficient, with the first one converging in some minutes and the second in some seconds. Regarding the registration accuracy the proposed MRF-based method significantly outperformed the descriptor-based one in all the performing experiments.

Download Full-text

Application Layer Packet Processing Using PISA Switches

Sensors ◽

10.3390/s21238010 ◽

2021 ◽

Vol 21 (23) ◽

pp. 8010

Author(s):

Ismail Butun ◽

Yusuf Tuncel ◽

Kasim Oztoprak

Keyword(s):

Streaming Data ◽

The Novel ◽

Application Layer ◽

Domain Specific ◽

Switch Architecture ◽

Novel Approach ◽

Data Processor ◽

Application Identification ◽

The Cost ◽

Telecommunication Operators

This paper investigates and proposes a solution for Protocol Independent Switch Architecture (PISA) to process application layer data, enabling the inspection of application content. PISA is a novel approach in networking where the switch does not run any embedded binary code but rather an interpreted code written in a domain-specific language. The main motivation behind this approach is that telecommunication operators do not want to be locked in by a vendor for any type of networking equipment, develop their own networking code in a hardware environment that is not governed by a single equipment manufacturer. This approach also eases the modeling of equipment in a simulation environment as all of the components of a hardware switch run the same compatible code in a software modeled switch. The novel techniques in this paper exploit the main functions of a programmable switch and combine the streaming data processor to create the desired effect from a telecommunication operator perspective to lower the costs and govern the network in a comprehensive manner. The results indicate that the proposed solution using PISA switches enables application visibility in an outstanding performance. This ability helps the operators to remove a fundamental gap between flexibility and scalability by making the best use of limited compute resources in application identification and the response to them. The experimental study indicates that, without any optimization, the proposed solution increases the performance of application identification systems 5.5 to 47.0 times. This study promises that DPI, NGFW (Next-Generation Firewall), and such application layer systems which have quite high costs per unit traffic volume and could not scale to a Tbps level, can be combined with PISA to overcome the cost and scalability issues.

Download Full-text