scholarly journals An Efficient Probability Estimation Decision Tree Postprocessing Method for Mining Optimal Profitable Knowledge for Enterprises with Multi-Class Customers

2019 ◽  
Vol 22 (64) ◽  
pp. 63-84
Author(s):  
JanapatyI Naga Muneiah ◽  
Ch D V SubbaRao

Enterprises often classify their customers based on the degree of profitability in decreasing order like C1, C2, ..., Cn. Generally, customers representing class Cn are zero profitable since they migrate to the competitor. They are called as attritors (or churners) and are the prime reason for the huge losses of the enterprises. Nevertheless, customers of other intermediary classes are reluctant and offer an insignificant amount of profits in different degrees and lead to uncertainty. Various data mining models like decision trees, etc., which are built using the customers’ profiles, are limited to classifying the customers as attritors or non-attritors only and not providing profitable actionable knowledge. In this paper, we present an efficient algorithm for the automatic extraction of profit-maximizing knowledge for business applications with multi-class customers by postprocessing the probability estimation decision tree (PET). When the PET predicts a customer as belonging  to any of the lesser profitable classes, then, our algorithm suggests the cost-sensitive actions to change her/him to a maximum possible higher profitable status. In the proposed novel approach, the PET is represented in the compressed form as a Bit patterns matrix and the postprocessing task is performed on the bit patterns by applying the bitwise AND operations. The computational performance of the proposed method is strong due to the employment of effective data structures. Substantial experiments conducted on UCI datasets, real Mobile phone service data and other benchmark datasets demonstrate that the proposed method remarkably outperforms the state-of-the-art methods.

Author(s):  
Antonio de Falco ◽  
Zoltan Dezso ◽  
Francesco Ceccarelli ◽  
Luigi Cerulo ◽  
Angelo Ciaramella ◽  
...  

Abstract Motivation The cost of drug development has dramatically increased in the last decades, with the number new drugs approved per billion US dollars spent on R&D halving every year or less. The selection and prioritization of targets is one the the most influential decisions in drug discovery. Here we present a Gaussian Process model for the prioritization of drug targets cast as a problem of learning with only positive and unlabeled examples. Results Since the absence of negative samples does not allow standard methods for automatic selection of hyperparameters, we propose a novel approach for hyperparameter selection of the kernel in One Class Gaussian Processes. We compare our methods with state-of-the-art approaches on benchmark datasets and then show its application to druggability prediction of oncology drugs. Our score reaches an AUC 0.90 on a set of clinical trial targets starting from a small training set of 102 validated oncology targets. Our score recovers the majority of known drug targets and can be used to identify novel set of proteins as drug target candidates. Availability Source code implemented in Python is freely available for download at https://github.com/AntonioDeFalco/Adaptive-OCGP. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
R. H. W. Brook

When a serious failure situation has developed, an expensive crash programme is usually required. If in-service data are analysed as a routine, then impending trouble may be foreseen and management decisions made to minimize the cost. A reliability analysis can help to establish a failure pattern compatible with intuitive engineering assessment so that, from a realistic prediction, alternative courses of action can be considered. A recent gas-turbine engine problem which has caused six component failures is analysed, and alternative replacement strategies are considered. It is suggested that to adopt the intuitive compromise strategy could be the most expensive in this case.


2010 ◽  
Vol 40-41 ◽  
pp. 156-161 ◽  
Author(s):  
Yang Li ◽  
Yan Qiang Li ◽  
Zhi Xue Wang

With the rapid development of automotive ECUs(Electronic Control Unit), the fault diagnosis becomes increasingly complicated. And the link between fault and symptom becomes less obvious. In order to improve the maintenance quality and efficiency, the paper proposes a fault diagnosis approach based on data mining technologies. By making full use of data stream, we firstly extract fault symptom vectors by processing data stream, and then establish a diagnosis decision tree through the ID3 decision tree algorithm, and finally store the link rules between faults and the related symptoms into historical fault database as a foundation for the fault diagnosis. The database provides the basis of trend judgments for a future fault. To verify this approach, an example of diagnosing faults of entertainment ECU is showed. The test result testifies the reliability and validity of this diagnostic method and reduces the cost of ECU diagnosis.


2015 ◽  
Vol 24 (02) ◽  
pp. 1540010 ◽  
Author(s):  
Patrick Arnold ◽  
Erhard Rahm

We introduce a novel approach to extract semantic relations (e.g., is-a and part-of relations) from Wikipedia articles. These relations are used to build up a large and up-to-date thesaurus providing background knowledge for tasks such as determining semantic ontology mappings. Our automatic approach uses a comprehensive set of semantic patterns, finite state machines and NLP techniques to extract millions of relations between concepts. An evaluation for different domains shows the high quality and effectiveness of the proposed approach. We also illustrate the value of the newly found relations for improving existing ontology mappings.


2021 ◽  
Author(s):  
İsmail Can Dikmen ◽  
Teoman Karadağ

Abstract Today, the storage of electrical energy is one of the most important technical challenges. The increasing number of high capacity, high-power applications, especially electric vehicles and grid energy storage, points to the fact that we will be faced with a large amount of batteries that will need to be recycled and separated in the near future. An alternative method to the currently used methods for separating these batteries according to their chemistry is discussed in this study. This method can be applied even on integrated circuits due to its ease of implementation and low operational cost. In this respect, it is also possible to use it in multi-chemistry battery management systems to detect the chemistry of the connected battery. For the implementation of the method, the batteries are connected to two different loads alternately. In this way, current and voltage values ​​are measured for two different loads without allowing the battery to relax. The obtained data is pre-processed with a separation function developed based on statistical significance. In machine learning algorithms, artificial neural network and decision tree algorithms are trained with processed data and used to determine battery chemistry with 100% accuracy. The efficiency and ease of implementation of the decision tree algorithm in such a categorization method are presented comparatively.


Author(s):  
Denghong Xiao ◽  
Tian He ◽  
Xiandong Liu ◽  
Yingchun Shan

A novel approach of locating damage in welded joints is proposed based on acoustic emission (AE) beamforming, which is particularly applicable to complex plate-like structures. First, five AE sensors used to obtain AE signals generated from damage are distributed on the surface of the structure in a uniform line array. Then the beamforming method is adopted to detect the weld joints in the area of interest rather than all the points of the whole structure, and to determine the location and obtain information of AE sources. In order to study the ability of the proposed method more comprehensively, a rectangular steel tube with welded joints is taken for the pencil-lead-broken test. The localization results indicate that the proposed localization approach can effectively localize the failure welded joints. This improvement greatly reduces the cost of computation and also improves the efficiency of localization work compared with the traditional beamforming.


2021 ◽  
Vol 1 (1) ◽  
pp. 32-50
Author(s):  
Nan Wang ◽  
Sid Chi-Kin Chau ◽  
Yue Zhou

Energy storage provides an effective way of shifting temporal energy demands and supplies, which enables significant cost reduction under time-of-use energy pricing plans. Despite its promising benefits, the cost of present energy storage remains expensive, presenting a major obstacle to practical deployment. A more viable solution to improve the cost-effectiveness is by sharing energy storage, such as community sharing, cloud energy storage and peer-to-peer sharing. However, revealing private energy demand data to an external energy storage operator may compromise user privacy, and is susceptible to data misuses and breaches. In this paper, we explore a novel approach to support energy storage sharing with privacy protection, based on privacy-preserving blockchain and secure multi-party computation. We present an integrated solution to enable privacy-preserving energy storage sharing, such that energy storage service scheduling and cost-sharing can be attained without the knowledge of individual users' demands. It also supports auditing and verification by the grid operator via blockchain. Furthermore, our privacy-preserving solution can safeguard against a majority of dishonest users, who may collude in cheating, without requiring a trusted third-party. We implemented our solution as a smart contract on real-world Ethereum blockchain platform, and provided empirical evaluation in this paper 1 .


Author(s):  
C. Platias ◽  
M. Vakalopoulou ◽  
K. Karantzalos

In this paper we propose a deformable registration framework for high resolution satellite video data able to automatically and accurately co-register satellite video frames and/or register them to a reference map/image. The proposed approach performs non-rigid registration, formulates a Markov Random Fields (MRF) model, while efficient linear programming is employed for reaching the lowest potential of the cost function. The developed approach has been applied and validated on satellite video sequences from Skybox Imaging and compared with a rigid, descriptor-based registration method. Regarding the computational performance, both the MRF-based and the descriptor-based methods were quite efficient, with the first one converging in some minutes and the second in some seconds. Regarding the registration accuracy the proposed MRF-based method significantly outperformed the descriptor-based one in all the performing experiments.


Sensors ◽  
2021 ◽  
Vol 21 (23) ◽  
pp. 8010
Author(s):  
Ismail Butun ◽  
Yusuf Tuncel ◽  
Kasim Oztoprak

This paper investigates and proposes a solution for Protocol Independent Switch Architecture (PISA) to process application layer data, enabling the inspection of application content. PISA is a novel approach in networking where the switch does not run any embedded binary code but rather an interpreted code written in a domain-specific language. The main motivation behind this approach is that telecommunication operators do not want to be locked in by a vendor for any type of networking equipment, develop their own networking code in a hardware environment that is not governed by a single equipment manufacturer. This approach also eases the modeling of equipment in a simulation environment as all of the components of a hardware switch run the same compatible code in a software modeled switch. The novel techniques in this paper exploit the main functions of a programmable switch and combine the streaming data processor to create the desired effect from a telecommunication operator perspective to lower the costs and govern the network in a comprehensive manner. The results indicate that the proposed solution using PISA switches enables application visibility in an outstanding performance. This ability helps the operators to remove a fundamental gap between flexibility and scalability by making the best use of limited compute resources in application identification and the response to them. The experimental study indicates that, without any optimization, the proposed solution increases the performance of application identification systems 5.5 to 47.0 times. This study promises that DPI, NGFW (Next-Generation Firewall), and such application layer systems which have quite high costs per unit traffic volume and could not scale to a Tbps level, can be combined with PISA to overcome the cost and scalability issues.


Sign in / Sign up

Export Citation Format

Share Document