Encrypting and Preserving Sensitive Attributes in Customer Churn Data Using Novel Dragonfly Based Pseudonymizer Approach

With miscellaneous information accessible in public depositories, consumer data is the knowledgebase for anticipating client preferences. For instance, subscriber details are inspected in telecommunication sector to ascertain growth, customer engagement and imminent opportunity for advancement of services. Amongst such parameters, churn rate is substantial to scrutinize migrating consumers. However, predicting churn is often accustomed with prevalent risk of invading sensitive information from subscribers. Henceforth, it is worth safeguarding subtle details prior to customer-churn assessment. A dual approach is adopted based on dragonfly and pseudonymizer algorithms to secure lucidity of customer data. This twofold approach ensures sensitive attributes are protected prior to churn analysis. Exactitude of this method is investigated by comparing performances of conventional privacy preserving models against the current model. Furthermore, churn detection is substantiated prior and post data preservation for detecting information loss. It was found that the privacy based feature selection method secured sensitive attributes effectively as compared to traditional approaches. Moreover, information loss estimated prior and post security concealment identified random forest classifier as superlative churn detection model with enhanced accuracy of 94.3% and minimal data forfeiture of 0.32%. Likewise, this approach can be adopted in several domains to shield vulnerable information prior to data modeling.

Download Full-text

Enabling Clustering for Privacy-Aware Data Dissemination Based on Medical Healthcare-IoTs (MH-IoTs) for Wireless Body Area Network

Journal of Healthcare Engineering ◽

10.1155/2020/8824907 ◽

2020 ◽

Vol 2020 ◽

pp. 1-10

Author(s):

Fasee Ullah ◽

Izhar Ullah ◽

Atif Khan ◽

M. Irfan Uddin ◽

Hashem Alyami ◽

...

Keyword(s):

Data Dissemination ◽

Wireless Body Area Network ◽

Information Loss ◽

Third Party ◽

Area Network ◽

Minimum Information ◽

Data Preservation ◽

Research Activities ◽

Local Coding ◽

Medical Healthcare

There is a need to develop an effective data preservation scheme with minimal information loss when the patient’s data are shared in public interest for different research activities. Prior studies have devised different approaches for data preservation in healthcare domains; however, there is still room for improvement in the design of an elegant data preservation approach. With that motivation behind, this study has proposed a medical healthcare-IoTs-based infrastructure with restricted access. The infrastructure comprises two algorithms. The first algorithm protects the sensitivity information of a patient with quantifying minimum information loss during the anonymization process. The algorithm has also designed the access polices comprising the public access, doctor access, and the nurse access, to access the sensitivity information of a patient based on the clustering concept. The second suggested algorithm is K-anonymity privacy preservation based on local coding, which is based on cell suppression. This algorithm utilizes a mapping method to classify the data into different regions in such a manner that the data of the same group are placed in the same region. The benefit of using local coding is to restrict third-party users, such as doctors and nurses, when trying to insert incorrect values in order to access real patient data. Efficiency of the proposed algorithm is evaluated against the state-of-the-art algorithm by performing extensive simulations. Simulation results demonstrate benefits of the proposed algorithms in terms of efficient cluster formation in minimum time, minimum information loss, and execution time for data dissemination.

Download Full-text

Customer Engagement

Insights, Innovation, and Analytics for Optimal Customer Engagement - Advances in Marketing, Customer Relationship Management, and E-Services ◽

10.4018/978-1-7998-3919-4.ch003 ◽

2021 ◽

pp. 49-65

Author(s):

Sitara Bibi ◽

Waseem UI Hameed ◽

Bushra Anwer ◽

Muneeba Saleem ◽

Shafqat Ali Niaz

Keyword(s):

Strong Relationship ◽

Customer Engagement ◽

Effective Strategy ◽

Data Set ◽

Customer Data ◽

Creation Process ◽

Comprehensive Knowledge ◽

Clear Insight ◽

Theoretical Foundations ◽

Value Creation Process

Customer engagement (CE) has become a hot topic of today's business. It has been widely recognized as one of the most important drivers of a business's prosperity. Customer engagement (CE) is considered as a perfect predictor of firm's growth as it works as an effective strategy to build and maintain strong relationship between customers and firms. This chapter is aimed to explore the theoretical foundations of customer engagement (CE) and provides clear insight and comprehensive knowledge about customer engagement and CE models that widely used by firms to engage their customers, contribution of CE in value creation process, as well as barriers apparent in introducing customer engagement analytics and their capabilities of dealing with large customer data-set as well. This chapter is beneficial for a reason that it may provide a comprehensive knowledge about customer engagement to the academics (i.e., marketing students, scholars, and researchers) as well as practitioners of various industries.

Download Full-text

A Customer Churn Detection Model for the Pay-TV Sector

10.3233/faia210129 ◽

2021 ◽

Author(s):

Vicente López ◽

Rebeca Egea ◽

Lledó Museros ◽

Ismael Sanz

Keyword(s):

Business Environment ◽

Data Set ◽

Detection Model ◽

Customer Churn ◽

Telecommunication Companies ◽

Pay Tv ◽

The Cost

The business environment today is characterized by high competition and saturated markets. Pay-tv platforms there are not an exception. Because of that, the cost to acquire new customers is much higher than the cost of retaining the existing customers. Therefore, it is important for Pay-TV platforms to keep controlled the Customer Churn. Therefore, the paper studies existing models used to predict Customer Churn in other context -like telecommunication companies customer Churn-and adapts them to the Pay-TV context. Another big problem faced in the paper is the fact that, in the data set udes in the paper there are not personal metrics, which are indispensables to solve the problem. Therefore this approach has defined new metrics in order to be able to predict customer churn.

Download Full-text

A Study of Feature Selection and Dimensionality Reduction Methods for Classification-Based Phishing Detection System

International Journal of Information Retrieval Research ◽

10.4018/ijirr.2021010101 ◽

2021 ◽

Vol 11 (1) ◽

pp. 1-35

Author(s):

Amit Singh ◽

Abhishek Tiwari

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Dimensionality Reduction ◽

Detection System ◽

Parallel Evolution ◽

Feature Selection Method ◽

Sensitive Information ◽

Selection Methods ◽

Survey Paper ◽

Reduction Methods

Phishing was introduced in 1996, and now phishing is the biggest cybercrime challenge. Phishing is an abstract way to deceive users over the internet. Purpose of phishers is to extract the sensitive information of the user. Researchers have been working on solutions of phishing problem, but the parallel evolution of cybercrime techniques have made it a tough nut to crack. Recently, machine learning-based solutions are widely adopted to tackle the menace of phishing. This survey paper studies various feature selection method and dimensionality reduction methods and sees how they perform with machine learning-based classifier. The selection of features is vital for developing a good performance machine learning model. This work is comparing three broad categories of feature selection methods, namely filter, wrapper, and embedded feature selection methods, to reduce the dimensionality of data. The effectiveness of these methods has been assessed on several machine learning classifiers using k-fold cross-validation score, accuracy, precision, recall, and time.

Download Full-text

Legal and Technical Issues of Privacy Preservation in Data Mining

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch180 ◽

2011 ◽

pp. 1158-1163 ◽

Cited By ~ 6

Author(s):

Kirsten Wahlstrom ◽

John F. Roddick ◽

Rick Sarre ◽

Vladimir Estivill-Castro ◽

Denise de Vries

Keyword(s):

Data Mining ◽

Social Responsibility ◽

Private Information ◽

New Technologies ◽

Ethical Issues ◽

Sensitive Information ◽

Personal Privacy ◽

Customer Data ◽

Moral Principles ◽

Very Large Databases

To paraphrase Winograd (1992), we bring to our communities a tacit comprehension of right and wrong that makes social responsibility an intrinsic part of our culture. Our ethics are the moral principles we use to assert social responsibility and to perpetuate safe and just societies. Moreover, the introduction of new technologies can have a profound effect on our ethical principles. The emergence of very large databases, and the associated automated data analysis tools, present yet another set of ethical challenges to consider. Socio-ethical issues have been identified as pertinent to data mining and there is a growing concern regarding the (ab)use of sensitive information (Clarke, 1999; Clifton et al., 2002; Clifton and Estivill-Castro, 2002; Gehrke, 2002). Estivill-Castro et al., discuss surveys regarding public opinion on personal privacy that show a raised level of concern about the use of private information (Estivill-Castro et al., 1999). There is some justification for this concern; a 2001 survey in InfoWeek found that over 20% of companies store customer data with information about medical profile and/or customer demographics with salary and credit information, and over 15% store information about customers’ legal histories.

Download Full-text

Exponential Ant-Lion Rider Optimization for Privacy Preservation in Cloud Computing

Web Intelligence ◽

10.3233/web-210473 ◽

2021 ◽

pp. 1-19

Author(s):

Nagaraju Pamarthi ◽

N. Nagamalleswara Rao

Keyword(s):

Cloud Computing ◽

Optimization Algorithm ◽

Privacy Preservation ◽

Fitness Function ◽

Information Loss ◽

Superior Performance ◽

Data Matrix ◽

Sensitive Information ◽

Bilinear Map ◽

Ant Lion

The innovative trend of cloud computing is outsourcing data to the cloud servers by individuals or enterprises. Recently, various techniques are devised for facilitating privacy protection on untrusted cloud platforms. However, the classical privacy-preserving techniques failed to prevent leakage and cause huge information loss. This paper devises a novel methodology, namely the Exponential-Ant-lion Rider optimization algorithm based bilinear map coefficient Generation (Exponential-AROA based BMCG) method for privacy preservation in cloud infrastructure. The proposed Exponential-AROA is devised by integrating Exponential weighted moving average (EWMA), Ant Lion optimizer (ALO), and Rider optimization algorithm (ROA). The input data is fed to the privacy preservation process wherein the data matrix, and bilinear map coefficient Generation (BMCG) coefficient are multiplied through Hilbert space-based tensor product. Here, the bilinear map coefficient is obtained by multiplying the original data matrix and with modified elliptical curve cryptography (MECC) encryption to maintain data security. The bilinear map coefficient is used to handle both the utility and the sensitive information. Hence, an optimization-driven algorithm is utilized to evaluate the optimal bilinear map coefficient. Here, the fitness function is newly devised considering privacy and utility. The proposed Exponential-AROA based BMCG provided superior performance with maximal accuracy of 94.024%, maximal fitness of 1, and minimal Information loss of 5.977%.

Download Full-text

Information loss resulting from statistical disclosure control of output data

Wiadomości Statystyczne. The Polish Statistician ◽

10.5604/01.3001.0014.4121 ◽

2020 ◽

Vol 65 (9) ◽

pp. 7-27

Author(s):

Andrzej Młodak

Keyword(s):

Inverse Correlation ◽

Original Data ◽

Information Loss ◽

Sensitive Information ◽

Statistical Disclosure Control ◽

Output Data ◽

Specific Data ◽

Disclosure Control ◽

Statistical Disclosure

The most important methods of assessing information loss caused by statistical disclosure control (SDC) are presented in the paper. The aim of SDC is to protect an individual against identification or obtaining any sensitive information relating to them by anyone unauthorised. The application of methods based either on the concealment of specific data or on their perturbation results in information loss, which affects the quality of output data, including the distributions of variables, the forms of relationships between them, or any estimations. The aim of this paper is to perform a critical analysis of the strengths and weaknesses of the particular types of methods of assessing information loss resulting from SDC. Moreover, some novel ideas on how to obtain effective and well-interpretable measures are proposed, including an innovative way of using a cyclometric function (arcus tangent) to determine the deviation of values from the original ones, as a result of SDC. Additionally, the inverse correlation matrix was applied in order to assess the influence of SDC on the strength of relationships between variables. The first presented method allows obtaining effective and well- -interpretable measures, while the other makes it possible to fully use the potential of the mutual relationships between variables (including the ones difficult to detect by means of classical statistical methods) for a better analysis of the consequences of SDC. Among other findings, the empirical verification of the utility of the suggested methods confirmed the superiority of the cyclometric function in measuring the distance between the curved deviations and the original data, and also heighlighted the need for a skilful correction of its flattening when large value arguments occur.

Download Full-text

SAISAN: An Automated Local File Inclusion Vulnerability Detection Model

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.3.9956 ◽

2018 ◽

Vol 7 (2.3) ◽

pp. 4 ◽

Cited By ~ 2

Author(s):

Md Maruf Hassan ◽

Touhid Bhuyian ◽

M Khaled Sohel ◽

Md Hasan Sharif ◽

Saikat Biswas

Keyword(s):

Information Disclosure ◽

Web Applications ◽

Sensitive Information ◽

Vulnerability Detection ◽

Security Breaches ◽

Detection Model ◽

Testing Method ◽

User Friendly ◽

Local File ◽

The Web

Communicating and delivering services to the consumers through web applications are now become very popular due to its user friendly interface, global accessibility, and easy manageability. Careless design and development of web applications are the key reasons for security breaches which are very alarming for the users as well as the web administrators. Currently, Local File Inclusion (LFI) vulnerability is found present commonly in several web applications that lead to remote code execution in host server and initiates sensitive information disclosure. Detection of LFI vulnerability is getting very critical concern for the web owner to take effective measures to mitigate the risk. After reviewing literatures, we found insignificant researches conducted on automated detection of LFI vulnerability. This paper has proposed an automated LFI vulnerability detection model, SAISANfor web applications and implemented it through a tool. 265 web applications of four different sectors has been examined and received 88% accuracy from the tool comparing with the manual penetration testing method.

Download Full-text

A Phishing Webpage Detection Method Based on Stacked Autoencoder and Correlation Coefficients

Journal of Computing and Information Technology ◽

10.20532/cit.2019.1004702 ◽

2019 ◽

Vol 27 (2) ◽

pp. 41-54 ◽

Cited By ~ 1

Keyword(s):

Neural Networks ◽

Correlation Coefficients ◽

Third Party ◽

Support Vector ◽

Sensitive Information ◽

Cyber Attack ◽

Source Codes ◽

Detection Model ◽

Stacked Autoencoder ◽

Order Of Magnitude

Phishing is a kind of cyber-attack that targets naive online users by tricking them into revealing sensitive information. There are many anti-phishing solutions proposed to date, such as blacklist or whitelist, heuristic-based and machine learning-based methods. However, online users are still being trapped into revealing sensitive information in phishing websites. In this paper, we propose a novel phishing webpage detection model, based on features that are extracted from URL, source codes of HTML, and the third-party services to represent the basic characters of phishing webpages, which uses a deep learning method – Stacked Autoencoder (SAE) to detect phishing webpages. To make features in the same order of magnitude, three kinds of normalization methods are adopted. In particular, a method to calculate correlation coefficients between weight matrixes of SAE is proposed to determine optimal width of hidden layers, which shows high computational efficiency and feasibility. Based on the testing of a set of phishing and benign webpages, the model using SAE achieves the best performance when compared to other algorithms such as Naive Bayes (NB), Support Vector Machine (SVM), Convolutional Neural Networks (CNN), and Recurrent Neural Networks (RNN). It indicates that the proposed detection model is promising and can be applied effectively to phishing detection.

Download Full-text

Examining the impact of mobile interactivity on customer engagement in the context of mobile shopping

Journal of Enterprise Information Management ◽

10.1108/jeim-07-2019-0194 ◽

2020 ◽

Vol 33 (3) ◽

pp. 627-653 ◽

Cited By ~ 2

Author(s):

Ali Abdallah Alalwan ◽

Raed Salah Algharabat ◽

Abdullah Mohammed Baabdullah ◽

Nripendra P. Rana ◽

Zainah Qasem ◽

...

Keyword(s):

Active Control ◽

Structural Equation ◽

Current Model ◽

Survey Study ◽

Customer Engagement ◽

Personal Factors ◽

Content Type ◽

Five Dimensions ◽

Mobile Shopping ◽

The Impact

PurposeThis study aims to examine the impact of mobile interactivity dimensions (active control, personalization, ubiquitous connectivity, connectedness, responsiveness and synchronicity) on customer engagement.Design/methodology/approachA quantitative field survey study was conducted to collect the required data from actual users of mobile shopping in three countries: Jordan, the United Kingdom (UK) and Saudi Arabia.FindingsThe results are based on structural equation modelling and support the impact of five dimensions of mobile interactivity: active control, personalization, ubiquitous connectivity, responsiveness and synchronicity. The impact of connectedness is not supported. The results also support the significant impact of customer engagement on customer loyalty.Research limitations/implicationsThis study only considered the shopping activities conducted by mobile channels, while other channels (e.g., online channels, traditional channels and social media shopping channels) are not considered. Furthermore, the current model does not consider the impact of personal factors (e.g., technology readiness, self-efficacy and user experience). The results of the current study present a foundation that can guide marketers and practitioners in the area of mobile shopping.Originality/valueThis study enriches the current understanding of the impact of mobile interactivity on mobile shopping, as well as how mobile interactivity can enhance the level of customer engagement.

Download Full-text