2. Why is big data special?

Author(s):  
Dawn E. Holmes

The rapid growth in computing power and storage has led to progressively more data being collected. Big datasets are certainly large and complex, but in order to fully define ‘big data’ we need first to understand ‘small data’ and its role in statistical analysis. ‘Why is big data special?’ considers the four main characteristics of big data: volume, variety, velocity, and veracity, which present a considerable challenge in data management. The advantages we expect to gain from meeting this challenge and the questions we hope to answer with big data can be understood through data mining. The use of big data mining in credit card fraud detection is discussed.

Bank marketers still have difficulties to find the best implementation for credit card promotion using above the line, particularly based on customers preferences in point of interest (POI) locations such as mall and shopping center. On the other hand, customers on those POIs are keen to have recommendation on what is being offered by the bank. On this paper we propose a design architecture and implementation of big data platform to support bank’s credit card’s program campaign that generating data and extracting topics from Twitter. We built a data pipeline that consist of a Twitter streamer, a text preprocessor, a topic extractor using Latent Dirichlet Allocation, and a dashboard that visualize the recommendation. As a result, we successfully generate topics that related to specific location in Jakarta during some time windows, that can be used as a recommendation for bank marketers to create promotion program for their customers. We also present the analysis of computing power usages that indicates the strategy is well implemented on the big data platform.


Web Services ◽  
2019 ◽  
pp. 618-638
Author(s):  
Goran Klepac ◽  
Kristi L. Berg

This chapter proposes a new analytical approach that consolidates the traditional analytical approach for solving problems such as churn detection, fraud detection, building predictive models, segmentation modeling with data sources, and analytical techniques from the big data area. Presented are solutions offering a structured approach for the integration of different concepts into one, which helps analysts as well as managers to use potentials from different areas in a systematic way. By using this concept, companies have the opportunity to introduce big data potential in everyday data mining projects. As is visible from the chapter, neglecting big data potentials results often with incomplete analytical results, which imply incomplete information for business decisions and can imply bad business decisions. The chapter also provides suggestions on how to recognize useful data sources from the big data area and how to analyze them along with traditional data sources for achieving more qualitative information for business decisions.


Author(s):  
Roberto Marmo

As a conseguence of expansion of modern technology, the number and scenario of fraud are increasing dramatically. Therefore, the reputation blemish and losses caused are primary motivations for technologies and methodologies for fraud detection that have been applied successfully in some economic activities. The detection involves monitoring the behavior of users based on huge data sets such as the logged data and user behavior. The aim of this contribution is to show some data mining techniques for fraud detection and prevention with applications in credit card and telecommunications, within a business of mining the data to achieve higher cost savings, and also in the interests of determining potential legal evidence. The problem is very difficult because fraudsters takes many different forms and are adaptive, so they will usually look for ways to avoid every security measures.


Author(s):  
Arun Thotapalli Sundararaman

Study of data quality for data mining application has always been a complex topic; in the recent years, this topic has gained further complexity with the advent of big data as the source for data mining and business intelligence (BI) applications. In a big data environment, data is consumed in various states and various forms serving as input for data mining, and this is the main source of added complexity. These new complexities and challenges arise from the underlying dimensions of big data (volume, variety, velocity, and value) together with the ability to consume data at various stages of transition from raw data to standardized datasets. These have created a need for expanding the traditional data quality (DQ) factors into BDQ (big data quality) factors besides the need for new BDQ assessment and measurement frameworks for data mining and BI applications. However, very limited advancement has been made in research and industry in the topic of BDQ and their relevance and criticality for data mining and BI applications. Data quality in data mining refers to the quality of the patterns or results of the models built using mining algorithms. DQ for data mining in business intelligence applications should be aligned with the objectives of the BI application. Objective measures, training/modeling approaches, and subjective measures are three major approaches that exist to measure DQ for data mining. However, there is no agreement yet on definitions or measurements or interpretations of DQ for data mining. Defining the factors of DQ for data mining and their measurement for a BI system has been one of the major challenges for researchers as well as practitioners. This chapter provides an overview of existing research in the area of BDQ definitions and measurement for data mining for BI, analyzes the gaps therein, and provides a direction for future research and practice in this area.


Complexity ◽  
2018 ◽  
Vol 2018 ◽  
pp. 1-9 ◽  
Author(s):  
Massimiliano Zanin ◽  
Miguel Romance ◽  
Santiago Moral ◽  
Regino Criado

The detection of frauds in credit card transactions is a major topic in financial research, of profound economic implications. While this has hitherto been tackled through data analysis techniques, the resemblances between this and other problems, like the design of recommendation systems and of diagnostic/prognostic medical tools, suggest that a complex network approach may yield important benefits. In this paper we present a first hybrid data mining/complex network classification algorithm, able to detect illegal instances in a real card transaction data set. It is based on a recently proposed network reconstruction algorithm that allows creating representations of the deviation of one instance from a reference group. We show how the inclusion of features extracted from the network data representation improves the score obtained by a standard, neural network-based classification algorithm and additionally how this combined approach can outperform a commercial fraud detection system in specific operation niches. Beyond these specific results, this contribution represents a new example on how complex networks and data mining can be integrated as complementary tools, with the former providing a view to data beyond the capabilities of the latter.


2016 ◽  
Vol 20 (1) ◽  
pp. 12-28 ◽  
Author(s):  
Son K. Lam ◽  
Stefan Sleep ◽  
Thorsten Hennig-Thurau ◽  
Shrihari Sridhar ◽  
Alok R. Saboo

The advent of new forms of data, modern technology, and advanced data analytics offer service providers both opportunities and risks. This article builds on the phenomenon of big data and offers an integrative conceptual framework that captures not only the benefits but also the costs of big data for managing the frontline employee (FLE)-customer interaction. Along the positive path, the framework explains how the “3Vs” of big data (volume, velocity, and variety) have the potential to improve service quality and reduce service costs by influencing big data value and organizational change at the firm and FLE levels. However, the 3Vs of big data also increase big data veracity, which casts doubt about the value of big data. The authors further propose that because of heterogeneity in big data absorptive capacities at the firm level, the costs of adopting big data in FLE management may outweigh the benefits. Finally, while FLEs can benefit from big data, extracting knowledge from such data does not discount knowledge derived from FLEs’ small data. Rather, combining and integrating the firm’s big data with FLEs’ small data are crucial to absorbing and applying big data knowledge. An agenda for future research concludes.


The handling of credit card for online and systematic purchase is booming and scam associated with it. An industry of fraud detection where cumulative rise can have huge perk for banks and client. Numerous stylish techniques like data mining, genetic programming, neural network etc. are used in identify fraudulent transaction. In online transaction, Data mining acquire indispensable aspect in discovery of credit card counterfeit. This paper uses gradient boosted trees, neural network, clustering technique and genetic algorithm and hidden markov model for achieving upshot of the fraudulent transaction. These all model are emerging in identifying various credit card fraudulent detection. The indispensable aims to expose the fraudulent transaction and to corroborate test data for further use. This paper presents the look over techniques and pinpoint the top fraud cases.


Sign in / Sign up

Export Citation Format

Share Document