A new methodology for customer behavior analysis using time series clustering

Kybernetes ◽  
2019 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Hossein Abbasimehr ◽  
Mostafa Shabani

Purpose The purpose of this paper is to propose a new methodology that handles the issue of the dynamic behavior of customers over time. Design/methodology/approach A new methodology is presented based on time series clustering to extract dominant behavioral patterns of customers over time. This methodology is implemented using bank customers’ transactions data which are in the form of time series data. The data include the recency (R), frequency (F) and monetary (M) attributes of businesses that are using the point-of-sale (POS) data of a bank. This data were obtained from the data analysis department of the bank. Findings After carrying out an empirical study on the acquired transaction data of 2,531 business customers that are using POS devices of the bank, the dominant trends of behavior are discovered using the proposed methodology. The obtained trends were analyzed from the marketing viewpoint. Based on the analysis of the monetary attribute, customers were divided into four main segments, including high-value growing customers, middle-value growing customers, prone to churn and churners. For each resulted group of customers with a distinctive trend, effective and practical marketing recommendations were devised to improve the bank relationship with that group. The prone-to-churn segment contains most of the customers; therefore, the bank should conduct interesting promotions to retain this segment. Practical implications The discovered trends of customer behavior and proposed marketing recommendations can be helpful for banks in devising segment-specific marketing strategies as they illustrate the dynamic behavior of customers over time. The obtained trends are visualized so that they can be easily interpreted and used by banks. This paper contributes to the literature on customer relationship management (CRM) as the proposed methodology can be effectively applied to different businesses to reveal trends in customer behavior. Originality/value In the current business condition, customer behavior is changing continually over time and customers are churning due to the reduced switching costs. Therefore, choosing an effective customer segmentation methodology which can consider the dynamic behaviors of customers is essential for every business. This paper proposes a new methodology to capture customer dynamic behavior using time series clustering on time-ordered data. This is an improvement over previous studies, in which static segmentation approaches have often been adopted. To the best of the authors’ knowledge, this is the first study that combines the recency, frequency, and monetary model and time series clustering to reveal trends in customer behavior.

2016 ◽  
Vol 50 (1) ◽  
pp. 41-57 ◽  
Author(s):  
Linghe Huang ◽  
Qinghua Zhu ◽  
Jia Tina Du ◽  
Baozhen Lee

Purpose – Wiki is a new form of information production and organization, which has become one of the most important knowledge resources. In recent years, with the increase of users in wikis, “free rider problem” has been serious. In order to motivate editors to contribute more to a wiki system, it is important to fully understand their contribution behavior. The purpose of this paper is to explore the law of dynamic contribution behavior of editors in wikis. Design/methodology/approach – After developing a dynamic model of contribution behavior, the authors employed both the metrological and clustering methods to process the time series data. The experimental data were collected from Baidu Baike, a renowned Chinese wiki system similar to Wikipedia. Findings – There are four categories of editors: “testers,” “dropouts,” “delayers” and “stickers.” Testers, who contribute the least content and stop contributing rapidly after editing a few articles. After editing a large amount of content, dropouts stop contributing completely. Delayers are the editors who do not stop contributing during the observation time, but they may stop contributing in the near future. Stickers, who keep contributing and edit the most content, are the core editors. In addition, there are significant time-of-day and holiday effects on the number of editors’ contributions. Originality/value – By using the method of time series analysis, some new characteristics of editors and editor types were found. Compared with the former studies, this research also had a larger sample. Therefore, the results are more scientific and representative and can help managers to better optimize the wiki systems and formulate incentive strategies for editors.


2021 ◽  
Vol 3 (1) ◽  
Author(s):  
Hitoshi Iuchi ◽  
Michiaki Hamada

Abstract Time-course experiments using parallel sequencers have the potential to uncover gradual changes in cells over time that cannot be observed in a two-point comparison. An essential step in time-series data analysis is the identification of temporal differentially expressed genes (TEGs) under two conditions (e.g. control versus case). Model-based approaches, which are typical TEG detection methods, often set one parameter (e.g. degree or degree of freedom) for one dataset. This approach risks modeling of linearly increasing genes with higher-order functions, or fitting of cyclic gene expression with linear functions, thereby leading to false positives/negatives. Here, we present a Jonckheere–Terpstra–Kendall (JTK)-based non-parametric algorithm for TEG detection. Benchmarks, using simulation data, show that the JTK-based approach outperforms existing methods, especially in long time-series experiments. Additionally, application of JTK in the analysis of time-series RNA-seq data from seven tissue types, across developmental stages in mouse and rat, suggested that the wave pattern contributes to the TEG identification of JTK, not the difference in expression levels. This result suggests that JTK is a suitable algorithm when focusing on expression patterns over time rather than expression levels, such as comparisons between different species. These results show that JTK is an excellent candidate for TEG detection.


2021 ◽  
Author(s):  
Sadnan Al Manir ◽  
Justin Niestroy ◽  
Maxwell Adam Levinson ◽  
Timothy Clark

Introduction: Transparency of computation is a requirement for assessing the validity of computed results and research claims based upon them; and it is essential for access to, assessment, and reuse of computational components. These components may be subject to methodological or other challenges over time. While reference to archived software and/or data is increasingly common in publications, a single machine-interpretable, integrative representation of how results were derived, that supports defeasible reasoning, has been absent. Methods: We developed the Evidence Graph Ontology, EVI, in OWL 2, with a set of inference rules, to provide deep representations of supporting and challenging evidence for computations, services, software, data, and results, across arbitrarily deep networks of computations, in connected or fully distinct processes. EVI integrates FAIR practices on data and software, with important concepts from provenance models, and argumentation theory. It extends PROV for additional expressiveness, with support for defeasible reasoning. EVI treats any com- putational result or component of evidence as a defeasible assertion, supported by a DAG of the computations, software, data, and agents that produced it. Results: We have successfully deployed EVI for very-large-scale predictive analytics on clinical time-series data. Every result may reference its own evidence graph as metadata, which can be extended when subsequent computations are executed. Discussion: Evidence graphs support transparency and defeasible reasoning on results. They are first-class computational objects, and reference the datasets and software from which they are derived. They support fully transparent computation, with challenge and support propagation. The EVI approach may be extended to include instruments, animal models, and critical experimental reagents.


2018 ◽  
Vol 11 (4) ◽  
pp. 486-495
Author(s):  
Ke Yi Zhou ◽  
Shaolin Hu

Purpose The similarity measurement of time series is an important research in time series detection, which is a basic work of time series clustering, anomaly discovery, prediction and many other data mining problems. The purpose of this paper is to design a new similarity measurement algorithm to improve the performance of the original similarity measurement algorithm. The subsequence morphological information is taken into account by the proposed algorithm, and time series is represented by a pattern, so the similarity measurement algorithm is more accurate. Design/methodology/approach Following some previous researches on similarity measurement, an improved method is presented. This new method combines morphological representation and dynamic time warping (DTW) technique to measure the similarities of time series. After the segmentation of time series data into segments, three parameter values of median, point number and slope are introduced into the improved distance measurement formula. The effectiveness of the morphological weighted DTW algorithm (MW-DTW) is demonstrated by the example of momentum wheel data of an aircraft attitude control system. Findings The improved method is insensitive to the distortion and expansion of time axis and can be used to detect the morphological changes of time series data. Simulation results confirm that this method proposed in this paper has a high accuracy of similarity measurement. Practical implications This improved method has been used to solve the problem of similarity measurement in time series, which is widely emerged in different fields of science and engineering, such as the field of control, measurement, monitoring, process signal processing and economic analysis. Originality/value In the similarity measurement of time series, the distance between sequences is often used as the only detection index. The results of similarity measurement should not be affected by the longitudinal or transverse stretching and translation changes of the sequence, so it is necessary to incorporate the morphological changes of the sequence into similarity measurement. The MW-DTW is more suitable for the actual situation. At the same time, the MW-DTW algorithm reduces the computational complexity by transforming the computational object to subsequences.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Najimu Saka ◽  
Abdullahi Babatunde Saka ◽  
Opeoluwa Akinradewo ◽  
Clinton O. Aigbavboa

Purpose The complex interaction of politics and the economy is a critical factor for the sustainable growth and development of the construction sector (CNS). This study aims to investigate the effects of type of political administration including democracy and military on the performance of CNS using the Nigerian Construction Sector (NCS) as a case study. Design/methodology/approach A 48 year (1970–2017) time series data (TSD) on the NCS and the gross domestic product (GDP) based on 2010 constant USD were extracted from the United Nations Statistical Department database. Analysis of variance (ANOVA) and analysis of covariance (ANCOVA) models were used to analyze the TSD. The ANCOVA model includes the GDP as correlational variable or covariate. Findings The estimates of the ANOVA model indicate that democratic administration is significantly better than military administration in construction performance. However, the ANCOVA model indicates that the GDP is more important than political administration in the performance of the CNS. The study recommends for a new national construction policy, favourable fiscal and monetary policy, local content development policy and construction credit guaranty scheme for the rapid growth and development of the NCS. Originality/value Hitherto, little is known about the influence of political administration on the performance of the CNS. This study provides empirical evidence from a developing economy perspective. It presents the relationships and highlights recommendations for driving growth in the construction industry.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Zulkifli Halim ◽  
Shuhaida Mohamed Shuhidan ◽  
Zuraidah Mohd Sanusi

PurposeIn the previous study of financial distress prediction, deep learning techniques performed better than traditional techniques over time-series data. This study investigates the performance of deep learning models: recurrent neural network, long short-term memory and gated recurrent unit for the financial distress prediction among the Malaysian public listed corporation over the time-series data. This study also compares the performance of logistic regression, support vector machine, neural network, decision tree and the deep learning models on single-year data.Design/methodology/approachThe data used are the financial data of public listed companies that been classified as PN17 status (distress) and non-PN17 (not distress) in Malaysia. This study was conducted using machine learning library of Python programming language.FindingsThe findings indicate that all deep learning models used for this study achieved 90% accuracy and above with long short-term memory (LSTM) and gated recurrent unit (GRU) getting 93% accuracy. In addition, deep learning models consistently have good performance compared to the other models over single-year data. The results show LSTM and GRU getting 90% and recurrent neural network (RNN) 88% accuracy. The results also show that LSTM and GRU get better precision and recall compared to RNN. The findings of this study show that the deep learning approach will lead to better performance in financial distress prediction studies. To be added, time-series data should be highlighted in any financial distress prediction studies since it has a big impact on credit risk assessment.Research limitations/implicationsThe first limitation of this study is the hyperparameter tuning only applied for deep learning models. Secondly, the time-series data are only used for deep learning models since the other models optimally fit on single-year data.Practical implicationsThis study proposes recommendations that deep learning is a new approach that will lead to better performance in financial distress prediction studies. Besides that, time-series data should be highlighted in any financial distress prediction studies since the data have a big impact on the assessment of credit risk.Originality/valueTo the best of authors' knowledge, this article is the first study that uses the gated recurrent unit in financial distress prediction studies based on time-series data for Malaysian public listed companies. The findings of this study can help financial institutions/investors to find a better and accurate approach for credit risk assessment.


2020 ◽  
Vol 49 (2) ◽  
pp. 229-248
Author(s):  
Tamson Pietsch

PurposeThe purpose of this paper is to create comparable time series data on university income in Australia and the UK that might be used as a resource for those seeking to understand the changing funding profile of universities in the two countries and for those seeking to investigate how such data were produced and utilised.Design/methodology/approachA statistical analysis of university income from all sources in the UK and Australia.FindingsThe article produces a new time series for Australia and a comparable time series for the UK. It suggests some of the ways these data related to broader patterns of economic change, sketches the possibility of strategic influence, and outlines some of their limitations.Originality/valueThis is the first study to systematically create a time series on Australian university income across the twentieth century and present it alongside a comparable dataset for the UK.


2017 ◽  
Vol 10 (1) ◽  
pp. 82-110
Author(s):  
Syed Ali Raza ◽  
Mohd Zaini Abd Karim

Purpose This study aims to investigate the influence of systemic banking crises, currency crises and global financial crisis on the relationship between export and economic growth in China by using the annual time series data from the period of 1972 to 2014. Design/methodology/approach The Johansen and Jeuuselius’ cointegration, auto regressive distributed lag bound testing cointegration, Gregory and Hansen’s cointegration and pooled ordinary least square techniques with error correction model have been used. Findings Results indicate the positive and significant effect of export of goods and services on economic growth in both long and short run, whereas the negative influence of systemic banking crises and currency crises over economic growth is observed. It is also concluded that the impact of export of goods and service on economic growth becomes insignificant in the presence of systemic banking crises and currency crises. The currency crises effect the influence of export on economic growth to a higher extent compared to systemic banking crises. Surprisingly, the export in the period of global financial crises has a positive and significant influence over economic growth in China, which conclude that the global financial crises did not drastically affect the export-growth nexus. Originality/value This paper makes a unique contribution to the literature with reference to China, being a pioneering attempt to investigate the effects of systemic banking crises and currency crises on the relationship of export and economic growth by using long-time series data and applying more rigorous econometric techniques.


2019 ◽  
Vol 14 (2) ◽  
pp. 182-207 ◽  
Author(s):  
Benoît Faye ◽  
Eric Le Fur

AbstractThis article tests the stability of the main hedonic wine price coefficients over time. We draw on an extensive literature review to identify the most frequently used methodology and define a standard hedonic model. We estimate this model on monthly subsamples of a worldwide auction database of the most commonly exchanged fine wines. This provides, for each attribute, a monthly time series of hedonic coefficients time series data from 2003 to 2014. Using a multivariate autoregressive model, we then study the stability of these coefficients over time and test the existence of structural or cyclical changes related to fluctuations in general price levels. We find that most hedonic coefficients are variable and either exhibit structural or cyclical variations over time. These findings shed doubt on the relevance of both short- and long-run hedonic estimations. (JEL Classifications: C13, C22, D44, G11)


Sign in / Sign up

Export Citation Format

Share Document