Research on power-law distribution of long-tail data and its application to tourism recommendation

PurposeOne challenge for tourism recommendation systems (TRSs) is the long-tail phenomenon of ratings or popularity among tourist products. This paper aims to improve the diversity and efficiency of TRSs utilizing the power-law distribution of long-tail data.Design/methodology/approachUsing Sina Weibo check-in data for example, this paper demonstrates that the long-tail phenomenon exists in user travel behaviors and fits the long-tail travel data with power-law distribution. To solve data sparsity in the long-tail part and increase recommendation diversity of TRSs, the paper proposes a collaborative filtering (CF) recommendation algorithm combining with power-law distribution. Furthermore, by combining power-law distribution with locality sensitive hashing (LSH), the paper optimizes user similarity calculation to improve the calculation efficiency of TRSs.FindingsThe comparison experiments show that the proposed algorithm greatly improves the recommendation diversity and calculation efficiency while maintaining high precision and recall of recommendation, providing basis for further dynamic recommendation.Originality/valueTRSs provide a better solution to the problem of information overload in the tourism field. However, based on the historical travel data over the whole population, most current TRSs tend to recommend hot and similar spots to users, lacking in diversity and failing to provide personalized recommendations. Meanwhile, the large high-dimensional sparse data in online social networks (OSNs) brings huge computational cost when calculating user similarity with traditional CF algorithms. In this paper, by integrating the power-law distribution of travel data and tourism recommendation technology, the authors’ work solves the problem existing in traditional TRSs that recommendation results are overly narrow and lack in serendipity, and provides users with a wider range of choices and hence improves user experience in TRSs. Meanwhile, utilizing locality sensitive hash functions, the authors’ work hashes users from high-dimensional vectors to one-dimensional integers and maps similar users into the same buckets, which realizes fast nearest neighbors search in high-dimensional space and solves the extreme sparsity problem of high dimensional travel data. Furthermore, applying the hashing results to user similarity calculation, the paper greatly reduces computational complexity and improves calculation efficiency of TRSs, which reduces the system load and enables TRSs to provide effective and timely recommendations for users.

Download Full-text

The two sides of CEO pay injustice: a commentary

Management Research The Journal of the Iberoamerican Academy of Management ◽

10.1108/mrjiam-10-2017-0785 ◽

2018 ◽

Vol 16 (1) ◽

pp. 90-96

Author(s):

Albert Cannella ◽

Valerie Sy

Keyword(s):

Power Law ◽

Design Methodology ◽

Ceo Compensation ◽

Power Law Distribution ◽

Content Type ◽

Compensation Process ◽

Ceo Pay ◽

Two Sides ◽

Theoretical Issues ◽

Research Domain

Purpose The purpose of this paper is to extend discussions in the CEO compensation research domain. Specifically, this paper provides a critical analysis of the power law conceptualization and pay injustice contribution by Aguinis, Martin, Gomez-Mejia, O’Boyle and Joo. Design/methodology/approach This commentary addresses statistical and theoretical issues of the power law distribution with respect to prior compensation research and offers additional perspectives on the issue of CEO pay deservingness. Findings The power law is worth investigating further, but more attention should be paid to outliers and fit to the distribution. Stronger theory is needed for using the power law to explain CEO compensation phenomena, especially regarding standard firm performance measures and anomalies in the compensation process. Finally, “injustice” and “deservingness” in discussions of CEO pay exist in the eye of the beholder. Originality/value This paper offers additional considerations for scholars to explore when applying the power law distribution to compensation research.

Download Full-text

Examining the power-law distribution among eWOM communities: a characterisation approach of the Long Tail

Technology Analysis and Strategic Management ◽

10.1080/09537325.2015.1122187 ◽

2015 ◽

Vol 28 (5) ◽

pp. 601-613 ◽

Cited By ~ 10

Author(s):

M. Olmedilla ◽

M. R. Martínez-Torres ◽

S. L. Toral

Keyword(s):

Power Law ◽

Power Law Distribution ◽

Long Tail

Download Full-text

The long tail thesis

Chinese Management Studies ◽

10.1108/cms-03-2019-0109 ◽

2019 ◽

Vol 14 (2) ◽

pp. 433-454

Author(s):

Shuanping Dai ◽

Markus Taube

Keyword(s):

Success Factors ◽

Business Models ◽

Developing Economies ◽

Business Practices ◽

Base Of The Pyramid ◽

Long Tail ◽

Content Type ◽

Customer Base ◽

Business Approach ◽

Long Tail Phenomenon

Purpose This paper aims to explore the functionality of long tail markets (LTM), where the consumers cannot be reached or are ignored by the traditional mainstream businesses, in new products and business development. Design/methodology/approach First, the authors review two Chinese entrepreneurial practices in the Fintech sector and low-speed electric vehicles (LSEV) and describe their stylized facts; second, they explore a possible theoretical LTM framework to underscore these practices; third, they make a connection between LTM and existing business models and analyze its significance and practical implications in business, in particular, in developing economies. Findings The LTM business approach has helped Chinese companies in the Fintech sector and LSEVs gain global attention. The success factors of LTM for businesses are identifying a specific customer base, being aware of localization products and playing skillfully with regulations; the LTM approach has several overlaps with existing studies on niche products and base of the pyramid market. Originality/value Based on some emerging and attractive business practices in China, this paper offers a valuable attempt to theorize them as long tail phenomenon. The LTM thesis provides a potential framework to reference for similar methods elsewhere and may illuminate entrepreneurship to be explored in similar markets.

Download Full-text

The elusive linkage between CEO pay and performance

Management Research The Journal of the Iberoamerican Academy of Management ◽

10.1108/mrjiam-10-2017-0787 ◽

2018 ◽

Vol 16 (1) ◽

pp. 57-65

Author(s):

Gerald Edward Ledford ◽

Edward E. Lawler

Keyword(s):

Organizational Performance ◽

Power Law ◽

Chief Executive ◽

Power Law Distribution ◽

Content Type ◽

Ceo Tenure ◽

Dependent Variables ◽

Ceo Pay ◽

External Conditions ◽

And Performance

Purpose The authors comment on the paper by Aguinis et al. (2018). The authors believe that their hypotheses probably are true, but their methodology is flawed and their data do not support their conclusions. Design/Methodology The authors review and comment on the paper by Aguinis et al. (2018). Findings The data do not adequately demonstrate a power law distribution for chief executive officer’s (CEO) performance because the analysis confounded external conditions affecting performance, and the authors use inappropriate dependent variables. The analysis does not demonstrate a power law distribution for CEO pay because the analysis does not take into account changes in pay level and mix over time. The analysis does not show a lack of overlap between the two distributions because it does not take into account the way that the CEOs are paid for performance and because it uses CEO pay averaged over CEO tenure. Research limitations/implications A more convincing analysis of the authors’ hypothesis would require the use of total shareholder return (TSR) as the dependent variable for organizational performance and would require a number of much more specific controls. Practical implications The authors call for greater use of power law thinking by practitioners in setting CEO pay. Their analysis indicates that practitioners already think in power law terms and allocate CEO pay accordingly. Moreover, power law theory and findings could be misused as an excuse for paying average CEOs much more than they are already paid. Social implications The authors add another perspective on CEO pay. Originality/value The authors’ perspective is informed both by research and by consulting experience on CEO pay projects.

Download Full-text

Defining the three Rs of commercial property market performance

Journal of Property Investment and Finance ◽

10.1108/jpif-08-2014-0054 ◽

2015 ◽

Vol 33 (6) ◽

pp. 481-493 ◽

Cited By ~ 1

Author(s):

David Higgins

Keyword(s):

Standard Deviation ◽

Power Law ◽

Extreme Values ◽

Market Performance ◽

Downside Risk ◽

Property Market ◽

Power Law Distribution ◽

Content Type ◽

Commercial Property ◽

Price Fluctuations

Purpose – Modern property investment allocation techniques are typically based on recognised measures of return and risk. Whilst these models work well in theory under stable conditions, they can fail when stable assumptions cease to hold and extreme volatility occurs. This is evident in commercial property markets which can experience extended stable periods followed by large concentrated negative price fluctuations as a result of major unpredictable events. This extreme volatility may not be fully reflected in traditional risk calculations and can lead to ruin. The paper aims to discuss these issues. Design/methodology/approach – This research studies 28 years of quarterly Australian direct commercial property market performance data for normal distribution features and signs of extreme downside risk. For the extreme values, Power Law distribution models were examined as to provide a better probability measure of large negative price fluctuations. Findings – The results show that the normal bell curve distribution underestimated actual extreme values both by frequency and extent, being by at least 30 per cent for the outermost data point. For the statistical outliers beyond 2 SD, a Power Law distribution can overcome many of the shortcomings of the standard deviation approach and therefore better measure the probability of ruin, being extreme downside risk. Practical implications – In highlighting the challenges to measuring property market performance, analysis of extreme downside risk should be separated from traditional standard deviation risk calculations. In recognising these two different types of risk, extreme downside risk has a magnified domino effect with the tendency of bad news to come in crowds. Big price changes can lead to market crashes and financial ruin which is well beyond the standard deviation risk measure. This needs to be recognised and developed as there is evidence that extreme downside risk determinants are increasing by magnitude, frequency and impact. Originality/value – Analysis of extreme downside risk should form a key part of the property decision process and be included in the property investment manager’s toolkit. Modelling techniques for estimating measures of tail risk provide challenges and have shown to be beyond traditional risk management practices, being too narrow and constraining a definition. Measuring extreme risk and the likelihood of ruin is the first step in analysing and dealing with risk in both an asset class and portfolio context.

Download Full-text

The structure of co-publications multilayer network

Computational Social Networks ◽

10.1186/s40649-021-00089-w ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Ghislain Romaric Meleu ◽

Paulin Yonta Melatagia

Keyword(s):

Power Law ◽

Degree Distribution ◽

Preferential Attachment ◽

Small World ◽

Generation Model ◽

Small World Network ◽

Power Law Distribution ◽

Multilayer Networks ◽

Multilayer Network ◽

Scientific Papers

AbstractUsing the headers of scientific papers, we have built multilayer networks of entities involved in research namely: authors, laboratories, and institutions. We have analyzed some properties of such networks built from data extracted from the HAL archives and found that the network at each layer is a small-world network with power law distribution. In order to simulate such co-publication network, we propose a multilayer network generation model based on the formation of cliques at each layer and the affiliation of each new node to the higher layers. The clique is built from new and existing nodes selected using preferential attachment. We also show that, the degree distribution of generated layers follows a power law. From the simulations of our model, we show that the generated multilayer networks reproduce the studied properties of co-publication networks.

Download Full-text

Computing Expectiles Using k-Nearest Neighbours Approach

Symmetry ◽

10.3390/sym13040645 ◽

2021 ◽

Vol 13 (4) ◽

pp. 645

Author(s):

Muhammad Farooq ◽

Sehrish Sarfraz ◽

Christophe Chesneau ◽

Mahmood Ul Hassan ◽

Muhammad Ali Raza ◽

...

Keyword(s):

Computational Cost ◽

Real Life ◽

Distance Measures ◽

Computational Time ◽

High Dimensional ◽

Test Error ◽

Nearest Neighbours ◽

Comparable Performance ◽

Asymmetric Least Squares ◽

Low Computational Cost

Expectiles have gained considerable attention in recent years due to wide applications in many areas. In this study, the k-nearest neighbours approach, together with the asymmetric least squares loss function, called ex-kNN, is proposed for computing expectiles. Firstly, the effect of various distance measures on ex-kNN in terms of test error and computational time is evaluated. It is found that Canberra, Lorentzian, and Soergel distance measures lead to minimum test error, whereas Euclidean, Canberra, and Average of (L1,L∞) lead to a low computational cost. Secondly, the performance of ex-kNN is compared with existing packages er-boost and ex-svm for computing expectiles that are based on nine real life examples. Depending on the nature of data, the ex-kNN showed two to 10 times better performance than er-boost and comparable performance with ex-svm regarding test error. Computationally, the ex-kNN is found two to five times faster than ex-svm and much faster than er-boost, particularly, in the case of high dimensional data.

Download Full-text

Power-law behavior of transcription factor dynamics at the single-molecule level implies a continuum affinity model

Nucleic Acids Research ◽

10.1093/nar/gkab072 ◽

2021 ◽

Author(s):

David A Garcia ◽

Gregory Fettweis ◽

Diego M Presman ◽

Ville Paakinaho ◽

Christopher Jarzynski ◽

...

Keyword(s):

Transcription Factor ◽

Single Molecule ◽

Power Law ◽

Dwell Time ◽

Specific Binding ◽

Power Law Distribution ◽

Single Molecule Level ◽

Binding Behavior ◽

Chromatin Template ◽

Nuclear Domains

Abstract Single-molecule tracking (SMT) allows the study of transcription factor (TF) dynamics in the nucleus, giving important information regarding the diffusion and binding behavior of these proteins in the nuclear environment. Dwell time distributions obtained by SMT for most TFs appear to follow bi-exponential behavior. This has been ascribed to two discrete populations of TFs—one non-specifically bound to chromatin and another specifically bound to target sites, as implied by decades of biochemical studies. However, emerging studies suggest alternate models for dwell-time distributions, indicating the existence of more than two populations of TFs (multi-exponential distribution), or even the absence of discrete states altogether (power-law distribution). Here, we present an analytical pipeline to evaluate which model best explains SMT data. We find that a broad spectrum of TFs (including glucocorticoid receptor, oestrogen receptor, FOXA1, CTCF) follow a power-law distribution of dwell-times, blurring the temporal line between non-specific and specific binding, suggesting that productive binding may involve longer binding events than previously believed. From these observations, we propose a continuum of affinities model to explain TF dynamics, that is consistent with complex interactions of TFs with multiple nuclear domains as well as binding and searching on the chromatin template.

Download Full-text

The power-law distribution of gene family size is driven by the pseudogenisation rate's heterogeneity between gene families

Gene ◽

10.1016/j.gene.2008.02.014 ◽

2008 ◽

Vol 414 (1-2) ◽

pp. 85-94 ◽

Cited By ~ 13

Author(s):

Timothy Hughes ◽

David A. Liberles

Keyword(s):

Gene Family ◽

Family Size ◽

Power Law ◽

Gene Families ◽

Power Law Distribution

Download Full-text

Diversifying systemic risk in agriculture

Agricultural Finance Review ◽

10.1108/afr-06-2016-0061 ◽

2016 ◽

Vol 76 (4) ◽

pp. 512-531 ◽

Cited By ~ 3

Author(s):

Xiaoguang Feng ◽

Dermot Hayes

Keyword(s):

Crop Yield ◽

Systemic Risk ◽

Crop Insurance ◽

Tail Dependence ◽

Model Potential ◽

High Dimensional ◽

Computationally Efficient ◽

Content Type ◽

Insurance Portfolio ◽

Systemic Nature

Purpose Portfolio risk in crop insurance due to the systemic nature of crop yield losses has inhibited the development of private crop insurance markets. Government subsidy or reinsurance has therefore been used to support crop insurance programs. The purpose of this paper is to investigate the possibility of converting systemic crop yield risk into “poolable” risk. Specifically, this study examines whether it is possible to remove the co-movement as well as tail dependence of crop yield variables by enlarging the risk pool across different crops and countries. Design/methodology/approach Hierarchical Kendall copula (HKC) models are used to model potential non-linear correlations of the high-dimensional crop yield variables. A Bayesian estimation approach is applied to account for estimation risk in the copula parameters. A synthetic insurance portfolio is used to evaluate the systemic risk and diversification effect. Findings The results indicate that the systemic nature – both positive correlation and lower tail dependence – of crop yield risks can be eliminated by combining crop insurance policies across crops and countries. Originality/value The study applies the HKC in the context of agricultural risks. Compared to other advanced copulas, the HKC achieves both flexibility and parsimony. The flexibility of the HKC makes it appropriate to precisely represent various correlation structures of crop yield risks while the parsimony makes it computationally efficient in modeling high-dimensional correlation structure.

Download Full-text