Increasing the Discriminatory Power of DEA in the Presence of the Sample Heterogeneity with Cluster Analysis and Decision Trees

Author(s):  
Sergey V. Samoilenko ◽  
Kweku-Muata Osei-Bryson
2021 ◽  
Vol 13 (6) ◽  
pp. 3297
Author(s):  
Alejandro García-Jurado ◽  
José Javier Pérez-Barea ◽  
Francisco Fernández-Navarro

Profiles of millennial reviewers and gamification can contribute to digital sustainability as a driver of innovation and growth. The study aims to detect if there are profiles of reviewers that can be grouped together, in order to apply a specific gamification to them and to make it sustainable over time. In this way, more information will be generated through the reviews that will help responsible consumers to choose better in their purchase decisions. The objective of this study is twofold. First, it aims to characterize online product reviewers based on their intrinsic motivations and self-perception when they comment, identifying their main motivations. Second, it aims to classify these individuals based on the acceptance of gamification elements while commenting on and relating them to the intrinsic attributes that determine their behaviors. A survey method design was used to capture responses from 187 millennial reviewers of Amazon in Spain. The relationships between motivations and the types of reviewer were extracted from the accommodation of the dataset using decision trees (DTs), specifically, the J48 algorithm. To contribute to the second objective, this paper elaborates a typology of reviewer analysis based on cluster analysis and DTs. It is confirmed that online product reviewers can be characterized based on their intrinsic motivations, which are mainly egoistic motives, competence and social relatedness. The obtained results show that the J48 DT provides excellent classification accuracy of approximately 95% in identifying reviewers based on intrinsic motivations. Similarly, egoistic intrinsic motives are decisive in focusing gamification strategies.


2021 ◽  
Vol 14 (11) ◽  
pp. 544
Author(s):  
Mirjana Pejić Bach ◽  
Jasmina Pivar ◽  
Božidar Jaković

The goal of the paper is to present the framework for combining clustering and classification for churn management in telecommunications. Considering the value of market segmentation, we propose a three-stage approach to explain and predict the churn in telecommunications separately for different market segments using cluster analysis and decision trees. In the first stage, a case study churn dataset is prepared for the analysis, consisting of demographics, usage of telecom services, contracts and billing, monetary value, and churn. In the second stage, k-means cluster analysis is used to identify market segments for which chi-square analysis is applied to detect the clusters with the highest churn ratio. In the third stage, the chi-squared automatic interaction detector (CHAID) decision tree algorithm is used to develop classification models to identify churn determinants at the clusters with the highest churn level. The contribution of this paper resides in the development of the structured approach to churn management using clustering and classification, which was tested on the churn dataset with a rich variable structure. The proposed approach is continuous since the results of market segmentation and rules for churn prediction can be fed back to the customer database to improve the efficacy of churn management.


Author(s):  
Rachel M. Harter ◽  
Pinliang (Patrick) Chen ◽  
Joseph P. McMichael ◽  
Edgardo S. Cureg ◽  
Samson A. Adeshiyan ◽  
...  

The 2015 Residential Energy Consumption Survey design called for stratification of primary sampling units to improve estimation. Two methods of defining strata from multiple stratification variables were proposed, leading to this investigation. All stratification methods use stratification variables available for the entire frame. We reviewed textbook guidance on the general principles and desirable properties of stratification variables and the assumptions on which the two methods were based. Using principal components combined with cluster analysis on the stratification variables to define strata focuses on relationships among stratification variables. Decision trees, regressions, and correlation approaches focus more on relationships between the stratification variables and prior outcome data, which may be available for just a sample of units. Using both principal components/cluster analysis and decision trees, we stratified primary sampling units for the 2009 Residential Energy Consumption Survey and compared the resulting strata.


Author(s):  
Thomas W. Shattuck ◽  
James R. Anderson ◽  
Neil W. Tindale ◽  
Peter R. Buseck

Individual particle analysis involves the study of tens of thousands of particles using automated scanning electron microscopy and elemental analysis by energy-dispersive, x-ray emission spectroscopy (EDS). EDS produces large data sets that must be analyzed using multi-variate statistical techniques. A complete study uses cluster analysis, discriminant analysis, and factor or principal components analysis (PCA). The three techniques are used in the study of particles sampled during the FeLine cruise to the mid-Pacific ocean in the summer of 1990. The mid-Pacific aerosol provides information on long range particle transport, iron deposition, sea salt ageing, and halogen chemistry.Aerosol particle data sets suffer from a number of difficulties for pattern recognition using cluster analysis. There is a great disparity in the number of observations per cluster and the range of the variables in each cluster. The variables are not normally distributed, they are subject to considerable experimental error, and many values are zero, because of finite detection limits. Many of the clusters show considerable overlap, because of natural variability, agglomeration, and chemical reactivity.


Author(s):  
Matthew L. Hall ◽  
Stephanie De Anda

Purpose The purposes of this study were (a) to introduce “language access profiles” as a viable alternative construct to “communication mode” for describing experience with language input during early childhood for deaf and hard-of-hearing (DHH) children; (b) to describe the development of a new tool for measuring DHH children's language access profiles during infancy and toddlerhood; and (c) to evaluate the novelty, reliability, and validity of this tool. Method We adapted an existing retrospective parent report measure of early language experience (the Language Exposure Assessment Tool) to make it suitable for use with DHH populations. We administered the adapted instrument (DHH Language Exposure Assessment Tool [D-LEAT]) to the caregivers of 105 DHH children aged 12 years and younger. To measure convergent validity, we also administered another novel instrument: the Language Access Profile Tool. To measure test–retest reliability, half of the participants were interviewed again after 1 month. We identified groups of children with similar language access profiles by using hierarchical cluster analysis. Results The D-LEAT revealed DHH children's diverse experiences with access to language during infancy and toddlerhood. Cluster analysis groupings were markedly different from those derived from more traditional grouping rules (e.g., communication modes). Test–retest reliability was good, especially for the same-interviewer condition. Content, convergent, and face validity were strong. Conclusions To optimize DHH children's developmental potential, stakeholders who work at the individual and population levels would benefit from replacing communication mode with language access profiles. The D-LEAT is the first tool that aims to measure this novel construct. Despite limitations that future work aims to address, the present results demonstrate that the D-LEAT represents progress over the status quo.


Sign in / Sign up

Export Citation Format

Share Document