multivariate outliers
Recently Published Documents


TOTAL DOCUMENTS

75
(FIVE YEARS 15)

H-INDEX

15
(FIVE YEARS 2)

2021 ◽  
Author(s):  
Mutlu Yaşar ◽  
Fatih Dikbaş

Abstract The accuracy of descriptive statistics might be influenced by the existence of outliers in data sets. An observation which might not be considered as an outlier in the univariate case might be a multivariate outlier. Therefore, determination of outliers might make multivariate analysis more robust by providing an opportunity for making required corrections before modelling studies. This paper presents the implementation of the two-dimensional correlation method in the determination of multivariate outliers among the observations of six precipitation stations in Turkey. The two-dimensional correlation method considers the averages of the parts of the whole series instead of the average of the whole series and enables determination of the location of the outlier in the compared series. The obtained results point out that an outlier analysis for hydrologic variables should consider the two-directional behavior and the presented two-dimensional correlation method proves to be a strong alternative to be used in outlier and irregularity detection studies even with a limited number of available data. The 2DCorr software used in the study is freely provided as a supplementary material.


2021 ◽  
Vol 6 (2) ◽  
pp. 199-215
Author(s):  
Nunik Kusnilawati ◽  
Aprih Santoso

This study aims to analyze and empirically examine the effect of personal characteristics, human capital, and social capital variables on entrepreneurial business performance variables, with the intervening variable being entrepreneurial characteristics. A sample of 182 respondents, using a questionnaire instrument, the assumption test shows the results of the evaluation of the normality of the data that do not deviate, then the multivariate outliers test has the same degree of freedom of 22, the value of mahalanobis distance ² (0.001;22) = 48.268 so it can be concluded that the model on the data has has no outliers. Residual testing, after the model respecification technique was carried out, it showed that the loading factor of all indicators was > 0.5 and the standardized residual value showed that the value was in the range of ± 2.58 so that it was concluded that the modified model was acceptable. Hypothesis testing, there is a significant influence between the variables of personal characteristics and human capital, on entrepreneurial characteristics. Human capital variables and business characteristics have a significant effect on business performance. But there is no significant effect between the variables (1) social capital on entrepreneurial characteristics, and (2) personal characteristics on business performance. Novelty is a business performance research model with entrepreneurial characteristics as the intervening variable, and the survey was conducted during the economic crisis due to the covid-19 pandemic. Key words : personal characteristics, human capital, social capital, entrepreneurial characteristics, entrepreneurial business performance


2021 ◽  
Author(s):  
Fatih Dikbas

Abstract The accuracy of descriptive statistics might be influenced by the existence of outliers in data sets. An observation which might not be considered as an outlier in the univariate case might be a multivariate outlier. Therefore, determination of outliers might make multivariate analysis more robust by providing an opportunity for making required corrections before modelling studies. This paper presents the implementation of the two-dimensional correlation method in the determination of multivariate outliers among the observations of six precipitation stations in Turkey. The two-dimensional correlation method considers the averages of the parts of the whole series instead of the average of the whole series and enables determination of the location of the outlier in the compared series. The obtained results point out that an outlier analysis for hydrologic variables should consider the two-directional behavior and the presented two-dimensional correlation method proves to be a strong alternative to be used in outlier and irregularity detection studies even with a limited number of available data. The 2DCorr software used in the study is freely provided as a supplementary material.


2021 ◽  
Author(s):  
Abduruhman Fahad Alajmi1 ◽  
Hmoud Al-Olimat ◽  
Reham Abu Ghaboush ◽  
Nada A. Al Buniaian

<p>An online questionnaire was distributed to the target population (<i>N </i>= ~2000); 226 completed forms were received from respondents Missing values in all variables did not exceed 6% of cases. Missing data analysis was then followed with Little’s (1988) missing completely at random test. The results were not significant, χ<sup>2</sup> (59) = 73.340, <i>p</i> = .099, suggesting that the values were missing entirely by chance. Thus, the missing values in the dataset were estimated with the expectation–maximization algorithm. To examine outliers among cases, data were evaluated for univariate and multivariate outliers by examining Mahalanobis distance for each participant. An outlier was defined as a Mahalanobis score that was over than Mahal. Critical score cv = 55.32; univariate or multivariate outliers were 31 cases with 13% (Tabachnik & Fidell, 2013, McLachlan GJ. (1999).</p>


2021 ◽  
Author(s):  
Abduruhman Fahad Alajmi1 ◽  
Hmoud Al-Olimat ◽  
Reham Abu Ghaboush ◽  
Nada A. Al Buniaian

<p>An online questionnaire was distributed to the target population (<i>N </i>= ~2000); 226 completed forms were received from respondents Missing values in all variables did not exceed 6% of cases. Missing data analysis was then followed with Little’s (1988) missing completely at random test. The results were not significant, χ<sup>2</sup> (59) = 73.340, <i>p</i> = .099, suggesting that the values were missing entirely by chance. Thus, the missing values in the dataset were estimated with the expectation–maximization algorithm. To examine outliers among cases, data were evaluated for univariate and multivariate outliers by examining Mahalanobis distance for each participant. An outlier was defined as a Mahalanobis score that was over than Mahal. Critical score cv = 55.32; univariate or multivariate outliers were 31 cases with 13% (Tabachnik & Fidell, 2013, McLachlan GJ. (1999).</p>


Technologies ◽  
2020 ◽  
Vol 8 (4) ◽  
pp. 58
Author(s):  
Ramona Ruiz Blázquez ◽  
Mario Muñoz-Organero

Nowadays, our mobile devices have become smart computing platforms, incorporating a wide number of embedded sensors such as accelerometers, gyroscopes, barometers, GPS receivers, and magnetometers. Smartphones are valuable devices for gathering user-related data and transforming it into value-added information for the user. In this study, a novel mechanism to process sensor data from mobile devices in order to detect the type of area the user is crossing while walking in an urban setting is presented. The method is based on combining outlier data analysis and classification techniques from data collected by several pedestrians while traversing an urban environment. A theoretical framework, composed of methods for detecting multivariate outliers combined with supervised classification techniques, has been proposed in order to identify different situations and physical barriers while walking. Each type of element to be detected is characterized by using a feature vector computed based on the outliers detected. Finally, a radial SVM is used for the classification task. The classifier is trained in a supervised way with data from 20 different segments containing several physical barriers and used later to assign a class to new un-labelled data. The results obtained with this approach are very promising with an average accuracy around 95% when detecting different types of physical barriers.


2020 ◽  
pp. 084456212093205
Author(s):  
Maher M. El-Masri ◽  
Fabrice I. Mowbray ◽  
Susan M. Fox-Wasylyshyn ◽  
David Kanters

The presence of statistical outliers is a shared concern in research. If ignored or improperly handled, outliers have the potential to distort parameter estimates and possibly compromise the validity of research findings. The purpose of this paper is to provide a conceptual and practical overview of multivariate outliers with a focus on common techniques used to identify and manage multivariate outliers. Specifically, this paper discusses the use of Mahalanobis distance and residual statistics as common multivariate outlier identification techniques. It also discusses the use of leverage and Cook’s distance as two common techniques to determine the influence that multivariate outliers may have on statistical models. Finally, this paper discusses techniques that are commonly used to handle influential multivariate outlier cases.


2020 ◽  
Vol 42 ◽  
pp. e16
Author(s):  
Josino José Barbosa ◽  
Anderson Ribeiro Duarte ◽  
Helgem Souza Ribeiro Martins

Methodologies for identifying multivariate outliers are extremely important in statistical analysis. Outliers may reveal relevant information to variables under investigation. Statistical applications without prior identification of possible extreme values may yield controversial results and induce mistaken decision making. In many contexts, outliers are points of great practical interest. Given this, this paper seeks to discuss methodologies for the detection of multivariate outliers through a fair and adequate comparative technique in their simulation procedure. The comparison considers detection techniques based on Mahalanobis distance, besides a methodology based on cluster analysis technique. Sensitivity, specificity, and accuracy metrics are used to measure the method quality. An analysis of the computational time required to perform the procedures is evaluated. The technique based on cluster analysis revealed a noticeable superiority over the others in detection quality and also in execution time.


Sign in / Sign up

Export Citation Format

Share Document