minimum volume ellipsoid
Recently Published Documents


TOTAL DOCUMENTS

39
(FIVE YEARS 12)

H-INDEX

9
(FIVE YEARS 1)

2021 ◽  
Author(s):  
Antonio A. M. Raposo ◽  
Valeska Martins de Souza ◽  
Luís Roberto A. G. Filho

Crystals ◽  
2021 ◽  
Vol 11 (12) ◽  
pp. 1539
Author(s):  
Mateusz Banach

A computer algorithm for assessment of globularity of protein structures is presented. By enclosing the input protein in a minimum volume ellipsoid (MVEE) and calculating a profile measuring how voxelized space within this shape (cubes on a uniform grid) is occupied by atoms, it is possible to estimate how well the molecule resembles a globule. For any protein to satisfy the proposed globularity criterion, its ellipsoid profile (EP) should first confirm that atoms adequately fill the ellipsoid’s center. This property should then propagate towards the surface of the ellipsoid, although with diminishing importance. It is not required to compute the molecular surface. Globular status (full or partial) is assigned to proteins with values of their ellipsoid profiles, called here the ellipsoid indexes (EI), above certain levels. Due to structural outliers which may considerably distort the measurements, a companion method for their detection and reduction of their influence is also introduced. It is based on kernel density estimation and is shown to work well as an optional input preparation step for MVEE. Finally, the complete workflow is applied to over two thousand representatives of SCOP 2.08 domain superfamilies, surveying the landscape of tertiary structure of proteins from the Protein Data Bank.


PeerJ ◽  
2021 ◽  
Vol 9 ◽  
pp. e11436
Author(s):  
Thomas R. Etherington

The Mahalanobis distance is a statistical technique that has been used in statistics and data science for data classification and outlier detection, and in ecology to quantify species-environment relationships in habitat and ecological niche models. Mahalanobis distances are based on the location and scatter of a multivariate normal distribution, and can measure how distant any point in space is from the centre of this kind of distribution. Three different methods for calculating the multivariate location and scatter are commonly used: the sample mean and variance-covariance, the minimum covariance determinant, and the minimum volume ellipsoid. The minimum covariance determinant and minimum volume ellipsoid were developed to be robust to outliers by minimising the multivariate location and scatter for a subset of the full sample, with the proportion of the full sample forming the subset being controlled by a user-defined parameter. This outlier robustness means the minimum covariance determinant and the minimum volume ellipsoid are highly relevant for ecological niche analyses, which are usually based on natural history observations that are likely to contain errors. However, natural history observations will also contain extreme bias, to which the minimum covariance determinant and the minimum volume ellipsoid will also be sensitive. To provide guidance for selecting and parameterising a multivariate location and scatter method, a series of virtual ecological niche modelling experiments were conducted to demonstrate the performance of each multivariate location and scatter method under different levels of sample size, errors, and bias. The results show that there is no optimal modelling approach, and that choices need to be made based on the individual data and question. The sample mean and variance-covariance method will perform best on very small sample sizes if the data are free of error and bias. At larger sample sizes the minimum covariance determinant and minimum volume ellipsoid methods perform as well or better, but only if they are appropriately parameterised. Modellers who are more concerned about the prevalence of errors should retain a smaller proportion of the full data set, while modellers more concerned about the prevalence of bias should retain a larger proportion of the full data set. I conclude that Mahalanobis distances are a useful niche modelling technique, but only for questions relating to the fundamental niche of a species where the assumption of multivariate normality is reasonable. Users of the minimum covariance determinant and minimum volume ellipsoid methods must also clearly report their parameterisations so that the results can be interpreted correctly.


2020 ◽  
pp. 198-205
Author(s):  
Pavel Shcherbakov

We propose six heuristic methods for finding an approximate solution to the following combinatorial problem: GivenN points in the n-dimensional space, find the minimum-size ellipsoid covering exactly N 􀀀k of them, where k is much less than N. Various assumptions on the nature of the points and their amount are considered; the results of illustrative numerical experiments with the algorithms are discussed.


2020 ◽  
Vol 28 (4) ◽  
Author(s):  
Habshah Midi ◽  
Jayanthi Arasan ◽  
Hassan Uraibi ◽  
Hasan Talib Hendi

High Leverage Points (HLPs) are outlying observations in the X -directions. It is very imperative to detect HLPs because the computed values of various estimates are affected by their presence. It is now evident that Diagnostic Robust Generalized Potential which is based on the Minimum Volume Ellipsoid (DRGP(MVE)) is capable of detecting multiple HLPs. However, it takes very long computational running times. Another diagnostic measure which is based on Index Set Equality denoted as DRGP(ISE) is put forward with the main aim of reducing its running time. Nonetheless, it is computationally not stable and still suffers from masking and swamping effects. Hence, in this paper, we propose another version of diagnostic measure which is based on Reweighted Fast Consistent and High Breakdown (RFCH) estimators. We call this measure Diagnostic Robust Generalized Potential based on √n RFCH and it is denoted by DRGP(RFCH). The results of simulation study and real data indicate that our proposed method outperformed the other two methods in term of having the least computing time, highest percentage of correct detection of HLPs and smallest percentage of swamping and masking effects compared to the DRGP(MVE) and DRGP (ISE).


2020 ◽  
Vol 7 (3) ◽  
pp. 12-29
Author(s):  
M. Fevzi Esen

Insider trading is one the most common deceptive trading practice in securities markets. Data mining appears as an effective approach to tackle the problems in fraud detection with high accuracy. In this study, the authors aim to detect outlying insider transactions depending on the variables affecting insider trading profitability. 1,241,603 sales and purchases of insiders, which range from 2010 to 2017, are analyzed by using classical and robust outlier detection methods. They computed robust distance scores based on minimum volume ellipsoid, Stahel-Donoho, and fast minimum covariance determinant estimators. To investigate the outlying observations that are likely to be fraudulent, they employ event study analysis to measure abnormal returns of outlying transactions. The results are compared to the abnormal returns of non-outlying transactions. They find that outlying transactions gain higher abnormal returns than transactions that are not flagged as outliers. Business intelligence and analytics may be a useful strategy for detecting and preventing of financial fraud for companies.


2019 ◽  
Vol 8 (4) ◽  
pp. 439-450
Author(s):  
Jeffri Nelwin J. O. Siburian ◽  
Rita Rahmawati ◽  
Abdul Hoyyi

Robust principal component regression s-estimator is principal component regression that applies robust approach method at principal component analysis and s-estimator at principal component regression analysis. The aim of robust principal component regression s-estimator is to overcome multicollinearity problems in multiple linier regression Ordinary Least Square (OLS) and to overcome outlier problems in principal component regression so get the most effective model. Minimum Volume Ellipsoid (MVE) is one of the robust approach methods that applied when doing principal component analysis and S-Estimator is one of the estimation methods that applied when doing principal component regression analysis. The case in this study is the factors that influence the Number of Unemployment in Central Java in 2017. The model that provides the most effective result to handling multicolliniearity and ouliers in the case study  Number of Unemployment in Central Java in 2017 is using robust principal component regression MVE-(S-Estimator) with Adjusted R2 value of 0.9615 and RSE value of 0.4073. Keywords: Robust Principal Component Regression S-Estimator, Multicollinearity, Outliers, Minimum Volume Ellipsoid (MVE), Number of Unemployment.


Sign in / Sign up

Export Citation Format

Share Document