scholarly journals k-Means: Outliers-Resistant Clustering+++

Algorithms ◽  
2020 ◽  
Vol 13 (12) ◽  
pp. 311
Author(s):  
Adiel Statman ◽  
Liat Rozenberg ◽  
Dan Feldman

The k-means problem is to compute a set of k centers (points) that minimizes the sum of squared distances to a given set of n points in a metric space. Arguably, the most common algorithm to solve it is k-means++ which is easy to implement and provides a provably small approximation error in time that is linear in n. We generalize k-means++ to support outliers in two sense (simultaneously): (i) nonmetric spaces, e.g., M-estimators, where the distance dist(p,x) between a point p and a center x is replaced by mindist(p,x),c for an appropriate constant c that may depend on the scale of the input. (ii) k-means clustering with m≥1 outliers, i.e., where the m farthest points from any given k centers are excluded from the total sum of distances. This is by using a simple reduction to the (k+m)-means clustering (with no outliers).

2013 ◽  
Vol 1 ◽  
pp. 200-231 ◽  
Author(s):  
Andrea C.G. Mennucci

Abstract In this paper we discuss asymmetric length structures and asymmetric metric spaces. A length structure induces a (semi)distance function; by using the total variation formula, a (semi)distance function induces a length. In the first part we identify a topology in the set of paths that best describes when the above operations are idempotent. As a typical application, we consider the length of paths defined by a Finslerian functional in Calculus of Variations. In the second part we generalize the setting of General metric spaces of Busemann, and discuss the newly found aspects of the theory: we identify three interesting classes of paths, and compare them; we note that a geodesic segment (as defined by Busemann) is not necessarily continuous in our setting; hence we present three different notions of intrinsic metric space.


2016 ◽  
Vol 46 (1) ◽  
pp. 207-215
Author(s):  
F. Soleimany ◽  
M. Iranmanesh
Keyword(s):  

2020 ◽  
Vol 2 (7) ◽  
pp. 91-99
Author(s):  
E. V. KOSTYRIN ◽  
◽  
M. S. SINODSKAYA ◽  

The article analyzes the impact of certain factors on the volume of investments in the environment. Regression equations describing the relationship between the volume of investment in the environment and each of the influencing factors are constructed, the coefficients of the Pearson pair correlation between the dependent variable and the influencing factors, as well as pairwise between the influencing factors, are calculated. The average approximation error for each regression equation is determined. A correlation matrix is constructed and a conclusion is made. The developed econometric model is implemented in the program of separate collection of municipal solid waste (MSW) in Moscow. The efficiency of the model of investment management in the environment is evaluated on the example of the growth of planned investments in the activities of companies specializing in the export and processing of solid waste.


Sign in / Sign up

Export Citation Format

Share Document