Euclidean distance versus Manhattan distance for skin detection using the SFA database

2022 ◽  
Vol 14 (1) ◽  
pp. 46
Author(s):  
Ouarda Soltani ◽  
Souad Benabdelkader
2020 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Shumpei Haginoya ◽  
Aiko Hanayama ◽  
Tamae Koike

Purpose The purpose of this paper was to compare the accuracy of linking crimes using geographical proximity between three distance measures: Euclidean (distance measured by the length of a straight line between two locations), Manhattan (distance obtained by summing north-south distance and east-west distance) and the shortest route distances. Design/methodology/approach A total of 194 cases committed by 97 serial residential burglars in Aomori Prefecture in Japan between 2004 and 2015 were used in the present study. The Mann–Whitney U test was used to compare linked (two offenses committed by the same offender) and unlinked (two offenses committed by different offenders) pairs for each distance measure. Discrimination accuracy between linked and unlinked crime pairs was evaluated using area under the receiver operating characteristic curve (AUC). Findings The Mann–Whitney U test showed that the distances of the linked pairs were significantly shorter than those of the unlinked pairs for all distance measures. Comparison of the AUCs showed that the shortest route distance achieved significantly higher accuracy compared with the Euclidean distance, whereas there was no significant difference between the Euclidean and the Manhattan distance or between the Manhattan and the shortest route distance. These findings give partial support to the idea that distance measures taking the impact of environmental factors into consideration might be able to identify a crime series more accurately than Euclidean distances. Research limitations/implications Although the results suggested a difference between the Euclidean and the shortest route distance, it was small, and all distance measures resulted in outstanding AUC values, probably because of the ceiling effects. Further investigation that makes the same comparison in a narrower area is needed to avoid this potential inflation of discrimination accuracy. Practical implications The shortest route distance might contribute to improving the accuracy of crime linkage based on geographical proximity. However, further investigation is needed to recommend using the shortest route distance in practice. Given that the targeted area in the present study was relatively large, the findings may contribute especially to improve the accuracy of proactive comparative case analysis for estimating the whole picture of the distribution of serial crimes in the region by selecting more effective distance measure. Social implications Implications to improve the accuracy in linking crimes may contribute to assisting crime investigations and the earlier arrest of offenders. Originality/value The results of the present study provide an initial indication of the efficacy of using distance measures taking environmental factors into account.


Author(s):  
Qibin Zhou ◽  
Qinggang Su ◽  
Peng Xiong

The assisted download is an effective method solving the problem that the coverage range is insufficient when Wi-Fi access is used in VANET. For the low utilization of time-space resource within blind area and unbalanced download services in VANET, this paper proposes an approximate global optimum scheme to select vehicle based on WebGIS for assistance download. For WebGIS, this scheme uses a two-dimensional matrix to respectively define the time-space resource and the vehicle selecting behavior, and uses Markov Decision Process to solve the problem of time-space resource allocation within blind area, and utilizes the communication features of VANET to simplify the behavior space of vehicle selection so as to reduce the computing complexity. At the same time, Euclidean Distance(Metric) and Manhattan Distance are used as the basis of vehicle selection by the proposed scheme so that, in the case of possessing the balanced assisted download services, the target vehicles can increase effectively the total amount of user downloads. Experimental results show that because of the wider access range and platform independence of WebGIS, when user is in the case of relatively balanced download services, the total amount of downloads is increased by more than 20%. Moreover, WebGIS usually only needs to use Web browser (sometimes add some plug-ins) on the client side, so the system cost is greatly reduced.


2021 ◽  
Vol 25 (01) ◽  
pp. 80-91
Author(s):  
Saba K. Naji ◽  
◽  
Muthana H. Hamd ◽  

Due to, the great electronic development, which reinforced the need to define people's identities, different methods, and databases to identification people's identities have emerged. In this paper, we compare the results of two texture analysis methods: Local Binary Pattern (LBP) and Local Ternary Pattern (LTP). The comparison based on comparing the extracting facial texture features of 40 and 401 subjects taken from ORL and UFI databases respectively. As well, the comparison has taken in the account using three distance measurements such as; Manhattan Distance (MD), Euclidean Distance (ED), and Cosine Distance (CD). Where the maximum accuracy of the LBP method (99.23%) is obtained with a Manhattan and ORL database, while the LTP method attained (98.76%) using the same distance and database. While, the facial database of UFI shows low quality, which is satisfied 75.98% and 73.82% recognition rates using LBP and LTP respectively with Manhattan distance.


Author(s):  
Parag Jain

Most popular machine learning algorithms like k-nearest neighbour, k-means, SVM uses a metric to identify the distance(or similarity) between data instances. It is clear that performances of these algorithm heavily depends on the metric being used. In absence of prior knowledge about data we can only use general purpose metrics like Euclidean distance, Cosine similarity or Manhattan distance etc, but these metric often fail to capture the correct behaviour of data which directly affects the performance of the learning algorithm. Solution to this problem is to tune the metric according to the data and the problem, manually deriving the metric for high dimensional data which is often difficult to even visualize is not only tedious but is extremely difficult. Which leads to put effort on \textit{metric learning} which satisfies the data geometry.Goal of metric learning algorithm is to learn a metric which assigns small distance to similar points and relatively large distance to dissimilar points.


2021 ◽  
Vol 38 (6) ◽  
pp. 1843-1851
Author(s):  
Ouarda Soltani ◽  
Souad Benabdelkader

The human color skin image database called SFA, specifically designed to assist research in the area of face recognition, constitutes a very important means particularly for the challenging task of skin detection. It has showed high performances comparing to other existing databases. SFA database provides multiple skin and non-skin samples, which in various combinations with each other allow creating new samples that could be useful and more effective. This particular aspect will be investigated, in the present paper, by creating four new representative skin samples according to the four rules of minimum, maximum, mean and median. The obtained samples will be exploited for the purpose of skin segmentation on the basis of the well-known Euclidean and Manhattan distance metrics. Thereafter, performances of the new representative skin samples versus performances of those skin samples, originally provided by SFA, will be illustrated. Simulation results in both SFA and UTD (University of Texas at Dallas) color face databases indicate that detection rates higher than 92% can be achieved with either measure.


Repositor ◽  
2020 ◽  
Vol 2 (7) ◽  
pp. 945
Author(s):  
Adnan Burhan Hidayat Kiat ◽  
Yufiz Azhar ◽  
Vinna Rahmayanti

Segmentasi pelanggan pada perusahaan merupakan tindakan yang dapat mempermudah perusahaan dalam mengambil keputusan ke depan. Pada penelitian ini data yang digunakan berasal dari perusahaan otomotif, PT Hasjrat Abadi Ambon. Data yang dipakai terdiri dari data transaksi dan pelanggan kendaraan bermotor. Penerapan model RFM dapat mengelompokkan pelanggan-pelanggan berdasarkan nilai variabel Recency, Frequency dan Monetary. Hasil dari model RFM akan memperoleh status baru pada tiap pelanggan dari skala terbaik sampai terburuk. Pelanggan yang telah memiliki status akan dikelompokkan menggunakan metode K-Means menjadi beberapa Cluster(kelompok). Dalam menentukan jumlah Cluster yang optimal maka diterapkan metode Elbow. Algoritma yang digunakan dalam pembentukan Cluster terdiri dari Euclidean Distance dan Manhattan Distance. Kedua algoritma akan dibandingkan kualitas pembentukan Clusternya menggunakan metode Silhoutte Coefficient. Hasil yang diberikan pada penelitian ini berupa data yang terbagi atas 5 kelompok dengan dilakukannya lima kali pengujian untuk menentukan centroid yang unggul. Cluster yang unggul akan dibuatkan visualisasi datanya untuk memudahkan perusahaan dalam mengambil keputusan. Berdasarkan penerapan Silhoutte Coefficient, algoritma yang lebih unggul yaitu Manhattan Distance dengan nilai s(i) sebesar 0.152695. Customer segmentation at the company is an action that can facilitate the company in making decisions going forward. In this study the data used came from an automotive company, PT Hasjrat Abadi Ambon. The data used consists of transaction data and motor vehicle customers. The application of the RFM model can classify customers based on the value of the Recency, Frequency and Monetary variables. The results of the RFM model will obtain a new status on each customer from the best to the worst scale. Customers who already have status will be grouped using the K-Means method into several Clusters (groups). In determining the optimal number of Clusters, the Elbow method is applied. The algorithm used in Cluster formation consists of Euclidean Distance and Manhattan Distance. The two algorithms will be compared the quality of the Cluster formation using the Silhoutte Coefficient method. The results given in this study are in the form of data divided into 5 groups by conducting five tests to determine superior centroids. Excellent clusters will be made of data visualization to facilitate the company in making decisions. Based on the application of Silhoutte Coefficient, a superior algorithm is Manhattan Distance with value s(i) : 0.152695.


Sign in / Sign up

Export Citation Format

Share Document