scholarly journals Detection of Outliers through Influence Function on Affinity

2007 ◽  
Vol 6 (2) ◽  
pp. 34-44
Author(s):  
P. Rajalakshmi ◽  
P. Geetha

Outliers are the atypical observations that lie at abnormal distances from the other observations in a random sample. Such outliers are often seen as contaminating the data. In general, the rejection of influential outliers improves the accuracy of the estimators and so the results with the identification of outliers have become the most important aspect in any data analysis. Outlier detection finds many applications in the areas such as data cleaning, fraud detection, network intrusion, pharmaceutical research and exploration in science data buses. The distance based outlier detection is the most commonly used method. In this paper, the influence function for affinity is explained and the detection of outliers in classification problems using influence function for affinity is illustrated for univariate data through a few examples.

2020 ◽  
Vol 9 (3) ◽  
pp. 336-345
Author(s):  
Alvi Waldira ◽  
Abdul Hoyyi ◽  
Dwi Ispriyanti

 Transportation has a strategic role, even becoming one of the main needs of the community, especially air transportation services. A large number of passengers in air transportation always experiences a difference every month. One of the differences occurred when approaching Eid al-Fitr, which changes every year based on an Islamic calendar that is different from Masehi calendar. The lunar shift in the occurrence of Eid al-Fitr forms a pattern called calendar variation. The effects of calendar variations can be overcome by using an additional variable, such as a dummy variable, this variable which will be used in the ARIMAX model. Observation of time series is often influenced by several unexpected events such as outliers. This outlier causes the results of data analysis to be less valid. So the researchers added the detection of outliers in this study. Based on the analysis results, the ARIMA calendar variation model is obtained (1.0, [12]), with time variable t, dummy variable , and the addition of one outlier. This model has a MAPE value of 0.07079609 which means this model is very good for forecasting. Forecasting results showed an increase in the number of passengers during the two months before Eid. Keywords: Passenger, calendar variation, outlier detection


Author(s):  
P. Ingram

It is well established that unique physiological information can be obtained by rapidly freezing cells in various functional states and analyzing the cell element content and distribution by electron probe x-ray microanalysis. (The other techniques of microanalysis that are amenable to imaging, such as electron energy loss spectroscopy, secondary ion mass spectroscopy, particle induced x-ray emission etc., are not addressed in this tutorial.) However, the usual processes of data acquisition are labor intensive and lengthy, requiring that x-ray counts be collected from individually selected regions of each cell in question and that data analysis be performed subsequent to data collection. A judicious combination of quantitative elemental maps and static raster probes adds not only an additional overall perception of what is occurring during a particular biological manipulation or event, but substantially increases data productivity. Recent advances in microcomputer instrumentation and software have made readily feasible the acquisition and processing of digital quantitative x-ray maps of one to several cells.


2017 ◽  
Vol 2 (2) ◽  
pp. 155-168 ◽  
Author(s):  
David Wong

This research aims at analyzing (1) the effect of vendor’s ability, benevolence, and integrity variables toward e-commerce customers’ trust in UBM; (2) the effect of vendor’s ability, benevolence, and integrity variables toward the level of e-commerce customers’ participation in Indonesia; and (3) the effect of trust variable toward level of e-commerce customers participation in UBM. This research makes use of UBM e-commerce users as research samples while using Likert scale questionnaire for data collection. Furthermore, the questionnaires are sent to as many as 200 respondents. For data analysis method, Structural Equation Model was used. Out of three predictor variables (ability, benevolence, and integrity), it is only vendor’s integrity that has a positive and significant effect on customers’ trust. On the other hand, it is only vendor’s integrity and customer’s trust that have a positive and significant effect on e-commerce customers’ participation in UBM. Keywords: e-commerce customers’ participation, ability, benevolence, integrity


2021 ◽  
Vol 5 (1) ◽  
pp. 41
Author(s):  
Christos Katris

In this paper, the scope is to study whether and how the COVID-19 situation affected the unemployment rate in Greece. To achieve this, a vector autoregression (VAR) model is employed and data analysis is carried out. Another interesting question is whether the situation affected more heavily female and the youth unemployment (under 25 years old) compared to the overall unemployment. To predict the future impact of COVID-19 on these variables, we used the Impulse Response function. Furthermore, there is taking place a comparison of the impact of the pandemic with the other European countries for overall, female, and youth unemployment rates. Finally, the forecasting ability of such a model is compared with ARIMA and ANN univariate models.


Algorithms ◽  
2021 ◽  
Vol 14 (1) ◽  
pp. 18
Author(s):  
Michael Li ◽  
Santoso Wibowo ◽  
Wei Li ◽  
Lily D. Li

Extreme learning machine (ELM) is a popular randomization-based learning algorithm that provides a fast solution for many regression and classification problems. In this article, we present a method based on ELM for solving the spectral data analysis problem, which essentially is a class of inverse problems. It requires determining the structural parameters of a physical sample from the given spectroscopic curves. We proposed that the unknown target inverse function is approximated by an ELM through adding a linear neuron to correct the localized effect aroused by Gaussian basis functions. Unlike the conventional methods involving intensive numerical computations, under the new conceptual framework, the task of performing spectral data analysis becomes a learning task from data. As spectral data are typical high-dimensional data, the dimensionality reduction technique of principal component analysis (PCA) is applied to reduce the dimension of the dataset to ensure convergence. The proposed conceptual framework is illustrated using a set of simulated Rutherford backscattering spectra. The results have shown the proposed method can achieve prediction inaccuracies of less than 1%, which outperform the predictions from the multi-layer perceptron and numerical-based techniques. The presented method could be implemented as application software for real-time spectral data analysis by integrating it into a spectroscopic data collection system.


2021 ◽  
Vol 9 (2) ◽  
pp. 88-104
Author(s):  
Devis Tuia ◽  
Ribana Roscher ◽  
Jan Dirk Wegner ◽  
Nathan Jacobs ◽  
Xiaoxiang Zhu ◽  
...  

2021 ◽  
Author(s):  
Yi-Ting Chen ◽  

Due to the homogeneity of the product or sample, it will affect whether it meets the scope of application and purpose. For example, the reference materials(RM) produced by the reference material producer(RMP), and the proficiency test items selected by the proficiency testing provider(PTP), in order to ensure the reference materials or proficiency test items have consistent characteristics or comparability, they should be proved to have certain homogeneity. However, before performing homogeneity assessment, it is necessary to measure the characteristic parameters of the reference materials or proficiency test items to obtain a sufficient number of measured values for data analysis, but there may be outliers in the measured values that may affect data analysis and interpretation of the results. Therefore, this article will refer to ASTM E178-16a:2016[1], ISO 5725-2:1994[2], ISO 13528:2015[3], etc., to introduce several outlier detection and homogeneity assessment methods, supplemented by case studies. Finally, this article will remind the precautions for the use of the method, so that readers can choose the appropriate method for use in the actual analysis.


2015 ◽  
Vol 32 (22) ◽  
pp. 224019 ◽  
Author(s):  
A S Silbergleit ◽  
J W Conklin ◽  
M I Heifetz ◽  
T Holmes ◽  
J Li ◽  
...  

2018 ◽  
Vol 2018 ◽  
pp. 1-13 ◽  
Author(s):  
Laura Millán-Roures ◽  
Irene Epifanio ◽  
Vicente Martínez

A functional data analysis (FDA) based methodology for detecting anomalous flows in urban water networks is introduced. Primary hydraulic variables are recorded in real-time by telecontrol systems, so they are functional data (FD). In the first stage, the data are validated (false data are detected) and reconstructed, since there could be not only false data, but also missing and noisy data. FDA tools are used such as tolerance bands for FD and smoothing for dense and sparse FD. In the second stage, functional outlier detection tools are used in two phases. In Phase I, the data are cleared of anomalies to ensure that data are representative of the in-control system. The objective of Phase II is system monitoring. A new functional outlier detection method is also proposed based on archetypal analysis. The methodology is applied and illustrated with real data. A simulated study is also carried out to assess the performance of the outlier detection techniques, including our proposal. The results are very promising.


Sign in / Sign up

Export Citation Format

Share Document