Outlier detection in linear models: a comparative study in simple linear regression

1986 ◽  
Vol 15 (12) ◽  
pp. 3589-3597 ◽  
Author(s):  
Uditha Balasooriya ◽  
Y.K. Tse
PeerJ ◽  
2016 ◽  
Vol 4 ◽  
pp. e2662 ◽  
Author(s):  
Christiaan W. Winterbach ◽  
Sam M. Ferreira ◽  
Paul J. Funston ◽  
Michael J. Somers

BackgroundThe range, population size and trend of large carnivores are important parameters to assess their status globally and to plan conservation strategies. One can use linear models to assess population size and trends of large carnivores from track-based surveys on suitable substrates. The conventional approach of a linear model with intercept may not intercept at zero, but may fit the data better than linear model through the origin. We assess whether a linear regression through the origin is more appropriate than a linear regression with intercept to model large African carnivore densities and track indices.MethodsWe did simple linear regression with intercept analysis and simple linear regression through the origin and used the confidence interval for ß in the linear modely = αx + ß, Standard Error of Estimate, Mean Squares Residual and Akaike Information Criteria to evaluate the models.ResultsThe Lion on Clay and Low Density on Sand models with intercept were not significant (P > 0.05). The other four models with intercept and the six models thorough origin were all significant (P < 0.05). The models using linear regression with intercept all included zero in the confidence interval for ß and the null hypothesis that ß = 0 could not be rejected. All models showed that the linear model through the origin provided a better fit than the linear model with intercept, as indicated by the Standard Error of Estimate and Mean Square Residuals. Akaike Information Criteria showed that linear models through the origin were better and that none of the linear models with intercept had substantial support.DiscussionOur results showed that linear regression through the origin is justified over the more typical linear regression with intercept for all models we tested. A general model can be used to estimate large carnivore densities from track densities across species and study areas. The formulaobserved track density = 3.26 × carnivore densitycan be used to estimate densities of large African carnivores using track counts on sandy substrates in areas where carnivore densities are 0.27 carnivores/100 km2or higher. To improve the current models, we need independent data to validate the models and data to test for non-linear relationship between track indices and true density at low densities.


2020 ◽  
pp. 65-92
Author(s):  
Bendix Carstensen

This chapter evaluates regression models, focusing on the normal linear regression model. The normal linear regression model establishes a relationship between a quantitative response (also called outcome or dependent) variable, assumed to be normally distributed, and one or more explanatory (also called regression, predictor, or independent) variables about which no distributional assumptions are made. The model is usually referred to as 'the general linear model'. The chapter then differentiates between simple linear regression and multiple regression. The term 'simple linear regression' covers the regression model where there is one response variable and one explanatory variable, assuming a linear relationship between the two. The chapter also discusses the model formulae in R; generalized linear models; collinearity and aliasing; and logarithmic transformations.


2019 ◽  
Vol 16 (1) ◽  
Author(s):  
Chioneso Marange ◽  
Yongsong Qin

The application of goodness-of-fit (GoF) tests in linear regression modeling is a common practice in applied statistical sciences. For instance, in simple linear regression the assumption of normality of residuals is always necessary to test before making any further inferences. The growing popularity of the use of powerful and efficient empirical likelihood ratio (ELR) based GoF tests in checking for departures from normality in various continuous distributions can be of great use in checking for distributional assumptions of residuals in linear models. Motivated by the attractive properties of the ELR based GoF tests the researchers conducted an extensive Type I error rate assessment as well as a Monte Carlo power comparison of selected ELR GoF tests with well-known existing tests against symmetric and asymmetric alternative OLS and BLUS residuals. Under the simulated scenarios, all the studied tests have good control of Type I error rates. The Monte Carlo experiments revealed the superiority of the ELR GoF tests under certain alternatives of both the OLS and BLUS residuals. Our findings also demonstrated the superiority of OLS over BLUS residuals when one is testing for normality in simple linear regression models. A real data study further revealed the applicability of the ELR based GoF tests in testing normality of residuals in linear regression models.


10.32698/0642 ◽  
2019 ◽  
Vol 2 (2) ◽  
pp. 120
Author(s):  
Wiwi Delfita ◽  
Neviyarni S. ◽  
Riska Ahmad

Some students perceive lesbian, gay, bisexual, and transgender (LGBT) positively, even though LGBT is a sexual deviation that is not appropriate with values and norms. There are several factors that influence an individual's perception of LGBT, including sexual identity. This study aims at looking at the contribution of sexual identity to student perceptions about LGBT. This research used a quantitative approach with a descriptive method and a simple linear regression analysis. The sample of this research was 385 taken from 15.752 undergraduate students of Universitas Negeri Padang which the sample was drawn by using the Slovin formula and continued with a Proportional Random Sampling technique. The instrument used was the Guttman model's sexual identity scale and the scale of students' perceptions of the LGBT Likert model. After analyzing the data with the descriptive technique and the simple linear regression analysis, the results showed that sexual identity significantly contributed to the students' perceptions of LGBT. This research has implications as a basis for counselors to help students avoid sexual identity mismatches and prevent the emergence of positive perceptions of LGBT.


2019 ◽  
Vol 4 (2) ◽  
pp. 17
Author(s):  
Dedy Mulyadi ◽  
Didik Purwanto

The question of compensation in addition to sensitive to be driving someone to worl due to an effect on morale and discipline employees. Therefore , any  agency or any organization should be able to provide compensation equal to the workload  to create a workforce that efficient and effective manner can be realized. Amaore than that, the company’s goal to improve performance. Performance assessment is a subjective process that involves human judgments. Thus, performance assessment is very likely wrong and very easily influonced by sources that are not actual, so it must be taken into account and considered reasinable. Frformance appraisals are considered  to meet the target if it has a good impact on new employees who rated their performance. Simple linear regression analysis using SPSS version 12:00 data processing obtained tegression equation Y = 0,487 X 74 + with an explanation of X = award, 74 = constant, 0.487 = coefficient awards, and Y = performance based on simple linear regression equation in case of increase of one unit of the  performance award will be increased 0.487 units. If company policy negates the performance award will remain at a constant rate (74) units . (A) Test results obtained thitung significant constants of (12.574) > t table for (1.960 then reject Ho constanta significant meaning. (B) significant Test award coefficient t count the results obtained by (2.164)> t table foe (1.96) then reject Ho the mean coeffent of appreciation affect the performance . (C) correlation coefficient analysis is done by calculating the product moment corration (pearson)  to test  whether or not a strong  relationship between the variables X  dan Y , based on the results of cakculations with SPSS  table valuse obtained by calculating the  correlation coefficient r (0.3100> r on the table for a = 0,05 (0.291) then reject Ho, which means there is a relationship of respect for performance. When we enter these valuse in the table shows the interpretation of the correlation coefficient between the interval from 0.20 to 0.399 which has a low relationship


2019 ◽  
Vol 10 (9) ◽  
pp. 902-909
Author(s):  
Umbas Krisnanto ◽  
◽  
Conny Marpaung ◽  

This study aims to determine and analyze the influence of Service Quality and Customer Satisfaction on Customer Loyalty in Jabodetabek Commuter Line. The sample of this study was 50 people. Methods of collecting data by distributing questionnaires. Data analysis using the analysis used is simple linear regression, t test and coefficient of determination. The results showed 1) Service Quality has a positive and significant effect on Customer Loyalty in Jabodetabek Commuter Line, with a significance level of 0.048; and supported by the results of hypothesis testing with a t-count value of 4.433 > t-table value of 1.95, with a significance of 0.048 or < 0.05; 2) Customer Satisfaction positive and significant effect on Customer Loyalty in Jabodetabek Commuter Line, with a level significance of 0,000; and supported by the results of hypothesis testing with a t-count value of 4,969 > t-table value of 1.95, with a significance of 0,000 or < 0.05, 3) Service quality and Customer Satisfaction have a positive and significant effect on Customer Loyalty in Jabodetabek Commuter Line, with a significance level of 0,000. This means that the hypothesis H0 is rejected and Ha is accepted so that it can be concluded that service quality and customer satisfaction together have a positive and significant effect on customer loyalty in Jabodetabek Commuter Line.


2019 ◽  
Vol 3 (2) ◽  
pp. 26
Author(s):  
Niken Ayu Wulandari ◽  
Tegoeh Hari Abrianto ◽  
Edi Santoso

This research to analyze and evaluate intellectual capital on financial performance obtained by return on equity, asset turnover and growth in revenue. The population in this study are consumer goods companies listed on the Stock Exchange in 2015-2017. The research sample was received by 21 companies obtained by using purposive sampling technique. The analytical method used is simple linear regression analysis with the SPSS version 20 application and uses the VAICTM method to measure intellectual capital. The results of this study indicate that intellectual capital has a significant effect on financial performance generated by return on equity, but intellectual capital does not have a significant effect on financial performance required by asset turnover and growth in revenue.


Author(s):  
Fransiskus Ginting ◽  
Efori Buulolo ◽  
Edward Robinson Siagian

Data Mining is an information discovery by extracting information patterns that contain trend searches in a very large amount of data and assist the process of storing data in making a decision in the future. In determining the pattern classification techniques do to collect records (Training set). Regional income is generally derived from local taxes and levies, local taxes are one source of funding for the region on the national average has not been able to make a large contribution to the formation of local revenue. By utilizing Regional Revenue data, it can produce forecasting and predictions of Regional Revenue income in the future to match the reality / reality so that the planned RAPBD can run smoothly. Simple Linear Regression or often abbreviated as SLR (Simple Linear Regression) is one of the statistical methods used in production to make predictions or predictions about the characteristics of quality and quantity to describe the processes associated with data processing for the acquisition of regional income. So that in the testing phase with visual basic net can help in processing valid Regional Revenue Amount data. Keywords: Data Mining, Local Revenue, Simple Linear Regression Algorithm, Visual Basic net 2008


Sign in / Sign up

Export Citation Format

Share Document