scholarly journals The impact of measurement error in models using police recorded crime rates

2021 ◽  
Author(s):  
Jose Pina-Sánchez ◽  
David Buil-Gil ◽  
ian brunton-smith ◽  
Alexandru Cernat

Objectives: Assess the extent to which measurement error in police recorded crime rates impact the estimates of regression models exploring the causes and consequences of crime.Methods: We focus on linear models where crime rates are included either as the response or as an explanatory variable, in their original scale, or log-transformed. Two measurement error mechanisms are considered, systematic errors in the form of under-recorded crime, and random errors in the form of recording inconsistencies across areas. The extent to which such measurement error mechanisms impact model parameters is demonstrated algebraically, using formal notation, and graphically, using simulations.Results: Most coefficients and measures of uncertainty from models where crime rates are included in their original scale are severely biased. However, in many cases, this problem could be minimised, or altogether eliminated by log-transforming crime rates. This transforms the multiplicative measurement error observed in police recorded crime rates into a less harmful additive mechanism.Conclusions: The validity of findings from regression models where police recorded crime rates are used in their original scale is put into question. In interpreting the large evidence base exploring the effects and consequences of crime using police statistics we urge researchers to consider the biasing effects shown here. Equally, we urge researchers to log-transform crime rates before they are introduced in statistical models.

2018 ◽  
Vol 620 ◽  
pp. A168 ◽  
Author(s):  
G. Valle ◽  
M. Dell’Omodarme ◽  
P. G. Prada Moroni ◽  
S. Degl’Innocenti

Aims. We aim to perform a theoretical investigation on the direct impact of measurement errors in the observational constraints on the recovered age for stars in main sequence (MS) and red giant branch (RGB) phases. We assumed that a mix of classical (effective temperature Teff and metallicity [Fe/H]) and asteroseismic (Δν and νmax) constraints were available for the objects. Methods. Artificial stars were sampled from a reference isochrone and subjected to random Gaussian perturbation in their observational constraints to simulate observational errors. The ages of these synthetic objects were then recovered by means of a Monte Carlo Markov chains approach over a grid of pre-computed stellar models. To account for observational uncertainties the grid covers different values of initial helium abundance and mixing-length parameter, that act as nuisance parameters in the age estimation. Results. The obtained differences between the recovered and true ages were modelled against the errors in the observables. This procedure was performed by means of linear models and projection pursuit regression models. The first class of statistical models provides an easily generalizable result, whose robustness is checked with the second method. From linear models we find that no age error source dominates in all the evolutionary phases. Assuming typical observational uncertainties, for MS the most important error source in the reconstructed age is the effective temperature of the star. An offset of 75 K accounts for an underestimation of the stellar age from 0.4 to 0.6 Gyr for initial and terminal MS. An error of 2.5% in νmax resulted the second most important source of uncertainty accounting for about −0.3 Gyr. The 0.1 dex error in [Fe/H] resulted particularly important only at the end of the MS, producing an age error of −0.4 Gyr. For the RGB phase the dominant source of uncertainty is νmax, causing an underestimation of about 0.6 Gyr; the offset in the effective temperature and Δν caused respectively an underestimation and overestimation of 0.3 Gyr. We find that the inference from the linear model is a good proxy for that from projection pursuit regression models. Therefore, inference from linear models can be safely used thanks to its broader generalizability. Finally, we explored the impact on age estimates of adding the luminosity to the previously discussed observational constraints. To this purpose, we assumed – for computational reasons – a 2.5% error in luminosity, much lower than the average error in the Gaia DR2 catalogue. However, even in this optimistic case, the addition of the luminosity does not increase precision of age estimates. Moreover, the luminosity resulted as a major contributor to the variability in the estimated ages, accounting for an error of about −0.3 Gyr in the explored evolutionary phases.


2005 ◽  
Vol 30 (2) ◽  
pp. 169-187 ◽  
Author(s):  
David Kaplan

This article considers the problem of estimating dynamic linear regression models when the data are generated from finite mixture probability density function where the mixture components are characterized by different dynamic regression model parameters. Specifically, conventional linear models assume that the data are generated by a single probability density function characterized by a single set of regression model parameters. However, when the true generating model is finite mixture density function, then estimation of conventional linear models under the assumption of a single density function may lead to erroneous conclusions. Instead, it may be desirable to estimate the regression model under the assumption that the data are derived from a finite mixture density function and to examine differences in the parameters of the model within each mixture component. Dynamic regression models and subsequent dynamic response analysis using dynamic multipliers are also likely to be affected by the existence of a finite mixture density because dynamic multipliers are functions of the regression model parameters. Utilizing finite mixture modeling applied to two real data examples, this article shows that dynamic responses to changes in exogenous variables can be quite different depending on the number and nature of underlying mixture components. Implications for substantive conclusions based on the use of dynamic multipliers is discussed.


2020 ◽  
Vol 224 (2) ◽  
pp. 1381-1403
Author(s):  
Maciej J Mendecki ◽  
Judyta Odrobińska ◽  
Renata Patyńśka ◽  
Adam F Idziak

SUMMARY This paper presents the results of new research on ground-motion relations from three areas in the Upper Silesia Coal Basin (USCB) in Poland and compares them with of ground-motion relations. These three mining areas of the USCB were investigated in order to better predict ground motion caused by seismic events. The study focused on variations in regression parameters and predicted PGA (peak ground acceleration) for different areas to better understand the influence of geology. To compare our results to previous models we had to unify the known ground-motion prediction equations (GMPE). Then, we used various regression models to predict the corresponding PGA values of a relatively strong USCB seismic event with an energy level of 108 J (ML = 3.3) and compared their results. The regression model parameters were compared to each other, particularly those related to energy and distance, which corresponds to a geometrical scattering (attenuation) of seismic waves as well as the influence of wave type (body or surface). Finally, building upon several established regression models, our analysis showed a strong linear correlation between two regression parameters corresponding to energy and distance. However, an open question remains whether this relation can be explained by physics, or, from a mathematical point of view, it is the effect of linear dependence of matrix vectors logE and logR. A comparison of different GMPEs allows for better verification of knowledge about the impact of tremors on ground motion in the USCB.


Author(s):  
H.A.M. van der Steen ◽  
P.W. Knap

Available technology allows pig breeding companies to automate feed intake recording during performance test. This provides data on ‘average daily feed intake’ as recorded with more traditional manual systems. It also results in feed intake curves, i.e. the relationship between ‘days on test’ and ‘daily feed intake’. This information can be used in different ways. The feed intake curve may be described using sophisticated linear or non-linear models; these may describe the feed intake curve accurately, but model parameters cannot be used easily in genetic/economic evaluation in the context of a breeding programme. A simple method to describe feed intake curves is used in this paper, allowing for easy interpretation of the results. The objective is to study the impact of existing selection procedures on the feed intake curve and the utilisation of variation in its shape in pig breeding.Performance test data of 1331 boars of a Large White based line, collected from November 1990 to March 1993 were analysed. Boars are tested over a 12 week period, starting at approximately 30 kg. Feed intake data are recorded with the Hunday FIRE system.


Energies ◽  
2021 ◽  
Vol 14 (10) ◽  
pp. 2958
Author(s):  
Leon Prochowski ◽  
Mateusz Ziubiński ◽  
Patryk Szwajkowski ◽  
Mirosław Gidlewski ◽  
Tomasz Pusty ◽  
...  

The introduction of autonomous cars will help to improve road traffic safety, and the use of a cargo trailer improves the energy efficiency of transport. One of the critical (collision) road situations has been considered, where immediate counteraction is required in a space that has been only partly defined. This research work was aimed at determining the impact of the trajectory planning method and the values of some parameters of the control system on the feasibility of safe avoidance of an obstacle that has suddenly appeared. The obstacle is assumed to be a motor vehicle moving on a road intersection along a collision path in relation to the autonomous car-trailer unit (CT unit) travelling at high speed. Analysis of cooperation between several non-linear models (representing the car, trailer, tyre–road interaction, and driving controller) has been carried out. Mathematical models of the control system and the CT unit have been built. The process of selection of temporary and variable parameters, applied to the control system for the time of the critical situation under consideration, has been shown. The research work carried out has made it possible to recommend appropriate parameter values for the control system.


Sensors ◽  
2022 ◽  
Vol 22 (2) ◽  
pp. 412
Author(s):  
Luigi Borzì ◽  
Ivan Mazzetta ◽  
Alessandro Zampogna ◽  
Antonio Suppa ◽  
Fernanda Irrera ◽  
...  

Background: Current telemedicine approaches lack standardised procedures for the remote assessment of axial impairment in Parkinson’s disease (PD). Unobtrusive wearable sensors may be a feasible tool to provide clinicians with practical medical indices reflecting axial dysfunction in PD. This study aims to predict the postural instability/gait difficulty (PIGD) score in PD patients by monitoring gait through a single inertial measurement unit (IMU) and machine-learning algorithms. Methods: Thirty-one PD patients underwent a 7-m timed-up-and-go test while monitored through an IMU placed on the thigh, both under (ON) and not under (OFF) dopaminergic therapy. After pre-processing procedures and feature selection, a support vector regression model was implemented to predict PIGD scores and to investigate the impact of L-Dopa and freezing of gait (FOG) on regression models. Results: Specific time- and frequency-domain features correlated with PIGD scores. After optimizing the dimensionality reduction methods and the model parameters, regression algorithms demonstrated different performance in the PIGD prediction in patients OFF and ON therapy (r = 0.79 and 0.75 and RMSE = 0.19 and 0.20, respectively). Similarly, regression models showed different performances in the PIGD prediction, in patients with FOG, ON and OFF therapy (r = 0.71 and RMSE = 0.27; r = 0.83 and RMSE = 0.22, respectively) and in those without FOG, ON and OFF therapy (r = 0.85 and RMSE = 0.19; r = 0.79 and RMSE = 0.21, respectively). Conclusions: Optimized support vector regression models have high feasibility in predicting PIGD scores in PD. L-Dopa and FOG affect regression model performances. Overall, a single inertial sensor may help to remotely assess axial motor impairment in PD patients.


Author(s):  
Carlos Alberto Huaira Contreras ◽  
Carlos Cristiano Hasenclever Borges ◽  
Camila Borelli Zeller ◽  
Amanda Romanelli

The paper proposes a weighted cross-validation (WCV) algorithm  to select a linear regression model with change-point under a scale mixtures of normal (SMN) distribution that yields the best prediction results. SMN distributions are used to construct robust regression models to the influence of outliers on the parameter estimation process. Thus, we relaxed the usual assumption of normality of the regression models and considered that the random errors follow a SMN distribution, specifically the Student-t distribution. In addition, we consider the fact that the parameters of the regression model can change from a specific and unknown point, called change-point. In this context, the estimations of the model parameters, which include the change-point, are obtained via the EM-type algorithm (Expectation-Maximization). The WCV method is used in the selection of the model that presents greater robustness and that offers a smaller prediction error, considering that the weighting values come from step E of the EM-type algorithm. Finally, numerical examples considering simulated and real data (data from television audiences) are presented to illustrate the proposed methodology.


2015 ◽  
Vol 19 (7) ◽  
pp. 3153-3179 ◽  
Author(s):  
M. S. Raleigh ◽  
J. D. Lundquist ◽  
M. P. Clark

Abstract. Physically based models provide insights into key hydrologic processes but are associated with uncertainties due to deficiencies in forcing data, model parameters, and model structure. Forcing uncertainty is enhanced in snow-affected catchments, where weather stations are scarce and prone to measurement errors, and meteorological variables exhibit high variability. Hence, there is limited understanding of how forcing error characteristics affect simulations of cold region hydrology and which error characteristics are most important. Here we employ global sensitivity analysis to explore how (1) different error types (i.e., bias, random errors), (2) different error probability distributions, and (3) different error magnitudes influence physically based simulations of four snow variables (snow water equivalent, ablation rates, snow disappearance, and sublimation). We use the Sobol' global sensitivity analysis, which is typically used for model parameters but adapted here for testing model sensitivity to coexisting errors in all forcings. We quantify the Utah Energy Balance model's sensitivity to forcing errors with 1 840 000 Monte Carlo simulations across four sites and five different scenarios. Model outputs were (1) consistently more sensitive to forcing biases than random errors, (2) generally less sensitive to forcing error distributions, and (3) critically sensitive to different forcings depending on the relative magnitude of errors. For typical error magnitudes found in areas with drifting snow, precipitation bias was the most important factor for snow water equivalent, ablation rates, and snow disappearance timing, but other forcings had a more dominant impact when precipitation uncertainty was due solely to gauge undercatch. Additionally, the relative importance of forcing errors depended on the model output of interest. Sensitivity analysis can reveal which forcing error characteristics matter most for hydrologic modeling.


2019 ◽  
Vol 2019 (1) ◽  
pp. 331-338 ◽  
Author(s):  
Jérémie Gerhardt ◽  
Michael E. Miller ◽  
Hyunjin Yoo ◽  
Tara Akhavan

In this paper we discuss a model to estimate the power consumption and lifetime (LT) of an OLED display based on its pixel value and the brightness setting of the screen (scbr). This model is used to illustrate the effect of OLED aging on display color characteristics. Model parameters are based on power consumption measurement of a given display for a number of pixel and scbr combinations. OLED LT is often given for the most stressful display operating situation, i.e. white image at maximum scbr, but having the ability to predict the LT for other configurations can be meaningful to estimate the impact and quality of new image processing algorithms. After explaining our model we present a use case to illustrate how we use it to evaluate the impact of an image processing algorithm for brightness adaptation.


Sign in / Sign up

Export Citation Format

Share Document