scholarly journals Monitoring Aggregated Poisson Data for Processes with Time-Varying Sample Sizes

2017 ◽  
Vol 40 (2) ◽  
pp. 243-262 ◽  
Author(s):  
Victor Hugo Morales ◽  
José Alberto Vargas

This article deals with the effect of data aggregation, when Poisson processes with varying sample sizes, are monitored. These aggregation procedures are necessary or convenient in many applications, and can simplify monitoring processes. In health surveillance applications it is a common practice to aggregate the observations during a certain time period and monitor the processes at the end of it. Also, in this type of applications it is very frequent that the sample size vary over time, which makes that instead of monitor the mean of the processes, as would be in the case of Poisson observations with constant sample size, the occurrence rate of an adverse event is monitored.Two control charts for monitoring the count Poisson data with time-varying sample sizes are proposed by Shen et al. (2013) and Dong et al. (2008). We use the average run length (ARL) to compare the performance of these control charts when different levels of aggregation, two scenarios of generating of sample size and different out-of-control states are considered. Simulation studies show the effect of data aggregation in some situations, as well as those in which their use may be appropriate without significantly compromising the prompt detection of out-of-control signals. We also show the effect of data aggregation with an example of application in health surveillance.

2020 ◽  
Vol 16 (3) ◽  
pp. 325
Author(s):  
Elsa Resa Sari

One technique used in performing statistical quality control is by poisson control chart. Poisson control chart used in data that have the same mean and varians for monitoring the number of defects in the study. In some cases, the different sample sizes influence the control chart performance. The control chart performance can be measured using average run length (ARL). The smaller ARL’s value, the better type of control chart. In this study, we used different sample sizes  that is  and mean . The result show the best performance of control chart is when  and m = 200, because its has a smaller ARL’s value.                            


Mathematics ◽  
2020 ◽  
Vol 8 (5) ◽  
pp. 698
Author(s):  
Chanseok Park ◽  
Min Wang

The control charts based on X ¯ and S are widely used to monitor the mean and variability of variables and can help quality engineers identify and investigate causes of the process variation. The usual requirement behind these control charts is that the sample sizes from the process are all equal, whereas this requirement may not be satisfied in practice due to missing observations, cost constraints, etc. To deal with this situation, several conventional methods were proposed. However, some methods based on weighted average approaches and an average sample size often result in degraded performance of the control charts because the adopted estimators are biased towards underestimating the true population parameters. These observations motivate us to investigate the existing methods with rigorous proofs and we provide a guideline to practitioners for the best selection to construct the X ¯ and S control charts when the sample sizes are not equal.


2016 ◽  
Vol 39 (6) ◽  
pp. 883-897 ◽  
Author(s):  
Rashid Mehmood ◽  
Muhammad Riaz ◽  
Tahir Mahmood ◽  
Saddam Akbar Abbasi ◽  
Nasir Abbas

In this article, we have extended the design structures of dual auxiliary information-based control charts under a variety of sampling strategies and runs rules schemes. We have considered the cases of known and unknown skewed distributions by using the skewness correction (SC) method. The design structures under the skewness correction method are based on the degree of skewness of the study variable, amount of correlation between study variable and auxiliary variable, and sample size. We have investigated the performance of the developed structures in terms of probability of signals, false alarm rate and average run length by considering the symmetrical distribution, skewed distributions, heavy tailed distributions and contamination environments. Outcomes of the current article showed that control charts based on extreme ranked set strategies have higher probability of detecting an out-of-control signal and are comparatively more robust than other control charts, especially for known distributions. Furthermore, control charts for unknown skewed process distributions under extreme ranked set strategies are relatively more robust for a small sample size, followed by other ranked set strategies-based control charts for a large sample size. Moreover, we have included a real-life example for the monitoring of ground water variables to highlight the application of our proposals.


Author(s):  
Muhammad Anwar Mughal ◽  
Muhammad Azam ◽  
Muhammad Aslam

Repetitive sampling has been found very popular in improving the control chart techniques for last couple of years. In repetitive sampling based control charts, there are two additional control limits inside the usual upper control limit (UCL) and lower control limit (LCL). If any subgroup crosses these limits but remain inside outer limits, it is deferred and replaced with another selection. Process is said to be out of control if any subgroup falls outside UCL/LCL. In this article, the technique has been modified by introducing a relation between outer and inner control limits in terms of a ratio and need of this modification has also been justified by highlighting a gap in the existing technique. By using Monte Carlo simulation, several results have been generated relevant to different sample sizes and introduced ratios. The results have been described with the help of average run length (ARL) tables that how the efficiency of control chart is effected by using different ratios. The modification in the technique also provides variety of alternatives within the scope of repetitive based control charts. All the discussed options have summarized to one table to see that how the control limits under this technique behave and impact on detecting shifts in the process average. The schemes have been interpreted in the light of above ratio and their comparison has been described under different sample sizes that facilitate the user to select most appropriate scheme for a desired process control. An example has been included by choosing one of the proposed schemes to show the application and performance of the proposed control chart in a manufacturing process. 


2021 ◽  
Vol 13 (3) ◽  
pp. 368
Author(s):  
Christopher A. Ramezan ◽  
Timothy A. Warner ◽  
Aaron E. Maxwell ◽  
Bradley S. Price

The size of the training data set is a major determinant of classification accuracy. Nevertheless, the collection of a large training data set for supervised classifiers can be a challenge, especially for studies covering a large area, which may be typical of many real-world applied projects. This work investigates how variations in training set size, ranging from a large sample size (n = 10,000) to a very small sample size (n = 40), affect the performance of six supervised machine-learning algorithms applied to classify large-area high-spatial-resolution (HR) (1–5 m) remotely sensed data within the context of a geographic object-based image analysis (GEOBIA) approach. GEOBIA, in which adjacent similar pixels are grouped into image-objects that form the unit of the classification, offers the potential benefit of allowing multiple additional variables, such as measures of object geometry and texture, thus increasing the dimensionality of the classification input data. The six supervised machine-learning algorithms are support vector machines (SVM), random forests (RF), k-nearest neighbors (k-NN), single-layer perceptron neural networks (NEU), learning vector quantization (LVQ), and gradient-boosted trees (GBM). RF, the algorithm with the highest overall accuracy, was notable for its negligible decrease in overall accuracy, 1.0%, when training sample size decreased from 10,000 to 315 samples. GBM provided similar overall accuracy to RF; however, the algorithm was very expensive in terms of training time and computational resources, especially with large training sets. In contrast to RF and GBM, NEU, and SVM were particularly sensitive to decreasing sample size, with NEU classifications generally producing overall accuracies that were on average slightly higher than SVM classifications for larger sample sizes, but lower than SVM for the smallest sample sizes. NEU however required a longer processing time. The k-NN classifier saw less of a drop in overall accuracy than NEU and SVM as training set size decreased; however, the overall accuracies of k-NN were typically less than RF, NEU, and SVM classifiers. LVQ generally had the lowest overall accuracy of all six methods, but was relatively insensitive to sample size, down to the smallest sample sizes. Overall, due to its relatively high accuracy with small training sample sets, and minimal variations in overall accuracy between very large and small sample sets, as well as relatively short processing time, RF was a good classifier for large-area land-cover classifications of HR remotely sensed data, especially when training data are scarce. However, as performance of different supervised classifiers varies in response to training set size, investigating multiple classification algorithms is recommended to achieve optimal accuracy for a project.


2020 ◽  
Vol 2020 ◽  
pp. 1-9
Author(s):  
Johnson A. Adewara ◽  
Kayode S. Adekeye ◽  
Olubisi L. Aako

In this paper, two methods of control chart were proposed to monitor the process based on the two-parameter Gompertz distribution. The proposed methods are the Gompertz Shewhart approach and Gompertz skewness correction method. A simulation study was conducted to compare the performance of the proposed chart with that of the skewness correction approach for various sample sizes. Furthermore, real-life data on thickness of paint on refrigerators which are nonnormal data that have attributes of a Gompertz distribution were used to illustrate the proposed control chart. The coverage probability (CP), control limit interval (CLI), and average run length (ARL) were used to measure the performance of the two methods. It was found that the Gompertz exact method where the control limits are calculated through the percentiles of the underline distribution has the highest coverage probability, while the Gompertz Shewhart approach and Gompertz skewness correction method have the least CLI and ARL. Hence, the two-parameter Gompertz-based methods would detect out-of-control faster for Gompertz-based X¯ charts.


2013 ◽  
Vol 113 (1) ◽  
pp. 221-224 ◽  
Author(s):  
David R. Johnson ◽  
Lauren K. Bachan

In a recent article, Regan, Lakhanpal, and Anguiano (2012) highlighted the lack of evidence for different relationship outcomes between arranged and love-based marriages. Yet the sample size ( n = 58) used in the study is insufficient for making such inferences. This reply discusses and demonstrates how small sample sizes reduce the utility of this research.


2010 ◽  
Vol 42 (3) ◽  
pp. 260-275 ◽  
Author(s):  
Anne G. Ryan ◽  
William H. Woodall

2012 ◽  
Vol 2012 ◽  
pp. 1-8 ◽  
Author(s):  
Louis M. Houston

We derive a general equation for the probability that a measurement falls within a range of n standard deviations from an estimate of the mean. So, we provide a format that is compatible with a confidence interval centered about the mean that is naturally independent of the sample size. The equation is derived by interpolating theoretical results for extreme sample sizes. The intermediate value of the equation is confirmed with a computational test.


Sign in / Sign up

Export Citation Format

Share Document