scholarly journals Sample size and its evolution in research

2022 ◽  
Vol 1 ◽  
pp. 9-13
Author(s):  
Sai Prashanti Gumpili ◽  
Anthony Vipin Das

Objective: Sample size is one of the crucial and basic steps involved in planning any study. This article aims to study the evolution of sample size across the years from hundreds to thousands to millions and to a trillion in the near future (H-K-M-B-T). It also aims to understand the importance of sampling in the era of big data. Study Design - Primary Outcome measure, Methods, Results, and Interpretation: A sample size which is too small will not be a true representation of the population whereas a large sample size will involve putting more individuals at risk. An optimum sample size needs to be employed to identify statistically significant differences if they exist and obtain scientifically valid results. The design of the study, the primary outcome, sampling method used, dropout rate, effect size, power, level of significance, and standard deviation are some of the multiple factors which affect the sample size. All these factors need to be taken into account while calculating the sample size. Many sources are available for calculating sample size. Discretion needs to be used while choosing the right source. The large volumes of data and the corresponding number of data points being analyzed is redefining many industries including healthcare. The larger the sample size, the more insightful information, identification of rare side effects, lesser margin of error, higher confidence level, and models with more accuracy. Advances in the digital era have ensured that we do not face most of the obstacles faced traditionally with regards to statistical sampling, yet it has its own set of challenges. Hence, considerable efforts and time should be invested in selecting sampling techniques which are appropriate and reducing sampling bias and errors. This will ensure the reliability and reproducibility in the results obtained. Along with a large sample size, the focus should be on getting to know the data better, the sample frame and the context in which it was collected. We need to focus on creation of good quality data and structured systems to capture the sample. Good data quality management makes sure that the data are structured appropriately.

Author(s):  
Alison Sizer ◽  
Oliver Duke-Williams

Background and Rationale The ONS Longitudinal Study (‘the LS’) covers England and Wales and includes individual data from the 1971 – 2011 decennial censuses and linked information on births, deaths and cancer registrations. It is representative of the population of England and Wales. Aim This presentation describes the LS and the linked administrative data, and showcases recent/ prominent examples of research. Methods and Approach The LS is built around samples drawn from decennial censuses, with its initial sample drawn from the 1971 Census. It also contains information about other people living in a sample-member’s household. Substantial emphasis is placed on security of access to the data and its responsible use. All research outputs are checked and are only released to users once disclosure control requirements are met. Linkage of study members from one census to another and vital events is carried out by ONS. Results The LS has been used for a variety of research. Using linked census and death records occupational differences in mortality rates have been researched. Individual records from all five censuses have been used to contribute to research social mobility, and research has also investigated the effects of long-term exposure to air pollution. Research has provided evidence of impact for social policy issues, e.g. health inequalities and the State Pension Age Review. Discussion The main strength of the LS is its large sample size (>1 million), making it the largest nationally representative longitudinal dataset in the UK. This allows analysis of small areas and specific population groups. Sampling bias is almost nil, and response rates are very high relative to other cohort and panel studies. Conclusion The ONS Longitudinal Study is a vital UK research asset, providing access to a large sample of census data linked across five censuses. It is strengthened through linkage to events data.


2018 ◽  
pp. 437-445
Author(s):  
Gregory S. Thomas

The chapter Heart Rate Response to Exercise reviews the studies performed to estimate a patient’s maximum predicted heart rate. While the commonly used formula (220 – age), developed in 1971, is easy to remember, it underestimates the actual maximum heart rate in older persons. Studies of large sample size have found the maximum heart rate to be relatively independent of sex and physical fitness but to incrementally decline with age. The decrease with age is less than 1 beat per minute per year, however. A more accurate and recommended formula is [(208) – (0.7)(age)] as developed by Tanaka and colleagues.


1970 ◽  
Vol 7 (01) ◽  
pp. 1-20 ◽  
Author(s):  
Ora Engleberg Percus ◽  
Jerome K. Percus

A generating function technique is used to determine the probability that the deviation between two empirical distributions drawn from the same population lies within a given band a specified number of times. We also treat the asymptotic problem of very large sample size, and obtain explicit expressions when the relative number of failures is very small or very large.


2019 ◽  
Vol 24 (4) ◽  
pp. 408-419
Author(s):  
Hongu Meng ◽  
Antony Warden ◽  
Lulu Zhang ◽  
Ting Zhang ◽  
Yiyang Li ◽  
...  

Mass cytometry (CyTOF) is a critical cell profiling tool in acquiring multiparameter proteome data at the single-cell level. A major challenge in CyTOF analysis is sample-to-sample variance arising from the pipetting process, staining variation, and instrument sensitivity. To reduce such variations, cell barcoding strategies that enable the combination of individual samples prior to antibody staining and data acquisition on CyTOF are often utilized. The most prevalent barcoding strategy is based on a binary scheme that cross-examines the existence or nonexistence of certain mass signals; however, it is limited by low barcoding efficiency and high cost, especially for large sample size. Herein, we present a novel barcoding method for CyTOF application based on mass ratiometry. Different mass tags with specific fixed ratios are used to label CD45 antibody to achieve sample barcoding. The presented method exponentially increases the number of possible barcoded samples with the same amount of mass tags compared with conventional methods. It also reduces the overall time for the labeling process to 40 min and avoids the need for expensive commercial barcoding buffer reagents. Moreover, unlike the conventional barcoding process, this strategy does not pre-permeabilize cells before the barcoding procedure, which offers additional benefits in preserving surface biomarker signals.


2019 ◽  
Vol 7 (9) ◽  
pp. 1801053
Author(s):  
Liu Xie ◽  
Rui Tong ◽  
Wen Zhang ◽  
Dejian Wang ◽  
Tao Liu ◽  
...  

2021 ◽  
pp. 1-12
Author(s):  
Jing Wang ◽  
Jie Wei ◽  
Long Li ◽  
Lijian Zhang

With the rapid development of evidence-based medicine, translational medicine, and pharmacoeconomics in China, as well as the country’s strong commitment to clinical research, the demand for physicians’ research continues to increase. In recent years, real-world studies are attracting more and more attention in the field of health care, as a method of post-marketing re-evaluation of drugs, RWS can better reflect the effects of drugs in real clinical settings. In the past, it was difficult to ensure data quality and efficiency of research implementation because of the large sample size required and the large amount of medical data involved. However, due to the large sample size required and the large amount of medical data involved, it is not only time-consuming and labor-intensive, but also prone to human error, making it difficult to ensure data quality and efficiency of research implementation. This paper analyzes and summarizes the existing application systems of big data analytics platforms, and concludes that big data research analytics platforms using natural language processing, machine learning and other artificial intelligence technologies can help RWS to quickly complete the collection, integration, processing, statistics and analysis of large amounts of medical data, and deeply mine the intrinsic value of the data, real-world research in new drug development, drug discovery, drug discovery, drug discovery, and drug discovery. It has a broad application prospect for multi-level and multi-angle needs such as economics, medical insurance cost control, indications/contraindications evaluation, and clinical guidance.


Sign in / Sign up

Export Citation Format

Share Document