The Meaninglessness of the Mean

2018 ◽  
pp. 98-107
Author(s):  
Erwin B. Montgomery

The mean (average) or other central tendencies of a set of data is an internal construct that does not necessarily reflect reality. It is possible to determine the central tendency from any arbitrary collection of data as long as they vary on the same dimension. Even if applied to a relevant sample of data, the central tendency may be a poor reflection of data. A virtually infinite number of different collections of data may have the same central tendency and variance. This has very important implications when reasoning from studies reporting means and standard deviations. The same concerns apply to medians as the central tendencies and quartiles as the variability. When translating studies to the individual patient, the cumulative percentage (probability) function may be more helpful. There is a strong inclination to attribute some ontological status (reality) to measures of central tendency that can be misleading.

1979 ◽  
Vol 22 (2) ◽  
pp. 295-310 ◽  
Author(s):  
Michael G. Block ◽  
Terry L. Wiley

Acoustic-reflex growth functions and loudness-balance judgments were obtained for three normal-hearing subjects with normal middle-ear function. The hypothesis that acoustic reflex-activating signals producing proportionately equal acoustic-impedance changes are judged equal in loudness was evaluated. The mean acoustic impedance and associated standard deviations were computed for the baseline (static) and activator (reflex) portions of each reflex event. An acoustic-impedance change exceeding two standard deviations of baseline was defined as the criterion acoustic-reflex response. Acoustic impedance was measured as a function of activator SPL for broadband noise and a 1000-Hz tone from criterion magnitude to the maximum acoustic impedance (or 120-dB SPL). This was defined as the dynamic range of reflex growth. Loudness-balance measurements were made for the 1000-Hz tone and broadband noise at SPL’s representing 30, 50, and 70% of the individual dynamic range. The data supported the hypothesis.


2018 ◽  
pp. 93-97
Author(s):  
Erwin B. Montgomery

Making sense of the enormous variety of patient phenomena creates the epistemic conundrum. Is each and every patient taken as a unique entity, or is there an economical set of principles and facts from which each and every patient can be reconstructed? Empiric medicine favors the former, risking Solipsism of the Present Moment. Rationalist/allopathic medicine favors the latter but makes application of knowledge to the individual patient problematic. The conundrum cannot be resolved by simply taking the “average” of all patients or some other measure of central tendency. While it is possible to find the average weight of animals in Dundas, Ontario, it would have little meaning, except perhaps in exceptional circumstance. A central question in statistics is whether the mean (average) reflects the true nature of the phenomenon or does its range (variance)? Assuming the former is greatly enabling in medical decision-making and research but may be misleading.


2018 ◽  
Vol 36 (01) ◽  
pp. 067-073
Author(s):  
Kristin Dotson ◽  
Sarah Anderson ◽  
Stacy Harris ◽  
Lorie Harper ◽  
Alan Tita ◽  
...  

Objective We sought to validate the SunTech Medical Advantage Model 2 Series with firmware LX 3.40.8 algorithm noninvasive blood pressure module in a pregnant population, including those with preeclampsia. Study Design Validation study of an oscillometric noninvasive blood pressure module using the ANSI/AAMI ISO 81060-2:2013 standard guidelines. Pregnant women were enrolled into three subgroups: normotensive, hypertensive without proteinuria, and preeclampsia (hypertensive with random protein-to-creatinine ratio ≥ 0.3 or a 24-hour urine protein > 300 mg). Two trained research nurses, blinded to each other's measurements, used a mercury sphygmomanometer to validate the module by following the protocol set forth in the ANSI/AAMI ISO 81060-2:2013 standard guidelines. Results A total of 45 patients, 15 in each subgroup, were included. The mean systolic and diastolic differences with standard deviations between the module and the mean observers' measurements for all participants were −2.3 ± 7.3 and 0.2 ± 6.5 mm Hg, respectively. The systolic and diastolic standard deviations of the mean of the individual patient's paired module and observers' measurements were 6.27 and 5.98 mm Hg, respectively. The test device, relative to a mercury sphygmomanometer, underestimated the systolic blood pressure in patients with preeclampsia by at least 10 mm Hg in 24% (11/45) of paired measurements. Conclusion The SunTech Medical Advantage Model 2 Series with firmware LX 3.40.8 algorithm noninvasive blood pressure module is validated in pregnancy, including patients with preeclampsia; however, it may underestimate systolic blood pressure measurements in patients with preeclampsia.


2014 ◽  
Vol 25 (05) ◽  
pp. 441-448 ◽  
Author(s):  
Defne Abur ◽  
Nicholas J. Horton ◽  
Susan E. Voss

Background: Power reflectance measurements are an active area of research related to the development of noninvasive middle-ear assessment methods. There are limited data related to test-retest measures of power reflectance. Purpose: This study investigates test-retest features of power reflectance, including comparisons of intrasubject versus intersubject variability and how ear-canal measurement location affects measurements. Research Design: Repeated measurements of power reflectance were made at about weekly intervals. The subjects returned for four to eight sessions. Measurements were made at three ear-canal locations: a deep insertion depth (with a foam plug flush at the entrance to the ear canal) and both 3 and 6 mm more lateral to this deep insertion. Study Sample: Repeated measurements on seven subjects are reported. All subjects were female, between 19 and 22 yr old, and enrolled at an undergraduate women’s college. Data Collection and Analysis: Measurements on both the right and left ears were made at three ear-canal locations during each of four to eight measurement sessions. Random-effects regression models were used for the analysis to account for repeated measures within subjects. The mean power reflectance for each position over all sessions was calculated for each subject. Results: The comparison of power reflectance from the left and right ears of an individual subject varied greatly over the seven subjects; the difference between the power reflectance measured on the left and that measured on the right was compared at 248 frequencies, and depending on the subject, the percentage of tested frequencies for which the left and right ears differed significantly ranged from 10% to 93% (some with left values greater than right values and others with the opposite pattern). Although the individual subjects showed left-right differences, the overall population generally did not show significant differences between the left and right ears. The mean power reflectance for each measurement position over all sessions depended on the location of the probe in the ear for frequencies of less than 1000 Hz. The standard deviation between subjects' mean power reflectance after controlling for ear (left or right) was found to be greater than the standard deviation within the individual subject’s mean power reflectance. The intrasubject standard deviation in power reflectance was smallest at the deepest insertion depths. Conclusions: All subjects had differences in power reflectance between their left and right ears at some frequencies; the percentage of frequencies at which differences occurred varied greatly across subjects. The intrasubject standard deviations were smallest for the deepest probe insertion depths, suggesting clinical measurements should be made with as deep an insertion as practically possible to minimize variability. This deep insertion will reduce both acoustic leaks and the effect of low-frequency ear-canal losses. The within-subject standard deviations were about half the magnitude of the overall standard deviations, quantifying the extent of intrasubject versus intersubject variability.


2015 ◽  
Vol 26 (04) ◽  
pp. 346-354 ◽  
Author(s):  
Richard H. Wilson

Background: In 1940, a cooperative effort by the radio networks and Bell Telephone produced the volume unit (vu) meter that has been the mainstay instrument for monitoring the level of speech signals in commercial broadcasting and research laboratories. With the use of computers, today the amplitude of signals can be quantified easily using the root mean square (rms) algorithm. Researchers had previously reported that amplitude estimates of sentences and running speech were 4.8 dB higher when measured with a vu meter than when calculated with rms. This study addresses the vu–rms relation as applied to the carrier phrase and target word paradigm used to assess word-recognition abilities, the premise being that by definition the word-recognition paradigm is a special and different case from that described previously. Purpose: The purpose was to evaluate the vu and rms amplitude relations for the carrier phrases and target words commonly used to assess word-recognition abilities. In addition, the relations with the target words between rms level and recognition performance were examined. Research Design: Descriptive and correlational. Study Sample: Two recoded versions of the Northwestern University Auditory Test No. 6 were evaluated, the Auditec of St. Louis (Auditec) male speaker and the Department of Veterans Affairs (VA) female speaker. Data Collection and Analysis: Using both visual and auditory cues from a waveform editor, the temporal onsets and offsets were defined for each carrier phrase and each target word. The rms amplitudes for those segments then were computed and expressed in decibels with reference to the maximum digitization range. The data were maintained for each of the four Northwestern University Auditory Test No. 6 word lists. Descriptive analyses were used with linear regressions used to evaluate the reliability of the measurement technique and the relation between the rms levels of the target words and recognition performances. Results: Although there was a 1.3 dB difference between the calibration tones, the mean levels of the carrier phrases for the two recordings were −14.8 dB (Auditec) and −14.1 dB (VA) with standard deviations <1 dB. For the target words, the mean amplitudes were −19.9 dB (Auditec) and −18.3 dB (VA) with standard deviations ranging from 1.3 to 2.4 dB. The mean durations for the carrier phrases of both recordings were 593–594 msec, with the mean durations of the target words a little different, 509 msec (Auditec) and 528 msec (VA). Random relations were observed between the recognition performances and rms levels of the target words. Amplitude and temporal data for the individual words are provided. Conclusions: The rms levels of the carrier phrases closely approximated (±1 dB) the rms levels of the calibration tones, both of which were set to 0 vu (dB). The rms levels of the target words were 5–6 dB below the levels of the carrier phrases and were substantially more variable than the levels of the carrier phrases. The relation between the rms levels of the target words and recognition performances on the words was random.


2018 ◽  
pp. 186-194
Author(s):  
Erwin B. Montgomery

Widespread irreproducibility of biomedical research has raised concerns. Journal editors and grant administrators are calling for greater safeguards. The causes go far beyond fraud, lack of transparency, and poor statistical analyses, as commonly thought. The root cause may stem from the same epistemic issues that confront medical reasoning, the necessary use of logical fallacies. However, the use of these fallacies increases the risk of uncertainty and subsequent irreproducibility. Furthermore, many procedures in data analysis actually result in an irretrievable loss of information by the Second Law of Thermodynamics as Applied to Information, thereby increasing the risk for irreproducibility. The second law holds that any irreversible process, such as operating only from the central tendency, as in the mean of a sample, results in a loss of information about the actual sample. The loss of that information makes the study less informative about the management of the individual patient.


1974 ◽  
Vol 13 (02) ◽  
pp. 193-206
Author(s):  
L. Conte ◽  
L. Mombelli ◽  
A. Vanoli

SummaryWe have put forward a method to be used in the field of nuclear medicine, for calculating internally absorbed doses in patients. The simplicity and flexibility of this method allow one to make a rapid estimation of risk both to the individual and to the population. In order to calculate the absorbed doses we based our procedure on the concept of the mean absorbed fraction, taking into account anatomical and functional variability which is highly important in the calculation of internal doses in children. With this aim in mind we prepared tables which take into consideration anatomical differences and which permit the calculation of the mean absorbed doses in the whole body, in the organs accumulating radioactivity, in the gonads and in the marrow; all this for those radionuclides most widely used in nuclear medicine. By comparing our results with dose obtained from the use of M.I.R.D.'s method it can be seen that when the errors inherent in these types of calculation are taken into account, the results of both methods are in close agreement.


1974 ◽  
Vol 75 (2) ◽  
pp. 274-285 ◽  
Author(s):  
A. Gordin ◽  
P. Saarinen ◽  
R. Pelkonen ◽  
B.-A. Lamberg

ABSTRACT Serum thyrotrophin (TSH) was determined by the double-antibody radioimmunoassay in 58 patients with primary hypothyroidism and was found to be elevated in all but 2 patients, one of whom had overt and one clinically borderline hypothyroidism. Six (29%) out of 21 subjects with symptomless autoimmune thyroiditis (SAT) had an elevated serum TSH level. There was little correlation between the severity of the disease and the serum TSH values in individual cases. However, the mean serum TSH value in overt hypothyroidism (93.4 μU/ml) was significantly higher than the mean value both in clinically borderline hypothyroidism (34.4 μU/ml) and in SAT (8.8 μU/ml). The response to the thyrotrophin-releasing hormone (TRH) was increased in all 39 patients with overt or borderline hypothyroidism and in 9 (43 %) of the 21 subjects with SAT. The individual TRH response in these two groups showed a marked overlap, but the mean response was significantly higher in overt (149.5 μU/ml) or clinically borderline hypothyroidism (99.9 μU/ml) than in SAT (35.3 μU/ml). Thus a normal basal TSH level in connection with a normal response to TRH excludes primary hypothyroidism, but nevertheless not all patients with elevated TSH values or increased responses to TRH are clinically hypothyroid.


2003 ◽  
Vol 128 (1) ◽  
pp. 17-26 ◽  
Author(s):  
David J. Kay ◽  
Richard M. Rosenfeld

OBJECTIVE: The goal was to validate the SN-5 survey as a measure of longitudinal change in health-related quality of life (HRQoL) for children with persistent sinonasal symptoms. DESIGN AND SETTING: We conducted a before and after study of 85 children aged 2 to 12 years in a metropolitan pediatric otolaryngology practice. Caregivers completed the SN-5 survey at entry and at least 4 weeks later. The survey included 5 symptom-cluster items covering the domains of sinus infection, nasal obstruction, allergy symptoms, emotional distress, and activity limitations. RESULTS: Good test-retest reliability ( R = 0.70) was obtained for the overall SN-5 score and the individual survey items ( R ≥ 0.58). The mean baseline SN-5 score was 3.8 (SD, 1.0) of a maximum of 7.0, with higher scores indicating poorer HRQoL. All SN-5 items had adequate correlation ( R ≥ 0.36) with external constructs. The mean change in SN-5 score after routine clinical care was 0.88 (SD, 1.19) with an effect size of 0.74 indicating good responsiveness to longitudinal change. The change scores correlated appropriately with changes in related external constructs ( R ≥ 0.42). CONCLUSIONS: The SN-5 is a valid, reliable, and responsive measure of HRQoL for children with persistent sinonasal symptoms, suitable for use in outcomes studies and routine clinical care.


Genetics ◽  
1986 ◽  
Vol 113 (4) ◽  
pp. 1077-1091
Author(s):  
John H Gillespie

ABSTRACT A statistical analysis of DNA sequences from four nuclear loci and five mitochondrial loci from different orders of mammals is described. A major aim of the study is to describe the variation in the rate of molecular evolution of proteins and DNA. A measure of rate variability is the statistic R, the ratio of the variance in the number of substitutions to the mean number. For proteins, R is found to be in the range 0.16 &lt; R &lt; 35.55, thus extending in both directions the values seen in previous studies. An analysis of codons shows that there is a highly significant excess of double substitutions in the first and second positions, but not in the second and third or first and third positions. The analysis of the dynamics of nucleotide evolution showed that the ergodic Markov chain models that are the basis of most published formulas for correcting for multiple substitutions are incompatible with the data. A bootstrap procedure was used to show that the evolution of the individual nucleotides, even the third positions, show the same variation in rates as seen in the proteins. It is argued that protein and silent DNA evolution are uncoupled, with the evolution at both levels showing patterns that are better explained by the action of natural selection than by neutrality. This conclusion is based primarily on a comparison of the nuclear and mitochondrial results.


Sign in / Sign up

Export Citation Format

Share Document