Data Analysis for Chemistry
Latest Publications


TOTAL DOCUMENTS

6
(FIVE YEARS 0)

H-INDEX

0
(FIVE YEARS 0)

Published By Oxford University Press

9780195162103, 9780197562086

Author(s):  
D. Brynn Hibbert ◽  
J. Justin Gooding

• To describe the linear calibration model and how to estimate uncertainties in the calibration parameters and test concentrations determined from the model. • To show how to perform calibration calculations using Excel. • To calculate parameters and uncertainties in the standard addition method. • To calculate detection limits from measurements of blanks and uncertainties of the calibration model.… Calibration is at the heart of chemical analysis, and is the process by which the response of an instrument (in metrology called ‘‘indication of the measuring instrument’’) is related to the value of the measurand, in chemistry often the concentration of the analyte. Without proper calibration of instruments measurement results are not traceable, and not even correct. Scales in supermarkets are periodically calibrated to ensure they indicate the correct mass. Petrol pumps and gas and electric meters all must be calibrated and recalibrated at appropriate times. A typical example in analytical chemistry is the calibration of a GC (gas chromatography) analysis. The heights of GC peaks are measured as a function of the concentration of the analyte in a series of standard solutions (‘‘calibration solutions’’) and a linear equation fitted to the data. Before the advent of computers, a graph would be plotted by hand and used for calibration and subsequent measurement. Having drawn the best straight line through the points, the unknown test solution would be measured and the peak height read across to the calibration line then down on to the x-axis to give the concentration (figure 5.1). Nowadays, the regression equation is computed from the calibration data and then inverted to give the concentration of the test solution. Although the graph is no longer necessary to determine the parameters of the calibration equation, it is good practice to plot the graph as a rapid visual check for outliers or curvature. Because we can choose what values the calibration concentrations will take, the concentration is the independent variable, with the instrumental output being the dependent variable (because the output of the instrument depends on the concentration).


Author(s):  
D. Brynn Hibbert ◽  
J. Justin Gooding

• What ANOVA is, and what it is used for. • To perform and interpret a one-way ANOVA. • To determine which effects are significant using least significant difference. • To perform and interpret a two-way ANOVA. … ANOVA is the workhorse method of using statistics to compare means and determine the effects of influence factors on measurement results (i.e., anything that can be varied or measured that may affect the result). In chapter 3 we learned how to use Student t-tests to compare two means. There is nothing to stop us performing a series of t-tests on pairs of means that must be compared, but a different approach that looks at the variance of data, ANOVA, can decide if there is a significant effect caused by a factor for which we have any number of sets of data. ANOVA relies on an understanding of two things. First, how the variances of different components can be combined to give the overall observed variance of data. Second, that a difference in means can lead to a spread of results of the combined data that can be detected in terms of an increased variance. As an example, consider an attempt to determine if there is a significant difference between the means of replicate analyses conducted by two methods. The standard deviation of each set of results will estimate the repeatability of the measurement. If the two methods have different means then the standard deviation of the combined data will be increased by any differences arising from the methods. This is illustrated in figure 4.1. When the means are far apart, even though the individual standard deviations are not great, the combination has a huge standard deviation. ANOVA is powerful because it can determine if there is significant difference among a number of instances of the same factor (e.g., if we wanted to know if there were any difference in the result between three or more analytical methods), and also among different factors (e.g., what is the effect of temperature and concentration on the yield of a reaction?).


Author(s):  
D. Brynn Hibbert ◽  
J. Justin Gooding

• To understand the concept of the null hypothesis and the role of Type I and Type II errors. • To test that data are normally distributed and whether a datum is an outlier. • To determine whether there is systematic error in the mean of measurement results. • To perform tests to compare the means of two sets of data.… One of the uses to which data analysis is put is to answer questions about the data, or about the system that the data describes. In the former category are ‘‘is the data normally distributed?’’ and ‘‘are there any outliers in the data?’’ (see the discussions in chapter 1). Questions about the system might be ‘‘is the level of alcohol in the suspect’s blood greater than 0.05 g/100 mL?’’ or ‘‘does the new sensor give the same results as the traditional method?’’ In answering these questions we determine the probability of finding the data given the truth of a stated hypothesis—hence ‘‘hypothesis testing.’’ A hypothesis is a statement that might, or might not, be true. Usually the hypothesis is set up in such a way that it is possible to calculate the probability (P) of the data (or the test statistic calculated from the data) given the hypothesis, and then to make a decision about whether the hypothesis is to be accepted (high P) or rejected (low P). A particular case of a hypothesis test is one that determines whether or not the difference between two values is significant—a significance test. For this case we actually put forward the hypothesis that there is no real difference and the observed difference arises from random effects: it is called the null hypothesis (H<sub>0</sub>). If the probability that the data are consistent with the null hypothesis falls below a predetermined low value (say 0.05 or 0.01), then the hypothesis is rejected at that probability. Therefore, p<0.05 means that if the null hypothesis were true we would find the observed data (or more accurately the value of the statistic, or greater, calculated from the data) in less than 5% of repeated experiments.


Author(s):  
D. Brynn Hibbert ◽  
J. Justin Gooding

This chapter is called Readers’ Guide because chapter 1 is clearly the proper start of the book, with introductions and discussions of what measurement really is and so on. This chapter was compiled last, and attempts to be the first stop for a reader who does not want the edifying discourse on measurement, but is desperate to find out how to do a t-test. In the glossary, we define terms and concepts used in the book with a section reference to where the particular term or concept is explained in detail. If you half know what you are after, perhaps the memory jog from seeing the definition may suffice, but sometime return to the text and reacquaint yourself with the theory. There follows ‘‘frequently asked questions’’ that represent just that—questions we are often asked by our students (and colleagues). The order roughly follows that of the book, but you may have to do some scanning before the particular question that is yours springs out of the page. Finally we have lodged a number of Excel spreadsheet functions that are most useful to a chemist faced with data to subdue. The list has brought together those functions that are not obviously dealt with elsewhere, and does not claim to be complete. But have a look there if you cannot find a function elsewhere. The definitions given below are not always the official statistical or metrological definition. They are given in the context of chemical analysis, and are the authors’ best attempt at understandable descriptions of the terms.


Author(s):  
D. Brynn Hibbert ◽  
J. Justin Gooding

• To understand the concept of mean, variance, and standard deviation pertaining both to a large sample (population) and a small sample. • To define the standard deviation of the mean of a number of repeated measurements and understand its relation to the sample standard deviation. • To define confidence intervals about a mean and show how to use them to indicate measurement precision. • To introduce robust estimators of representing the average and sample standard deviation. • To appreciate the difference between measurement repeatability and reproducibility…. Why do we bother with means and standard deviations? Because these two statistics tell us a great deal about the data and the population from which they come. A mean of a number of repeated measurements of the concentration of a test solution is an estimate of the concentration of the test solution and the sample standard deviation gives a measure of the random scatter of the values obtained by measurement. Together with the appropriate units they represent the result. This information is not necessarily the answer to: ‘‘What is the concentration of the test solution and how sure are you of that answer?’’ To answer this question an uncertainty budget must be prepared, which includes errors, random and systematic, arising from all aspects of the experiment (of which the standard deviation of repeated measurements is just one). Why is it good to repeat analytical measurements? There might be an argument for the ‘‘quit while you are ahead’’ school but repeating a measurement gives increased confidence in the result, especially if the numbers appear to agree. But apart from the appearance of consistency, do you get better answers by repeating measurements, and are more repeats better than fewer repeats? The answer to both questions is ‘‘yes,’’ as we shall see in this chapter. Note that the statistical treatment of repeated results does not tell us about systematic error unless we can compare our mean with a known or assigned value of the quantity being measured.


Author(s):  
D. Brynn Hibbert ◽  
J. Justin Gooding

• To understand that chemical measurements are made for a purpose, usually to answer a nonchemical question. • To define measurement and related terms. • To understand types of error and how they are estimated. • What makes a valid analytical measurement…. Chemistry, like all sciences, relies on measurement, yet a poll of our students and colleagues showed that few could even start to give a reasonable explanation of ‘‘measurement.’’ Reading textbooks on data analysis revealed that this most basic act of science is rarely defined. Believe it or not there are people that specialize in the science of measurement: a field of study called metrology. The definition used in this book for measurement is a ‘‘set of operations having the object of determining the value of a quantity.’’ We will come back to this but first . . . The world spent an estimated US$3.1 billion on chemical measurements for medical diagnosis in 1998, most of this measurement being done in the United States and the European Union. These measurements were carried out to discover something about the patients. The sequence of events that involve a chemical measurement are: (1) state the real-world problem; (2) decide what chemical measurement can help answer that problem; (3) find a method that will deliver the appropriate measurement; (4) do the measurement and obtain a result (value and uncertainty, including appropriate units); and (5) give a solution to the problem based on the measurement result. It is important to understand the relationship between the real-world problem and the proposed measurement. The chemical measurement may give only part of the answer, and should not be confused with the answer itself. In forensic analytical chemistry, matching a suspect’s DNA with DNA sampled at the crime scene does not necessarily mean that the suspect is guilty. In health care, a cholesterol measurement might tell the doctor about the likelihood of a patient contracting heart diesease, but a full analysis of high- and low-density lipids and other fats will be more useful.


Sign in / Sign up

Export Citation Format

Share Document