Confidence Intervals for Sample Means

(Section 6.4 in Zar, 2010)

Confidence intervals, also referred to as confidence limits, are another way to describe the error in an estimate. Remember that "error" is being used in the statistical sense, i.e., variation, and not in the "someone messed up" sense. Confidence intervals define a range within which we have a specified degree of confidence that the value of the actual parameter we are trying to estimate lies. For example, if we estimate μ = 10 (because our sample mean was equal to 10) and report a 95% confidence interval of 2, it means that we are 95% confident that the actual value of μ lies between 8 and 12.

If we know the distribution of sample means around the population mean, we can establish the limits that contain 95% of the observations. For example, we know that for a normal distribution:

95% of the observations lie within 1.960 standard deviations of the mean ( μ ± 1.960σ)

And so for parameters that are normally distributed, we can establish the confidence intervals as ± 1.960σ, and estimate them as ± 1.960s.

One parameter that is distributed normally, are sample means drawn from a normally distributed statistical population. This is true, regardless of the size of the sample used to generate the sample means. This is demonstrated below, where the bars are the frequencies of 5000 sample means drawn from a normally distributed population, with μ = 0 and σ = 1, with the sample sizes (n) for the sample means indicated on the graph. The red line is the expected normal distribution for the mean and standard deviation of those observations:

If the above animation does not work, or you wish to examine the individual graphs, they can be viewed HERE

If you recall from our discussion of descriptive statistics, the standard error (SE) of the mean estimates the standard deviation of the distribution of sample means around the population mean (μ). Knowing that these means are normally distributed around μ tells us that 95% of the sample means are within 1.960 SE of μ. Thus, for a sample drawn from a normally distributed statistical population, we can conclude that 95% of the time μ is within 1.960 SE of the value estimated by our sample mean, because our sample mean should fall within 1.960SE of μ 95% of the time. Thus, we can calculate the 95% confidence intervals for a sample mean drawn from a normal population as:

Question 3: Describe how to calculate the 99% confidence intervals for a sample mean drawn from a normally distributed statistical population.

While this might seem like a terribly useful piece of knowledge, it is an extremely rare occasion when we know that the statistical population that we are sampling from is normally distributed (in other words, this is not the approach that you should use for question 4). For this reason, as I mentioned earlier, the normal probability distribution has limited application in direct comparisons and, sadly, for reporting confidence intervals. What we need to consider then is the distribution of sample means drawn from statistical populations that are not normally distributed. As it turns out, sample means drawn from any distribution approach a normal distribution. This is an important concept known as the central limit theorem. We can demonstrate this by drawing sample means from a lognormal distribution, an example of which is shown below:

This clearly is a positively-skewed distribution. The reason it is referred to as lognormal is because if you plot the distribution using the log of the X-variables, it becomes a normal distribution:

Although it is a distraction from our current train of thought, this demonstrates an important point. In some cases where data do not meet the assumptions of an analysis, e.g., an assumption of normality, transformation of those data (in this case, log transformation) may fix the problem. Now, back to the issue at hand...

Drawing 5000 samples (and calculating the means of those samples) from the right-skewed, lognormal distribution, using the sample sizes depicted on the following graph (the R program used to generate the data for this and the previous animation can be viewed HERE) shows that the distribution of those sample means becomes more symmetrical (once again, the red line indicates the normal expectation) as the sample sizes used to generate the sample means increases:

If the above animation does not work, or you wish to examine the individual graphs, they can be viewed HERE

While this might seem like promising news, pay careful attention to the sample sizes. While sample means drawn from any distribution will approach a normal distribution, they only fit a normal distribution when n = ∞. That's a pretty high standard to meet! Fortunately, W. S. Gossett made certain that it mattered little that the sample means are not normally distributed, by describing very precisely how the distribution approaches a normal distribution. He published this distribution (the t-distribution) under the pseudonym "Student", because his employer (the Guinness Brewing Company) would not let employees publish using their own names.

Student's t-distribution, at low sample sizes for the sample means, is wider and more peaked than the normal distribution, as shown below where the bars represent 10000 draws of sample means from the t-distribution (using THIS R program), and the red line represents the normal distribution:

Note that instead of sample size (n) on the graph, the size of the sample is inferred from degrees of freedom (df). This is the convention for reporting probability distributions, and one that you will become familiar with. It essentially describes the number of values that can vary. For example, in the calculation of a sample mean of 10 observations, if you had the sample mean and 9 other observations, the value of the 10th observation could be easily calculated, because it can only be one possible value. Thus, the degrees of freedom for the distribution as it applies to sample means is calculated as (n-1). The calculation for degrees of freedom varies depending on the parameter (it typically is the denominator of the variance calculation), but it always is directly related to sample size.

Appendix B in your textbook contains a number of tables, many of which describe probability distributions. Table B.3 (starting on page 678 of the fifth edition) contains the critical values for the t distribution. Critical values correspond to the number of standard deviations from the mean within which some proportion of the distribution lies. For example, 95% of the observations in a normal distribution lie within 1.960 standard deviations of the mean, leaving 2.5% of the observations in either tail of the distribution. Thus, the critical values for 95% confidence are -1.960 and 1.960. We will find out why these are referred to as critical values next week, but for now just recognize that these are the boundaries in units of standard deviation.

Looking at Table B.3, we can see a column on the left labelled "v". The "v" stands for degrees of freedom, which for our current purposes (as mentioned above) are determined as n-1. There also are two rows of probabilities across the top, α(2) and α(1). Last week we were introduced to the concept of a decision rule, the probabililty below which we would consider an event to be too improbable to have occurred by chance. We will designate this probability as α (alpha). It also was mentioned that we typically use 0.05 as that level of probability, giving us a confidence level (1 - α) of 1 - 0.05 = 0.95 for those decisions. The top set of probabilities across the top of the table are for two-tailed probabilities, α(2), where that 0.05 is divided between the two tails of the distribution (0.025 on either side). The lower row is for one-tailed probabilities α(1), where the entire 0.05 is confined to one tail of the distribution. We will discuss the application of one-tailed vs two-tailed probabilities next week. For the present, the probabilities that we have been describing are two-tailed, and so we will apply the α(2) probabilities.

Look at the second page of the table, for v = ∞, and a two-tailed probability, α(2), of 0.05. This number (1.960) should look familiar. It tells you that 95% of the observations will be within 1.960σ of μ, which is another demonstration of the central limit theorem (the distribution of sample means is normal when n = ∞). Moving up the column, to lower values for the degrees of freedom, we can see that the boundaries increase. Essentially we are looking at the approach of the t -distribution to normality in reverse. Because the t-distribution describes the distribution of sample means around μ for sample sizes of v + 1, we can use the t-distribution to generate 95% confidence intervals for estimates of μ without worrying about the underlying distribution of the statistical population. The table values provide the boundaries, in units of standard deviation (remember that the standard deviation of sample means is SE), between which 95% of the observations should occur. Thus, we can calculate the 95% confidence intervals for a sample mean calculated from n observations as:

The rather cumbersome subscript for t (the 95% CI calculation is just t×SE) indicates where to look on the table, with a two-tailed probability, α(2), of 0.05, and degrees of freedom (shown as df above, but denoted as v on Table B.3) equal to n-1. DO NOT add and subtract the interval from the mean when you report the results. Report the mean, type in "+/-" (or "±" if you are feeling fancy) and then report the interval. If the value for the degrees of freedom that you need is not present on the table, use the next lowest number on the table.

Download this week's Excel workbook from HERE. It contains 3 data sets that we already have examined: condition factor (K) for bluegill sunfishes and bluegill x green sunfish hybrids (worksheet "fish"), bone alkaline phospatase assays from cats (worksheet "BAP") and large seed masses for a population of barbed goatgrass from California (worksheet "goat").

Question 4: Calculate and report the sample means and 95% confidence intervals for all 4 samples in the Excel workbook (there are 2 samples for the sunfish data). Also, calculate and report the 99% confidence intervals for the BAP data.

If you used 1.96 for all of the 95% confidence intervals in question 4, it means that you know that the data are normally distributed, or that your sample size is infinite; neither of which is true. It also means that you didn't read anything past question 3 before answering question 4. You might want to take another stab at it...

Yet another application of the normal distribution involves the calculation of confidence intervals for variances...

Send comments, suggestions, and corrections to: Derek Zelmer