### The finite population correction

by Pamela Narins, Manager of Market Research

In the last column, we discussed a formula that could be used to determine the number of people that should be surveyed using a simple random sampling methodology. This formula works in many situations, but, of course, is not appropriate all of the time. Several readers asked for more information about a specific case, and that's what I'd like to address this time.

What happens if the population from which you are sampling is small? You might be doing research on city-wide elections, where the voting population is several thousand. Alternatively your study might address employee satisfaction for a company of 500 people. If your population seems small, you can employ the finite population correction to the equation that we discussed last time (see Keywords 54).

The sample size derived using the finite population correction will be smaller than that derived from the uncorrected equation. This makes sense, as one would naturally assume that you'd need to sample a smaller number of people from a smaller population.

Brief review
Below is the equation we used last time to determine sample size.

Equation 1

Remember that this equation requires an estimate of the percent of responses to a dichotomous variable indicated by Py and Pn [usually set at (.50)(.50) for the most conservative estimate], and the standard error. The addition of the finite population correction is simply the addition of two other algebraic terms.

Equation 2

[This formula is from Kish, Survey Sampling, (Wiley, 1965). There are other, slightly different versions of this formula, although all principally do the same thing.]

Basically, this new formula adds in numbers that we already know. The only term in the equation that you have not yet seen is N1, the size of the true population. Let's remember that the formula for determining the standard error is:

Equation 3

The Z or t distribution coefficient is determined by the chosen level of confidence, that is, 1.96 for 95%, 2.58 for 99%, and so on. If we want to be 95% certain that our results are plus or minus 5 percentage points from the actual score, the Standard Error2 is (.05/1.96)2 or .0006507.

The new term in the denominator of Equation 2 is simply the proportion of those answering "Yes" times those answering "No" [the most conservative estimate again being (.5)(.5), or .25] divided by N1, the actual number in the population.

An example
Let's use the voting example, where the population is 10,000. If we don't want to survey every voter, what sample size do we need if we want to have an absolute error of plus or minus 5 percentage points at the 95% confidence level (to be further discussed below)? Using equation 2:

we arrive at the sample size

Using equation 1, the standard formula, the sample size estimate is 384.

A rule of thumb
If we get to the point where the finite population correction does little to change our suggested sample size (as in the example above), it pays to take the more conservative option and select the larger sample that would result from not using the correction. At what point do we want to do this? While, as with most things related to sampling, there are many considerations and judgment calls and few hard and fast rules. But, there is a rule of thumb that can guide you:

If your sample size is more than 5% of the population size, consider using the finite population correction.

The municipal elections example provides us with a suggested sample of 384 which is about 4% of the total, known population of 10,000. In this example, the finite population correction, which reduces the sample size by 13, would save little time or cost.

Likert scales
The other question posed by several readers is what to do if you do not have dichotomous variables, but have multi-category variables or Likert scales. Could we modify of the equation from (Py)(Pn) to (P1)(P2)(P3)(P4)(P5), where P1 is the proportion of respondents selecting the "1" response (perhaps "Strongly Agree"), P2 was the proportion answering "2," and so on.

The bad news is no, there is no easy way to do it. The good news is that because we can take the most conservative approach (by setting Py and Pn to .50), all things being equal, we are in relatively little danger of taking too few respondents if we use the dichotomous method.

Next time
The next edition of "Survey Samplings" will discuss some of the issues relating to questionnaire design, levels of measurement and their implications, and the pros and cons of various administration methods

SPSS P&S | SPSS MR | SPSS Science | SPSS Quality
Data Mining | Customer Services | Technical Support | SPSS Worldwide | Press Center
Investor Center | Cool Stuff | Consulting | SPSS Training | How to Order
Software | Newsletters | Books & References | FAQs

Webmaster | Talk to Us | Search | SPSS Home