Teacher resources and professional development across the curriculum

Teacher professional development and classroom resources across the curriculum

Monthly Update sign up
Mailing List signup
Learning Math Home
Data Session 9, Part B: Selecting the Sample
Session 9 Part A Part B Part C Part D Homework
Data Site Map
Session 9 Materials:

Session 9, Part B:
Selecting the Sample

In This Part: Fair Sampling | A Fair Sampling Method | Variation in Estimates

There are many different ways to randomly select 10 sub-regions. Many of these methods involve initially numbering the 100 sub-regions. In this section, we will use the numbering system below, which numbers the sub-regions from 00 through 99:

Locating number positions is easier if we put digits on the outside borders as shown. Each number in the grid corresponds to a red and blue number combination; the red number is the first digit, and the blue number is the second digit:


Problem B2


Think of a way to pick 10 numbers between 00 and 99 at random. (You may prefer to select each digit individually, or to select the entire two-digit number at once.) Then use your method to generate the 10 random numbers.

Stop!  Do the above problem before you proceed.  Use the tip text to help you solve the problem if you get stuck.
You may wish to use a random-number-generating device, such as a calculator, a 10-sided die, or computer software, to generate the random numbers.   Close Tip


One possible method for solving Problem B2 is to use two 10-sided dice, one red and one blue. The sub-region can then be determined by the two dice (in the order red, and then blue).

You might notice that the random selection process will sometimes produce duplicates. There is a greater than one-third chance that 10 numbers picked at random between 00 and 99 will produce at least one duplicate, and almost a 90% chance that 20 such numbers will produce at least one duplicate.

For instance, you might find that seven tosses of the dice produced these sub-region choices:
19 22 39 50 34 05 39

If we do not want duplicates, we can skip them until we get 10 distinct numbers, for example:
19 22 39 50 34 05 75 62 87 13

This is called sampling without replacement, since each time we choose a sub-region we remove it from the list of sub-regions we can choose on the next toss of the dice. In some experiments, it may be impractical or impossible to exclude duplicates from the random selection process. If duplicates are allowed, it is called sampling with replacement.

The 10 distinct numbers (19, 22, 39, 50, 34, 05, 75, 62, 87, 13) correspond to these 10 sub-regions:

Here is a look at the number of penguins in each of the 10 sub-regions we selected:

The estimate of the total number of penguins for the entire region based on this random sample of 10 sub-regions is as follows:

100 x [(5 + 6 + 6 + 7 + 5 + 2 + 1+ 5 + 5 + 3)/10] = 100 x (45/10) = 450


Problem B3


Use the random sample you found in Problem B2 to estimate the total number of penguins in the region. Find your 10 random sub-regions in the chart below:


Problem B4

write Reflect  

Did you expect your estimate from Problem B3 to equal your estimate from Problem B2? Why or why not? What explains this variation? If the sample size were increased to 20 sub-regions, would you expect the variation in the estimates to increase or decrease? Why?

Next > Part B (Continued): Variation in Estimates

Learning Math Home | Data Home Register | | Glossary | Map | ©

Session 9: Index | Notes | Solutions | Video


© Annenberg Foundation 2017. All rights reserved. Legal Policy