Statisticians often use a random sample to estimate characteristics of a population when the population is very large and they cannot obtain data on every individual in the population. Statistical estimation asks the fundamental questions "What can I say about a whole population based on information from a random sample of that population?" and "To what degree can I say that my estimate is accurate?" Let's put random sampling into action to answer a question about demographics: "How many penguins are there on a particular ice floe in the Antarctic?"
Counting a penguin population can be tricky. Penguins tend to move around and swim off, and it's cold! So scientists use aerial photographs and statistical sampling to estimate population size. Some of the techniques they use are quite sophisticated, but we can look at a simplified version of their approach to examine the basic ideas of random sampling and estimation.
Imagine a large, snow-covered, square region of the Antarctic that is inhabited by penguins. From above, it would look like a white square sprinkled with black dots:
If you had access to such an aerial view, you could count the dots to determine the number of penguins in this region. But suppose the region was too large to see in one photo. You might instead take 100 photographs of the 100 smaller square sub-regions, count the penguins in each sub-region, and total these to obtain a count for the entire region.
However, this might take too long and be too expensive. So here's another alternative: You can select a representative sample of the sub-regions, obtain photos of only these, and use the counts from these sub-regions to estimate the total number of penguins in the entire region. Note 2