Solutions for Session 9, Part D

See solutions for Problems: D1 | D2 | D3 | D4 | D5 | D6    Problem D1 There are more estimates from the distribution for sample size 20 that fall in the 4H and 5L stems (i.e., in the range 450-549). This suggests that the estimates from 20 sub-regions are more accurate.   Problem D2

 a. Here is the completed table:  Interval Interval Length Proportion of Estimates in Interval Sample Size 10 Sample Size 20  350-650 300 100/100 100/100 375-625 250 98/100 100/100 400-600 200 94/100 98/100 425-575 150 84/100 94/100 450-550 100 69/100 83/100 475-525 50 37/100 55/100 b. Each interval of the samples of 20 sub-regions contains a higher proportion of estimates. For instance, the interval 450-550 contains 83/100 samples of size 20, compared to 69/100 samples of size 10. A higher proportion of the estimates falls within 50 penguins of the actual population size (500) when samples of size 20 were used. This suggests that the increased sample size has a significant effect on the accuracy of the estimates.   Problem D3

 a. The median is in position (100 + 1)/2 = 50.5, so it is the average of the 50th and 51st values in the ordered list. Each of these values is 500. b. The quartiles will be at position (50 + 1)/2 = 25.5, so they are the average of the 25th and 26th values in their respective halves. c. Here is the completed table:   Sample Size 10     Maximum 620 Upper Quartile (Q3) 540 Median 500 Lower Quartile (Q1) 470 Minimum 360    Problem D4

Here is the completed table:   Sample Size 20     Maximum 610 Upper Quartile (Q3) 530 Median 500 Lower Quartile (Q1) 482.5 Minimum 390    Problem D5 Here are the completed box plots:    Problem D6

 a. The sample-to-sample variation goes down as the sample size increases. This is exhibited by the shrinking box portion of the graphs. b. The estimates are closer to the actual value as the sample size increases. Both the range and the interquartile range decrease significantly from the estimates using sample size 10 and sample size 20.