# Appendix C. Answers to Exercises

**2-1** The mean is the sum of the observations divided by the number of observations, 24: 965/24 = 40.2. To find the median we list the observation in order, then select the (50/100)(24 + 1) = 12.5th point, which is the average of the 12th and 13th observations, (29 + 30)/2 = 29.5. The standard deviation is the square root of the sum of the squared differences between the observations and the sample mean divided by the sample size minus 1, 29.8. The 25th percentile is the (25/100)(24 + 1) = 6.25. Thus, the 25th percentile is between the 6th and 7th observation, which we average to obtain (13 + 13)/2 = 13. Likewise, the 75th percentile is (75/100)(24 + 1) = 18.75, so we average the 18th and 19th observations to obtain (70 + 70)/2 = 70. The fact that the median is very different from the mean (29.5 versus 40.2) and not located roughly equidistant between the top and bottom quartile indicates that the data were probably not drawn from a normal distribution. (If the data were symmetrically distributed about the median, we could have further checked for normality by computing the 2.5th, 16th, 84th and 97.5th percentiles and comparing them with values 2 and 1 standard deviations below and above the mean, as described in Fig. 2-10.)

**2-2** Mean = 61,668, median = 13,957, standard deviation = 117,539, 25th percentile = 8914, 75th percentile = 63,555, mean − 0.67 standard deviations = −17,083, mean + 0.67 standard deviations = 140,419. These data appear not to be drawn from a normally distributed population for several reasons. (1) The mean and median are very different. (2) All the observations are (and have to be, since you cannot have a negative viral load) greater than zero and the standard deviation is larger than the mean. If the population were normally distributed, it would have to include negative values of viral load, which is impossible. (3) The relationship between the percentiles and numbers of standards deviations about the mean are different from what you would expect if the data were drawn from a normally distributed population.

**2-3** Mean = 4.30, median = 4.15, standard deviation = 0.67, 25th percentile = 5.25, 75th percentile = 4.79, mean − 0.67 standard deviations = 3.85, mean + 0.67 standard deviations = 4.75. These data appear to be drawn from a normally distributed population on the basis of the comparisons in the answer to Prob. 2-2.

**2-4** Mean = 1709, median = 1750, standard deviation = 825, 25th percentile = 825, 75th percentile = 2400, mean − 0.67 standard deviations = 1157, mean + 0.67 standard deviations = 2262. These data appear to be drawn from a normally distributed population on the basis of the comparisons in the answer to Prob. 2-1.

**2-5** There is 1 chance in 6 of getting each of the following values: 1, ...