Appendix D: Image Descriptions

Figure 1.1 Image Description: A larger ellipse in blue is labelled “Population.” Inside the blue ellipse is a smaller, white ellipse outlined in orange labelled “Sample.” The smaller ellipse is entirely inside the larger, showing that a sample is a portion of a population. [Return to Figure 1.1]

Table 1.1 Image Description: The table has 50 columns and 20 rows. The columns are divided into groups of 10 and the rows are in groups of 5. Each cell has a random number from 0 to 9. The number 82 at the intersection of 05 with 00 and 01 is highlighted, and a set of red arrows shows the numbers selected after 82. [Return to Table 1.1]

Snapshot 1.1 Image Description: A screenshot of the R commander window. The section for input, called R Script, has the following lines: “set dot seed (4061) line-break sample (1:10 comma 10).” The output window repeats the above lines with a new line underneath that says “[1] 53 14 57 13 8 45 11 50 59 25.” [Return to Snapshot 1.1]

Snapshot 1.2 Image Description: A screenshot of the R commander window. The section for input, called R Script, has the following lines: “set dot seed (6194) line-break sample (1:100 comma 5).” The output window repeats the above lines with a new line underneath that says “[1] 59 9 1 40 77.” [Return to Snapshot 1.2]

Figure 1. 2 image description: A bar graph in the left panel shows the relative frequency of how students came to school. The vertical axis marks relative frequency ranging from 0 to 0.5 in intervals of 0.1. The horizontal axis shows the categories of variable “Transport”. The values are car (a relative frequency of 0.194), public ( at 0.498), bike (at 0.033), walk (at 0.271), and others (and 0.004). The right panel shows a pie chart on the right panel showing percentages of the same data. One slice for one category. Clockwise from the top, the slices show that 19.4% of students by car, 0.4% by other means, 27.1% walked, 3.3% by bike, and 49.8% by public transportation. [Return to Figure 1.2]

Figure 1.3 Image Description: The left panel is a pie chart showing percentages of how female students came to school. Clockwise from the top, the slices show that 16.9% of female students by car, 0% by other means, 25.7% walked, 3.4% by bike, and 54.1% by public transportation. The right panel is also a pie chart showing percentages of how male students came to school. Clockwise from the top, the slices show that 22.4% of male students by car, 0.8% by other means, 28.8% walked, 3.2% by bike, and 44.8% by public transportation. [Return to Figure 1.3]

Figure 1.4 Image Description: A side-by-side bar graph comparing relative frequency of how female and male students came to school. The vertical axis marks relative frequency ranging from 0 to 1 in intervals of 0.2. The horizontal axis shows the categories of variable “Transport”. The values are car (at 0.169 for females and 0.224 for males), public (at 0.541 for females and 0.448 for males), bike (at 0.034 for females and 0.032 for males),  walk (at 0.257 for females and 0.288 for males), others (at 0 for females and 0.008 for males). [Return to Figure 1.4]

Figure 1.5 Image Description: A histogram of number of siblings. The y-axis (vertical) is frequency, going from 0 to 35 in increments of 5. The x-axis (horizontal) is the number of siblings with labelled values of 0, 1, 2, 3, and larger than 3. Five bars are shown as follows: x at 0 has a height of 10, x at 1 has a height of 30, x at 2 has a height of 35, x at 3 has a height of 15, and x at more than 3 has a height of 10. [Return to Figure 1.5]

Figure 1.6 Image Description: A histogram of grade. The y-axis is frequency, going from 0 to 14 in increments of 2. The x-axis is grades from 0 to 100 with 10 bars taking up intervals of width 10. The heights of the bars are as follows: x at 0 to 10 has a height of 1, x at 10 to 20 has a height of 0, x at 20 to 30 has a height of 2, x at 30 to 40 has a height of 4, x at 40 to 50 has a height of 8, x at 50 to 60 has a height of 14, x at 60 to 70 has a height of 10, x at 70 to 80 has a height of 6, x at 80 to 90 has a height of 3, and x at 90 to 100 has a height of 2. [Return to Figure 1.6]

Figure 1.7 Image Description: A stem and leaf diagram of grade. The stems are listed vertically to the left of a vertical line, starting from 0 to 9. The leaves of each stem are listed horizontally on the right of the vertical line. The leaf of 0 is 9, no leaf for stem 1. Leaves of 2 are 4 and 9, leaves of 3 are 4, 7, 7, and 9. Leaves of 4 are 0, 2, 6, 7, 8, 8, 9, 9. Leaves of 5 are 0, 1, 2, 3, 4, 5, 5, 5, 6, 6, 7, 9, 9 and 9. Leaves of 6 are 0, 0, 1, 2, 5, 5, 8, 8, 8, and 8. Leaves of 7 are 1, 2, 4, 5, 6, and 9. Leaves of 8 are 1, 3 and 5. The last row are the leaves of 9: 0 and 2. [Return to Figure 1.7]

Figure 1.8 Image Description: Nine special shapes of distributions presented in three rows and three columns. The three figures in the first row are as follows: figure (a) is a bell-shape curve, (b) is an isosceles triangle, and (c) is a rectangle called uniform distribution. The three figures in the second row are as follows: figure (d) is non-symmetrical curve does have a peak and a longer tail at the right end, it is called right skewed; figure (e) is non-symmetrical curve does have a peak and a longer tail at the left end, it is called left skewed; and figure (f) is non-symmetrical and slopes upward continually, looks like a capital letter J. The three figures on the third row are as follows: figure (g) is non-symmetrical and slopes downward continually, looks like a reversed capital letter J; figure (h) is a symmetrical curve with two peaks, it is called bimodal distribution; and figure (i) is a non-symmetrical curve with three peaks, it is called multi-modal. [Return to Figure 1.8]

Figure 1.8.1 Image Description: A histogram of grade. The y-axis is frequency from 0 to 14 in increments of 2. The x-axis is grades from 0 to 100 with 10 bars taking up intervals of width 10. The heights of the bars are as follows: x at 0 to 10 has a height of 1, x at 10 to 20 has a height of 0, x at 20 to 30 has a height of 2, x at 30 to 40 has a height of 4, x at 40 to 50 has a height of 8, x at 50 to 60 has a height of 14, x at 60 to 70 has a height of 10, x at 70 to 80 has a height of 6, x at 80 to 90 has a height of 3, and x at 90 to 100 has a height of 2. [Return to Figure 1.8.1]

Figure 1.9 Image Description: A histogram of survival time after diagnosis of cancer. The y-axis is frequency from 0 to 1000 in increments of 200. The x-axis is survival time in years from 0 to 32 with 32 bars in increments of 1 year. The heights of the bars are as follows: x at 1 is close to 550, x at 2 is close to 1000, x at 3 is close to 1050 (this is the peak of the histogram), and x at 4 is close to 900. Following 4, the height of the bars decreases as x increases. The bars after x at 15 have a height very close to zero. [Return to Figure 1.9]

Figure 1.10 Image Description: A histogram of salary. The y-axis is frequency from 0 to 8 in increments of 2. The x-axis is salary from 20 to 60 with 9 bars or increment 5. The heights of the bars are as follows: x at 20 to 25 has a height of 1, x at 25 to 30 has a height of 3, x at 30 to 35 has a height of 8, x at 35 to 40 has a height of 7, x at 40 to 45 has a height of 5, x at 45 to 50 has a height of 9, x at 50 to 55 has a height of 7, x at 55 to 60 has a height of 1, and x at 60 to 65 has a height of 1. [Return to Figure 1.10]

Assignment 1 Question 2 Image Description: The table shows the first thirty entries of the home sale spreadsheet. Each row is numbered, representing a single house. The columns are labelled “Size” in square feet, “Pool” yes or no, “Area” in square feet, “Age” in years, “Bath” in number of bathrooms, “Stories” in number of stories, “Garage” yes or no, “Traffic” yes or no, “Roof” tile or non-tile, and “Price” in dollars. The first entry (number one) has the following data: Size is 1865, Area is 9509.4, Age is 18, Bath is 2.5, Stories is 1, Garage is 2, Traffic is no, Roof is non-tile, and Price is 145950. For more entries, please download M01_SaleHome.xlsx from the top of the assignment page. [Return to Question 2]

Figure 2.1 Image Description: Three histograms in a row. The histogram on the left panel is roughly symmetric, the mean (red solid vertical line) and the median (the blue dashed vertical line) are almost identical. The histogram in the middle is right skewed with a longer tail on the right, the mean (red solid vertical line) is on the right of the median (the blue dashed vertical line). The histogram on the right panel is left skewed with a longer tail on the left, the mean (red solid vertical line) is on the left of the median (the blue dashed vertical line). [Return to Figure 2.1]

Figure 2.2 Image Description: A histogram of survival time after diagnosis of cancer. The y-axis is frequency from 0 to 1000 in increments of 200. The x-axis is survival time in years from 0 to 32 with 32 bars in increments of 1 year. The heights of the bars are as follows: x at 1 is close to 550, x at 2 is close to 1000, x at 3 is close to 1050 (this is the peak of the histogram), and x at 4 is close to 900. Following 4, the height of the bars decreases as x increases. The bars after x at 15 have a height very close to zero. [Return to Figure 2.2]

Figure 2.3 Image Description: A vertical box plot for the given data. The y-axis is in increments of 5 from 0 to 20. The boxplot’s bottom adjacent value is 1, the smallest value within the lower and upper limits. The lower whisker (a short dashed line) extends from the bottom adjacent value, 1, to the first quartile which is 4. The box begins at the first quartile and extends to the third quartile which is 10. A horizontal line is drawn at the median which is 7. And the upper whisker (a short dashed line) extends from the third quartile which is 10 to the top adjacent value which is 11. The observation 21 is an outlier indicated as a circle. [Return to Figure 2.3]

Figure 2.4 Image Description: Three box plots with the same y-axis in a row. The y-axis is in increments of 0.2 from 0 to 1. The leftmost panel is a box plot of a right skewed distribution ranging from 0 to 0.5. The lower whisker (a dashed line extending from the smallest observation to the first quartile) is shorter than the upper whisker (a dash line extending from the third quartile to the upper adjacent value). The distance between the first quartile and the median is also shorter than the distance between the median and the third quartile. There are also several outliers on the top. The middle panel presents a box plot of a symmetric distribution ranging from 0.2 to 0.8. The lower whisker and the upper whisker are roughly of the same length. The distance between the first quartile and the median and the distance between the median and the third quartile are roughly the same. There is one outlier on the top. The rightmost panel presents a box plot of a left skewed distribution ranging from 0.5 to 1. The lower whisker is longer than the upper whisker. The distance between the first quartile and the median is also larger than the distance between the median and the third quartile. There are also several outliers at the bottom. [Return to Figure 2.4]

Figure 2.5 Image Description: A pair of boxplots, titled “Boxplot of Non-attendees & Attendees”. The y-axis is labelled “Final grade” and is in increments of 5 from 35 to 95. There are two boxplots represented here, the one on the left is labelled “Non-attendee” and the one on the right is labelled “Attendee”. There are no outliers for either group. The range of the boxplot for “Non-attendee” is roughly between 35 and 87 with a median of 65, and the box (IQR) goes from about 52 to 78. The range of the box plot for “Attendee” is roughly between 48 and 97 with a median of 78, and the box (IQR) goes from about 70 to 85. [Return to Figure 2.5]

Figure 2.6 Image Description: A vertical box plot for the  given data. The y-axis is in increments of 1 from negative 5 to 1. The boxplot’s bottom adjacent value is 0.05, the smallest value within the lower and upper limits. The lower whisker extends from the bottom adjacent value, 0.05, to the first quartile which is 0.2. The box begins at the first quartile and extends to the third quartile which is 0.7. A horizontal line is drawn at the median which is indicated as Q2 equals 0.5. And the upper whisker goes from the third quartile which is 0.7 to the top adjacent value which is 0.95. The observation negative 5 is an outlier indicated as a circle at the bottom. [Return to Figure 2.6]

Figure 3.1 Image Description: A scatter plot with y-axis labelled “Proportion of Heads” and x-axis labelled “# of Experiments”. The y-axis is in increments of 0.02 from 0.44 to 0.54 and the x-axis is in increments of 20000 from 0 to 100000. As the number of experiments increases the proportion of heads starts with 0.45 and then goes up to 0.5 and then fluctuates under a red horizontal dashed line at 0.5. The points are getting closer to 0.5 when the number of experiments goes above 70000. [Return to Figure 3.1]

Figure 3.2 Image Description: Four venn diagrams are presented in a 2 by 2 matrix.  Each venn diagram has a rectangle with a label “S” on the top-left corner representing the sample space.  The top-left venn diagram is  labelled “E” under the rectangle and has an oval in the center. The oval has a letter “E” in the center and is filled with light blue. The top-right venn diagram is  labelled “ not E” under the rectangle and has an oval in the center. The oval has a letter “E” in the center and the whole rectangle except the oval is filled with light blue. The bottom-left venn diagram is  labelled “A & B” under the rectangle and has two overlapped ovals in the center, the left oval is labelled “A” and the right oval is labelled “B” and their overlap is labelled “A & B”. The overlap is filled with blue. The bottom-right venn diagram is  labelled “A or B” under the rectangle and has two overlapped ovals in the center, the left oval is labelled “A” and the right oval is labelled “B”. Both two ovals and their overlap are filled with blue. [Return to Figure 3.2]

Figure 3.3 Image Description: A venn diagram shows event A as a subset of event B. A rectangle with a label “S” on the top-left corner represents the sample space. There is a big oval labelled “B” in the center of the rectangle and filled with light blue. There is a small oval labelled “A” inside the big oval “B” and filled with dark blue. [Return to Figure 3.3]

Figure 3.4 Image Description: This tree diagram grows from the left to the right and has two levels. The first level has two branches representing two possible outcomes of Midterm I. The upper branch is labelled “P(A1) equals 0.15” above and “greater than or equal to 90” below the branch.  The lower branch reads “P(B1) equals 0.85” below and “greater than 90” above the branch.  Both the upper and lower branches of the first level have two sub-branches representing the outcomes of Midterm II given the result of Midterm I. The topmost branch is connected to the first upper branch and reads “P(A2 given A1) equals 0.8” above and “less than or equal to 90” below the sub-branch. The next branch down is connected to the first upper branch and reads “P(B2 given A1) equals 0.2” below and “greater than 90” above the sub-branch. The next branch down is connected to the first lower branch and reads “P(A2 given B1) equals 0.1” above and “greater than or equal to 90” below the sub-branch. The bottom most branch is connected to the first lower branch and reads “P(B2 given B1) equals 0.9” below and “greater than 90” above the sub-branch. The tree has four paths, the outcomes and their associated probabilities are listed in two columns to the right of the tree. The outcome of the first path is “A2 & A1” with probability “0.15 times 0.8 equals 0.12”. The outcome of the second path is “B2 & A1” with probability “0.15 times 0.2 equals 0.03”. The outcome of the third path is “A2 & B1” with probability “0.85 times 0.1 equals 0.085”. The outcome of the fourth path is “B2 & B1” with probability “0.85 times 0.9 equals 0.765”. [Return to Figure 3.4]

Example 3.1 Image Description: This tree diagram grows from the left to the right and has two levels. The first level has two branches representing two possible outcomes of smoking status. The upper first branch reads “P(S) equals 0.2” above the branch. The lower first branch reads “P(not S)” below the branch. Both the upper and lower branches of the first level have two sub-branches representing the outcomes of breast cancer status given their smoking status. The topmost branch is connected to the first upper branch and reads “P(B given S) equals one-third” above the sub-branch. The next branch down is connected to the first upper branch and reads “P(not B given S) equals two-thirds” below the sub-branch. The next branch down is connected to the first lower branch and reads “P(B given not S) equals three-seventeenths” above the sub-branch. The bottom most branch is connected to the first lower branch and reads “P(not B given not S) equals fourteen-seventeenths” below the sub-branch. The tree has four paths, the outcome events and their associated probabilities are listed in two columns to the right of the tree. The event of the first path is “ B & S” with probability “0.2 times one-third equals 0.0667”. The event of the second path is “not B & S” with probability “0.2 times two-thirds equals 0.1333”. The event of the third path is “B & not S” with probability “0.8 times three-seventeenths equals 0.1412”. The event of the fourth path is “not B & not S” with probability “0.8 times fourteen-seventeenths equals 0.6588”. [Return to Example 3.1]

Figure 4.1 Image Description: A mapping for the random variable X, number of tails that occur if we flip a coin twice. There are two ovals. The one labelled S shows the possible outcomes of the two coin flips and the one labelled X shows all possible numeric values of the random variable x, the number of tails.  The possible outcomes listed in S are: HH, HT, TH, and TT . The possible values listed in X are 0, 1, and 2. The value HH from S is connected to X with an arrow pointing to 0.  The values HT and TH in S are connected to X an arrow each, both pointing to 1. The value TT in S is connected X with an arrow pointing to 2. [Return to Figure 4.1]

Figure 4.2 Image Description: A mapping for the random variable X, number of siblings if we randomly select a student. There are two ovals. The one representing S shows the randomly selected student and the one labelled X shows all possible values of the random variable x, the number of siblings. The possible outcomes in S are the five selected students: Mark, John, Rebecca, Sarah, and Mary. The possible values of X are 0, 1, 2, and 3. Mark (in S) is connected to X with an arrow pointing to 0.  John (in S) is connected to X with an arrow pointing to 1. Rebecca and Sarah (in S) are connected to X with an arrow each, both pointing to 2. Mary (in S) is connected to X with an arrow pointing to 3. [Return to Figure 4.2]

Figure 4.3 Image Description: A histogram of number of siblings. The y-axis is probability (i.e., relative frequency) from 0 to 0.4 in increments of 0.1. The x-axis is the number of siblings with bars at 0, 1, 2, and 3. The heights of the bars are as follows: x at 0 has a height of 0.2, x at 1 has a height of 0.2, x at 2 has a height of 0.4, and x at 3 has a height of 0.2. [Return to Figure 4.3]

Figure 5.1 Image Description: The graph on the left is titled “Probability Histogram of Grade”. The y-axis is Probability from 0 to 0.4 in increments of 0.1. The x-axis is Grades with 7 bars in increments of 10 starting at 30 going to 100. The heights of the bars are as follows: x between 30 and 40 has a height of 0.003, x between 40 and 50 has a height of 0.019, x between 50 and 60 has a height of 0.15, x between 60 and 70 has a height of 0.33, x between 70 and 80 has a height of 0.332, x between 80 and 90 has a height of 0.144 and x between 90 and 100 has a height of 0.022. This is the same as the values given in table 5.1.  The three left-most bins are filled with black. The Density Curve of Grade on the right panel is almost identical to the Probability Histogram on the left panel except that the y-axis labelled “Density” in increments of 0.01 from 0 to 0.04. There is a red bell-shaped curve on the top of the bars in the density graph. [Return to Figure 5.1]

Figure 5.1.1 Image Description: Three identical density curves are presented in a row. The leftmost curve has the area to the left of a vertical line x equals a under the density curve shaded in grey. In the middle panel, the area to the right of a vertical line x equals b under the density curve is shaded in grey. The right panel has the area between vertical lines x equals a and x equals b (a less than b) under the density curve is shaded in grey. [Return to Figure 5.1.1]

Figure 5.2 Image Description: This graph shows three normal density curves. The x-axis labelled as “z” is in increments of 5 from negative 5 to 5;  The y-axis labelled as “f(z)” is in increments of 0.1 from 0 to 5. The black solid bell-shaped curve centred at 0 is the density curve for a normal distribution with mean 0 and standard deviation 2. The red dashed bell-shaped curve centred at 0 and taller than the black solid curve is the density curve for a normal distribution with mean 0 and standard deviation 1. The blue dotted bell-shaped curve centred at 4 and of the same shape as the red dashed curve is the density curve for a normal distribution with mean 4 and standard deviation 1. [Return to Figure 5.2]

Figure 5.3 Image Description: This figure illustrates the empirical rule of a normal distribution. The normal curve is shown over a horizontal axis. The axis is labelled “X” with points µ minus 3 σ, µ minus 2 σ, µ minus σ, µ, µ plus σ, µ plus 2 σ, and µ plus 3 σ. Two red vertical lines connect the axis to the curve at labelled points µ minus 3 σ and µ plus 3 σ. A red horizontal line connects these two points on the curve and reads “99.74%” indicating that the area under the normal curve between these points contains 99.74% of observations. Two blue vertical lines connect the axis to the curve at labelled points µ minus 2 σ and µ plus 2 σ. A blue horizontal line connects these two points on the curve and reads “95.44%” indicating that the area under the normal curve between these points contains 95.44% of observations. Two green vertical lines connect the axis to the curve at labelled points µ minus σ and µ plus σ. A green line connects these two points on the curve and reads “68.26%” indicating that the area under the normal curve contains 68.26% of observations. [Return to Figure 5.3]

Figure 5.4 Image Description: There are two normal density curves: the one on the left is the normal density curve for a normal distribution with mean µ and standard deviation σ; the one on the right is the normal density curve for a standard normal distribution with mean 0 and standard deviation 1. The normal curve has labelled points a and b, with the area between a and b under the curve shaded in grey. The standard curve has points labelled (a minus µ) divided by σ and (b minus µ) divided by σ. The area between these points is shaded grey. Arrows point to both grey areas and they are labelled “equal areas.” [Return to Figure 5.4]

Figure 5.5 Image Description: Three normal density curves are shown on a single horizontal axis. The x-axis is in increments of 1 from negative 6 to 18. The green normal density curve has a mean µ equals negative 2 and a standard deviation σ equals 1. The red normal density curve has a mean µ equals 4 and a standard deviation σ equals 0.5; a smaller standard deviation makes the red density curve taller and slimmer than the green density curve. The blue normal density curve has a mean µ equals 10 and a standard deviation σ equals 2; a larger standard deviation makes the blue density curve flatter than the green density curve. All these three normal density curves can be converted to the standard normal density curve through the standardisation method z equals (X minus µ) divided by σ. The standard normal density curve with mean µ equals 0 and standard deviation σ equals 1 is in black and shown over a horizontal axis goes from negative 4 to 4 with increments of 1.  The black curve has the same shape as the green normal density curve, since they have the same standard deviation. [Return to Figure 5.5]

Figure 5.6 Image Description: This figure shows part of the second page of Table II (area under the standard normal curve for positive z scores). The first column of the table (labelled “z”) gives the first decimal place of the z-score in increments of 0.1 from 0.0 to 1.9. The first row of the table reads “Second decimal place in z” and the second row gives the second decimal place of the z-score in increments of 0.01 from 0.00 to 0.09. The elements of the main body of the table are the area (in four decimal places) to the left of the corresponding z-scores under the standard normal curve. The graph also shows that the area to the left of 1.96 is 0.9750. [Return to Figure 5.6]

Figure 5.7 Image Description: Three standard normal density curves are presented in a row. The left-hand graph has the area to the left of a vertical line z equals 1.96 under the standard normal density curve shaded in grey. This area is labelled as equalling 0.975. The middle graph has the area to the right of a vertical line z equals 1.96 under the standard normal density curve shaded in grey. This area is labelled area equalling 1 minus P(Z less than 1.96) equals 1-0.975 equals 0.025”. The rightmost graph has the area between vertical lines z equals negative 1.96 and z equals 1.96 under the standard normal density curve shaded in grey. This area is labelled “area equals P(negative 1.96 less than Z less than 1.96) equals P(Z less than 1.96) minus P(Z less than negative 1.96) equals 0.975 minus 0.025 equals 0.95”. [Return to Figure 5.7]

Figure 5.8 Image Description: Four standard normal density curves are presented in a row. Each curve is shown over a horizontal axis labelled “Z” and has a dashed vertical line at the center z equals 0. The first graph corresponds to question 1 in this example. It is titled “Area to the Left of negative 2: P(Z less than negative 2)”. The area to the left of a vertical line z equals negative 2 under the standard normal density curve is shaded in grey. The second graph corresponds to question 2 in this example. It is titled “Area to the Right of 2: P(Z greater than 2)”. The area to the right of a vertical line z equals 2 under the standard normal density curve is shaded in grey. The third graph corresponds to question 3 in this example. It is titled “Area Beyond negative 2 and 2: P(absolute Z greater than 2)”. The area to the left of a vertical line z equals negative 2 and the area to the right of a vertical line z equals 2 under the standard normal density curve is shaded in grey. The fourth graph corresponds to question 4 in this example. It is titled “Area Between negative 4 and 5: P(negative 4 less than Z less than 5)”. The area between a vertical line z equals negative 4 and a vertical line z equals 5 under the standard normal density curve is shaded in grey. [Return to Figure 5.8]

Example 5.1 Image Description: The leftmost image is a standard normal density curve with a labelled “z equals ?” under the horizontal axis on the left end. A vertical line at the label “z equals ?” is drawn and the area to its left under the bell-shaped curve is shaded in grey. On the top left corner of the graph, it reads “area equals 0.1”. Pause here to answer. The middle image is a standard normal density curve with a labelled “z equals ?” under the horizontal axis on the right end. A vertical line at the label “z equals ?” is drawn and the area to its right under the bell-shaped curve is shaded in grey. On the top right corner of the graph, it reads “area equals 0.1”. Pause here to answer. The rightmost image is a standard normal density curve with a labelled “z equals ?” under the horizontal axis on the right end, which is a little bit further to the right end than the one in the previous example. A vertical line at the label “z equals ?” is drawn and the area to its right under the bell-shaped curve is shaded in grey. On the top right corner of the graph, it reads “area equals 0.05”. [Return to Example 5.1]

Figure 5.9 Image Description: From left to right, it reads “X approximates Normal (µ, σ)” on the first line and “x value” on the second line. Then there are parallel horizontal lines with an arrow at the end. The arrow of the horizontal line on the top is at the right end and pointing to the right, it writes “Z equals (X minus µ) divided σ” above the line. The arrow of the horizontal line at the bottom is at the left end and pointing to the left, it writes “x equals µ plus z times σ” below the line. Then it reads “z equals (x minus µ) divided by σ”. Then two horizontal lines with arrows at both ends. It reads “Table II (standard normal table)” above the straight line on the top. Finally, it reads “area under the curve” on the first line and “probability/percentage” on the second line. [Return to Figure 5.9]

Figure 5.10 Image Description: This graph titled “Normal Probability Plot” is a scatter plot for the data given in Table 5.2. The x-axis labelled “Normal Score” is in increments of 0.5 from negative 1 to 1. The y-axis labelled “Sorted Grade” is in increments of 10 from 40 to 90. There are six points plotted, at (negative 1.28, 40), (negative 0.64, 75), (negative 0.20, 75), (0.20, 80), (0.64, 85), and (1.28, 90). [Return to Figure 5.10]

Figure 5.11 Image Description: Three normal probability plots titled “(a)”, “(b)”, and “(c)” respectively are presented in a row. They share the same x-axis and y-axis. The x-axis labelled “theoretical” is in increments of 1 from negative 2 to 2. The y-axis labelled “sample” is in increments of 0.25 from 0 to 1.00. The points in panel (a) are roughly on a straight line. The points in panel (b) show a “J” shape. The points in panel (c) also show a “J” shape. Most of the points in panel (c) are roughly on a straight line, except several extremely small and large observations. [Return to Figure 5.11]

Figure 5.12 Image Description: A total of eighteen graphs are shown here, in two groups of nine. The first nine graphs are presented in a 3 by 3 matrix. Histograms of a left skewed distribution, a normal distribution and a right skewed distribution are in the first row. Their corresponding normal probability plots and boxplot are given in the second and third row respectively. For a left skewed distribution, the histogram has a longer tail on the left-hand side; the normal probability plot is concave; the boxplot has a longer lower whisker and larger distance between Q1 and Q2. For a normal distribution, the histogram is roughly symmetric and bell-shaped; the normal probability plot shows a strong linear pattern; the lower and upper whiskers are roughly of the same length in the boxplot. For a right skewed distribution, the histogram has a longer tail on the right-hand side; the normal probability plot has a “J” shape; the boxplot has a shorter lower whisker and smaller distance between Q1 and Q2.

The second set of nine graphs are also presented  in a 3 by 3 matrix. Histograms of a multimodal distribution, a normal distribution with outliers and a uniform distribution are in the first row. Their corresponding normal probability plots and boxplot are given in the second and third row respectively. For a multimodal distribution, the histogram shows a roughly symmetric distribution with three peaks; the normal probability plot shows a flattening “S” shape; the lower and upper whiskers are roughly of the same length in the boxplot. For a normal distribution with outliers, the histogram is roughly symmetric and bell-shaped except for some observations on both ends; except for several observations on both ends, all the remaining points are roughly on a straight line in the normal probability plot; the boxplot is symmetric except for the outliers at both ends. For a uniform distribution, the histogram looks like a rectangle; the normal probability plot has a flattening “S” shape; the lengths of the lower and upper whiskers, the distance between Q1 and Q2, and the distance between Q2 and Q3 are all roughly the same. [Return to Figure 5.12]

Table 6.1 Image Description: There are five students in the population, if we randomly pick two students, there are 10 (5 choose 2) different samples and the sample means are 160, 165, 170, 175, 170, 175, 180, 180, 185, and 190. The mean and standard deviation of these ten sample means are 175 and 8.66 respectively. If we randomly pick three students, there are 10 (5 choose 3) different samples and the sample means are 165, 168.33, 171.67, 171.67, 175, 178.33, 175, 178.33, 181.67, and 185.  The mean and standard deviation of these ten sample means are 175 and 5.77 respectively. If we randomly pick four students, there are 5 (5 choose 4) different samples and the sample means are 170, 172.5, 175, 177.5, and 180.  The mean and standard deviation of these ten sample means are 175 and 3.54 respectively. [Return to Table 6.1]

Figure 6.1 Image Description: The probability distribution of the sample mean for n equals 2 is given on the left-hand side in a table with two columns. The first column labelled as “x-bar” lists all possible values of the sample mean as values of x-bar. Their associated probabilities (relative frequency) are given in the second column labelled as “P(upper case X bar equals lower case x bar)”. The values are as follows: for x-bar at 160, one-tenth equals 0.1; for x-bar at 165, one-tenth equals 0.1; for x-bar at 170, two-tenths equals 0.2; for x-bar at 175, two-tenths equals 0.2; for x-bar at 180, one-tenth equals 0.1; for x-bar at 185, one-tenth equals 0.1; and for x-bar at 190, one-tenth equals 0.1. The probability histogram is shown on the right-hand side. The y-axis is probability (i.e., relative frequency) from 0 to 0.2 in increments of 0.1 and x-axis is ”Average Height of Two” with bars for each x-bar. The heights of the bars are the same as their second column values. [Return to Figure 6.1]

Figure 6.2 Image Description: Two graphs are presented in a row. The one on the left is titled “Population Distribution of Grade”. The y-axis labelled “Density” is in increments of 0.05 from 0 to 0.20. The x-axis labelled “Grade” is increment of 20 from 20 to 100. The graph consists of 4 bars symmetric at grade equals 70, and a black bell-shaped curve ranging from 20 to 120 is drawn on the top of the bars. A red vertical line is drawn at grade equals 70. In the middle of the left-hand side of the graph, it reads “population mean equals 70” in the first line and “population SD equals 10” in the second line. The one on the right is titled “Q-Q Plot of Grade”. The y-axis labelled “Observed Quantiles: Grade” is in increments of 20 from 40 to 100. The x-axis labelled “Theoretical Quantiles: Normal Score” is increment of 2 from negative 4 to 4. The points show an almost perfect straight line. [Return to Figure 6.2]

Figure 6.3 Image Description: Three density curves of the sample mean for sample size n equals 2, 5 and 30 are presented in a row. These three graphs have identical x- and y-axis. Their corresponding normal probability plots are shown below. The first graph on the first row has the title “Distribution of Sample Mean With n equals 2”. The y-axis labelled “Density” is in increments of 0.05 from 0 to 0.20. The x-axis labelled “Average Grade” is in increment of 20 from 20 to 100. The graph consists of bars, and a black bell-shaped curve ranging from 35 to 105 is drawn on the top of the bars. A red solid vertical line representing the population mean and a blue dashed vertical line representing the mean of the sample means are drawn. The red and blue lines are almost identical. In the middle of the left-hand side of the graph, it reads “mean of sample mean equals 70” in the first line and “SD of sample mean equals 7.2” in the second line.

The second graph on the first row has the title “Distribution of Sample Mean With n equals 5”. The y-axis labelled “Density” is in increments of 0.05 from 0 to 0.20. The x-axis labelled “Average Grade” is in increment of 20 from 20 to 100. The graph consists of bars, and a black bell-shaped curve ranging from 50 to 90 is drawn on the top of the bars. A red solid vertical line representing the population mean and a blue dashed vertical line representing the mean of the sample means are drawn. The red and blue lines are almost identical. In the middle of the left-hand side of the graph, it reads “mean of sample mean equals 70” in the first line and “SD of sample mean equals 4.5” in the second line.

The third graph on the first row has the title “Distribution of Sample Mean With n equals 30”. The y-axis labelled “Density” is in increments of 0.05 from 0 to 0.20. The x-axis labelled “Average Grade” is in increments of 20 from 20 to 100. The graph consists of bars, and a black bell-shaped curve ranging from 60 to 80 is drawn on the top of the bars. A red solid vertical line representing the population mean and a blue dashed vertical line representing the mean of the sample means are drawn. The red and blue lines are almost identical. In the middle of the left-hand side of the graph, it reads “mean of sample mean equals 70” in the first line and “SD of sample mean equals 1.8” in the second line.

Three normal probability plots corresponding to the density curves above are presented in the second row. The first probability plot has the title “Q-Q Plot of Sample Mean With n equals 2”. The y-axis labelled “Observed Quantiles: Average Grade” is in increments of 10 from 40 to 100. The x-axis labeled “Theoretical Quantiles: Normal Score” is in increments of 2 from negative 4 to 4. The points show an almost perfect straight line. The second probability plot has the title “Q-Q Plot of Sample Mean With n equals 5”. The y-axis labelled “Observed Quantiles: Average Grade” is in increments of 5 from 55 to 85. The x-axis labelled “Theoretical Quantiles: Normal Score” is in increments of 2 from negative 4 to 4. The points show an almost perfect straight line. The third probability plot has the title “Q-Q Plot of Sample Mean With n equals 30”. The y-axis labelled “Observed Quantiles: Average Grade” is in increments of 2 from 62 to 76. The x-axis labelled “Theoretical Quantiles: Normal Score” is in increments of 2 from negative 4 to 4. The points show an almost perfect straight line. [Return to Figure 6.3]

Figure 6.4 Image Description: Density curve of outcome of rolling a die with a title “Population Distribution”. The y-axis labelled “Density” is in increments of 0.5 from 0 to 1.5. The x-axis labelled “Outcome of one die” is in increments of intervals of 1 from 1 to 6. The graph consists of 6 bars, the height of each bars is one-sixth. A red vertical line is drawn at outcome equals 3.5. In the middle of the left-hand side of the graph, it reads “population mean equals 3.5” in the first line and “population SD equals 1.71” in the second line. [Return to Figure 6.4]

Figure 6.5 Image Description: Three density curves of the sample mean for sample size n equals 2, 5 and 30 are presented in a row, these three graphs have identical x- and y-axis. Their corresponding normal probability plots are shown below. The first graph in the first row is titled “Distribution of Sample Mean With n equals 2”. The y-axis labelled “Density” is in increments of 0.5 from 0 to 1.5. The x-axis labelled “Average of n equals 2 Dice” is in increments of 1 from 1 to 6. The graph consists of bars, and a black symmetric triangular curve ranging from negative 0.5 to 6.5 is drawn on the top of the bars. A red solid vertical line representing the population mean and a blue dashed vertical line representing the mean of the sample means are drawn. The red and blue lines are almost identical. In the middle of the left-hand side of the graph, it reads “mean of sample mean equals 3.5” in the first line and “SD of sample mean equals 1.2” in the second line.

The second graph in the first row is titled “Distribution of Sample Mean With n equals 5”. The y-axis labelled “Density” is in increments of 0.5 from 0 to 1.5. The x-axis labelled “Average of n equals 5 Dice” is in increments of 1 from 1 to 6. The graph consists of bars, and a black bell-shaped curve ranging from negative 0.5 to 6.5 is drawn on the top of the bars. A red solid vertical line representing the population mean and a blue dashed vertical line representing the mean of the sample means are drawn. The red and blue lines are almost identical. In the middle of the left-hand side of the graph, it reads “mean of sample mean equals 3.5” in the first line and “SD of sample mean equals 0.8” in the second line.

This is the third graph in the first row with a title “Distribution of Sample Mean With n equals 30”. The y-axis labelled “Density” is in increments of 0.5 from 0 to 1.5. The x-axis labelled “Average of n equals 30 Dice” is in increments of 1 from 1 to 6. The graph consists of bars, and a black bell-shaped curve ranging from 2 to 5 is drawn on the top of the bars. A red solid vertical line representing the population mean and a blue dashed vertical line representing the mean of the sample means are drawn. The red and blue lines are almost identical. In the middle of the left-hand side of the graph, it reads “mean of sample mean equals 3.5” in the first line and “SD of sample mean equals 0.31” in the second line.

Three normal probability plots corresponding to the density curves above are presented in the second row. The first probability plot is titled “Q-Q Plot of Sample Mean With n equals 2”. The y-axis labelled “Observed Quantiles: Average of n equals 2 Dice” is in increments of 1 from 1 to 6. The x-axis labelled “Theoretical Quantiles: Normal Score” is in increments of 2 from negative 4 to 4. The points show a stair with 11 steps going upward from the left to the right. The second probability plot is titled “Q-Q Plot of Sample Mean With n equals 5”. The y-axis labelled “Observed Quantiles: Average of n equals 5 Dice” is in increments of 1 from 1 to 6. The x-axis labelled “Theoretical Quantiles: Normal Score” is in increments of 2 from negative 4 to 4. The points show a stair with 26 steps going upward from the left to the right. The points are roughly on a straight line. The third probability plot is titled “Q-Q Plot of Sample Mean With n equals 30”. The y-axis labelled “Observed Quantiles: Average of n equals 30 Dice” is in increments of 1 from 1 to 6. The x-axis labelled “Theoretical Quantiles: Normal Score” is in increments of 2 from negative 4 to 4. The points show a stair with many steps going upward from the left to the right. Therefore, the points appear to form a straight line. [Return to Figure 6.5]

Figure 6.6 Image Description: Two graphs are presented in a row. The one on the left is titled “Population Distribution of Survival Time”. The y-axis labelled “Density” is in increments of 0.1 from 0.0 to 0.5. The x-axis labelled “Survival Time (year)” is in increments of 5 from 0 to 20. The graph consists of 4 bars. The heights are as follows: x between 0 and 5 has a height of roughly 0.13, x between 5 and 10 has a height of roughly 0.5, x between 10 and 15 has a height of roughly 0.2, and x between 15 and 20 has a height of roughly 0.1. A black reversed “J” shaped curve ranging from negative 1 to 21 is drawn on the top of the bars. A red vertical line is drawn at survival time equals 5. In the middle of the right-hand side of the graph, it reads “population mean equals 5” in the first line and “population SD equals 5” in the second line.

The graph on the right panel is titled “Q-Q Plot of Survival Time”. The y-axis labelled “Observed Quantiles: Survival Time” is in increments of 10 from 0 to 40. The x-axis labelled “Theoretical Quantiles: Normal Score” is an increment of 2 from negative 4 to 4. The points show a “J” shaped curve. [Return to Figure 6.6]

Figure 6.7 Image Description: Three density curves of the sample mean for sample size n equals 2, 5 and 30 are presented in a row. These three graphs have identical x- and y-axis. Their corresponding normal probability plots are shown below. The first graph is titled “Distribution of Sample Mean With n equals 2”. The y-axis labelled “Density” is in increments of 0.1 from 0.0 to 0.5. The x-axis labelled “Survival Time (year)” is incremented in intervals of 5 from 0 to 20. The graph consists of bars showing a right-skewed distribution, and a black curve ranging from negative 1 to 21 is drawn on the top of the bars. The curve goes upward from survival time equals negative 1 to the peak at survival time equals 3 and goes downward until survival equals 21. A red solid vertical line representing the population mean and a blue dashed vertical line representing the mean of the sample means are drawn. The red and blue lines are almost identical. In the middle of the right-hand side of the graph, it reads “mean of sample mean equals 5” in the first line and “SD of sample mean equals 3.6” in the second line.

The second graph is titled “Distribution of Sample Mean With n equals 5”. The y-axis labelled “Density” is in increments of 0.1 from 0.0 to 0.5. The x-axis labelled “Survival Time (year)” is incremented in intervals of 5 from 0 to 20. The graph consists of bars showing a right-skewed distribution, and a black curve ranging from negative 1 to 19 is drawn on the top of the bars. The curve goes upward from survival time equals negative 1 to the peak at survival time equals 4 and goes downward till survival equals 19. A red solid vertical line representing the population mean and a blue dashed vertical line representing the mean of the sample means are drawn. The red and blue lines are almost identical. In the middle of the right-hand side of the graph, it reads “mean of sample mean equals 5” in the first line and “SD of sample mean equals 2.2” in the second line.

The third graph is titled “Distribution of Sample Mean With n equals 30”. The y-axis labelled “Density” is in increments of 0.1 from 0.0 to 0.5. The x-axis labelled “Survival Time (year)” is incremented in intervals of 5 from 0 to 20. The graph consists of bars showing a symmetric distribution, and a black bell-shaped (roughly) curve ranging from 1.5 to 9.5 is drawn on the top of the bars. A red solid vertical line representing the population mean and a blue dashed vertical line representing the mean of the sample means are drawn. The red and blue lines are almost identical. In the middle of the right-hand side of the graph, it reads “mean of sample mean equals 5” in the first line and “SD of sample mean equals 0.9” in the second line.

Three normal probability plots corresponding to the density curves above are presented in the second row. The first probability plot is titled “Q-Q Plot of Sample Mean With n equals 2”. The y-axis labelled “Observed Quantiles: Average Survival Time” is in increments of 5 from 0 to 25. The x-axis labelled “Theoretical Quantiles: Normal Score” is in increments of 2 from negative 4 to 4. The points show a “J” shaped curve. The second probability plot is titled “Q-Q Plot of Sample Mean With n equals 5”. The y-axis labelled “Observed Quantiles: Average Survival Time” is in increments of 5 from 0 to 15. The x-axis labelled “Theoretical Quantiles: Normal Score” is in increments of 2 from negative 4 to 4. The points show a flattened “J” shaped curve. The third probability plot is titled “Q-Q Plot of Sample Mean With n equals 30”. The y-axis labelled “Observed Quantiles: Average Survival Time” is in increments of 1 from 3 to 9. The x-axis labelled “Theoretical Quantiles: Normal Score” is in increments of 2 from -4 to 4. The points are roughly on a straight line. [Return to Figure 6.7]

Exercise 6.1 Image Description: The graph is titled “Density Curve of Rent”. The y-axis labelled “Density” is in increments of 0.01 from 0.00 to 0.12. The x-axis labelled “Rent of One-Bedroom Apartment ($100)” is incremented in intervals of 5 from 0 to 30. The curve goes upward from rent equals 0 to the peak at rent equals 5 and then goes downward until rent equals 30. [Return to Exercise 6.1]

Chapter 6 Review Question 7 Image Description: The graph is titled “Density Curve of Rent”. The y-axis labelled “Density” is in increments of 0.01 from 0.00 to 0.12. The x-axis labelled “Rent of One-Bedroom Apartment ($100)” is incremented in intervals of 5 from 0 to 30. The curve goes upward from rent equals 0 to the peak at rent equals 5 and then goes downward until rent equals 30. [Return to Question 7]

Assignment 6 Question 7 Image Description: The graph is titled “Density Curve of Rent”. The y-axis labelled “Density” is in increments of 0.01 from 0.00 to 0.12. The x-axis labelled “Rent of One-Bedroom Apartment ($100)” is incremented in intervals of 5 from 0 to 30. The curve goes upward from rent equals 0 to the peak at rent equals 5 and then goes downward until rent equals 30. [Return to Question 7]

Figure 7.1 Image Description: The graph is titled “Sample Mean”. A horizontal line is drawn with labels at 339, 340, 341, 342 and 343. A green vertical line is drawn at 341, and two blue vertical lines are drawn at 340.02 and 341.98. Twenty horizontal lines with very short vertical bars at each end represent twenty 95% confidence intervals. The center of each interval (the sample mean) is shown as a red diamond. All intervals are of the same length which is 1.96. There is one interval that does not intersect the green vertical line. [Return to Figure 7.1]

Figure 7.2 Image Description: A horizontal line representing a confidence interval centred at 339 (indicated as x-bar below the number) is drawn. The left end point of the interval is calculated as 339 minus 0.98 equals 338.02. The formula to calculate the value is given below the equation: x-bar minus z sub alpha over 2 times sigma over root of n. The right end point of the interval is calculated as 339 plus 0.98 equals 339.98. The formula to calculate the value is given below the equation: x-bar plus z sub alpha over 2 times sigma over root of n. The interval is divided into two halves indicated by two horizontal lines with arrows at both ends. An equation “capital E equals z sub alpha over 2 times sigma over root of n equals 0.98” is written above the horizontal line of each half. A red vertical line indicating the population mean mu is drawn. A green vertical line indicating value 341 is shown to be outside of the interval. [Return to Figure 7.2]

Figure 7.3 Image Description: Several t-curves at varying degrees of freedom are compared to the standard normal curve. The y-axis of the graph is “Density” in increments of 0.1 from 0 to 0.4 and the x-axis is “X” in increment of 1 from negative 4 to 4. Four bell-shaped curves are shown in this figure. The blue dashed curve representing a t distribution with 1 degrees of freedom is flattest curve. The  purple dashed curve representing a t distribution with 3 degrees of freedom is the second flattest curve. The red dashed-dotted curve representing a t distribution with 15 degrees of freedom is slightly flatter but looks almost the same as the black density curve representing the standard normal distribution. [Return to Figure 7.3]

Figure 7.4 Image Description: This figure shows part of the first page of Table IV (Values of t sub alpha of t distribution). The first column of the table (labelled “df”) gives the degrees of freedom of a t distribution. The first row of the table reads “alpha: Area to the Right of t sub alpha”. The second row gives the values of alpha: 0.40, 0.30, 0.20, 0.15, 0.10, 0.05, 0.025, 0.010, 0.0075, 0.005, 0.0025 and 0.0005. The elements of the main body of the table are t-scores (in three decimal places) having an area of alpha (given as the column name) to its right for a given degrees of freedom (given as the row name). The graph shows that the t-score has an area of 0.025 to its right under the t distribution with df equals 9 is 2.262. The graph also shows that the area to the right of a t-score of 1.5 under the t distribution with df equals 9 is between 0.10 and 0.05. [Return to Figure 7.4]

Figure 7.5 Image Description: The bell-shaped curve shows the density of a t distribution with 14 degrees of freedom (df equals 14). Several coloured lines show important critical values according to Table IV. The largest area equalling 0.1 is shown to the right of a purple line at 1.345 under the density curve. The second largest area equalling 0.05 is shown to the right of a light blue line at 1.761. The middling area equalling 0.025 is shown to the right of a light green line at 2.145. The second smallest area equalling 0.01 is shown to the right of a red line at 2.624. The smallest area equalling 0.005 is shown to the right of a brown line at 2.977. [Return to Figure 7.5]

Example 7.1 Image Description: This figure shows part of Table IV (Values of t sub alpha of t distribution) for df=1 to 5 and df=31 to 50. Complete Table IV (see Appendix B Table IV). [Return to Example 7.1]

Figure 8.1 Image Description: Two density curves with overlap are shown over a horizontal axis labelled “Sugar level in blood”. The red curve on the left indicates the blood sugar level for patients without diabetes and the green curve on the right indicates the blood sugar level for patients with diabetes. There is a black vertical line labeled “cut-off C”. The area to the left of the cut-off C under the density curve for diabetes is shaded in blue and is labeled “False negative, Type II error”. The area to the right of the cut-off C under the density curve for diabetes free patients is shaded in gold and is labelled “False positive, Type I error”. [Return to Figure 8.1]

Figure 8.2 Image Description: The figure summarises a table illustrating the main idea of a hypothesis testing. We should reject the null hypothesis H sub 0: µ equals µ sub 0 and claim the alternative H sub a: µ not equals µ sub 0 if the sample mean x-bar is either too large or too small. We should reject the null hypothesis H sub 0: µ less than or equal to µ sub 0 and claim the alternative H sub 0: µ greater than µ sub 0 if the sample mean x-bar is too large. We should reject the null hypothesis H sub 0: µ greater than or equal to µ sub 0 and claim the alternative H sub a: µ less than µ sub 0 if the sample mean x-bar is too small. [Return to Figure 8.2]

Figure 8.3 Image Description: Two identical unimodal and right-skewed density curves are shown over a horizontal axis. The one on the left is labelled at critical value C and the area to its right under the density curve is shaded in grey. This shaded area is called the rejection region. The density curve on the right is labelled at x sub 0 and the area to its right under the density curve is shaded in grey. This shaded area is called the p-value. [Return to Figure 8.3]

Figure 8.4 Image Description: The figure illustrates that the rejection region of a two-tailed z test is either the z-score greater than z sub alpha over 2 or less than negative z sub alpha over 2;  the rejection region of a right-tailed z test is the z-score greater than z sub alpha; the rejection region of a left-tailed z test is the z-score less than negative z sub alpha. [Return to Figure 8.4]

Figure 8.5 Image Description: The figure summarises a table illustrating the calculation of a p-value. For a two-tailed z test with the null hypothesis H sub 0: µ equals µ sub 0 and the alternative hypothesis H sub a: µ not equals µ sub 0, the p-value equals twice of the area under a standard normal curve to the right of the absolute value of the observed z-score z sub o. For a right-tailed z test with the null hypothesis H sub 0: µ less than or equal to µ sub 0 and the alternative hypothesis H sub a: µ greater than µ sub 0, the p-value equals the area under a standard normal curve to the right of the observed z-score z sub o. For a left-tailed z test with the null hypothesis H sub 0: µ greater than or equal to µ sub 0 and the alternative Ha: µ less than µ sub 0 , the p-value is calculated as the area under a standard normal curve to the left of the observed z-score z sub o . [Return to Figure 8.5]

Example 8.1 Image Description: A standard normal density curve. Both the area under the curve to the right of 4 and to the left of negative 4 are shaded in grey. Each area has a label saying “area equals p-value divided by 2”. Part of Table II: area under the standard normal curve for negative z is also shown. It indicates that the area to the left of negative 3.99 under the standard normal curve is 0.0000. [Return to Example 8.1]

Example 8.2 Image Description: A standard normal density curve is shown. Both the area under the curve to the right of 1.96 (which is z sub alpha over 2) and to the left of negative 1.96 (which is negative z sub alpha over 2) are shaded in grey. Each area has a label saying “area equals alpha divided by 2”. [Return to Example 8.2]

Example 8.3 Image Description: Two t-density curve with 35 degrees of freedom. In the graph on the left panel, the area to the left of negative 0.714 under the t-curve is shaded in grey. In the graph on the right panel, the area to the right of 0.714 under the t-curve is shaded in grey. The two areas are the same. Part of Table IV: Values of t sub alpha of t distribution is also shown. From the table, we know that for a t distribution with degrees of freedom df equals 35, the area to the right of 0.714 is between 0.20 and 0.30. [Return to Example 8.3]

Example 8.4 Image Description: A t-density curve with degrees of freedom df equals 35 is drawn. The area under the curve to the left of negative 2.438 is shaded in grey with a label saying “area equals alpha equals 0.01”. A purple vertical line at negative 0.714 is drawn. We know that negative 0.714 is outside the shaded area, the rejection region of the test. [Return to Example 8.4]

Figure 8.6 Image Description: The bell-shaped curve shows the density of a t-distribution with 35 degrees of freedom (df equals 35). Several critical values according to Table IV are shown. The largest area equalling 0.1 is to the right of a pink line at 1.306 under the density curve. The second largest area equalling 0.05 is to the right of a dark blue line at 1.690. The middling area equalling 0.025 is to the right of a bright green line at 2.030. The second smallest area equalling 0.01 is to the right of a red line at 2.438. The smallest area equalling 0.005 is shown to the right of a black line at 2.724. [Return to Figure 8.6]

Figure 9.1 Image Description: Two big identical ovals are presented side by side. The oval on the left-hand side is labeled as “Population 1” and the one on the right-hand side is labeled as “Population 2”. Below the two ovals it reads “Greek letter mu sub 1” and “Greek letter mu sub 1” respectively. It reads “Two independent samples” between the two bigger ovals. There is a smaller oval inside each of the bigger oval representing a simple random sample from each population. Inside the smaller oval of population 1, it reads vertically “n sub 1, x-bar sub 1 and s sub 1”. Inside the smaller oval of population 2, it reads vertically “n sub 2, x-bar sub 2 and s sub 2”. [Return to Figure 9.1]

Figure 9.2 Image Description: A t-curve with degrees of freedom df equals 35 is drawn. The area under the curve to the right of 2.403 (t-score with area 0.01 to its right) is shaded in grey with a label saying “rejection region” in the first line and “area equals alpha equals 0.01” in the second line. A purple vertical line at 5.332 is drawn. The graph indicates  that the observed t-score t sub o is outside the shaded area, the rejection region of the test. [Return to Figure 9.2]

Figure 9.3 Image Description: A one-sided confidence interval starting from 9.887 pointing towards positive infinity is drawn. A short green vertical line is shown at 0 and a short blue vertical line is shown at 5. A short red vertical line is drawn somewhere within the one-sided interval with a label saying “Greek letter mu sub 1 minus Greek letter mu sub 2 greater than 9.887”. [Return to Figure 9.3]

Figure 9.4 Image Description: This graph titled “Normal Probability Plot on Differences” is a normal Q-Q plot on the paired differences given in the third column of Table 9.3. The x-axis labelled “norm quantiles” is in increments of 0.5 from negative 1.5 to 1.5. The y-axis labelled “Difference” is in increments of 10 from negative 10 to 50. There are 11 points plotted, all points are roughly on a red straight line and within the 95% simultaneous confidence band. The smallest and the large difference have a number “8” and “10” respectively next to the points. [Return to Figure 9.4]

Figure 10.1 Image Description: Four histograms of the sample proportion for sample size n equals 50, 100, 200 and 1000 are presented in a row. The first graph is titled “n equals 50”. The y-axis labelled “Frequency” is in increments of 200 from 0 to 1400. The x-axis labelled “Sample proportion” is incremented in intervals of 0.05 from 0 to 0.2. The first graph consists of eight bars with a width of 0.03 each from 0 to 0.16. The heights of the bars decrease from above 1400 at the first bar to close 0 at the last bar showing an extremely right-skewed distribution. A red solid vertical line representing the population proportion and a blue dashed vertical line representing the mean of the sample proportions are drawn. The red and blue lines coincide at sample proportion equals 0.05.

This is the second graph with a title “n equals 100”. The y-axis labelled “Frequency” is in increments of 200 from 0 to 1400. The x-axis labelled “Sample proportion” is incremented in intervals of 0.05 from 0 to 0.2. The graph consists of 13 bars of width 0.01 from 0 to 0.13. The heights of the bars increase from around 200 at the first bar to the peak around 900 at sample proportion equals 0.05 then decrease to around 0 at the last bar. The distribution is still right-skewed, but less so. A red solid vertical line representing the population proportion and a blue dashed vertical line representing the mean of the sample proportions are drawn. The red and blue lines coincide at sample proportion equals 0.05.

This is the third graph with a title “n equals 200”. The y-axis labelled “Frequency” is in increments of 200 from 0 to 1400. The x-axis labelled “Sample proportion” is incremented in intervals of 0.05 from 0 to 0.2. The graph consists of 11 bars of width 0.01 from 0 to 0.11. The heights of the bars increase from around 20 at the first bar to the peak around 1200 at sample proportion equals 0.05 then decrease to around 0 at the last bar. The distribution is only slightly right-skewed. A red solid vertical line representing the population proportion and a blue dashed vertical line representing the mean of the sample proportions are drawn. The red and blue lines coincide at sample proportion equals 0.05.

This is the fourth graph with a title “n equals 1000”. The y-axis labelled “Frequency” is in increments of 200 from 0 to 1400. The x-axis labelled “Sample proportion” is incremented in intervals of 0.05 from 0 to 0.2. The graph consists of 12 bars of width 0.005 from 0.025 to 0.085. The heights of the bars increase from 0 at the first bar to the peak around 1400 at sample proportion equals 0.05 then decrease to around 0 at the last bar. The distribution is now roughly symmetric. A red solid vertical line representing the population proportion and a blue dashed vertical line representing the mean of the sample proportions are drawn. The red and blue lines coincide at sample proportion equals 0.05. [Return to Figure 10.1]

Figure 10.2 Image Description: A curve from 0 to 1 is shown with a red dot at the peak. The y-axis of the graph labelled as “p-hat times (1-p-hat)” is in increments of 0.05 from 0 to 0.25. The x-axis labelled as “p-hat” is in increments of 0.2 from 0 to 1. The curve is smooth and symmetric; it keeps going upward from the coordinates (0, 0) to peak with coordinates (0.5, 0.25) and then starts going downward to the coordinates (1, 0). [Return to Figure 10.2]

Figure 11.1 Image Description: The y-axis of the graph is “Density” in increments of 0.1 from 0 to 0.4 and the x-axis is “X” in increments of 5 from 0 to 30. Five chi-square curves are shown in this figure. The black curve is goes to infinity at x equals 0 and 0 as x approaches 30. This is the chi-square curve with 1 degrees of freedom. The red curve increases from y approximating 0.2 at x equals 0 to y approximating 0.24 at x equals 3 then approaches 0 as x approaches 30. This is the chi-square curve with 3 degrees of freedom. The blue curve is right-skewed with a peak at coordinates (4, 0.15). This is the chi-square with 5 degrees of freedom. The purple curve is right-skewed with a peak at coordinates (7, 0.1). This is the chi-square with 9 degrees of freedom. The green curve is slightly right-skewed with a peak at coordinates (14, 0.08). This is the chi-square with 15 degrees of freedom. [Return to Figure 11.1]

Table 11.1 Image Description: This figure shows part of Table V (Values of Greek letter chi-square sub Greek letter alpha of chi-square distribution). The first column of the table (labelled “df”) gives the degrees of freedom for chi-square distributions. The first row of the table reads “Greek letter alpha: Area to the Right of Greek letter chi-square sub alpha”. The second row gives the values of alpha: 0.995, 0.990, 0.975, 0.950, 0.9, 0.1, 0.05, 0.025, 0.01 and 0.005. The elements of the main body of the table are chi-square scores (in three decimal places) having an area of alpha (given as the column name) to its right for a given degrees of freedom (given as the row name). [Return to Table 11.1]

Figure 11.2 Image Description: Three bars are presented for “Smoker”, “Non-Smoker” and “Total” from left to right. Each bar has two segments: the green segment at the bottom represents individuals with cancer and the red segment on the top for individuals without cancer. The y-axis labeled “Relative Frequency” is in increments of  0.2 from 0 to 1. The heights of the green segment are 0.333 for Smoker, 0.176 for Non-Smoker, and 0.2 for Total. [Return to Figure 11.2]

Figure 12.1 Image Description: Three big identical ovals are presented in a row. The first oval is labeled as “Population 1”, the second one as “Population 2”, and the third one as “Population k”. Below the ovals it reads “Greek letter mu sub 1”, “Greek letter mu sub 2” and “Greek letter mu sub k” respectively. It reads “Two independent samples” between the first two bigger ovals and the last two big ovals. There is a smaller oval inside each of the bigger oval representing a simple random sample from each population. Inside the smaller oval of population 1, it reads vertically “n sub 1, x-bar sub 1 and s sub 1”. Inside the smaller oval of population 2, it reads vertically “n sub 2, x-bar sub 2 and s sub 2”. Inside the smaller oval of population k, it reads vertically “n sub k, x-bar sub k and s sub k”. [Return to Figure 12.1]

Figure 12.2 Image Description: The x-axis is in increments of 2 from 2 to 10. Data set 1 has red circles at 1, 2, 3, 3, 4, and 5 and blue crosses at 6, 7,  8, 8, 9, and 10. Data set 2 has red circles at 1, 3, 4, 5, 8, and 9 and blue crosses at 2, 3, 6, 7, 8, and 10 [Return to Figure 12.2]

Figure 12.3 Image Description: The y-axis of the graph is “Density” in increments of 0.2 from 0 to 1.2 and the x-axis is “X” in increments of 1 from 0 to 5. Five F-density curves are shown in this figure. The black curve is right-skewed with a peak at coordinates (0.3, 0.7). This is the F distribution with (3, 30) degrees of freedom. The red curve is right-skewed with a peak at coordinates (0.6, 0.6). This is the F distribution with (30, 3) degrees of freedom. The green curve is right-skewed with a peak at coordinates (0.5, 0.6). This is the F distribution with (15, 3) degrees of freedom. The purple curve is right-skewed with a peak at coordinates (0.8, 0.98). This is the F distribution with  (15, 30) degrees of freedom. The blue curve is right-skewed with a peak at coordinates (0.9, 1.18). This is the F distribution with (30, 30) degrees of freedom. [Return to Figure 12.3]

Table 12.3 Image Description: This figure shows part of Table VI (Values of F sub alpha of F distribution) (Table 6). The elements of the main body of the table are F scores having an area of alpha (given as the row name) to its right for a given numerator degrees of freedom df sub n (given as the column name). We see df sub n equals 1, 2, up to 10 in this image. The alpha values provided are 0.5, 0.1, 0.05, 0.025, 0.01, 0.005, and 0.001. The table is grouped by the denominator degrees of freedom df sub d. This image shows seven rows and ten columns of F scores for df sub d equals 1, another seven rows for df sub d equals 2, and another seven rows for df sub d equals 3. [Return to Table 12.3]

Figure 12.4 Image Description: Six images are shown in a 2 by 3 matrix. The three graphs in the first rows are the side-by-side histograms of downloading time for 7 AM, 5 PM and 12 AM from left to right. Their corresponding side-by-side boxplots are presented in the second row. [Return to Figure 12.4]

Figure 12.5 Image Description: Six images are shown in a 2 by 3 matrix. The three graphs in the first rows are the side-by-side histograms of bone density for control, high jump and low jump groups from left to right. Their corresponding side-by-side boxplots are presented in the second row. [Return to Figure 12.5]

Figure 13.1 Image Description: The x-axis labeled “Age (Years)” is in increments of 2 from 2 to 12. The y-axis labeled “Price ($1000)” is in increments of 2 from 4 to 14. Fifteen points are plotted at (1, 14), (1, 13), (3, 13), (4, 10), (4, 10), (5, 9), (5, 9), (6, 7), (7, 7), (7, 8), (8, 7), (8, 6), (10, 5), (10, 4) and (13, 3). The points roughly fall on a straight line going downward. [Return to Figure 13.1]

Figure 13.2 Image Description: A straight line given by “y equals b sub 0 plus b sub 1 times x” is drawn. The straight line goes downward and intersects the y-axis at coordinates (0, b sub 0). Two vertical lines at x and x plus 1 are drawn to show that the slope of the straight line b sub 1 is the change in the response Y when X increases by 1 unit. [Return to Figure 13.2]

Figure 13.3 Image Description: The fitted least-squares straight line is added to the scatter plot of 15 used cars with “Age (Years)” and “Price ($1000)” as the x- and y-axis. Two points on the straight line are identified at age equals 3 and age equals 6, these two points are denoted as y-hat. Connect the two fitted value y-hat with their corresponding observed values, we see that the residual (defined as y minus y-hat) of the observation with age equals 3 is positive and the residual of the observation with age equals 6 is negative. [Return to Figure 13.3]

Figure 13.4 Image Description: On the scatter plot with the fitted least-squares straight line for those 15 used cars, three points on the least-squares straight line are chosen with age equals 2, 10 and 20. Their coordinates are (2, 12.2316), (10, 4.686), and (20, negative 4.746). The line slopes down through all three points. [Return to Figure 13.4]

Figure 13.5 Image Description: The x-axis is in increments of 5 from 0 to 20 and the y-axis is in increments of 10 from 0 to 40. Except for an outlier point at coordinates (20, 42), all data points are within the region [0, 5] by [0, 10]. The fitted least-squares straight lines with and without the outlier are almost identical. [Return to Figure 13.5]

Figure 13.6 Image Description: The x-axis is in increments of 5 from 0 to 20 and the y-axis is in increments of 2 from 0 to 8. Except for an outlier point at coordinates (20, 1), all data points are within the region [0, 4] by [3, 9]. The fitted least-squares straight lines with and without the outlier are almost orthogonal. The slope of the line with the outlier is negative, but the slope of the one without the outlier is positive. [Return to Figure 13.6]

Figure 13.7 Image Description: Four scatter plots are shown in a row. The first graph labelled a shows data points forming a “U” shape with a wide mouth. The second graph labelled b shows data points falling in a wide band going upward. The third graph labelled c shows the data points falling in a narrower but still wide band going downward. The fourth graph labelled d shows the data points falling in a very narrow band going upward. [Return to Figure 13.7]

Figure 13.8 Image Description: A vector Y starting from the origin represents the observed y values. The projection of Y onto a hyperplane is denoted as Y-hat. The projection of the vector Y minus Y-bar is denoted as Y-hat minus Y-bar. The angle between the vector Y minus Y-bar and Y-hat minus Y-bar is theta. The residual vector Y minus Y-hat is orthogonal to the hyperplane. The vectors Y minus Y-bar, Y-hat minus Y-bar and Y -minus Y-hat form a right triangle. [Return to Figure 13.8]

Figure 13.9 Image Description: On the scatter plot with the fitted least-squares straight line for those 15 used cars, three identical vertical bell-shaped curves are shown at age equals 4, 8, and 10. The centers of the bell-shaped curves gather around the least-squares straight line. [Return to Figure 13.9]

Figure 13.10 Image Description: Six graphs are presented in a 2 by 3 matrix. The three graphs in the first row are residual plots with “Standardised Residuals” as the y-axis which is in increment 1 from -4 to 4. All three residual plots have a red horizontal line at y equals 0. The corresponding normal probability plots of the three sets of residuals are presented in the second row. The first residual plot labelled a shows the data points gathered randomly within a horizontal bank  from y equals negative 3 to y equals 3. The second residual plot labelled as b shows the data points in a “U” shape with a wide mouth. The third residual plot labelled c shows the data points having a wider range in y-axis as x increases.

The second row shows the corresponding Q-Q plots. The first normal probability plot is of residual plot a. The y-axis labelled “Sample Quantiles” is in increment 1 from negative 2 to 2. The x-axis labelled “Theoretical Quantiles” is in increment 1 from negative 2 to 2. All points roughly fall on a straight line. The normal probability plot of residual plot b has a y-axis labelled “Sample Quantiles” and incremented in 1 from negative 1 to 4. The x-axis is labelled “Theoretical Quantiles” and is incremented in 1 from negative 2 to 2. The points show a flattened “J” shape. The normal probability plot of residual plot c has a y-axis labelled “Sample Quantiles” and incremented 1 from negative 1 to 3. The x-axis is labelled “Theoretical Quantiles” and is increment 1 from negative 2 to 2. The points show an “s” shape. [Return to Figure 13.10]

Table 13.2 Image Description: A table of used car values. The given variables are: Age in years, Price in thousands of dollars, y-hat given by price-hat equals 14.118 minus 0.9432 times age, and residual given by error at i equals y minus y-hat. The first car has the following values: Age equals 1, Price equals 14, y-hat equals 13.1748, and Residual equals 0.8252. [Return to Table 13.2]

Figure 13.11 Image Description: The first graph is a normal Q-Q plot. The y-axis labelled “Sorted Residuals” is in increments of 0.5 from negative 1.5 to 1.5. The x-axis labelled “Normal Score” is in increment of 1 from negative 2 to 2. Fifteen points are plotted and they are roughly on a straight line.

The second image is a scatter plot of standardised residuals The y-axis labelled “Standardised Residuals” is in increment of 1 from negative 3 to 3. The x-axis labelled “Age (Years)” is in increment of 1 from 1 to 13. Fifteen points are plotted and they randomly scatter within a horizontal band from negative 2 to 2. [Return to Figure 13.11]

Figure 13.12 Image Description: A histogram showing one possible distribution of the slope. The y-axis labelled “Density” is in increment of 1 from 0 to 5. The x-axis labelled “Slope b sub 1 (in $1000 per year)” is in increment of 0.1 from negative 1.2 to negative 0.7. The graph consists of bars and a black bell-shaped curve ranging from negative 1.2 to negative 0.7 is drawn on the top of the bars. A red solid vertical line representing the population slop beta sub 1 and a blue dashed vertical line representing the mean of the least-squares estimate b sub 1 are drawn. The red and blue lines almost coincide at round slope equals negative 0.945. [Return to Figure 13.12]

Figure 13.13 Image Description: Two histograms are presented side by side. The graph on the left panel is a histogram of the distribution for the conditional mean. The y-axis labelled “Density” is in increments of 0.5 from 0 to 2. The x-axis labelled “Conditional Mean (in $1000)” is in increments of 1 from 5 to 10. The graph consists of bars and a black bell-shaped curve ranging from 6.5 to 8.5 is drawn on the top of the bars. A red solid vertical line representing the population conditional mean and a blue dashed vertical line representing the mean of the sample conditional mean are drawn. The red and blue lines almost coincide at round mean equals 7.5. The graph on the right panel is a histogram of the distribution of a single value response. The y-axis labelled “Density” is in increments of 0.1 from 0 to 0.5. The x-axis labelled “A single Response (in $1000)” is in increments of 1 from 5 to 10. The graph consists of bars and a black bell-shaped curve ranging from 4 to 10 is drawn on the top of the bars. A red solid vertical line representing the population response Y and a blue dashed vertical line representing the mean of the fitted value Y-hat are drawn. The red and blue lines almost coincide at single value equals 7.5. [Return to Figure 13.13]

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Introduction to Applied Statistics Copyright © 2024 by Wanhua Su is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.