{"id":211,"date":"2020-06-28T15:51:59","date_gmt":"2020-06-28T19:51:59","guid":{"rendered":"https:\/\/openbooks.macewan.ca\/rcommander\/?post_type=chapter&#038;p=211"},"modified":"2025-05-07T17:37:40","modified_gmt":"2025-05-07T21:37:40","slug":"2-1-centre-of-a-distribution","status":"publish","type":"chapter","link":"https:\/\/openbooks.macewan.ca\/introstats\/chapter\/2-1-centre-of-a-distribution\/","title":{"raw":"2.1 Centre of a Distribution","rendered":"2.1 Centre of a Distribution"},"content":{"raw":"The centre of a distribution is in general referred to as the most typical value of the distribution. There are three ways to describe the centre of a distribution\u2014median, mean, and mode.\r\n<h2>2.1.1 Median<\/h2>\r\nWhat is the centre of a ruler? The centre could be viewed as the balance point that cuts the ruler into two halves with equal weight. A similar idea can be applied to the centre of a distribution: the centre of a distribution can be viewed as the value that divides the sorted data into two halves with an equal number of observations. This value is known as the <strong>median<\/strong>. That is, 50% of the observations are below the median and another 50% are above the median. Here are the steps to find the median of a set of data:\r\n<ol>\r\n \t<li>Sort the data from the smallest to the largest.<\/li>\r\n \t<li>If the total number of observations <em>n<\/em> is odd, the median is the observation standing in the middle of the sorted list.<\/li>\r\n \t<li>If <em>n<\/em> is an even number, the median is the average of the two values in the middle of the sorted list.<\/li>\r\n<\/ol>\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\">Example: Find the Median<\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n\r\nFind the Median of the Data 3, 5, 3, 7, 7.\r\n\r\nSteps:\r\n<ul>\r\n \t<li>Sort into 3, 3, <strong>5<\/strong>, 7, 7.<\/li>\r\n \t<li><em>n<\/em> = 5 which is odd, median = 5.<\/li>\r\n<\/ul>\r\n<\/div>\r\n<\/div>\r\n<div style=\"height: 55px; margin-top: 2.1428571429em;\">\r\n\r\n<img class=\"size-full wp-image-99 alignleft\" src=\"https:\/\/openbooks.macewan.ca\/rcommander\/wp-content\/uploads\/sites\/8\/2020\/06\/activity.png\" alt=\"\" width=\"250\" height=\"50\" \/>\r\n\r\n<\/div>\r\n<div class=\"textbox textbox--exercises\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\">Exercise: Find the Median<\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n\r\nFind the median of the following data sets:\r\n<ol>\r\n \t<li>3, 5, 3, 7, 997<\/li>\r\n \t<li>3, 5, 3, 7<\/li>\r\n \t<li>Male, Female, Male, Male, Female, Female, Female<\/li>\r\n<\/ol>\r\n&nbsp;\r\n\r\n<details><summary>Show\/Hide Answer<\/summary>\r\n<ol>\r\n \t<li>3, 5, 3, 7, 997\r\nSort into 3, 3, 5, 7, 997.\r\nn = 5 which is odd, median = 5.<\/li>\r\n \t<li>3, 5, 3, 7\r\nSort into 3, 3, 5, 7\r\nn = 4 which is even, median = [latex]\\frac{3+5}{2} = 4 [\/latex].<\/li>\r\n \t<li>Male, Female, Male, Male, Female, Female, Female\r\nSort: cannot sort Male and Female; therefore, the median does not exist.<\/li>\r\n<\/ol>\r\n<\/details><\/div>\r\n<\/div>\r\n<h2><strong>2.1.2 Mean<\/strong><\/h2>\r\nThe <strong>mean<\/strong> is the average of all observations, which equals the total divided by the number of observations. Suppose the population has [latex]N[\/latex] observations denoted as [latex]x_1, x_2, ..., x_N[\/latex] to distinguish them. Therefore, [latex] x_i[\/latex] refers to the ith observation, [latex]i=1, 2, ... , N[\/latex]. The <strong>population mean<\/strong>, denoted as [latex]\\mu[\/latex] can be calculated as\r\n[latex] \\mu = \\frac{\\text{total}}{N} = \\frac{\\text{sum}}{N} = \\frac{x_1 + x_2+ ... + x_N}{N} = \\frac{\\sum_{i=1}^N x_i}{N} = \\frac{\\sum x_i}{N}. [\/latex]\r\nThe notation \"[latex]\\sum [\/latex]\" is the summation sign which means taking the sum of the observations as indicated in the index, i.e., for all values of [latex]i[\/latex] from 1 to [latex]N[\/latex]. Here [latex]N[\/latex] denotes the population size, the number of individuals in the population.\r\n\r\nIf we have a sample of size [latex]n, x_1, x_2, \\dots, x_n[\/latex], we can calculate the <strong>sample mean<\/strong>, denoted as [latex]\\bar{x}[\/latex] (read as x-bar), as follows:\r\n<p align=\"center\">[latex]\\bar{x} = \\frac{\\text{sum}}{n} = \\frac{x_1 + x_2+ ... + x_n}{n} = \\frac{\\sum_{i=1}^n x_i}{n} = \\frac{\\sum x_i}{n}. [\/latex]<\/p>\r\nHere [latex]n[\/latex] is the sample size, the number of individuals in the sample.\r\n\r\nIn inferential statistics, we often use the sample mean [latex]\\bar x[\/latex] to estimate the value of the population mean [latex]\\mu[\/latex].\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\">Example: Find the Mean<\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n\r\nFind the Sample Mean of the Data 3, 5, 3, 7, 7.\r\n<p style=\"text-align: center;\">[latex]\\bar{x} = \\frac{\\sum x_i}{n} = \\frac{3+5+3+7+7}{5} = \\frac{25}{5} = 5.[\/latex]<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<div style=\"height: 55px; margin-top: 2.1428571429em;\">\r\n\r\n<img class=\"size-full wp-image-99 alignleft\" src=\"https:\/\/openbooks.macewan.ca\/rcommander\/wp-content\/uploads\/sites\/8\/2020\/06\/activity.png\" alt=\"\" width=\"250\" height=\"50\" \/>\r\n\r\n<\/div>\r\n<div class=\"textbox textbox--exercises\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\">Exercises: Find the Mean<\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n\r\nFind the mean of the following data sets:\r\n<ol>\r\n \t<li>3, 5, 3, 7, 997<\/li>\r\n \t<li>3, 5, 3, 7<\/li>\r\n \t<li>Male, Female, Male, Male, Female, Female, Female<\/li>\r\n<\/ol>\r\n&nbsp;\r\n\r\n<details><summary>Show\/Hide Answer<\/summary>\r\n<ol>\r\n \t<li>[latex]\\bar{x} = \\frac{\\sum x_i}{n} = \\frac{3+5+3+7+997}{5} = \\frac{1015}{5} = 203.[\/latex]\r\nThe sample mean is much larger than the mean in the previous example due to the extremely large observation of 997.<\/li>\r\n \t<li>[latex]\\bar{x} = \\frac{\\sum x_i}{n} = \\frac{3+5+3+7}{4} = \\frac{18}{4} = 4.5.[\/latex]<\/li>\r\n \t<li>We cannot calculate the average of qualitative data; therefore, the sample mean does not exist.<\/li>\r\n<\/ol>\r\n<\/details><\/div>\r\n<\/div>\r\n<h2>2.1.3 Mode<\/h2>\r\nThe last measure of centre covered in this course is the <strong>mode, <\/strong>which is the observation that occurs most often. If at least two observations occur most often, the data set has multiple modes; if all observations occur once, there is no mode.\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\">Example: Find the Mode<\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n\r\nFind the Mode of the Data 3, 5, 3, 7, 7.\r\n\r\nThe observation \"3\" occurs twice, and so does the observation \"7\". The observation \"5\" occurs only once. Therefore, the modes are 3 and 7.\r\n\r\n<\/div>\r\n<\/div>\r\n<div style=\"height: 55px; margin-top: 2.1428571429em;\">\r\n\r\n<img class=\"size-full wp-image-99 alignleft\" src=\"https:\/\/openbooks.macewan.ca\/rcommander\/wp-content\/uploads\/sites\/8\/2020\/06\/activity.png\" alt=\"\" width=\"250\" height=\"50\" \/>\r\n\r\n<\/div>\r\n<div class=\"textbox textbox--exercises\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\">Exercise: Find the Mode<\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n\r\nFind the mode of the following data sets:\r\n<ol>\r\n \t<li>3, 5, 3, 7, 997<\/li>\r\n \t<li>3, 5, 9, 7<\/li>\r\n \t<li>Male, Female, Male, Male, Female, Female, Female<\/li>\r\n<\/ol>\r\n&nbsp;\r\n\r\n<details><summary>Show\/Hide Answer<\/summary>\r\n<ol>\r\n \t<li>3, 5, 3, 7, 997\r\nThe observation \"3\" occurs twice, the observations \"5,\" \"7,\" and \"997\" each occur only once. Therefore, the mode is 3.<\/li>\r\n \t<li>3, 5, 9, 7\r\nEach observation occurs only once, so there is no mode.<\/li>\r\n \t<li>Male, Female, Male, Male, Female, Female, Female\r\nThere are three males and four females. Thus, the mode is \"Female.\"<\/li>\r\n<\/ol>\r\n<\/details><\/div>\r\n<\/div>\r\n<h2>2.1.4 Choose the Proper Measure to Describe Centre<\/h2>\r\nMean, median, and mode are the three measures of the centre. The following section provides some practical guidelines for choosing the proper measure to describe the centre of the data.\r\n<h3><strong>Mean Versus Median<\/strong><\/h3>\r\nBoth the mean and the median are measures of the center of a distribution. When a distribution is symmetric, the mean and the median are equal. However, it is better to use the mean to describe the centre of a symmetric distribution for several reasons (one of which is introduced in Chapter 6). On the other hand, when a distribution is skewed or when it contains outliers, it is better to use the median. This is because the mean includes every observation from a data set and, as such, it is easily influenced by extremely large or small values (called outliers). Conversely, the median does not include every observation from a data set but only the centralmost value(s). For this reason, the median is highly resistant to outliers. Recall the two data sets in previous examples: {3, 3, 5, 7, 7} and {3, 3, 5, 7, 799}. These two data sets have the same median of 5; however, they have very different sample means (5 versus 203) due to the extensive observation (i.e., 799) in the second data set.\r\n\r\nThe following graphs show the relationship between the mean (red solid) and the median (blue dashed) for symmetric, right-skewed, and left-skewed distributions.<a id=\"retfig2.1\"><\/a>\r\n\r\n[caption id=\"attachment_256\" align=\"aligncenter\" width=\"1455\"]<img class=\"wp-image-256 size-full\" src=\"https:\/\/openbooks.macewan.ca\/introstats\/wp-content\/uploads\/sites\/8\/2020\/06\/m02_Mean-Median_Distribution.png\" alt=\"Three histograms in a row showing the differences between mean and median for symmetric and non-symmetric distributions. Image description available\" width=\"1455\" height=\"539\" \/> <strong>Figure 2.1<\/strong>: Compare Mean (red solid) and Median (blue dashed) [<a href=\"https:\/\/openbooks.macewan.ca\/introstats\/back-matter\/image-description\/#fig2.1\">Image Description (See Appendix D Figure 2.1)<\/a>][\/caption]We can tell from the figures:\r\n<ul>\r\n \t<li>For the right-skewed distribution (longer tail on the right-hand side), mean &gt; median because the observations on the right tail drag the mean to the right.<\/li>\r\n \t<li>For the symmetric distribution, mean = median. Both divide the distribution into two parts with roughly equal number of observations.<\/li>\r\n \t<li>For the distribution that is skewed to the left (longer tail on the left-hand side), mean &lt; median because the observations on the left tail drag the mean to the left.<\/li>\r\n<\/ul>\r\n<h3><strong>Summary of the Centre<\/strong><\/h3>\r\nHere are some guidelines for choosing the proper measure to describe the centre of a distribution:\r\n<ul>\r\n \t<li>Use the median when the distribution is extremely skewed or outliers exist.<\/li>\r\n \t<li>Use the mean when the distribution is symmetric and there are no outliers.<\/li>\r\n \t<li>For qualitative\/categorical data, we can only use the mode to describe the center.<\/li>\r\n \t<li>For quantitative data, the mode can also be computed. However, it is not as informative as the median or the mean.<\/li>\r\n<\/ul>","rendered":"<p>The centre of a distribution is in general referred to as the most typical value of the distribution. There are three ways to describe the centre of a distribution\u2014median, mean, and mode.<\/p>\n<h2>2.1.1 Median<\/h2>\n<p>What is the centre of a ruler? The centre could be viewed as the balance point that cuts the ruler into two halves with equal weight. A similar idea can be applied to the centre of a distribution: the centre of a distribution can be viewed as the value that divides the sorted data into two halves with an equal number of observations. This value is known as the <strong>median<\/strong>. That is, 50% of the observations are below the median and another 50% are above the median. Here are the steps to find the median of a set of data:<\/p>\n<ol>\n<li>Sort the data from the smallest to the largest.<\/li>\n<li>If the total number of observations <em>n<\/em> is odd, the median is the observation standing in the middle of the sorted list.<\/li>\n<li>If <em>n<\/em> is an even number, the median is the average of the two values in the middle of the sorted list.<\/li>\n<\/ol>\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\">Example: Find the Median<\/p>\n<\/header>\n<div class=\"textbox__content\">\n<p>Find the Median of the Data 3, 5, 3, 7, 7.<\/p>\n<p>Steps:<\/p>\n<ul>\n<li>Sort into 3, 3, <strong>5<\/strong>, 7, 7.<\/li>\n<li><em>n<\/em> = 5 which is odd, median = 5.<\/li>\n<\/ul>\n<\/div>\n<\/div>\n<div style=\"height: 55px; margin-top: 2.1428571429em;\">\n<p><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-99 alignleft\" src=\"https:\/\/openbooks.macewan.ca\/rcommander\/wp-content\/uploads\/sites\/8\/2020\/06\/activity.png\" alt=\"\" width=\"250\" height=\"50\" srcset=\"https:\/\/openbooks.macewan.ca\/introstats\/wp-content\/uploads\/sites\/8\/2020\/06\/activity.png 250w, https:\/\/openbooks.macewan.ca\/introstats\/wp-content\/uploads\/sites\/8\/2020\/06\/activity-65x13.png 65w, https:\/\/openbooks.macewan.ca\/introstats\/wp-content\/uploads\/sites\/8\/2020\/06\/activity-225x45.png 225w\" sizes=\"auto, (max-width: 250px) 100vw, 250px\" \/><\/p>\n<\/div>\n<div class=\"textbox textbox--exercises\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\">Exercise: Find the Median<\/p>\n<\/header>\n<div class=\"textbox__content\">\n<p>Find the median of the following data sets:<\/p>\n<ol>\n<li>3, 5, 3, 7, 997<\/li>\n<li>3, 5, 3, 7<\/li>\n<li>Male, Female, Male, Male, Female, Female, Female<\/li>\n<\/ol>\n<p>&nbsp;<\/p>\n<details>\n<summary>Show\/Hide Answer<\/summary>\n<ol>\n<li>3, 5, 3, 7, 997<br \/>\nSort into 3, 3, 5, 7, 997.<br \/>\nn = 5 which is odd, median = 5.<\/li>\n<li>3, 5, 3, 7<br \/>\nSort into 3, 3, 5, 7<br \/>\nn = 4 which is even, median = [latex]\\frac{3+5}{2} = 4[\/latex].<\/li>\n<li>Male, Female, Male, Male, Female, Female, Female<br \/>\nSort: cannot sort Male and Female; therefore, the median does not exist.<\/li>\n<\/ol>\n<\/details>\n<\/div>\n<\/div>\n<h2><strong>2.1.2 Mean<\/strong><\/h2>\n<p>The <strong>mean<\/strong> is the average of all observations, which equals the total divided by the number of observations. Suppose the population has [latex]N[\/latex] observations denoted as [latex]x_1, x_2, ..., x_N[\/latex] to distinguish them. Therefore, [latex]x_i[\/latex] refers to the ith observation, [latex]i=1, 2, ... , N[\/latex]. The <strong>population mean<\/strong>, denoted as [latex]\\mu[\/latex] can be calculated as<br \/>\n[latex]\\mu = \\frac{\\text{total}}{N} = \\frac{\\text{sum}}{N} = \\frac{x_1 + x_2+ ... + x_N}{N} = \\frac{\\sum_{i=1}^N x_i}{N} = \\frac{\\sum x_i}{N}.[\/latex]<br \/>\nThe notation &#8220;[latex]\\sum[\/latex]&#8221; is the summation sign which means taking the sum of the observations as indicated in the index, i.e., for all values of [latex]i[\/latex] from 1 to [latex]N[\/latex]. Here [latex]N[\/latex] denotes the population size, the number of individuals in the population.<\/p>\n<p>If we have a sample of size [latex]n, x_1, x_2, \\dots, x_n[\/latex], we can calculate the <strong>sample mean<\/strong>, denoted as [latex]\\bar{x}[\/latex] (read as x-bar), as follows:<\/p>\n<p style=\"text-align: center;\">[latex]\\bar{x} = \\frac{\\text{sum}}{n} = \\frac{x_1 + x_2+ ... + x_n}{n} = \\frac{\\sum_{i=1}^n x_i}{n} = \\frac{\\sum x_i}{n}.[\/latex]<\/p>\n<p>Here [latex]n[\/latex] is the sample size, the number of individuals in the sample.<\/p>\n<p>In inferential statistics, we often use the sample mean [latex]\\bar x[\/latex] to estimate the value of the population mean [latex]\\mu[\/latex].<\/p>\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\">Example: Find the Mean<\/p>\n<\/header>\n<div class=\"textbox__content\">\n<p>Find the Sample Mean of the Data 3, 5, 3, 7, 7.<\/p>\n<p style=\"text-align: center;\">[latex]\\bar{x} = \\frac{\\sum x_i}{n} = \\frac{3+5+3+7+7}{5} = \\frac{25}{5} = 5.[\/latex]<\/p>\n<\/div>\n<\/div>\n<div style=\"height: 55px; margin-top: 2.1428571429em;\">\n<p><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-99 alignleft\" src=\"https:\/\/openbooks.macewan.ca\/rcommander\/wp-content\/uploads\/sites\/8\/2020\/06\/activity.png\" alt=\"\" width=\"250\" height=\"50\" srcset=\"https:\/\/openbooks.macewan.ca\/introstats\/wp-content\/uploads\/sites\/8\/2020\/06\/activity.png 250w, https:\/\/openbooks.macewan.ca\/introstats\/wp-content\/uploads\/sites\/8\/2020\/06\/activity-65x13.png 65w, https:\/\/openbooks.macewan.ca\/introstats\/wp-content\/uploads\/sites\/8\/2020\/06\/activity-225x45.png 225w\" sizes=\"auto, (max-width: 250px) 100vw, 250px\" \/><\/p>\n<\/div>\n<div class=\"textbox textbox--exercises\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\">Exercises: Find the Mean<\/p>\n<\/header>\n<div class=\"textbox__content\">\n<p>Find the mean of the following data sets:<\/p>\n<ol>\n<li>3, 5, 3, 7, 997<\/li>\n<li>3, 5, 3, 7<\/li>\n<li>Male, Female, Male, Male, Female, Female, Female<\/li>\n<\/ol>\n<p>&nbsp;<\/p>\n<details>\n<summary>Show\/Hide Answer<\/summary>\n<ol>\n<li>[latex]\\bar{x} = \\frac{\\sum x_i}{n} = \\frac{3+5+3+7+997}{5} = \\frac{1015}{5} = 203.[\/latex]<br \/>\nThe sample mean is much larger than the mean in the previous example due to the extremely large observation of 997.<\/li>\n<li>[latex]\\bar{x} = \\frac{\\sum x_i}{n} = \\frac{3+5+3+7}{4} = \\frac{18}{4} = 4.5.[\/latex]<\/li>\n<li>We cannot calculate the average of qualitative data; therefore, the sample mean does not exist.<\/li>\n<\/ol>\n<\/details>\n<\/div>\n<\/div>\n<h2>2.1.3 Mode<\/h2>\n<p>The last measure of centre covered in this course is the <strong>mode, <\/strong>which is the observation that occurs most often. If at least two observations occur most often, the data set has multiple modes; if all observations occur once, there is no mode.<\/p>\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\">Example: Find the Mode<\/p>\n<\/header>\n<div class=\"textbox__content\">\n<p>Find the Mode of the Data 3, 5, 3, 7, 7.<\/p>\n<p>The observation &#8220;3&#8221; occurs twice, and so does the observation &#8220;7&#8221;. The observation &#8220;5&#8221; occurs only once. Therefore, the modes are 3 and 7.<\/p>\n<\/div>\n<\/div>\n<div style=\"height: 55px; margin-top: 2.1428571429em;\">\n<p><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-99 alignleft\" src=\"https:\/\/openbooks.macewan.ca\/rcommander\/wp-content\/uploads\/sites\/8\/2020\/06\/activity.png\" alt=\"\" width=\"250\" height=\"50\" srcset=\"https:\/\/openbooks.macewan.ca\/introstats\/wp-content\/uploads\/sites\/8\/2020\/06\/activity.png 250w, https:\/\/openbooks.macewan.ca\/introstats\/wp-content\/uploads\/sites\/8\/2020\/06\/activity-65x13.png 65w, https:\/\/openbooks.macewan.ca\/introstats\/wp-content\/uploads\/sites\/8\/2020\/06\/activity-225x45.png 225w\" sizes=\"auto, (max-width: 250px) 100vw, 250px\" \/><\/p>\n<\/div>\n<div class=\"textbox textbox--exercises\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\">Exercise: Find the Mode<\/p>\n<\/header>\n<div class=\"textbox__content\">\n<p>Find the mode of the following data sets:<\/p>\n<ol>\n<li>3, 5, 3, 7, 997<\/li>\n<li>3, 5, 9, 7<\/li>\n<li>Male, Female, Male, Male, Female, Female, Female<\/li>\n<\/ol>\n<p>&nbsp;<\/p>\n<details>\n<summary>Show\/Hide Answer<\/summary>\n<ol>\n<li>3, 5, 3, 7, 997<br \/>\nThe observation &#8220;3&#8221; occurs twice, the observations &#8220;5,&#8221; &#8220;7,&#8221; and &#8220;997&#8221; each occur only once. Therefore, the mode is 3.<\/li>\n<li>3, 5, 9, 7<br \/>\nEach observation occurs only once, so there is no mode.<\/li>\n<li>Male, Female, Male, Male, Female, Female, Female<br \/>\nThere are three males and four females. Thus, the mode is &#8220;Female.&#8221;<\/li>\n<\/ol>\n<\/details>\n<\/div>\n<\/div>\n<h2>2.1.4 Choose the Proper Measure to Describe Centre<\/h2>\n<p>Mean, median, and mode are the three measures of the centre. The following section provides some practical guidelines for choosing the proper measure to describe the centre of the data.<\/p>\n<h3><strong>Mean Versus Median<\/strong><\/h3>\n<p>Both the mean and the median are measures of the center of a distribution. When a distribution is symmetric, the mean and the median are equal. However, it is better to use the mean to describe the centre of a symmetric distribution for several reasons (one of which is introduced in Chapter 6). On the other hand, when a distribution is skewed or when it contains outliers, it is better to use the median. This is because the mean includes every observation from a data set and, as such, it is easily influenced by extremely large or small values (called outliers). Conversely, the median does not include every observation from a data set but only the centralmost value(s). For this reason, the median is highly resistant to outliers. Recall the two data sets in previous examples: {3, 3, 5, 7, 7} and {3, 3, 5, 7, 799}. These two data sets have the same median of 5; however, they have very different sample means (5 versus 203) due to the extensive observation (i.e., 799) in the second data set.<\/p>\n<p>The following graphs show the relationship between the mean (red solid) and the median (blue dashed) for symmetric, right-skewed, and left-skewed distributions.<a id=\"retfig2.1\"><\/a><\/p>\n<figure id=\"attachment_256\" aria-describedby=\"caption-attachment-256\" style=\"width: 1455px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-256 size-full\" src=\"https:\/\/openbooks.macewan.ca\/introstats\/wp-content\/uploads\/sites\/8\/2020\/06\/m02_Mean-Median_Distribution.png\" alt=\"Three histograms in a row showing the differences between mean and median for symmetric and non-symmetric distributions. Image description available\" width=\"1455\" height=\"539\" srcset=\"https:\/\/openbooks.macewan.ca\/introstats\/wp-content\/uploads\/sites\/8\/2020\/06\/m02_Mean-Median_Distribution.png 1455w, https:\/\/openbooks.macewan.ca\/introstats\/wp-content\/uploads\/sites\/8\/2020\/06\/m02_Mean-Median_Distribution-300x111.png 300w, https:\/\/openbooks.macewan.ca\/introstats\/wp-content\/uploads\/sites\/8\/2020\/06\/m02_Mean-Median_Distribution-1024x379.png 1024w, https:\/\/openbooks.macewan.ca\/introstats\/wp-content\/uploads\/sites\/8\/2020\/06\/m02_Mean-Median_Distribution-768x285.png 768w, https:\/\/openbooks.macewan.ca\/introstats\/wp-content\/uploads\/sites\/8\/2020\/06\/m02_Mean-Median_Distribution-65x24.png 65w, https:\/\/openbooks.macewan.ca\/introstats\/wp-content\/uploads\/sites\/8\/2020\/06\/m02_Mean-Median_Distribution-225x83.png 225w, https:\/\/openbooks.macewan.ca\/introstats\/wp-content\/uploads\/sites\/8\/2020\/06\/m02_Mean-Median_Distribution-350x130.png 350w\" sizes=\"auto, (max-width: 1455px) 100vw, 1455px\" \/><figcaption id=\"caption-attachment-256\" class=\"wp-caption-text\"><strong>Figure 2.1<\/strong>: Compare Mean (red solid) and Median (blue dashed) [<a href=\"https:\/\/openbooks.macewan.ca\/introstats\/back-matter\/image-description\/#fig2.1\">Image Description (See Appendix D Figure 2.1)<\/a>]<\/figcaption><\/figure>\n<p>We can tell from the figures:<\/p>\n<ul>\n<li>For the right-skewed distribution (longer tail on the right-hand side), mean &gt; median because the observations on the right tail drag the mean to the right.<\/li>\n<li>For the symmetric distribution, mean = median. Both divide the distribution into two parts with roughly equal number of observations.<\/li>\n<li>For the distribution that is skewed to the left (longer tail on the left-hand side), mean &lt; median because the observations on the left tail drag the mean to the left.<\/li>\n<\/ul>\n<h3><strong>Summary of the Centre<\/strong><\/h3>\n<p>Here are some guidelines for choosing the proper measure to describe the centre of a distribution:<\/p>\n<ul>\n<li>Use the median when the distribution is extremely skewed or outliers exist.<\/li>\n<li>Use the mean when the distribution is symmetric and there are no outliers.<\/li>\n<li>For qualitative\/categorical data, we can only use the mode to describe the center.<\/li>\n<li>For quantitative data, the mode can also be computed. However, it is not as informative as the median or the mean.<\/li>\n<\/ul>\n","protected":false},"author":19,"menu_order":1,"template":"","meta":{"pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-211","chapter","type-chapter","status-publish","hentry"],"part":209,"_links":{"self":[{"href":"https:\/\/openbooks.macewan.ca\/introstats\/wp-json\/pressbooks\/v2\/chapters\/211","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/openbooks.macewan.ca\/introstats\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/openbooks.macewan.ca\/introstats\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/openbooks.macewan.ca\/introstats\/wp-json\/wp\/v2\/users\/19"}],"version-history":[{"count":83,"href":"https:\/\/openbooks.macewan.ca\/introstats\/wp-json\/pressbooks\/v2\/chapters\/211\/revisions"}],"predecessor-version":[{"id":5484,"href":"https:\/\/openbooks.macewan.ca\/introstats\/wp-json\/pressbooks\/v2\/chapters\/211\/revisions\/5484"}],"part":[{"href":"https:\/\/openbooks.macewan.ca\/introstats\/wp-json\/pressbooks\/v2\/parts\/209"}],"metadata":[{"href":"https:\/\/openbooks.macewan.ca\/introstats\/wp-json\/pressbooks\/v2\/chapters\/211\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/openbooks.macewan.ca\/introstats\/wp-json\/wp\/v2\/media?parent=211"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/openbooks.macewan.ca\/introstats\/wp-json\/pressbooks\/v2\/chapter-type?post=211"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/openbooks.macewan.ca\/introstats\/wp-json\/wp\/v2\/contributor?post=211"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/openbooks.macewan.ca\/introstats\/wp-json\/wp\/v2\/license?post=211"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}