{"id":1011,"date":"2021-05-30T22:54:42","date_gmt":"2021-05-31T02:54:42","guid":{"rendered":"https:\/\/openbooks.macewan.ca\/rcommander\/?post_type=chapter&#038;p=1011"},"modified":"2024-02-08T14:20:05","modified_gmt":"2024-02-08T19:20:05","slug":"9-1-distribution-of-the-difference-between-two-sample-means-for-two-independent-samples","status":"publish","type":"chapter","link":"https:\/\/openbooks.macewan.ca\/introstats\/chapter\/9-1-distribution-of-the-difference-between-two-sample-means-for-two-independent-samples\/","title":{"raw":"9.1 Distribution of the Difference between Two Sample Means for Two Independent Samples","rendered":"9.1 Distribution of the Difference between Two Sample Means for Two Independent Samples"},"content":{"raw":"Suppose two populations have means [latex]\\mu_1[\/latex], [latex]\\mu_2[\/latex] and standard deviations [latex]\\sigma_1[\/latex], [latex]\\sigma_2[\/latex]. Further, suppose that we obtain from each population simple random samples, from which we obtain sample means [latex]\\bar{x}_1[\/latex] and [latex]\\bar{x}_2[\/latex]. Our objective is to make inferences about [latex]\\mu_1[\/latex] \u2013 [latex]\\mu_2[\/latex] using the unbiased estimate [latex]\\bar{x}_1[\/latex] - [latex]\\bar{x}_2[\/latex] and as such, we need to know the distribution of [latex]\\bar{X}_1 - \\bar{X}_2[\/latex].<a id=\"retfig9.1\"><\/a>\r\n\r\n&nbsp;\r\n\r\n[caption id=\"attachment_2881\" align=\"aligncenter\" width=\"500\"]<img class=\"wp-image-2881\" src=\"https:\/\/openbooks.macewan.ca\/rcommander\/wp-content\/uploads\/sites\/8\/2021\/05\/two_sample_t_crop-1024x772.png\" alt=\"A figure demonstrating that two independent populations have independent samples. Image description available.\" width=\"500\" height=\"377\" \/> <strong>Figure 9.1<\/strong>: Two Independent Samples. [<a href=\"https:\/\/openbooks.macewan.ca\/introstats\/back-matter\/image-description\/#fig9.1\">Image Description (See Appendix D Figure 9.1)<\/a>][\/caption]Recall the conclusions about the sampling distribution of the sample mean [latex]\\bar{X}[\/latex] based on samples of size <em>n<\/em> taken from a population with mean [latex]\\mu[\/latex] and standard deviation [latex]\\sigma[\/latex]:\r\n<div class=\"textbox\">\r\n<ol>\r\n \t<li>The mean of [latex]\\bar{X}[\/latex] equals the population mean [latex]\\mu[\/latex], i.e., [latex]\\mu_{\\scriptsize \\bar{X}} = \\mu[\/latex].<\/li>\r\n \t<li>The standard deviation of [latex]\\bar{X}[\/latex] equals the population standard deviation divided by the square root of the sample size <em>n<\/em>, i.e., [latex]\\sigma_{\\scriptsize\\bar{X}} = \\frac{\\sigma}{\\sqrt{n}}[\/latex].\r\n<strong> These two conclusions are always true regardless of the population distribution and the sample size <em>n<\/em>.<\/strong><\/li>\r\n \t<li>The shape of the distribution of [latex]\\bar{X}[\/latex]:\r\n<ol type=\"a\">\r\n \t<li>If the population is normally distributed, so is [latex]\\bar{X}[\/latex] regardless of the sample size <em>n<\/em>.<\/li>\r\n \t<li>If the population is not normally distributed, but the sample size <em>n<\/em> is relatively large, say [latex]n \\geq 30[\/latex], then the sample mean [latex]\\bar{X}[\/latex] is approximately normally distributed.<\/li>\r\n<\/ol>\r\n<\/li>\r\n<\/ol>\r\n<\/div>\r\nA similar idea applies to the distribution of [latex]\\bar{X_1} - \\bar{X_2}[\/latex].\r\n<div class=\"textbox textbox--key-takeaways\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\">Key Facts: Sampling Distribution of [latex]\\color{white}\\bar{X_1}-\\bar{X_2}[\/latex]<\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<ol>\r\n \t<li>The mean of [latex]\\bar{X_1} - \\bar{X_2}[\/latex] equals the difference of the population means: [latex]\\mu_{\\scriptsize \\bar{X_1} - \\bar{X_2}} = \\mu_1 - \\mu_2[\/latex].<\/li>\r\n \t<li>The standard deviation of [latex]\\bar{X_1} - \\bar{X_2}[\/latex] is: [latex]\\sigma_{\\scriptsize \\bar{X_1} - \\bar{X_2}} = \\sqrt{\\frac{\\sigma_1^2}{n_1} + \\frac{\\sigma_2^2}{n_2}}[\/latex].\r\n<strong>These two conclusions are always true regardless of the population distributions and the sample sizes [latex]n_1[\/latex] and [latex]n_2[\/latex].<\/strong><\/li>\r\n \t<li>The shape of the distribution of [latex]\\bar{X_1} - \\bar{X_2}[\/latex]:\r\n<ol type=\"a\">\r\n \t<li>If the populations are normally distributed, [latex]\\bar{X_1} - \\bar{X_2}[\/latex] is exactly normally distributed regardless of the sample sizes [latex]n_1[\/latex] and [latex]n_2[\/latex].<\/li>\r\n \t<li>If the populations are not normally distributed, but sample sizes [latex]n_1[\/latex] and [latex]n_2[\/latex] are relatively large, say [latex]n_1 \\geq 30[\/latex] and [latex]n_2 \\geq 30[\/latex], then by the central limit theorem both [latex]\\bar{X_1}[\/latex] and [latex]\\bar{X_2}[\/latex] are approximately normally distributed. The difference of two normal distributions is still normal; therefore, for [latex]n_1 \\geq 30[\/latex] and [latex]n_2 \\geq 30[\/latex], [latex]\\bar{X_1} - \\bar{X_2}[\/latex] is approximately normally distributed.<\/li>\r\n<\/ol>\r\n<\/li>\r\n<\/ol>\r\n<\/div>\r\n<\/div>\r\nTo summarize, for normal populations <strong>OR <\/strong>large sample sizes\r\n\r\n[latex]\\bar{X_1} - \\bar{X_2} \\sim N \\left( \\mu_1 - \\mu_2, \\sqrt{ \\frac{\\sigma_1^2}{n_1} + \\frac{\\sigma_2^2}{n_2}} \\right). [\/latex]\r\n\r\nWe can also standardize [latex]\\bar{X_1} - \\bar{X_2}[\/latex] to convert it into a standard normal random variable:\r\n\r\n[latex]Z = \\frac{(\\bar{X_1} - \\bar{X_2}) - (\\mu_1 - \\mu_2)}{\\sqrt{ \\frac{\\sigma_1^2}{n_1} + \\frac{\\sigma_2^2}{n_2}}} \\sim N(0, 1).[\/latex]\r\n\r\nIf the population standard deviations [latex]\\sigma_1[\/latex] and [latex]\\sigma_2[\/latex] are unknown and estimated by sample standard deviations [latex]s_1[\/latex] and [latex]s_2[\/latex], the studentized version of [latex]\\bar{X_1} - \\bar{X_2}[\/latex] is\r\n\r\n[latex]t = \\frac{(\\bar{X_1} - \\bar{X_2}) - (\\mu_1 - \\mu_2)}{\\sqrt{ \\frac{s_1^2}{n_1} + \\frac{s_2^2}{n_2}}} \\sim t \\text{ distribution}[\/latex]\r\n\r\nwith degrees of freedom\r\n\r\n[latex] df = \\frac{ \\left( \\frac{s_1^2}{n_1} + \\frac{s_2^2}{n_2} \\right)^2}{\\frac{1}{n_1 - 1} \\left( \\frac{s_1^2}{n_1} \\right)^2 + \\frac{1}{n_2 - 1} \\left( \\frac{s_2^2}{n_2} \\right)^2 } \\text{ rounded down to the nearest integer}.[\/latex]\r\n\r\n&nbsp;\r\n<div style=\"height: 55px; margin-top: 5px;\"><img class=\"size-full wp-image-99 alignleft\" src=\"https:\/\/openbooks.macewan.ca\/rcommander\/wp-content\/uploads\/sites\/8\/2020\/06\/instructornote.png\" alt=\"\" width=\"250\" height=\"50\" \/><\/div>\r\nThe degrees of freedom calculation given in the above equation is very complicated, so for exams, you can use the conservative lower bound, which is defined as the smaller value of [latex]n_1 - 1[\/latex] and [latex]n_2 - 1[\/latex]. That is, you may use [latex]df = \\min\\{n_1 -1, n_2 - 1 \\}[\/latex].\r\n\r\nFor example, if [latex]n_1 = 40, n_2 = 50[\/latex], then [latex]df = \\min\\{n_1 -1, n_2 - 1 \\} = \\min\\{40-1, 50-1 \\} = \\min\\{39, 49 \\} = 39.[\/latex]","rendered":"<p>Suppose two populations have means [latex]\\mu_1[\/latex], [latex]\\mu_2[\/latex] and standard deviations [latex]\\sigma_1[\/latex], [latex]\\sigma_2[\/latex]. Further, suppose that we obtain from each population simple random samples, from which we obtain sample means [latex]\\bar{x}_1[\/latex] and [latex]\\bar{x}_2[\/latex]. Our objective is to make inferences about [latex]\\mu_1[\/latex] \u2013 [latex]\\mu_2[\/latex] using the unbiased estimate [latex]\\bar{x}_1[\/latex] &#8211; [latex]\\bar{x}_2[\/latex] and as such, we need to know the distribution of [latex]\\bar{X}_1 - \\bar{X}_2[\/latex].<a id=\"retfig9.1\"><\/a><\/p>\n<p>&nbsp;<\/p>\n<figure id=\"attachment_2881\" aria-describedby=\"caption-attachment-2881\" style=\"width: 500px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-2881\" src=\"https:\/\/openbooks.macewan.ca\/rcommander\/wp-content\/uploads\/sites\/8\/2021\/05\/two_sample_t_crop-1024x772.png\" alt=\"A figure demonstrating that two independent populations have independent samples. Image description available.\" width=\"500\" height=\"377\" srcset=\"https:\/\/openbooks.macewan.ca\/introstats\/wp-content\/uploads\/sites\/8\/2021\/05\/two_sample_t_crop-1024x772.png 1024w, https:\/\/openbooks.macewan.ca\/introstats\/wp-content\/uploads\/sites\/8\/2021\/05\/two_sample_t_crop-300x226.png 300w, https:\/\/openbooks.macewan.ca\/introstats\/wp-content\/uploads\/sites\/8\/2021\/05\/two_sample_t_crop-768x579.png 768w, https:\/\/openbooks.macewan.ca\/introstats\/wp-content\/uploads\/sites\/8\/2021\/05\/two_sample_t_crop-1536x1158.png 1536w, https:\/\/openbooks.macewan.ca\/introstats\/wp-content\/uploads\/sites\/8\/2021\/05\/two_sample_t_crop-2048x1544.png 2048w, https:\/\/openbooks.macewan.ca\/introstats\/wp-content\/uploads\/sites\/8\/2021\/05\/two_sample_t_crop-65x49.png 65w, https:\/\/openbooks.macewan.ca\/introstats\/wp-content\/uploads\/sites\/8\/2021\/05\/two_sample_t_crop-225x170.png 225w, https:\/\/openbooks.macewan.ca\/introstats\/wp-content\/uploads\/sites\/8\/2021\/05\/two_sample_t_crop-350x264.png 350w\" sizes=\"auto, (max-width: 500px) 100vw, 500px\" \/><figcaption id=\"caption-attachment-2881\" class=\"wp-caption-text\"><strong>Figure 9.1<\/strong>: Two Independent Samples. [<a href=\"https:\/\/openbooks.macewan.ca\/introstats\/back-matter\/image-description\/#fig9.1\">Image Description (See Appendix D Figure 9.1)<\/a>]<\/figcaption><\/figure>\n<p>Recall the conclusions about the sampling distribution of the sample mean [latex]\\bar{X}[\/latex] based on samples of size <em>n<\/em> taken from a population with mean [latex]\\mu[\/latex] and standard deviation [latex]\\sigma[\/latex]:<\/p>\n<div class=\"textbox\">\n<ol>\n<li>The mean of [latex]\\bar{X}[\/latex] equals the population mean [latex]\\mu[\/latex], i.e., [latex]\\mu_{\\scriptsize \\bar{X}} = \\mu[\/latex].<\/li>\n<li>The standard deviation of [latex]\\bar{X}[\/latex] equals the population standard deviation divided by the square root of the sample size <em>n<\/em>, i.e., [latex]\\sigma_{\\scriptsize\\bar{X}} = \\frac{\\sigma}{\\sqrt{n}}[\/latex].<br \/>\n<strong> These two conclusions are always true regardless of the population distribution and the sample size <em>n<\/em>.<\/strong><\/li>\n<li>The shape of the distribution of [latex]\\bar{X}[\/latex]:\n<ol type=\"a\">\n<li>If the population is normally distributed, so is [latex]\\bar{X}[\/latex] regardless of the sample size <em>n<\/em>.<\/li>\n<li>If the population is not normally distributed, but the sample size <em>n<\/em> is relatively large, say [latex]n \\geq 30[\/latex], then the sample mean [latex]\\bar{X}[\/latex] is approximately normally distributed.<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<\/div>\n<p>A similar idea applies to the distribution of [latex]\\bar{X_1} - \\bar{X_2}[\/latex].<\/p>\n<div class=\"textbox textbox--key-takeaways\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\">Key Facts: Sampling Distribution of [latex]\\color{white}\\bar{X_1}-\\bar{X_2}[\/latex]<\/p>\n<\/header>\n<div class=\"textbox__content\">\n<ol>\n<li>The mean of [latex]\\bar{X_1} - \\bar{X_2}[\/latex] equals the difference of the population means: [latex]\\mu_{\\scriptsize \\bar{X_1} - \\bar{X_2}} = \\mu_1 - \\mu_2[\/latex].<\/li>\n<li>The standard deviation of [latex]\\bar{X_1} - \\bar{X_2}[\/latex] is: [latex]\\sigma_{\\scriptsize \\bar{X_1} - \\bar{X_2}} = \\sqrt{\\frac{\\sigma_1^2}{n_1} + \\frac{\\sigma_2^2}{n_2}}[\/latex].<br \/>\n<strong>These two conclusions are always true regardless of the population distributions and the sample sizes [latex]n_1[\/latex] and [latex]n_2[\/latex].<\/strong><\/li>\n<li>The shape of the distribution of [latex]\\bar{X_1} - \\bar{X_2}[\/latex]:\n<ol type=\"a\">\n<li>If the populations are normally distributed, [latex]\\bar{X_1} - \\bar{X_2}[\/latex] is exactly normally distributed regardless of the sample sizes [latex]n_1[\/latex] and [latex]n_2[\/latex].<\/li>\n<li>If the populations are not normally distributed, but sample sizes [latex]n_1[\/latex] and [latex]n_2[\/latex] are relatively large, say [latex]n_1 \\geq 30[\/latex] and [latex]n_2 \\geq 30[\/latex], then by the central limit theorem both [latex]\\bar{X_1}[\/latex] and [latex]\\bar{X_2}[\/latex] are approximately normally distributed. The difference of two normal distributions is still normal; therefore, for [latex]n_1 \\geq 30[\/latex] and [latex]n_2 \\geq 30[\/latex], [latex]\\bar{X_1} - \\bar{X_2}[\/latex] is approximately normally distributed.<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<\/div>\n<\/div>\n<p>To summarize, for normal populations <strong>OR <\/strong>large sample sizes<\/p>\n<p>[latex]\\bar{X_1} - \\bar{X_2} \\sim N \\left( \\mu_1 - \\mu_2, \\sqrt{ \\frac{\\sigma_1^2}{n_1} + \\frac{\\sigma_2^2}{n_2}} \\right).[\/latex]<\/p>\n<p>We can also standardize [latex]\\bar{X_1} - \\bar{X_2}[\/latex] to convert it into a standard normal random variable:<\/p>\n<p>[latex]Z = \\frac{(\\bar{X_1} - \\bar{X_2}) - (\\mu_1 - \\mu_2)}{\\sqrt{ \\frac{\\sigma_1^2}{n_1} + \\frac{\\sigma_2^2}{n_2}}} \\sim N(0, 1).[\/latex]<\/p>\n<p>If the population standard deviations [latex]\\sigma_1[\/latex] and [latex]\\sigma_2[\/latex] are unknown and estimated by sample standard deviations [latex]s_1[\/latex] and [latex]s_2[\/latex], the studentized version of [latex]\\bar{X_1} - \\bar{X_2}[\/latex] is<\/p>\n<p>[latex]t = \\frac{(\\bar{X_1} - \\bar{X_2}) - (\\mu_1 - \\mu_2)}{\\sqrt{ \\frac{s_1^2}{n_1} + \\frac{s_2^2}{n_2}}} \\sim t \\text{ distribution}[\/latex]<\/p>\n<p>with degrees of freedom<\/p>\n<p>[latex]df = \\frac{ \\left( \\frac{s_1^2}{n_1} + \\frac{s_2^2}{n_2} \\right)^2}{\\frac{1}{n_1 - 1} \\left( \\frac{s_1^2}{n_1} \\right)^2 + \\frac{1}{n_2 - 1} \\left( \\frac{s_2^2}{n_2} \\right)^2 } \\text{ rounded down to the nearest integer}.[\/latex]<\/p>\n<p>&nbsp;<\/p>\n<div style=\"height: 55px; margin-top: 5px;\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-99 alignleft\" src=\"https:\/\/openbooks.macewan.ca\/rcommander\/wp-content\/uploads\/sites\/8\/2020\/06\/instructornote.png\" alt=\"\" width=\"250\" height=\"50\" \/><\/div>\n<p>The degrees of freedom calculation given in the above equation is very complicated, so for exams, you can use the conservative lower bound, which is defined as the smaller value of [latex]n_1 - 1[\/latex] and [latex]n_2 - 1[\/latex]. That is, you may use [latex]df = \\min\\{n_1 -1, n_2 - 1 \\}[\/latex].<\/p>\n<p>For example, if [latex]n_1 = 40, n_2 = 50[\/latex], then [latex]df = \\min\\{n_1 -1, n_2 - 1 \\} = \\min\\{40-1, 50-1 \\} = \\min\\{39, 49 \\} = 39.[\/latex]<\/p>\n","protected":false},"author":19,"menu_order":1,"template":"","meta":{"pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-1011","chapter","type-chapter","status-publish","hentry"],"part":1007,"_links":{"self":[{"href":"https:\/\/openbooks.macewan.ca\/introstats\/wp-json\/pressbooks\/v2\/chapters\/1011","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/openbooks.macewan.ca\/introstats\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/openbooks.macewan.ca\/introstats\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/openbooks.macewan.ca\/introstats\/wp-json\/wp\/v2\/users\/19"}],"version-history":[{"count":43,"href":"https:\/\/openbooks.macewan.ca\/introstats\/wp-json\/pressbooks\/v2\/chapters\/1011\/revisions"}],"predecessor-version":[{"id":5303,"href":"https:\/\/openbooks.macewan.ca\/introstats\/wp-json\/pressbooks\/v2\/chapters\/1011\/revisions\/5303"}],"part":[{"href":"https:\/\/openbooks.macewan.ca\/introstats\/wp-json\/pressbooks\/v2\/parts\/1007"}],"metadata":[{"href":"https:\/\/openbooks.macewan.ca\/introstats\/wp-json\/pressbooks\/v2\/chapters\/1011\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/openbooks.macewan.ca\/introstats\/wp-json\/wp\/v2\/media?parent=1011"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/openbooks.macewan.ca\/introstats\/wp-json\/pressbooks\/v2\/chapter-type?post=1011"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/openbooks.macewan.ca\/introstats\/wp-json\/wp\/v2\/contributor?post=1011"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/openbooks.macewan.ca\/introstats\/wp-json\/wp\/v2\/license?post=1011"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}