class: center, middle, inverse, title-slide .title[ # Week 2: Normal RVs ] .author[ ### STAT 021 with Suzanne Thornton ] .institute[ ### Swarthmore College ] --- <style type="text/css"> pre { background: #FFBB33; max-width: 100%; overflow-x: scroll; } .scroll-output { height: 75%; overflow-y: scroll; } .scroll-small { height: 50%; overflow-y: scroll; } .red{color: #ce151e;} .green{color: #26b421;} .blue{color: #426EF0;} </style> # Normal Random Variables: Pictures and Code Here we are going to go over how to find these attributions for a Normally distributed random variable (RV) using R code: - density curve/histograms - critical value - quantiles - probabilities (area under the curves) - standardization/Z-score --- # Normal Random Variables: Pictures and Code ## Random sample To generate a random sample from a Normal distribution with mean `\(3\)` and standard deviation `\(1.5\)`, we use the `rnorm` function. Below, I've labeled the sample size of `\(10\)` as `sample_size` and then draw a random sample of that size with the specified parameters: ```r sample_size <- 10 rnorm(sample_size, mean=3, sd=1.5) ``` ``` ## [1] 2.427049 2.088816 4.296314 3.819302 1.190547 2.091762 1.575756 1.509311 ## [9] 1.957886 5.262236 ``` This is a *computer simulated* random sample. --- # Normal Random Variables: Pictures and Code ## Probabilities Given a quantile, or observation of a random variable, we can calculate how frequently this value or a smaller value will occur from this Normal distribution with mean `\(3\)` and standard deviation `\(1.5\)` using the `prnorm` function. The prefix "p" stands for "probability" since this is a lower tailed probability. The example below calculates the probability a Normally distributed RV will be less than or equal to the number `my_quantile`: ```r my_quantile <- 3.45 pnorm(my_quantile, mean=3, sd=1.5, lower.tail=TRUE) ``` ``` ## [1] 0.6179114 ``` --- # Normal Random Variables: Pictures and Code ## Quantiles The *inverse* of the setting on the previous slide is to find a quantile given some probability. Given a lower tailed probability, we can calculate exactly what is the largest number for a Normally distributed RV with mean `\(3\)` and standard deviation `\(1.5\)` that we expect to occur this often using the `qrnorm` function. The prefix "q" stands for "quantile" since. The example below calculates the quantile or upper bound of the values of a Normal RV that occur with probability `my_probability` or less: ```r my_probability <- 0.75 qnorm(my_probability, mean=3, sd=1.5, lower.tail=TRUE) ``` ``` ## [1] 4.011735 ``` --- ## Normal/Gaussian Random Variables These are continuous, quantitative random variables whose sample space is always `\(S= (-\infty, \infty)\)`. `$$X \sim N(\mu, \sigma^2)$$` The mean `\((\mu)\)` describes the balancing point and center of the distribution. The standard deviation `\((\sigma)\)` and variance `\((\sigma^2)\)` measure the width or spread of the histogram. .center[<img src = "normal_hist-1.png"/>] --- ## Normal/Gaussian Random Variables `$$X \sim N(\mu, \sigma^2)$$` .center[<img src = "normal_hist2-1.png"/>] For Normally distributed RVs we have the following rule: `\(68\%\)` of all possible values this RV can take on fall within **one** standard deviation of the mean, `\(95\%\)` of all possible values fall within **two** standard deviations of the mean, and `\(99.7\%\)` of all possible values fall within **three** standard deviation of the mean --- ## Normal/Gaussian Random Variables ### Standard Normal RV `$$Z \sim N(0,1)$$` ### In R some useful functions for Normal RVs are ```r rnorm(100, mean=2, sd=0.9) ## This generates 100 independence random samples from a Normal distribution with mean 2 and variance 0.9^2 pnorm(1.1, mean=2, sd=0.9, lower.tail=FALSE) ## This find the probability that a Normally distributed (mean 2, variance 0.9^2) RV take on a value of 1.1 or higher qnorm(0.18, mean=2, sd=0.9, lower.tail=TRUE) ## this finds the lower 18th percentile for a RV with a Normal distribution (mean 2, variance 0.9^2) ``` --- ## Properties of Normal RVs **1) Linear combinations of Normal RVs** If `\(X \sim N(\mu_{X}, \sigma^2_{X})\)` and `\(Y \sim N(\mu_{Y}, \sigma^2_{Y})\)`, then for any constant numbers `\(a,b,\)` and `\(c\)`, `$$aX + bY + c \sim N(\mu_{combined}, \sigma^2_{combinded}), \quad\text{where}$$` `$$\mu_{combined} = a\mu_{X} + b\mu_{Y}+c \quad \text{and} \quad \sigma^2_{combined} = \sigma^2_{X} + \sigma^2_{Y} + Cov(X,Y).$$` **2) Unique specifications of Normal RVs** The mean `\((\mu)\)` and variance `\((\sigma^2)\)` of a Normal RV completely and uniquely specify a Normal RV. This means if two random variables are both Normally distributed and have the same mean and same variance, then they are from the exact same distribution.