Prologue: The Importance of Simulation and Power

What is simulation? To put it in a simple way, instead of following the theory, we ‘’let the system play out.’’ As you will soon see, simulation is an essential part of data science and statistics. If the background theoretical work exists, simulation is a good way to verify that theory is indeed correct. Oftentimes, however, there is not a theoretical background, and in such cases, simulation is a great way to get an accurate estimate.

In this lab, you will explore the concept of power (of a hypothesis test). Put simply, the power of a test is the ability of that test to (correctly) reject the null hypothesis. Generally speaking, when comparing two tests, the “better” test is the one with higher power.

More specifically, you will consider the power of the \(z\)-test. Suppose that our observations come from the \(\text{N}(\mu, \sigma=1)\) distribution, where \(\mu\) is unknown, and we wish to test \(\text{H}_0: \mu = 0\) against \(\text{H}_A: \mu \neq 0\). For simplicity, we set \(\alpha\), the probability of making a Type I Error, to be \(0.05\).

As usual, after you are done with each step, change eval = F to eval = T at the beginning of each chunk of codes, and knit the document to HTML to make sure it works. With a warning that knitting might take a while, let us commence.

Part 1: Power Calculation (2.5 Points)

1.1: Generate A Normal Sample (0.5 Points)

Use the rnorm() function to generate a sample of size \(4\) from a normal distribution with mean \(1\) and standard deviation \(1\). Save your output into a vector called sample.

sample <- COMPLETE
sample

1.2: Extract \(p\)-value; Comparison to \(0.05\) (0.5 Points)

Now, grab sample and run it through the command z.test, as given below. You will see that the output is quite complex, and we are only interested in the \(p\)-value. Not to worry—the code to extract the \(p\)-value is given below. Write a logical statement that returns TRUE if the \(p\)-value is less than or equal to \(0.05\), and FALSE otherwise.

Play around with the command a little to see that you can get both TRUE and FALSE as outcomes. You need not show this in your write-up, though.

Pro-tip: If you are thinking of an if-else statement, there is a shortcut. Ask your lab instructor about it.

z.test(sample, sigma.x = 1) ## DO NOT CHANGE
pVal <- z.test(sample, sigma.x = 1)$p.value ## DO NOT CHANGE
compare <- COMPLETE
compare ## DO NOT CHANGE

1.3: Calculate Power (1 Point)

To estimate the power of the test when \(\mu=1\), we want to see how many times, out of a large number of trials, that our test correctly reject the null hypothesis. Thus, we repeat the same process above for \(10,000\) times, and every time the \(p\)-value falls below \(0.05\), we add one to our count. At the end of this process, we take our count and divide by \(10,000\)—that will be our estimated power for the \(z\)-test when \(\mu = 1\). Complete the chunk of codes below to execute this task.

Pro-tip: You can write a conditional (i.e. if-else) to check if your \(p\)-value does not go above \(0.05\). However, \(\textsf{R}\) treats TRUE as \(1\) and FALSE as \(0\), so you can speed up the process by exploiting this fact!

count <- 0 ## DO NOT CHANGE 
for (i in 1:10000) { ## YOU CAN COMBINE STEPS IF YOU WANT---THESE ARE SUGGESTIONS
  sample <- COMPLETE
  pVal <- COMPLETE
  compare <- COMPLETE
  count <- COMPLETE
}
power1 <- count/10000 ## DO NOT CHANGE 
power1 ## DO NOT CHANGE

1.4: Comparison to Actual Power (0.5 Points)

How good of a job that we did? The snippet of codes below gives you the actual value of the power of the \(z\)-test when \(\mu=1\). Print out both this value and our estimated power, and comment on what you see. Feel free to run the chunk of codes above to get different estimations of this value.

## DO NOT CHANGE THIS CHUNK OF CODE
realPwr <- pwr.norm.test(d = 1, n = 4, sig.level = 0.05, alternative = "two.sided")$power
c(power1, realPwr)

Part 2: Writing A Function (2 Points)

Now, the codes above (should) work all the time, but it is very annoying to go through all the motions again if we want to work with different sets of parameters. What if we want to see the effect of different sample sizes on power? How about different standard deviations? Is there a way to automate this process? We can resolve these problems by condense everything into one function.

2.1: Assemble From Part 1 (1.5 Points)

Assemble the process above into one function. The parameters are given to you, so think about what can be generalized for your function.

Power <- function(numTrial, n, mu, sd) { ## DO NOT CHANGE
  count <- 0 ## DO NOT CHANGE 
  for (i in 1:numTrial) {
    COMPLETE
  }
  pow <- count/numTrial ## DO NOT CHANGE 
  return(pow) ## DO NOT CHANGE 
}

2.2: Compare to Previous Calculations (0.5 Points)

Run the function that you just wrote with \(n = 4\), \(\mu = 1\), and \(\sigma = 1\)—yes, the same parameters as in Part 1. Compare the output you got to the values in Problem 1.4.

## DO NOT CHANGE THIS CHUNK OF CODE
power2 <- Power(10000, n = 4, mu = 1, sd = 1)
c(power1, realPwr, power2)

Part 3: Getting A Power Curve (2.5 Points)

A picture is worth a thousand words—and in this case, instead of focusing on one set of parameters, we will see a fuller picture of the power of the \(z\)-test across different values of \(\mu\).

3.1: Generate More Power Values (1.5 Points)

The vector effect gives you numbers from \((-3)\) to \(3\), in increments of \(0.1\). Loop through this vector to get different values of the power of the \(z\)-test when \(\mu\) ranges from \((-3)\) to \(3\) and \(\sigma=1\). Save all of these values into a new vector called powerVector—they should be in correspondence with the elements of effect. (That is, the first element of powerVector should be the estimated power when \(\mu =-3\); the second element should be the estimated power when \(\mu = -2.9\), and so on.) There are at least two ways to do this—you can use either a for-loop or the sapply function.

effect <- seq(-3, 3, by = 0.1) ## DO NOT CHANGE

powerVector <- COMPLETE

powerVector ## DO NOT CHANGE

3.2: The First Power Curve (0.5 Points)

Now, using the codes below, plot the values of powerVector (estimated power) and the theoretical power curve (given below) against effect. Your plot should look like the one in the file Plot_3_2.pdf.

## DO NOT CHANGE THIS CHUNK OF CODE
realPower <- unlist(sapply(effect, pwr.norm.test, n = 4, sig.level = 0.05, alternative = "two.sided")["power", ])

colors0 <- c("Theoretical" = "blue", "Simulation" = "deeppink")

plot_3_2 <- ggplot() +
  geom_point(aes(x = effect, y = powerVector, color = "Simulation"), size = 0.6) +
  geom_line(aes(x = effect, y = powerVector, color = "Simulation"), linewidth = 0.3) +
  geom_point(aes(x = effect, y = realPower, color = "Theoretical"), size = 0.6) +
  geom_line(aes(x = effect, y = realPower, color = "Theoretical"), linewidth = 0.3) +
  labs(x = "Effect",
       y = "Power", 
       title = "Power of the Z-Test, Standard Deviation 1, Sample Size 4",
       color = "Legend") +
  scale_color_manual(values = colors0) +
  scale_x_continuous(breaks = seq(-3, 3, by = 0.5), expand = c(0, 0.12)) +
  scale_y_continuous(breaks = seq(0, 1, by = 0.1), expand = c(0, 0), limits = c(-0.025, 1.025))

plot_3_2

3.3: Observation (0.5 Points)

Comment on the power curve. Some questions to consider: What is the shape of the power curve? What do you see when the size of \(\mu\) increases? What is the power when \(\mu=0\)? Does the simulated curve agree with the theoretical curve? Do these observations match your intuition?

Part 4: Comparison in Different Settings (3 Points)

So far, we have seen how different values of \(\mu\) can affect the power. What if we change other parameters, though? In this section, we investigate what impact changing our sample size \(n\) and standard deviation \(\sigma\) can have on the power of the \(z\)-test.

4.1: Power Across Different Sample Sizes (1.5 Points)

In the first scenario, we fix \(\sigma=1\) while changing our sample size \(n\). (Think of that as obtaining more information about our population.) Use similar code to Section 3.1 to obtain three new vectors that contains the power of the \(z\)-test when we change our sample size—the first, second, and third vector should correspond to the cases where \(n=4\), \(n=6\), and \(n=8\), respectively. In addition, provided below is the code that plots this the power of the \(z\)-test against our effect. Comment on the power of the test as the sample size increases. (Your plot should look like the one in the file Plot_4_1.pdf.) Does this match your intuition?

powerVector1a <- COMPLETE # n = 4
powerVector1b <- COMPLETE # n = 6
powerVector1c <- COMPLETE # n = 8
## DO NOT CHANGE THIS CHUNK OF CODE
colors1 <- c("4" = "red", "6" = "gold", "8" = "green")

plot_4_1 <- ggplot() +
  geom_point(aes(x = effect, y = powerVector1a, color = "4"), size = 0.6) +
  geom_line(aes(x = effect, y = powerVector1a, color = "4"), linewidth = 0.3) +
  geom_point(aes(x = effect, y = powerVector1b, color = "6"), size = 0.6) +
  geom_line(aes(x = effect, y = powerVector1b, color = "6"), linewidth = 0.3) +
  geom_point(aes(x = effect, y = powerVector1c, color = "8"), size = 0.6) +
  geom_line(aes(x = effect, y = powerVector1c, color = "8"), linewidth = 0.3) +
  labs(x = "Effect",
       y = "Power", 
       title = "Power of the Z-Test, Standard Deviation 1, Various Sample Sizes",
       color = "Sample Size") +
  scale_color_manual(values = colors1) +
  scale_x_continuous(breaks = seq(-3, 3, by = 0.5), expand = c(0, 0.12)) +
  scale_y_continuous(breaks = seq(0, 1, by = 0.1), expand = c(0, 0), limits = c(-0.025, 1.025))

plot_4_1

4.2: Power Across Different Standard Deviations (1.5 Points)

Now, we fix \(n=4\) while varying our standard deviation \(\sigma\). (Think of that as adding noises to our data.) Repeat the previous part; however, this time, the first, second, and third vector should correspond to the case where \(\sigma=1\), \(\sigma=1.5\), and \(\sigma=2\), respectively. Plot the result using the code below. (Your plot should look like the one in the file Plot_4_2.pdf.) Does this match what you believe?

powerVector2a <- COMPLETE # sd = 1.0
powerVector2b <- COMPLETE # sd = 1.5
powerVector2c <- COMPLETE # sd = 2.0
## DO NOT CHANGE THIS CHUNK OF CODE
colors2 <- c("1.0" = "red", "1.5" = "gold", "2.0" = "green")

plot_4_2 <- ggplot() +
  geom_point(aes(x = effect, y = powerVector2a, color = "1.0"), size = 0.6) +
  geom_line(aes(x = effect, y = powerVector2a, color = "1.0"), linewidth = 0.3) +
  geom_point(aes(x = effect, y = powerVector2b, color = "1.5"), size = 0.6) +
  geom_line(aes(x = effect, y = powerVector2b, color = "1.5"), linewidth = 0.3) +
  geom_point(aes(x = effect, y = powerVector2c, color = "2.0"), size = 0.6) +
  geom_line(aes(x = effect, y = powerVector2c, color = "2.0"), linewidth = 0.3) +
  labs(x = "Effect",
       y = "Power", 
       title = "Power of the Z-Test, Various Standard Deviations, Sample Size 4",
       color = "Standard Deviation") +
  scale_color_manual(values = colors2) +
  scale_x_continuous(breaks = seq(-3, 3, by = 0.5), expand = c(0, 0.12)) +
  scale_y_continuous(breaks = seq(0, 1, by = 0.1), expand = c(0, 0), limits = c(-0.025, 1.025))

plot_4_2

Epilogue

That’s it—we hope that you gain a better understanding of power as well as the importance of simulation, and that you have fun along the way!