Simulating a Study with a Binary Outcome

An Introduction to Writing Your Own Simulations


Aaron R. Caldwell


January 29, 2021


I was on Twitter the other day and saw that Andrew Althouse had a nice thread on simulating an RCT and I wanted to expand on his suggestions and provide some of my own recommendations. In Andrew’s thread he explicitly used base R and only default stats functions that are native to R. While these are great tools, they are limited and most researchers wanting to simulate a study will find these tools lacking when trying to simulate their own particular study. This is not a criticism of Dr. Althouse’s thread, it is a wonderful thread, but I want to so a “tidy” version of this tutorial.

In this blog post I will use simstudy as a way to pseudo-replicate these results. My hope is that this post will help make the process of creating your own simulations a little bit easier. In particular, I will advocate that you create your own functions to simulate your studies. This way you can run multiple variations on your simulation with only a few more lines of code (rather than copy and pasting the whole simulation again and again).

R Packages

For this simulation I will try to restrict my use of packages to as few as possible. To simulate the data again I will need the simstudy package and I will also tidyverse set of packages to make the code “tidy”. In addition, I will need the


According to Dr. Althouse’s documentation he was trying to simulate the following study:

  • This code will mimic a 2-group parallel-arm randomized trial using 1:1 allocation of patients to treatment 1 versus treatment 2
  • For this example, we will use a binary outcome of “death”
  • Patients receiving treatment 1 will have 40% probability of death
  • Patients receiving treatment 2 will have 30% probability of death
  • Analysis will be performed using a logistic regression model
  • We will run 1000 simulated RCT’s and report the odds ratio, 95% confidence interval, and p-value for each simulated trial
  • The “true” treatment effect for a treatment that reduces probability of outcome from 40% to 30% is about OR = 0.642
  • The power of a trial with N=1000 patients and exactly 1:1 allocation under these assumptions is about 91-92%


So I imagine some people reading this are already starting to sweat about the task ahead. But trust me, if I can do this so can you! OVerall, this is a fairly straightforward process. The process is only made easier when you use the many tools available in R. Also, I am going to create “functions” which essentially means if I want to run the same simulation again (but maybe change 1 or 2 parameters) this can be done with only a few lines of code. This is efficient and if you are serious about writing your own simulations I would highly recommend writing your own functions.

Step 1: Create a Data Generating Function

This is part looks more complex than it is. All I am doing is making a function that I will call gen_bidat (short for “generate binomial data”). One of the things Althouse mentions in his code is that his allocation in the simulation is inappropriate. We can get around that by incorporating the functions from simstudy (which randomly assigns group) and keeps things organized through the %>% function. We also will import the data.table package because the simstudy package generates data as data.table type objects.

# Define function
# Parameters
## N = sample size
## props = proportions in each group (length is equal to number of groups)
gen_bidat = function(props,N){
  # Create data with N participants and balanced assignment to treatment group
  df = genData(N) %>% # generate data with N participants
    trtAssign(n = length(props),
              balanced = TRUE,
              grpName = "trt") # Randomly assign treatment (tr) in a balanced fashion  
  # Get the number of gropus by the number or proportions entered
  grps = length(props)
  # generate condition for each group
  # This creates a conditional output 
  # I.e., the odds of the outcome is determined by treatment group
  for(i in 1:grps){ # run loop once for each group in the study
    grp = i-1 # the simulation runs from 1 to the total number of groups
    # the i-1 starts the loop at zero
    # We then assign the group (grp) by which number in the loop we are in
    con_run = paste0("trt == ", grp)
    # We then grab the assign the correct proportion to the groups in order
    # We have to create the object dc first (i == 1)
    # All iterations of the loop add to dc rather than creating a new dc
    if (i == 1) {
      dc = defCondition(
        condition = con_run,
        formula = props[i],
        dist = "binary",
        link = "identity"
    } else{
      dc = defCondition(
        condition = con_run,
        formula = props[i],
        dist = "binary",
        link = "identity"
  # Now we generate the outcome based on the group (condition)
  dat <- addCondition(condDefs = dc,
                      dtOld = df,
                      newvar = "y")

# Test run
    id y trt
 1:  1 0   1
 2:  2 0   1
 3:  3 1   1
 4:  4 0   0
 5:  5 0   0
 6:  6 1   1
 7:  7 0   1
 8:  8 0   0
 9:  9 0   0
10: 10 0   0

In some cases the function above is enough. We may just want to generate a data set reflective of study we have designed. We can then “play” with the data set to plan analyses for future study. However, that is not what we are after today and we will move onto the power analysis.

Step 2: Simulation

Now we get fancy and run a simulation. All this means is that we run the simulated data (above) for a number of iterations or repetitions (tyically for thousands of iterations). We can make it a power analysis by counting the number of positive results (e.g., below the significance threshold).

For the power analysis I created a pwr_bidat function that performs a power analysis with a certain number of repetitions (reps argument). Notice below that I am using the function I just created (gen_bidat) within the pwr_bidat. That is why the user must supply the same information as the last function (the proportions, props, and the sample size, N). In addition, there are two arguments needed for the power analysis: alpha and conf.level. The alpha argument sets the alpha-level for the analyses (i.e., the significance cutoff). While the conf.level argument sets the confidence level (e.g., 95%) for the confidence intervals for the power analysis. We can calculate confidence intervals for a simulation because we have a number of “success” over a number of attempts (total number of reps). We can use the prop.test function which provides confidence intervals for proportions. This is helpful when for when we are running a small number of simulations and want an idea of what estimates of power are reasonable.

pwr_bidat = function(props,N,
                     conf.level = .95){

  # Create 1 to reps simulated data sets
  sims = replicate(n = reps,
                   gen_bidat(props, N = N),
                   simplify = FALSE)
  # Run an equivalent number of analyses
  sims2 = purrr::map(sims,  ~ glm(y ~ trt,
                                  data = .x,
                                  family = binomial(link = "logit")))
  # Get the summary coefficients from our models
  sims3 = purrr::map(sims2,  ~ summary(.x)$coefficients)
  # Put all the results into a data frame (tibble)
  sims4 = purrr::map(sims3,
                     ~ tibble::rownames_to_column(,
  # Combine all the data frames into one
  simsdf = bind_rows(sims4, .id = "nrep")
  # Summarize results by coefficient
  simspow = simsdf %>%
    group_by(coef) %>%
    # Calculate the power (number of results with p-value < alpha)
      estimate = mean(Estimate),
      power = mean(`Pr(>|z|)` < alpha),
      # Calculate confidence intervals
      power.lci = prop.test(sum(`Pr(>|z|)` < alpha), reps)$[1],
      power.uci = prop.test(sum(`Pr(>|z|)` < alpha), reps)$[2],
      .groups = 'drop'
  # Return table of results

pwr_res = pwr_bidat(c(.4,.3),N=1000)

Now that we have the results we can create a table as output using the kable function.

# Create pretty table to print results
knitr::kable(pwr_res %>%
               rename(Coefficients = coef,
                      `Average Log Odds` = estimate,
                      Power = power,
                      `Lower C.I.` = power.lci,
                      `Upper C.I.` = power.uci),
             digits = 2,
             caption = "Result from Power Simulation")
Result from Power Simulation
Coefficients Average Log Odds Power Lower C.I. Upper C.I.
(Intercept) -0.40 0.99 0.94 1.00
trt -0.47 0.95 0.88 0.98

Based on these results we can can conclude that a study of 1000 patients randomly assigned to one of two treatment groups wherein 30% of participants die in treatment group #1 and 40% perish in treatment group #2 will have approximately 95% power [0.88,0.98]. Notice, that compared to Dr. Althouse’s thread, I estimated the power at 95% (thread noted power at ~92.3%). This is to be expected when simulating data and is why a high number of repetitions are needed to determine power.

Why more complicated code?

Well, here is why create functions: it is easier on future me. Say, I run three separate simulations with minor differences so I go about with the Althouse approach (maybe even copy and paste the code a few times). Later, I notice a flaw in my code, or maybe there is a small change that alters all three simulations. Well, if I take the time to create the simulations then all I have to do is change the code in 1 place (where I defined the function) rather than with every chunk of code that includes the simulation code.

Also, it is easier to produce variations on the same design. Let’s imagine we are doing a study on a treatment for an infectious disease that has a hospitalization rate of at least 3.5% people that are infected (so treatment occurs immediately upon diagnosis). The study investigators want to know if 2000 patients are enough to detect if the proposed treatment reduces the hospitalization rate at least by half (1.75%).

All I have to do is use the same function I have created above but change the arguments in the function.

pwr_2 = pwr_bidat(c(.035,.0175),N=2000)

             caption = "Another Study of Binary Outcomes")
Another Study of Binary Outcomes
coef estimate power power.lci power.uci
(Intercept) -3.343981 1.00 0.9538987 1.000000
trt -0.704106 0.69 0.5885509 0.776633

Now, we have the new results in only 4 lines of code! Based on those results we would likely advise the investigators that they will need a larger sample size to conclude effectiveness for their proposed treatment.


In this blog post, I have demonstrated how to create 2 R functions that 1) generate data and 2) generate a simulation based power analysis. Simulation is not necessary for a simple analysis such as this where analytic solutions exist in programs like GPower. However, as I will detail in future posts, simulation becomes very useful when the design becomes complicated (or when we want to violate the assumptions of the models we use).

Hopefully, this process appears straightforward. If not send me a message and I can try to bridge the gap!