Choice-based conjoint surveys in R with surveydown

R
tutorial
conjoint

A how-to guide for using R to design and implement choice-based conjoint surveys using the surveydown R package

Author

John Paul Helveston

Published

2024-08-28

Because surveydown surveys run as a shiny app, you can include custom logic in the background by writing some code in your server. In this post, I’m going to show you one approach for using surveydown to create a particular type of complex survey: a choice-based conjoint survey.

Note

If you’re unfamiliar with what a conjoint survey is, take a look at this quick introduction.

The key component of a choice-based conjoint survey is asking repsondents to make choices from randomized sets of choice questions. So the hard part is figuring out a way to show each respondent a different set of randomized questions. This post shows how you can achieve this in surveydown.

Throughout this post, I will use a demo survey about people’s preferences for apples with three attributes: type, price, and freshness.1

You can view the live demo survey here, and all files used to create the survey are on this GitHub repo.

Introduction

If you’ve never used surveydown before, take a look at this post to get a quick introduction to the package and how to use it to make a survey.

The basic concept is:

  1. Design your survey as a Quarto shiny document using markdown and R code.
  2. Render your doc into a shiny app that can be hosted online and sent to respondents.
  3. Store your survey responses in a supabase database.

Getting started

If you want to start from a blank slate, take a look at the Getting Started documentation page.

For this post, we recommend starting from the demo survey available at this GitHub repo. It provides an already working survey that you can modify to the needs of your conjoint survey.

The demo repo has a lot of files in it, but the main files defining the survey itself are:

  • survey.qmd: The main body of the survey.
  • server.R: The server defining the logic implemented in the survey, including randomizing questions, connecting to a database, etc.

In a typical surveydown survey, our server code chunk (bottom of the survey.qmd file) is not that long, so we just keep everything there. However, in this case since we have a lot more going on in our server, we chose to use a split file structure where we separate out most of the server code in a separate server.R file. We then source that file in the server code chunk (the source("server.R") part in our server code chunk). This just makes it easier to edit the server code as the server.R file can be opened in a separate tab rather than having to scroll up and down the longer survey.qmd file.

Note

We recommend opening the survey.Rproj if you’re working in RStudio to make sure RStudio opens to the correct project folder.

Content in the survey body

After the setup code chunk (where we load the package and establish a database connection with sd_database()), we have a series of pages (defined with ::: fences) that include markdown-formatted text and survey questions (defined with sd_question()). You can modify any of this content as you wish to suit the needs of your survey.

In this demo, we have a few other examples included, like a conditionally displayed question (the fav_fruit question will not display if you choose “No” on the first question about liking fruit) as well as a question that skips people to the end (if you choose “blue” and not “red” on the screening page). The logic controlling the conditional display and skipping is defined in the sd_config() function inside the server.R file.

None of this is necessary for a conjoint survey, but often times these are features that you may want to include, such as screening people out of the survey if they don’t qualify to take it, so we include it for demonstration purposes.

Defining the choice questions

The central component of every conjoint survey is the set of randomized choice questions. To implement these in surveydown, we pre-define our choice questions in a design file that we later use in the survey to select randomized sets of choice questions to display each respondent.

We use the cbcTools package to create the pre-defined design file. The code to create the choice questions for this demo survey is in the make_choice_questions.R file in the demo repo. This code generates a data frame of randomized choice questions that we then save in the project directory as choice_questions.csv.

Implementing the choice questions

The choice questions are implemented at the top of the server.R file in the demo repo. This code does the following steps:

1. Read in the design file

Pretty straightforward - this is one line to read in the choice_questions.csv design file that we saved in the project folder.

design <- readr::read_csv("choice_questions.csv")

2. Sample and store a random respondent ID

Since we want each respondent to see a different set of choice questions, we randomly sample a respondent ID from the set of all respondent IDs in the design file. We also need to keep track of this and store it in our response data so that later we can know what each respondent was actually shown.

Since this is a value that we generated in the server (and not a value from a survey question to a respondent), we have to manually add it to the survey response data using sd_store_value(). Here we modified the name so that in the resulting survey data the column name will be "respID".

# Sample a random respondentID
respondentID <- sample(design$respID, 1)

# Store the respondentID
sd_store_value(respondentID, "respID")

3. Filter the design for the respondentID

We create a subset dataframe called df that stores only the rows for the randomly chosen respondent ID. We also append the "images/" string onto the values in the image column as this will create the relative path to the images in our survey, e.g. "images/fuji.jpg" (all the images we show are in the "images" folder in the repo).

# Filter for the rows for the chosen respondentID
df <- design %>%
  filter(respID == respondentID) %>%
  mutate(image = paste0("images/", image))

4. Define a function to create question options

This is the most complex component in the server logic. Here we created a function that takes a dataframe and returns a named vector defining the options to show in each choice question. In this case, we only have 3 options per choice question, so each time we call this function we will use a small dataframe that has just 3 rows defining the 3 choice alternatives in a single choice question.

The function does several things. First, it extracts three single-row data frames that store the values of each of the 3 alternatives (alt1, alt2, and alt3). It then creates an options vector that has just 3 values: "option_1", "option_2", and "option_3". Then we have to define the names of each of those options. Remember that the values in the options vector are what gets stored in our resulting survey data based on what the respondent chooses, but the names are what respondents see. So in the context of a choice survey like this, we need to embed all of the attributes and their levels in the names of the options vector.

We use the glue() function to easily inject the values stored in alt1, alt2, and alt3 into our labels. The glue() function is similar to paste() in that is just concatenates object values into a string, but it has an easier syntax to work with. Anything inside {} brackets is evaluated, and the resulting value is inserted into the string. So for example, the line glue("1 plus 1 equals {1+1}") would produce the string "1 plus 1 equals 2".

In our case, we’re including some html code to insert an image of the apple type (<img src='{alt1$image}' width=100>), the apply type itself (**Type**: {alt1$type}), and the apple price (**Price**: $ {alt1$price} / lb).

Notice also that we’re mixing markdown (e.g. **Option 1**) and html (e.g. <br>), which will all get rendered into proper html in the resulting shiny app. The full function looks like this:

# Function to create the labels for a choice question
# based on the values in df

make_cbc_options <- function(df) {
  alt1 <- df |> filter(altID == 1)
  alt2 <- df |> filter(altID == 2)
  alt3 <- df |> filter(altID == 3)

  options <- c("option_1", "option_2", "option_3")

  names(options) <- c(
    glue("
      **Option 1**<br>
      <img src='{alt1$image}' width=100><br>
      **Type**: {alt1$type}<br>
      **Price**: $ {alt1$price} / lb
    "),
    glue("
      **Option 2**<br>
      <img src='{alt2$image}' width=100><br>
      **Type**: {alt2$type}<br>
      **Price**: $ {alt2$price} / lb
    "),
    glue("
      **Option 3**<br>
      <img src='{alt3$image}' width=100><br>
      **Type**: {alt3$type}<br>
      **Price**: $ {alt3$price} / lb
    ")
  )
  return(options)
}

5. Create the options for each choice question

One of the benefits of making the function the way we did in the previous step is that we can now easily call it to generate the option vector for each of the 6 choice questions in df:

# Create the options for each choice question

cbc1_options <- make_cbc_options(df |> filter(qID == 1))
cbc2_options <- make_cbc_options(df |> filter(qID == 2))
cbc3_options <- make_cbc_options(df |> filter(qID == 3))
cbc4_options <- make_cbc_options(df |> filter(qID == 4))
cbc5_options <- make_cbc_options(df |> filter(qID == 5))
cbc6_options <- make_cbc_options(df |> filter(qID == 6))

6. Create each choice question (6 in total)

Finally, we now have everything we need to generate each choice question. Here we’re using the mc_buttons question type so that the labels we generated will be displayed on a large button, which looks good both on a computer and phone. We give the question a unique id (e.g. cbc_q1), and a label, and then set the option to the corresponding option vector we defined above.

sd_question(
  type   = 'mc_buttons',
  id     = 'cbc_q1',
  label  = "(1 of 6) If these were your only options, which would you choose?",
  option = cbc1_options
)

# ...and 5 more questions like this

Remember that since the labels in the options are being dynamically generated on each new session (each respondent), they have to be created in the server, not in the main survey body. As a result, the sd_question() function must also be created in the server code (if you put this code in the main body, only one random set of choice options will be generated, and they’ll be the same for everyone).

To display each question in the survey body, we use sd_output("id", type = "question"), changing id to each corresponding choice question we created. In the demo survey.qmd file, you’ll see that there are 6 choice questions displayed in the main survey body (each on their own page), and each of those 6 questions are defined in the server.R file.

When rendered, a choice question will look like this, with the values matching whatever alternative was chosen in the design file:



And that’s it! You now have 6 randomized choice questions!

Preview and check

The rest of the server.R file has the remaining components we need, like any conditional display or skip logic using sd_config(). This is all standard features of any surveydown survey, so we won’t cover them in detail here and instead direct you to the documentation for details.

But before you go live, it’s a good idea to do some quick testing. You can test your survey even without having it connected to a database by setting ignore = TRUE in the sd_database() function. Of course, you probably should also test it after connecting it to a database to ensure that responses are being properly stored.

When testing, you might get an error - don’t panic! Read the terminal output carefully and debug. There’s a good chance you may have missed a bug somewhere in your server code. But since it’s in a separate server.R file, you should be able to open this and run through your code to check that things are working properly.

Getting the data

Once your survey is live and you start collecting responses, you can easily access your data with the sd_get_data() function. This is typically done in a separate R file, which might look something like this:

library(surveydown)

db <- sd_database(
  user   = 'postgres.axzkymswaxcasjdflkurrj',
  host   = 'aws-0-us-east-1.pooler.supabase.com',
  port   = 5678,
  db_name = 'postgres',
  table_name = 'my_table'
)

data <- sd_get_data(db)

Obviously your settings in sd_database() would need to match those of your Supabase database that you created for your survey.

And that’s it! We hope this post was helpful, and do go check out the this GitHub repo to try out the demo yourself.

Back to top

Footnotes

  1. Yes, people have actually done conjoint surveys on fruit before.↩︎