Function to query allofus observation table for survey responses

Description

Extracts survey responses in a tidy format that also includes ‘skip’ responses and collapses across all versions of the person health / personal medical history surveys. Currently responses in the ‘ds_survey’ table omit skipped responses. Responses are returned as Yes" if the respondent answered that the individual had the condition, No" if the respondent answered that the individual did not have that condition (or omitted it when selecting from related conditions), a skip response if the question was skipped, and NA if the respondent did not answer the question. Returns a data frame or SQL tbl with the initial cohort table along with a column for each question included in questions and answers foreach person_id in the cells. To find the desired survey questions, use the all of us data dictionary, survey codebook, Athena, data browser, or the modified codebook which can be found in the allofus R package.

Usage

aou_survey(
  cohort = NULL,
  questions,
  question_output = "concept_code",
  clean_answers = TRUE,
  collect = FALSE,
  ...,
  con = getOption("aou.default.con")
)

Arguments

cohort Reference to a remote table or local dataframe with a column called "person_id"
questions either a vector of concept_ids or concept_codes for questions to return results
question_output how to name the columns. Options include as the text of the concept code ("concept_code"), as concept ids preceded by "x_" ("concept_id"), or using a custom vector of column names matching the vector of questions. Defaults to "concept_code".
clean_answers whether to clean the answers to the survey questions. Defaults to TRUE.
collect Whether to bring the resulting table into local memory (collect = TRUE) as a dataframe or leave as a reference to a database table (for continued analysis using, e.g., dbplyr). Defaults to FALSE.
additional arguments passed to collect() when collect = TRUE
con connection to the allofus SQL database. Defaults to getOption("aou.default.con"), which is created automatically with aou_connect()

Details

The function will return a dataframe or SQL tbl with the initial cohort table along with a column for each question included in questions and answers for each person_id in the cells. The column names (questions) can be returned as the concept_code or concept_id or by providing new column names. For each question, a column with the suffix "_date" is included with the date on which the question was answered. When questions can have multiple answers ("checkbox"-style questions), answers are returned as a comma-separated string.

To find the desired survey questions, use the all of us data dictionary, survey codebook, athena, data browser, or the allofus R package modified codebook which can be found here: https://roux-ohdsi.github.io/allofus/vignettes/searchable_codebook.html For questions regarding an individual’s health history or family health history, the function requires the specific concept_id (or concept_code) for individual in question, whether that is "self" or another relative. Responses are returned as "Yes" if the respondent answered that the individual had the condition, "No" if the respondent answered that the individual did not have that condition (or omitted it when selecting from related conditions), a skip response if the question was skipped, and NA if the respondent did not answer the question.

Value

A dataframe if collect = TRUE; a reference to a remote database table if not.

Examples

library("allofus")



con <- aou_connect()
cohort <- dplyr::tbl(con, "person") %>%
  dplyr::filter(person_id > 5000000) %>%
  dplyr::select(person_id, year_of_birth, gender_concept_id)

aou_survey(
  cohort,
  questions = c(1585375, 1586135),
  question_output = "concept_code"
)
aou_survey(
  cohort,
  questions = c(1585811, 1585386),
  question_output = c("pregnancy", "insurance")
)
aou_survey(
  cohort,
  questions = c(1585375, 1586135, 1740719, 43529932),
  question_output = c("income", "birthplace", "grandpa_bowel_obstruction", "t2dm"),
  collect = FALSE
)

aou_survey(cohort,
  questions = 1384452,
  question_output = "osteoarthritis"
) %>%
  dplyr::count(osteoarthritis)