Skip to contents

This function processes submissions from a local directory or Google Drive folder containing HTML/XML files. It extracts tables from the files, filters them based on a pattern and key variables, and returns either a summary tibble or a combined tibble with all the data.

Usage

process_submissions(
  path,
  title = ".",
  return_value = "Summary",
  key_vars = NULL,
  verbose = 0,
  keep_file_name = NULL,
  emails = NULL
)

Arguments

path

The path to the local directory containing the HTML/XML files, or a Google Drive folder URL. If it's a Google Drive URL, the function will download individual files to a temporary directory.

title

A character vector of patterns to match against the file names (default: "."). Each pattern is processed separately and results are combined.

return_value

The type of value to return. Allowed values are "Summary" (default) or "All".

key_vars

A character vector of key variables to extract from the "id" column (default: NULL).

verbose

An integer specifying the verbosity level. 0: no messages, 1: file count messages, 2: some detailed messages about files, 3: detailed messages including all file problems (default: 0).

keep_file_name

Specifies whether to keep the file name in the summary tibble. Allowed values are NULL (default), "All" (keep entire file name), "Space" (keep up to first space), or "Underscore" (keep up to first underscore). Only used when return_value is "Summary".

emails

A character vector of email addresses to filter results by, "*" to include all emails, or NULL to skip email filtering (default: NULL).

Value

If return_value is "Summary", returns a tibble with one row for each file, columns corresponding to the key_vars, and an additional "answers" column indicating the number of rows in each tibble. If return_value is "All", returns a tibble with all the data combined from all the files.

Examples

if (FALSE) { # \dontrun{
# Process submissions from local directory
process_submissions(path = "path/to/directory")

# Process submissions with multiple patterns from local directory
process_submissions(path = "path/to/directory", title = "final", key_vars = c("email"))

# Process submissions and include all emails (no email filtering)
process_submissions(path = "path/to/directory", key_vars = "email", emails = "*")

# Process submissions and return all data
process_submissions(path = "path/to/directory", return_value = "All")

# Process submissions with verbose output (level 3)
process_submissions(path = "https://drive.google.com/drive/folders/your_folder_id", verbose = 3)

# Process submissions and keep the entire file name in the summary tibble
process_submissions(path = "path/to/directory", return_value = "Summary", keep_file_name = "All")
} # }