Preceptor Table1 | ||||||
---|---|---|---|---|---|---|
Unit/Time1
|
Potential Outcomes1
|
Treatment1
|
Covariates1
|
|||
Senator | Session Year | Support Bill | Oppose Bill | Lobbying Contact | Senator Age | More |
… | … | … | … | … | … | … |
… | … | … | … | … | … | … |
… | … | … | … | … | … | … |
… | … | … | … | … | … | … |
1 ... |
Overview
This vignette introduces the make_p_tables()
function in the primer.tutorials
package, which inserts a three-chunk Quarto-ready template into your open document for creating Preceptor Tables and Population Tables.
These tables are designed to support both causal and predictive modeling workflows by clearly labeling variables with spanners and encouraging detailed documentation via footnotes.
What Are Preceptor and Population Tables?
Preceptor Tables and Population Tables help represent structured information about observational units, treatment status, potential or predicted outcomes, and covariates in a standardized format. They are especially useful in modeling workflows, particularly in education or social science contexts where modeling assumptions must be made transparent.
This format draws inspiration from the Cardinal Virtues article, and aims to make tables interpretable in isolation by including both clear labeling and explanatory footnotes.
Preceptor Table
A Preceptor Table contains hypothetical or expected outcomes for units (such as students or senators). It often includes unknowns (denoted by "..."
) where real data is not yet available, and reflects researcher or instructor expectations. The table automatically includes a blank third row and a “More” column for additional covariates.
Population Table
A Population Table contains a merged view of observed data (from the population) alongside preceptor-defined expectations. It includes an additional column Source
that distinguishes between actual data ("Data"
) and expectations ("Preceptor"
). The table follows an 11-row structure with proper spacing between data and preceptor sections.
Key Features
The output of make_p_tables()
includes:
- Editable footnotes for documentation
- An empty
tibble
for the Preceptor Table (p_tibble
) - An empty
tibble
for data input (d_tibble
) -
gt
code to render both tables with grouped column headers (“spanners”) - Automatic addition of missing rows and “More” column during rendering
- Column alignment in the tribble code for easier editing
Spanner Structure
Each table includes spanners for:
- Unit/Time (for the unit columns)
- Potential Outcomes (for causal models) or Outcome (for predictive models)
- Treatment (included only in causal models)
- Covariates (includes the covariate column and the “More” column)
Note: All table entries must be surrounded by double quotes, even for numeric values (e.g.,
"42"
).
The goal is to visually communicate which variables play which roles in your modeling. Each spanner groups columns of a shared type. Footnotes help document the rationale and context for each set of variables.
Package Requirements
It automatically constructs code using these libraries:
Running make_p_tables()
Preceptor and Population Tables are inserted together. The Population Table includes a "Source"
column as its first column (controlled by the source_col
argument), which takes values "Data"
or "Preceptor"
depending on origin. This structure encourages comparison between expected and observed values.
Behind the scenes, these tables are generated using tibble::tribble()
for easier manual editing by row. The tribble code is formatted with aligned columns to help authors maintain visual structure while editing.
When you run primer.tutorials::make_p_tables()
without any argument values, you’ll get an error because the function requires specific labels to create the template.
Understanding the Function Arguments
The make_p_tables()
function takes a set of user-defined labels and options that control how the Preceptor and Population tables are built and displayed. Each argument serves a clear conceptual role and is used to populate column headers, spanner labels, and the default content in tibble::tribble()
calls. Here is a detailed breakdown:
- We use the term “label” rather than “vars” to indicate that these are the labels in the table rather than the variable names from the data. As such, they are often human-readable phrases with spaces, like “Math Score if in Small Class”. These descriptions should be concise but meaningful.
type
(Character)
- Set to
"causal"
to generate a causal table structure, which includes:- Multiple columns for Potential Outcomes (specified in
outcome_label
). - A Treatment column representing an intervention or assigned condition.
- Multiple columns for Potential Outcomes (specified in
- Set to
"predictive"
for a predictive model:- Includes outcome columns as specified in
outcome_label
. - Treatment column is still included but represents the predictor variable.
- Includes outcome columns as specified in
- This determines not only what variables appear in the tables, but also how they are spanned and labeled in the rendered
gt
tables.
unit_label
(Character vector of length 2)
- Human-readable names for the unit of analysis — e.g.,
c("Senator", "Session Year")
orc("Student", "Grade Level")
. - These will appear as the first two columns and are grouped under the
"Unit/Time"
spanner. - The labels should be capitalized and concise.
outcome_label
(Character vector)
- Describes the key outcomes being predicted or causally modeled.
- For causal models, typically includes multiple potential outcomes like
c("Support Bill", "Oppose Bill")
. - For predictive models, might be a single outcome like
c("Test Score")
. - Should be interpretable phrases that clearly describe the outcomes.
treatment_label
(Character)
- Label for the treatment/predictor column, such as
"Phone Call"
or"Tutoring Program"
. - Used to title the corresponding
gt::tab_spanner()
. - Required for both causal and predictive models.
covariate_label
(Character)
- Label for the main covariate column relevant to the analysis.
- This is grouped under the
"Covariates"
spanner along with the “More” column. - Should be a simple phrase like
"Age"
or"School Type"
.
source_col
(Logical, default TRUE)
- Controls whether the Population Table includes a
"Source"
column. - When
TRUE
, adds a column distinguishing between"Data"
and"Preceptor"
rows. - When
FALSE
, the Population Table omits the source column.
Each of these labels should be understood as descriptive display names, not as variable names from an existing dataset. The goal is clarity and interpretability for readers of the resulting Quarto document.
Helper Functions
The make_p_tables()
function relies on two key helper functions that handle the generation and formatting of the table templates:
write_input_tribble()
This function generates properly formatted tibble::tribble()
code with aligned columns for easy manual editing:
-
Purpose: Creates a character string representing R code for a tribble with placeholder values (
"..."
) - Input: Character vector of column names
- Output: Formatted tribble code with columns aligned under their headers
- Key Feature: Calculates appropriate spacing so that column values align vertically, making it easier to see which column you’re editing
The function ensures that: - Column headers are wrapped in backticks and prefixed with ~
- All placeholder values are "..."
- Column widths accommodate both header names and placeholder text - The resulting code is properly formatted for insertion into Quarto documents
expand_input_tibble()
This function processes the user-filled tibbles to add missing structural elements before rendering:
- Purpose: Adds missing rows and the “More” column to create the final table structure
- Input: List of tibbles, table type (“preceptor” or “population”), and source column option
-
Output: A single expanded tibble ready for
gt
rendering
For Preceptor Tables: - Ensures at least 4 rows by adding a blank row in the third position - Adds a “More” column filled with "..."
For Population Tables: - Creates the 11-row structure with proper spacing - Combines data and preceptor sections with blank rows - Handles the “Source” column labeling (“Data” vs “Preceptor”)
This function ensures that the final rendered tables have consistent structure regardless of how much data the user initially provides in their tibbles.
Understanding the Footnotes
Footnotes in these tables document your analytical assumptions and connect to the cardinal virtues of data science. When you use make_p_tables()
, it generates editable placeholders for ten footnotes, five for each table.
Preceptor Table Footnotes
pre_title_footnote
: Make clear the question we are trying to answer. That question helps to define the universe of interest.pre_units_footnote
: Defines each unit/row and connects to stability and representativeness. Explains what each row represents and any temporal/spatial scope. The missing rows (indicated by “…”) represent the rest of the population from which both your data and expectations are drawn.pre_outcome_footnote
: For causal tables, connects to validity - explains how the potential outcomes relate to the true causal effects you want to measure. For predictive tables, simply describes the outcome variable and its measurement.pre_treatment_footnote
: Defines the treatment and connects to unconfoundedness. Explains the treatment assignment mechanism and what makes it “as good as random” for causal inference.pre_covariates_footnote
: Explains covariate selection and the “…” in the More column, indicating additional variables that might matter but aren’t included.
Population Table Footnotes
pop_title_footnote
: Describes how this table combines observed data with researcher expectations from the Preceptor Table.pop_units_footnote
: Distinguishes between Data rows (observed units) and Preceptor rows (researcher expectations), connecting to stability and representativeness. The “…” rows represent the broader population from which both are drawn.pop_outcome_footnote
: Documents data sources and measurement procedures. For causal tables, connects to validity by explaining how observed outcomes relate to the potential outcomes of interest.pop_treatment_footnote
: Explains how treatment was assigned or observed in the data, connecting to unconfoundedness assumptions about the assignment mechanism.pop_covariates_footnote
: Describes covariate data sources and any measurement differences between observed data and researcher expectations.
The key insight is that question marks in the Preceptor Table represent the fundamental problem of causal inference - we can never observe both potential outcomes for the same unit. These footnotes make your assumptions about this missing data explicit and connect them to the cardinal virtues that make causal inference possible: validity, stability, representativeness, and unconfoundedness.
Examples
When you run:
make_p_tables(
type = "causal",
unit_label = c("Senator", "Session Year"),
outcome_label = c("Support Bill", "Oppose Bill"),
treatment_label = "Lobbying Contact",
covariate_label = "Senator Age"
)
The following chunks are inserted:
1. Footnotes and Data Setup
2. Preceptor Table
3. Population Table
Population Table1 | |||||||
---|---|---|---|---|---|---|---|
Unit/Time1
|
Potential Outcomes1
|
Treatment1
|
Covariates1
|
||||
Source | Senator | Session Year | Support Bill | Oppose Bill | Lobbying Contact | Senator Age | More |
… | … | … | … | … | … | … | … |
Data | … | … | … | … | … | … | … |
Data | … | … | … | … | … | … | … |
Data | … | … | … | … | … | … | … |
Data | … | … | … | … | … | … | … |
… | … | … | … | … | … | … | … |
Preceptor | … | … | … | … | … | … | … |
Preceptor | … | … | … | … | … | … | … |
Preceptor | … | … | … | … | … | … | … |
Preceptor | … | … | … | … | … | … | … |
… | … | … | … | … | … | … | … |
1 ... |
After filling in the tibbles and footnotes with actual data, you would see properly formatted tables with:
- Preceptor Table: 4 rows (3 content + 1 blank) with a “More” column
- Population Table: 11 rows with proper separation between data and preceptor sections
- Column alignment: Easy-to-edit tribble format with aligned columns
- Source labeling: Clear distinction between “Data” and “Preceptor” rows
Working Example: Gubernatorial Elections and Longevity
Let’s walk through a complete example using real data from the governors
dataset in primer.data
. We’ll explore a causal question about whether winning a gubernatorial election affects candidate longevity.
Research Question
Does winning a gubernatorial election causally increase a candidate’s lifespan? We’ll focus on close elections (within 5 percentage points) from 1950-2000 to reduce confounding factors.
Setting Up the Analysis
make_p_tables(
type = "causal",
unit_label = c("Candidate", "Election Year"),
outcome_label = c("Lifespan if Win", "Lifespan if Lose"),
treatment_label = "Election Outcome",
covariate_label = "Election Age"
)
This generates the template, which we then fill with actual data:
1. Completed Data Setup
2. Rendered Preceptor Table
Preceptor Table | ||||||
---|---|---|---|---|---|---|
Unit/Time1
|
Potential Outcomes2
|
Treatment3
|
Covariates
|
|||
Candidate | Election Year | Lifespan if Win | Lifespan if Lose | Election Outcome | Election Age | ... |
John Smith | 1975 | 78 | 75 | Won | 52 | … |
Mary Johnson | 1982 | 82 | 79 | Lost | 48 | … |
… | … | … | … | … | … | … |
Robert Wilson | 1990 | 75 | 81 | Won | 45 | … |
1 Each row represents a candidate in a close gubernatorial election (1950-2000, margin ≤5%). Missing rows represent the broader population (stability, representativeness). | ||||||
2 Potential lifespans under winning vs. losing. Question marks show unobserved counterfactuals (validity). | ||||||
3 Election outcome determined by vote margin. Close races approximate random assignment (unconfoundedness). |
3. Rendered Population Table
Population Table | |||||||
---|---|---|---|---|---|---|---|
Source |
Unit/Time1
|
Potential Outcomes2
|
Treatment3
|
Covariates
|
|||
Candidate | Election Year | Lifespan if Win | Lifespan if Lose | Election Outcome | Election Age | ... | |
… | … | … | … | … | … | … | … |
Data | Frank Miller | 1978 | 73 | ? | Won | 55 | … |
Data | Susan Davis | 1984 | ? | 76 | Lost | 49 | … |
Data | … | … | … | … | … | … | … |
Data | David Brown | 1992 | 68 | ? | Won | 58 | … |
… | … | … | … | … | … | … | … |
Preceptor | John Smith | 1975 | 78 | 75 | Won | 52 | … |
Preceptor | Mary Johnson | 1982 | 82 | 79 | Lost | 48 | … |
Preceptor | … | … | … | … | … | … | … |
Preceptor | Robert Wilson | 1990 | 75 | 81 | Won | 45 | … |
… | … | … | … | … | … | … | … |
1 Data rows: actual elections; Preceptor rows: expectations. Missing rows represent broader population (stability, representativeness). | |||||||
2 Historical lifespans; question marks show unobserved counterfactuals (validity). | |||||||
3 Election outcomes from vote tallies. Close margins approximate randomization (unconfoundedness). |
Table Structure Details
Preceptor Table
- Uses
p_tibble
as input (3 rows of placeholders) - Processed by
expand_input_tibble()
to add a blank third row and “More” column - Results in 4 total rows for the final table
Population Table
- Uses
d_tibble
for data input (3 rows of placeholders) - Creates 4 data rows (3 content + 1 blank in 3rd position)
- Uses the expanded preceptor table (4 rows)
- Combines into 11-row structure: blank + 4 data + blank + 4 preceptor + blank
- All rows are properly labeled in the Source column
Column Alignment
The tribble code generated by write_input_tribble()
is formatted with aligned columns to make editing easier: - Headers and values are padded to align vertically - Minimum column width accommodates the "..."
placeholder - Makes it easy to see which column you’re editing
Summary
The make_p_tables()
function simplifies the creation of interpretable, spanner-labeled tables for modeling workflows. It promotes clarity, transparency, and rigor by encouraging authors to:
- Replace placeholders with meaningful values
- Use proper formatting with double quotes around all entries
- Fill in footnotes with useful context
- Take advantage of the aligned column structure for easy editing
- Understand the automatic row and column additions during rendering
This workflow supports better modeling documentation and instructional design, with helper functions ensuring consistent formatting and structure.
For more on how and why to use these tables, see:
- The Cardinal Virtues article from primer.tutorials