framework.db

Overview

Every Framework project includes a framework.db SQLite database for internal tracking. This lightweight database stores metadata about your data sources, cached computations, saved outputs, and more.

You generally don't interact with framework.db directly—Framework manages it automatically when you use functions like data_read(), cache_remember(), and save_table().

It is recommended to commit this file to git/GitHub and consider it as part of your project.

Database Schema

Table	Purpose
`data`	Tracks data sources, file paths, types, and integrity hashes
`cache`	Manages cached computation metadata and expiration
`results`	Registry of saved outputs (tables, figures, models)
`connections`	Database connection metadata
`meta`	Key-value store for project metadata

Data Integrity

When you read a data file using data_read(), Framework stores a hash of the file contents.

data:
  inputs.raw.patients:
    path: inputs/raw/patients.csv
    locked: true

patients <- data_read("inputs.raw.patients")

# First read: computes and stores hash in framework.db
# Subsequent reads: verifies hash matches, errors if file changed

If you set locked to true in the data catalog, it will ensure your source data hasn't been modified unexpectedly by erring. This is critical for reproducible research.

Cache Tracking

The cache table tracks what's cached, when it expires, and integrity hashes:

result <- cache_remember("expensive_model", {
  fit_complex_model(data)
}, expire = "7 days")  # Strings like "7 days" or numeric hours are accepted
# Metadata stored in framework.db
# Actual data stored in outputs/cache/expensive_model.rds

Framework uses the database to:

Check if a cached result exists and is still valid
Track expiration times
Verify cached data hasn't been corrupted

Durations: expire_after (or expire) accepts numeric hours (e.g., 4) or strings like "1 day" or "7 days".

Results Registry

When you save outputs, Framework automatically records them:

save_table(results, "analysis_results")
save_figure(plot, "summary_chart")
save_model(model, "regression_v1")

The results table tracks:

Output name and type (table, figure, model, report, notebook)
Whether it's public or private (for sensitive projects)
File hash for integrity
Creation and update timestamps

Use result_list() to query saved results:

result_list()                        # All results
result_list(type = "figure")         # Only figures
result_list(type = "table", public = TRUE)  # Public tables only

See Saving Work and Viewing Work for complete documentation on saving and tracking outputs.

Querying the Database

For advanced use, you can query framework.db directly using the built-in "framework" connection:

# View cached items
cache_entries <- db_query("SELECT name, expire_at FROM cache", "framework")

# Check data registry
data_registry <- db_query("SELECT name, path, hash, locked FROM data", "framework")

# List saved results
results <- db_query("SELECT name, type, created_at FROM results", "framework")

Location

The database lives at your project root:

my-project/
├── framework.db          # Project database
├── settings.yml
├── notebooks/
└── ...

It's typically gitignored since it contains local state that may differ between collaborators.

← scaffold()

Packages →