Caching

Caching expensive computations for faster iteration

Overview

Framework's caching system lets you save the results of expensive computations and reuse them in future sessions. This dramatically speeds up iterative analysis work.

Cached items are stored as RDS files in your project's cache directory. Framework tracks creation time, expiration, and a hash for integrity verification.

Default cache directories by project type:

Project Type Cache Directory
project outputs/cache
project_sensitive outputs/private/cache
course cache
presentation cache

Cache directories are created lazily—they don't exist until you first cache something. You can customize the cache directory per project type in the GUI under Project Defaults, or in settings.yml:

directories:
  cache: my/cache/path

Basic Caching

Use cache_remember() to cache a computation:

# First run: executes the code and caches result
processed_data <- cache_remember("my_analysis", {
  # Expensive operation
  raw_data |>
    mutate(date = parse_date(date_string)) |>
    filter(valid == TRUE) |>
    group_by(category) |>
    summarize(total = sum(amount))
})

# Subsequent runs: returns cached result instantly
processed_data <- cache_remember("my_analysis", {
  # This code won't run - cached result returned
})

Cache Expiration

Set an expiration time for cached data. You can pass hours as a number, or human-readable strings:

# Cache expires after 1 hour
fresh_data <- cache_remember("live_data", {
  fetch_from_api()
}, expire_after = 1)

# Human-readable strings also work
cache_remember("daily", expr, expire_after = "1 day")
cache_remember("weekly", expr, expire_after = "7 days")

Force Refresh

Force re-computation even if cached:

# Ignore cache and recompute
updated <- cache_remember("my_analysis", {
  # This runs even if cached
  expensive_computation()
}, refresh = TRUE)

Store a Value

# Manually cache a value
cache("processed_results", my_data)

# With expiration (hours)
cache("temp_results", my_data, expire_after = 24)

Retrieve a Value

# Get cached value (returns NULL if not found or expired)
cached <- cache_get("processed_results")

Delete a Cached Item

# Remove a specific cache
cache_forget("my_analysis")

List All Cached Items

cache_list()
# Returns data frame with name, expire_at, created_at, status

Clear All Cached Items

# Remove all cached items
cache_flush()

Best Practices

1. Use Descriptive Names

# Good
cache_remember("sales_2024_by_region", ...)

# Avoid
cache_remember("data1", ...)

2. Cache at the Right Level

Cache expensive operations, not simple ones:

# Good: Cache expensive database query
results <- cache_remember("quarterly_report", {
  query_get("complex_aggregation")
})

# Avoid: Don't cache simple operations
# x <- cache_remember("add", { 1 + 1 })

3. Use Expiration for Live Data

# Good: External data should expire
api_data <- cache_remember("exchange_rates", {
  fetch_rates_from_api()
}, expire_after = "1 hour")