Publishing
Publishing notebooks, data, and files to S3 storage
Overview
Framework provides functions for publishing notebooks, data, and files to S3-compatible storage (AWS S3, MinIO, DigitalOcean Spaces, etc.). Share rendered reports, datasets, and analysis outputs with colleagues or the public.
Configuring S3
Define S3 connections in settings.yml (either format works):
Preferred (connections.storage_buckets):
connections:
storage_buckets:
default:
bucket: my-reports-bucket
region: us-east-1
prefix: projects/analysis
access_key: env("S3_ACCESS_KEY")
secret_key: env("S3_SECRET_KEY")
static_hosting: true
backup:
bucket: backup-bucket
endpoint: https://minio.internal:9000
access_key: env("MINIO_ACCESS_KEY")
secret_key: env("MINIO_SECRET_KEY")
default_storage_bucket: default
Store credentials in .env:
S3_ACCESS_KEY=AKIA...
S3_SECRET_KEY=...
Configuration Options
| Option | Description |
|---|---|
bucket |
S3 bucket name |
region |
AWS region (default: us-east-1) |
prefix |
Path prefix for all uploads |
endpoint |
Custom endpoint for S3-compatible services |
access_key |
AWS access key (use env()) |
secret_key |
AWS secret key (use env()) |
static_hosting |
If true, uploads notebooks as dest/index.html |
default |
(connections format) Mark this bucket as the default |
Set a default bucket with default_storage_bucket (or default: true on a bucket). If no default is set, Framework falls back to the first bucket defined.
Publishing Notebooks
Render and upload Quarto notebooks in one step:
# Publish a notebook
publish_notebook("notebooks/analysis.qmd")
# -> https://bucket.s3.amazonaws.com/prefix/analysis.html
# Custom destination
publish_notebook("notebooks/analysis.qmd", dest = "reports/q4-2024")
# Use specific connection
publish_notebook("notebooks/analysis.qmd", connection = "backup")
Static Hosting Mode
When static_hosting: true, notebooks are uploaded as index.html files for clean URLs:
publish_notebook("analysis.qmd", dest = "reports/q4")
# Uploads to: reports/q4/index.html
# Access at: https://bucket.s3.amazonaws.com/reports/q4/
Without static hosting, files keep their extension:
publish_notebook("analysis.qmd", dest = "reports/q4")
# Uploads to: reports/q4.html
# Access at: https://bucket.s3.amazonaws.com/reports/q4.html
Self-Contained vs Multi-File
By default, notebooks are rendered as self-contained HTML with all resources embedded:
# Self-contained (default) - single file, portable
publish_notebook("analysis.qmd", self_contained = TRUE)
# Multi-file - smaller HTML, separate assets (requires static_hosting)
publish_notebook("analysis.qmd", self_contained = FALSE)
Publishing Data
Upload data frames or existing files:
# Publish a data frame as CSV
publish_data(my_results, "datasets/results.csv")
# Publish as other formats
publish_data(my_results, "datasets/results.parquet", format = "parquet")
publish_data(my_results, "datasets/results.rds", format = "rds")
# Publish with compression
publish_data(my_results, "datasets/results.csv", compress = TRUE)
# -> uploads as results.csv.gz
# Publish an existing file
publish_data("outputs/model.rds", "models/v2/model.rds")
Publishing Directories
Upload entire directories:
# Upload a directory
publish_dir("outputs/dashboard/")
# Custom destination
publish_dir("outputs/dashboard/", dest = "dashboards/v2/")
# Filter files
publish_dir("outputs/", pattern = "\\.html$")
Generic Publish
For any file or directory:
# Single file
publish("outputs/report.html")
# With custom destination
publish("outputs/report.html", dest = "reports/final.html")
# Directory
publish("outputs/charts/", dest = "visualizations/")
Workflow Example
A typical publishing workflow:
library(framework)
scaffold()
# Run analysis
results <- analyze_data(my_data)
summary_plot <- create_visualization(results)
# Save outputs locally
save_table(results, "final_results")
save_figure(summary_plot, "summary")
# Render notebook
save_notebook("notebooks/analysis.qmd")
# Publish to S3
publish_notebook("notebooks/analysis.qmd", dest = "reports/q4-analysis")
publish_data(results, "data/q4-results.csv")
# Share the URL
# https://my-bucket.s3.amazonaws.com/reports/q4-analysis/
Best Practices
Use Prefixes for Organization
connections:
storage_buckets:
default:
prefix: team-name/project-name
default_storage_bucket: default
This keeps your bucket organized:
team-name/project-name/reports/...
team-name/project-name/data/...
Version Important Outputs
Include dates or versions in destinations:
publish_notebook("analysis.qmd", dest = "reports/analysis-2024-01-15")
publish_data(results, "data/results-v2.csv")
Use Static Hosting for Reports
Enable static_hosting: true for cleaner URLs when sharing reports:
# With static hosting
https://bucket.s3.amazonaws.com/reports/q4/
# Without
https://bucket.s3.amazonaws.com/reports/q4.html