Storage Backends
Configure persistent storage for catalog metadata.
Table of contents
- Overview
- Quick Start
- Memory Backend
- Local Filesystem
- AWS S3
- Google Cloud Storage
- Azure Blob Storage
- MinIO (Self-Hosted S3)
- Kubernetes Horizontal Scaling
- Backup and Restore
- Performance Tuning
- Troubleshooting
- Next Steps
Overview
Rustberg uses SlateDB (100% pure Rust) for catalog metadata storage:
| Backend | URL Scheme | K8s HA | Use Case |
|---|---|---|---|
| Memory | memory:// |
❌ | Development, testing |
| Local File | file:///path |
❌ | Single-node production |
| AWS S3 | s3://bucket/prefix |
✅ | Cloud production |
| GCS | gs://bucket/prefix |
✅ | Cloud production |
| Azure Blob | az://container/prefix |
✅ | Cloud production |
| MinIO | s3://bucket + endpoint |
✅ | Air-gapped |
Quick Start
Memory (Default)
# In-memory storage (data lost on restart)
./rustberg
Local Filesystem
# Persistent local storage
./rustberg --storage file:///var/lib/rustberg
AWS S3
# S3 backend (K8s HA ready)
export AWS_ACCESS_KEY_ID=your_key
export AWS_SECRET_ACCESS_KEY=your_secret
./rustberg --storage s3://my-bucket/rustberg-catalog
Memory Backend
URL: memory://
Best for:
- Development
- CI/CD testing
- Ephemeral workloads
[storage]
object_store_url = "memory://"
Data is lost when the process restarts. Not for production.
Local Filesystem
URL: file:///absolute/path
Best for:
- Single-node production
- Edge deployments
- Simple setups
Configuration
[storage]
object_store_url = "file:///var/lib/rustberg"
Directory Structure
/var/lib/rustberg/
├── slatedb/ # SlateDB LSM-tree data
│ ├── wal/ # Write-ahead log
│ ├── sst/ # Sorted string tables
│ └── manifest/ # Metadata
└── backup/ # Optional backup location
Permissions
# Create directory
sudo mkdir -p /var/lib/rustberg
sudo chown rustberg:rustberg /var/lib/rustberg
chmod 700 /var/lib/rustberg
Use absolute paths. Relative paths may cause issues.
AWS S3
URL: s3://bucket/prefix
Best for:
- Kubernetes deployments
- High availability
- Multi-replica setups
Configuration
[storage]
object_store_url = "s3://my-bucket/rustberg-catalog"
aws_region = "us-east-1"
Authentication
Environment Variables (Recommended)
export AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE
export AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
export AWS_REGION=us-east-1
IAM Role (EC2/EKS)
# EKS Service Account with IRSA
apiVersion: v1
kind: ServiceAccount
metadata:
name: rustberg
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::123456789:role/rustberg-role
IAM Policy
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::my-bucket",
"arn:aws:s3:::my-bucket/rustberg-catalog/*"
]
}
]
}
S3 Bucket Settings
| Setting | Recommended Value | Why |
|---|---|---|
| Versioning | Enabled | Disaster recovery |
| Encryption | SSE-S3 or SSE-KMS | Data protection |
| Lifecycle | 30 days for old versions | Cost optimization |
| Replication | Optional | Multi-region HA |
Google Cloud Storage
URL: gs://bucket/prefix
Best for:
- GKE deployments
- Google Cloud workloads
Configuration
[storage]
object_store_url = "gs://my-bucket/rustberg-catalog"
Authentication
Service Account Key
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
Workload Identity (GKE)
apiVersion: v1
kind: ServiceAccount
metadata:
name: rustberg
annotations:
iam.gke.io/gcp-service-account: rustberg@project.iam.gserviceaccount.com
IAM Permissions
gsutil iam ch serviceAccount:rustberg@project.iam.gserviceaccount.com:objectAdmin \
gs://my-bucket
Azure Blob Storage
URL: az://container/prefix
Best for:
- AKS deployments
- Azure workloads
Configuration
[storage]
object_store_url = "az://my-container/rustberg-catalog"
azure_storage_account = "mystorageaccount"
Authentication
Access Key
export AZURE_STORAGE_ACCOUNT=mystorageaccount
export AZURE_STORAGE_KEY=your_storage_key
Managed Identity (AKS)
apiVersion: v1
kind: ServiceAccount
metadata:
name: rustberg
annotations:
azure.workload.identity/client-id: <client-id>
MinIO (Self-Hosted S3)
URL: s3://bucket with custom endpoint
Best for:
- Air-gapped environments
- On-premises deployments
- Development with S3 API
Configuration
[storage]
object_store_url = "s3://rustberg-bucket/catalog"
aws_endpoint = "http://minio.local:9000"
aws_region = "us-east-1"
aws_allow_http = true # Only for development
Docker Compose Example
version: '3.8'
services:
minio:
image: minio/minio
ports:
- "9000:9000"
- "9001:9001"
environment:
MINIO_ROOT_USER: minioadmin
MINIO_ROOT_PASSWORD: minioadmin
command: server /data --console-address ":9001"
rustberg:
image: ghcr.io/hupe1980/rustberg:latest
ports:
- "8181:8181"
environment:
RUSTBERG_STORAGE: "s3://rustberg/catalog"
AWS_ENDPOINT_URL: "http://minio:9000"
AWS_ACCESS_KEY_ID: minioadmin
AWS_SECRET_ACCESS_KEY: minioadmin
AWS_REGION: us-east-1
depends_on:
- minio
Kubernetes Horizontal Scaling
SlateDB enables horizontal scaling without external coordination:
apiVersion: apps/v1
kind: Deployment
metadata:
name: rustberg
spec:
replicas: 3 # ✅ Multiple replicas!
selector:
matchLabels:
app: rustberg
template:
spec:
containers:
- name: rustberg
image: ghcr.io/hupe1980/rustberg:latest
env:
- name: RUSTBERG_STORAGE
value: "s3://my-bucket/rustberg-catalog"
How It Works
- No leader election - SlateDB’s
writer_epochfencing handles coordination - CAS operations - Object storage provides atomic compare-and-swap
- Automatic retry - Contention resolved with exponential backoff
- 11-nines durability - Inherits S3/GCS durability
┌─────────────────────────────────────────────────────────────────┐
│ Rustberg K8s Deployment │
├─────────────────────────────────────────────────────────────────┤
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Pod 1 │ │ Pod 2 │ │ Pod 3 │ │
│ │ Rustberg │ │ Rustberg │ │ Rustberg │ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
│ └─────────────┼─────────────┘ │
│ ▼ │
│ ┌─────────────────────────────┐ │
│ │ SlateDB │ │
│ │ (writer_epoch fencing) │ │
│ └─────────────┬───────────────┘ │
│ ▼ │
│ ┌─────────────────────────────┐ │
│ │ S3 / GCS / MinIO │ │
│ │ (CAS + 11-nines durable) │ │
│ └─────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
Backup and Restore
Backup
# Backup catalog to archive
./rustberg backup \
--storage s3://my-bucket/rustberg-catalog \
--output /backups/catalog-2026-01-24.tar.gz
Restore
# Restore from backup
./rustberg restore \
--input /backups/catalog-2026-01-24.tar.gz \
--storage s3://my-bucket/rustberg-catalog
Validate Backup
# Verify backup integrity
./rustberg validate-backup \
--input /backups/catalog-2026-01-24.tar.gz
Performance Tuning
S3 Optimization
[storage]
object_store_url = "s3://my-bucket/catalog"
aws_region = "us-east-1"
# Performance settings
s3_multipart_threshold_mb = 8
s3_multipart_chunk_size_mb = 8
s3_max_concurrent_requests = 100
Local Filesystem
[storage]
object_store_url = "file:///var/lib/rustberg"
# Use SSD for best performance
# Mount with noatime for reduced I/O
Troubleshooting
S3 Access Denied
# Verify credentials
aws sts get-caller-identity
# Test bucket access
aws s3 ls s3://my-bucket/rustberg-catalog/
# Check bucket policy
aws s3api get-bucket-policy --bucket my-bucket
GCS Permission Denied
# Verify service account
gcloud auth list
# Test bucket access
gsutil ls gs://my-bucket/rustberg-catalog/
Local Filesystem Issues
# Check permissions
ls -la /var/lib/rustberg
# Check disk space
df -h /var/lib/rustberg
# Check for lock files
ls -la /var/lib/rustberg/slatedb/
Next Steps
- Encryption Guide - Encrypt data at rest
- Kubernetes Guide - Production K8s deployment
- Backup Guide - Disaster recovery