Per-Principal Quotas
Rate limiting for multi-tenant deployments.
Table of contents
- Overview
- Quota Types
- Quota Resolution Order
- Protocol
- CLI Usage
- Configuration
- Client Handling
- Monitoring
- Best Practices
- Internal Architecture
- Security
Overview
Rivven supports per-principal quotas to limit throughput on a per-user, per-client-id, or per-consumer-group basis. This prevents noisy neighbors in multi-tenant deployments and ensures fair resource allocation.
Quotas can be configured at different granularities:
| Entity Type | Description | Use Case |
|---|---|---|
user |
Per authenticated principal | User-based throttling |
client-id |
Per client identifier | Application-based throttling |
consumer-group |
Per consumer group | Group throughput limits |
default |
Fallback for all entities | Global baseline limits |
Quota Types
| Quota | Unit | Default | Description |
|---|---|---|---|
produce_bytes_rate |
bytes/sec | 50 MB/s | Producer throughput limit |
consume_bytes_rate |
bytes/sec | 100 MB/s | Consumer throughput limit |
request_rate |
requests/sec | 1000/s | Request rate limit |
Quota Resolution Order
When checking quotas, Rivven resolves in this order:
- User + Client ID specific — Most specific, highest priority
- User specific — Per-user limits
- Client ID specific — Per-application limits
- Default for entity type — e.g., default user limits
- Global default — Fallback baseline
Protocol
Describe Quotas
Request::DescribeQuotas {
// Empty = all quotas, or specific entities
entities: vec![
("user".to_string(), Some("alice".to_string())),
("client-id".to_string(), Some("app-1".to_string())),
],
}
Response:
Response::QuotasDescribed {
entries: vec![
QuotaEntry {
entity_type: "user",
entity_name: Some("alice"),
quotas: {
"produce_bytes_rate": 10_000_000,
"consume_bytes_rate": 20_000_000,
"request_rate": 500,
},
},
],
}
Alter Quotas
Request::AlterQuotas {
alterations: vec![
QuotaAlteration {
entity_type: "user".to_string(),
entity_name: Some("alice".to_string()),
quota_key: "produce_bytes_rate".to_string(),
quota_value: Some(10_000_000), // 10 MB/s
},
QuotaAlteration {
entity_type: "client-id".to_string(),
entity_name: Some("batch-processor".to_string()),
quota_key: "request_rate".to_string(),
quota_value: Some(100), // 100 requests/sec
},
],
}
Response:
Response::QuotasAltered {
altered_count: 2,
}
Throttle Response
When a client exceeds their quota, they receive a Throttled response:
Response::Throttled {
throttle_time_ms: 500,
quota_type: "produce_bytes_rate",
entity: "user:alice",
}
The client should wait throttle_time_ms before retrying.
CLI Usage
# List all quotas
rivven quota list
# Set user quota
rivven quota set --user alice \
--produce-bytes-rate 10000000 \
--consume-bytes-rate 20000000
# Set client-id quota
rivven quota set --client-id batch-processor \
--request-rate 100
# Set default quotas
rivven quota set --default \
--produce-bytes-rate 50000000 \
--consume-bytes-rate 100000000
# Remove a quota (revert to defaults)
rivven quota delete --user alice --quota produce_bytes_rate
Configuration
Server-side default quotas can be configured at startup:
# rivven.yaml
quotas:
defaults:
produce_bytes_rate: 52428800 # 50 MB/s
consume_bytes_rate: 104857600 # 100 MB/s
request_rate: 1000
Client Handling
Clients should handle throttle responses gracefully:
use rivven_client::Client;
async fn publish_with_retry(client: &Client, topic: &str, value: &[u8]) -> Result<()> {
loop {
match client.publish(topic, value).await {
Ok(response) => {
// Check for throttle time in response
if let Some(throttle_ms) = response.throttle_time_ms() {
if throttle_ms > 0 {
tracing::warn!("Throttled for {}ms", throttle_ms);
tokio::time::sleep(Duration::from_millis(throttle_ms)).await;
}
}
return Ok(());
}
Err(e) if e.is_throttled() => {
let delay = e.throttle_time().unwrap_or(Duration::from_millis(100));
tracing::warn!("Throttled, retrying after {:?}", delay);
tokio::time::sleep(delay).await;
}
Err(e) => return Err(e),
}
}
}
Monitoring
Quota metrics are exposed via the metrics endpoint:
# HELP rivven_quota_produce_violations_total Total produce quota violations
# TYPE rivven_quota_produce_violations_total counter
rivven_quota_produce_violations_total 42
# HELP rivven_quota_consume_violations_total Total consume quota violations
# TYPE rivven_quota_consume_violations_total counter
rivven_quota_consume_violations_total 7
# HELP rivven_quota_request_violations_total Total request rate violations
# TYPE rivven_quota_request_violations_total counter
rivven_quota_request_violations_total 15
# HELP rivven_quota_throttle_time_ms_total Total throttle time in milliseconds
# TYPE rivven_quota_throttle_time_ms_total counter
rivven_quota_throttle_time_ms_total 23450
Best Practices
1. Start with Conservative Defaults
Begin with generous default quotas and tighten based on observed usage:
quotas:
defaults:
produce_bytes_rate: 104857600 # 100 MB/s
consume_bytes_rate: 209715200 # 200 MB/s
request_rate: 5000
2. Set Quotas for Known Heavy Users
Identify applications that need higher or lower limits:
# Batch processor needs higher throughput
rivven quota set --client-id batch-etl \
--produce-bytes-rate 500000000
# Rate-limit the test environment
rivven quota set --user test-service \
--produce-bytes-rate 1000000 \
--request-rate 50
3. Use Unlimited for Admin/Internal Services
# Admin tools should not be throttled
rivven quota set --user admin \
--produce-bytes-rate unlimited \
--consume-bytes-rate unlimited
4. Monitor and Alert
Set up alerts for quota violations:
# Prometheus alert rule
- alert: HighQuotaViolations
expr: rate(rivven_quota_produce_violations_total[5m]) > 10
for: 5m
labels:
severity: warning
annotations:
summary: "High produce quota violations"
Internal Architecture
The quota system uses a sliding window algorithm for rate tracking:
- Window Duration: 1 second sliding window
- Tracking: Atomic counters for thread-safe updates
- Cleanup: Idle entity state is cleaned up after 1 hour
- Resolution: Quotas checked from most to least specific
Enforcement Path
All three quota checks (request rate, produce bytes, consume bytes) are enforced inside handle_with_principal():
- Anonymous path:
handle()delegates tohandle_with_principal(request, None, None)— quotas tracked against the default entity. - Authenticated path:
AuthenticatedHandlerextracts the principal name from the auth session and delegates tohandle_with_principal(request, Some(user), client_id)— quotas tracked against the specific user/client.
This ensures:
- No double-counting of quotas between auth and handler layers
- Consume quotas are enforced with the real principal (not just anonymous)
- All three quota types use the same enforcement point
┌─────────────────────────────────────────────────────┐
│ QuotaManager │
├─────────────────────────────────────────────────────┤
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Defaults │ │ Configs │ │ States │ │
│ │ (baseline) │ │ (per-entity)│ │(sliding win)│ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
├─────────────────────────────────────────────────────┤
│ record_produce(user, client_id, bytes) → Result │
│ record_consume(user, client_id, bytes) → Result │
│ record_request(user, client_id) → Result │
└─────────────────────────────────────────────────────┘
Security
- Authorization:
DescribeQuotasrequiresDescribeon Cluster - Authorization:
AlterQuotasrequiresAlteron Cluster (admin only) - Audit: All quota changes are logged for compliance