Going to Production
On this page
1. Choose the Right Backend
Picking the right storage backend is the most important production decision. Apalis supports several:
| Backend | Crate | Best For |
|---|---|---|
| PostgreSQL | apalis-postgres | Durable jobs, existing Postgres infra |
| MySQL/MariaDB | apalis-mysql | Durable jobs, existing MySQL infra |
| SQLite | apalis-sqlite | Low-traffic or single-node deployments |
| Redis | apalis-redis | High-throughput, low-latency job queues |
| AMQP | apalis-amqp | Message broker-based architectures |
| PGMQ | apalis-pgmq | Postgres-native message queues with at-least-once delivery |
| NATS | apalis-nats | Distributed messaging, cloud-native and edge deployments |
| RSMQ | apalis-rsmq | Redis-backed simple message queues |
| Cron | apalis-cron | Schedule-driven jobs — use with pipe_to for persistence |
For most production systems, PostgreSQL or Redis are the recommended choices. SQLite should be avoided for multi-node or high-concurrency deployments because it lacks the concurrent write performance needed.
PostgreSQL is preferred when job durability, transactional guarantees, and the ability to query job state directly from your main database are priorities. Redis is preferred when raw throughput and minimal latency matter most.
Setting Up PostgreSQL Storage
# Cargo.toml
[dependencies]
apalis = { version = "1.0.0-rc.4" }
apalis-postgres = { version = "1.0.0-rc.4" }use apalis_sql::postgres::PostgresStorage;
use sqlx::PgPool;
let pool = PgPool::connect(&database_url).await?;
// Run migrations to create the jobs table
PostgresStorage::setup(&pool).await?;
let storage = PostgresStorage::new(pool);2. Build for Production
Always build in release mode for production. Debug builds are significantly slower due to the lack of optimizations.
cargo build --releaseFor smaller binary sizes, add the following to your Cargo.toml:
[profile.release]
opt-level = 3
lto = true # Link-time optimization
codegen-units = 1 # Better optimization at cost of compile time
strip = true # Strip debug symbols from the binaryThese settings can reduce binary size by 30–60% and meaningfully improve runtime performance.
3. Configuration & Environment
Never hardcode credentials or connection strings. Use environment variables for all runtime configuration.
use std::env;
let database_url = env::var("DATABASE_URL")
.expect("DATABASE_URL must be set");
let redis_url = env::var("REDIS_URL")
.expect("REDIS_URL must be set");A minimal .env file for reference (do not commit this to version control):
DATABASE_URL=postgres://user:password@host:5432/mydb
REDIS_URL=redis://:password@host:6379
RUST_LOG=info
WORKER_CONCURRENCY=10
Use a crate like dotenvy to load .env files in non-containerized environments:
dotenvy = "0.15"dotenvy::dotenv().ok(); // Load .env if present, silently skip if not4. Concurrency & Worker Tuning
Concurrency controls how many jobs a single worker processes simultaneously. Setting it too low wastes resources; too high can overwhelm your database, downstream APIs, or hit memory limits.
WorkerBuilder::new("email-worker")
.parallelize(tokio::spawn) // Process jobs in parallel with tokio::spawn
.concurrency(10) // Process up to 10 jobs concurrently
.backend(storage)
.build_fn(send_email)Recommended starting points:
- CPU-bound jobs (image processing, encoding): set concurrency to the number of CPU cores (
num_cpus::get()) - I/O-bound jobs (HTTP calls, DB writes, emails): set concurrency to 10–50 or more, depending on downstream capacity
- Rate-limited jobs (third-party APIs): use the
RateLimitLayer(see section 8) rather than just limiting concurrency
You can make concurrency configurable via an environment variable:
let concurrency: usize = env::var("WORKER_CONCURRENCY")
.unwrap_or_else(|_| "10".to_string())
.parse()
.expect("WORKER_CONCURRENCY must be a number");5. Graceful Shutdown
Apalis's Monitor supports graceful shutdown out of the box. It waits for in-progress jobs to complete before exiting, preventing data loss or incomplete operations on SIGTERM.
use apalis::prelude::*;
use tokio::signal;
Monitor::new()
.register(|run_id| {
WorkerBuilder::new("email-worker")
.backend(storage)
.parallelize(tokio::spawn)
.concurrency(10)
.build_fn(send_email)
})
.on_event(|e| tracing::info!("{e}"))
.shutdown_timeout(std::time::Duration::from_secs(30)) // Wait up to 30s for jobs to finish
.run_with_signal(signal::ctrl_c()) // Gracefully stop on Ctrl+C / SIGINT
.await?;In containerized environments, also handle SIGTERM (what Kubernetes sends on pod termination):
use tokio::signal::unix::{signal, SignalKind};
async fn shutdown_signal() {
let mut sigterm = signal(SignalKind::terminate()).unwrap();
let mut sigint = signal(SignalKind::interrupt()).unwrap();
tokio::select! {
_ = sigterm.recv() => tracing::info!("Received SIGTERM"),
_ = sigint.recv() => tracing::info!("Received SIGINT"),
}
}
Monitor::new()
// ...
.run_with_signal(shutdown_signal())
.await?;Set your container's terminationGracePeriodSeconds in Kubernetes to be longer than your shutdown_timeout to allow jobs to finish cleanly.
6. Error Handling & Retries
Production jobs will fail. Design for it explicitly.
Return Errors from Job Handlers
use apalis::prelude::*;
async fn send_email(job: Email, _: Data<()>) -> Result<(), BoxDynError> {
smtp_client.send(&job.to, &job.body).await?;
Ok(())
}Returning Err(...) marks the job as failed and triggers retry logic if configured.
Add a Retry Layer
use apalis::layers::retry::{RetryLayer, RetryPolicy};
use tower::ServiceBuilder;
WorkerBuilder::new("email-worker")
.layer(RetryLayer::new(RetryPolicy::retries(3))) // Retry up to 3 times
.backend(storage)
.build_fn(send_email)Implement Custom Retry Logic
For more control (e.g., exponential backoff, only retry on specific errors):
use apalis::layers::retry::RetryPolicy;
use tower::retry::Policy;
#[derive(Clone)]
struct ExponentialBackoff { attempts: usize }
impl<Req: Clone, Res, E> Policy<Req, Res, E> for ExponentialBackoff {
type Future = std::future::Ready<Self>;
fn retry(&self, _req: &Req, result: Result<&Res, &E>) -> Option<Self::Future> {
if result.is_err() && self.attempts < 5 {
Some(std::future::ready(ExponentialBackoff { attempts: self.attempts + 1 }))
} else {
None
}
}
fn clone_request(&self, req: &Req) -> Option<Req> {
Some(req.clone())
}
}Catch Panics
use apalis::layers::retry::{RetryLayer, RetryPolicy};
use tower::ServiceBuilder;
WorkerBuilder::new("email-worker")
.backend(storage)
.catch_panic()
.build_fn(send_email)Dead Letter Queues
Consider using a separate storage namespace or queue to move permanently failed jobs to for later inspection, rather than discarding them:
// With Redis, use a dedicated namespace for DLQ
let dlq_config = apalis_redis::Config::default()
.set_namespace("my-app::dead-letter");
let dlq_storage = RedisStorage::new_with_config(conn.clone(), dlq_config);7. Observability: Logging, Tracing & Metrics
Structured Logging with tracing
Apalis integrates natively with the tracing ecosystem. Enable tracing in your builder and configure a subscriber at startup:
use tracing_subscriber::{layer::SubscriberExt, util::SubscriberInitExt};
tracing_subscriber::registry()
.with(tracing_subscriber::EnvFilter::new(
std::env::var("RUST_LOG").unwrap_or_else(|_| "info".to_string()),
))
.with(tracing_subscriber::fmt::layer().json()) // JSON logs for production
.init();Then enable tracing in your worker:
WorkerBuilder::new("email-worker")
.enable_tracing() // Automatically traces each job execution
.backend(storage)
.build_fn(send_email)Prometheus Metrics
With the prometheus feature, apalis can expose job metrics (job counts, durations, failures):
apalis = { version = "1.0.0-rc.4", features = ["prometheus"] }use apalis::layers::prometheus::PrometheusLayer;
use tower::ServiceBuilder;
WorkerBuilder::new("email-worker")
.layer(PrometheusLayer::new())
.backend(storage)
.build_fn(send_email)Expose a /metrics endpoint using your HTTP server (e.g., Axum or Actix-web) that serves the Prometheus registry output.
Sentry Integration
apalis = { version = "1.0.0-rc.4", features = ["sentry"] }use sentry_tower::NewSentryLayer;
use apalis::{layers::sentry::SentryLayer, prelude::*};
WorkerBuilder::new("email-worker")
.layer(NewSentryLayer::new_from_top())
.layer(SentryLayer::new())
.backend(storage)
.build_fn(send_email)8. Rate Limiting & Backpressure
Protect downstream services and comply with third-party API rate limits using the RateLimitLayer:
apalis = { version = "1.0.0-rc.4", features = ["limit"] }
tower = { version = "0.4", features = ["limit"] }use std::time::Duration;
WorkerBuilder::new("sendgrid-worker")
.rate_limit(100, Duration::from_secs(1))
.backend(storage)
.build_fn(send_email)You can also apply a ConcurrencyLimitLayer to cap total concurrent executions:
WorkerBuilder::new("sendgrid-worker")
.concurrency(10)
.backend(storage)
.build_fn(send_email)9. Monitoring with apalis Board
Apalis Board is an optional web UI for monitoring and managing jobs. It provides a real-time view of queued, running, failed, and completed jobs.
apalis-board = { version = "1.0.0-rc.4" }
apalis-board-api = { version = "1.0.0-rc.4" }Integrate it with an Axum-based HTTP server:
use apalis_board::Board;
use axum::Router;
let api = ApiBuilder::new(Router::new())
.register(email_store.clone())
.build();
let router = Router::new()
.nest("/api/v1", api)
.fallback_service(ServeUI::new())
.layer(Extension(broadcaster.clone()));
let listener = tokio::net::TcpListener::bind(&args.api_host).await.unwrap();
axum::serve(listener, router)
.with_graceful_shutdown(ctrl_c().map(|_| ()))
.awaitSecurity note: In production, protect the apalis Board behind authentication middleware. It exposes job data and allows manual job management, so it should never be publicly accessible without authorization.
10. Scaling Workers
Horizontal Scaling
Because apalis backends (Postgres, Redis, AMQP) are distributed by design, you can run multiple worker processes or pods without any special coordination. Each worker independently polls for and claims jobs using atomic operations, so there is no double-processing.
# Run multiple worker instances pointing at the same backend
./my-worker &
./my-worker &
./my-worker &Multiple Workers in One Process
You can also register multiple workers with a single Monitor to process different job types in one process:
Monitor::new()
.register(
WorkerBuilder::new("email-worker")
.concurrency(20)
.backend(email_storage)
.build_fn(send_email)
)
.register(
WorkerBuilder::new("report-worker")
.concurrency(5)
.backend(report_storage)
.build_fn(generate_report)
)
.run_with_signal(shutdown_signal())
.await?;Worker Naming
Give each worker a unique, descriptive name. This name is used in monitoring and logs, so a name like "email-worker-us-east" is more useful than "worker-1".
11. Deployment Patterns
Docker
A minimal production Dockerfile:
FROM rust:1.80 AS builder
WORKDIR /app
COPY . .
RUN cargo build --release
FROM debian:bookworm-slim
RUN apt-get update && apt-get install -y ca-certificates libssl3 && rm -rf /var/lib/apt/lists/*
COPY --from=builder /app/target/release/my-worker /usr/local/bin/my-worker
CMD ["my-worker"]Kubernetes
A basic deployment manifest:
apiVersion: apps/v1
kind: Deployment
metadata:
name: email-worker
spec:
replicas: 3
selector:
matchLabels:
app: email-worker
template:
metadata:
labels:
app: email-worker
spec:
terminationGracePeriodSeconds: 60 # Must exceed shutdown_timeout
containers:
- name: email-worker
image: my-registry/my-worker:latest
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: db-secret
key: url
- name: RUST_LOG
value: "info"
- name: WORKER_CONCURRENCY
value: "10"
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "512Mi"
cpu: "500m"Systemd (bare metal / VMs)
[Unit]
Description=My Apalis Worker
After=network.target postgresql.service
[Service]
Type=simple
User=myapp
EnvironmentFile=/etc/myapp/worker.env
ExecStart=/usr/local/bin/my-worker
Restart=always
RestartSec=5
KillMode=process # Send SIGTERM to main process only
TimeoutStopSec=60 # Allow up to 60s for graceful shutdown
[Install]
WantedBy=multi-user.target12. Security Checklist
- Never commit secrets — use environment variables, Kubernetes Secrets, or a secrets manager (Vault, AWS Secrets Manager, etc.)
- Use TLS for all backend connections — ensure
DATABASE_URLandREDIS_URLuse TLS (sslmode=requirefor Postgres,rediss://for Redis) - Restrict backend access — workers should connect to the database/Redis from a private network, not a public endpoint
- Protect apalis Board — put it behind authentication (e.g., HTTP Basic Auth, OAuth, or an internal-only network)
- Validate job payloads — treat incoming job data as untrusted input; use
serdevalidation and reject malformed payloads - Set resource limits — apply memory and CPU limits in Docker/Kubernetes to prevent a runaway job from taking down the host
13. Production Checklist
Before going live, verify all of the following:
- Release build compiled with
--release - Production backend chosen (Postgres or Redis recommended)
- Database migrations run (
PostgresStorage::setup(&pool).await?) - All configuration via environment variables (no hardcoded secrets)
- Graceful shutdown configured with appropriate timeout
- Error handling: all job handlers return
Result - Retry policy configured for transient failures
- Dead letter queue or failed job visibility strategy defined
-
tracing/ structured logging initialized with JSON format - Metrics exposed (Prometheus) and dashboards created
- Rate limiting applied for jobs calling external APIs
- Worker concurrency tuned for your workload type
- apalis Board deployed (if used) and protected behind auth
- Container image built with a minimal base image
-
terminationGracePeriodSecondsexceedsshutdown_timeoutin Kubernetes - Load tested: backend can handle expected job throughput
- Alerting set up on job failure rate and queue depth