Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Fila

A message broker that makes fair scheduling and per-key throttling first-class primitives.

The problem

Every existing broker delivers messages in FIFO order. When multiple tenants, customers, or workload types share a queue, a single noisy producer can starve everyone else. Rate limiting is pushed to the consumer — which means the consumer has to fetch a message, check the limit, and re-enqueue it. That wastes work and adds latency.

Fila moves scheduling decisions into the broker:

  • Deficit Round Robin (DRR) fair scheduling — each fairness key gets its fair share of delivery bandwidth. No tenant starves another.
  • Token bucket throttling — per-key rate limits enforced at the broker, before delivery. Consumers only receive messages that are ready to process.
  • Lua rules engineon_enqueue and on_failure hooks let you define scheduling policy (assign fairness keys, set weights, decide retry vs. dead-letter) in user-supplied Lua scripts.
  • Zero wasted work — consumers never receive a message they can’t act on.

Quickstart

Docker

docker run -p 5555:5555 ghcr.io/faiscadev/fila:dev

Install script

curl -fsSL https://raw.githubusercontent.com/faiscadev/fila/main/install.sh -o install.sh
less install.sh
bash install.sh
fila-server

Cargo

cargo install fila-server fila-cli
fila-server

Try it out

Once the broker is running on localhost:5555:

# Create a queue
fila queue create orders

# Enqueue a message
fila enqueue orders --header tenant=acme --payload hello

# Check queue stats
fila queue inspect orders

Core Concepts

This document explains the key concepts behind Fila’s scheduling and message handling.

Message lifecycle

A message moves through these states:

Producer                      Broker                        Consumer
   |                            |                              |
   |-- Enqueue ----------------->|                              |
   |                            |-- on_enqueue (Lua) --------->|
   |                            |   assigns fairness_key,      |
   |                            |   weight, throttle_keys      |
   |                            |                              |
   |                            |-- Stored (pending) --------->|
   |                            |                              |
   |                            |-- DRR scheduler picks ------>|
   |                            |   checks throttle tokens     |
   |                            |                              |
   |                            |-- Consume (leased) -------->|-- Processing
   |                            |                              |
   |                            |<-------- Ack ----------------|  (success)
   |                            |   message deleted            |
   |                            |                              |
   |                            |<-------- Nack ---------------|  (failure)
   |                            |-- on_failure (Lua) --------->|
   |                            |   retry or dead-letter       |
   |                            |                              |
   |                            |-- Visibility timeout ------->|
   |                            |   re-enqueue if not acked    |
  1. Enqueue — producer sends a message to a queue. If the queue has an on_enqueue Lua script, it runs to assign scheduling metadata.
  2. Pending — the message is persisted in RocksDB and indexed by fairness key.
  3. Scheduled — the DRR scheduler picks the next fairness key and checks throttle tokens. If tokens are available, the message is delivered to a waiting consumer.
  4. Leased — the consumer is processing the message. A visibility timeout timer starts.
  5. Acked — the consumer confirms success. The message is deleted.
  6. Nacked — the consumer reports failure. The on_failure hook decides: retry (re-enqueue) or dead-letter.
  7. Expired — if the visibility timeout fires before ack/nack, the message is automatically re-enqueued.

Fairness groups

Every message belongs to a fairness group identified by its fairness_key. The key is assigned during enqueue — either by an on_enqueue Lua script or defaulting to "default".

Common fairness key strategies:

  • Per-tenant: msg.headers["tenant_id"] — prevents one tenant from monopolizing the queue
  • Per-customer: msg.headers["customer_id"] — fair delivery across customers
  • Per-priority: msg.headers["priority"] — combined with weights for priority scheduling

Deficit Round Robin (DRR)

Fila uses the DRR algorithm to schedule delivery across fairness groups:

  1. Each fairness key has a deficit counter (starts at 0) and a weight (default 1).
  2. In each scheduling round, every key receives weight * quantum additional deficit.
  3. The scheduler delivers messages from a key as long as its deficit is positive, decrementing by 1 per delivery.
  4. When a key’s deficit reaches 0 or it has no pending messages, the scheduler moves to the next key.

Example: Two tenants with equal weight and quantum=1000. Each gets 1000 deficit per round — the scheduler delivers ~1000 messages from tenant A, then ~1000 from tenant B, then back to A. A noisy tenant sending 100x more messages doesn’t starve the quiet tenant.

Weights: A key with weight=3 gets 3x the deficit of a key with weight=1, so it receives ~3x the delivery bandwidth. Use weights for priority lanes.

Token bucket throttling

Fila supports per-key rate limiting via token bucket throttlers. Each throttle key has:

  • rate — tokens refilled per second
  • burst — maximum tokens the bucket can hold

When the scheduler is about to deliver a message, it checks all of the message’s throttle_keys. If any bucket is empty, the message is held until tokens refill. The consumer never receives a message it would have to reject for rate limiting.

Setting up throttle rates

Throttle rates are managed via runtime configuration:

# Allow 10 requests/second with burst of 20 for the "api" throttle key
fila config set throttle:api:rate 10
fila config set throttle:api:burst 20

Messages are assigned throttle keys in the on_enqueue Lua hook:

function on_enqueue(msg)
  return {
    fairness_key = msg.headers["tenant"],
    throttle_keys = { msg.headers["api_endpoint"] }
  }
end

Lua hooks

Fila embeds a Lua 5.4 runtime for user-defined scheduling policy. Scripts run inside a sandbox with configurable timeouts and memory limits.

on_enqueue

Runs when a message is enqueued. Returns scheduling metadata:

function on_enqueue(msg)
  -- msg.headers       — table of string key-value pairs
  -- msg.payload_size  — byte count of the payload
  -- msg.queue         — queue name

  return {
    fairness_key = msg.headers["tenant"] or "default",
    weight = tonumber(msg.headers["priority"]) or 1,
    throttle_keys = { msg.headers["endpoint"] }
  }
end

Return fields:

FieldTypeDefaultDescription
fairness_keystring"default"Groups the message for DRR scheduling
weightnumber1DRR weight for this fairness key
throttle_keyslist of strings[]Token bucket keys to check before delivery

on_failure

Runs when a consumer nacks a message. Decides retry vs. dead-letter:

function on_failure(msg)
  -- msg.headers   — table of string key-value pairs
  -- msg.id        — message UUID
  -- msg.attempts  — current attempt count
  -- msg.queue     — queue name
  -- msg.error     — error description from the nack

  if msg.attempts >= 3 then
    return { action = "dlq" }
  end
  return { action = "retry", delay_ms = 1000 * msg.attempts }
end

Return fields:

FieldTypeDescription
action"retry" or "dlq"Whether to re-enqueue or dead-letter
delay_msnumber (optional)Delay before re-enqueue (retry only)

Lua API

Scripts can read runtime configuration from the broker:

local limit = fila.get("rate_limit:tenant_a")  -- returns string or nil

Safety

SettingDefaultDescription
lua.default_timeout_ms10Max script execution time
lua.default_memory_limit_bytes1 MBMax memory per script
lua.circuit_breaker_threshold3Consecutive failures before circuit break
lua.circuit_breaker_cooldown_ms10000Cooldown period after circuit break

When the circuit breaker trips, Lua hooks are bypassed and messages use default scheduling (fairness_key="default", weight=1, no throttle keys). The circuit breaker resets automatically after the cooldown period.

Dead letter queue

Messages that exhaust retries (when on_failure returns { action = "dlq" }) are moved to a dead letter queue named <queue>.dlq. For example, messages dead-lettered from orders go to orders.dlq.

Inspecting and redriving

# Check how many messages are in the DLQ
fila queue inspect orders.dlq

# Move 10 messages back to the source queue
fila redrive orders.dlq --count 10

Redrive moves pending (non-leased) messages from the DLQ back to the original source queue, where they go through the normal enqueue flow again.

Runtime configuration

The broker maintains a key-value configuration store that persists across restarts. Values are accessible from Lua scripts via fila.get(key) and managed through the CLI or API.

fila config set feature:new_flow enabled
fila config get feature:new_flow
fila config list --prefix feature:

Common use cases:

  • Feature flags: toggle behavior in Lua scripts without redeployment
  • Throttle rates: throttle:<key>:rate and throttle:<key>:burst
  • Dynamic routing: change fairness key assignment logic based on config values

Visibility timeout

When a consumer receives a message via Consume, the message is “leased” for a configurable duration (set per-queue at creation time via visibility_timeout_ms). During this lease:

  • The message is not delivered to other consumers
  • A timer tracks the lease expiry

If the consumer does not Ack or Nack the message before the timeout expires, the message is automatically re-enqueued and becomes available for delivery again. This prevents messages from being lost when consumers crash.

The default visibility timeout is set per-queue at creation:

fila queue create orders --visibility-timeout 30000  # 30 seconds

Tutorials

Step-by-step guides for common Fila use cases. Each tutorial assumes you have a running broker (see quickstart).

Multi-tenant fair scheduling

Goal: Prevent a noisy tenant from starving other tenants in a shared queue.

1. Create a queue with tenant-aware fairness

fila queue create orders \
  --on-enqueue 'function on_enqueue(msg)
    return { fairness_key = msg.headers["tenant_id"] or "default" }
  end'

The on_enqueue hook extracts a tenant_id header and uses it as the fairness key. Each unique tenant gets its own DRR scheduling group.

2. Produce messages from multiple tenants

# Python SDK
from fila import FilaClient

client = FilaClient("localhost:5555")

# Noisy tenant sends 1000 messages
for i in range(1000):
    client.enqueue("orders", {"tenant_id": "noisy-corp"}, f"order-{i}")

# Other tenants send a few each
for tenant in ["acme", "globex", "initech"]:
    for i in range(10):
        client.enqueue("orders", {"tenant_id": tenant}, f"{tenant}-order-{i}")

3. Consume and observe fairness

stream = client.consume("orders")
for msg in stream:
    print(f"tenant={msg.metadata.fairness_key} id={msg.id}")
    client.ack("orders", msg.id)

Without Fila, all 1000 noisy-corp messages would be delivered first. With DRR scheduling, each tenant gets interleaved delivery — acme, globex, and initech messages arrive alongside noisy-corp’s, not after.

4. Verify with stats

fila queue inspect orders

The per-key breakdown shows each tenant’s pending count and current DRR deficit.

Weighted fairness

Give premium tenants more bandwidth by setting weights:

fila queue create orders \
  --on-enqueue 'function on_enqueue(msg)
    local weights = { premium = 3, standard = 1 }
    local tier = msg.headers["tier"] or "standard"
    return {
      fairness_key = msg.headers["tenant_id"] or "default",
      weight = weights[tier] or 1
    }
  end'

A premium tenant with weight=3 gets 3x the delivery bandwidth of a standard tenant with weight=1.


Per-provider throttling

Goal: Rate-limit outgoing API calls per external provider without wasting consumer resources.

1. Create a queue with throttle keys

fila queue create api-calls \
  --on-enqueue 'function on_enqueue(msg)
    local keys = {}
    if msg.headers["provider"] then
      table.insert(keys, "provider:" .. msg.headers["provider"])
    end
    return {
      fairness_key = msg.headers["tenant"] or "default",
      throttle_keys = keys
    }
  end'

2. Set throttle rates

# Stripe: 100 requests/second, burst up to 150
fila config set throttle.provider:stripe 100,150

# SendGrid: 10 requests/second, burst up to 20
fila config set throttle.provider:sendgrid 10,20

The format is rate,burst. Rate is tokens per second; burst is the maximum bucket capacity.

3. Produce messages

// Go SDK
client, _ := fila.Connect("localhost:5555")

// These will be throttled to 100/s
for i := 0; i < 500; i++ {
    client.Enqueue(ctx, "api-calls", map[string]string{
        "tenant":   "acme",
        "provider": "stripe",
    }, []byte(fmt.Sprintf("charge-%d", i)))
}

4. Consume — the broker does the throttling

stream, _ := client.Consume(ctx, "api-calls")
for msg := range stream {
    // Every message received is within the rate limit.
    // No need to check limits client-side.
    callExternalAPI(msg.Payload)
    client.Ack(ctx, "api-calls", msg.ID)
}

Consumers receive messages at the provider’s rate limit. No consumer-side rate checking, no wasted fetches, no re-enqueue loops.

Adjusting rates at runtime

Change rates without restarting the broker:

# Double Stripe's rate
fila config set throttle.provider:stripe 200,300

The token bucket updates immediately.


Exponential backoff retry

Goal: Retry failed messages with increasing delays, then dead-letter after max attempts.

1. Create a queue with retry logic

fila queue create jobs \
  --on-enqueue 'function on_enqueue(msg)
    return { fairness_key = msg.headers["job_type"] or "default" }
  end' \
  --on-failure 'function on_failure(msg)
    local max_attempts = tonumber(fila.get("max_retries") or "5")
    if msg.attempts >= max_attempts then
      return { action = "dlq" }
    end
    -- Exponential backoff: 1s, 2s, 4s, 8s, 16s...
    local delay = math.min(1000 * (2 ^ (msg.attempts - 1)), 60000)
    return { action = "retry", delay_ms = delay }
  end' \
  --visibility-timeout 30000

2. Configure max retries at runtime

fila config set max_retries 3

The on_failure hook reads this with fila.get("max_retries"). Change it without redeploying.

3. Process messages with failure handling

// JavaScript SDK
const { FilaClient } = require('fila-client');

async function main() {
  const client = await FilaClient.connect('localhost:5555');
  const stream = client.consume('jobs');

  for await (const msg of stream) {
    try {
      await processJob(msg.payload);
      await client.ack('jobs', msg.id);
    } catch (err) {
      // Nack triggers on_failure hook — broker handles retry/DLQ
      await client.nack('jobs', msg.id, err.message);
    }
  }
}

main().catch(console.error);

4. Monitor and redrive

# Check how many messages ended up in the DLQ
fila queue inspect jobs.dlq

# After fixing the root cause, redrive them
fila redrive jobs.dlq --count 0  # 0 = all messages

Customizing backoff per job type

Read config to customize behavior per job type:

function on_failure(msg)
  -- Different retry strategies per job type
  local job_type = msg.headers["job_type"] or "default"
  local max = tonumber(fila.get("max_retries:" .. job_type) or "5")

  if msg.attempts >= max then
    return { action = "dlq" }
  end

  -- Critical jobs: shorter delays, more retries
  -- Batch jobs: longer delays, fewer retries
  local base_ms = tonumber(fila.get("retry_base_ms:" .. job_type) or "1000")
  local delay = math.min(base_ms * (2 ^ (msg.attempts - 1)), 300000)
  return { action = "retry", delay_ms = delay }
end
fila config set max_retries:payment 10
fila config set retry_base_ms:payment 500

fila config set max_retries:report 3
fila config set retry_base_ms:report 5000

Lua Hook Patterns

Copy-paste patterns for common scheduling scenarios. See concepts for hook API details.

Tenant fairness

Assign each tenant its own fairness group so the DRR scheduler gives equal delivery bandwidth:

function on_enqueue(msg)
  return {
    fairness_key = msg.headers["tenant_id"] or "default"
  }
end

With weighted tiers

Premium tenants get more bandwidth:

function on_enqueue(msg)
  local tier = msg.headers["tier"] or "standard"
  local weight = 1
  if tier == "premium" then weight = 3 end
  if tier == "enterprise" then weight = 5 end

  return {
    fairness_key = msg.headers["tenant_id"] or "default",
    weight = weight
  }
end

With dynamic weights from config

function on_enqueue(msg)
  local tenant = msg.headers["tenant_id"] or "default"
  local weight = tonumber(fila.get("weight:" .. tenant) or "1")

  return {
    fairness_key = tenant,
    weight = weight
  }
end

Set weights at runtime: fila config set weight:acme 5


Provider throttling

Rate-limit outbound calls per external API provider:

function on_enqueue(msg)
  local keys = {}
  if msg.headers["provider"] then
    table.insert(keys, "provider:" .. msg.headers["provider"])
  end

  return {
    fairness_key = msg.headers["tenant"] or "default",
    throttle_keys = keys
  }
end

Set rates: fila config set throttle.provider:stripe 100,200

Multi-dimensional throttling

Throttle by both provider and tenant (composite key):

function on_enqueue(msg)
  local tenant = msg.headers["tenant"] or "default"
  local provider = msg.headers["provider"]

  local keys = {}
  if provider then
    -- Global provider limit
    table.insert(keys, "provider:" .. provider)
    -- Per-tenant-per-provider limit
    table.insert(keys, "tenant-provider:" .. tenant .. ":" .. provider)
  end

  return {
    fairness_key = tenant,
    throttle_keys = keys
  }
end
# Global: Stripe allows 1000 req/s total
fila config set throttle.provider:stripe 1000,1500

# Per-tenant: each tenant gets at most 100 req/s to Stripe
fila config set throttle.tenant-provider:acme:stripe 100,150
fila config set throttle.tenant-provider:globex:stripe 100,150

Exponential backoff retry

Retry with increasing delays, dead-letter after max attempts:

function on_failure(msg)
  if msg.attempts >= 5 then
    return { action = "dlq" }
  end

  -- 1s, 2s, 4s, 8s, 16s
  local delay = math.min(1000 * (2 ^ (msg.attempts - 1)), 60000)
  return { action = "retry", delay_ms = delay }
end

With configurable max retries

function on_failure(msg)
  local max = tonumber(fila.get("max_retries") or "5")
  if msg.attempts >= max then
    return { action = "dlq" }
  end

  local delay = math.min(1000 * (2 ^ (msg.attempts - 1)), 60000)
  return { action = "retry", delay_ms = delay }
end

Change at runtime: fila config set max_retries 10

Linear backoff

function on_failure(msg)
  if msg.attempts >= 5 then
    return { action = "dlq" }
  end

  -- 5s, 10s, 15s, 20s, 25s
  return { action = "retry", delay_ms = 5000 * msg.attempts }
end

Immediate retry (no delay)

function on_failure(msg)
  if msg.attempts >= 3 then
    return { action = "dlq" }
  end
  return { action = "retry", delay_ms = 0 }
end

Header-based routing

Use headers to make dynamic scheduling decisions.

Route by priority

function on_enqueue(msg)
  local priority = msg.headers["priority"] or "normal"
  local weights = {
    critical = 10,
    high = 5,
    normal = 2,
    low = 1
  }

  return {
    fairness_key = "priority:" .. priority,
    weight = weights[priority] or 2
  }
end

Route by region

function on_enqueue(msg)
  local region = msg.headers["region"] or "default"

  return {
    fairness_key = "region:" .. region,
    throttle_keys = { "region:" .. region }
  }
end
# Rate limit per region
fila config set throttle.region:us-east 500,750
fila config set throttle.region:eu-west 300,450

Conditional dead-letter by error type

function on_failure(msg)
  -- Permanent errors: dead-letter immediately
  if msg.error:find("4%d%d") then  -- HTTP 4xx
    return { action = "dlq" }
  end

  -- Transient errors: retry with backoff
  if msg.attempts >= 5 then
    return { action = "dlq" }
  end

  local delay = 1000 * (2 ^ (msg.attempts - 1))
  return { action = "retry", delay_ms = delay }
end

Feature flag gating

function on_enqueue(msg)
  local tenant = msg.headers["tenant"] or "default"
  local new_flow = fila.get("feature:new_flow:" .. tenant)

  if new_flow == "enabled" then
    return { fairness_key = tenant .. ":v2", weight = 1 }
  end

  return { fairness_key = tenant, weight = 1 }
end
# Enable new flow for one tenant
fila config set feature:new_flow:acme enabled

Built-in helpers

Fila provides fila.helpers — a set of convenience functions for common patterns. These are available in all scripts alongside fila.get().

fila.helpers.exponential_backoff(attempts, base_ms, max_ms)

Returns a delay in milliseconds with exponential growth and ±25% jitter.

  • attempts — current attempt count
  • base_ms — base delay (first attempt delay)
  • max_ms — maximum delay cap
function on_failure(msg)
  if msg.attempts >= 5 then
    return { action = "dlq" }
  end
  return {
    action = "retry",
    delay_ms = fila.helpers.exponential_backoff(msg.attempts, 1000, 60000)
  }
end

fila.helpers.tenant_route(msg, header_name)

Extracts a header value as the fairness key. Returns { fairness_key = "default" } if the header is missing.

function on_enqueue(msg)
  return fila.helpers.tenant_route(msg, "tenant_id")
end

fila.helpers.rate_limit_keys(msg, patterns)

Generates throttle key strings from patterns. Each {placeholder} is replaced by the corresponding header value. Patterns with missing headers are omitted.

function on_enqueue(msg)
  local route = fila.helpers.tenant_route(msg, "tenant_id")
  route.throttle_keys = fila.helpers.rate_limit_keys(msg, {
    "provider:{provider}",
    "region:{region}"
  })
  return route
end

fila.helpers.max_retries(attempts, max)

Returns { action = "retry" } if attempts < max, otherwise { action = "dlq" }.

function on_failure(msg)
  return fila.helpers.max_retries(msg.attempts, 5)
end

Combining helpers

Helpers compose naturally. Here’s a complete on_failure script using multiple helpers:

function on_failure(msg)
  local decision = fila.helpers.max_retries(msg.attempts, 5)
  if decision.action == "retry" then
    decision.delay_ms = fila.helpers.exponential_backoff(msg.attempts, 1000, 60000)
  end
  return decision
end

SDK Quick Start

Get up and running with Fila in your language of choice. Each section covers installation, connection, and the core enqueue/consume/ack flow.

Prerequisites

# Start a Fila broker
fila-server &

# Create a queue
fila queue create demo

Rust

cargo add fila-sdk tokio tokio-stream
use fila_sdk::FilaClient;
use std::collections::HashMap;
use tokio_stream::StreamExt;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = FilaClient::connect("http://localhost:5555").await?;

    // Enqueue a message
    let mut headers = HashMap::new();
    headers.insert("tenant".to_string(), "acme".to_string());
    let id = client.enqueue("demo", headers, b"hello".to_vec()).await?;
    println!("enqueued: {id}");

    // Consume messages
    let mut stream = client.consume("demo").await?;
    if let Some(msg) = stream.next().await {
        let msg = msg?;
        println!("received: {}", String::from_utf8_lossy(&msg.payload));

        // Acknowledge
        client.ack("demo", &msg.id).await?;
    }

    Ok(())
}

Go

go get github.com/faiscadev/fila-go
package main

import (
    "context"
    "fmt"
    "log"

    fila "github.com/faiscadev/fila-go"
)

func main() {
    client, err := fila.Connect("localhost:5555")
    if err != nil {
        log.Fatal(err)
    }
    defer client.Close()

    ctx := context.Background()

    // Enqueue
    id, err := client.Enqueue(ctx, "demo",
        map[string]string{"tenant": "acme"}, []byte("hello"))
    if err != nil {
        log.Fatal(err)
    }
    fmt.Println("enqueued:", id)

    // Consume
    stream, err := client.Consume(ctx, "demo")
    if err != nil {
        log.Fatal(err)
    }

    msg, err := stream.Recv()
    if err != nil {
        log.Fatal(err)
    }
    fmt.Printf("received: %s\n", msg.Payload)

    // Ack
    if err := client.Ack(ctx, "demo", msg.ID); err != nil {
        log.Fatal(err)
    }
}

Python

pip install fila
import asyncio
from fila import FilaClient

async def main():
    client = await FilaClient.connect("localhost:5555")

    # Enqueue
    msg_id = await client.enqueue("demo",
        headers={"tenant": "acme"}, payload=b"hello")
    print(f"enqueued: {msg_id}")

    # Consume
    async for msg in client.consume("demo"):
        print(f"received: {msg.payload}")
        await client.ack("demo", msg.id)
        break

asyncio.run(main())

JavaScript / Node.js

npm install fila-client
const { FilaClient } = require('fila-client');

async function main() {
    const client = await FilaClient.connect('localhost:5555');

    // Enqueue
    const id = await client.enqueue('demo',
        { tenant: 'acme' }, Buffer.from('hello'));
    console.log('enqueued:', id);

    // Consume
    const stream = client.consume('demo');
    for await (const msg of stream) {
        console.log('received:', msg.payload.toString());
        await client.ack('demo', msg.id);
        break;
    }
}

main().catch(console.error);

Ruby

gem install fila-client
require 'fila'

client = Fila::Client.new('localhost:5555')

# Enqueue
id = client.enqueue('demo',
  headers: { 'tenant' => 'acme' }, payload: 'hello')
puts "enqueued: #{id}"

# Consume
client.consume('demo') do |msg|
  puts "received: #{msg.payload}"
  client.ack('demo', msg.id)
  break
end

Java

<dependency>
    <groupId>dev.faisca</groupId>
    <artifactId>fila-client</artifactId>
    <version>0.1.0</version>
</dependency>
import dev.faisca.fila.FilaClient;
import java.util.Map;

public class Demo {
    public static void main(String[] args) throws Exception {
        var client = FilaClient.connect("localhost:5555");

        // Enqueue
        var id = client.enqueue("demo",
            Map.of("tenant", "acme"), "hello".getBytes());
        System.out.println("enqueued: " + id);

        // Consume
        var stream = client.consume("demo");
        var msg = stream.next();
        System.out.println("received: " + new String(msg.getPayload()));

        // Ack
        client.ack("demo", msg.getId());
    }
}

Security

TLS

Connect to a TLS-enabled broker by specifying the CA certificate or using system trust store.

SDKTLS Connection
RustFilaClient::builder("https://host:5555").with_tls().connect().await?
Gofila.Connect("host:5555", fila.WithTLS())
PythonFilaClient.connect("host:5555", tls=True)
JavaScriptFilaClient.connect('host:5555', { tls: true })
RubyFila::Client.new('host:5555', tls: true)
JavaFilaClient.connect("host:5555", FilaClient.withTLS())

For custom CA certificates:

SDKCustom CA
Rust.with_tls_ca("path/to/ca.crt")
Gofila.WithTLSCA("path/to/ca.crt")
Pythontls_ca="path/to/ca.crt"
JavaScript{ tlsCa: 'path/to/ca.crt' }
Rubytls_ca: 'path/to/ca.crt'
JavaFilaClient.withTLSCA("path/to/ca.crt")

API Key Authentication

SDKAPI Key
Rust.with_api_key("your-key")
Gofila.WithAPIKey("your-key")
Pythonapi_key="your-key"
JavaScript{ apiKey: 'your-key' }
Rubyapi_key: 'your-key'
JavaFilaClient.withAPIKey("your-key")

SDK Examples

Working code for every SDK showing the core enqueue -> consume -> ack flow.

Setup

All examples assume a running broker with a queue:

fila-server &
fila queue create demo

Rust

use std::collections::HashMap;
use std::time::Duration;

use fila_sdk::FilaClient;
use tokio_stream::StreamExt;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = FilaClient::connect("http://localhost:5555").await?;

    // Enqueue
    let mut headers = HashMap::new();
    headers.insert("tenant".to_string(), "acme".to_string());
    let id = client.enqueue("demo", headers, b"hello".to_vec()).await?;
    println!("enqueued: {id}");

    // Consume
    let mut stream = client.consume("demo").await?;
    let msg = tokio::time::timeout(Duration::from_secs(5), stream.next())
        .await??
        .expect("stream error");

    println!("received: {} ({})", msg.id, String::from_utf8_lossy(&msg.payload));

    // Ack
    client.ack("demo", &msg.id).await?;
    println!("acked: {}", msg.id);

    Ok(())
}

Add to Cargo.toml:

[dependencies]
fila-sdk = "0.1"
tokio = { version = "1", features = ["full"] }
tokio-stream = "0.1"

Rust with API key authentication

#![allow(unused)]
fn main() {
let opts = ConnectOptions::new("127.0.0.1:5555")
    .with_api_key("my-secret-key");
let client = FilaClient::connect_with_options(opts).await?;
}

Go

package main

import (
	"context"
	"fmt"
	"log"

	fila "github.com/faiscadev/fila-go"
)

func main() {
	ctx := context.Background()
	client, err := fila.Connect("localhost:5555")
	if err != nil {
		log.Fatal(err)
	}
	defer client.Close()

	// Enqueue
	id, err := client.Enqueue(ctx, "demo", map[string]string{
		"tenant": "acme",
	}, []byte("hello"))
	if err != nil {
		log.Fatal(err)
	}
	fmt.Println("enqueued:", id)

	// Consume
	stream, err := client.Consume(ctx, "demo")
	if err != nil {
		log.Fatal(err)
	}

	msg, err := stream.Next()
	if err != nil {
		log.Fatal(err)
	}
	fmt.Printf("received: %s (%s)\n", msg.ID, string(msg.Payload))

	// Ack
	if err := client.Ack(ctx, "demo", msg.ID); err != nil {
		log.Fatal(err)
	}
	fmt.Println("acked:", msg.ID)
}

Python

from fila import FilaClient

client = FilaClient("localhost:5555")

# Enqueue
msg_id = client.enqueue("demo", {"tenant": "acme"}, b"hello")
print(f"enqueued: {msg_id}")

# Consume
stream = client.consume("demo")
msg = next(stream)
print(f"received: {msg.id} ({msg.payload.decode()})")

# Ack
client.ack("demo", msg.id)
print(f"acked: {msg.id}")

JavaScript / Node.js

const { FilaClient } = require('fila-client');

async function main() {
  const client = await FilaClient.connect('localhost:5555');

  // Enqueue
  const id = await client.enqueue('demo', { tenant: 'acme' }, Buffer.from('hello'));
  console.log('enqueued:', id);

  // Consume
  const stream = client.consume('demo');
  for await (const msg of stream) {
    console.log(`received: ${msg.id} (${msg.payload.toString()})`);

    // Ack
    await client.ack('demo', msg.id);
    console.log('acked:', msg.id);
    break; // just one message for the demo
  }
}

main().catch(console.error);

Ruby

require 'fila'

client = Fila::Client.new('localhost:5555')

# Enqueue
id = client.enqueue('demo', { 'tenant' => 'acme' }, 'hello')
puts "enqueued: #{id}"

# Consume
client.consume('demo') do |msg|
  puts "received: #{msg.id} (#{msg.payload})"

  # Ack
  client.ack('demo', msg.id)
  puts "acked: #{msg.id}"
  break
end

Java

import dev.fila.client.FilaClient;
import dev.fila.client.Message;

public class Demo {
    public static void main(String[] args) throws Exception {
        FilaClient client = FilaClient.connect("localhost:5555");

        // Enqueue
        String id = client.enqueue("demo",
            java.util.Map.of("tenant", "acme"),
            "hello".getBytes());
        System.out.println("enqueued: " + id);

        // Consume
        var stream = client.consume("demo");
        Message msg = stream.next();
        System.out.printf("received: %s (%s)%n", msg.getId(),
            new String(msg.getPayload()));

        // Ack
        client.ack("demo", msg.getId());
        System.out.println("acked: " + msg.getId());

        client.close();
    }
}

Integration Patterns

Common messaging patterns with Fila.

Producer / Consumer

The most basic pattern: one service produces messages, another consumes them.

┌──────────┐    enqueue    ┌──────┐    consume    ┌──────────┐
│ Producer │ ────────────► │ Fila │ ────────────► │ Consumer │
└──────────┘               └──────┘               └──────────┘

Use when: You want to decouple a producer from a consumer — the producer doesn’t need to wait for processing to complete.

# producer.py
import asyncio
from fila import FilaClient

async def produce():
    client = await FilaClient.connect("localhost:5555")
    for i in range(100):
        await client.enqueue("orders",
            headers={"tenant": "acme"},
            payload=f"order-{i}".encode())

asyncio.run(produce())
# consumer.py
import asyncio
from fila import FilaClient

async def consume():
    client = await FilaClient.connect("localhost:5555")
    async for msg in client.consume("orders"):
        print(f"processing: {msg.payload}")
        # ... do work ...
        await client.ack("orders", msg.id)

asyncio.run(consume())

Fan-out

Multiple consumers process different types of work from a shared queue. Fila’s fairness keys ensure each workload type gets its fair share.

                              ┌──────────────┐
                         ┌──► │ Consumer (A)  │
┌──────────┐  enqueue   │    └──────────────┘
│ Producer │ ──────► ┌──┴──┐
└──────────┘         │ Fila │  fairness_key routing
                     └──┬──┘
                         └──► ┌──────────────┐
                              │ Consumer (B)  │
                              └──────────────┘

Use when: Different message types need fair scheduling. Without fairness keys, a burst of type-A messages would starve type-B consumers.

# Producer assigns fairness keys via Lua on_enqueue
# Queue created with:
# fila queue create work --on-enqueue '
#   function on_enqueue(msg)
#     return { fairness_key = msg.headers["type"] or "default" }
#   end
# '

# producer.py
async def produce():
    client = await FilaClient.connect("localhost:5555")
    # Type A messages (high volume)
    for i in range(1000):
        await client.enqueue("work",
            headers={"type": "email"}, payload=f"email-{i}".encode())
    # Type B messages (low volume but equally important)
    for i in range(10):
        await client.enqueue("work",
            headers={"type": "sms"}, payload=f"sms-{i}".encode())
    # Fila's DRR scheduler ensures SMS messages aren't starved by emails

Request-Reply

Implement synchronous-style request/reply over async messaging using a correlation ID and a reply queue.

┌─────────┐  enqueue(request)  ┌──────┐  consume   ┌─────────┐
│ Client  │ ─────────────────► │ Fila │ ────────► │ Service │
│         │ ◄───────────────── │      │ ◄──────── │         │
└─────────┘  consume(reply)    └──────┘  enqueue   └─────────┘
                                         (reply)

Use when: A service needs a response but you still want the decoupling and reliability benefits of message queues.

// client.go — sends request, waits for reply
package main

import (
    "context"
    "fmt"
    "log"

    "github.com/google/uuid"
    fila "github.com/faiscadev/fila-go"
)

func main() {
    client, _ := fila.Connect("localhost:5555")
    ctx := context.Background()

    correlationID := uuid.New().String()
    replyQueue := "replies-" + correlationID

    // Create a temporary reply queue
    client.CreateQueue(ctx, replyQueue)
    defer client.DeleteQueue(ctx, replyQueue)

    // Send request with correlation ID
    client.Enqueue(ctx, "requests",
        map[string]string{
            "correlation_id": correlationID,
            "reply_to":       replyQueue,
        },
        []byte("what is 2+2?"))

    // Wait for reply
    stream, _ := client.Consume(ctx, replyQueue)
    reply, _ := stream.Recv()
    fmt.Printf("reply: %s\n", reply.Payload)
    client.Ack(ctx, replyQueue, reply.ID)
}
// service.go — processes requests, sends replies
package main

import (
    "context"
    "fmt"
    "log"

    fila "github.com/faiscadev/fila-go"
)

func main() {
    client, _ := fila.Connect("localhost:5555")
    ctx := context.Background()

    stream, _ := client.Consume(ctx, "requests")
    for {
        msg, err := stream.Recv()
        if err != nil {
            log.Fatal(err)
        }

        // Process request
        answer := fmt.Sprintf("answer: 4 (to: %s)", string(msg.Payload))

        // Send reply to the caller's reply queue
        replyTo := msg.Headers["reply_to"]
        client.Enqueue(ctx, replyTo,
            map[string]string{
                "correlation_id": msg.Headers["correlation_id"],
            },
            []byte(answer))

        client.Ack(ctx, "requests", msg.ID)
    }
}

Configuration Reference

Fila reads configuration from a TOML file. It searches for:

  1. fila.toml in the current working directory
  2. /etc/fila/fila.toml

If no file is found, all defaults are used. The broker runs with zero configuration.

Environment variables

VariableDefaultDescription
FILA_DATA_DIRdataPath to the RocksDB data directory
FILA_FIBP_PORT_FILE(none)When set, the server writes the actual FIBP listen address to this file after binding (useful for test harnesses with port 0)

Full configuration

[server]
listen_addr = "0.0.0.0:5555"    # FIBP listen address

[scheduler]
command_channel_capacity = 10000 # internal command channel buffer size
idle_timeout_ms = 100            # scheduler idle timeout between rounds (ms)
quantum = 1000                   # DRR quantum per fairness key per round

[lua]
default_timeout_ms = 10              # max script execution time (ms)
default_memory_limit_bytes = 1048576 # max memory per script (1 MB)
circuit_breaker_threshold = 3        # consecutive failures before circuit break
circuit_breaker_cooldown_ms = 10000  # cooldown period after circuit break (ms)

[telemetry]
otlp_endpoint = "http://localhost:4317"  # OTLP endpoint (omit to disable)
service_name = "fila"                     # OTel service name
metrics_interval_ms = 10000              # metrics export interval (ms)

# Uncomment to enable the web management GUI
# [gui]
# listen_addr = "0.0.0.0:8080"  # HTTP port for the dashboard

# FIBP transport tuning (defaults are optimized — override only if needed)
# [fibp]
# listen_addr = "0.0.0.0:5555"       # TCP listen address
# max_frame_size = 16777216           # max frame size in bytes (16 MB)
# keepalive_interval_secs = 15        # heartbeat ping interval
# keepalive_timeout_secs = 10         # timeout waiting for pong

Section reference

[server]

KeyTypeDefaultDescription
listen_addrstring"0.0.0.0:5555"Address and port for the FIBP server

[scheduler]

KeyTypeDefaultDescription
command_channel_capacityinteger10000Size of the bounded channel between FIBP connection handlers and the scheduler loop. Increase if you see backpressure under high load.
idle_timeout_msinteger100How long the scheduler waits when there’s no work before checking again. Lower values reduce latency at the cost of CPU.
quantuminteger1000DRR quantum. Each fairness key gets weight * quantum deficit per scheduling round. Higher values mean more messages delivered per key per round (coarser interleaving).

[lua]

KeyTypeDefaultDescription
default_timeout_msinteger10Maximum execution time for Lua scripts. Enforced via instruction count hook (approximate).
default_memory_limit_bytesinteger1048576 (1 MB)Maximum memory a Lua script can allocate.
circuit_breaker_thresholdinteger3Number of consecutive Lua execution failures before the circuit breaker trips. When tripped, Lua hooks are bypassed and default scheduling is used.
circuit_breaker_cooldown_msinteger10000How long to wait after circuit breaker trips before retrying Lua execution.

[telemetry]

Telemetry export is optional. When otlp_endpoint is omitted, the broker uses plain tracing-subscriber logging only.

KeyTypeDefaultDescription
otlp_endpointstring(none)OTLP endpoint for exporting traces and metrics. Example: "http://localhost:4317".
service_namestring"fila"Service name reported in OTel traces and metrics.
metrics_interval_msinteger10000How often metrics are exported to the OTLP endpoint.

[gui]

Web management GUI. Disabled by default. When enabled, serves a read-only dashboard on a separate HTTP port.

KeyTypeDefaultDescription
listen_addrstring"0.0.0.0:8080"Address and port for the web dashboard HTTP server

[fibp]

FIBP (Fila Binary Protocol) transport configuration. FIBP is Fila’s binary TCP protocol, using length-prefixed frames with a 6-byte header (flags, op code, correlation ID).

When [tls] is configured, FIBP connections are TLS-wrapped. When [auth] is configured, FIBP clients must send an OP_AUTH frame (containing the API key) as the first frame after handshake. Per-queue ACL checks apply to all data and admin operations.

FIBP supports all admin operations (create/delete queue, list queues, queue stats, redrive) using protobuf-encoded payloads for schema evolution.

KeyTypeDefaultDescription
listen_addrstring"0.0.0.0:5557"TCP address for the FIBP transport
max_frame_sizeinteger16777216 (16 MB)Maximum frame size in bytes. Frames exceeding this limit are rejected.
keepalive_interval_secsinteger15Interval between keepalive heartbeat pings.
keepalive_timeout_secsinteger10Timeout waiting for a keepalive pong before closing the connection.

SDK connection

The Rust SDK connects via FIBP:

#![allow(unused)]
fn main() {
let client = FilaClient::connect("127.0.0.1:5555").await?;

// With TLS and API key
let opts = ConnectOptions::new("127.0.0.1:5555")
    .with_tls()
    .with_api_key("my-key");
let client = FilaClient::connect_with_options(opts).await?;
}

Notes:

  • The address is a raw TCP address (not a URL with http://).
  • TLS and API key options are set on ConnectOptions.
  • The FIBP wire format supports multi-message enqueue in a single frame.

OpenTelemetry metrics

When telemetry is enabled, Fila exports the following metrics:

MetricTypeDescription
fila.messages.enqueuedCounterMessages enqueued
fila.messages.deliveredCounterMessages delivered to consumers
fila.messages.ackedCounterMessages acknowledged
fila.messages.nackedCounterMessages rejected
fila.messages.expiredCounterMessages expired (visibility timeout)
fila.messages.dead_letteredCounterMessages moved to DLQ
fila.messages.redrivenCounterMessages redriven from DLQ
fila.queue.depthGaugePending messages per queue
fila.queue.in_flightGaugeLeased messages per queue
fila.queue.consumersGaugeActive consumers per queue
fila.queue.fairness_keysGaugeActive fairness keys per queue
fila.delivery.latencyHistogramTime from enqueue to consumer delivery
fila.lua.executionsCounterLua script executions (by hook type and outcome)
fila.throttle.limitedCounterMessages held due to throttle limits

All queue-scoped metrics include a queue attribute.

Deployment Guide

This guide covers deploying Fila in production environments.

Docker (single node)

The quickest way to run Fila:

docker run -d \
  --name fila \
  -p 5555:5555 \
  -v fila-data:/var/lib/fila \
  ghcr.io/faiscadev/fila:latest

With a custom config file:

docker run -d \
  --name fila \
  -p 5555:5555 \
  -v fila-data:/var/lib/fila \
  -v ./fila.toml:/etc/fila/fila.toml:ro \
  ghcr.io/faiscadev/fila:latest \
  fila-server --config /etc/fila/fila.toml

systemd

For bare-metal or VM deployments on Linux.

1. Install the binary

curl -fsSL https://raw.githubusercontent.com/faiscadev/fila/main/install.sh -o install.sh
bash install.sh

2. Create system user and directories

sudo useradd --system --no-create-home --shell /usr/sbin/nologin fila
sudo mkdir -p /etc/fila /var/lib/fila
sudo chown fila:fila /var/lib/fila

3. Create configuration

Copy the example config and customize:

sudo cp deploy/fila.toml /etc/fila/fila.toml
sudo editor /etc/fila/fila.toml

See Configuration Reference for all options.

4. Install and start the service

sudo cp deploy/fila.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable --now fila
sudo systemctl status fila

5. Verify

fila queue create test-queue
fila queue inspect test-queue

Docker Compose cluster

Run a 3-node cluster locally using Docker Compose.

1. Start the cluster

docker compose -f deploy/docker-compose.cluster.yml up -d

This starts three Fila nodes:

  • node1 — bootstrap node, port 5555
  • node2 — port 5565
  • node3 — port 5575

2. Create a queue and test

# Connect to any node
fila --addr localhost:5555 queue create orders

# Enqueue via node1, consume via node2 (transparent routing)
fila --addr localhost:5555 enqueue orders --payload "hello"
fila --addr localhost:5565 queue inspect orders

3. Stop the cluster

docker compose -f deploy/docker-compose.cluster.yml down
# Add -v to also remove data volumes

Kubernetes with Helm

Single-node deployment

helm install fila deploy/helm/fila/

Clustered deployment

helm install fila deploy/helm/fila/ \
  --set replicaCount=3 \
  --set cluster.enabled=true \
  --set cluster.replicationFactor=3 \
  --set persistence.enabled=true \
  --set persistence.size=20Gi

With TLS and authentication

# Create TLS secret first
kubectl create secret tls fila-tls \
  --cert=server.crt --key=server.key

helm install fila deploy/helm/fila/ \
  --set tls.enabled=true \
  --set tls.certSecret=fila-tls \
  --set auth.bootstrapApiKey=your-secret-key

With OpenTelemetry

helm install fila deploy/helm/fila/ \
  --set telemetry.enabled=true \
  --set telemetry.otlpEndpoint=http://otel-collector:4317

Custom values file

Create a my-values.yaml:

replicaCount: 3

cluster:
  enabled: true
  replicationFactor: 3

persistence:
  enabled: true
  size: 50Gi
  storageClass: gp3

tls:
  enabled: true
  certSecret: fila-tls

auth:
  bootstrapApiKey: "change-me-in-production"

telemetry:
  enabled: true
  otlpEndpoint: "http://otel-collector.monitoring:4317"

resources:
  requests:
    cpu: 500m
    memory: 512Mi
  limits:
    cpu: 2000m
    memory: 2Gi
helm install fila deploy/helm/fila/ -f my-values.yaml

Production checklist

Before going to production, verify these items:

  • Data persistence — data directory is on a persistent volume (not ephemeral container storage)
  • TLS enabled — all client connections encrypted; consider mTLS for service-to-service
  • API key auth — bootstrap API key set; per-queue ACLs configured for multi-tenant
  • Backups — data directory backup strategy in place (RocksDB is crash-consistent, point-in-time snapshots work)
  • Monitoring — OTel exporter configured, scraping endpoint or collector receiving metrics
  • Log level — set RUST_LOG=info (default); use debug only for troubleshooting
  • File descriptorsLimitNOFILE=65536 or higher for high-connection workloads
  • Cluster sizing — for HA, run 3+ nodes with replication_factor = 3
  • Resource limits — set CPU/memory limits appropriate for workload
  • Lua safety — review script timeout and memory limits per queue

Cluster Scaling Benchmark Methodology

This document describes how to measure Fila’s horizontal scaling characteristics using the fila-bench harness against a multi-node cluster.

Prerequisites

  • Fila server binary (3 copies, or 1 binary run 3 times with different configs)
  • fila-bench binary (from crates/fila-bench)
  • fila CLI binary (from crates/fila-cli)

Setting Up a 3-Node Local Cluster

Create three configuration files, one per node.

node1.toml

[server]
listen_addr = "127.0.0.1:5555"

[cluster]
enabled = true
node_id = 1
bind_addr = "127.0.0.1:5556"
bootstrap = true
peers = []

node2.toml

[server]
listen_addr = "127.0.0.1:5565"

[cluster]
enabled = true
node_id = 2
bind_addr = "127.0.0.1:5566"
bootstrap = false
peers = ["127.0.0.1:5556"]

node3.toml

[server]
listen_addr = "127.0.0.1:5575"

[cluster]
enabled = true
node_id = 3
bind_addr = "127.0.0.1:5576"
bootstrap = false
peers = ["127.0.0.1:5556"]

Starting the Cluster

# Terminal 1
fila-server --config node1.toml

# Terminal 2
fila-server --config node2.toml

# Terminal 3
fila-server --config node3.toml

Verify the cluster is healthy:

fila --addr 127.0.0.1:5555 queue list
# Should show "Cluster nodes: 3" at the bottom

Queue-to-Node Assignment

When a queue is created in cluster mode, Fila automatically distributes queue leadership across nodes for balanced load:

  • Preferred leader selection: each new queue’s initial leader is the node with the fewest current queue leaderships (tie-break: lowest node ID).
  • Node subset selection: when the cluster has more nodes than replication_factor (default: 3), only the N least-loaded nodes participate in each queue’s Raft group. This spreads I/O across more nodes.
  • No manual placement: operators do not need to specify which nodes a queue runs on. The cluster handles assignment automatically.

Configure replication_factor in the [cluster] section:

[cluster]
enabled = true
node_id = 1
replication_factor = 3  # default: 3 nodes per queue Raft group

Inspect queue leadership via fila queue inspect <name> — the “Raft leader” line shows which node currently leads each queue.

Running Scaling Benchmarks

Step 1: Baseline (Single Node)

Run fila-bench against a single standalone node to establish baseline throughput:

fila-bench --addr 127.0.0.1:5555 --category enqueue_throughput
fila-bench --addr 127.0.0.1:5555 --category consume_throughput
fila-bench --addr 127.0.0.1:5555 --category enqueue_consume_mixed

Record the messages/second for each category.

Step 2: Cluster Throughput

Create queues distributed across the cluster, then run the same benchmarks:

# Create test queues (they'll be replicated across all 3 nodes)
fila --addr 127.0.0.1:5555 queue create bench-q1
fila --addr 127.0.0.1:5555 queue create bench-q2
fila --addr 127.0.0.1:5555 queue create bench-q3

# Check leadership distribution
fila --addr 127.0.0.1:5555 queue list

Run benchmarks targeting different nodes to exercise the full cluster:

# Enqueue through node 1 (may forward to queue leaders on other nodes)
fila-bench --addr 127.0.0.1:5555 --category enqueue_throughput

# Enqueue through node 2
fila-bench --addr 127.0.0.1:5565 --category enqueue_throughput

# Mixed workload across all nodes (run in parallel)
fila-bench --addr 127.0.0.1:5555 --category enqueue_consume_mixed &
fila-bench --addr 127.0.0.1:5565 --category enqueue_consume_mixed &
fila-bench --addr 127.0.0.1:5575 --category enqueue_consume_mixed &
wait

Step 3: Measure Scaling Factor

Compare cluster throughput to single-node baseline:

scaling_factor = cluster_total_msgs_per_sec / single_node_msgs_per_sec

For a well-distributed workload across 3 queues on 3 nodes, the ideal scaling factor is 3.0x. In practice, expect 2.0x-2.5x due to Raft consensus overhead (log replication, leader forwarding).

What to Measure

MetricHowTool
Enqueue throughput (msg/s)fila-bench --category enqueue_throughputfila-bench
Consume throughput (msg/s)fila-bench --category consume_throughputfila-bench
P99 enqueue latencyfila-bench --category enqueue_latencyfila-bench
Raft consensus overheadCompare single-node vs cluster latencyfila-bench
Leadership distributionfila queue list LEADER columnfila CLI
Failover timeKill leader, measure time to new leader electionmanual

Interpreting Results

  • Linear scaling means the scaling factor approaches N (number of nodes). Fila achieves this when queues are distributed across different leaders.
  • Sub-linear scaling is expected for single-queue workloads since all writes go through one Raft leader regardless of cluster size.
  • Raft overhead adds ~1-3ms per write (log replication to majority). This is the cost of durability and fault tolerance.
  • Forwarding overhead adds latency when a client connects to a non-leader node. The request is transparently forwarded to the leader via the internal cluster protocol.

Notes

  • The fila-bench harness runs single-node benchmarks by default. To benchmark a cluster, point it at any cluster node’s client address.
  • Queue creation in cluster mode goes through the meta Raft group. All nodes in the cluster will create the queue locally.
  • Consumer streams are served by the queue’s Raft leader. If the leader changes (failover), consumers automatically reconnect to the new leader.

Troubleshooting

Common issues and their solutions.

Connection refused

Symptom: transport error: connection refused or failed to connect

Causes:

  • Broker not running. Start it: fila-server or docker run -p 5555:5555 ghcr.io/faiscadev/fila:latest
  • Wrong address. Default is localhost:5555. Check your connection string.
  • Firewall blocking port 5555. Verify: nc -zv localhost 5555
  • In Docker: use host.docker.internal:5555 from containers, not localhost

TLS errors

Symptom: tls handshake error, certificate verify failed, or unknown ca

Solutions:

  • Self-signed cert: Provide the CA certificate to the SDK (with_tls_ca("ca.crt"))
  • System trust store: Use with_tls() (no CA path) if the broker’s cert is issued by a public CA
  • Expired cert: Check cert expiry: openssl x509 -in server.crt -noout -enddate
  • Wrong hostname: The cert’s CN/SAN must match the hostname you’re connecting to
  • mTLS required: If the broker requires client certs, provide both client cert and key

Authentication failures

Symptom: UNAUTHENTICATED error status

Solutions:

  • Missing API key: Set api_key / with_api_key() on your client connection
  • Wrong key: Verify the key matches what’s configured on the broker
  • Key revoked: Check if the key was deleted. List keys: fila auth list
  • Permission denied (PERMISSION_DENIED): The key exists but lacks permission for the queue. Check ACLs: fila auth acl list

Queue not found

Symptom: NOT_FOUND error status on enqueue or consume

Solutions:

  • Queue doesn’t exist. Create it: fila queue create <name>
  • Typo in queue name. List queues: fila queue list
  • In clustered mode: if you get UNAVAILABLE instead of NOT_FOUND, the node is still starting up — retry after a moment

Consumer timeout

Symptom: Consumer stream hangs with no messages delivered

Causes:

  • Queue is empty. Check depth: fila queue inspect <name>
  • All messages are leased to other consumers. Check pending count in queue stats.
  • Visibility timeout expired and messages were re-queued. Increase timeout or process faster.
  • Throttle rate limit hit. Check throttle config: fila config list

Message redelivery

Symptom: Same message delivered multiple times

Causes:

  • Visibility timeout expiry: If a consumer takes too long to ack, the message is re-delivered. Solution: ack promptly, or increase the visibility timeout.
  • Nack without DLQ: If on_failure returns { action = "retry" }, the message goes back to the queue. After enough retries, consider dead-lettering.
  • Consumer crash: Leased messages are re-delivered after visibility timeout. This is by design — at-least-once delivery.

Best practice: Make consumers idempotent. Use msg.id as a deduplication key.

Lua script errors

Symptom: Warning logs about script failures, circuit breaker tripping

Solutions:

  • Syntax error: Validate before setting: fila queue create test --on-enqueue 'function on_enqueue(msg) return {} end'
  • Runtime error: Check logs for the specific error. Common: accessing nil headers, type mismatches.
  • Timeout: Script took too long. Default limit is 10ms. Simplify the script or increase lua.default_timeout_ms.
  • Memory limit: Script uses too much memory. Default is 1MB. Check for unbounded table growth.
  • Circuit breaker: After 3 consecutive failures, scripts are bypassed for 10 seconds. Fix the script to reset the breaker.

High memory usage

Solutions:

  • Check queue depths: fila queue inspect <name>. Large queues consume memory.
  • RocksDB block cache can be tuned via configuration.
  • In clustered mode, each node replicates data for its assigned queues.

Cluster issues

Symptom: Node not joining cluster, split-brain, leader election failures

Solutions:

  • Node not joining: Check cluster.peers config — all nodes must list each other’s cluster addresses (port 5556, not 5555).
  • Bootstrap: Exactly one node should have bootstrap = true. Start it first.
  • Network: Verify nodes can reach each other on the cluster port: nc -zv node2 5556
  • Node ID: Each node must have a unique node_id. Duplicates cause undefined behavior.

API Reference

Fila exposes two service groups on the same port (default 5555) via FIBP (Fila Binary Protocol). Protobuf message definitions are in proto/fila/v1/.

Hot-path service (fila.v1.FilaService)

Used by producers and consumers for message operations.

Enqueue

Enqueue one or more messages. Single-message enqueue is a batch of one.

rpc Enqueue(EnqueueRequest) returns (EnqueueResponse)

Request (EnqueueRequest):

FieldTypeDescription
messagesrepeated EnqueueMessageOne or more messages to enqueue

EnqueueMessage:

FieldTypeDescription
queuestringQueue name
headersmap<string, string>Arbitrary key-value headers (accessible in Lua hooks)
payloadbytesMessage body

Response (EnqueueResponse):

FieldTypeDescription
resultsrepeated EnqueueResultOne result per input message (same order)

EnqueueResult:

FieldTypeDescription
message_idstringUUID assigned (on success)
errorEnqueueErrorError details (on failure)

Only one of message_id or error is set (protobuf oneof).

EnqueueErrorCode:

CodeDescription
ENQUEUE_ERROR_CODE_QUEUE_NOT_FOUNDQueue does not exist
ENQUEUE_ERROR_CODE_STORAGEStorage layer error
ENQUEUE_ERROR_CODE_LUALua hook rejected the message
ENQUEUE_ERROR_CODE_PERMISSION_DENIEDCaller lacks permission

StreamEnqueue

Bidirectional streaming enqueue with sequence tracking. The client sends batches of messages on the request stream and receives per-batch results on the response stream. Sequence numbers allow the client to correlate responses with requests.

rpc StreamEnqueue(stream StreamEnqueueRequest) returns (stream StreamEnqueueResponse)

Request (stream, StreamEnqueueRequest):

FieldTypeDescription
messagesrepeated EnqueueMessageMessages to enqueue in this batch
sequence_numberuint64Client-assigned sequence number for correlation

Response (stream, StreamEnqueueResponse):

FieldTypeDescription
sequence_numberuint64Echoed sequence number from the request
resultsrepeated EnqueueResultOne result per input message

Consume

Open a server-streaming connection to receive messages. The broker delivers messages according to the DRR scheduler, respecting fairness groups and throttle limits.

rpc Consume(ConsumeRequest) returns (stream ConsumeResponse)

Request:

FieldTypeDescription
queuestringQueue name to consume from

Response (stream, ConsumeResponse):

FieldTypeDescription
messagesrepeated MessageOne or more delivered messages (see Message below)

The stream stays open until the client disconnects. Messages are delivered as they become available — the stream blocks when no messages are ready.

Errors:

Error StatusCondition
NOT_FOUNDQueue does not exist

Ack

Acknowledge one or more messages. Removes acknowledged messages from the broker.

rpc Ack(AckRequest) returns (AckResponse)

Request (AckRequest):

FieldTypeDescription
messagesrepeated AckMessageOne or more messages to acknowledge

AckMessage:

FieldTypeDescription
queuestringQueue name
message_idstringID of the message to acknowledge

Response (AckResponse):

FieldTypeDescription
resultsrepeated AckResultOne result per input message

AckResult:

FieldTypeDescription
successAckSuccessEmpty message (on success)
errorAckErrorError details (on failure)

Only one of success or error is set (protobuf oneof).

AckErrorCode:

CodeDescription
ACK_ERROR_CODE_MESSAGE_NOT_FOUNDMessage does not exist or is not leased
ACK_ERROR_CODE_STORAGEStorage layer error
ACK_ERROR_CODE_PERMISSION_DENIEDCaller lacks permission

Nack

Reject one or more messages. Triggers the on_failure Lua hook (if configured) to decide retry vs. dead-letter.

rpc Nack(NackRequest) returns (NackResponse)

Request (NackRequest):

FieldTypeDescription
messagesrepeated NackMessageOne or more messages to reject

NackMessage:

FieldTypeDescription
queuestringQueue name
message_idstringID of the message to reject
errorstringError description (passed to on_failure hook as msg.error)

Response (NackResponse):

FieldTypeDescription
resultsrepeated NackResultOne result per input message

NackResult:

FieldTypeDescription
successNackSuccessEmpty message (on success)
errorNackErrorError details (on failure)

Only one of success or error is set (protobuf oneof).

NackErrorCode:

CodeDescription
NACK_ERROR_CODE_MESSAGE_NOT_FOUNDMessage does not exist or is not leased
NACK_ERROR_CODE_STORAGEStorage layer error
NACK_ERROR_CODE_PERMISSION_DENIEDCaller lacks permission

Admin service (fila.v1.FilaAdmin)

Used by operators and the fila CLI for queue management, configuration, and diagnostics.

CreateQueue

Create a new queue with optional Lua hooks and visibility timeout.

rpc CreateQueue(CreateQueueRequest) returns (CreateQueueResponse)

Request:

FieldTypeDescription
namestringQueue name
configQueueConfigOptional configuration (see below)

QueueConfig:

FieldTypeDefaultDescription
on_enqueue_scriptstring(none)Lua script run on every enqueue
on_failure_scriptstring(none)Lua script run on every nack
visibility_timeout_msuint6430000Lease duration in milliseconds

Response:

FieldTypeDescription
queue_idstringQueue identifier

Errors:

Error StatusCondition
ALREADY_EXISTSQueue with that name already exists

DeleteQueue

Delete a queue and all its messages.

rpc DeleteQueue(DeleteQueueRequest) returns (DeleteQueueResponse)

Request:

FieldTypeDescription
queuestringQueue name

Errors:

Error StatusCondition
NOT_FOUNDQueue does not exist

ListQueues

List all queues with summary statistics.

rpc ListQueues(ListQueuesRequest) returns (ListQueuesResponse)

Response:

FieldTypeDescription
queuesrepeated QueueInfoList of queues

QueueInfo:

FieldTypeDescription
namestringQueue name
depthuint64Number of pending messages
in_flightuint64Number of leased (in-flight) messages
active_consumersuint32Number of connected consumers

SetConfig

Set a runtime configuration key-value pair. Persisted across restarts.

rpc SetConfig(SetConfigRequest) returns (SetConfigResponse)

Request:

FieldTypeDescription
keystringConfiguration key
valuestringConfiguration value

GetConfig

Retrieve a configuration value by key.

rpc GetConfig(GetConfigRequest) returns (GetConfigResponse)

Request:

FieldTypeDescription
keystringConfiguration key

Response:

FieldTypeDescription
valuestringConfiguration value

Errors:

Error StatusCondition
NOT_FOUNDKey does not exist

ListConfig

List configuration entries, optionally filtered by prefix.

rpc ListConfig(ListConfigRequest) returns (ListConfigResponse)

Request:

FieldTypeDescription
prefixstringFilter entries by key prefix (empty = all)

Response:

FieldTypeDescription
entriesrepeated ConfigEntryKey-value pairs
total_countuint32Total number of matching entries

ConfigEntry:

FieldTypeDescription
keystringConfiguration key
valuestringConfiguration value

GetStats

Get detailed statistics for a queue, including per-fairness-key and per-throttle-key breakdowns.

rpc GetStats(GetStatsRequest) returns (GetStatsResponse)

Request:

FieldTypeDescription
queuestringQueue name

Response:

FieldTypeDescription
depthuint64Total pending messages
in_flightuint64Messages currently leased
active_fairness_keysuint64Number of fairness keys with pending messages
active_consumersuint32Connected consumers
quantumuint32DRR quantum value
per_key_statsrepeated PerFairnessKeyStatsPer-fairness-key breakdown
per_throttle_statsrepeated PerThrottleKeyStatsPer-throttle-key breakdown

PerFairnessKeyStats:

FieldTypeDescription
keystringFairness key
pending_countuint64Pending messages for this key
current_deficitint64Current DRR deficit
weightuint32DRR weight

PerThrottleKeyStats:

FieldTypeDescription
keystringThrottle key
tokensdoubleCurrent available tokens
rate_per_seconddoubleToken refill rate
burstdoubleMaximum bucket capacity

Errors:

Error StatusCondition
NOT_FOUNDQueue does not exist

Redrive

Move pending messages from a dead letter queue back to the source queue.

rpc Redrive(RedriveRequest) returns (RedriveResponse)

Request:

FieldTypeDescription
dlq_queuestringDLQ name (e.g., orders.dlq)
countuint64Maximum number of messages to redrive

Response:

FieldTypeDescription
redrivenuint64Number of messages actually moved

Only pending (non-leased) messages are redriven. Leased messages in the DLQ are skipped to avoid interfering with active consumers.

Errors:

Error StatusCondition
NOT_FOUNDDLQ does not exist

Message types

Message

The core message envelope returned by Consume.

message Message {
  string id = 1;
  map<string, string> headers = 2;
  bytes payload = 3;
  MessageMetadata metadata = 4;
  MessageTimestamps timestamps = 5;
}
FieldTypeDescription
idstringUUID assigned at enqueue
headersmap<string, string>Headers set by the producer
payloadbytesMessage body
metadataMessageMetadataBroker-assigned scheduling metadata
timestampsMessageTimestampsLifecycle timestamps

MessageMetadata

Scheduling metadata assigned by the broker (via Lua on_enqueue or defaults).

message MessageMetadata {
  string fairness_key = 1;
  uint32 weight = 2;
  repeated string throttle_keys = 3;
  uint32 attempt_count = 4;
  string queue_id = 5;
}
FieldTypeDescription
fairness_keystringDRR fairness group key
weightuint32DRR weight for this key
throttle_keysrepeated stringToken bucket keys checked before delivery
attempt_countuint32Number of delivery attempts
queue_idstringQueue this message belongs to

MessageTimestamps

message MessageTimestamps {
  google.protobuf.Timestamp enqueued_at = 1;
  google.protobuf.Timestamp leased_at = 2;
}
FieldTypeDescription
enqueued_atTimestampWhen the message was first enqueued
leased_atTimestampWhen the message was last delivered to a consumer

Benchmarks

This page presents Fila’s benchmark results: self-benchmarks measuring single-node performance, and competitive comparisons against Kafka, RabbitMQ, and NATS.

Results from commit 1e5bb0e on 2026-03-26. Run benchmarks on your own hardware for results relevant to your environment. See Reproducing results for instructions.

Self-benchmarks

Self-benchmarks measure Fila’s single-node performance across throughput, latency, scheduling, and resource usage. The benchmark suite is in crates/fila-bench/ and uses the Fila SDK as a blackbox client against a real server instance.

Throughput

MetricValueUnit
Enqueue throughput (1KB payload)5,051msg/s
Enqueue throughput (1KB payload)4.93MB/s

Single producer, sustained over a 3-second measurement window after 1-second warmup.

End-to-end latency

Round-trip latency: produce a message, consume it, measure the interval. 100 samples per load level.

Load levelProducersp50p95p99
Light10.00 ms0.00 ms0.00 ms

Fair scheduling overhead

Compares throughput with DRR fair scheduling enabled vs plain FIFO delivery.

ModeThroughput (msg/s)
FIFO baseline1,307
Fair scheduling (DRR)1,247
Overhead3.1%

The DRR scheduler adds minimal overhead compared to FIFO delivery (< 5% target).

Fairness accuracy

Messages enqueued across 5 fairness keys with weights 1:2:3:4:5. 2,000 messages per key (10,000 total), consuming a window of 5,000.

KeyWeightExpected shareActual shareDeviation
tenant-116.7%6.7%0.2%
tenant-2213.3%13.4%0.2%
tenant-3320.0%20.0%0.1%
tenant-4426.7%26.6%0.1%
tenant-5533.3%33.3%0.1%

The DRR scheduler distributes messages proportionally to weight within any delivery window. Max deviation is < 1%, well within the < 5% NFR target.

Lua script overhead

Measures per-message overhead of executing an on_enqueue Lua hook.

MetricValueUnit
Throughput without Lua987msg/s
Throughput with on_enqueue hook943msg/s
Per-message overhead31.7us

The Lua hook adds < 6 us per-message overhead, well within the < 50 us NFR target.

Fairness key cardinality scaling

Scheduling throughput as the number of distinct fairness keys increases.

Key countThroughput (msg/s)
101,479
1,000818
10,000509

Consumer concurrency scaling

Aggregate consume throughput with increasing concurrent consumer streams.

ConsumersThroughput (msg/s)
166
101,009
1001,863

Memory footprint

MetricValue
RSS idle351 MB
RSS under load (10K messages)351 MB

Memory usage is dominated by the RocksDB buffer pool, not message count.

RocksDB compaction impact

Metricp99 latency
Idle (no compaction)0.00 ms
Active compaction0.00 ms
Delta< 0.39 ms

Compaction has no measurable negative impact on tail latency in single-node benchmarks.

Batch benchmarks

Batch benchmarks measure throughput and latency of multi-message Enqueue (multiple EnqueueMessage items per EnqueueRequest) and compare it against single-message enqueue. These benchmarks are gated behind FILA_BENCH_BATCH=1 because they exercise batch-specific code paths and take additional time.

Enable with FILA_BENCH_BATCH=1:

FILA_BENCH_BATCH=1 cargo bench -p fila-bench --bench system

Multi-message enqueue throughput

Measures multi-message Enqueue throughput at various batch sizes with 1KB messages. Reports both messages/s and batches/s.

Batch sizeThroughput (msg/s)Batches/s
1
10
50
100
500

Batch size scaling

Measures throughput as a function of batch size (1 to 1000) to identify the point of diminishing returns.

Batch sizeThroughput (msg/s)
1
5
10
25
50
100
250
500
1000

Auto-batching latency

Measures end-to-end latency (multi-message enqueue to consume) at various producer concurrency levels. Simulates client-side auto-batching by accumulating messages and flushing via the Enqueue RPC with 50 messages per request.

Producersp50p95p99p99.9p99.99max
1
10
50

Batched vs unbatched comparison

Runs identical workloads (3,000 messages) with three approaches and reports throughput and speedup ratios.

ModeThroughput (msg/s)Speedup
Unbatched1.0x
Explicit batch (size 100)—x
Auto-batch (size 100)—x

Speedup ratios are computed relative to the unbatched baseline.

Delivery batching throughput

Measures consumer throughput with varying concurrent consumer counts. Messages are pre-loaded and continuously produced via multi-message Enqueue.

ConsumersThroughput (msg/s)
1
10
100

Concurrent producer batching

Measures aggregate throughput with multiple concurrent producers all using multi-message Enqueue (batch size 100).

ProducersThroughput (msg/s)
1
5
10
50

Subsystem benchmarks

Subsystem benchmarks isolate and measure each internal component independently, bypassing the full server stack. This helps identify where time is spent and which component dominates in different workloads.

Enable with FILA_BENCH_SUBSYSTEM=1:

FILA_BENCH_SUBSYSTEM=1 cargo bench -p fila-bench --bench system

RocksDB raw write throughput

Measures raw put_message throughput directly against RocksDB, bypassing scheduler, FIBP, and serialization. Isolates storage engine performance.

PayloadThroughput (ops/s)p50 latencyp99 latency
1KB
64KB

Protobuf serialization throughput

Measures protobuf encode and decode throughput for EnqueueRequest and ConsumeResponse at three payload sizes. Isolates serialization overhead.

PayloadEncode (MB/s)Encode (ns/msg)Decode (ns/msg)
64B
1KB
64KB

Reported for both EnqueueRequest (producer path) and ConsumeResponse (consumer path).

DRR scheduler throughput

Measures next_key() + consume_deficit() cycle throughput at varying active key counts. Isolates the scheduling algorithm from storage I/O.

Active keysThroughput (sel/s)
10
1,000
10,000

FIBP round-trip overhead

Measures round-trip latency for a minimal (1-byte payload) Enqueue request. Quantifies the fixed per-call overhead of FIBP framing, separate from message processing.

MetricValueUnit
p50 latencyus
p99 latencyus
p99.9 latencyus
Throughputops/s

Lua execution throughput

Measures on_enqueue hook execution throughput for three script complexity levels, directly against the Lua VM (no server, no FIBP).

ScriptThroughput (exec/s)p50p99
No-op (return defaults)
Header-set (read 2 headers)
Complex routing (string ops, conditionals, table insert)

Competitive comparison

Fila is compared against Kafka, RabbitMQ, and NATS on queue-oriented workloads. All brokers run in Docker containers and are benchmarked using native Rust clients via the bench-competitive binary. See Methodology for details.

How to run competitive benchmarks

cd bench/competitive
make bench-competitive

Results are written to bench/competitive/results/bench-{broker}.json.

Workloads

Each broker is tested with identical workloads using its recommended high-throughput configuration:

WorkloadDescriptionBatching
ThroughputSustained message production rate (64B, 1KB, 64KB payloads)Each broker’s recommended batching
LatencyProduce-consume round-trip (p50/p95/p99)Unbatched
LifecycleFull enqueue-consume-ack cycle (1,000 messages)Unbatched
Multi-producer3 concurrent producers aggregate throughputEach broker’s recommended batching
ResourcesCPU and memory during benchmark

Broker configurations

BrokerVersionModeThroughput batching
FilalatestDocker container, DRR schedulerAccumulatorMode::Auto (4 concurrent producers)
Kafka3.9KRaft (no ZooKeeper), 1 partitionlinger.ms=5, batch.num.messages=1000
RabbitMQ3.13Quorum queues, durable, manual ackPer-message (no client-side batching)
NATS2.11JetStream, file storage, pull-subscribePer-message (no client-side batching)

All competitors use production-recommended settings, not development defaults. All brokers use native Rust client libraries (rdkafka, lapin, async-nats). Throughput scenarios use each broker’s recommended batching strategy for a fair comparison. Lifecycle scenarios are unbatched for all brokers.

Run make bench-competitive on your hardware to generate comparison tables.

Results

These are reference numbers from a single run. Your results will vary by hardware. All brokers run in Docker containers. Throughput uses each broker’s recommended batching; lifecycle is unbatched.

Throughput (messages/second, batched)

PayloadFilaKafkaRabbitMQNATS
64B
1KB
64KB

Previous unbatched results (Fila 2,637 msg/s vs Kafka 143,278 msg/s at 1KB) were an unfair comparison: Kafka used linger.ms=5 batching while Fila sent 1 message per RPC. The updated benchmark uses each broker’s recommended batching.

End-to-end latency (1KB payload)

PercentileFilaKafkaRabbitMQNATS
p500.92 ms101.62 ms1.46 ms0.29 ms
p952.82 ms105.07 ms3.32 ms0.42 ms
p994.79 ms105.30 ms5.59 ms0.79 ms

Lifecycle throughput (enqueue + consume + ack, 1KB, unbatched)

Brokermsg/s
NATS25,763
Fila2,724
RabbitMQ658
Kafka356

Multi-producer throughput (3 producers, 1KB)

Brokermsg/s
Kafka186,708
NATS150,676
RabbitMQ63,660
Fila6,769

Resource usage

BrokerCPUMemory
NATS1.3%12 MB
Kafka2.1%1,276 MB
Fila3.7%874 MB
RabbitMQ56.8%654 MB

Methodology

Measurement parameters

ParameterValue
Warmup period1 second (discarded)
Measurement window3 seconds
Latency samples100 per level
Runs for CI regression3 (median)
Competitive runs1 (relative comparison)

Limitations

  • Single-node only. All brokers run as single instances. Clustering performance is not tested.
  • No network latency. Brokers run on localhost. Real deployments have network overhead.
  • Docker containers. All brokers run in Docker containers for a fair comparison.
  • Hardware-specific. Results will vary on different hardware. Always include hardware specs when citing numbers.

Reproducing results

Self-benchmarks:

# Build and run the full benchmark suite
cargo bench -p fila-bench --bench system

# Results written to crates/fila-bench/bench-results.json

Competitive benchmarks:

cd bench/competitive

# Run all brokers
make bench-competitive

# Or individual brokers
make bench-kafka
make bench-rabbitmq
make bench-nats
make bench-fila

# Clean up Docker containers
make bench-clean

See bench/competitive/METHODOLOGY.md for complete methodology documentation including broker configuration details and justifications.

CI regression detection

The bench-regression GitHub Actions workflow runs on every push to main and on pull requests:

  • Runs the self-benchmark suite 3 times, takes the median
  • On main pushes: saves results as the baseline
  • On PRs: compares against the baseline and flags regressions exceeding the threshold (default: 10%)
  • Results are uploaded as workflow artifacts for every run

Traceability

Results in this document are from commit 1e5bb0e (2026-03-26). Run cargo bench -p fila-bench --bench system to generate results for the current version. The JSON output includes the commit hash and timestamp for traceability.

Versioning & Compatibility Policy

Versioning

Fila uses semantic versioning (semver) for all releases:

  • MAJOR — Breaking proto/API changes. Existing SDK versions may not work.
  • MINOR — New features, backward compatible. Existing SDKs continue to work; new features require SDK update.
  • PATCH — Bug fixes only. No behavior changes.

Proto Backward Compatibility

The fila.v1 proto package follows strict additive-only rules within a MAJOR version:

  • New RPCs may be added
  • New fields may be added to existing messages
  • Existing fields are never removed, renamed, or retyped
  • Field numbers are never reused

A MAJOR version bump (e.g., v1 → v2) would introduce a new proto package (fila.v2) and may remove deprecated RPCs or fields.

Deprecation Policy

  • Features are deprecated with at least 1 MINOR version warning before removal
  • Deprecated RPCs and fields are documented in release notes and proto comments
  • Removal only occurs in a MAJOR version bump