Skip to main content

Configuration

SierraDB offers extensive configuration options to optimize for your specific use case.

Configuration Methods

Command Line Arguments

The simplest way to configure SierraDB:

sierradb-server \
--dir ./data \
--client-address 0.0.0.0:9090 \
--cluster-address /ip4/0.0.0.0/tcp/7890 \
--node-count 3 \
--node-index 0 \
--log info \
--config sierradb.toml \
--mdns false

Available CLI Options:

  • --dir, -d: Path to database data directory
  • --client-address: Network address for client connections
  • --cluster-address: Network address for inter-node cluster communication
  • --node-count, -n: Total number of nodes in the cluster
  • --node-index, -i: Index of this node in the cluster (0-based)
  • --log, -l: Log filter string (e.g., "info", "debug", "error")
  • --config, -c: Path to configuration file (TOML, YAML, or JSON)
  • --mdns: Enable/disable mDNS auto discovery

Configuration File

For complex setups, use a TOML configuration file:

# sierradb.toml
[bucket]
count = 4

[partition]
count = 32

[segment]
size_bytes = 268435456

[replication]
factor = 1

[network]
client_address = "0.0.0.0:9090"
cluster_address = "/ip4/0.0.0.0/tcp/0"

Load with:

sierradb-server --config sierradb.toml

Environment Variables

All configuration options can be set via environment variables:

export SIERRADB_CLIENT_ADDRESS="0.0.0.0:9090"
export SIERRADB_PARTITION_COUNT=64
export SIERRADB_BUCKET_COUNT=8

sierradb-server --dir ./data

Configuration Sections

Bucket Configuration

Controls top-level data organization and I/O parallelism:

[bucket]
count = 4 # Number of buckets (default: 4)
# ids = [0, 1, 2, 3] # Optional: explicit bucket IDs

count:

  • Default: 4
  • Purpose: Number of buckets for data partitioning
  • Impact: Affects I/O parallelism and write thread distribution
  • Recommendations: Start with 4, increase for high-core systems

ids (Optional):

  • Default: Auto-generated range (0..count)
  • Purpose: Explicit bucket ID assignment for this node
  • Use case: Custom bucket distribution in multi-node setups

Cache Configuration

Controls in-memory block cache for read performance:

[cache]
capacity_bytes = 268435456 # Cache size in bytes (default: 256MB)

capacity_bytes:

  • Default: 268435456 (256MB)
  • Purpose: Maximum memory used for caching data blocks
  • Impact: Higher values improve read performance but use more RAM
  • Recommendations: 25-50% of available system memory

Directory Configuration

Specifies data storage location:

dir = "/path/to/data"        # Database data directory

dir:

  • Default: Platform-specific user data directory
  • Purpose: Root directory for all database files
  • Structure: Contains buckets/ subdirectory with segment files
  • Permissions: Must be readable/writable by SierraDB process

Heartbeat Configuration

Controls cluster node health monitoring:

[heartbeat]
interval_ms = 1000 # Heartbeat send interval (default: 1000ms)
timeout_ms = 6000 # Heartbeat timeout (default: 6000ms)

interval_ms:

  • Default: 1000 (1 second)
  • Purpose: How often nodes send heartbeat messages
  • Impact: Lower values = faster failure detection, higher network overhead

timeout_ms:

  • Default: 6000 (6 seconds)
  • Purpose: Time before considering a node unreachable
  • Constraint: Must be greater than interval_ms
  • Recommendations: 3-6x the interval value

Network Configuration

Controls client and cluster communication:

[network]
cluster_enabled = true # Enable cluster mode (default: true)
client_address = "0.0.0.0:9090" # Client connections (default)
cluster_address = "/ip4/0.0.0.0/tcp/0" # Inter-node communication (default)
mdns = false # Enable mDNS auto-discovery (default: false)

cluster_enabled:

  • Default: true
  • Purpose: Enable/disable cluster networking
  • Impact: When false, only single-node mode is supported

client_address:

  • Default: "0.0.0.0:9090"
  • Format: "host:port"
  • Purpose: Address for client connections (RESP3 protocol)
  • Examples: "127.0.0.1:9090" (localhost only), "0.0.0.0:9091" (custom port)

cluster_address:

  • Default: "/ip4/0.0.0.0/tcp/0"
  • Format: libp2p multiaddr
  • Purpose: Address for inter-node cluster communication
  • Examples: "/ip4/192.168.1.10/tcp/7890" (specific IP/port)

mdns:

  • Default: false
  • Purpose: Enable automatic peer discovery using multicast DNS
  • Use case: Simplifies cluster setup in local networks
  • Security: Consider disabling in production environments

Node Configuration

Defines cluster node identity and count:

[node]
count = 3 # Total nodes in cluster (optional, auto-detected)
index = 0 # This node's index (0-based)

count (Optional):

  • Default: Auto-detected from nodes array or CLI
  • Purpose: Total number of nodes in the cluster
  • Constraints: Must be > 0, affects replication factor limits

index:

  • Default: 0 for single-node, required for multi-node
  • Purpose: Zero-based index identifying this node
  • Constraints: Must be < count
  • Auto-assignment: Single-node setups default to 0

Partition Configuration

Controls data distribution and write concurrency:

[partition]
count = 32 # Number of partitions (default: 32)
# ids = [0, 1, 2, ...] # Optional: explicit partition IDs

count:

  • Default: 32
  • Purpose: Number of partitions for write parallelism
  • Constraints: Must be >= number of buckets and >= number of nodes
  • Recommendations: 32-64 for moderate loads, 128+ for high throughput

ids (Optional):

  • Default: Auto-assigned based on bucket ownership
  • Purpose: Explicit partition assignment for this node
  • Use case: Custom data distribution patterns

Replication Configuration

Controls data redundancy, consistency, and out-of-order write handling:

[replication]
factor = 3 # Number of replicas (default: min(node_count, 3))
buffer_size = 1000 # Max out-of-order writes per partition (default: 1000)
buffer_timeout_ms = 8000 # Max wait for missing sequences (default: 8000ms)
catchup_timeout_ms = 2000 # Catchup operation timeout (default: 2000ms)

factor:

  • Default: min(node_count, 3)
  • Purpose: Number of replicas for each write
  • Constraints: Cannot exceed node count
  • Recommendations: 1 (development), 3 (production)

buffer_size (Advanced):

  • Default: 1000
  • Purpose: Maximum out-of-order writes buffered per partition
  • Errors: BufferFull when exceeded, BufferEvicted for oldest entries
  • Tuning: Increase for network jitter, decrease to save memory

buffer_timeout_ms (Advanced):

  • Default: 8000 (8 seconds)
  • Purpose: Maximum wait time for missing sequence numbers
  • Behavior: Triggers garbage collection and catchup on timeout
  • Tuning: Increase for high-latency networks

catchup_timeout_ms (Advanced):

  • Default: 2000 (2 seconds)
  • Purpose: Timeout for catchup operations when sequence gaps detected
  • Impact: Faster timeouts = quicker recovery, more aggressive retries

Segment Configuration

Controls storage file sizes and performance:

[segment]
size_bytes = 268435456 # Segment size in bytes (default: 256MB)

size_bytes:

  • Default: 268435456 (256MB)
  • Purpose: Maximum size of individual segment files
  • Constraints: Minimum 128KB, maximum 10GB
  • Impact: Larger segments = fewer files, more memory usage per read
  • Recommendations: 256MB (balanced), 512MB (high-throughput)

Sync Configuration

Controls when data is flushed from memory to disk:

[sync]
interval_ms = 5 # Max time before sync (default: 5ms)
max_batch_size = 50 # Max events before sync (default: 50)
min_bytes = 4096 # Min bytes before sync (default: 4096)

interval_ms:

  • Default: 5
  • Purpose: Maximum time to wait before flushing to disk
  • Impact: Lower values = better durability, higher I/O overhead

max_batch_size:

  • Default: 50
  • Purpose: Maximum number of events to batch before flushing
  • Impact: Higher values = better throughput, increased memory usage

min_bytes:

  • Default: 4096 (4KB)
  • Purpose: Minimum bytes to accumulate before flushing
  • Impact: Ensures efficient disk writes for small events

Thread Configuration

Controls CPU resource allocation:

[threads]
read = 8 # Read thread pool size (default: cores * 2, clamped 4-32)
write = 4 # Write thread pool size (default: bucket count)

read:

  • Default: (CPU cores × 2), clamped between 4-32
  • Purpose: Thread pool size for read operations
  • Recommendations: 1-2x number of CPU cores

write:

  • Default: Number of assigned buckets
  • Purpose: Thread pool size for write operations
  • Constraints: Must be a divisor of bucket count and ≤ bucket count
  • Examples: For 8 buckets, valid values are 1, 2, 4, 8

Nodes Configuration

Defines per-node configuration overrides:

[[nodes]]
# Node 0 configuration
[nodes.replication]
factor = 3

[[nodes]]
# Node 1 configuration
[nodes.network]
client_address = "0.0.0.0:9091"

nodes:

  • Default: Empty (single-node configuration)
  • Purpose: Array of node-specific configuration overrides
  • Usage: Each array element defines configuration for one node
  • Override: Settings override global defaults for specific nodes

Environment-Specific Configurations

Development Setup

Optimized for quick development:

[bucket]
count = 2

[cache]
capacity_bytes = 134217728 # 128MB

[partition]
count = 8

[segment]
size_bytes = 67108864 # 64MB

[threads]
read = 4
write = 2

[replication]
factor = 1 # No replication for dev

[sync]
interval_ms = 10
max_batch_size = 25

[network]
cluster_enabled = false # Single node
mdns = false

Production Setup

Optimized for performance and reliability:

[bucket]
count = 8

[cache]
capacity_bytes = 536870912 # 512MB

[partition]
count = 64

[segment]
size_bytes = 536870912 # 512MB

[threads]
read = 16
write = 8

[replication]
factor = 3

[sync]
interval_ms = 5
max_batch_size = 50

[heartbeat]
interval_ms = 500 # Faster heartbeats
timeout_ms = 3000

[network]
cluster_enabled = true
mdns = false

High-Latency Network Setup

For clusters with network delays or jitter:

[replication]
factor = 3
buffer_size = 2000 # Handle more out-of-order writes
buffer_timeout_ms = 15000 # Wait longer for missing sequences
catchup_timeout_ms = 5000 # Longer catchup timeout

[heartbeat]
interval_ms = 2000 # Less frequent heartbeats
timeout_ms = 12000 # Longer timeout tolerance

Monitoring Configuration

Logging

Control log levels with environment variables:

# Debug level logging
sierradb-server --config production.toml --log debug

# Info level logging
sierradb-server --log info

# Quiet mode (errors only)
sierradb-server --log error

# Target-specific logging (warn by default, debug for sierradb modules)
sierradb-server --log warn,sierradb=debug

Log levels follow the tracing EnvFilter format, allowing fine-grained control over logging output per module.

Security Configuration

Network Security

[network]
# Bind to specific interface
client_address = "192.168.1.10:9090"

File System Security

# Restrict data directory permissions
chmod 700 ./data

# Run as non-root user
useradd sierradb
chown -R sierradb:sierradb ./data
sudo -u sierradb sierradb-server --config config.toml

Advanced Configuration

Memory Management

Control memory usage:

[cache]
capacity_bytes = 268435456 # 256MB default cache size

Troubleshooting Configuration

Common Issues

Configuration not loading:

# Check file exists and is readable
ls -la sierradb.toml

# Validate TOML syntax
toml-cli check sierradb.toml

Performance issues:

# Monitor resource usage
htop

# Check thread configuration
# Ensure threads don't exceed CPU cores significantly

Memory issues:

# Check memory usage
free -h

# Reduce cache sizes or thread counts
[cache]
capacity_bytes = 134217728 # 128MB instead of 256MB

Next Steps