7 Replication
x edited this page 2026-02-27 10:08:39 +00:00

Replication

VSKI supports read-replica architecture for horizontal scaling. This allows you to distribute read traffic across multiple replica servers while maintaining a single master for writes.

Note

Schemas are append-only. You cannot rename columns in a database trough the interface. Dropping and renaming columns is a manual task.

Note

Tables and columns are always soft-deleted when using VSKI API. Data migration and breaking changes is an admin task that has to be done carefully and with cool head.

Warning

Except the workflows, other real-time features are disabled on replicas. If you need distributed real time you can use triggers from master combined with Redis Cluster (changes emitted to redis -> consumed by your service -> send to subscribers).

Overview

                ┌─────────────┐
                │   Master    │
                │ (Read/Write)│
                └──────┬──────┘
                       │
       ┌───────────────┼───────────────┐
       │               │               │
       ▼               ▼               ▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│  Replica 1  │ │  Replica 2  │ │  Replica 3  │
│  (Read-only)│ │  (Read-only)│ │  (Read-only)│
└─────────────┘ └─────────────┘ └─────────────┘
  • Master: Handles all write operations and serves read requests
  • Replicas: Read-only copies that sync from master and serve read traffic

Quick Start

1. Start Master Server

# Master server (default mode)
JWT_SECRET=your-shared-secret DATA_DIR=./data vski-prod serve --port 3001

2. Start Replica Server

# Replica server - just needs MASTER_URL and same JWT_SECRET
REPLICA_MODE=replica \
MASTER_URL=http://localhost:3001 \
JWT_SECRET=your-shared-secret \
DATA_DIR=./data-replica \
SYNC_INTERVAL=5 \
vski-prod serve --port 3002

The replica will automatically:

  1. Generate a JWT token using the shared secret
  2. Connect to master and authenticate
  3. Download all databases
  4. Start periodic sync

That's it! No need to register replicas on master. Just ensure both servers use the same JWT_SECRET.

Configuration

Environment Variables

Variable Description Default Required
REPLICA_MODE Server mode: master or replica master No
MASTER_URL URL of the master server - Yes (if replica)
SYNC_INTERVAL Sync interval in seconds (0 = manual only) 60 No
JWT_SECRET Must be the same on master and replicas - Yes

JWT Secret Requirement

Both master and replicas must use the same JWT_SECRET. This ensures that:

  • Tokens generated on master are valid on replicas
  • Users can authenticate on any server
  • Replicas can authenticate with master
# Master
JWT_SECRET=shared-secret-key

# Replica (must match!)
JWT_SECRET=shared-secret-key

How Authentication Works

  1. Replica generates its own JWT token using JWT_SECRET
  2. Replica sends this token to master in the Authorization: Bearer header
  3. Master validates the token using the same JWT_SECRET
  4. No pre-registration needed - any server with the shared secret can sync

This simplified approach means you can spin up new replicas without any configuration on the master.

Sync Process

Initial Sync

When a replica first connects or has no replication state, a full sync occurs:

  1. Replica requests database status from master
  2. Replica downloads the entire database file
  3. Replica atomically replaces local database with downloaded copy
  4. Replica resets master's session to ensure clean changeset tracking

Incremental Sync

For ongoing updates, replicas use SQLite changesets for efficient incremental sync:

  1. Replica computes schema hash from local sqlite_master
  2. Replica fetches schema hash from master
  3. If hashes match: Replica requests changeset since last sync version
  4. If hashes differ: Schema migration occurs first (see below)
  5. Replica applies changeset to local database

Schema Synchronization

When schema changes are detected (new collections, field changes, indexes), replicas automatically migrate their schema:

  1. Hash-Based Detection: Schema hashes are computed from sqlite_master (tables, indexes, triggers, views)
  2. Schema Fetch: Replica fetches full schema from master via /api/replica/schema-sync
  3. Diff Computation: Replica computes schema diff:
    • New tables to create
    • Tables with new columns (ALTER TABLE)
    • New/dropped indexes, triggers, views
  4. Migration Application:
    • CREATE TABLE for new collections
    • ALTER TABLE ADD COLUMN for new fields
    • CREATE INDEX, CREATE TRIGGER, CREATE VIEW for new objects
    • DROP INDEX, DROP TRIGGER, DROP VIEW for removed objects
    • Tables are never dropped on replica

Changeset Sync

After schema migration (if needed), data changes are synced:

  1. Replica tracks the last sync version
  2. Requests changeset from master since that version
  3. Master generates changeset from SQLite session (INSERT, UPDATE, DELETE)
  4. Replica applies changeset directly to SQLite database

Omitted Databases

The following system databases are automatically excluded from replication:

  • stats - Statistics and logs (replica generates its own)
  • workflows - Workflow data (replica generates its own workflow runs)

You can also exclude custom databases using the REPLICA_OMIT_DBS environment variable:

# Exclude specific databases from replication
REPLICA_OMIT_DBS=temp,cache,logs

Sync Interval

Configure how often replicas sync:

# Sync every 5 seconds (near real-time)
SYNC_INTERVAL=5

# Sync every minute (default)
SYNC_INTERVAL=60

# Manual sync only
SYNC_INTERVAL=0

Sync Latency Considerations

Replicas may be slightly behind the master depending on the SYNC_INTERVAL configuration. This is acceptable for most read-heavy workloads, but there are important considerations:

When to Route to Master

Route requests to master when you need:

  • Realtime updates: Immediately see changes after writes
  • Strong consistency: Read-your-writes guarantee
  • Realtime subscriptions: WebSocket connections for live data

When Replicas Are Fine

Replicas are suitable for:

  • Read-heavy workloads: Offload read traffic from master
  • Eventual consistency tolerance: Slight delay is acceptable
  • Geographic distribution: Serve reads from nearby replicas
Use Case Sync Interval Notes
Realtime features Master only Use master for realtime subscriptions
Near real-time reads 5-10 seconds Acceptable for most interactive apps
Standard reads 30-60 seconds Default, good for most use cases
Analytics / Reporting 60+ seconds Longer intervals reduce master load

Example Configuration

# For near real-time consistency (10 second sync)
SYNC_INTERVAL=10

# For standard read replicas (1 minute sync)
SYNC_INTERVAL=60

Read-Only Behavior

When running in replica mode:

  • All write operations (POST, PUT, PATCH, DELETE) return 403 Forbidden
  • Auth endpoints (login, token refresh) are allowed
  • All read operations (GET) work normally

Example

# Read operation - allowed on replica
curl http://replica:3002/api/collections/posts/records

# Write operation - blocked on replica
curl -X POST http://replica:3002/api/collections/posts/records \
  -H "Content-Type: application/json" \
  -d '{"title": "New Post"}'
# Returns: {"error": "replica is read-only", "code": 403}

Headers

Replicas include helpful headers in responses:

Header Description
X-Replica-Mode Current mode: replica or master
X-Replica-Id Unique identifier for this replica
X-Master-Url URL of the master server (replica only)

Example:

curl -I http://replica:3002/health

HTTP/1.1 200 OK
X-Replica-Mode: replica
X-Replica-Id: replica-001
X-Master-Url: http://master:3001

Routing by Replica ID

Replicas can be configured with a unique ID and public URL, allowing load balancers to route requests to specific replicas based on the X-Replica-Id header or subdomain patterns.

Configuration

# Replica with ID and public URL
REPLICA_MODE=replica \
REPLICA_ID=r001 \
PUBLIC_URL=https://r001.instance.com \
MASTER_URL=http://master:3001 \
vski-prod serve --port 3002

Subdomain-Based Routing

A common pattern is to use the replica ID as a subdomain (e.g., r001.instance.com). The load balancer can route requests based on the X-Replica-Id header sent by the client SDK.

// Client SDK automatically captures replicaId from responses
const records = await client.collection("posts").getList(1, 10);
console.log(client.replicaId); // "r001" - captured from X-Replica-Id header

// Subsequent requests include the header
// Load balancer can route to the correct replica

HAProxy Example (Header-Based Routing)

frontend http_front
    bind *:80
    
    # Route to specific replica based on X-Replica-Id header
    acl replica_r001 hdr(X-Replica-Id) r001
    acl replica_r002 hdr(X-Replica-Id) r002
    use_backend replica_r001 if replica_r001
    use_backend replica_r002 if replica_r002
    
    # Default to master for writes and new connections
    default_backend master

backend replica_r001
    server r001 r001.instance.com:3002

backend replica_r002
    server r002 r002.instance.com:3002

backend master
    server master master:3001

Nginx Example (Subdomain-Based Routing)

# Map replica ID to upstream
map $http_x_replica_id $replica_backend {
    default  "master";
    "r001"   "replica_r001";
    "r002"   "replica_r002";
}

upstream master {
    server master:3001;
}

upstream replica_r001 {
    server r001.instance.com:3002;
}

upstream replica_r002 {
    server r002.instance.com:3002;
}

server {
    listen 80;
    
    location / {
        proxy_pass http://$replica_backend;
        proxy_set_header X-Replica-Id $http_x_replica_id;
    }
}

Traefik Example (Subdomain Routing)

# docker-compose.yml
services:
  traefik:
    image: traefik:v3
    command:
      - "--providers.docker=true"
    ports:
      - "80:80"

  replica-r001:
    image: vski:latest
    environment:
      REPLICA_MODE: replica
      REPLICA_ID: r001
      PUBLIC_URL: https://r001.instance.com
    labels:
      - "traefik.http.routers.r001.rule=Host(`r001.instance.com`)"

  replica-r002:
    image: vski:latest
    environment:
      REPLICA_MODE: replica
      REPLICA_ID: r002
      PUBLIC_URL: https://r002.instance.com
    labels:
      - "traefik.http.routers.r002.rule=Host(`r002.instance.com`)"

Load Balancing

Use a load balancer to distribute traffic:

HAProxy Example

frontend http_front
    bind *:80
    
    # Route writes to master
    acl is_write method POST PUT PATCH DELETE
    use_backend master if is_write
    
    # Route reads to all servers
    default_backend all_servers

backend master
    server master1 master:3001

backend all_servers
    balance roundrobin
    server master1 master:3001
    server replica1 replica1:3002
    server replica2 replica2:3002

Nginx Example

upstream reads {
    server master:3001;
    server replica1:3002;
    server replica2:3002;
}

upstream writes {
    server master:3001;
}

server {
    listen 80;
    
    # Writes go to master
    if ($request_method !~ ^(GET|HEAD|OPTIONS)$) {
        proxy_pass http://writes;
    }
    
    # Reads go to any server
    location / {
        proxy_pass http://reads;
    }
}

Client SDK Usage

The SDK works with both master and replicas seamlessly:

import { VskiClient } from "@vski/sdk";

// Connect to replica for reads
const readClient = new VskiClient("http://replica:3002");
await readClient.admins.authWithPassword("admin@example.com", "password");

// Read operations work on replica
const posts = await readClient.collection("posts").getList(1, 10);

// For writes, connect to master
const writeClient = new VskiClient("http://master:3001");
await writeClient.admins.authWithPassword("admin@example.com", "password");

// Write operations only work on master
await writeClient.collection("posts").create({
  title: "New Post",
  content: "Hello World",
});

Monitoring

Health Check

Check replica status:

curl http://replica:3002/health

Response includes headers indicating replica status.

Sync Status

Check replication status on master:

curl -H "Authorization: Bearer <replica-jwt>" http://master:3001/api/replica/status

Response:

{
  "dbs": [
    {
      "name": "default",
      "size": 1048576,
      "sha256": "abc123...",
      "schemaVersion": 5
    }
  ]
}

Security

Authentication

Replicas authenticate using JWT tokens generated from the shared JWT_SECRET. No additional keys or registration required.

Best Practices

  1. Use HTTPS in production
  2. Keep JWT_SECRET secret and consistent across all servers
  3. Use strong secrets - at least 32 characters of random data
  4. Rotate secrets by updating all servers simultaneously

Limitations

Current limitations (may be addressed in future releases):

  1. Read-only replicas: All writes must go to master
  2. Manual failover: No automatic promotion of replica to master
  3. Full database sync: Changesets apply to entire database, not per-collection

Troubleshooting

Replica Not Syncing

  1. Check network connectivity to master
  2. Verify MASTER_URL is correct
  3. Verify JWT_SECRET is identical on master and replica
  4. Check master logs for authentication errors

Authentication Fails on Replica

Ensure JWT_SECRET is identical on both master and replica:

# On master
echo $JWT_SECRET

# On replica (must match!)
echo $JWT_SECRET

Data Not Appearing on Replica

  1. Check sync interval: SYNC_INTERVAL
  2. Wait for next sync cycle
  3. Check replica logs for sync errors

Schema Not Syncing

If new collections appear on master but not on replica:

  1. Check replica logs for "schema mismatch" messages
  2. Verify replica can reach /api/replica/schema-sync endpoint
  3. Check for SQL errors in replica logs during migration

Best Practices

  1. Monitor sync lag - Ensure replicas stay in sync
  2. Use consistent JWT_SECRET - Critical for authentication
  3. Monitor replica health - Set up health checks
  4. Plan capacity - Ensure replicas can handle read load
  5. Test failover - Know how to promote a replica if master fails
  6. Secure communication - Use HTTPS between components