BETA Ferrosa is in active development. APIs may change before 1.0.
v1.0.0-beta.4 — active development

Cassandra. Reimagined.
Rust-native. S3-backed.

Drop-in CQL compatibility. No GC pauses. Ephemeral nodes that recover in seconds. Your data lives in S3 — no EBS snapshots, no backup jobs. Point-in-time recovery for 10 TB costs $2/month, not $2,250. Start on a single node, scale to a full production cluster — same binary, same config, no migration. Cheap, fast, easy — pick all three.

Get Started See How It Works
0ms
GC Pauses
<10s
Node Recovery
1000x
Cheaper Point-in-Time Recovery
1→N
Seamless Scaling

Most of your database bill
isn't serving a single query

Pull up your cloud bill. Look past the compute, past the live storage. The largest line items are EBS snapshots, cross-region replicas, and backup retention — copies of data that nobody reads, nobody queries, and nobody thinks about until something goes wrong. It's the most expensive insurance policy in your infrastructure, and it scales with every gigabyte you store.

💸

Your Backups Cost More Than Your Database

Point-in-time recovery requires EBS snapshots — a full second copy of your dataset that exists only for "just in case." 10 TB with 30-day retention costs $2,250/month in snapshots alone — on top of the $3,000–7,500/month you're already paying for the live EBS volumes. A year of retention? $15,000/month. That's $180,000/year for data you hope you never need to read.

📀

You Pay for Every Byte Three Times

Once on the live EBS volume. Once in EBS snapshots for recovery. Once more if you replicate across regions. At petabyte scale with RF=3, you're storing 3 PB on EBS ($300K/month), snapshotting 3 PB ($150K/month), and none of that snapshot storage serves a single read.

GC Pauses Kill Your P99

Java's garbage collector causes unpredictable tail latency spikes. No amount of JVM tuning eliminates the problem — it just moves it.

Hours to Replace a Node

When a node dies, streaming hundreds of gigabytes from replicas takes hours. Your cluster runs degraded the entire time.

🔧

Ops Expertise Required

Compaction tuning, repair scheduling, bootstrap orchestration, heap sizing — Cassandra demands a dedicated team to keep it healthy.

Core Capabilities

Everything you need.
Nothing you don't.

Ferrosa keeps the architecture that made Cassandra great and replaces everything that held it back.

Eliminate Your Backup Bill Entirely

That $2,250/month you're spending on EBS snapshots for 10 TB? That $180,000/year insurance policy? Ferrosa makes it $2.30/month. Not because we compress better or found cheaper storage — because the backup step doesn't exist. Your data is already durable in S3. Point-in-time recovery is just keeping commit log segments longer, and S3 Lifecycle rules tier them to Glacier automatically.

One command sets your retention policy: ferrosa-ctl storage retention set --years 1. Ferrosa configures S3 Lifecycle rules that tier your archives through Standard, Infrequent Access, and Glacier automatically. A full year of point-in-time recovery costs ~$45/TB — actually cheaper than 3 months without tiering, because data spends most of its life in Deep Archive at $0.00099/GB.

30d recovery: $2.30 vs $2,250 · 1yr: $45/TB · DynamoDB max: 35 days
10 TB Recovery Cassandra Ferrosa
7 days $525/mo $0.55/mo
30 days $2,250/mo $2.30/mo
1 year $15,000/mo $3.75/mo
3 years $30,000/mo $5.20/mo

Cassandra: EBS snapshots at $0.05/GB, 5% daily change rate, RF=3. Ferrosa: S3 Lifecycle tiered commit log retention.

S3-Backed Storage

Durable storage lives in S3. Local NVMe is a hot cache, not a commitment. Nodes are ephemeral — lose one, spin up a replacement, and it's serving reads from S3 in seconds instead of streaming for hours.

NVMe is a bounded cache layer, S3 is the unbounded durable store. Data that doesn't fit on NVMe lives in S3 and gets pulled through on demand. The local_cache_max_bytes config controls how much NVMe is used. Size your local disks for your working set, not your total dataset.

Point-in-time recovery retention is one command: ferrosa-ctl storage retention set --years 1. Ferrosa configures S3 Lifecycle rules that tier your commit log archives through Standard, Infrequent Access, and Glacier automatically. A full year of point-in-time recovery costs ~$45/TB — cheaper than 3 months without tiering. Cold SSTables transition to cheaper storage classes automatically while staying instantly readable through the block cache — 66% savings on storage with zero performance impact. Node retirement is just as simple: decommission a node and its S3 objects are tagged for cleanup.

Works with AWS S3, MinIO, Cloudflare R2, or any S3-compatible endpoint. No vendor lock-in.

S3: ~$0.023/GB/mo · 1yr recovery: ~$45/TB · SSTable tiering: 66% savings
Write
Commit Log → Memtable → ACK
Flush
Memtable → SSTable (local)
Upload
SSTable → S3 (async)
Read
Cache hit? → local NVMe → S3 fallback

Rust-Native Performance

No JVM. No garbage collector. No stop-the-world pauses. Lock-free reads via ArcSwap, sharded memtables with nanosecond-level write locks, async I/O on Tokio. Predictable latency from the first request to the billionth.

P99 latency: no GC tail

Drop-In CQL Compatibility

CQL protocol v5 — the same wire protocol your drivers already speak. Python, Java, Go, Node.js drivers connect without code changes. DDL, DML, prepared statements, batch operations, system keyspaces — all there.

0 driver changes required

Native Graph Queries

Cypher queries on your CQL tables. No separate graph database. Vertices and edges are normal tables with schema extensions. Variable-length paths, aggregations (count, sum, avg), expression evaluation, and worst-case optimal joins via leapfrog triejoin for cyclic pattern matching. Bolt v5 wire protocol for Neo4j driver compatibility alongside HTTP/JSON.

Cypher · Bolt v5 · HTTP/JSON · leapfrog triejoin · SUBSCRIBE

Built-In Pub/Sub & CDC

SUBSCRIBE to any CQL or Cypher query for real-time change streaming. Interval polling with EVERY, or push-on-write with DELTA mode. No external Kafka, Debezium, or CDC connector pipeline needed — change data capture is built into the storage engine.

Fully backward compatible with existing CQL drivers. Subscribe to tables, graph traversals, or observability views with the same syntax.

SUBSCRIBE social.users EVERY 5s
-- Poll for changes every 5 seconds
SUBSCRIBE social.users EVERY 5s;

-- Push changes as they happen
SUBSCRIBE social.users DELTA;

-- Subscribe to a graph traversal
SUBSCRIBE MATCH (a:Person)-[:FOLLOWS]->(b)
RETURN b.name EVERY 10s;

Pluggable Secondary Indexes

Eight index types behind a single trait: B-tree for range scans, hash for O(1) lookups, composite for multi-column queries, phonetic for fuzzy string matching (Soundex, Metaphone, Double Metaphone, Caverphone), and two vector index methods for approximate nearest neighbor search.

Indexes are storage-attached — built asynchronously after SSTable flush with zero impact on the write path. Per-index staleness tracking and operational metrics via system_views.secondary_indexes let you monitor build progress and lag in real time. CQL-compatible DDL with standard CREATE INDEX ... USING 'type' syntax.

B-tree · Hash · Composite · Phonetic · Vector (HNSW) · Vector (IVFFlat)
-- B-tree index for range queries
CREATE INDEX idx_email ON users (email)
  USING 'btree';

-- Phonetic index for fuzzy name matching
CREATE INDEX idx_name ON users (last_name)
  USING 'phonetic'
  WITH OPTIONS = {'algorithm': 'double_metaphone'};

-- Vector index for ANN search
CREATE INDEX idx_embed ON documents (embedding)
  USING 'vector'
  WITH OPTIONS = {'method': 'hnsw',
    'metric': 'cosine', 'dimensions': '768'};

-- Nearest neighbor query
SELECT * FROM documents
  ORDER BY embedding ANN OF [0.1, 0.2, ...]
  LIMIT 10;

Lock-Free Architecture

Readers never block. ArcSwap provides wait-free atomic view loading. Writers touch sharded memtables for nanoseconds. No global locks, no contention, no coordination overhead on the hot path.

wait-free reads · zero contention
📊

Built-In Observability

Query cluster state through virtual tables in the system_observability keyspace — connections, storage stats, active queries, topology. Native Prometheus /metrics endpoint with no exporter needed. Web console with REST API and WebSocket live updates. ferrosa-ctl CLI with TUI dashboard for real-time monitoring. SUBSCRIBE to any view for streaming updates.

virtual tables · /metrics · WebSocket · ferrosa-ctl TUI

Production Cluster Mode

5-node clusters with Raft consensus, 8 tunable consistency levels (ONE through ALL, LOCAL_ONE, LOCAL_QUORUM, EACH_QUORUM), and hinted handoff for replica failure recovery. Operator-controlled node join with approval gate, graceful decommission with streaming protocol, and skew-aware token rebalancing. Background maintenance handles auto-flush, compaction, and commit log GC. Ships as a systemd service with .deb packaging and Docker Compose support.

Raft · tunable CL · hinted handoff · token rebalancing

Reconnection & Resilience

Automatic reconnection with exponential backoff for internode links. Graceful drain with configurable timeout ensures in-flight requests complete before shutdown. Connection drop detection and per-connection request limiting provide backpressure under load. Production-quality code built to handle real-world failure scenarios.

auto-reconnect · graceful drain · backpressure
📋

Hinted Handoff

When a replica is temporarily unavailable, writes are stored as hints and replayed automatically when the node recovers. Configurable hint window and TTL ensure data consistency without manual repair for transient failures.

auto hint storage · replay on recovery
🔄

Node Lifecycle Management

Operator-controlled join with approval gate prevents accidental cluster changes. Graceful decommission streams data to remaining nodes before removal. Token rebalancing with skew-aware algorithm redistributes load evenly across the ring.

join · decommission · rebalance
📈

Prometheus Metrics

Native /metrics endpoint exposes cluster health, query latencies, storage utilization, and connection counts in Prometheus format. No sidecar exporter needed — scrape directly from each Ferrosa node. Integrates with Grafana, Datadog, and any Prometheus-compatible monitoring stack.

native /metrics · no exporter · Grafana-ready

WebAssembly Functions

User-defined functions execute as sandboxed WebAssembly — no Java, no JavaScript. Upload a compiled WASM binary to a table, reference it in CREATE FUNCTION with an AS clause. Write functions in Rust (preferred), C, Go, or any language that compiles to WASM. Full Wasmtime integration with WIT contract, compilation caching, and sandbox enforcement. Memory-limited, CPU-time-limited, no network or filesystem access. Deterministic execution guaranteed.

WASM sandbox · WIT contract · Wasmtime compilation · moka cache · UDF + UDA

See It In Action

Same CQL. Same drivers.
Better everything else.

Your existing application code works unchanged. Just point your driver at Ferrosa.

-- Create a keyspace with S3-backed replication
CREATE KEYSPACE social WITH replication = {
  'class': 'SimpleStrategy',
  'replication_factor': 3
};

-- Tables work exactly like Cassandra
CREATE TABLE social.users (
  user_id uuid PRIMARY KEY,
  name text,
  email text,
  created_at timestamp
);

-- Inserts, selects, updates — all standard CQL
INSERT INTO social.users (user_id, name, email, created_at)
VALUES (uuid(), 'Alice', 'alice@example.com', toTimestamp(now()));

SELECT * FROM social.users WHERE user_id = ?;
-- Graph queries on the same data — no separate database
-- Mark tables as graph entities via schema extensions

ALTER TABLE social.users
  WITH extensions = {'graph.type': 'vertex', 'graph.label': 'Person'};

ALTER TABLE social.follows
  WITH extensions = {
    'graph.type': 'edge',
    'graph.label': 'FOLLOWS',
    'graph.source': 'Person',
    'graph.target': 'Person'
  };

-- Then query with Cypher via HTTP/JSON endpoint
MATCH (a:Person {name: 'Alice'})-[:FOLLOWS]->(b:Person)
RETURN b.name, b.email;

-- Multi-hop: friends of friends
MATCH (a:Person)-[:FOLLOWS*2]->(c:Person)
WHERE a.name = 'Alice'
RETURN DISTINCT c.name;
-- B-tree index for sorted range queries
CREATE INDEX idx_email ON social.users (email) USING 'btree';

-- Hash index for O(1) equality lookups
CREATE INDEX idx_uid ON social.users (user_id) USING 'hash';

-- Composite index on multiple columns
CREATE INDEX idx_name ON social.users (last_name, first_name)
  USING 'composite';

-- Phonetic index — fuzzy name matching
CREATE INDEX idx_snd ON social.users (last_name)
  USING 'phonetic'
  WITH OPTIONS = {'algorithm': 'double_metaphone'};

SELECT * FROM social.users WHERE last_name SOUNDS LIKE 'Smith';

-- Filtered index — partial index over a row subset
CREATE INDEX idx_active ON social.users (email) USING 'btree'
  WHERE status = 'active';

-- Vector index with HNSW for ANN search
CREATE INDEX idx_embed ON docs.articles (embedding)
  USING 'vector'
  WITH OPTIONS = {'method': 'hnsw', 'metric': 'cosine',
    'dimensions': '768', 'm': '16', 'ef_construction': '200'};

SELECT * FROM docs.articles
  ORDER BY embedding ANN OF [0.1, 0.2, ...]
  LIMIT 10;

-- Monitor index health
SELECT index_name, status, lag_seconds
  FROM system_views.secondary_indexes;
# Your existing Python code — just change the contact point
from cassandra.cluster import Cluster

# Before: Cassandra
# cluster = Cluster(['cassandra-node-1.prod'])

# After: Ferrosa — same driver, same API
cluster = Cluster(['ferrosa-node-1.prod'])
session = cluster.connect('social')

# Prepared statements work identically
stmt = session.prepare("""
    SELECT name, email FROM users WHERE user_id = ?
""")

user = session.execute(stmt, [user_id])
print(user.one().name)

# Batch operations, async queries, retry policies —
# everything your driver supports works unchanged.
-- Built-in observability via CQL virtual tables
SELECT * FROM system_observability.active_queries;
SELECT * FROM system_observability.storage_stats;
SELECT * FROM system_observability.connections;

-- Stream live updates with SUBSCRIBE
SUBSCRIBE system_observability.connections
  EVERY 5s;

-- Or subscribe to your own data for CDC-style streaming
SUBSCRIBE social.users;

-- Prometheus metrics at /metrics, CLI: ferrosa-ctl monitor (TUI)
-- WebSocket push: ws://host:9090/api/ws (subscribe/unsubscribe)

Ferrosa vs. the alternatives

The performance of ScyllaDB. The compatibility of Cassandra. The cost of object storage.

Ferrosa Cassandra ScyllaDB DynamoDB Keyspaces
Language Rust Java C++ Proprietary Managed
GC Pauses None Yes (JVM) None N/A N/A
Memory Safety Guaranteed GC-managed Manual (C++) N/A N/A
CQL Compatible v5 Partial
S3-Native Storage
Graph Queries Cypher + Bolt
Vector Search HNSW + IVFFlat SAI (limited)
Secondary Indexes 8 types SAI / 2i SI GSI / LSI GSI
Real-Time Pub/Sub SUBSCRIBE CDC (external) CDC (external) DynamoDB Streams
UDF Language WASM Java Lua / WASM N/A N/A
Built-in Observability CQL + TUI + Web + WS JMX/nodetool Prometheus CloudWatch CloudWatch
Production Ops Hinted handoff, auto-reconnect, graceful drain, token rebalance, systemd Manual (nodetool) Manual + Operator Managed Managed
Consensus Raft (openraft) Paxos (Accord) Paxos (Pegasus) N/A (managed) N/A (managed)
Node Recovery Seconds Hours Hours Automatic Automatic
Storage Cost (PB) ~$23K/mo (tiered: ~$8K) ~$100–250K/mo ~$100–250K/mo ~$250K/mo ~$250K/mo
Recovery Cost (PB, 30d) ~$2/mo (commit log only) ~$150K/mo (EBS snapshots) ~$150K/mo (EBS snapshots) ~$250K/mo (on-demand) ~$250K/mo (on-demand)
Recovery Cost (PB, 1yr) ~$45/mo (Glacier tiered) ~$500K+/mo ~$500K+/mo N/A (35 days max) N/A (35 days max)
Vendor Lock-In None None Cloud tier Full (AWS) Full (AWS)
Open Source Coming Soon Partial (AGPL)

Under the Hood

Modular Rust crates.
Compose what you need.

Each layer is an independent, tested crate. Use the full stack or embed individual components.

Protocol
ferrosa-cql ferrosa-net Graph HTTP + Bolt
Observe
system_observability /metrics (Prometheus) REST API + WebSocket ferrosa-ctl TUI
Query
CQL Parser ferrosa-graph Prepared Cache
Schema
ferrosa-schema RBAC Graph Extensions
Index
ferrosa-index B-tree / Hash Composite / Phonetic Vector (HNSW / IVFFlat)
Storage
Memtable Commit Log SSTable (BTI) Compaction
Durability
Local NVMe Cache S3 / MinIO / R2
Cluster
ferrosa-cluster Single / Pair / Full Cluster Raft (openraft) Tunable CL Hinted Handoff Token Ring Coordinator Streaming Token Rebalancing

Built For

From migration to greenfield

Migration

Move off Cassandra without rewriting

Same CQL protocol, same drivers, same consistency model. Import existing SSTables directly. Migrate one keyspace at a time with dual-read verification. Your application never knows the difference.

Scale Seamlessly

Dev to production without re-architecting

Start with a single node on your laptop. Add a second node for high availability with automatic pair mode. Scale to 3+ nodes and Ferrosa forms a full production cluster with Raft consensus, tunable consistency levels, hinted handoff, and node lifecycle management. Same binary, same config format — just add nodes.

Graph + Relational

One database for tables and relationships

Stop running Cassandra alongside Neo4j. Ferrosa speaks CQL for your transactional workloads and Cypher for graph traversals — same tables, same cluster, same operational surface.

One binary. Every scale.

Start on your laptop, ship to production. No re-architecture, no migration between tiers. The same Ferrosa binary handles every stage of your growth.

1 Node

Development & prototyping

Run a single Ferrosa node locally. Full CQL compatibility, graph queries, and SUBSCRIBE all work out of the box. No cluster setup, no coordination overhead. Your application code stays exactly the same as you scale up.

2 Nodes

High availability with pair mode

Add a second node and Ferrosa automatically enters pair mode — synchronous replication with instant failover. All 11 DDL operations replicate automatically: CREATE, ALTER, and DROP for keyspaces, tables, and roles, plus GRANT and REVOKE. Schema catch-up on rejoin, operator switchover, and force-promote for split-brain recovery.

3+ Nodes

Full production cluster

Add a third node and Ferrosa transitions to full cluster mode: Raft consensus for metadata (openraft), Murmur3 token ring for data sharding, and a coordinator pattern with tunable consistency levels (ONE, QUORUM, ALL, LOCAL_QUORUM, EACH_QUORUM). Hinted handoff stores writes for temporarily failed replicas and replays on recovery. Streaming protocol handles node bootstrap and decommission. Skew-aware token rebalancing keeps load evenly distributed. Deploy with Docker Compose or bare metal.

Download

Coming Soon

Ferrosa is under active development. Sign up below to be notified when the first release is available.

See the getting started guide for architecture and design details.

Getting Started

Install, configure, and connect your first CQL driver in minutes.

Examples & Tutorials

Step-by-step walkthroughs for IoT, analytics, e-commerce, and 10 more use cases — with runnable CQL scripts.

CQL Compatibility

Full CQL reference with supported statements, types, and functions.

Migration Guide

Move from Apache Cassandra to Ferrosa without rewriting your application.

Ready to leave GC pauses behind?

Ferrosa is in active development. Binary releases are coming soon.

Read the Docs Browse Examples