Move from Apache Cassandra to Ferrosa without rewriting application code. Same CQL protocol, same drivers, same consistency model.
Ferrosa is designed as a drop-in replacement for Apache Cassandra at the protocol level. The migration strategy is incremental:
| Feature | Status | Notes |
|---|---|---|
| CQL protocol v5 | Compatible | All 16 opcodes |
| SASL authentication | Compatible | PasswordAuthenticator flow |
| Frame compression | Compatible | LZ4, Snappy |
| Prepared statements | Compatible | W-TinyLFU cache |
| Batch operations | Compatible | Unlogged batches |
| system_schema.* | Compatible | All standard tables |
| system.local | Compatible | Including tokens column |
| Murmur3Partitioner | Compatible | Same token distribution |
| BTI SSTables | Compatible | Read Cassandra 5.x SSTables directly |
| Consistency levels | Compatible | ONE, TWO, THREE, QUORUM, ALL, LOCAL_ONE, LOCAL_QUORUM, EACH_QUORUM |
| cqlsh | Compatible | Tested with Cassandra cqlsh |
| Hinted handoff | Compatible | Stores hints for down nodes, replays on recovery |
| Node lifecycle | Compatible | Join and decommission via ferrosa-ctl |
| Token rebalancing | Compatible | Operator-triggered via ferrosa-ctl rebalance |
| Secondary indexes | Compatible | 8 index types including vector (HNSW, IVFFlat) |
| UDTs | Supported | CREATE/ALTER/DROP TYPE fully implemented |
| Materialized views | Not yet | Planned |
| LWT | Not yet | Planned |
| Gossip protocol | Replaced | Ferrosa uses Raft for metadata (not wire-compatible) |
| Internode protocol | Replaced | Custom binary protocol (not wire-compatible) |
Export your Cassandra schema and check for unsupported features. Ferrosa supports all standard CQL types, partition/clustering keys, table options, and UDTs. Check for materialized views or UDFs — these are not yet supported.
# Export schema from Cassandra cqlsh -e "DESCRIBE SCHEMA" > schema.cql # Check for unsupported features grep -i "MATERIALIZED VIEW\|CREATE FUNCTION" schema.cql
Run Ferrosa alongside your existing cluster. It doesn't need to join the Cassandra ring.
# Build and start cargo build --release FERROSA_CQL_BIND=0.0.0.0:9042 \ FERROSA_AUTH_DISABLED=true \ ./target/release/ferrosa
Run your exported schema against Ferrosa. Remove any unsupported objects first.
# Apply schema to Ferrosa cqlsh ferrosa-host 9042 -f schema.cql
Two options: SSTable import (fastest for large datasets) or CQL COPY/INSERT (simpler for smaller datasets).
# Option A: CQL COPY (simple, works for moderate datasets) # Export from Cassandra cqlsh cassandra-host -e "COPY social.users TO '/tmp/users.csv'" # Import to Ferrosa cqlsh ferrosa-host -e "COPY social.users FROM '/tmp/users.csv'" # Option B: SSTable import (see SSTable Import section below)
Run your application against both Cassandra and Ferrosa, comparing results. See the dual-read verification section below.
Once validation passes, update your driver contact points from Cassandra to Ferrosa.
# Your application code — just change the host # Before: # cluster = Cluster(['cassandra-node-1.prod']) # After: cluster = Cluster(['ferrosa-node-1.prod'])
Ferrosa's ferrosa-sstable crate reads Cassandra BTI (Big Trie-Indexed) SSTables
natively — the default format in Cassandra 5.x. This means you can import existing SSTable
files directly without an intermediate conversion step.
A BTI SSTable consists of 7 component files:
*-Data.db — row data*-Partitions.db — partition index (on-disk trie)*-Rows.db — row-level index*-Filter.db — Bloom filter*-CompressionInfo.db — compression metadata*-Statistics.db — SSTable statistics*-TOC.txt — table of contentsCopy these files from your Cassandra data directory to Ferrosa's data directory, organized by keyspace and table. Ferrosa will pick them up on next read.
ferrosa-import CLI tool with validation and progress reporting is planned for a
future release.
Ferrosa supports LZ4 and Zstd compressed SSTables. No decompression step needed — compressed SSTables are read directly.
nodetool upgradesstables to force conversion. Big format read support
is planned for a future Ferrosa release.
Before cutting over production traffic, run dual reads against both databases and compare results:
from cassandra.cluster import Cluster # Connect to both cass = Cluster(['cassandra-host']).connect('social') ferro = Cluster(['ferrosa-host']).connect('social') # Compare results query = "SELECT * FROM users WHERE user_id = ?" stmt_c = cass.prepare(query) stmt_f = ferro.prepare(query) for uid in sample_user_ids: row_c = cass.execute(stmt_c, [uid]).one() row_f = ferro.execute(stmt_f, [uid]).one() assert row_c == row_f, f"Mismatch for {uid}"
Run this against a representative sample of your production queries. Cover:
For production, configure S3 as the durable storage backend. Local disk acts as a hot cache.
# AWS S3 with IAM instance profile (no explicit keys needed) FERROSA_S3_ENDPOINT=https://s3.amazonaws.com \ FERROSA_S3_BUCKET=my-ferrosa-data \ FERROSA_S3_REGION=us-east-1 \ FERROSA_DATA_DIR=/var/lib/ferrosa \ ./target/release/ferrosa # MinIO for local development FERROSA_S3_ENDPOINT=http://localhost:9000 \ FERROSA_S3_BUCKET=ferrosa \ FERROSA_S3_ACCESS_KEY_ID=minioadmin \ FERROSA_S3_SECRET_ACCESS_KEY=minioadmin \ FERROSA_S3_ALLOW_HTTP=true \ ./target/release/ferrosa
Ferrosa works with any S3-compatible object store. Set FERROSA_S3_ENDPOINT for non-AWS providers:
| Provider | Endpoint |
|---|---|
| AWS S3 | (default — no endpoint needed) |
| MinIO | http://minio:9000 |
| Cloudflare R2 | https://<account>.r2.cloudflarestorage.com |
| DigitalOcean Spaces | https://<region>.digitaloceanspaces.com |
| Backblaze B2 | https://s3.<region>.backblazeb2.com |
With S3-backed storage, you trade local disk costs for object storage costs — typically a 4-10x reduction:
| Scale | EBS (gp3, 3 replicas) | S3 Standard | Savings |
|---|---|---|---|
| 1 TB | ~$240/mo | ~$23/mo | ~10x |
| 10 TB | ~$2,400/mo | ~$230/mo | ~10x |
| 100 TB | ~$24,000/mo | ~$2,300/mo | ~10x |
| 1 PB | ~$240,000/mo | ~$23,000/mo | ~10x |
S3 request costs (GET/PUT) add overhead for high-throughput workloads, but the local NVMe cache absorbs most read traffic. Write-behind uploads batch SSTables to amortize PUT costs.
| Area | Cassandra | Ferrosa |
|---|---|---|
| Cluster membership | Gossip protocol | Raft (openraft) for metadata consensus |
| Internode protocol | Cassandra messaging | Custom binary protocol with 3 priority lanes, PSK auth |
| Storage durability | Local disk (EBS/SSD) | S3 (local NVMe as cache) |
| Node recovery | Stream from replicas (hours) | Read from S3 (seconds) |
| GC pauses | JVM stop-the-world | None (Rust, no GC) |
| Observability | JMX + nodetool | CQL virtual tables + Prometheus + TUI + Web console |
| Real-time pub/sub | CDC (requires Kafka/Debezium) | Built-in SUBSCRIBE with EVERY/DELTA modes |
| Graph queries | Not supported | Cypher via HTTP/JSON on the same tables |
| SSTable format | Big + BTI | Reads BTI, writes BTI (native format planned) |
| Commit log | Standard | CAS-allocated segments, 3 sync modes, built-in CDC |
If you need to roll back to Cassandra:
cqlsh COPY if you need to sync writes that went only to Ferrosa