Migrates Quickwit's metastore from file-backed (S3) to PostgreSQL.
The file-backed metastore stores one metastore.json per index on S3. At scale this causes gRPC timeouts on list_splits,
slow GC, and full JSON download/parse/rewrite on every metadata operation.
PostgreSQL metastore fixes all of this with indexed queries. There's no built-in
migration path, so this tool fills that gap.
- Reads from file-backed metastore using Quickwit's own
FileBackedMetastore(handles all JSON versioning, manifest loading, S3 access) - Writes to PostgreSQL using raw
sqlxwith the same schema Quickwit uses (runs upstream migrations, UNNEST-based batch inserts) - Per-index transaction: if anything fails, that index rolls back cleanly
- Post-migration verification compares counts between source and target
This is the trickiest part. File-backed metastore has per-index delete task
opstamp sequences (1, 2, 3...) and splits reference them via delete_opstamp.
PostgreSQL uses a global BIGSERIAL, so we insert delete tasks one-by-one to get
the new auto-assigned opstamps, then remap split delete_opstamp values using
a mapping table. If a split has delete_opstamp=2, we find the highest new
opstamp whose original was <= 2.
Mature -> timestamp 0, Immature -> create_timestamp + maturation_period.
Same logic as upstream quickwit-metastore/src/metastore/postgres/utils.rs.
The migration tool only reads from the file-backed metastore (never modifies it), so your source data is always safe. The risk is new data arriving during the migration window that ends up only in the old metastore. Here's the full path:
Provision a Postgres instance. Quickwit is not heavy on metastore queries, so nothing fancy needed — the same instance you'd use for any small-to-medium service. Make sure your Quickwit nodes can reach it.
Run the migration with --dry-run first to see what would be migrated and
catch any config issues:
quickwit-metastore-migration \
--source-config node.yaml \
--target-postgres-url postgresql://quickwit:pass@pg-host:5432/quickwit_metastore \
--dry-run
You need to stop everything that writes to the metastore. That means:
| Component | Why stop it | What happens if you don't |
|---|---|---|
| Indexers | Create new splits, run merges | New splits appear in file-backed metastore but not in Postgres — data loss after switchover |
| Control plane | Schedules indexing plans, assigns shards | Could trigger new indexing work during migration |
| Janitor | Runs GC, deletes MarkedForDeletion splits from S3 | Could delete split files that the migration tool is about to reference |
Searchers can stay running during migration for read availability. They're read-only against the metastore. Users can keep querying while you migrate.
If you're using Kafka sources: Quickwit tracks consumer offsets (checkpoints) in the metastore. After migration, Quickwit will resume from the last committed checkpoint — no data loss, it just reprocesses from where it left off.
If you're using the ingest API: wait for any in-flight data to be committed to
splits before stopping indexers. Check that no splits are in Staged state
(meaning they haven't been published yet). You can check via:
curl http://quickwit:7280/api/v1/indexes/{index_id}/splits?split_states=Staged
quickwit-metastore-migration \
--source-config node.yaml \
--target-postgres-url postgresql://quickwit:pass@pg-host:5432/quickwit_metastore
The tool will:
- Run Postgres schema migrations (creates tables, indexes, triggers)
- Migrate each index in its own transaction
- Print progress and verification results
- Exit 0 if everything matches
For large clusters, you can migrate a single index first to test:
quickwit-metastore-migration \
--source-config node.yaml \
--target-postgres-url postgresql://... \
--index my-important-index
The tool runs verification automatically, but you can also sanity-check directly:
-- Check index count
SELECT COUNT(*) FROM indexes;
-- Check splits per index
SELECT i.index_id, s.split_state, COUNT(*)
FROM splits s JOIN indexes i ON s.index_uid = i.index_uid
GROUP BY i.index_id, s.split_state ORDER BY 1, 2;Change metastore_uri in your Quickwit config (node.yaml, Helm values,
env vars — however you deploy) from the S3 path to the Postgres URL:
# Before
metastore_uri: s3://my-bucket/indexes
# After
metastore_uri: postgresql://quickwit:pass@pg-host:5432/quickwit_metastoreEverything else stays the same — default_index_root_uri still points to S3
because that's where the actual split data files live. Only the metastore
(index metadata, split registry, delete tasks) moves to Postgres.
Start indexers, control plane, janitor. Searchers that were still running will need a restart to pick up the new metastore URI.
Kafka sources will resume from their last checkpoint automatically.
After you're confident everything works, you can remove the old
metastore.json files from S3. They're not needed anymore. Don't delete
the split data files — those are still referenced by the Postgres metastore.
If anything goes wrong after switchover, just point metastore_uri back to the
S3 path and restart. The file-backed metastore was never modified. You'll lose
any data that was ingested after the switchover (it's only in Postgres), but
everything before the migration is intact.
Only write downtime — no new data ingested during the migration window. Read downtime is zero if you keep searchers running. The migration itself takes seconds to minutes depending on how many splits you have (it's just reading JSON from S3 and inserting rows into Postgres).
quickwit-metastore-migration \
--source-config <path-to-node.yaml> \
--target-postgres-url postgresql://user:pass@host:5432/quickwit_metastore
| Flag | Default | Description |
|---|---|---|
--source-config |
required | Path to Quickwit node.yaml (metastore_uri + S3 config) |
--target-postgres-url |
required | PostgreSQL connection string |
--dry-run |
false | Show what would be migrated, don't write |
--index <id> |
all | Only migrate this one index |
--batch-size <n> |
500 | Splits per INSERT batch |
--skip-schema-setup |
false | Skip running PostgreSQL migrations |
version: 0.8
metastore_uri: s3://my-bucket/indexes
default_index_root_uri: s3://my-bucket/indexes
storage:
s3:
region: us-east-1Needs protoc and cmake:
apt-get install -y protobuf-compiler cmake # or brew install protobuf cmake
RUSTFLAGS="--cfg tokio_unstable" cargo build --release
Or with Docker (recommended, avoids local toolchain issues):
docker build -t quickwit-metastore-migration .
The migrations/ directory contains copies of the upstream Quickwit PostgreSQL
migrations from quickwit-metastore/migrations/postgresql/ at commit
3bfdbbbbf. They're copied (not symlinked) so this tool is self-contained.
These create all the tables Quickwit expects: indexes, splits,
delete_tasks, shards, index_templates, etc.
src/
main.rs CLI entry, glues everything together
reader.rs Reads from FileBackedMetastore via MetastoreService RPCs
writer.rs Writes to PostgreSQL via sqlx (schema setup + inserts)
migrator.rs Per-index orchestration: read -> transform -> write in a tx
verify.rs Post-migration count comparison (source vs target)
There are two test setups, both using docker-compose.
Tests that the migration tool correctly reads hand-crafted metastore.json
files and inserts the data into PostgreSQL with correct row counts and states.
No Quickwit instance involved - just MinIO + Postgres + the migration tool.
Test data (test/test_data/):
test-index-clean: 5 Published splits, 0 delete tasks, simpletest-index-dirty: 20 splits (10 Published, 5 MarkedForDeletion, 5 Staged), 2 delete tasks, tags on splits
Run:
docker compose up -d minio postgres
docker compose run --rm minio-setup
docker compose run --rm migration
# Then verify:
./test/verify.sh postgresql://quickwit:quickwit@localhost:15432/quickwit_metastore
docker compose down -vverify.sh checks Postgres tables directly: index counts, split counts per
state, delete task counts, tags stored as TEXT[].
The real deal. Spins up actual Quickwit instances, indexes 5000 real documents, migrates the metastore, then verifies every single document matches between the old and new metastore backends.
What it does:
- Starts MinIO (S3) + PostgreSQL
- Starts Quickwit with file-backed metastore on MinIO
- Creates an index, ingests 5000 docs in 10 batches (varied log messages, timestamps spanning ~10 hours, 4 log levels, 14 services)
- Runs baseline queries (total count, per-level counts, text search, tag filter, time range filter), dumps all 5000 docs
- Stops file-backed Quickwit
- Runs the migration tool (file-backed -> PostgreSQL)
- Starts Quickwit with PostgreSQL metastore (same S3 for split storage)
- Runs the exact same queries and compares every result
- Dumps all 5000 docs again and compares doc-by-doc (sorted by timestamp+message, field-level comparison)
- Ingests 100 more docs to prove the postgres metastore is writable
- Checks split metadata integrity (time ranges, doc count sums)
Run:
./test/e2e-test.shTakes ~3-4 minutes. Uses quickwit/quickwit:edge image. Output shows
PASS/FAIL for each check.
Generate test data (already committed, but if you want to regenerate):
python3 test/generate-docs.pyCreates test/batches/batch_001.ndjson through batch_010.ndjson (500 docs
each, seeded random so reproducible).
Results are saved to test/results/ after a run:
baseline_docs.jsonl/postgres_docs.jsonl- full doc dumpsbaseline_sorted.jsonl/postgres_sorted.jsonl- sorted for diffe2e-output.log- full test output
- All 5000 documents are searchable after migration (not just metadata counts)
- Text search, tag filtering, and time range filtering all work identically
- First and last document content matches exactly
- Doc-by-doc comparison of all 5000 docs: zero differences
- Post-migration ingest works (metastore is fully writable)
- Split metadata (time ranges, doc counts) is preserved