Trigger.dev @triggerdotdev tweet - We just moved all run logs and spans from PostgreSQL to @ClickHouseDB. Here's why this matters and what we learned migrating billions of OpenTelemetry events 🧵 https://t.co/WXb3OCs39E

Trigger.dev

@triggerdotdev

11 days

PostgreSQL hit a wall with dynamic attribute search. Every OTEL span has different attributes, and creating indexes for every possible JSON path wasn't feasible. Filtering/grouping across runs became impossibly slow at scale.

1

0

5

Trigger.dev

@triggerdotdev

11 days

ClickHouse's columnar storage changes everything. Instead of storing entire rows together, it stores each column separately. This means querying just `run_id` and `duration` only reads those columns from disk, not the entire event structure.

1

0

4

Trigger.dev

@triggerdotdev

11 days

The compression gains are also pretty amazing. Similar log attributes compress extremely well when stored together column-wise. We use ZSTD with delta encoding for timestamps and specialized codecs for different data patterns.

1

0

3

Trigger.dev

@triggerdotdev

11 days

Our schema uses some clever @ClickHouseDB features: - Bloom filters for exact ID lookups - Token-based full-text search on attributes - minmax indexes for duration queries - Native JSON type that preserves full OTEL structure

1

9

Trigger.dev

@triggerdotdev

11 days

The `attributes_text` field is a materialized column. ClickHouse auto-generates this as `toJSONString(attributes)` on insert. It lets us do full-text search on the JSON structure using tokenbf_v1 indexes without storing it twice.

2

0

3

Trigger.dev

@triggerdotdev

11 days

Partitioning by date is perfect for logs since most queries are time-range based. Combined with ORDER BY on (environment_id, timestamp, trace_id), queries that filter by environment and time range are lightning fast.

1

0

3

Trigger.dev

@triggerdotdev

11 days

TTL is built into the table definition: `TTL toDateTime(expires_at) + INTERVAL 7 DAY`. ClickHouse automatically drops old partitions. No cron jobs, no manual cleanup, it just handles data retention natively.

1

0

4

Trigger.dev

@triggerdotdev

11 days

We built a dual-store architecture during migration. Both PostgreSQL and ClickHouse implement the same `IEventRepository` interface, with feature flags controlling which store is active. This enabled zero-downtime migration and easy rollback.

1

0

4

Trigger.dev

@triggerdotdev

11 days

ClickHouse gives us eventual consistency instead of ACID guarantees. For observability data, slight delays are acceptable in exchange for massive performance gains. You don't need perfect consistency to debug a failed task run.

1

0

4

Trigger.dev

@triggerdotdev

11 days

Materialized views unlock real-time usage metrics without expensive on-the-fly aggregations. They automatically maintain pre-aggregated stats as events are inserted.

1

0

4

Trigger.dev

@triggerdotdev

11 days

The hardest parts? Unicode handling (unpaired surrogate pairs crash ClickHouse inserts), async insertions for write throughput, and maintaining nanosecond timestamp precision (DateTime64(9)) for accurate trace correlation.

1

0

3

Trigger.dev

@triggerdotdev

11 days

Result: 50,000 logs per run in the dashboard (up from ~10k), instant filtering/aggregation, and we're now building log search on top of this foundation.

1

4

Trigger.dev

@triggerdotdev

11 days

Checkout the full changelog for more details: https://t.co/9PzepIQuTZ

0

8

Marcus for Peace e/acc

@MarcusForPeace

11 days

@triggerdotdev @ClickHouseDB This makes sense in general. To clarify, were you using generic Postgres? No extensions from @TigerDatabase or @neondatabase

1

0

1

chronark

@chronark

10 days

@triggerdotdev @ClickHouseDB Perfect I need to do something similar soon and will steal your schema

0

1

Yassine ☀️

@0xrais

11 days

@triggerdotdev @ClickHouseDB Ofc

0

1

0

RajuGangitla

@Raju_dev0

11 days

@triggerdotdev @ClickHouseDB Great we also move to click house

0

1

Nafees

@NafeDotEs

10 days

@triggerdotdev @ClickHouseDB Seems like you just reinvented @hyperdxio

0

Abhi

@abhishekashwinc

11 days

@triggerdotdev @ClickHouseDB How is the on base performance of it?

0

Mykhailo Sorochuk

@sir4K_zen

11 days

@triggerdotdev @ClickHouseDB The migration sounds solid. ClickHouse's performance for log data is a game changer, especially with large datasets. How's the learning curve with those specialized features?

0

ethan steininger 🔎

@ethansteininger

10 days

@triggerdotdev @ClickHouseDB just migrated to clickhouse too - where are you hosting it?

0

Min Chon Chi

@MinChonChiSF

10 days

@triggerdotdev @ClickHouseDB Interesting to see the shift from PostgreSQL to ClickHouse for logs.

0

Replies