Argus generated regime signals, novelty anomalies, and trend-following calls from a 1,400-feature engine built on AVX2 SIMD.
3.2x
Macro endpoint (1,476ms → 465ms warm)
13x
Regime endpoint (1,048ms → 79ms warm)
<5ms
Signal feed latency (Redis cache)
87ms
p99 ClickHouse query latency overall
CHAPTER 01
Argus generated regime signals, novelty anomalies, and trend-following calls from a 1,400-feature engine built on AVX2 SIMD. The output of that computation was only valuable if downstream consumers could query it at low latency. Two classes of consumers existed. First, Apex: the trading executor that consumed signals from Redis Streams and needed regime context from ClickHouse. Second, the Avo public site: a Next.js App Router application serving market dashboards, signal feeds, screener pages, and a real-time intelligence view to end users.
The ClickHouse deployment ran on a single server with 32 GB of RAM allocated to the engine, a maximum of 100 concurrent queries, and 16 threads per query. This was not a distributed cluster. It was a single-node OLAP instance with purpose-built codecs and vectorized query execution. The challenge was making that single node serve both low-latency dashboard queries and high-throughput signal generation without one workload starving the other.
Three concrete problems emerged as the system grew:
First, the public macro endpoint (/api/macro) queried ClickHouse for macroeconomic indicator data on every request with no caching. Cold-cache latency was 1,476 ms. That was unacceptable for a public-facing API route.
Second, the regime endpoint (/api/intelligence/regime) performed a collection scan across 209,033 Redis regime keys on every request. Cold-cache latency was 1,048 ms. This endpoint was called on every page load of the intelligence dashboard.
Third, the daily bar downloader and minute bar ingest ran as concurrent ClickHouse INSERT workloads. Under high ingestion load, analytical SELECT queries from the dashboard and signal generation pipeline experienced latency spikes of up to 3x normal. The async insert buffer (4 threads, 10 MB) was correct for sustained ingestion but insufficient isolation from read workloads.
---
CHAPTER 02
The solution operated at three layers: codec-level storage optimization in ClickHouse, a Redis caching layer with tiered TTLs for each query category, and a background cache warmer daemon that precomputed high-cost queries outside the request path.
ClickHouse was configured with Gorilla codec on all OHLCV price columns (exploiting the smooth, autocorrelated nature of financial time series), DoubleDelta on timestamp columns (minimal delta encoding on regular-interval time series), LowCardinality on symbol string columns (reducing dictionary overhead from 93.64M unique string comparisons to an integer lookup), and ZSTD(3) as the final compression layer. The result was 5 to 10x compression versus raw storage, keeping the bars_1m table (714.4M rows) within bounds of the 32 GB query memory budget.
The Redis caching layer sat between the Next.js API routes and ClickHouse. Cache keys were namespaced under avo:cache:*. The background warmer ran as a persistent process updating high-cost keys on a schedule tied to each key's expected refresh cadence.
---
ARCHITECTURE OVERVIEW
PRODUCERS
Rust 1.84
Tokio 1.40
STREAM
ClickHouse 24.3
ordered / durable
CONSUMERS
Redis 7.2
Next.js App Router
SINK
ISR (stale-while-revalidate)
ack + replay
CHAPTER 03
Macro endpoint caching. The /api/macro route queried ClickHouse for macroeconomic indicator values (FRED series: GDP, CPI, unemployment rate, Fed Funds rate, and 27 other series with data from 2010 to present). The query joined across multiple macro tables and sorted by latest observation date. This query ran in 1,476 ms cold. The caching strategy applied a 5-minute Redis TTL on the result set. On warm cache, response time dropped to 465 ms. The 70% improvement in perceived latency came entirely from eliminating the ClickHouse round trip on repeated requests. The warmer updated this key every 4 minutes, ensuring the cache was always pre-populated before TTL expiry.
Regime collection caching. The /api/intelligence/regime route originally iterated over all 209,033 Redis regime keys using a SCAN loop to build a regime summary. SCAN on 209,033 keys took 1,048 ms even with Redis's non-blocking approach, because the collection traversal was synchronous from the API route's perspective. The fix was a single pre-aggregated cache key (avo:cache:regime:summary) that the regime pipeline updated after every write cycle. Instead of scanning 209,033 keys per request, each request performed a single GET. Latency dropped from 1,048 ms to 79 to 90 ms (an 11 to 13x improvement). The warmer updated this key in sync with argus-regime's 30-second polling cycle.
Signal feed (real-time). The /api/intelligence/latest route served the most recent signals from argus.signals via a Redis key (avo:cache:home:signals) with a 30-second TTL and 60-second revalidate. The underlying ClickHouse query was:
At 1.59 million rows with a ts index, this query returned in under 30 ms. The cache served as a rate limiter (preventing 100 concurrent users from all hitting ClickHouse simultaneously) rather than a performance optimization. The 30-second TTL was chosen to match the argus-signals emission cadence: signals were emitted in batches every 30 to 60 seconds, so a cache TTL shorter than 30 seconds would have produced unnecessary ClickHouse queries with identical results.
Backpressure between ingest and analytics. ClickHouse's async insert setting (4 threads, 10 MB buffer, 200 ms flush interval) handled the ingest write path. The read path (dashboard queries, signal queries) ran in separate query threads. The 100-query concurrency limit was the shared resource. Under simultaneous heavy ingest and dashboard load, the ingest write threads consumed up to 8 of the 100 slots, leaving 92 for read queries. In practice, read query concurrency from the Avo site peaked at 12 to 15 during testing (10 concurrent users each triggering 1 to 2 API routes), well within the remaining capacity.
Lag monitoring for the query layer. The argus-health binary tracked query latency by issuing a calibration query every 60 seconds:
This query measured both ClickHouse availability and approximate query responsiveness. Expected return time was under 50 ms. If the query took over 200 ms, it indicated either index saturation or high concurrent query load. If it took over 1,000 ms, it indicated a ClickHouse issue (memory pressure, background merge contention, or disk I/O saturation).
Consumer group model for downstream analytics. Apex consumed signals from Redis Streams via consumer group apex-consumer. The Avo site consumed signals via the Redis cache key (single GET, no consumer group). These two access patterns were completely decoupled: Apex used the durable stream with acknowledgment; the Avo site used the ephemeral cache snapshot. This meant a slow Avo site page load could not delay Apex signal processing, and an Apex crash could not cause the Avo site to serve stale signal data (it would simply serve the last cached snapshot until the warmer refreshed it).
Replay capability. For historical analytics (screener queries, 5-year chart views, regime history), the Avo site queried ClickHouse's bars_1d directly with time-range filtering. These queries were cached with longer TTLs (1 hour for daily historical data, 3,600 seconds stale-while-revalidate). Historical signal replay was available through the argus-api binary at port 8080, which supported filtering by symbol, date range, source, and confidence threshold. The full argus.signals table (1.59M rows) was retained in ClickHouse without deletion, making any historical period replayable.
---
TECH STACK
CHAPTER 04
- Macro endpoint: 1,476 ms cold reduced to 465 ms warm. Cache hit rate measured at approximately 94% during testing (6% miss rate corresponding to the first request after TTL expiry before warmer refresh). - Regime endpoint: 1,048 ms cold reduced to 79 to 90 ms warm. This is an 11.6 to 13.2x improvement from eliminating the 209,033-key Redis scan per request. - Signal feed latency: ClickHouse query for latest 50 signals returned under 30 ms. Redis cache GET returned under 5 ms. Aggregate API response time (including Next.js overhead) was under 80 ms warm. - p99 ClickHouse query latency overall: Measured across the calibration query across a 24-hour window. Median was 18 ms. p99 was 87 ms. Peak (during heavy ingest coinciding with a dashboard spike) reached 210 ms, well under the 500 ms WARN threshold. - Cache warmer reliability: Over a 72-hour window, the warmer successfully refreshed all monitored cache keys before TTL expiry on 99.1% of cycles. The 0.9% miss rate corresponded to a ClickHouse restart event (planned maintenance) during which the warmer retried with exponential backoff and successfully repopulated all keys within 45 seconds of ClickHouse becoming available. - Compression: The bars_1m table at 714.4 million rows occupied approximately 117 GB on disk (measured via system_parts). Uncompressed estimated storage would have been approximately 700 to 850 GB. The Gorilla plus ZSTD(3) codec combination delivered approximately 6 to 7x compression, consistent with the design target of 5 to 10x. - All API routes under 500 ms cold: After caching was applied across the macro, regime, and signal feed routes, all primary API routes benchmarked under 500 ms cold and under 250 ms warm. This represented a baseline improvement from the pre-caching state where macro and regime were both over 1,000 ms cold.
---
3.2x
Macro endpoint (1,476ms → 465ms warm)
13x
Regime endpoint (1,048ms → 79ms warm)
<5ms
Signal feed latency (Redis cache)
87ms
p99 ClickHouse query latency overall
CHAPTER 05
DECISION · 01
Redis as a query result cache is fundamentally different from Redis as a data store. The regime key pattern (209,033 individual HSET writes, one per symbol-timeframe combination) was a data store pattern. It worked for Apex's point-lookup access pattern (read one regime key for one symbol at a time) but was catastrophic for the collection scan pattern needed by the dashboard. The fix was not to abandon Redis but to add a layer: maintain the individual keys for point lookups, add a pre-aggregated summary key for collection queries. Both patterns were satisfied without architectural change.
DECISION · 02
Background warmers eliminate the cold-start problem at the cost of slight staleness. A 5-minute cache TTL with a 4-minute warmer update cycle guaranteed that cache misses were rare (happening only during warmer downtime or first deployment). The tradeoff was that macro data was always up to 5 minutes stale relative to ClickHouse. For macroeconomic indicators published by FRED with daily to weekly update cadences, 5-minute staleness was irrelevant. The warmer cycle was matched to the data's natural refresh rate, not to an arbitrary performance target.
DECISION · 03
Query isolation by access pattern prevents workload interference. The decision to separate the durable stream consumer path (Apex via XREADGROUP) from the ephemeral cache path (Avo site via GET) meant that slow dashboard queries never affected signal delivery latency. In a design where both consumers shared the same query path against ClickHouse, a slow historical chart query from a site visitor could have delayed regime data available to Apex. Routing access patterns to their appropriate mechanism (streams for durability, cache for read scalability) removed this coupling.
DECISION · 04
Codec selection is a one-time decision with long-term compounding returns. Choosing Gorilla for price columns and DoubleDelta for timestamps at schema creation time delivered compression benefits across every subsequent insert without any additional engineering. At 714.4 million rows, the difference between 6x and 2x compression represented roughly 350 GB of disk capacity recovered. That capacity is now available for continued data growth without hardware changes.
START A PROJECT
We build fast. Most projects ship in under two weeks. Start with a free 30-minute discovery call.
Start a ProjectArgus generates trading signals continuously across 6 source categories: mean_reversion, novelty_anomaly, regime_caution, trend_following,.
100% Signal sources emitting (6 of 6)
Read case study →
Real-TimeArgus needed a unified market data foundation across the major crypto exchanges.
~186/min Sustained throughput (per exchange)
Read case study →
Real-TimeBy late April 2026, the Argus data layer held over 800 million rows across bars_1m, bars_1d, and downstream tables.
<6 min Detection speed (silent-dead)
Read case study →