# Relay Response Cache — Analysis & Status

**Date:** 2025-02-14
**From:** StreamRift
**Branch:** `streamrift-0` (uncommitted work on top of your commits)

---

## TL;DR

We analyzed the relay's data flow to figure out where duplicate API calls are burning Kalshi rate limit budget. The answer: **discovery/bootstrap REST calls** (fetching today's games, events, markets). The real-time path is already efficient — pure WebSocket push, no polling. We wrote a response cache for the relay but are **leaving it uncommitted for now** since the current system is stable and working. This doc explains what was built, why, and what to know if/when we decide to land it.

---

## The Problem

When multiple users load the dashboard, each one triggers the same discovery flow:

1. `GET /trade-api/v2/events?series_ticker=KXNBA&status=open` — what NBA events exist today?
2. `GET /trade-api/v2/markets?event_ticker=...` — what markets exist per event?
3. `GET /trade-api/v2/orderbook/v2/...` — initial orderbook snapshots

These responses are **identical for every user** — they're public data, not user-specific. But each dashboard instance makes its own calls, burning API quota for the same answers.

Once loaded, everything switches to WebSocket push (ticker, trade, orderbook_delta channels) — no polling, no duplicate calls. The waste is isolated to the bootstrap path.

The smart relay's `KalshiFetcher` also re-discovers every `SMART_RELAY_REFRESH_MS` (default 5 min), which is another source of repeated identical GETs.

---

## What Was Built

Three files were modified/created (all uncommitted on `streamrift-0`):

### 1. `apps/relay/src/responseCache.ts` (new file)

A `ResponseCache` class that caches GET responses for known Kalshi endpoints:

- **Keyed by URL** (query params sorted for consistency) — auth headers differ per user but responses are the same
- **Pattern-matched TTLs:**

| Endpoint Pattern                 | TTL    | Rationale                                    |
| -------------------------------- | ------ | -------------------------------------------- |
| `/trade-api/v2/markets?...`      | 5 min  | Market discovery — game slate changes rarely |
| `/trade-api/v2/events/...`       | 30 min | Event metadata — very stable once created    |
| `/trade-api/v2/orderbook/v2/...` | 30 sec | Live orderbook — must stay fresh for trading |

- Auto-prunes stale entries every 60 seconds
- Tracks hit/miss stats (exposed via `/health`)
- Clean `destroy()` for graceful shutdown

### 2. `apps/relay/src/config.ts` (modified)

Added `relayCacheEnabled` config flag:

- Default: `true`
- Opt-out: set `RELAY_CACHE_ENABLED=false` in env
- No other config changes

### 3. `apps/relay/src/index.ts` (modified)

Wired the cache into the HTTP relay route:

- **Before forwarding:** check cache for GET requests → return cached response on hit
- **After forwarding:** store successful (200) GET responses for cacheable URLs
- Cache stats included in `/health` endpoint response
- Cache cleaned up on server shutdown

---

## Real-Time Data Flow (No Changes Needed)

For context, here's how the live data path works — this is **not** affected by caching:

| Data Type         | Transport            | Cycle        | Notes                                                            |
| ----------------- | -------------------- | ------------ | ---------------------------------------------------------------- |
| Market prices     | WebSocket push       | Event-driven | Kalshi pushes `ticker` + `trade` events as they happen           |
| Orderbook deltas  | WebSocket push       | Event-driven | Kalshi pushes `orderbook_delta` on changes                       |
| User fills/orders | WebSocket push       | Event-driven | Via user stream subscription                                     |
| Order monitoring  | REST poll (fallback) | 2 sec        | `OrderMonitor` class — only used when WS user stream unavailable |

The WebSocket relay (`wsRelay.ts`) forwards raw bytes bidirectionally with no buffering or batching. Latency = network hops only. The 10-second ping/pong is just keepalive per Kalshi's protocol.

---

## Open Questions / Things to Consider

### 1. Event listing endpoint pattern

The current cache rules match `/trade-api/v2/events/` (with trailing slash — path-based lookups like `/events/SOME_TICKER`). But the discovery flow may also call `GET /trade-api/v2/events?series_ticker=KXNBA` (query-string based listing). If so, that URL would **miss** the cache rule. Worth checking whether the actual discovery calls use the path form or query form, and adjusting the regex if needed.

A simple fix would be to also match: `{ pattern: /\/trade-api\/v2\/events\?/, ttlMs: 30 * 60_000 }`

### 2. Market discovery TTL could be longer

5 minutes is conservative for "what games are on today." The daily slate doesn't change mid-session. 15-30 minutes would save more API calls with negligible staleness risk. The tradeoff: if a new event gets listed (rare), users wouldn't see it for up to 30 min.

### 3. Orderbook cache is aggressive for trading

30-second TTL on orderbook snapshots means users could see stale book data on initial load. For the bootstrap case this is fine (WS deltas take over immediately). But if something re-fetches the orderbook via REST mid-session, it could get a 30-second-old snapshot. Probably fine since the WS maintains local state after that, but worth being aware of.

### 4. Smart Relay overlap

The smart relay (`SmartRelay` + `MarketCache`) already caches market data server-side for its streaming clients. The response cache is a separate, simpler layer for the basic HTTP relay path. They serve different use cases:

- **Smart relay cache:** Real-time streaming to `/stream/markets` WebSocket clients. Event-driven updates, subscriber-based.
- **Response cache:** Deduplicates identical REST GETs through `/relay/http`. Request-response, TTL-based.

Both can coexist. Long-term, if all clients move to the smart relay streaming path, the response cache becomes less important. But for now, most dashboard instances use the standard REST discovery → WS subscription flow.

### 5. No cache invalidation mechanism

The cache is purely TTL-based — no way to force-invalidate entries. If you ever need to bust the cache (e.g., after a deploy, or if Kalshi data is stale), the only option is restarting the relay process. A simple `POST /admin/cache/clear` endpoint would be easy to add if needed.

---

## Current Decision

**Leave it uncommitted.** The system is stable and working. The cache is a nice optimization but not critical — we're not hitting rate limits today, and the current call volume is manageable. The code is clean and ready to land whenever we decide to.

If we start seeing rate limit pressure (429s from Kalshi) or add more concurrent users, this is the first lever to pull.

---

## Files Reference

```
apps/relay/src/responseCache.ts    # NEW — ResponseCache class
apps/relay/src/config.ts           # MODIFIED — added relayCacheEnabled flag
apps/relay/src/index.ts            # MODIFIED — wired cache into relay route
```

All changes are on `streamrift-0`, uncommitted. `git stash` or `git diff HEAD` to see them.
