ZAI LLM reverse proxy and metrics dashboard
Find a file
jedarden 9799d75d2b feat(dashboard): add cache token tracking and running totals panel
Adds cache_read and cache_write token directions throughout the
observability stack so Anthropic prompt-cache billing is visible.

- model/metrics.go: TokensCacheRead, TokensCacheWrite, TokenRateCacheRead,
  TokenRateCacheWrite fields on MetricSnapshot
- collector: reads direction=cache_read/cache_write from
  zai_proxy_tokens_total Prometheus metric
- frontend types.ts: matching TS fields
- TokenPanel: rewritten to show all 4 directions (input, output,
  cache_read, cache_write) on the rate chart; running-total summary
  strip above the chart shows window totals (e.g. "5h window: 1.2M
  input / 340k output / 89k cache_read / 12k cache_write")

Also updates docs/plan/plan.md to accurately document the full
dashboard architecture (backend API, storage schema, SSE hub,
frontend panels, Grafana layer, env vars).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-16 23:08:28 -04:00
dashboard feat(dashboard): add cache token tracking and running totals panel 2026-05-16 23:08:28 -04:00
docs feat(dashboard): add cache token tracking and running totals panel 2026-05-16 23:08:28 -04:00
proxy chore: update module paths and add evaluation package 2026-05-16 16:03:50 -04:00
.gitignore feat: initial zai-proxy ecosystem repo 2026-05-16 15:53:52 -04:00
README.md feat: initial zai-proxy ecosystem repo 2026-05-16 15:53:52 -04:00

zai-proxy

LLM reverse proxy and metrics dashboard for the Z.AI API.

Components

proxy/

OpenAI-compatible reverse proxy that fronts the Z.AI API. Features:

  • Request/response body parsing and token counting (tiktoken + GLM tokenizers)
  • Rate limiting with configurable burst and steady-state limits
  • Prometheus metrics export
  • Blue/green and canary deployment support
  • Translation layer for provider-specific request/response formats

See proxy/README.md for setup and configuration.

dashboard/

Go backend + React frontend for visualizing proxy metrics, token usage, and request history.

  • Collects metrics from the proxy's Prometheus endpoint
  • Stores aggregated data in SQLite
  • Serves a Tailwind/Vite frontend via SSE for live updates

See dashboard docs for deployment and monitoring.

Docs

  • docs/plan/ — architecture decisions and roadmaps
  • docs/notes/ — deployment, operations, monitoring, canary procedures
  • docs/research/ — tokenizer research, metrics references

Git remotes

  • Forgejo (primary): https://git.ardenone.com/jedarden/zai-proxy.git