ZAI LLM reverse proxy and metrics dashboard
Adds cache_read and cache_write token directions throughout the observability stack so Anthropic prompt-cache billing is visible. - model/metrics.go: TokensCacheRead, TokensCacheWrite, TokenRateCacheRead, TokenRateCacheWrite fields on MetricSnapshot - collector: reads direction=cache_read/cache_write from zai_proxy_tokens_total Prometheus metric - frontend types.ts: matching TS fields - TokenPanel: rewritten to show all 4 directions (input, output, cache_read, cache_write) on the rate chart; running-total summary strip above the chart shows window totals (e.g. "5h window: 1.2M input / 340k output / 89k cache_read / 12k cache_write") Also updates docs/plan/plan.md to accurately document the full dashboard architecture (backend API, storage schema, SSE hub, frontend panels, Grafana layer, env vars). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|---|---|---|
| dashboard | ||
| docs | ||
| proxy | ||
| .gitignore | ||
| README.md | ||
zai-proxy
LLM reverse proxy and metrics dashboard for the Z.AI API.
Components
proxy/
OpenAI-compatible reverse proxy that fronts the Z.AI API. Features:
- Request/response body parsing and token counting (tiktoken + GLM tokenizers)
- Rate limiting with configurable burst and steady-state limits
- Prometheus metrics export
- Blue/green and canary deployment support
- Translation layer for provider-specific request/response formats
See proxy/README.md for setup and configuration.
dashboard/
Go backend + React frontend for visualizing proxy metrics, token usage, and request history.
- Collects metrics from the proxy's Prometheus endpoint
- Stores aggregated data in SQLite
- Serves a Tailwind/Vite frontend via SSE for live updates
See dashboard docs for deployment and monitoring.
Docs
docs/plan/— architecture decisions and roadmapsdocs/notes/— deployment, operations, monitoring, canary proceduresdocs/research/— tokenizer research, metrics references
Git remotes
- Forgejo (primary):
https://git.ardenone.com/jedarden/zai-proxy.git