Prometheus metrics
Scrape TokenID cost and token counters into your existing Grafana / Prometheus stack.
Endpoint
/metricsReturns Prometheus text exposition format scoped to the authenticated organization. Authentication is the same Bearer token used elsewhere — pass it in your Prometheus scrape config.
What's exposed
| Metric | Type | Labels |
|---|---|---|
llm_tokens_total |
counter | provider, model, org_id |
llm_cost_usd_total |
counter | provider, model, org_id |
llm_requests_total |
counter | provider, model, org_id |
llm_cost_usd_rolling_24h |
gauge | org_id |
llm_tokens_total is the sum of input_tokens + output_tokens across all api_call events for the
org. llm_cost_usd_rolling_24h is a 24-hour rolling window so it's well-suited as an alerting input.
Example scrape config
scrape_configs:
- job_name: tokenid
metrics_path: /metrics
scheme: https
static_configs:
- targets: ['token-api.audit.id']
authorization:
type: Bearer
credentials: 'eyJhbGciOi…' # your TokenID API token
scrape_interval: 60s
Endpoint details
/metrics| Property | Value |
|---|---|
| Auth | Authorization: Bearer td_live_xxxx — same token as the rest of the API |
| Scope | Per-organization — the response only contains metrics for the calling org |
| Content type | text/plain; version=0.0.4; charset=utf-8 (Prometheus text exposition) |
| Suggested scrape interval | 60s (computed on demand from event-log aggregates) |
Metric reference
| Metric | Type | Labels | Meaning |
|---|---|---|---|
llm_requests_total |
counter | provider, model, org_id |
Cumulative number of api_call events |
llm_tokens_total |
counter | provider, model, org_id |
Cumulative input_tokens + output_tokens |
llm_cost_usd_total |
counter | provider, model, org_id |
Cumulative cost in USD |
llm_cost_usd_rolling_24h |
gauge | org_id |
Sum of cost over the trailing 24 hours |
All labels are emitted lowercase. Unknown values (an event without a model) are dropped from the per-(provider, model) counters — they never show up as model="unknown" blowing up your cardinality.
Sample prometheus.yml
scrape_configs:
- job_name: tokenid
metrics_path: /metrics
scheme: https
static_configs:
- targets: ['token.audit.id']
bearer_token: td_live_xxxx
scrape_interval: 60s
For Kubernetes-hosted Prometheus, store the token in a Secret and reference it via bearer_token_file:
bearer_token_file: /etc/prometheus/secrets/tokenid/api-key
Alerting on burn rate
groups:
- name: tokenid
rules:
- alert: TokenIDDailyBurnHigh
expr: llm_cost_usd_rolling_24h > 500
for: 5m
annotations:
summary: "TokenID 24h spend exceeded $500"
This complements the in-product Spend Alerts feature — Prometheus alerting is the right channel for ops teams who already run a Grafana / Alertmanager stack and want LLM cost in the same surface.