PromptHelm Docs

PromptHelm records every gateway call as an execution log and rolls those logs into per-prompt analytics. The result is a single dashboard where you can spot latency regressions, runaway cost, and quality drift across versions — without exporting to a third-party warehouse.

Where to find it

Open any prompt and click the Analytics tab. The view scopes to a single prompt aggregate; for organization-wide rollups, use the Cost and Logs sections of the dashboard.

Time windows

Name	Type	Default	Description
24h	preset	—	Granularity: 1 hour. Best for catching live regressions.
7d	preset	—	Granularity: 1 hour or 1 day. Default tradeoff between resolution and noise.
30d	preset	—	Granularity: 1 day. Trend view for cost and version adoption.
Custom	range	—	Pick any start/end inside the 90-day retention window.

The backend caps queries at 90 days. Aggregations beyond that fall back to monthly billing summaries.

Metrics

Every prompt surfaces six metric families:

Volume — total calls in the window plus a calls-per-day trend.
Latency — p50, p95, p99 server-observed latency.
Cost — total + average per call, with cached-token discount highlighted separately.
Success rate — successful calls divided by total calls.
By version — per-version usage breakdown, useful before and after a promotion.
Errors — error-code distribution (timeouts, validation, provider failures) with links into the raw logs.

How the numbers are computed

Analytics run as a single Mongo $facet aggregation against the executionlogs collection, indexed by (tenantId, promptId, createdAt). Percentiles use the $percentile t-digest accumulator. The pipeline is filtered server-side by tenantId, so cross-tenant leakage is structurally impossible.

REST API

The same data is available over REST for custom dashboards:

curl -G https://api.prompthelm.app/api/v1/execution-logs/aggregations/prompt/$PROMPT_ID \
  -H "Authorization: Bearer $PROMPTHELM_API_KEY" \
  --data-urlencode "start=2026-04-01T00:00:00Z" \
  --data-urlencode "end=2026-05-01T00:00:00Z" \
  --data-urlencode "granularity=day"

The response shape mirrors the dashboard tiles — see the API reference for the full schema.

Use cases

Catch regressions across versions. Promote a new version, watch the latency tile diverge from the prior version's baseline, revert if needed.
Find cost outliers. Sort the by-version breakdown by average cost per call to surface a model swap that doubled spend.
Track quality over time. Pair the success-rate metric with your own application's downstream metrics (e.g. user thumbs-up rate) to detect quality drift across providers.

Logs vs analytics

Analytics give you the rolled-up trend; the Logs view gives you the raw per-call payloads (request, response, headers, latency) filtered by prompt, version, status, or correlation ID.

Per-Prompt Analytics

Where to find it

Time windows

Metrics

How the numbers are computed

REST API

Use cases

Next steps

Prompts & versions

Cost optimization

On this page