PPromptHelm Docs
Concepts

Per-Prompt Analytics

Latency, success rate, cost, and version comparison for every prompt.

PromptHelm records every gateway call as an execution log and rolls those logs into per-prompt analytics. The result is a single dashboard where you can spot latency regressions, runaway cost, and quality drift across versions — without exporting to a third-party warehouse.

Where to find it

Open any prompt and click the Analytics tab. The view scopes to a single prompt aggregate; for organization-wide rollups, use the Cost and Logs sections of the dashboard.

Time windows

NameTypeDefaultDescription
24hpresetGranularity: 1 hour. Best for catching live regressions.
7dpresetGranularity: 1 hour or 1 day. Default tradeoff between resolution and noise.
30dpresetGranularity: 1 day. Trend view for cost and version adoption.
CustomrangePick any start/end inside the 90-day retention window.

The backend caps queries at 90 days. Aggregations beyond that fall back to monthly billing summaries.

Metrics

Every prompt surfaces six metric families:

  • Volume — total calls in the window plus a calls-per-day trend.
  • Latencyp50, p95, p99 server-observed latency.
  • Cost — total + average per call, with cached-token discount highlighted separately.
  • Success rate — successful calls divided by total calls.
  • By version — per-version usage breakdown, useful before and after a promotion.
  • Errors — error-code distribution (timeouts, validation, provider failures) with links into the raw logs.

How the numbers are computed

Analytics run as a single Mongo $facet aggregation against the executionlogs collection, indexed by (tenantId, promptId, createdAt). Percentiles use the $percentile t-digest accumulator. The pipeline is filtered server-side by tenantId, so cross-tenant leakage is structurally impossible.

REST API

The same data is available over REST for custom dashboards:

curl -G https://api.prompthelm.app/api/v1/execution-logs/aggregations/prompt/$PROMPT_ID \
  -H "Authorization: Bearer $PROMPTHELM_API_KEY" \
  --data-urlencode "start=2026-04-01T00:00:00Z" \
  --data-urlencode "end=2026-05-01T00:00:00Z" \
  --data-urlencode "granularity=day"

The response shape mirrors the dashboard tiles — see the API reference for the full schema.

Use cases

  • Catch regressions across versions. Promote a new version, watch the latency tile diverge from the prior version's baseline, revert if needed.
  • Find cost outliers. Sort the by-version breakdown by average cost per call to surface a model swap that doubled spend.
  • Track quality over time. Pair the success-rate metric with your own application's downstream metrics (e.g. user thumbs-up rate) to detect quality drift across providers.

Logs vs analytics

Analytics give you the rolled-up trend; the Logs view gives you the raw per-call payloads (request, response, headers, latency) filtered by prompt, version, status, or correlation ID.

Next steps

On this page