PromptHelm Docs

The PromptHelm gateway is a single streaming endpoint that fronts every LLM provider you use. One bearer token, one URL, one wire format — PromptHelm handles auth, retries, cost accounting, and provider quirks on your behalf.

One endpoint, every provider

Provider	Models
OpenAI	GPT-5, GPT-4.1, GPT-4o, o4-mini, embeddings
Anthropic	Claude Sonnet 4.5, Opus 4.7, Haiku 4.5
Google	Gemini 2.5 Pro, Flash, Flash-Lite
DeepSeek	DeepSeek V3, R1

More providers are rolling out monthly. Adding one never changes your client code — only the model field on the prompt.

Authentication

Every gateway request carries a bearer API token (see API tokens):

curl -X POST https://api.prompthelm.app/api/v1/gateway/execute \
  -H "Authorization: Bearer $PROMPTHELM_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"promptSlug":"support-triage","variables":{"ticket":"..."}}'

Two request shapes

The gateway accepts either a saved-prompt call or a free-form call in the same endpoint.

{
  "promptSlug": "support-triage",
  "environment": "production",
  "variables": { "ticket": "Password reset email never arrived." },
  "model": "gpt-5",
  "temperature": 0.2
}

Override fields (model, temperature, maxTokens, topP, stopSequences) take precedence over the values stored on the saved prompt — useful for A/B tests without forking a new version.

Streaming with Server-Sent Events

For interactive UIs, hit the streaming variant. Responses arrive as text/event-stream:

curl -N https://api.prompthelm.app/api/v1/gateway/stream \
  -H "Authorization: Bearer $PROMPTHELM_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"promptSlug":"support-triage","variables":{"ticket":"..."}}'

Three event types arrive in order:

delta — incremental text chunks
done — final usage + cost summary
error — terminal failure with an error code

Cost and retry policy

Every response carries usage.costUsd calculated from input, output, and cached token rates per model. SDK clients retry 5xx and 429 with exponential backoff; 4xx failures (validation, auth, missing prompt) are surfaced immediately.

Provider keys never leave the server

Provider keys are stored AES-256-GCM encrypted with HKDF-derived subkeys. They are decrypted in-memory for a single request — never sent to your client, never logged, never written to disk in plaintext.

The Multi-Provider Gateway

One endpoint, every provider

Authentication

Two request shapes

Streaming with Server-Sent Events

Cost and retry policy

Next steps

API tokens

Per-prompt analytics

On this page