PPromptHelm Docs
Concepts

The Multi-Provider Gateway

How PromptHelm routes calls to OpenAI, Anthropic, Google, DeepSeek and beyond.

The PromptHelm gateway is a single streaming endpoint that fronts every LLM provider you use. One bearer token, one URL, one wire format — PromptHelm handles auth, retries, cost accounting, and provider quirks on your behalf.

One endpoint, every provider

ProviderModels
OpenAIGPT-5, GPT-4.1, GPT-4o, o4-mini, embeddings
AnthropicClaude Sonnet 4.5, Opus 4.7, Haiku 4.5
GoogleGemini 2.5 Pro, Flash, Flash-Lite
DeepSeekDeepSeek V3, R1

More providers are rolling out monthly. Adding one never changes your client code — only the model field on the prompt.

Authentication

Every gateway request carries a bearer API token (see API tokens):

curl -X POST https://api.prompthelm.app/api/v1/gateway/execute \
  -H "Authorization: Bearer $PROMPTHELM_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"promptSlug":"support-triage","variables":{"ticket":"..."}}'

Two request shapes

The gateway accepts either a saved-prompt call or a free-form call in the same endpoint.

{
  "promptSlug": "support-triage",
  "environment": "production",
  "variables": { "ticket": "Password reset email never arrived." },
  "model": "gpt-5",
  "temperature": 0.2
}

Override fields (model, temperature, maxTokens, topP, stopSequences) take precedence over the values stored on the saved prompt — useful for A/B tests without forking a new version.

Streaming with Server-Sent Events

For interactive UIs, hit the streaming variant. Responses arrive as text/event-stream:

curl -N https://api.prompthelm.app/api/v1/gateway/stream \
  -H "Authorization: Bearer $PROMPTHELM_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"promptSlug":"support-triage","variables":{"ticket":"..."}}'

Three event types arrive in order:

  • delta — incremental text chunks
  • done — final usage + cost summary
  • error — terminal failure with an error code

Cost and retry policy

Every response carries usage.costUsd calculated from input, output, and cached token rates per model. SDK clients retry 5xx and 429 with exponential backoff; 4xx failures (validation, auth, missing prompt) are surfaced immediately.

Provider keys never leave the server

Provider keys are stored AES-256-GCM encrypted with HKDF-derived subkeys. They are decrypted in-memory for a single request — never sent to your client, never logged, never written to disk in plaintext.

Next steps

On this page