Skip to main content
The Confidential Proxy is a local termination proxy bundled with @premai/api-sdk. It exposes OpenAI and Anthropic compatible HTTP routes on your machine and handles all end-to-end encryption transparently. Point any OpenAI or Anthropic client at it by changing a single base URL — no SDK changes, and it works from any language (Python, Go, Java, …).
Already on the TypeScript SDK? You don’t need the proxy — the SDK encrypts in-process. The proxy is for everything else: other languages, existing OpenAI/Anthropic codebases, and tools that only speak HTTP.

How it works

The proxy runs on your machine and performs the same client-side encryption the SDK does. Your plaintext is encrypted before it leaves the proxy, so the Prem API Gateway only ever sees ciphertext and decryption happens inside the enclave’s Trusted Execution Environment. For the full cryptographic design — XWing key exchange, the two-server model, and the threat model — see Encryption.

Running the server

Run the proxy directly with bunx or npx (no install required), or install it globally:
# Run without installing (bun or npm)
bunx -p @premai/api-sdk confidential-proxy
npx -p @premai/api-sdk confidential-proxy

# Or install globally, then run (ensure your global bin dir is on your PATH)
npm i -g @premai/api-sdk   # or: bun i -g @premai/api-sdk
confidential-proxy
By default the server listens on http://127.0.0.1:8000.
Set PROXY_URL and ENCLAVE_URL to the values for your environment. Get the latest from dashboard.prem.io/endpoints.json.

Configuration

The proxy is configured through environment variables or CLI flags (flags take precedence).

Environment variables

VariableRequiredDefaultDescription
ENCLAVE_URLYesEnclave endpoint that decrypts and runs inference
PROXY_URLYesPrem API Gateway endpoint that routes encrypted payloads
CLIENT_KEKYesYour Key Encryption Key — wraps DEKs (32 bytes, base64)
JSON_BODY_LIMITNo32mbMax request body size
HOSTNo127.0.0.1Interface to bind
PORTNo8000Port to listen on
CONFIDENTIAL_PROXY_LOG_LEVELNoinfoerror, warn, info, http, verbose, debug, or silly
There is no API-key environment variable. Each calling client sends its own Prem API key on every request — Authorization: Bearer <key> for OpenAI routes, x-api-key: <key> for Anthropic routes. The proxy caches a client in memory per API key. CLIENT_KEK is a separate, server-side secret used only to wrap encryption keys.

CLI options

All commands accept the same server options:
# Bind host / port
confidential-proxy --host 127.0.0.1 --port 8000

# Override backend endpoints
confidential-proxy --proxy-url https://gateway.prem.io --enclave-url https://conf-engine.prem.io

# Pass the client KEK inline
confidential-proxy --kek your-kek

# Raise the JSON body size limit
confidential-proxy --json-body-limit 64mb

Compatibility modes

Choose which API surface to expose with --compat:
ModeRoutesDescription
openai/v1/*OpenAI-compatible API only
anthropic/v1/*Anthropic-compatible Messages API only
both/openai/v1/* and /anthropic/v1/*Both APIs side-by-side under separate prefixes
# OpenAI only (default surface)
confidential-proxy --compat openai

# Anthropic only
confidential-proxy --compat anthropic

# Both, with custom prefixes
confidential-proxy --compat both --openai-prefix /openai --anthropic-prefix /anthropic
In both mode the two APIs are served under separate prefixes to avoid route conflicts. The Anthropic surface translates incoming Anthropic Messages requests into the internal OpenAI-compatible enclave pipeline, then pipes the response back as Anthropic SSE events.

Connecting a client

OpenAI

Point any OpenAI-compatible client at the proxy’s /v1 base URL and send your Prem API key as a bearer token:
curl http://127.0.0.1:8000/v1/chat/completions \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-v4-pro",
    "messages": [{"role": "user", "content": "Hello!"}],
    "stream": false
  }'
Or use the OpenAI SDK in Node.js:
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.PREM_API_KEY!,
  baseURL: "http://127.0.0.1:8000/v1",
});

const stream = await client.chat.completions.create({
  model: "deepseek-v4-pro",
  messages: [{ role: "user", content: "Count to 10" }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || "");
}
Any other language works the same way — for example, Python:
from openai import OpenAI

client = OpenAI(
    api_key="your-api-key",
    base_url="http://127.0.0.1:8000/v1",
)

response = client.chat.completions.create(
    model="deepseek-v4-pro",
    messages=[{"role": "user", "content": "Hello, privately."}],
)

print(response.choices[0].message.content)

Anthropic

When running with --compat anthropic (or both), the proxy exposes an Anthropic-compatible Messages API. Authenticate with x-api-key and send the anthropic-version header:
curl http://127.0.0.1:8000/v1/messages \
  -H "x-api-key: your-api-key" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-v4-pro",
    "max_tokens": 1024,
    "messages": [{"role": "user", "content": "Hello!"}]
  }'
Add "stream": true for incremental responses:
curl -N http://127.0.0.1:8000/v1/messages \
  -H "x-api-key: your-api-key" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-v4-pro",
    "max_tokens": 1024,
    "messages": [{"role": "user", "content": "Count to 10"}],
    "stream": true
  }'
The Anthropic surface supports system prompts, tool use, image inputs, stop sequences, temperature, and top_p. Streaming responses follow the Anthropic SSE format (message_start, content_block_start, content_block_delta, content_block_stop, message_delta, message_stop).

Running as a daemon

Beyond the default foreground mode, the CLI can manage the proxy as a background daemon.
CommandDescription
confidential-proxyRun in the foreground, attached to the terminal
confidential-proxy startStart the server as a background daemon
confidential-proxy stopGracefully stop the running daemon
confidential-proxy statusCheck whether the daemon is running and reachable
1

start

Checks for an existing PID file (refusing to start a duplicate), spawns itself as a child process with logs directed to the configured log file, writes a PID file, and polls the HTTP endpoint until the server is reachable — then exits, leaving the daemon running.
2

stop

Sends SIGTERM and waits up to 5 seconds for graceful shutdown. If the process is still alive, it escalates to SIGKILL and cleans up the PID file.
3

status

Checks both process liveness and HTTP reachability.
Daemon-specific options (for start / stop / status):
OptionDefaultDescription
--pid-file<data-dir>/proxy.pidCustom PID file path
--log-filestdout/stderrFile to write daemon logs (with start)
--log-levelinfoLog verbosity (errorsilly)
--shutdown-timeout30000Max ms to wait for in-flight requests during graceful shutdown
# Start in the background, then confirm it's up
confidential-proxy start --compat openai
confidential-proxy status

# Stop it when you're done
confidential-proxy stop

Next steps

Chat completions

The chat API in detail, with streaming and vision payloads.

Encryption

How key exchange and end-to-end encryption work.
The same proxy powers confidential-claude, a convenience integration shipped in the SDK that launches Claude Code wired to the encrypted gateway. All traffic runs through this proxy.