Skip to main content
No encryption expertise required. PCCI is designed so that the encryption layer is completely invisible in your application code. If you’ve built anything with the OpenAI API, you already know how to use PCCI. The SDK handles all cryptography automatically — you write normal API calls and get normal responses.

Two Ways to Integrate

Install the SDK and use it like any OpenAI client:
import { createRvencClient } from "@premai/pcci-sdk-ts";

const client = await createRvencClient({
  apiKey: process.env.PCCI_API_KEY,
  clientKEK: process.env.PCCI_KEK, // Your master key — you generate it, we never see it
});

// This looks exactly like an OpenAI call — because it is
const chat = await client.chat.completions.create({
  model: "your-model",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: "Summarize this quarterly report." },
  ],
  stream: true,
});

for await (const chunk of chat) {
  process.stdout.write(chunk.choices[0]?.delta?.content || "");
}
Behind the scenes, the SDK encrypts your messages before sending, performs a secure key exchange with the enclave, and decrypts each streaming chunk as it arrives. You see none of this — your code looks like any other OpenAI integration.

Option 2: Local Proxy Server (Any Language)

If you use Python, Go, Java, or any other language with an OpenAI-compatible client library, the SDK includes a local proxy server that handles encryption transparently:
# Start the local proxy (one command)
npx @premai/pcci-sdk-ts proxy --api-key $PCCI_API_KEY --kek $PCCI_KEK
Then point your existing code at localhost:
from openai import OpenAI

# Your existing OpenAI code — just change the base URL
client = OpenAI(
    base_url="http://localhost:3100/v1",
    api_key="unused",  # Auth is handled by the local proxy
)

response = client.chat.completions.create(
    model="your-model",
    messages=[{"role": "user", "content": "Hello, privately."}],
)
Zero code changes to your application logic. The local proxy encrypts outbound requests and decrypts responses automatically.

What You Can Do

Chat with AI Models

Full OpenAI-compatible chat API:
FeatureDetails
StreamingReal-time word-by-word output, each chunk encrypted individually
JSON modeStructured output for reliable parsing
System messagesControl model behavior and personality
Multi-turn conversationsFull conversation history and context management
Audio transcriptionConvert speech to text (Whisper, Deepgram)
Audio translationTranslate audio to English

Error Handling

Standard HTTP status codes with structured error responses:
CodeWhat It MeansWhat to Do
400Invalid request formatCheck your input against the API spec
401Invalid API keyVerify your API key is correct and active
403Insufficient permissionsCheck your API key’s scopes
429Rate limitedImplement exponential backoff (examples in Rate Limits)
500Server errorRetry with an idempotency key
503Temporarily unavailableWait and retry
Every error includes a support_id you can share with our team for debugging.

Rate Limits

Rate limits are per-organization, across four dimensions:
DimensionWhat It LimitsWhy
RPS (Requests per second)How fast you can send requestsPrevents bursts from overwhelming the system
TPM (Tokens per minute)Total token throughputManages inference capacity
ConcurrentSimultaneous active requestsEnsures fair resource sharing
Limits increase across tiers (Free, Tier 1, Tier 2, Tier 3) as your usage grows. See Rate Limits for specific values and retry strategies with code examples.
Ready to start building? See the Quickstart for step-by-step setup, or browse Recipes for copy-paste examples.