No encryption expertise required. PCCI is designed so that the encryption layer is completely invisible in your application code. If you’ve built anything with the OpenAI API, you already know how to use PCCI. The SDK handles all cryptography automatically — you write normal API calls and get normal responses.
Two Ways to Integrate
Option 1: PCCI TypeScript SDK (Recommended)
Install the SDK and use it like any OpenAI client:
import { createRvencClient } from "@premai/pcci-sdk-ts";
const client = await createRvencClient({
apiKey: process.env.PCCI_API_KEY,
clientKEK: process.env.PCCI_KEK, // Your master key — you generate it, we never see it
});
// This looks exactly like an OpenAI call — because it is
const chat = await client.chat.completions.create({
model: "your-model",
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Summarize this quarterly report." },
],
stream: true,
});
for await (const chunk of chat) {
process.stdout.write(chunk.choices[0]?.delta?.content || "");
}
Behind the scenes, the SDK encrypts your messages before sending, performs a secure key exchange with the enclave, and decrypts each streaming chunk as it arrives. You see none of this — your code looks like any other OpenAI integration.
Option 2: Local Proxy Server (Any Language)
If you use Python, Go, Java, or any other language with an OpenAI-compatible client library, the SDK includes a local proxy server that handles encryption transparently:
# Start the local proxy (one command)
npx @premai/pcci-sdk-ts proxy --api-key $PCCI_API_KEY --kek $PCCI_KEK
Then point your existing code at localhost:
from openai import OpenAI
# Your existing OpenAI code — just change the base URL
client = OpenAI(
base_url="http://localhost:3100/v1",
api_key="unused", # Auth is handled by the local proxy
)
response = client.chat.completions.create(
model="your-model",
messages=[{"role": "user", "content": "Hello, privately."}],
)
Zero code changes to your application logic. The local proxy encrypts outbound requests and decrypts responses automatically.
What You Can Do
Chat with AI Models
Full OpenAI-compatible chat API:
| Feature | Details |
|---|
| Streaming | Real-time word-by-word output, each chunk encrypted individually |
| JSON mode | Structured output for reliable parsing |
| System messages | Control model behavior and personality |
| Multi-turn conversations | Full conversation history and context management |
| Audio transcription | Convert speech to text (Whisper, Deepgram) |
| Audio translation | Translate audio to English |
Error Handling
Standard HTTP status codes with structured error responses:
| Code | What It Means | What to Do |
|---|
| 400 | Invalid request format | Check your input against the API spec |
| 401 | Invalid API key | Verify your API key is correct and active |
| 403 | Insufficient permissions | Check your API key’s scopes |
| 429 | Rate limited | Implement exponential backoff (examples in Rate Limits) |
| 500 | Server error | Retry with an idempotency key |
| 503 | Temporarily unavailable | Wait and retry |
Every error includes a support_id you can share with our team for debugging.
Rate Limits
Rate limits are per-organization, across four dimensions:
| Dimension | What It Limits | Why |
|---|
| RPS (Requests per second) | How fast you can send requests | Prevents bursts from overwhelming the system |
| TPM (Tokens per minute) | Total token throughput | Manages inference capacity |
| Concurrent | Simultaneous active requests | Ensures fair resource sharing |
Limits increase across tiers (Free, Tier 1, Tier 2, Tier 3) as your usage grows. See Rate Limits for specific values and retry strategies with code examples.
Ready to start building? See the Quickstart for step-by-step setup, or browse Recipes for copy-paste examples.