---
name: Pcci
description: Use when building end-to-end encrypted AI applications with chat completions, audio processing, and file encryption. Reach for this skill when implementing OpenAI-compatible APIs with zero-knowledge architecture, managing API keys and authentication, handling rate limits, or integrating encrypted AI services into applications.
metadata:
    mintlify-proj: pcci
    version: "1.0"
---

# Prem API Skill Reference

## Product summary

Prem API is an **end-to-end encrypted, OpenAI-compatible API** for chat completions, audio transcription/translation, and file encryption. All data is encrypted client-side before transmission — the platform never sees plaintext. Use the TypeScript SDK (`@premai/api-sdk`) or run the bundled local proxy (`pcci-proxy`) to connect with any OpenAI client. Key files: `.env` (stores `PREM_API_KEY`, `PROXY_URL`, `ENCLAVE_URL`), API key management at `dashboard.prem.io/api-keys`. Primary docs: https://docs.prem.io

## When to use

Reach for this skill when:
- Building applications that need **end-to-end encrypted AI** (chat, audio, vision models)
- Integrating **OpenAI-compatible APIs** without changing client code
- Managing **API authentication** with scoped permissions and IP restrictions
- Handling **rate limits** and implementing retry logic
- Processing **audio files** (transcription, translation)
- Uploading and encrypting **files** with zero-knowledge architecture
- Debugging **encryption key management** or secure key exchange (XWing)
- Implementing **streaming chat completions** or non-streaming requests
- Troubleshooting **403 Forbidden**, **429 Too Many Requests**, or **401 Unauthorized** errors

## Quick reference

### SDK Installation & Setup

```bash
npm install @premai/api-sdk
```

### Environment Variables (Required)

```bash
PREM_API_KEY=your-api-key
PROXY_URL=https://gateway.prem.io
ENCLAVE_URL=https://conf-engine.prem.io
```

Get latest endpoints from: `https://dashboard.prem.io/endpoints.json`

### Basic Client Initialization

```typescript
import createRvencClient from "@premai/api-sdk";

const client = await createRvencClient({
  apiKey: process.env.PREM_API_KEY,
  clientKEK: process.env.CLIENT_KEK  // optional, auto-generated if omitted
});
```

### Common API Calls

| Task | Code |
|------|------|
| **Chat completion (non-streaming)** | `client.chat.completions.create({ model: "openai/gpt-oss-120b", messages: [...] })` |
| **Chat completion (streaming)** | `client.chat.completions.create({ model: "...", messages: [...], stream: true })` |
| **Audio transcription** | `client.audio.transcriptions.create({ file: audioFile, model: "openai/whisper-large-v3" })` |
| **Audio translation** | `client.audio.translations.create({ file: audioFile, model: "openai/whisper-large-v3" })` |
| **Vision/image analysis** | `client.chat.completions.create({ model: "OpenGVLab/InternVL3-38B", messages: [{ role: "user", content: [{ type: "image_url", image_url: { url: "..." } }] }] })` |

### OpenAI-Compatible Proxy

```bash
npx -p @premai/api-sdk pcci-proxy
# Server runs on http://localhost:3000/v1
```

Then use any OpenAI client:

```typescript
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.PREM_API_KEY,
  baseURL: "http://127.0.0.1:3000/v1"
});
```

### API Key Scopes

| Scope | Purpose |
|-------|---------|
| `api_keys.read` | View and list API keys |
| `chats.completion` | Chat completion access |
| `audio.transcription` | Transcribe audio |
| `audio.translation` | Translate audio |
| `files.encrypted.read` | Read encrypted files |
| `files.encrypted.create` | Upload encrypted files |
| `files.encrypted.delete` | Delete encrypted files |
| `tools.execute` | Execute tools and integrations |

### Rate Limit Tiers (Inference Requests)

| Tier | RPS | TPM | Concurrent | Audio (sec/min) |
|------|-----|-----|-----------|-----------------|
| BASE | 5 | 38,000 | 5 | 10 |
| TIER_1 | 5 | 540,000 | 50 | 60 |
| TIER_2 | 5 | 1,000,000 | 500 | 300 |
| TIER_3 | 5 | 2,500,000 | 5,000 | 1,200 |

## Decision guidance

### When to use SDK vs Proxy

| Scenario | Use SDK | Use Proxy |
|----------|---------|----------|
| Building Node.js/TypeScript app | ✓ | |
| Using OpenAI client library | | ✓ |
| Need direct encryption control | ✓ | |
| Integrating with existing OpenAI code | | ✓ |
| Running in browser/frontend | ✓ | |
| Multi-language support needed | | ✓ |

### When to use streaming vs non-streaming

| Scenario | Streaming | Non-streaming |
|----------|-----------|---------------|
| Real-time user feedback | ✓ | |
| Long responses (word-by-word) | ✓ | |
| Simple request-response | | ✓ |
| Batch processing | | ✓ |
| Chat UI with live updates | ✓ | |

### Encryption key management

| Scenario | Action |
|----------|--------|
| First-time setup | Let SDK auto-generate keys with `createRvencClient({ apiKey })` |
| Reuse keys across sessions | Pre-generate with `generateEncryptionKeys()` and store securely |
| Key rotation needed | Generate new keys and update stored reference |
| Lost master key (KEK) | Data is permanently inaccessible — no recovery |

## Workflow

### 1. Set up authentication

1. Open https://dashboard.prem.io and sign in (or register)
2. Navigate to **Developers > API Keys**
3. Create a new API key and copy it
4. Store in `.env` as `PREM_API_KEY` (never commit to source control)
5. Get latest endpoint URLs from `https://dashboard.prem.io/endpoints.json`
6. Add `PROXY_URL` and `ENCLAVE_URL` to `.env`

### 2. Initialize the client

```typescript
import createRvencClient from "@premai/api-sdk";

const client = await createRvencClient({
  apiKey: process.env.PREM_API_KEY
});
```

The SDK auto-generates encryption keys on first call. For persistent keys across sessions, pre-generate and store them.

### 3. Make your first request

```typescript
const response = await client.chat.completions.create({
  model: "openai/gpt-oss-120b",
  messages: [{ role: "user", content: "Hello!" }]
});

console.log(response.choices[0].message.content);
```

### 4. Handle errors and rate limits

Check for error status codes and implement retry logic:

```typescript
try {
  const response = await client.chat.completions.create({...});
} catch (error: any) {
  if (error.status === 429) {
    // Rate limited — wait and retry
    const retryAfter = error.headers["retry-after"] || 1;
    await new Promise(r => setTimeout(r, retryAfter * 1000));
  } else if (error.status === 401) {
    // Invalid API key
    console.error("Check PREM_API_KEY");
  } else if (error.status === 403) {
    // Missing scope — check API key permissions
    console.error("API key lacks required scope");
  }
}
```

### 5. Verify encryption is working

All data is encrypted client-side automatically. To verify:
- Check that plaintext never appears in network logs (use browser DevTools or Wireshark)
- Confirm the enclave receives only encrypted payloads
- Use the `/auth/debug_token` endpoint to inspect your authenticated request

## Common gotchas

- **Missing environment variables**: All three (`PREM_API_KEY`, `PROXY_URL`, `ENCLAVE_URL`) are required. If any is missing, requests will fail silently or with cryptic errors. Always load from `.env` and validate on startup.

- **API key scopes**: A 403 error means your API key lacks the required scope. Check the scope list in the dashboard and regenerate with the correct permissions (e.g., `chats.completion` for chat requests).

- **Lost master key (KEK)**: If you lose your encryption key, all encrypted data becomes permanently inaccessible. There is no recovery. Always back up your keys to multiple secure locations.

- **Rate limit retries**: Retrying immediately after a 429 error will fail again. Respect the `Retry-After` header and implement exponential backoff with jitter. Continuous retries count against your rate limit.

- **Streaming response handling**: When using `stream: true`, you must iterate through all chunks with `for await (const chunk of stream)`. Incomplete iteration leaves the connection open and wastes resources.

- **File size limits**: The `maxBufferSize` option (default 10MB) limits SSE buffer size. For large streaming responses, increase this: `createRvencClient({ ..., maxBufferSize: 20 * 1024 * 1024 })`.

- **Proxy endpoint mismatch**: If using the local proxy, ensure your OpenAI client points to `http://127.0.0.1:3000/v1` (or your configured port), not the remote gateway.

- **Encryption key reuse**: Pre-generated keys should be stored securely and reused across sessions for consistency. Generating new keys each time creates new encryption contexts.

- **Audio file format**: Transcription/translation support MP3, WAV, M4A, FLAC, and other formats. Verify your file is in a supported format before uploading.

- **Model names**: Always use the full model identifier (e.g., `openai/gpt-oss-120b`, `OpenGVLab/InternVL3-38B`). Partial names will fail with 400 Bad Request.

## Verification checklist

Before submitting work with Prem API:

- [ ] API key is stored in `.env` and never committed to source control
- [ ] All three environment variables (`PREM_API_KEY`, `PROXY_URL`, `ENCLAVE_URL`) are set
- [ ] Client is initialized with `await createRvencClient({ apiKey })`
- [ ] Requests use correct model names (check `/models` endpoint or docs)
- [ ] Error handling includes checks for 401 (auth), 403 (scope), 429 (rate limit)
- [ ] Rate limit retry logic uses exponential backoff and respects `Retry-After` header
- [ ] Streaming requests iterate through all chunks with `for await`
- [ ] API key has required scopes for the operation (e.g., `chats.completion` for chat)
- [ ] Encryption keys are backed up if using pre-generated keys
- [ ] No plaintext data appears in logs or network traces
- [ ] Concurrent request count stays below tier limit (5 for BASE, 50 for TIER_1, etc.)

## Resources

**Comprehensive navigation**: https://docs.prem.io/llms.txt

**Critical documentation pages**:
- [Quickstart](https://docs.prem.io/basics/build/quickstart) — Get running in minutes with SDK or proxy
- [Encryption Architecture](https://docs.prem.io/developer-resources/get-started/encryption) — Understand zero-knowledge E2EE, XWing key exchange, and file encryption
- [API Keys & Authentication](https://docs.prem.io/developer-resources/get-started/api-keys) — Manage scopes, IP restrictions, and organization-level keys
- [Rate Limits](https://docs.prem.io/developer-resources/get-started/rate-limits) — Understand tiers, token limits, concurrent requests, and retry strategies
- [Error Handling](https://docs.prem.io/developer-resources/get-started/errors) — Common status codes and debugging with `support_id`
- [Chat Completions API](https://docs.prem.io/developer-resources/reference/rvenc/chat-completions) — Streaming, non-streaming, and configuration options
- [Recipes](https://docs.prem.io/recipes/overview) — Step-by-step guides for chat, audio, and vision tasks

---

> For additional documentation and navigation, see: https://docs.prem.io/llms.txt