Skip to main content

The Simple Version

Before we get into architecture diagrams, here’s the core idea:
  1. You type a prompt
  2. Your device encrypts it before sending anything over the network
  3. Our gateway receives the encrypted payload — it handles authentication and billing, but it cannot read your data
  4. The encrypted payload enters a sealed hardware environment (a Confidential Virtual Machine) where it gets decrypted, processed by the AI model, and the response is encrypted again
  5. The encrypted response travels back to your device, where it’s decrypted and displayed
At no point does your data exist in plaintext outside of (a) your device and (b) the sealed hardware environment. Not on our servers, not in our logs, not in transit. The hardware itself — made by AMD, Intel, and NVIDIA — enforces this seal. It’s not a software setting that someone can turn off.

Architecture Overview

The Components

Your Device — The PCCI SDK

The SDK runs on your side — your laptop, your server, your application, your browser. It’s the only place (besides the sealed enclave) where your data exists in readable form. What it does:
  • Encrypts everything before it leaves your device — using modern, quantum-resistant cryptography
  • Holds your master encryption key — a key you generate, that never leaves your device
  • Decrypts responses when they come back
From your code’s perspective, it looks and feels like the standard OpenAI SDK. The encryption is invisible — the SDK handles it automatically.

PCCI API — The Blind Gateway

The proxy is the front door to the platform. It handles the operational side — checking your API key, enforcing rate limits, tracking usage for billing, routing requests. The critical point: it never sees your actual data. The proxy processes only encrypted payloads and metadata (like API keys and timestamps). It has no encryption keys and no way to decrypt what passes through it.
What the Proxy doesWhat the Proxy cannot do
Validate your API key and permissionsRead your prompts or responses
Enforce rate limits for your organizationAccess any encryption keys
Route encrypted payloads to the right enclaveLog or inspect your data
Track usage for billingDecrypt files you’ve uploaded
Even if the proxy were fully compromised by an attacker, they’d get encrypted bytes and metadata — never your actual content.

PCCI Enclave — The Sealed Processing Environment

This is where your data is actually processed. The enclave runs inside a Trusted Execution Environment (TEE) — a sealed area of the processor with its own encrypted memory that the rest of the system cannot access. Think of it like a bank vault inside a building. The building owner has keys to every room — but the vault has its own lock that even the building owner cannot open. In this analogy, the “building” is the server, the “building owner” is whoever operates the server (us, or our infrastructure provider), and the “vault” is the TEE. Inside the enclave:
  1. Your encrypted payload arrives
  2. The enclave decrypts it using a secure key exchange
  3. The AI model processes your request
  4. The response is encrypted before leaving
  5. All plaintext is wiped from memory
The enclave runs on AMD SEV-SNP or Intel TDX processors, with NVIDIA Hopper and Blackwell architecture GPUs in confidential compute mode. The hardware enforces isolation — it is not a software setting that can be turned off with admin privileges.

Model Router

A service that directs AI requests to the right model — all hosted within our confidential infrastructure. It manages which models are available, performs health checks, and selects the appropriate backend. It runs inside the same sealed environment as the enclave, so it never exposes your data outside the confidential compute boundary. No requests leave our infrastructure — all models run on our own hardware inside CVMs.

Everything Runs in Sealed Environments

The enclave isn’t the only component inside the sealed environment. Every service that processes your data runs inside Confidential Virtual Machines (CVMs):
ServiceWhat It DoesRuns in CVM?
EnclaveDecrypts, orchestrates, encryptsYes
Model RouterRoutes to the right AI modelYes
LLM InferenceRuns the AI model on your prompt (all models self-hosted)Yes
Speech-to-Text (Deepgram, Whisper)Transcribes your audioYes
There is no gap in the chain. From the moment your data is decrypted to the moment the response is re-encrypted, every service touching your data is running inside hardware-sealed, attested confidential compute. The only component outside the CVM is the API Gateway — and it only handles encrypted bytes.

Where the Infrastructure Lives

PCCI runs on a hybrid infrastructure — a mix of hardware we own and capacity we rent:
  • Owned infrastructure is located in Switzerland, under Swiss data protection law
  • Rented infrastructure is primarily in Europe, with some deployments in the United States
All machines — owned or rented — are unattended. There is no SSH access, no remote desktop, no debug console. Nobody logs into these machines. The TEE hardware enforces isolation regardless of who owns the physical server, and attestation provides the same cryptographic proof in both environments.

What Happens When You Send a Message

Here is the full lifecycle of a chat request: For streaming responses (like ChatGPT-style word-by-word output), each chunk is individually encrypted inside the enclave before being sent. The proxy forwards chunks without buffering or inspecting them.

What You Don’t Need to Manage

The SDK handles all of this automatically. You don’t need to:
  • Understand or manage encryption algorithms
  • Perform key exchanges manually
  • Encrypt or decrypt anything in your application code
  • Handle streaming decryption
From your application’s perspective, you make standard API calls and get standard responses. The encryption layer is completely invisible.
For the full cryptographic details (algorithms, key types, protocols), see the Encryption reference. To understand the security guarantees and their limits, continue to Security Model.