# Xcity AI OS — full documentation dump
# Generated 2026-06-03T10:12:04.254Z
# Source: https://xcity.ai/docs


---

# Inference API
URL: https://xcity.ai/docs/en/api-reference/inference
Description: OpenAI-compatible chat, completions, and embeddings endpoints served by tokenhub.xcity.one.


The inference gateway lives at `https://tokenhub.xcity.one/v1` and speaks the OpenAI REST contract. Any OpenAI SDK works.

## Authentication

All requests require a bearer token from `/dashboard/keys`:

```
Authorization: Bearer sk-...
```

Keys are revocable from the dashboard or via the [Keys API](/docs/en/api-reference/keys). Rotating a key takes effect within ~5s globally.

## POST /v1/chat/completions

Standard OpenAI chat-completions shape.

```bash
curl https://tokenhub.xcity.one/v1/chat/completions \
  -H "Authorization: Bearer $XCITY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "messages": [
      {"role": "system", "content": "You are concise."},
      {"role": "user", "content": "Summarize the Argentina project in two sentences."}
    ],
    "stream": false
  }'
```

Response:

```json
{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "created": 1747353600,
  "model": "claude-sonnet-4-6",
  "choices": [
    {
      "index": 0,
      "message": { "role": "assistant", "content": "..." },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 32,
    "completion_tokens": 64,
    "total_tokens": 96
  }
}
```

## POST /v1/completions

Legacy completions endpoint. Supported for OpenAI parity but new code should use chat/completions.

## POST /v1/embeddings

```bash
curl https://tokenhub.xcity.one/v1/embeddings \
  -H "Authorization: Bearer $XCITY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "model": "text-embedding-3-small", "input": "hello world" }'
```

## GET /v1/models

Returns the models allowed by the requesting key's plan whitelist — not the global catalog. Use this to populate UI model pickers without leaking plans the user can't access.

## Streaming

Set `"stream": true` for SSE-style streaming. The wire format matches OpenAI exactly:

```
data: {"choices":[{"delta":{"content":"He"}}]}
data: {"choices":[{"delta":{"content":"llo"}}]}
data: [DONE]
```

## Error codes

| Status | Meaning |
|---|---|
| `401` | Invalid or revoked key |
| `403` | Model not in your plan's whitelist |
| `402` | Budget cap exceeded (per-request or monthly) |
| `429` | Rate limit hit; retry with exponential backoff |
| `5xx` | Upstream provider or gateway issue; safe to retry idempotent calls |

All error bodies follow the OpenAI shape:

```json
{ "error": { "message": "...", "type": "...", "code": "..." } }
```


---

# Auth API
URL: https://xcity.ai/docs/en/api-reference/auth
Description: Session, registration, password, and identity endpoints exposed by xcity-home.


These endpoints live at `https://www.xcity.one/api/auth/*` and are consumed by the website's own forms plus any sub-product needing to know "who is logged in." All set or read the `xcity_session` cookie.

## POST /api/auth/register

```http
POST /api/auth/register
Content-Type: application/json

{ "email": "...", "password": "...", "name": "..." }
```

Creates a GoTrue user, sends a confirmation email, returns `{ ok: true }`. The session is not established until the email is confirmed and the user logs in.

## POST /api/auth/login

```http
POST /api/auth/login
Content-Type: application/json

{ "email": "...", "password": "..." }
```

Sets the `xcity_session` cookie. Returns `{ user: { id, email, name, plan } }`.

## POST /api/auth/signout

Clears the session cookie. Returns `{ ok: true }`.

## GET /api/auth/me

Returns the current user (or `401` if unauthenticated). Used by sub-products to confirm identity.

```json
{
  "user": {
    "id": "uuid",
    "email": "you@example.com",
    "name": "...",
    "plan": "pro"
  }
}
```

## POST /api/auth/forgot-password

```http
POST /api/auth/forgot-password
Content-Type: application/json

{ "email": "..." }
```

Sends a reset email. Always returns `{ ok: true }` — we never disclose whether an address exists.

## CORS

Every `/api/auth/*` endpoint accepts requests from `https://*.xcity.one` (regex match) and dev origins listed in `XCT_CORS_EXTRA_ORIGINS`. Pre-flight (`OPTIONS`) responses are cached for 24h.

See [Concepts: Authentication flow](/docs/en/concepts/auth-flow) for the full sub-product story.


---

# Billing API
URL: https://xcity.ai/docs/en/api-reference/billing
Description: Plan, checkout, portal, and invoice endpoints backed by Stripe.


## GET /api/billing/plan

Returns the active plan for the current user.

```json
{
  "plan": {
    "id": "pro",
    "name": "Xcity Pro",
    "renews_at": "2026-06-01T00:00:00Z",
    "entitlements": ["full-catalog", "priority-routing"]
  }
}
```

Used by sub-products to render gating and entitlements without each app holding Stripe credentials.

## POST /api/billing/checkout

```http
POST /api/billing/checkout
Content-Type: application/json

{ "price_id": "price_..." }
```

Returns `{ url: "..." }` — a one-time Stripe Checkout session URL. Redirect the user there. On success they return to `/dashboard/billing?success=1`.

## GET /api/billing/portal

Returns `{ url: "..." }` — a one-time Stripe Customer Portal URL where the user manages payment methods, switches plans, and downloads invoices.

## GET /api/billing/invoices

Returns the user's invoice history.

```json
{
  "invoices": [
    { "id": "in_...", "amount": 2900, "currency": "usd", "status": "paid", "created": 1747000000, "pdf_url": "..." }
  ]
}
```

## POST /api/billing/webhook

**Stripe → xcity-home only.** Not for client use. Receives:

- `checkout.session.completed`
- `customer.subscription.created`
- `customer.subscription.updated`
- `customer.subscription.deleted`
- `invoice.payment_failed`

Updates `app_metadata.plan` on the GoTrue user. Signature verification is mandatory — see [Operations: Stripe webhooks](/docs/en/operations/stripe-webhooks).


---

# Keys API
URL: https://xcity.ai/docs/en/api-reference/keys
Description: Provision, list, and revoke inference API keys programmatically.


The Keys API at `/api/keys` lets you provision and revoke inference keys without going through the dashboard. Useful for B2B partners minting per-tenant keys.

## GET /api/keys

List the keys owned by the current account.

```json
{
  "keys": [
    { "id": "k_...", "label": "prod", "last_used": "2026-05-14T18:21:00Z", "created": "..." }
  ]
}
```

We never return the secret value after creation — only the prefix and metadata.

## POST /api/keys

```http
POST /api/keys
Content-Type: application/json

{ "label": "prod" }
```

Returns the key **once**:

```json
{ "key": { "id": "k_...", "label": "prod", "value": "sk-..." } }
```

The `value` is shown a single time. Store it; we cannot recover it.

## DELETE /api/keys/:id

Revokes the key. New requests are rejected within ~5 seconds globally.

```json
{ "ok": true }
```

## Quotas

- Free plan: 1 key
- Pro: 5 keys
- Team: 20 keys (shared across seats)
- Enterprise: unlimited


---

# Architecture Overview
URL: https://xcity.ai/docs/en/concepts/architecture
Description: How identity, billing, and inference flow through the Xcity stack.


Xcity's stack has three planes — **identity**, **billing**, and **inference** — that any product on the platform composes against.

```
                ┌────────────────────────────────────────────┐
   Browser  →   │  xcity-home (Astro, *.xcity.one)           │
                │   ├── /api/auth/*     identity BFF         │
                │   ├── /api/billing/*  Stripe BFF           │
                │   └── /api/me/*       plan / key resolver  │
                └──────┬───────────┬────────────────┬────────┘
                       │           │                │
                       ▼           ▼                ▼
                  GoTrue       Stripe          LiteLLM
                (auth.xcity)   (billing)     (tokenhub.xcity)
                                                    │
                                                    ▼
                                              Solar Compute (AR)
```

## Identity plane

Authentication is centralized on `auth.xcity.one` (a self-hosted GoTrue instance with Supabase as a dev fallback). The xcity-home Astro app issues a host-only, SameSite=Lax session cookie scoped to `*.xcity.one`. Every browser sub-product on a `.xcity.one` subdomain inherits that session by hitting xcity-home BFF endpoints with `credentials: 'include'`.

See [Sub-product Integration](/docs/en/guides/sub-product-integration) for the integration recipe.

## Billing plane

Stripe is the source of truth for plans, subscriptions, and invoices. The `xcity-home` server holds the only set of Stripe credentials; sub-products never call Stripe directly. Plan and entitlement state is mirrored onto each user's GoTrue `app_metadata` via Stripe webhooks landing at `/api/billing/webhook`.

See [Billing model](/docs/en/concepts/billing) and the [Stripe webhook reference](/docs/en/api-reference/billing-webhook).

## Inference plane

`tokenhub.xcity.one` runs LiteLLM in front of one or more upstream model providers and our own self-hosted models. Every API key is bound to a Xcity account, a plan whitelist, and a budget cap. Requests are logged for the usage ledger that drives `/dashboard/usage` and overage billing.

## Sub-products

Anything that lives on a `*.xcity.one` subdomain (xct-chat, xct-flow, xct-agent-marketplace, future ones) integrates via three BFF endpoints and never holds its own Stripe or auth credentials:

| Endpoint | Returns |
|---|---|
| `GET /api/auth/me` | Current user |
| `GET /api/me/litellm-key` | Bearer + plan + allowed models |
| `GET /api/billing/plan` | Plan id, entitlements, renewal |

CORS is enforced via a regex allowlist over `*.xcity.one` plus localhost for development; see `src/lib/cors.ts`.

## Where data lives

| Domain | Store | Region |
|---|---|---|
| User identity | GoTrue Postgres | San Juan, AR (primary) |
| Subscriptions | Stripe | Global (US) |
| API usage | LiteLLM Postgres | San Juan, AR |
| Audit logs | Object storage | San Juan + DR |

Enterprise contracts can pin a region or require an on-prem-style deployment — see [Enterprise: Data Residency](/docs/en/enterprise/data-residency).


---

# Plans, keys, and budgets
URL: https://xcity.ai/docs/en/concepts/plans
Description: How accounts map to plans, how keys are scoped, and how usage is metered.


## Plans

Xcity ships three default plans plus a custom Enterprise tier:

| Plan | Monthly | Includes | Cap |
|---|---|---|---|
| Free | $0 | basic model access, dashboard | hard cap, no overage |
| Pro | $29 | full model catalog, priority routing | soft cap + overage |
| Team | $99 | seats, shared budgets, audit log | soft cap + overage |
| Enterprise | custom | SSO, custom SLA, regional pinning, DPA | negotiated |

Stripe is the system of record for plan state. The xcity-home app reads it through `src/lib/billing.ts` and mirrors the active plan onto each user's GoTrue `app_metadata.plan` field via the `/api/billing/webhook` handler.

## Keys

A key (`sk-…`) is bound to:

1. A **Xcity account** — invalidating the account invalidates all its keys.
2. A **plan whitelist** — only models allowed by the account's current plan can be invoked. Routes to disallowed models return `403`.
3. A **budget envelope** — per-request budget enforced at LiteLLM admission, plus cumulative monthly cap.

Keys are minted via the dashboard or — for B2B integrations — through the [Keys API](/docs/en/api-reference/keys). They can be revoked at any time without affecting the user's other keys.

## Budgets

Two budget signals matter:

- **Per-request cost ceiling.** LiteLLM rejects a request whose estimated cost exceeds the per-request ceiling for the plan. This prevents accidental large prompts from draining a month's budget in one call.
- **Cumulative monthly cost.** The usage ledger sums to a per-account total. Free plans hard-block at the cap; paid plans switch to overage at the published rate.

You can inspect both via `GET /api/me/litellm-key` (returns the active envelope) and `GET /dashboard/usage` (visualizes the spend).

## Plan changes

Upgrade and downgrade flow through Stripe Customer Portal. We listen for `customer.subscription.updated` and reflect the new entitlements within seconds (typically <2s end-to-end). Downgrades take effect at the end of the current billing cycle to avoid mid-cycle lockouts.

See [Operations: Stripe webhooks](/docs/en/operations/stripe-webhooks) for failure modes.


---

# Authentication flow
URL: https://xcity.ai/docs/en/concepts/auth-flow
Description: How sessions, cookies, and identity are propagated across xcity.one and its sub-products.


## The 30-second model

```
user logs in once on        →   xcity-home sets a host-only
www.xcity.one (GoTrue)           SameSite=Lax session cookie

user visits chat.xcity.one
  app boots →
    fetch('https://www.xcity.one/api/me/litellm-key',
          { credentials: 'include' })
    ↑ browser auto-attaches the cookie because *.xcity.one is same-site
    ↑ xcity-home returns: { key, plan, models, api_base }

  app hits tokenhub.xcity.one/v1/{models,chat/completions} with bearer
  ↑ tokenhub enforces the user's plan whitelist + budget per request
```

Three BFF endpoints do all the work — sub-products never call GoTrue, Stripe, or LiteLLM admin APIs directly.

## Why this shape

- **Single sign-on without OAuth dance.** Users log in once at `www.xcity.one` and their session works across every `*.xcity.one` sub-product, with no per-product redirect.
- **Sub-products hold no secrets.** They never see the user's password, refresh token, or LiteLLM master key. The worst-case compromise of a sub-product is a leaked short-lived inference key.
- **Centralized policy.** Plan whitelists, model gating, and budget enforcement live in one place (xcity-home + LiteLLM) — sub-products can't drift from policy.

## Production vs dev

In production the session cookie is `Secure`, so cookies will not attach to plain `http://` hosts. For local development, allow your dev origin via `XCT_CORS_EXTRA_ORIGINS=http://localhost:3000`.

## Desktop (Electron)

Browser cookie inheritance doesn't apply to Electron — see [Guides: Desktop integration](/docs/en/guides/desktop-integration) for the OAuth-style flow we use for `xct-agent-desktop`.


---

# Using LLMs and agents with Xcity docs
URL: https://xcity.ai/docs/en/concepts/llms-and-ai
Description: Drop-in context for AI assistants — llms.txt, semantic anchors, structured references.


We publish docs in a shape that's easy for LLM-powered tools (Cursor, Continue, Claude, ChatGPT) to ingest.

## llms.txt

[/llms.txt](/llms.txt) and [/llms-full.txt](/llms-full.txt) follow the [llms.txt convention](https://llmstxt.org/):

- `/llms.txt` — a curated index pointing at every important reference page.
- `/llms-full.txt` — the same index plus the full text of every doc concatenated, suitable for one-shot context loading.

Both files regenerate on every build. They're committed to the deploy so AI tools that crawl them get a consistent, fresh view.

## Markdown source

Every doc page can be fetched as raw markdown by appending `?raw` to the URL:

```
GET https://xcity.one/docs/en/api-reference/inference?raw
```

This skips the wrapper layout and returns the markdown content + frontmatter. Useful when wiring Xcity docs into a RAG pipeline.

## Stable anchors

Every `<h2>` and `<h3>` gets a slug derived from the heading text and rendered as `id="..."`. These slugs are stable across edits — when we rename a heading, we add a backwards-compat anchor.

## Cite-friendly structure

Each page has:

- A single `<h1>` matching the frontmatter `title`
- A 1-2 sentence summary right under it (= frontmatter `description`)
- Headings in semantic order, no skipping levels
- Tables for matrix-style facts (plans, error codes, SLAs)

That structure plays well with extractive summarizers and quote-citation tools.


---

# Service Level Agreement
URL: https://xcity.ai/docs/en/enterprise/sla
Description: Uptime, latency, and support response commitments for paid plans.


This SLA applies to paid plans. Free is best-effort with no commitments.

## Uptime

| Plan | Monthly uptime | Credit |
|---|---|---|
| Pro | 99.5% | 5% of MRR per 0.1% missed below 99.5% |
| Team | 99.9% | 10% of MRR per 0.1% missed below 99.9% |
| Enterprise | 99.95% (custom available) | Negotiated; up to 30% MRR cap |

Uptime is measured at the gateway edge against synthetic probes from three regions. Maintenance windows announced ≥48h in advance do not count against uptime; emergency security patches do not count if they take <15 minutes.

## Latency

| Plan | p50 | p95 |
|---|---|---|
| Pro | <800ms | <2000ms |
| Team | <600ms | <1500ms |
| Enterprise | custom | custom |

Measured at gateway → first-token for chat/completions on the default model. Excessive token counts (>4k input) get a proportional latency budget.

## Support response

| Severity | Pro | Team | Enterprise |
|---|---|---|---|
| **S1** — production down | 4h | 1h | 15 min |
| **S2** — degraded but functional | next business day | 4h | 1h |
| **S3** — question / non-blocking | 3 business days | 1 business day | 4h |

S1 always escalates to PagerDuty. Enterprise S1 includes a phone bridge.

## Credit claims

Submit within 60 days of the missed window. Credits apply to the next invoice and do not roll over past 12 months.

Contact `support@xcity.one`. Enterprise customers use their dedicated channel in the customer portal.


---

# Data residency (Enterprise)
URL: https://xcity.ai/docs/en/enterprise/data-residency
Description: Region pinning options and the dedicated-cluster deployment model.


Enterprise contracts unlock data-residency controls beyond the default Argentina-primary footprint:

## Region pinning

Choose a single region to host **all** customer data:

- **AR (San Juan)** — solar-native, lowest cost.
- **EU (Frankfurt)** — GDPR-native.
- **US (Virginia)** — for latency to North American users.

Pinning disables cross-region DR; we recommend coupling with an explicit DR plan (snapshots-only or warm standby in the same region).

## Dedicated cluster

For regulated workloads, we deploy a single-tenant cluster:

- Isolated Postgres for identity + inference logs
- Isolated LiteLLM gateway with the customer's choice of providers
- Customer's own KMS root keys (BYOK)
- Customer-supplied VPN or PrivateLink for ingress

Operated by Xcity, billed at a different rate than shared infra. Talk to us via `enterprise@xcity.one`.

## Customer-hosted

For air-gapped or fully-sovereign environments, the Xcity gateway can be deployed inside the customer's cloud (AWS / Azure / GCP). The customer pays for compute; Xcity provides the software, updates, and support. Identity and billing still flow through xcity.one unless the customer opts for fully-offline mode.

## Reach out

Email `enterprise@xcity.one`. Typical evaluation cycle is 4–8 weeks; we'll walk you through a deployment plan, run a proof-of-concept against your data, and produce a custom DPA.


---

# Procurement & contracts
URL: https://xcity.ai/docs/en/enterprise/procurement
Description: How to buy Xcity at the enterprise tier — paper, security review, SSO, invoicing.


## Standard package

Every Enterprise contract includes:

- A **Master Service Agreement** (MSA) — we can use yours or our template.
- A **Data Processing Addendum** — see [/dpa](/dpa) for the standard text.
- A **custom SLA** — defaults are in [SLA](/docs/en/enterprise/sla), negotiated up from there.
- **SSO (SAML or OIDC)** — IdP-initiated or SP-initiated.
- **SCIM provisioning** — auto-provision and de-provision seats from your IdP.
- **Audit log export** — periodic delivery to your SIEM via S3 or webhook.
- **Dedicated support contact** — named CSM + Slack Connect.

## Buying process

1. **Discovery call** (30 min) — we scope your use case and identify gaps vs the shared plan.
2. **Security review** — we share our SOC 2 progress, DPA, sub-processor list, pen-test summary. Your team's review form is welcome.
3. **POC** (typically 2–4 weeks) — full Pro-tier access, your real workload, success criteria agreed up front.
4. **Paper** — MSA + Order Form + DPA. We aim for sign in <2 weeks from POC end.
5. **Provisioning** — SSO + SCIM + custom domain + dedicated capacity (if applicable) in <1 week.

## Pricing models

- **Per-seat** — $$ per seat / month with a minimum.
- **Usage-based** — committed monthly compute, overage at published rate.
- **Hybrid** — base platform fee + metered usage.
- **Custom deployment** — annual platform fee + compute pass-through.

## Invoicing

Annual prepaid (NET-30 from invoice). Multi-year discounts available. Custom payment terms negotiable above a threshold.

## Renewals

Auto-renew by default with 60-day cancellation notice. We send a renewal reminder 90 days before the term ends and an open invitation to upgrade or right-size.

Contact: `enterprise@xcity.one`.


---

# Enterprise onboarding playbook
URL: https://xcity.ai/docs/en/enterprise/onboarding-playbook
Description: What happens between signing the order form and your first production traffic.


A 30-day playbook the Xcity CSM follows on every Enterprise signup. We share it so you know what to expect — and so your procurement team can map our checkpoints to yours.

## Week 0 — kickoff

- **Day 0**: Signed order form + DPA filed.
- **Day 1**: CSM intro email; calendar holds for weekly sync + one-time kickoff (60 min).
- **Day 2**: Slack Connect channel created (`#xcity-<customer>`); shared Notion page with this playbook + your tracker.

## Week 1 — provisioning

- SSO connection (SAML or OIDC) configured on both sides.
- SCIM token issued; we test provisioning end-to-end with a dummy user.
- Initial seats created from your IdP. Pilot users invited.
- Dedicated capacity allocated (if applicable). Healthcheck dashboard linked.

## Week 2 — proof of value

- Pilot users run real workloads. Daily usage report from our side.
- We surface latency / cost / error breakdown vs. your previous provider.
- Tuning calls as needed — model routing, prompt patterns, budgets.

## Week 3 — production cutover plan

- Define the cutover window. Rollback steps written down on both sides.
- Synthetic probes pointed at your endpoints. SLA baseline measured.
- Incident escalation contacts on file in PagerDuty.

## Week 4 — go-live

- Production traffic ramped per the cutover plan (typically 10% → 50% → 100%).
- 24h hyper-care window. CSM available on Slack Connect.
- Post-go-live retro at the end of week 4.

## Ongoing

- Weekly sync drops to monthly after 90 days.
- Quarterly business review with usage trend, SLA scorecard, roadmap preview.
- Annual contract renewal touchpoint at term-90, term-30.


---

# Support tier matrix
URL: https://xcity.ai/docs/en/enterprise/support-matrix
Description: Channels, response targets, and what each support tier includes.


## Tier comparison

| Capability | Free | Pro | Team | Enterprise |
|---|---|---|---|---|
| Docs + community | ✓ | ✓ | ✓ | ✓ |
| Email support (24h) | — | ✓ | ✓ | ✓ |
| In-product chat | — | ✓ | ✓ | ✓ |
| Dedicated Slack Connect | — | — | — | ✓ |
| Phone bridge for S1 | — | — | — | ✓ |
| Named CSM | — | — | — | ✓ |
| 24×7 on-call (PagerDuty) | — | — | — | ✓ |
| Architectural reviews | — | — | annual | quarterly |
| Roadmap input | — | — | community | direct |

## Severity definitions

- **S1** — production down or unsafe data exposure. Page immediately.
- **S2** — degraded but workable. File via support channel.
- **S3** — bug, feature request, question. File via support channel.

## Response targets

See [SLA](/docs/en/enterprise/sla). Enterprise customers get 15-min S1 acknowledgement and 1h status update cadence until resolution.

## How to reach us

| Channel | Best for |
|---|---|
| `support@xcity.one` | All tiers; default for S2/S3 |
| In-product chat | Quick questions, all paid tiers |
| Slack Connect | Enterprise, anything non-S1 |
| PagerDuty number | Enterprise S1 only |
| `security@xcity.one` | Vulnerabilities, all tiers |

Every Enterprise customer gets a one-page contact card in their kickoff packet — laminate-friendly version available on request.


---

# Status & incident communication
URL: https://xcity.ai/docs/en/enterprise/status-page
Description: Where to watch operational state and how we communicate during incidents.


## Status page

Live at [status.xcity.one](https://status.xcity.one). Components:

- **Gateway (tokenhub)** — inference API availability + p95 latency
- **Identity (auth.xcity.one)** — login, session, registration
- **Billing** — Stripe webhook ingestion and plan-state sync
- **Console** — `xcity.one/dashboard`
- **Docs** — `xcity.one/docs`

Each component publishes a 90-day uptime history. JSON feed at `/status.json`; RSS at `/status.rss`.

## Incident severity

| Level | Definition | Public update cadence |
|---|---|---|
| **Investigating** | Anomaly detected, scope unconfirmed | within 15 min of detection |
| **Identified** | Root cause known, fix in progress | every 30 min |
| **Monitoring** | Fix deployed, watching | every hour |
| **Resolved** | Service restored, postmortem pending | postmortem within 5 business days |

Enterprise customers get the same updates pushed to their dedicated Slack Connect channel within 60 seconds of the public post.

## Postmortems

Within 5 business days of any S1 incident. We publish a redacted version on the status page and share the full version (including timelines, alerts, and corrective actions) with affected Enterprise customers.

Our template includes: detection → diagnosis → mitigation → root cause → corrective action → action items with owners. We follow blameless review practice — naming systems and processes, not individuals.

## Scheduled maintenance

Announced ≥48h in advance via:

- Status page (banner)
- Email to billing-admin contacts
- Slack Connect for Enterprise

Emergency security patches with <15 min duration may go out without advance notice. We document them retroactively on the status page.


---

# Introduction
URL: https://xcity.ai/docs/en/get-started/introduction
Description: What Xcity AI OS is, who it's for, and how the pieces fit together.


Xcity AI OS is the operating system for a solar-powered AI civilization. It bundles four product layers — Builder, Exchange, Agent, and Runtime — on top of a 100 GW solar compute base in Argentina, and exposes them through a single account, a single billing plane, and a single OpenAI-compatible inference gateway.

This documentation covers everything a developer, integrator, or enterprise buyer needs to ship on top of Xcity: account setup, the inference API, billing model, sub-product integration, deployment, and operational guarantees.

## Who this is for

- **Developers** building applications against the inference gateway (`tokenhub.xcity.one`).
- **Integrators** wiring a browser sub-product into the unified identity + billing flow.
- **Enterprises** evaluating Xcity for compliance, SLA, residency, and procurement.
- **Operators** running the Xcity stack — staging deploys, rotating secrets, debugging webhooks.

## What you get

| Layer | What it does | URL |
|---|---|---|
| L1 Builder | Hosted app and agent platform | `xcity.one` |
| L2 Exchange | Model + agent marketplace | `market.xcity.one` |
| L3 Agent / Runtime | OpenAI-compatible inference gateway | `tokenhub.xcity.one` |
| L4 Infra | Solar compute zone, network, identity | (operated by Xcity) |

All four layers share one identity (Xcity Account), one billing relationship (Stripe-backed), and one usage ledger.

## How the docs are organized

- **Get Started** — install, authenticate, make your first request.
- **Concepts** — mental model for accounts, plans, keys, agents, and the gateway.
- **Products** — per-vertical guides (AI Platform, Energy, Estates, Immigration).
- **API Reference** — every public endpoint with request/response examples.
- **Guides** — task-shaped recipes (integrate auth, ship a sub-product, handle webhooks).
- **Operations** — runbooks, deploy, SLAs, incident response.
- **Security & Compliance** — DPAs, regions, encryption, audit posture.
- **Enterprise** — procurement, custom SLAs, dedicated capacity.

If you're new, start with [Quick Start](/docs/en/get-started/quickstart). If you're integrating an existing app, jump to [Sub-product Integration](/docs/en/guides/sub-product-integration).


---

# Quick Start
URL: https://xcity.ai/docs/en/get-started/quickstart
Description: Create an account, get an API key, and make your first inference call in under five minutes.


This page walks you from zero to your first response from the Xcity inference gateway.

## 1. Create an account

1. Open [xcity.one/register](https://xcity.one/register).
2. Sign up with email + password. We send a confirmation link via Resend; the account is activated when you click it.
3. You'll land on the dashboard at `xcity.one/dashboard`. New accounts start on the **Free** plan.

## 2. Get an API key

The gateway uses bearer tokens scoped to your account. Two ways to retrieve one:

**Dashboard** — go to `/dashboard/keys`. Click **Create key**, copy the value (we only show it once), and give it a descriptive label.

**Programmatic** — keys can also be minted via the LiteLLM admin API once your account has an entitlement. See [Provisioning keys](/docs/en/api-reference/keys).

```bash
export XCITY_API_KEY="sk-..."
```

## 3. Make your first request

The gateway is OpenAI-compatible. Any OpenAI client library works — just swap the base URL.

```bash
curl https://tokenhub.xcity.one/v1/chat/completions \
  -H "Authorization: Bearer $XCITY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "messages": [{"role": "user", "content": "Say hello in one sentence."}]
  }'
```

Python:

```python
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["XCITY_API_KEY"],
    base_url="https://tokenhub.xcity.one/v1",
)

resp = client.chat.completions.create(
    model="claude-sonnet-4-6",
    messages=[{"role": "user", "content": "Say hello in one sentence."}],
)
print(resp.choices[0].message.content)
```

## 4. Check usage

Every request is logged. Visit `/dashboard/usage` for a live readout of:

- Tokens consumed today / this billing cycle
- Per-model breakdown
- Remaining budget against your plan cap

Free-tier accounts get a hard ceiling — paid plans get a soft cap with overage billing. See [Plans & Pricing](/docs/en/concepts/plans) for the full matrix.

## What's next

- [Concepts: Plans, keys, and budgets](/docs/en/concepts/plans)
- [Guide: Integrate auth into a sub-product](/docs/en/guides/sub-product-integration)
- [API Reference: chat/completions](/docs/en/api-reference/inference)


---

# Installation
URL: https://xcity.ai/docs/en/get-started/installation
Description: SDK install steps for Node.js, Python, and Go clients, plus environment setup.


Xcity's gateway is OpenAI-compatible, so you install the same SDK you'd use for OpenAI and point it at our base URL.

## Node.js

```bash
npm install openai
```

```ts
import OpenAI from 'openai';

export const xcity = new OpenAI({
  apiKey: process.env.XCITY_API_KEY!,
  baseURL: 'https://tokenhub.xcity.one/v1',
});
```

## Python

```bash
pip install openai
```

```python
from openai import OpenAI

xcity = OpenAI(
    api_key=os.environ["XCITY_API_KEY"],
    base_url="https://tokenhub.xcity.one/v1",
)
```

## Go

```bash
go get github.com/sashabaranov/go-openai
```

```go
cfg := openai.DefaultConfig(os.Getenv("XCITY_API_KEY"))
cfg.BaseURL = "https://tokenhub.xcity.one/v1"
client := openai.NewClientWithConfig(cfg)
```

## Environment variables

| Variable | Purpose |
|---|---|
| `XCITY_API_KEY` | Bearer token issued from `/dashboard/keys` |
| `XCITY_BASE_URL` | Optional override; defaults to `https://tokenhub.xcity.one/v1` |
| `XCITY_DEFAULT_MODEL` | Optional default model id used by your app |

Never commit `XCITY_API_KEY` to git. Store it in your platform's secret manager — see [Operations: Secrets](/docs/en/operations/secrets).


---

# Integrate a browser sub-product
URL: https://xcity.ai/docs/en/guides/sub-product-integration
Description: Wire any *.xcity.one app into the unified identity, plan, and inference flow.


How to wire a new browser sub-product (anything on `*.xcity.one`) into the xcity-home identity + tokenhub flow.

> **Scope**: this guide covers **browser** sub-products only — apps that run inside a normal Chromium / Firefox / Safari window on a `*.xcity.one` subdomain. For the Electron desktop app see [Desktop integration](/docs/en/guides/desktop-integration).

## Prerequisites

1. **Subdomain on `xcity.one`** — your app must be served from `https://<your-app>.xcity.one`. Anything else needs an explicit allowlist entry in xcity-home env (`XCT_CORS_EXTRA_ORIGINS=https://your-other-host.com`) and a security review.
2. **HTTPS in production** — the session cookie has `Secure`, so cookies will not attach to plain http hosts in production.
3. **User has a Xcity account** — first-time visitors get redirected to `xcity-home/login`; we don't ship a per-product signup form.

## Step 1 — fetch the identity envelope

```ts
async function getXcityIdentity() {
  const res = await fetch('https://www.xcity.one/api/me/litellm-key', {
    credentials: 'include',
  });
  if (res.status === 401) {
    window.location.href = 'https://www.xcity.one/login?return=' +
      encodeURIComponent(window.location.href);
    return null;
  }
  return res.json() as Promise<{
    key: string;
    plan: string;
    models: string[];
    api_base: string;
  }>;
}
```

That single call gives you the bearer, the user's plan, the models they can hit, and the gateway base URL.

## Step 2 — make inference calls

```ts
const id = await getXcityIdentity();
if (!id) return;

const res = await fetch(`${id.api_base}/chat/completions`, {
  method: 'POST',
  headers: {
    Authorization: `Bearer ${id.key}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: id.models[0],
    messages: [{ role: 'user', content: 'Hello' }],
  }),
});
```

## Step 3 — react to plan changes

The plan field returned by `/api/me/litellm-key` is fresh-on-load. For long-lived sessions, re-fetch it on focus or every 5 minutes. A plan downgrade will start returning `403` from `/chat/completions` for models you used to have access to — handle it gracefully by re-fetching `models` and updating your UI.

## Common mistakes

- **Forgetting `credentials: 'include'`.** Without it the browser doesn't send the session cookie and you'll always get `401`.
- **Hardcoding `api_base`.** Always read it from the identity envelope. We will move the gateway hostname in the future.
- **Caching the bearer.** It's short-lived (rotated when the user revokes or plan changes). Re-fetch on `401`.
- **Calling Stripe / GoTrue directly.** Don't. Sub-products own no shared secrets.

## Reference implementations

- [xct-chat](https://github.com/xcity/xct-chat) — the chat sub-product, MIT licensed.
- [xct-flow](https://github.com/xcity/xct-flow) — agent workflow builder.

Both use the same `useXcityIdentity()` hook — copy it.


---

# Deployment
URL: https://xcity.ai/docs/en/guides/deployment
Description: Deploy xcity-home to Cloudflare Pages with the Node adapter + Wrangler.


xcity-home ships as an Astro hybrid app behind the Node adapter, with Cloudflare Pages as the production runtime.

## Local build

```bash
npm install
npm run build       # astro build → ./dist
npm run start       # node ./dist/server/entry.mjs — preview the production bundle
```

## Cloudflare Pages

```bash
npm run pages:dev      # wrangler pages dev ./dist
npm run pages:deploy   # build + wrangler pages deploy ./dist
```

`wrangler.toml` pins the project name; Cloudflare environment variables hold all secrets — `STRIPE_SECRET_KEY`, `GOTRUE_ADMIN_TOKEN`, `LITELLM_MASTER_KEY`, etc. See [Operations: Secrets](/docs/en/operations/secrets) for the full list.

## Required environment variables

| Var | Purpose |
|---|---|
| `PUBLIC_SUPABASE_URL` | Local dev fallback identity provider |
| `PUBLIC_SUPABASE_ANON_KEY` | Anon key for dev |
| `GOTRUE_URL` | Self-hosted GoTrue base (`https://auth.xcity.one`) |
| `GOTRUE_ADMIN_TOKEN` | Long-lived service JWT for webhook user updates |
| `STRIPE_SECRET_KEY` | Stripe secret |
| `STRIPE_WEBHOOK_SECRET` | Stripe webhook signing secret |
| `STRIPE_PRICE_PRO_MONTHLY` | Stripe price id for Pro |
| `STRIPE_PRICE_TEAM_MONTHLY` | Stripe price id for Team |
| `LITELLM_BASE_URL` | `https://tokenhub.xcity.one` |
| `LITELLM_MASTER_KEY` | TokenHub master key for key provisioning |
| `XCT_CORS_EXTRA_ORIGINS` | Comma-separated dev/external origins |

## Smoke test the deploy

After every deploy:

1. Hit `/` — homepage 200.
2. Hit `/api/auth/me` — should be `401` when not signed in.
3. Hit `/dashboard` while signed in — sidebar shell renders.
4. Trigger a Stripe test event (`stripe trigger checkout.session.completed`) — webhook returns `200`.

Or just run the smoke suite:

```bash
npm run test:smoke
```

See [Operations: Test framework](/docs/en/operations/testing).


---

# Desktop integration (Electron)
URL: https://xcity.ai/docs/en/guides/desktop-integration
Description: How xct-agent-desktop and other native apps authenticate users without browser cookies.


Browser cookie inheritance does not work in Electron, mobile, or any native app. Those clients use an OAuth-style device flow against `xcity-home`.

## Flow

```
desktop app                      xcity-home                    user browser
   │                                  │                              │
   │  POST /api/auth/device/start ───▶│                              │
   │  { client: "xct-agent-desktop"}  │                              │
   │                                  │                              │
   │ ◀── { device_code, user_code,    │                              │
   │       verify_url, interval }     │                              │
   │                                  │                              │
   │ open verify_url in browser ─────────────────────────────────────▶│
   │                                  │ ◀── user logs in, approves ──│
   │                                  │                              │
   │  POST /api/auth/device/poll ────▶│                              │
   │  { device_code }                 │                              │
   │                                  │                              │
   │ ◀── { access_token, refresh,     │                              │
   │       expires_in }               │                              │
```

Poll every `interval` seconds (default 5s). Stop on `400 expired_token` or `200`.

## Token storage

Store the access token in the OS keychain:

- **macOS** — `keytar` writing to Keychain.
- **Windows** — Credential Manager.
- **Linux** — Secret Service (gnome-keyring/kwallet).

Never persist to plain files. Refresh on `401`.

## Using the token

Same as browser — but pass it via `Authorization: Bearer` header instead of relying on the cookie:

```ts
const res = await fetch('https://www.xcity.one/api/me/litellm-key', {
  headers: { Authorization: `Bearer ${accessToken}` },
});
```

The `/api/me/litellm-key` envelope is identical. From there, identical inference flow.

## Updates

xct-agent-desktop self-updates from `/api/agent-desktop/releases`. The endpoint returns a manifest with current/min versions and a signed download URL.


---

# Handle Xcity webhooks
URL: https://xcity.ai/docs/en/guides/handle-webhooks
Description: Subscribe to plan, key, and usage events from xcity-home.


Xcity emits webhooks for Enterprise customers and B2B partners that need to react to plan, key, or usage changes.

## Subscribe

Configure the webhook URL via `enterprise@xcity.one` or — when self-serve is available — `/dashboard/webhooks`.

## Events

| Event | When |
|---|---|
| `user.plan_changed` | A user's plan changed (upgrade, downgrade, or expiry) |
| `user.key_created` | A new inference key was minted |
| `user.key_revoked` | A key was revoked |
| `usage.threshold_crossed` | An account crossed a configured % of its monthly cap |

Payload shape:

```json
{
  "id": "evt_...",
  "type": "user.plan_changed",
  "created": 1747353600,
  "data": {
    "user_id": "uuid",
    "previous_plan": "pro",
    "current_plan": "team"
  }
}
```

## Signatures

Every webhook is signed with `X-Xcity-Signature: t=<unix>,v1=<hmac>`. Verify with the secret returned at subscription time:

```ts
import crypto from 'node:crypto';

function verify(payload: string, header: string, secret: string): boolean {
  const [t, v1] = header.split(',').map((p) => p.split('=')[1]);
  const expected = crypto
    .createHmac('sha256', secret)
    .update(`${t}.${payload}`)
    .digest('hex');
  return crypto.timingSafeEqual(Buffer.from(v1), Buffer.from(expected));
}
```

## Retries

We retry on non-2xx for up to 24h with exponential backoff. Always return `2xx` *quickly* — defer slow work to your own queue.

Idempotency: every event has a unique `id`. Use it as the idempotency key in your handler.


---

# Secrets & environment
URL: https://xcity.ai/docs/en/operations/secrets
Description: Every secret xcity-home needs, where it lives, and how to rotate it.


## Source of truth

| Secret | Where stored | Owner |
|---|---|---|
| `STRIPE_SECRET_KEY` | Cloudflare Pages env | Billing |
| `STRIPE_WEBHOOK_SECRET` | Cloudflare Pages env | Billing |
| `GOTRUE_JWT_SECRET` | Railway (auth.xcity.one) | Identity |
| `GOTRUE_ADMIN_TOKEN` | Cloudflare Pages env | Identity |
| `LITELLM_MASTER_KEY` | Cloudflare Pages env | Inference |
| `RESEND_API_KEY` | Railway (auth.xcity.one) | Identity |

Local development reads from `.env`; never commit it — `.gitignore` already includes it.

## Rotation

**Stripe key** — rotate from Stripe Dashboard → API keys → Roll. Update the value in Cloudflare Pages, redeploy. Stripe lets the old key linger for 12h.

**Stripe webhook secret** — re-create the webhook endpoint in Stripe Dashboard with the new secret. Update env, redeploy. Old endpoint can be deleted after one successful event under the new key.

**GoTrue admin token** — re-run the token-minting snippet from [Concepts: Authentication flow](/docs/en/concepts/auth-flow). Update env, redeploy. Old tokens are valid until their `exp` claim; rotate the JWT secret if you suspect compromise.

**LiteLLM master key** — generate a new one from the TokenHub admin UI, swap env, redeploy. Per-user inference keys are unaffected — only the admin/provisioning surface is.

## Audit

Every rotation should be logged with date, actor, and reason in `docs/security/rotation-log.md` (private repo). Enterprise tenants can request rotation logs as part of their compliance package — see [Security & Compliance](/docs/en/security/overview).


---

# Stripe webhooks
URL: https://xcity.ai/docs/en/operations/stripe-webhooks
Description: How billing events flow into xcity-home, what can fail, and how to replay.


`/api/billing/webhook` is the only endpoint Stripe ever calls. It must be:

- **Idempotent** — Stripe retries on `5xx`. Don't double-apply effects.
- **Signature-checked** — every request is verified against `STRIPE_WEBHOOK_SECRET`. Unsigned events return `400`.
- **Fast** — Stripe times out after ~30s. Defer slow work to a background queue.

## Events we listen for

| Event | Effect |
|---|---|
| `checkout.session.completed` | Mark user as paid, set initial plan |
| `customer.subscription.created` | Set plan on `app_metadata.plan` |
| `customer.subscription.updated` | Re-sync plan + entitlements |
| `customer.subscription.deleted` | Downgrade to Free |
| `invoice.payment_failed` | Tag user `payment_failed=true`, send dunning email |

## Failure modes

**Signature mismatch** — `400`. Caller logs the discrepancy. Most often: wrong env (test secret in prod) or a stale webhook endpoint.

**GoTrue down** — webhook returns `500`. Stripe retries with exponential backoff for ~3 days. If GoTrue stays down past then, manually replay from the Stripe dashboard.

**Plan id unknown** — webhook returns `200` but logs a warning. We don't fail the event because Stripe state is the truth; instead, the plan resolver in `src/lib/billing.ts` falls back to Free until the price id is added to the plan map.

## Replay

From Stripe Dashboard → Developers → Webhooks → click the failed event → **Resend**.

For batch replay during an incident, use the Stripe CLI:

```bash
stripe events resend --type customer.subscription.updated --since 1747000000
```

## Local development

```bash
stripe listen --forward-to http://localhost:4321/api/billing/webhook
```

Copy the printed `whsec_...` into `.env`. Then:

```bash
stripe trigger checkout.session.completed
```


---

# Test framework
URL: https://xcity.ai/docs/en/operations/testing
Description: How the Playwright test suite is structured, run, and aggregated.


xcity-home ships with a multi-tier Playwright suite. Every PR runs **smoke** + **e2e**; the nightly job runs everything including a11y and visual regression, then writes the [test dashboard](/docs/en/operations/test-dashboard).

## Suite layout

```
tests/
├── smoke/        every route loads with 200 + a key DOM marker
├── e2e/          flows: auth, blog filter, share buttons, dashboard, pricing
├── api/          contract tests for /api/* endpoints (status + shape)
├── a11y/         axe scan on the top 10 pages
├── visual/       per-section snapshots (nightly only)
└── fixtures/     reusable mocks (Stripe, Supabase, LiteLLM)
```

## Local commands

```bash
npm run test            # all suites except visual
npm run test:smoke      # < 30 seconds — route load + 200
npm run test:e2e        # full user flows
npm run test:api        # API contract tests
npm run test:a11y       # axe-core scan
npm run test:visual     # snapshot suite (use --update-snapshots after intentional UI changes)
npm run test:dashboard  # aggregate recent runs → docs/test-dashboard
```

## Picking the right suite

| Change | Run |
|---|---|
| Markdown / blog content | `test:smoke` |
| Component refactor | `test:smoke` + `test:e2e` |
| New API endpoint | `test:api` + add a smoke for it |
| Visual change | `test:visual --update-snapshots` after review |
| Pre-merge full sweep | `npm test` |

## What CI runs

- **Pull request**: `test:smoke` + `test:e2e` + `test:api` — blocks merge on failure.
- **Nightly main**: all suites + `test:visual` + dashboard publish.

See `.github/workflows/test.yml` for the full pipeline.


---

# Test dashboard
URL: https://xcity.ai/docs/en/operations/test-dashboard
Description: Where to read pass-rate, flake, and coverage trends from the Playwright suite.


The dashboard at [/internal/test-dashboard](/internal/test-dashboard) aggregates the last 30 runs of the Playwright suite. It's regenerated:

- After every CI run on `main`
- After every successful nightly job
- On demand: `npm run test:dashboard`

## What it shows

| Panel | Source |
|---|---|
| **Pass rate trend** | `tests.passed / tests.total` across the last 30 runs |
| **Top 10 flakiest tests** | Tests whose verdict has changed in the last 30 runs without code changes |
| **Slowest 10 tests** | Median wall-clock from the JSON reporter |
| **Route coverage matrix** | Every route under `src/pages/` × which suite touches it |
| **Last failure log** | Stack + screenshot of the most recent failure per spec |

## Where data is stored

- Per-run JSON: `playwright-report/results.json`
- Rolling history: `.test-history/{YYYY-MM-DD}-{run-id}.json` (gitignored)
- Aggregated output: `docs/test-dashboard/index.html` (committed)

The aggregator script is `scripts/aggregate-test-results.mjs` — it reads recent history, computes the metrics, and writes a static dashboard. No external service required.

## Cleaning up flakes

When a test lands on the flake list:

1. Reproduce locally with `--repeat-each=10`.
2. If genuinely flaky, mark `test.fixme` with a `// flake: <issue-link>` note. **Do not** use `.skip` — fixme keeps the test discoverable.
3. File a Linear ticket against the owning team. Flakes auto-decay off the list after 7 days without recurrence.

We treat sustained pass rate <99% as a release blocker.


---

# Runbooks
URL: https://xcity.ai/docs/en/operations/runbooks
Description: Step-by-step procedures for common operational tasks and incidents.


## Deploy a hotfix

1. Branch from `main`, name it `hotfix/<slug>`.
2. Make the change. Run `npm run test:smoke` locally.
3. PR → request 1 reviewer → smoke + e2e gate must pass.
4. Merge. `pages:deploy` runs from CI on `main`.
5. Verify via `/internal/test-dashboard` — the next smoke run should be green within 5 minutes.

## Roll back

```bash
npm run pages:deploy -- --branch <previous-deploy-id>
```

Or from Cloudflare Pages UI → Deployments → Rollback. Confirms in ~30 seconds.

Database schema changes do **not** roll back automatically. If the bad deploy migrated, restore from the backup at `gs://xcity-backups/<date>` first.

## Stripe webhook is failing

1. Stripe Dashboard → Developers → Webhooks → check error rate.
2. If signature mismatch, verify `STRIPE_WEBHOOK_SECRET` matches the endpoint's signing secret in Stripe.
3. If `500`, check Cloudflare Pages logs for `/api/billing/webhook`. Most common cause: GoTrue admin token expired — see [Secrets](/docs/en/operations/secrets) for rotation.
4. Replay failed events: Stripe Dashboard → Webhooks → Failed → Resend.

## A user can't log in

1. `/dashboard/keys` — confirm user exists in GoTrue admin.
2. Check `email_confirmed_at` — unconfirmed users can't sign in.
3. Resend confirmation: GoTrue admin → user → Resend.
4. If email never arrived, check Resend dashboard — most often a domain reputation issue, not GoTrue.

## Gateway latency spike

1. Check TokenHub status page.
2. If a single model is slow, LiteLLM should auto-failover; verify routing in admin.
3. If the whole gateway is slow, escalate to infra — Solar Compute capacity or network path.

## Where alerts go

Pages: PagerDuty rotation. Slack: `#ops-alerts`. Email: `ops@xcity.one`.


---

# AI Platform
URL: https://xcity.ai/docs/en/products/ai-platform
Description: The four-hub Xcity AI platform — Builder, Exchange, Agent, Runtime.


The Xcity AI Platform is composed of four hubs that share identity, billing, and observability:

| Hub | What it does | Primary URL |
|---|---|---|
| **L1 Builder** | Hosted app + agent platform — bring your code, we run it | `xcity.one/dashboard/agent-desktop` |
| **L2 Exchange** | Marketplace for models and agents with revenue share | `market.xcity.one` |
| **L3 Agent** | Long-running agent runtime with workflow primitives | `flow.xcity.one` |
| **L4 Runtime** | OpenAI-compatible inference gateway | `tokenhub.xcity.one` |

## L1 Builder

Push code, get a URL. We handle TLS, CDN, secrets, and observability. Three runtimes:

- **Static** — any framework that builds to HTML/CSS/JS.
- **Node** — long-running Node services with autoscale.
- **Sandbox** — ephemeral compute for agent tools (Python, Bash, headless browser).

## L2 Exchange

Two-sided marketplace for AI assets:

- **Models** — submit a fine-tune, set a price, earn per token.
- **Agents** — publish an agent recipe with declared inputs and a price-per-run.

Revenue share: 70% creator / 25% platform / 5% solar-credit fund.

## L3 Agent

Workflow primitives for multi-step, long-running agents:

- Steps with retries, timeouts, idempotency keys
- Durable state (resume after restart)
- Tool registry shared across agents
- Per-run trace exposed in the dashboard

## L4 Runtime

Documented in detail in [API Reference: Inference](/docs/en/api-reference/inference). Key features:

- OpenAI-compatible REST and SSE
- Plan-based model whitelist
- Per-request and monthly budget envelopes
- Solar-tied carbon attribution per token (Enterprise)


---

# Energy Platform
URL: https://xcity.ai/docs/en/products/energy-platform
Description: 100 GW solar compute zone — capacity model, contracts, and developer access.


The Xcity Energy Platform turns 400,000 hectares of San Juan solar irradiance into compute capacity. As a developer you don't interact with it directly — your inference runs on it by default. As an enterprise buyer you can contract for dedicated capacity, carbon attribution, or off-take agreements.

## Capacity

| Phase | Online | Target |
|---|---|---|
| Phase 1 (pilot) | 100 MW | 2026 Q3 |
| Phase 2 | 5 GW | 2027 |
| Phase 3 | 30 GW | 2029 |
| Final | 100 GW | 2031 |

San Juan averages 300 sunny days/year. Capacity factor on tracking arrays projects 28–30% — see the [Solar Compute paper](/argentina-project).

## Inference attribution

Every inference request gets a carbon line in its response header (Enterprise only):

```
X-Xcity-Carbon-gCO2e: 0.34
X-Xcity-Energy-Source: solar
```

We don't claim 100% solar coverage outside daylight hours — battery storage covers ~80%, grid backup the rest. Enterprise customers can choose **solar-only** mode (requests queue during low-availability windows, never grid-backed).

## Off-take

For enterprises with their own compute, we sell renewable energy off-take contracts in 10 MW blocks. Includes:

- Settlement at agreed PPA price
- REC delivery
- Optional fiber + direct-current handoff from the solar zone

Contact `energy@xcity.one`.


---

# Security & Compliance
URL: https://xcity.ai/docs/en/security/overview
Description: Encryption, residency, audit posture, and how to request compliance artifacts.


## Threat model

Xcity defends against three primary threats:

1. **Credential theft from sub-products** — mitigated by never sharing Stripe/GoTrue/LiteLLM secrets with sub-products. The worst-case compromise yields short-lived inference keys, not the user's account.
2. **Inference key abuse** — mitigated by plan whitelists and per-request budget envelopes enforced at the gateway. A leaked key can't drain a month's budget in a single call and can't access models outside the plan.
3. **Webhook impersonation** — mitigated by HMAC signature checks on every Stripe event.

## Encryption

| Layer | At rest | In transit |
|---|---|---|
| GoTrue Postgres | AES-256 (Railway managed) | TLS 1.3 |
| LiteLLM Postgres | AES-256 (Railway managed) | TLS 1.3 |
| Stripe data | (managed by Stripe — PCI-DSS Level 1) | TLS 1.3 |
| Object storage (audit) | AES-256-GCM | TLS 1.3 |
| Cloudflare Pages | (managed by Cloudflare) | TLS 1.3 |

## Data residency

| Domain | Region | Notes |
|---|---|---|
| Identity (GoTrue) | San Juan, AR (primary) | DR mirror in EU |
| Inference logs (LiteLLM) | San Juan, AR | DR mirror in EU |
| Billing (Stripe) | US (Stripe-managed) | Required by PCI scope |
| Audit object storage | San Juan + EU mirror | Customer-pinnable on Enterprise |

Enterprise customers may pin a single region — see [Enterprise: Data Residency](/docs/en/enterprise/data-residency).

## Audit log

Every privileged action (key rotation, plan override, admin login) is recorded with: actor, timestamp, source IP, action, target, result. Retained 365 days. Enterprise customers can request an export via their account team.

## Compliance posture

| Standard | Status |
|---|---|
| SOC 2 Type II | Q4 2026 target |
| GDPR | DPA available — see [DPA](/dpa) |
| HIPAA | Roadmap; contact for BAA discussion |
| ISO 27001 | Q1 2027 target |

## Vulnerability reporting

Email `security@xcity.one`. We aim for first-response within 24h, fix or mitigation within 30 days for critical issues. We do not currently run a public bounty program.

## Subprocessors

Listed at [/legal](/legal#subprocessors). Updated within 30 days of any change.


---

# Data residency
URL: https://xcity.ai/docs/en/security/data-residency
Description: Where customer data lives, what crosses borders, and how to constrain it.


## Default footprint

A free or paid (non-Enterprise) account's data lives in:

- **Argentina (San Juan)** — primary GoTrue Postgres, LiteLLM Postgres, object storage.
- **EU (Frankfurt)** — DR mirror, 6h lag.
- **Stripe (US, EU)** — billing only.

## What crosses borders

- Identity → only on DR mirror (encrypted, internal).
- Inference content → does not leave the gateway's region in production; we don't echo to a logging sink outside the region.
- Stripe billing data → outside our control, governed by Stripe's [International Data Transfers](https://stripe.com/legal/idpa).

## Enterprise residency pinning

Enterprise contracts can pin to:

- **AR only** — disable EU DR; data lives in San Juan exclusively. RPO is degraded to "best-effort" since DR is off.
- **EU only** — primary served from EU, DR in AR.
- **Single tenant on dedicated cluster** — separate Postgres/LiteLLM/object storage instances in the customer's preferred region.

See [Enterprise: Custom deployment](/docs/en/enterprise/custom-deployment).

## Sub-processor changes

We notify all customers ≥30 days before adding or removing a sub-processor that handles customer data. Enterprise customers can object to a sub-processor change; we'll work with you on a remediation path.


---

# 推理 API
URL: https://xcity.ai/docs/zh/api-reference/inference
Description: tokenhub.xcity.one 提供的 OpenAI 兼容 chat/completions、completions、embeddings 端点。


推理网关位于 `https://tokenhub.xcity.one/v1`，使用 OpenAI REST 协议。任意 OpenAI SDK 可直接使用。

## 认证

所有请求需要 `/dashboard/keys` 颁发的 Bearer Token：

```
Authorization: Bearer sk-...
```

密钥可从控制台或 [Keys API](/docs/zh/api-reference/keys) 吊销，5 秒内全球生效。

## POST /v1/chat/completions

标准的 OpenAI chat-completions 格式。

```bash
curl https://tokenhub.xcity.one/v1/chat/completions \
  -H "Authorization: Bearer $XCITY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "messages": [
      {"role": "system", "content": "请简洁。"},
      {"role": "user", "content": "两句话总结阿根廷项目。"}
    ],
    "stream": false
  }'
```

## POST /v1/embeddings

```bash
curl https://tokenhub.xcity.one/v1/embeddings \
  -H "Authorization: Bearer $XCITY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "model": "text-embedding-3-small", "input": "hello world" }'
```

## GET /v1/models

返回 **当前密钥计划允许** 的模型 (非全局目录)。用它驱动 UI 模型选择器，不会泄露未授权模型。

## 流式

`"stream": true` 开启 SSE 流。线协议与 OpenAI 完全一致：

```
data: {"choices":[{"delta":{"content":"你"}}]}
data: {"choices":[{"delta":{"content":"好"}}]}
data: [DONE]
```

## 错误码

| 状态 | 含义 |
|---|---|
| `401` | 密钥无效或已吊销 |
| `403` | 模型不在计划白名单 |
| `402` | 超出预算 (单请求或月度) |
| `429` | 限流，指数退避重试 |
| `5xx` | 上游或网关问题，幂等请求可重试 |


---

# 架构总览
URL: https://xcity.ai/docs/zh/concepts/architecture
Description: 身份、计费、推理三个平面如何在 Xcity 栈中流转。


Xcity 栈分为三个平面 —— **身份**、**计费**、**推理** —— 平台上的任何产品都基于它们组合。

```
                ┌────────────────────────────────────────────┐
   浏览器  →    │  xcity-home (Astro, *.xcity.one)           │
                │   ├── /api/auth/*     身份 BFF              │
                │   ├── /api/billing/*  Stripe BFF            │
                │   └── /api/me/*       计划 / 密钥解析       │
                └──────┬───────────┬────────────────┬────────┘
                       │           │                │
                       ▼           ▼                ▼
                  GoTrue       Stripe          LiteLLM
                (auth.xcity)   (计费)         (tokenhub.xcity)
                                                    │
                                                    ▼
                                              太阳能算力 (AR)
```

## 身份平面

认证集中在 `auth.xcity.one` (自托管 GoTrue，开发用 Supabase 兜底)。xcity-home Astro 服务签发一个作用域为 `*.xcity.one` 的 host-only、SameSite=Lax 会话 Cookie。所有 `.xcity.one` 子产品通过 `credentials: 'include'` 自动继承会话。

详见 [子产品集成](/docs/zh/guides/sub-product-integration)。

## 计费平面

Stripe 是计划、订阅、发票的事实源。xcity-home 服务持有唯一一份 Stripe 凭证；子产品永远不直接调用 Stripe。计划与权益通过 `/api/billing/webhook` 镜像到用户的 GoTrue `app_metadata` 上。

详见 [计费模型](/docs/zh/concepts/plans) 与 [Stripe Webhook 参考](/docs/zh/operations/stripe-webhooks)。

## 推理平面

`tokenhub.xcity.one` 在多家上游模型供应商与我们自托管模型前面运行 LiteLLM。每个 API Key 都绑定到 Xcity 账户、计划白名单与预算上限。每次请求落账，驱动 `/dashboard/usage` 与超量计费。

## 数据落点

| 域 | 存储 | 区域 |
|---|---|---|
| 用户身份 | GoTrue Postgres | 阿根廷圣胡安 (主) |
| 订阅 | Stripe | 美国 (全球) |
| API 用量 | LiteLLM Postgres | 阿根廷圣胡安 |
| 审计日志 | 对象存储 | 圣胡安 + DR |

企业合同可锁定区域或要求专属部署 —— 见 [企业: 数据驻留](/docs/zh/enterprise/data-residency)。


---

# 计划、密钥与预算
URL: https://xcity.ai/docs/zh/concepts/plans
Description: 账户如何映射到计划，密钥如何作用域化，用量如何计量。


## 计划

Xcity 默认提供三档 + 一档企业定制：

| 计划 | 月费 | 包含 | 上限 |
|---|---|---|---|
| Free | $0 | 基础模型、控制台 | 硬封顶，无超量 |
| Pro | $29 | 全模型、优先路由 | 软封顶 + 超量 |
| Team | $99 | 多席位、共享预算、审计 | 软封顶 + 超量 |
| Enterprise | 议价 | SSO、定制 SLA、区域锁定、DPA | 议价 |

Stripe 是计划状态的事实源。xcity-home 通过 `src/lib/billing.ts` 读取，并经 `/api/billing/webhook` 镜像到 GoTrue 的 `app_metadata.plan` 字段。

## 密钥

一把密钥 (`sk-...`) 绑定：

1. **Xcity 账户** —— 注销账户即吊销所有密钥。
2. **计划白名单** —— 仅允许当前计划的模型，越权返回 `403`。
3. **预算包络** —— LiteLLM 在准入时校验单请求成本，并累加到月度上限。

通过控制台或 [Keys API](/docs/zh/api-reference/keys) 创建。可随时吊销，不影响其他密钥。

## 预算

两个预算信号：

- **单请求成本上限**。LiteLLM 在准入时拒绝预估成本超过单请求上限的请求，避免一次大 prompt 耗光月度预算。
- **月累计成本**。用量账本汇总到账户级。Free 计划达上限即硬阻断；付费计划切到超量按价目计费。

`GET /api/me/litellm-key` 返回当前包络；`/dashboard/usage` 可视化消耗。

## 计划变更

升降级通过 Stripe Customer Portal。我们监听 `customer.subscription.updated`，秒级 (通常 <2s) 同步权益。降级在当前账单周期结束时生效，避免周期内被锁。

失败模式见 [运维: Stripe Webhook](/docs/zh/operations/stripe-webhooks)。


---

# 服务等级协议
URL: https://xcity.ai/docs/zh/enterprise/sla
Description: 付费计划的可用性、延迟、支持响应承诺。


本 SLA 适用于付费计划。Free 计划尽力服务，无承诺。

## 可用性

| 计划 | 月度可用性 | 抵扣 |
|---|---|---|
| Pro | 99.5% | 低于 99.5% 每 0.1% 抵扣 5% MRR |
| Team | 99.9% | 低于 99.9% 每 0.1% 抵扣 10% MRR |
| Enterprise | 99.95% (可定制) | 议价；MRR 上限 30% |

可用性按网关边缘三地探针计。提前 48h 公告的维护窗口不计；<15 分钟的紧急安全补丁不计。

## 延迟

| 计划 | p50 | p95 |
|---|---|---|
| Pro | <800ms | <2000ms |
| Team | <600ms | <1500ms |
| Enterprise | 定制 | 定制 |

按默认模型 chat/completions 网关 → 首 Token 计。>4k 输入按比例放宽预算。

## 支持响应

| 严重度 | Pro | Team | Enterprise |
|---|---|---|---|
| **S1** —— 生产宕机 | 4h | 1h | 15 分钟 |
| **S2** —— 降级可用 | 下个工作日 | 4h | 1h |
| **S3** —— 咨询 / 非阻塞 | 3 工作日 | 1 工作日 | 4h |

S1 自动升级到 PagerDuty。Enterprise S1 含电话桥。

## 抵扣申请

事件后 60 天内提交。下一份发票抵扣，不跨 12 个月累积。

联系 `support@xcity.one`；Enterprise 走专属客户门户。


---

# 简介
URL: https://xcity.ai/docs/zh/get-started/introduction
Description: Xcity AI OS 是什么、面向谁、各部分如何组合。


Xcity AI OS 是一座由太阳能驱动的 AI 文明的操作系统。它把四个产品层 —— Builder、Exchange、Agent、Runtime —— 构建在阿根廷 100 GW 太阳能算力底座之上，并通过统一账户、统一计费、统一 OpenAI 兼容的推理网关对外暴露。

本文档面向开发者、集成方与企业采购方，涵盖账户、推理 API、计费模型、子产品集成、部署与运维保障。

## 适合谁

- **开发者**：基于推理网关 `tokenhub.xcity.one` 构建应用。
- **集成方**：把浏览器子产品接入统一身份 + 计费。
- **企业**：评估合规、SLA、数据驻留、采购流程。
- **运维**：发布、密钥轮换、Webhook 排障。

## 包含什么

| 层级 | 作用 | 域名 |
|---|---|---|
| L1 Builder | 应用与 Agent 托管平台 | `xcity.one` |
| L2 Exchange | 模型与 Agent 市场 | `market.xcity.one` |
| L3 Agent / Runtime | OpenAI 兼容推理网关 | `tokenhub.xcity.one` |
| L4 Infra | 太阳能算力、网络、身份 | (由 Xcity 运营) |

四层共享一套身份 (Xcity Account)、一套计费 (Stripe) 与一份用量账本。

## 文档结构

- **快速开始** —— 安装、认证、首个调用。
- **核心概念** —— 账户、计划、密钥、Agent、网关的心智模型。
- **产品** —— 各垂类指南 (AI 平台、能源、不动产、移民)。
- **API 参考** —— 公开端点的请求/响应样例。
- **指南** —— 任务型实战 (接入身份、上线子产品、处理 Webhook)。
- **运维** —— Runbook、发布、SLA、事件响应。
- **安全与合规** —— DPA、区域、加密、审计。
- **企业版** —— 采购、定制 SLA、专享算力。

新用户从 [快速开始](/docs/zh/get-started/quickstart) 开始；接入已有应用直接看 [子产品集成](/docs/zh/guides/sub-product-integration)。


---

# 快速开始
URL: https://xcity.ai/docs/zh/get-started/quickstart
Description: 注册账户、获取 API Key、5 分钟内完成首个推理调用。


本页带你从零到首次收到 Xcity 推理网关的响应。

## 1. 注册账户

1. 打开 [xcity.one/register](https://xcity.one/register)。
2. 用邮箱 + 密码注册。Resend 会发送验证邮件，点击链接后账户激活。
3. 自动跳转到 `xcity.one/dashboard`。新账户默认 **Free** 计划。

## 2. 获取 API Key

网关使用与账户绑定的 Bearer Token。两种方式：

**控制台** —— 进入 `/dashboard/keys` → **Create key** → 复制 (仅展示一次)，给一个有意义的标签。

**程序化** —— 当账户已有授权时可通过 LiteLLM 管理 API 申请。详见 [Keys API](/docs/zh/api-reference/keys)。

```bash
export XCITY_API_KEY="sk-..."
```

## 3. 发起首次调用

网关是 OpenAI 兼容的，任意 OpenAI SDK 直接换 base URL 即可。

```bash
curl https://tokenhub.xcity.one/v1/chat/completions \
  -H "Authorization: Bearer $XCITY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "messages": [{"role": "user", "content": "一句话打个招呼。"}]
  }'
```

Python:

```python
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["XCITY_API_KEY"],
    base_url="https://tokenhub.xcity.one/v1",
)

resp = client.chat.completions.create(
    model="claude-sonnet-4-6",
    messages=[{"role": "user", "content": "一句话打个招呼。"}],
)
print(resp.choices[0].message.content)
```

## 4. 查看用量

每次请求都会落账。`/dashboard/usage` 实时展示：

- 今日 / 本周期消耗 Token
- 按模型拆分
- 剩余预算

Free 计划硬封顶；付费计划软封顶 + 按超量计费。完整矩阵见 [计划与定价](/docs/zh/concepts/plans)。

## 下一步

- [概念: 计划、密钥与预算](/docs/zh/concepts/plans)
- [指南: 接入子产品](/docs/zh/guides/sub-product-integration)
- [API 参考: chat/completions](/docs/zh/api-reference/inference)


---

# 接入浏览器子产品
URL: https://xcity.ai/docs/zh/guides/sub-product-integration
Description: 把任意 *.xcity.one 应用接入统一身份、计划与推理。


如何把一个新的浏览器子产品 (任何 `*.xcity.one`) 接入 xcity-home 身份 + tokenhub。

> **适用范围**：本文仅覆盖 **浏览器** 子产品 (Chromium / Firefox / Safari)。Electron 桌面应用见 [桌面集成](/docs/zh/guides/desktop-integration)。

## 前置

1. **xcity.one 子域** —— 必须服务于 `https://<app>.xcity.one`。其他域需要在 xcity-home 配置 `XCT_CORS_EXTRA_ORIGINS` 并经过安全评审。
2. **HTTPS 生产** —— 会话 Cookie 带 `Secure`，生产 http 域不会附带。
3. **用户已有 Xcity 账户** —— 首次访问自动跳转到 `xcity-home/login`；我们不在子产品提供注册表单。

## 第一步 —— 拉取身份包络

```ts
async function getXcityIdentity() {
  const res = await fetch('https://www.xcity.one/api/me/litellm-key', {
    credentials: 'include',
  });
  if (res.status === 401) {
    window.location.href = 'https://www.xcity.one/login?return=' +
      encodeURIComponent(window.location.href);
    return null;
  }
  return res.json() as Promise<{
    key: string;
    plan: string;
    models: string[];
    api_base: string;
  }>;
}
```

一次调用拿到 Bearer、用户计划、可用模型、网关地址。

## 第二步 —— 发起推理

```ts
const id = await getXcityIdentity();
if (!id) return;

const res = await fetch(`${id.api_base}/chat/completions`, {
  method: 'POST',
  headers: {
    Authorization: `Bearer ${id.key}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: id.models[0],
    messages: [{ role: 'user', content: '你好' }],
  }),
});
```

## 第三步 —— 响应计划变更

长会话请在 focus 事件或每 5 分钟刷新一次身份包络。降级后 `chat/completions` 会对已禁用模型返回 `403`，请优雅刷新 `models` 并更新 UI。

## 常见错误

- **忘加 `credentials: 'include'`**：浏览器不带 Cookie，永远 `401`。
- **写死 `api_base`**：永远读包络。我们会迁移网关域名。
- **缓存 Bearer**：它是短期的 (吊销或计划变更会失效)。`401` 时重新拉。
- **直接调用 Stripe / GoTrue**：不要。子产品不持有任何共享密钥。