Quick Start

Create an account, get an API key, and make your first inference call in under five minutes.

This page walks you from zero to your first response from the Xcity inference gateway.

1. Create an account

  1. Open xcity.one/register.
  2. Sign up with email + password. We send a confirmation link via Resend; the account is activated when you click it.
  3. You’ll land on the dashboard at xcity.one/dashboard. New accounts start on the Free plan.

2. Get an API key

The gateway uses bearer tokens scoped to your account. Two ways to retrieve one:

Dashboard — go to /dashboard/keys. Click Create key, copy the value (we only show it once), and give it a descriptive label.

Programmatic — keys can also be minted via the LiteLLM admin API once your account has an entitlement. See Provisioning keys.

export XCITY_API_KEY="sk-..."

3. Make your first request

The gateway is OpenAI-compatible. Any OpenAI client library works — just swap the base URL.

curl https://tokenhub.xcity.one/v1/chat/completions \
  -H "Authorization: Bearer $XCITY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "messages": [{"role": "user", "content": "Say hello in one sentence."}]
  }'

Python:

from openai import OpenAI

client = OpenAI(
    api_key=os.environ["XCITY_API_KEY"],
    base_url="https://tokenhub.xcity.one/v1",
)

resp = client.chat.completions.create(
    model="claude-sonnet-4-6",
    messages=[{"role": "user", "content": "Say hello in one sentence."}],
)
print(resp.choices[0].message.content)

4. Check usage

Every request is logged. Visit /dashboard/usage for a live readout of:

  • Tokens consumed today / this billing cycle
  • Per-model breakdown
  • Remaining budget against your plan cap

Free-tier accounts get a hard ceiling — paid plans get a soft cap with overage billing. See Plans & Pricing for the full matrix.

What’s next

Last updated: