Surf Inference · LLM Chat Completions

Served bySurf Inference Indexed external

Generates chat completions via OpenAI-compatible endpoint supporting multiple LLM models with configurable parameters.

What it does

Generates chat completions via OpenAI-compatible endpoint supporting multiple LLM models with configurable parameters.

  • Generate text responses for agent conversations
  • Create embeddings and completions without API keys
  • Access Chinese and multilingual LLMs programmatically
  • Build chatbots with pay-per-use cost structure

Ideal buyer

AI agent developers seeking OpenAI-compatible inference with micropayment flexibility and multi-model access.

Use with AXON

Run this through your governed agent wallet.

  1. 01
    Bootstrap AXON once with npx @axon402/init.
  2. 02
    Use the AXON runtime MCP tools to search_x402_services or inspect_x402_offer for this service.
  3. 03
    Quote, test-buy, then run the governed paid fetch through AXON.

Send this

Prompt for your agent

A natural-language instruction for your LLM agent — with this endpoint exposed as a tool — to call this resource. Not sent to the endpoint; the endpoint consumes the JSON body below.

Pasting this prompt into a raw ChatGPT or unconfigured agent will notexecute the paid endpoint flow. Run it through an agent with the AXON runtime / MCP tools exposed (see “Use with AXON” above) so the 402 challenge, quote, and governed fetch are handled for you.

Generate a completion for 'Explain quantum computing in 3 sentences' using moonshotai/kimi-k2.5 with max 100 tokens.

Endpoint request body

The JSON payload your agent sends to the endpoint.

application/json
{
  "model": "moonshotai/kimi-k2.5",
  "messages": [
    {
      "role": "user",
      "content": "Explain quantum computing in 3 sentences"
    }
  ],
  "max_tokens": 100
}

Advanced HTTP details

For integrators who need the raw protocol surface. Most agents should use AXON above instead of calling these directly.

curl fallback

curl https://inference.surf.cascade.fyi/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "X-PAYMENT: [signed_payment_envelope]" \
  -d '{"model":"moonshotai/kimi-k2.5","messages":[{"role":"user","content":"Explain quantum computing in 3 sentences"}],"max_tokens":100}'

Payment & settlement details

Raw on-chain settlement parameters. AXON above handles these automatically through quote / test-buy / governed fetch.

solanaexact
$0.0010
per call
Pay-to addressE7BFB4ucyNsa8K52uQsCyKgw4Dg9vHA5mo73W6aWTNZY
T/O: 300s asset EPjFWd…Dt1v
baseexact
$0.0010
per call
Pay-to address0xdd6090df24e88caf558839584dd53bcef79c6338
T/O: 300s asset 0x8335…2913

Price & network

Cheapest call$0.0010
Networks
solanabase

Trust & risk

Trust tier Indexed external
Pricing sanityCheap outlierratio 0.10×
Risk flagsNo risks flagged
View JSON bundle

Indexed from facilitator discovery data

Last enriched: