ZeroReader · Llama 3.1 8B Fast Inference

Served byapi.zeroreader.com Indexed external

Generate fast, cost-efficient text completions using Llama 3.1 8B with OpenAI-compatible chat format.

What it does

Generate fast, cost-efficient text completions using Llama 3.1 8B with OpenAI-compatible chat format.

  • Power high-volume agent conversations with low latency
  • Route cost-sensitive tasks to efficient 8B model
  • Build responsive chat interfaces with streaming
  • Fallback from larger models for speed-critical paths

Ideal buyer

AI agent developers and applications that need fast, affordable LLM inference for high-volume or latency-sensitive workloads.

Use with AXON

Run this through your governed agent wallet.

  1. 01
    Bootstrap AXON once with npx @axon402/init.
  2. 02
    Use the AXON runtime MCP tools to search_x402_services or inspect_x402_offer for this service.
  3. 03
    Quote, test-buy, then run the governed paid fetch through AXON.

Send this

Prompt for your agent

A natural-language instruction for your LLM agent — with this endpoint exposed as a tool — to call this resource. Not sent to the endpoint; the endpoint consumes the JSON body below.

Pasting this prompt into a raw ChatGPT or unconfigured agent will notexecute the paid endpoint flow. Run it through an agent with the AXON runtime / MCP tools exposed (see “Use with AXON” above) so the 402 challenge, quote, and governed fetch are handled for you.

Summarize this paragraph in 10 words: 'The quick brown fox jumps over the lazy dog while the sun sets behind the mountains.'

Endpoint request body

The JSON payload your agent sends to the endpoint.

application/json
{
  "messages": [
    {
      "role": "user",
      "content": "Hello, how are you?"
    }
  ],
  "max_tokens": 256,
  "temperature": 0.7
}

Advanced HTTP details

For integrators who need the raw protocol surface. Most agents should use AXON above instead of calling these directly.

curl fallback

curl https://api.zeroreader.com/v1/ai/llama-8b \
  -H "Content-Type: application/json" \
  -H "X-PAYMENT: [signed_payment_envelope]" \
  -d '{"messages":[{"role":"user","content":"Hello, how are you?"}],"max_tokens":256,"temperature":0.7}'

Payment & settlement details

Raw on-chain settlement parameters. AXON above handles these automatically through quote / test-buy / governed fetch.

baseexact
$0.0020
per call
Pay-to address0xca99149c1a5959f7e5968259178f974aacc70f55
T/O: 300s asset 0x8335…2913

Price & network

Cheapest call$0.0020
Networks
base

Trust & risk

Trust tier Indexed external
Pricing sanityCheap outlierratio 0.10×
Risk flagsNo risks flagged
View JSON bundle

Indexed from facilitator discovery data

Last enriched: