API-first reply drafting · public beta

asktiberius
anything.

Every conversation. Every objection. Every SOP, pricing sheet, tone-of-voice example your team ever wrote — welded into one editable knowledge graph, exposed as one API, grounded in every reply.

Try the APISee the graph
POST /api/v1/agents/:id/reply
~300msretrieval p95·14chunkers·~95%groundedness
/the flow

One API call. Five stages. Sub-second.

Tiberius isn't a wrapper around a chat model. It's a deterministic retrieval pipeline that feeds the generator exactly the context a human op would pull — and scores the result before anyone sees it.

01
Message in

Prospect pings you on WhatsApp, Telegram, Intercom, or your own widget. Forward the payload to /reply.

trigger_message: "What's pricing for $50M/mo USDC?"
02
Hybrid retrieve

pgvector HNSW + Postgres FTS fused via Reciprocal Rank Fusion, plus entity-triggered + metadata-filtered lookups.

retrieved: 17 chunks · RRF top-25 → LLM rerank → top-7
03
Fill slots

Facts, SOPs, ToV examples, similar past convos, state, history — each slot populated from retrieval, citeable by tag.

slots: [kb-3, sop-1, tov-2, convo-5]
04
Draft

The generator drafts a reply grounded in the slots, with inline citations the sales op can double-check in one click.

reply_text: "At $50M/mo pricing is custom — [sop-1]…"
05
Score

Multi-signal confidence: retrieval coverage, intent classifier, LLM-judged groundedness, self-consistency. Below threshold → flag.

confidence: 0.87 · below_threshold: false
/knowledge graph

Your ops. One graph.
Every chunk editable, every edge earned.

Product docs, SOPs, glossaries, chat transcripts, tone-of-voice samples — chunked by content-type-aware splitters, enriched with stage / intent / entities, embedded, linked. Similarity edges come from cosine neighbours; co-retrieval edges earn themselves when two chunks keep getting pulled together for real replies.

Rendering graph…
stage · fill
any
cold
qualifying
scheduled
scheduling
stalled
size · degree
hover · focus neighborhood
Click to expand
Hybrid retrieval, not pure vector

pgvector HNSW + Postgres FTS, fused via RRF, then an LLM listwise reranker. Metadata filters + entity-triggered lookups stack on top.

Every chunk is editable

Click a node, fix the sentence, save. The chunk re-embeds, the graph updates. No CI, no redeploys, no prompt engineering.

Edges the graph earned

Co-retrieval edges grow from real reply logs. Over time the graph shows which chunks actually answer questions — not just which ones we indexed.

/api-first · mcp-first

API-first.
MCP-first.
Human-last.

Most AI tools are UIs dragging an API behind them. asktiberius was API-first on day zero — the exact endpoints the web app uses are your endpoints too: /api/v1/*, Bearer-auth, OpenAPI spec, Swagger UI.

And because MCP is just the next layer over HTTP APIs, every asktiberius agent will ship as an MCP server — drop it into Claude, ChatGPT, Cursor, or your own agent stack and your knowledge graph becomes a first-class tool.

asktiberius · quickstart
curl -X POST "$TIB_BASE/api/v1/agents/$TIB_AGENT/reply" \
  -H "Authorization: Bearer $TIB_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "trigger_message": "Hey, what does Ivy charge for USDC pay-ins? We do ~$50M/mo EU.",
    "history": [
      {"role": "assistant", "content": "Hi — Felix from Ivy, saw your USDC volume is growing."}
    ]
  }'

# → {
#     "reply_text": "Thanks — at that volume, pricing would be custom. …",
#     "confidence": 0.87,
#     "confidence_breakdown": { "retrieval": 0.99, "groundedness": 0.80 },
#     "detected_intent": "pricing",
#     "suggested_tool": "send_calendly_link"
#   }
/agents, scaled

One agent per job.
Infinite reach.

Each agent gets its own knowledge graph, its own config, its own API keys. Supabase RLS keeps tenants clean; API keys are agent-scoped so a support bot can't answer a legal question by accident. Spin up ten — or ten thousand — from the same codebase.

Sales Pre-Discovery
B2B outbound · WhatsApp + Telegram
412chunks·4intents
pricingobjectionschedulingtimeline
Tier-1 Support
Customer help · Intercom + Email
1,284chunks·4intents
how-tobugbillingrefund
HR Policy Bot
Internal · Slack DM
296chunks·3intents
ptobenefitsonboarding
Legal Q&A
Contract desk · Email + Teams
174chunks·4intents
ndadpaliabilityredline
Multi-tenant by design. One Supabase project. N agents. M graphs. RLS on every table. Zero shared state between agents — and one dashboard to see them all.
/business case

Wherever humans
reply to humans.

Sales was the wedge. The pattern is universal: any chat-based job where an op needs the right context in three seconds, cites a source, and moves on.

B2B Sales

Outbound reps replying on WhatsApp / Telegram / LinkedIn need the right pricing, ToV, and SOP in three seconds.

Customer Support

Tier-1 agents covering thousands of tickets a week can't memorize every product page — but the graph can.

Partner Success

CSMs draft check-ins that pull the partner's usage, last QBR, and open tickets without switching tabs.

Internal Help Desk

Employees ping HR / IT / Finance over Slack. One graph per function, answers that cite the actual policy.

Recruiting

Sourcers mirror the candidate's tone, pull the most recent JD, and keep the story consistent across 40 replies/day.

Field Ops

Dispatch + technicians answering from the cab. Chunks land on mobile as clean, grounded text — no hallucinated part numbers.

~0ms
retrieval p95
0
content-type chunkers
~0%
groundedness (eval set)
/get started

Get a key.
Post a message.
Get a reply.

No SDK. No SaaS onboarding. A bearer token and an HTTP client is enough.

curl -X POST https://asktiberius.de/api/v1/agents/$ID/reply \
  -H "Authorization: Bearer $KEY" \
  -H "Content-Type: application/json" \
  -d '{ "trigger_message": "What are you charging per USDC txn?" }'
asktiberius
© 2026 · thinc! × Ivy · built in 48h