Architecture

How Kioku's components fit together.

┌─────────────────────────────────────────────────────────────┐
│                        Client Layer                          │
│  Kioku CLI (Rust)    MCP Client    HTTP API Consumer        │
└────────────┬──────────────┬─────────────┬───────────────────┘
             │              │             │
┌────────────▼──────────────▼─────────────▼───────────────────┐
│                    Hivemind API (:9100)                      │
│  Auth + Sessions + Knowledge Search + MCP Server             │
│  Embeddings (Ollama) → Vector Store (Qdrant)                 │
│  Postgres (pgvector) for relational data                    │
└────────────┬────────────────────────────────────┬───────────┘
             │                                     │
             │  POST /vexa/bots                    │ knowledge search
             ▼                                     ▼
┌──────────────────────────┐    ┌───────────────────────────────┐
│    Vexa API Gateway       │    │     Knowledge Pipeline         │
│  (:8056)                  │    │  PDF → text → embed → Qdrant   │
│  ┌─ Meeting API (8080)    │    │  Meeting transcript → embed    │
│  ┌─ Admin API (8001)      │    └───────────────────────────────┘
│  ┌─ Agent API (8100)      │
│  ┌─ Runtime API (8090)    │
│  │   └─ spawns bot pods   │
│  ┌─ MCP (18888)           │
│  ┌─ TTS Service (8002)    │
│  ┌─ Transcription (80)    │
│  ┌─ Redis (6379)          │
│  └─ MinIO (9000)          │
└──────────────────────────┘
         │ Runtime API
         ▼
┌──────────────────┐
│   Bot Pod (GPU)   │
│  Playwright +     │
│  Whisper + Xvfb   │
│  Lives per meeting│
└──────────────────┘

Data Flow

Meeting → Knowledge

User requests a bot via POST /vexa/bots (Hivemind proxies to Vexa)
Vexa runtime-api spawns a GPU bot pod
Bot joins the meeting (Google Meet/Zoom/Teams), captures audio
Whisper transcribes audio in real-time → Redis streams
Transcription collector writes to Postgres
Meeting completes → transcript sent to Hivemind POST /meetings
Hivemind embeds transcript chunks → Qdrant
Transcript becomes searchable via POST /knowledge/search

Document → Knowledge

User uploads PDF via CLI or POST /knowledge/documents
Hivemind extracts text (pdf-extract)
Text chunked → embedded via Ollama → stored in Qdrant
Searchable via POST /knowledge/search

MCP Integration

AI client (Claude, Cursor) connects to Hivemind MCP endpoint
MCP tools available: kioku_search, kioku_list_meetings, etc.
AI client can search knowledge, list meetings, get transcripts — all through authenticated MCP session

Data Flow ​

Meeting → Knowledge ​

Document → Knowledge ​

MCP Integration ​

Data Flow

Meeting → Knowledge

Document → Knowledge

MCP Integration