Embedding Model
Kioku usesnomic-embed-text-v2-moe via Ollama for all embeddings.
| Metric | Value |
|---|---|
| MTEB score | 63.9 |
| Dimensions | 256–768 (configurable) |
| Latency (GPU) | 5–20ms |
| Latency (CPU) | 50–200ms |
| Cost | Free (compute only) |
| Privacy | Data stays on your server |
text-embedding-3-small (62.3 MTEB) on benchmarks while running entirely on your hardware.
Pipeline
Documents (PDF)
- PDF uploaded via
POST /knowledge/documents - Text extracted using
pdf-extractcrate - Text split into chunks
- Each chunk embedded via Ollama HTTP API
- Embeddings + metadata stored in Qdrant
Meetings (Transcripts)
- Transcript ingested via
POST /meetings - Each transcript segment (speaker + text + timestamps) embedded
- Embeddings + meeting metadata stored in Qdrant
- Searchable via
POST /knowledge/search
