raggity¶
Local-first, top-tier RAG over your notes, docs, and PDFs — answered by Claude.
Hybrid retrieval (dense + BM25 + RRF), cross-encoder reranking, dedup, verified inline citations, and selective abstention: raggity only answers when it has evidence.
What makes raggity different¶
| Feature | raggity |
|---|---|
| Hybrid retrieval | Dense vector + BM25 full-text, fused with Reciprocal Rank Fusion (RRF k=60) |
| Cross-encoder reranking | Local ONNX cross-encoder re-scores every candidate — no blind top-k |
| Selective abstention | Returns "I don't have enough information" instead of hallucinating |
| Verified citations | Inline citation markers are cross-checked against retrieved sources before display |
| Zero-GPU default | CPU-only ONNX Runtime — works on any machine |
| Three LLM backends | Claude (default), OpenAI-compatible APIs, Ollama (offline) |
| Two vector stores | LanceDB (local, zero-config) or Qdrant (scalable, multi-user) |
| Full pipeline | Query transforms, parent-doc retrieval, GraphRAG, semantic answer cache, SSE server |
Architecture overview¶
Sources → Chunker → Embedder → LanceDB / Qdrant
|
Query ──────────────────────┤
dense search ├── RRF fusion (k=60)
BM25/FTS ┘ |
Cross-encoder rerank
|
Dedup (cosine ≥ 0.92)
|
Optional rerank-score filter
|
Lost-in-the-middle reorder
|
Claude Agent SDK → Answer
(with verified citations)
Quick links¶
- Installation
- Quickstart
- Configuration reference
- Backends — LLM + vector stores
- Retrieval pipeline
- Ingestion — file types & connectors
- Server & API
- Deploy — Docker & observability
License¶
GNU AGPL-3.0-or-later. If you modify raggity and distribute it — or run a modified version as a hosted service — you must release your source under the AGPL as well. Using raggity as-is to query your own documents has no such obligation.