What is Indexed
What is Indexed
Indexed is a local-first indexing engine that gives AI agents deep context over your codebase, documentation, Jira tickets, and Confluence pages. It builds a hybrid search index (BM25 + dense vectors) on your machine and exposes it via the Model Context Protocol (MCP) — so any MCP-compatible AI tool can search your indexed knowledge as naturally as it reads files.
Works with Claude Code, Claude Desktop, OpenAI Codex, Cursor, GitHub Copilot, and any other MCP-compatible agent.
What
Indexed turns your local sources into searchable collections. A collection is a named, typed snapshot of a source — files, a Jira project, or a Confluence space — stored as a local FAISS vector index at ~/.indexed/data/collections/<name>/.
~/.indexed/data/collections/my-docs/
├── manifest.json # source path, connector type, timestamps
├── documents.json # parsed document metadata
├── chunks.json # chunk text and positions
└── index.faiss # the binary vector indexThree steps to working search:
- Connect — point Indexed at a source (a directory, a Jira project, or a Confluence space)
- Index — code is parsed via tree-sitter AST, documents via Docling, everything is chunked and embedded using a local model. No API calls, no data sent anywhere.
- Search — hybrid BM25 + dense vector retrieval from the CLI, or let AI agents search via MCP
Why
Your Jira tickets are a black hole
Your team has two years of Jira tickets — decisions, bug investigations, architecture discussions, deployment runbooks. A new engineer asks "how do we handle auth token rotation?" The answer is in PROJ-1847 from eight months ago, but Jira search is keyword-based and the ticket title says "SSO refresh logic." Indexed makes all of it semantically searchable — from the CLI or directly through your AI agent.
Your AI agent can't see your full codebase
Coding agents like Claude Code read files one at a time and grep for keywords. That works for small repos, but it burns tokens on large ones and misses semantic connections. Indexed gives agents a pre-built search index over your entire codebase — AST-aware via tree-sitter — so they find the right function, module, or pattern across thousands of files in milliseconds.
Your docs and wikis are invisible to AI
Confluence pages, Markdown vaults, Obsidian notebooks, Notion exports, PDFs — your team wrote all of it, but your AI assistant can't see any of it. Indexed parses and indexes them with Docling (including OCR for scanned documents) and makes them searchable through one unified MCP endpoint.
Privacy
Your data never leaves your machine
Indexed runs a local embedding model (all-MiniLM-L6-v2) directly on your machine via ONNX Runtime. The only network calls are to your own Jira/Confluence instances when fetching documents. No data is sent to any third party. No telemetry. Everything is stored locally at ~/.indexed/.
What stays local — everything:
- Parsing, chunking, and embedding run entirely on your machine
- The FAISS vector index lives at
~/.indexed/and never leaves your disk - Search queries are embedded locally; no query text is sent to any external service
What network calls happen:
| When | What | Where |
|---|---|---|
| Indexing Jira | Fetches issues via Jira REST API | Your Jira instance |
| Indexing Confluence | Fetches pages via Confluence REST API | Your Confluence instance |
| First run | Downloads the embedding model (~80 MB) | HuggingFace (once only) |
After the initial model download, Indexed works completely offline. There is no telemetry, no analytics, no usage reporting of any kind.
Protect your index directory
~/.indexed/data/collections/*/chunks.json contains actual text excerpts from your documents. Treat ~/.indexed/ with the same access controls as the source documents.
Continue reading: Quick Start · Indexing Overview · Configuration Guide