What is Indexed

Indexed is a local-first indexing engine that builds a hybrid search index (BM25 + dense vectors) over your codebase, documentation, Jira tickets, and Confluence pages — all on your machine, with no data leaving your environment.

Search directly from the CLI (indexed index search), or expose the same index to AI agents via MCP. Both are first-class: CLI for quick lookups and scripting, MCP for letting Claude Code, Cursor, Copilot, or any MCP-compatible agent search for you.

What

Indexed turns your local sources into searchable collections. A collection is a named, typed snapshot of a source — files, a Jira project, or a Confluence space — stored as a local FAISS vector index at ~/.indexed/data/collections/<name>/.

~/.indexed/data/collections/my-docs/
├── manifest.json      # source path, connector type, timestamps
├── documents.json     # parsed document metadata
├── chunks.json        # chunk text and positions
└── index.faiss        # the binary vector index

Three steps to working search:

Connect — point Indexed at a source (a directory, a Jira project, or a Confluence space)
Index — code is parsed via tree-sitter AST, documents via Docling, everything is chunked and embedded using a local model. No API calls, no data sent anywhere.
Search — run indexed index search from the CLI, or connect an MCP-compatible agent to search for you. Both paths use the same index.

Why

Your Jira tickets are a black hole

Your team has two years of Jira tickets — decisions, bug investigations, architecture discussions, deployment runbooks. A new engineer asks "how do we handle auth token rotation?" The answer is in PROJ-1847 from eight months ago, but Jira search is keyword-based and the ticket title says "SSO refresh logic." Indexed makes all of it semantically searchable — from the CLI or directly through your AI agent.

Your AI agent can't see your full codebase

Coding agents like Claude Code read files one at a time and grep for keywords. That works for small repos, but it burns tokens on large ones and misses semantic connections. Indexed gives agents a pre-built search index over your entire codebase — AST-aware via tree-sitter — so they find the right function, module, or pattern across thousands of files in milliseconds.

Your docs and wikis are invisible to AI

Confluence pages, Markdown vaults, Obsidian notebooks, Notion exports, PDFs — your team wrote all of it, but your AI assistant can't see any of it. Indexed parses and indexes them with Docling (including OCR for scanned documents) and makes them searchable — from the CLI or via MCP in your AI agent.

Privacy

Your data never leaves your machine

Indexed runs a local embedding model (all-MiniLM-L6-v2) directly on your machine via ONNX Runtime. The only network calls are to your own Jira/Confluence instances when fetching documents. No data is sent to any third party. No telemetry. Everything is stored locally at ~/.indexed/.

What stays local — everything:

Parsing, chunking, and embedding run entirely on your machine
The FAISS vector index lives at ~/.indexed/ and never leaves your disk
Search queries are embedded locally; no query text is sent to any external service

What network calls happen:

When	What	Where
Indexing Jira	Fetches issues via Jira REST API	Your Jira instance
Indexing Confluence	Fetches pages via Confluence REST API	Your Confluence instance
First run	Downloads the embedding model (~80 MB)	HuggingFace (once only)

After the initial model download, Indexed works completely offline. There is no telemetry, no analytics, no usage reporting of any kind.

Protect your index directory

~/.indexed/data/collections/*/chunks.json contains actual text excerpts from your documents. Treat ~/.indexed/ with the same access controls as the source documents.

Continue reading: Quick Start · Indexing Overview · Configuration Guide

What is Indexed

What is Indexed

What

Why

Your Jira tickets are a black hole

Your AI agent can't see your full codebase

Your docs and wikis are invisible to AI

Privacy

On this page