Indexed

Configuration Guide

Complete guide to configuring Indexed — workspace setup, credentials, storage layout, and advanced options.

Configuration Guide

By the end of this guide, you will understand how to initialize and inspect your workspace, change and remove configuration values, manage connector credentials securely, understand the storage layout, and tune advanced options like chunk size and the embedding model.

This page is the walkthrough. For command flags, every config key in table form, and the INDEXED__… override pattern, use the Config commands reference (indexed config --help lists subcommands; each has its own --help).

Initialize

indexed config init creates a .indexed/ directory in your current working directory with config.toml and .env.example. To get the familiar layout under your home directory, run it after cd ~ — that yields ~/.indexed/. For a repository-scoped workspace, run it from the repo root instead.

Terminal
cd ~
indexed config init

Use --force (-f) only when you intend to overwrite existing workspace config files in that directory. Collections are not deleted by config init.

After config init, run indexed init once to download the default embedding model and create supporting directories (see indexed init --help).

Inspect

View merged configuration (global + workspace + environment):

Terminal
indexed config inspect

Show a single section or include defaults:

Terminal
indexed config inspect sources
indexed config inspect --defaults

Machine-readable JSON:

Terminal
indexed --simple-output config inspect

Global config updates

indexed config update replaces or edits global configuration — interactively, or from a file with --file path/to/config.toml. See indexed config update --help.

Change and Remove Values

indexed config set and indexed config delete modify the workspace .indexed/config.toml (the directory created by config init in your current working directory). indexed config update changes global configuration only — see Config commands reference.

Set a configuration value by its full key path:

Terminal
indexed config set core.v1.indexing.chunk_size 1024

Preview a change without applying it:

Terminal
indexed config set core.v1.indexing.chunk_size 1024 --dry-run
Would set core.v1.indexing.chunk_size = 1024 (currently 512)

Reset a key to its default value (prompts for confirmation unless you pass --force / -f):

Terminal
indexed config delete core.v1.indexing.chunk_overlap
indexed config delete core.v1.indexing.chunk_overlap --force

Validate the active configuration against the CLI’s validation rules (exits with status 1 if there are errors):

Terminal
indexed config validate

Run validate after manual edits to config.toml to catch typos before they cause indexing errors.

Precedence: Files, Environment, and Flags

indexed config inspect shows the merged result: global + workspace + environment (per indexed config inspect --help). In practice, later layers override earlier ones:

  1. Built-in defaults (lowest)
  2. Global config at ~/.indexed/config.toml when it exists
  3. Workspace config at .indexed/config.toml for the workspace you initialized (overrides global for keys that are set)
  4. Environment variables — including INDEXED__… overrides and connector credential variables
  5. CLI flags on the command you run (highest)

Workspace config is whichever .indexed/ directory applies to your current working directory and setup (for example ~/.indexed/ after cd ~, or <repo>/.indexed/ from a project root).

Environment Variable Overrides

Any configuration key can be overridden at runtime using the INDEXED__<section>__<key> pattern. Double underscores (__) separate each level of the config hierarchy.

Terminal
export INDEXED__core__v1__indexing__chunk_size=1024
export INDEXED__core__v1__embedding__model_name="all-MiniLM-L6-v2"

Environment variables override values from config files. This is useful for CI pipelines and Docker deployments where you want to inject settings without modifying the config file.

Connector Credentials

Connector credentials are read from environment variables (never from config.toml). Names and usage are in Config commands — Connector credentials; CLI flags are in Index commands (indexed index create jira --help / indexed index create confluence --help).

VariableUsed byDescription
ATLASSIAN_EMAILJira Cloud, Confluence CloudAtlassian account email
ATLASSIAN_TOKENJira Cloud, Confluence CloudAtlassian API token
JIRA_TOKENJira Server/DCBearer or personal access token
JIRA_LOGINJira Server/DCUsername (with JIRA_PASSWORD)
JIRA_PASSWORDJira Server/DCPassword (with JIRA_LOGIN)
CONF_TOKENConfluence Server/DCBearer or personal access token
CONF_LOGINConfluence Server/DCUsername (with CONF_PASSWORD)
CONF_PASSWORDConfluence Server/DCPassword (with CONF_LOGIN)

For token setup, Cloud vs Server/DC behavior, and troubleshooting auth, see the Jira and Confluence connector guides.

Keep credentials out of version control

Put secrets in .indexed/.env next to that workspace’s config.toml (or export them in your shell) — never in config.toml and never committed. indexed config init creates .env.example as a template; copy or rename as needed for your environment.

.env
# Jira or Confluence Cloud (typical)
ATLASSIAN_EMAIL=your-email@company.com
ATLASSIAN_TOKEN=your_api_token_here

# Jira Server/DC (example)
# JIRA_TOKEN=your_pat_here

# Confluence Server/DC (example)
# CONF_TOKEN=your_pat_here

Storage Layout

All Indexed data lives at ~/.indexed/:

~/.indexed/
├── config.toml                          # Your configuration
└── data/
    └── collections/
        ├── my-docs/
        │   ├── manifest.json            # Source path, connector type, timestamps
        │   ├── documents.json           # Document metadata (title, ID, source path)
        │   ├── chunks.json              # Actual text chunks — treat as sensitive
        │   └── index.faiss              # Binary vector index
        └── eng-tickets/
            ├── manifest.json
            ├── documents.json
            ├── chunks.json
            └── index.faiss

Check disk usage for a specific collection:

Terminal
du -sh ~/.indexed/data/collections/my-docs
5.1M    /Users/you/.indexed/data/collections/my-docs

Do not manually edit collection files

Do not manually edit, move, or rename individual files within a collection directory. The four files (manifest.json, documents.json, chunks.json, index.faiss) must stay in sync. If a collection becomes corrupted, remove it with indexed index remove <name> --force and recreate it.

The data directory location (~/.indexed/data/) is not currently configurable via a config key. To move it to another location (e.g., a different disk), use a symlink:

Terminal
mv ~/.indexed/data /path/to/new/location
ln -s /path/to/new/location ~/.indexed/data

Advanced Options

Chunk Size and Overlap

Chunk size controls how many tokens each embedded segment contains. Smaller chunks produce more precise retrieval — a narrow question is more likely to land on the exact answer. Larger chunks provide more context per result — useful when answers span several paragraphs.

The defaults (chunk_size = 512, chunk_overlap = 50) work well for most use cases. Increase chunk size if results feel too narrow; decrease it if results are losing their focus.

Terminal
# Larger chunks (more context per result)
indexed config set core.v1.indexing.chunk_size 1024
indexed config set core.v1.indexing.chunk_overlap 100

# Smaller chunks (more precise retrieval)
indexed config set core.v1.indexing.chunk_size 256
indexed config set core.v1.indexing.chunk_overlap 25

After changing chunk size, existing collections must be re-indexed — the stored vectors were generated with a different chunk size and will produce inconsistent results if mixed with new ones:

Terminal
indexed index create files -c my-docs -p ./docs --force

Embedding Model

The default embedding model is all-MiniLM-L6-v2 — a 80 MB sentence-transformers model that balances retrieval quality with speed and size. It runs locally via ONNX Runtime.

Any model from the sentence-transformers collection on HuggingFace can be used:

Terminal
indexed config set core.v1.embedding.model_name "paraphrase-multilingual-MiniLM-L12-v2"

Changing the model invalidates all existing collections

Each embedding model produces vectors in a different mathematical space. If you change the model, you must re-index all existing collections — searching with mixed models produces garbage results. Run indexed index create ... --force for each collection after changing the model.

Stick with all-MiniLM-L6-v2 unless you have a specific reason to change — for example, multilingual content (use paraphrase-multilingual-MiniLM-L12-v2) or domain-specific retrieval.

Caching

Indexed caches parsed document representations so that re-indexing (e.g., after adding files) doesn't re-parse documents that haven't changed. The cache is used by default.

Bypass the cache when source files have changed but their timestamps are stale (e.g., after a git checkout that preserved mtimes):

Terminal
indexed index create files -c my-docs -p ./docs --no-cache

Re-enable the cache explicitly (it is on by default):

Terminal
indexed index create files -c my-docs -p ./docs --use-cache

What's Next