Configuration Guide
Configuration Guide
By the end of this guide, you will understand how to initialize and inspect your workspace, change and remove configuration values, manage connector credentials securely, understand the storage layout, and tune advanced options like chunk size and the embedding model.
This page is the walkthrough. For command flags, every config key in table form, and the INDEXED__… override pattern, use the Config commands reference (indexed config --help lists subcommands; each has its own --help).
Initialize
indexed config init creates a .indexed/ directory in your current working directory with config.toml and .env.example. To get the familiar layout under your home directory, run it after cd ~ — that yields ~/.indexed/. For a repository-scoped workspace, run it from the repo root instead.
cd ~
indexed config initUse --force (-f) only when you intend to overwrite existing workspace config files in that directory. Collections are not deleted by config init.
After config init, run indexed init once to download the default embedding model and create supporting directories (see indexed init --help).
Inspect
View merged configuration (global + workspace + environment):
indexed config inspectShow a single section or include defaults:
indexed config inspect sources
indexed config inspect --defaultsMachine-readable JSON:
indexed --simple-output config inspectGlobal config updates
indexed config update replaces or edits global configuration — interactively, or from a file with --file path/to/config.toml. See indexed config update --help.
Change and Remove Values
indexed config set and indexed config delete modify the workspace .indexed/config.toml (the directory created by config init in your current working directory). indexed config update changes global configuration only — see Config commands reference.
Set a configuration value by its full key path:
indexed config set core.v1.indexing.chunk_size 1024Preview a change without applying it:
indexed config set core.v1.indexing.chunk_size 1024 --dry-runWould set core.v1.indexing.chunk_size = 1024 (currently 512)Reset a key to its default value (prompts for confirmation unless you pass --force / -f):
indexed config delete core.v1.indexing.chunk_overlap
indexed config delete core.v1.indexing.chunk_overlap --forceValidate the active configuration against the CLI’s validation rules (exits with status 1 if there are errors):
indexed config validateRun validate after manual edits to config.toml to catch typos before they cause indexing errors.
Precedence: Files, Environment, and Flags
indexed config inspect shows the merged result: global + workspace + environment (per indexed config inspect --help). In practice, later layers override earlier ones:
- Built-in defaults (lowest)
- Global config at
~/.indexed/config.tomlwhen it exists - Workspace config at
.indexed/config.tomlfor the workspace you initialized (overrides global for keys that are set) - Environment variables — including
INDEXED__…overrides and connector credential variables - CLI flags on the command you run (highest)
Workspace config is whichever .indexed/ directory applies to your current working directory and setup (for example ~/.indexed/ after cd ~, or <repo>/.indexed/ from a project root).
Environment Variable Overrides
Any configuration key can be overridden at runtime using the INDEXED__<section>__<key> pattern. Double underscores (__) separate each level of the config hierarchy.
export INDEXED__core__v1__indexing__chunk_size=1024
export INDEXED__core__v1__embedding__model_name="all-MiniLM-L6-v2"Environment variables override values from config files. This is useful for CI pipelines and Docker deployments where you want to inject settings without modifying the config file.
Connector Credentials
Connector credentials are read from environment variables (never from config.toml). Names and usage are in Config commands — Connector credentials; CLI flags are in Index commands (indexed index create jira --help / indexed index create confluence --help).
| Variable | Used by | Description |
|---|---|---|
ATLASSIAN_EMAIL | Jira Cloud, Confluence Cloud | Atlassian account email |
ATLASSIAN_TOKEN | Jira Cloud, Confluence Cloud | Atlassian API token |
JIRA_TOKEN | Jira Server/DC | Bearer or personal access token |
JIRA_LOGIN | Jira Server/DC | Username (with JIRA_PASSWORD) |
JIRA_PASSWORD | Jira Server/DC | Password (with JIRA_LOGIN) |
CONF_TOKEN | Confluence Server/DC | Bearer or personal access token |
CONF_LOGIN | Confluence Server/DC | Username (with CONF_PASSWORD) |
CONF_PASSWORD | Confluence Server/DC | Password (with CONF_LOGIN) |
For token setup, Cloud vs Server/DC behavior, and troubleshooting auth, see the Jira and Confluence connector guides.
Keep credentials out of version control
Put secrets in .indexed/.env next to that workspace’s config.toml (or export them in your shell) — never in config.toml and never committed. indexed config init creates .env.example as a template; copy or rename as needed for your environment.
# Jira or Confluence Cloud (typical)
ATLASSIAN_EMAIL=your-email@company.com
ATLASSIAN_TOKEN=your_api_token_here
# Jira Server/DC (example)
# JIRA_TOKEN=your_pat_here
# Confluence Server/DC (example)
# CONF_TOKEN=your_pat_hereStorage Layout
All Indexed data lives at ~/.indexed/:
~/.indexed/
├── config.toml # Your configuration
└── data/
└── collections/
├── my-docs/
│ ├── manifest.json # Source path, connector type, timestamps
│ ├── documents.json # Document metadata (title, ID, source path)
│ ├── chunks.json # Actual text chunks — treat as sensitive
│ └── index.faiss # Binary vector index
└── eng-tickets/
├── manifest.json
├── documents.json
├── chunks.json
└── index.faissCheck disk usage for a specific collection:
du -sh ~/.indexed/data/collections/my-docs5.1M /Users/you/.indexed/data/collections/my-docsDo not manually edit collection files
Do not manually edit, move, or rename individual files within a collection directory. The four files (manifest.json, documents.json, chunks.json, index.faiss) must stay in sync. If a collection becomes corrupted, remove it with indexed index remove <name> --force and recreate it.
The data directory location (~/.indexed/data/) is not currently configurable via a config key. To move it to another location (e.g., a different disk), use a symlink:
mv ~/.indexed/data /path/to/new/location
ln -s /path/to/new/location ~/.indexed/dataAdvanced Options
Chunk Size and Overlap
Chunk size controls how many tokens each embedded segment contains. Smaller chunks produce more precise retrieval — a narrow question is more likely to land on the exact answer. Larger chunks provide more context per result — useful when answers span several paragraphs.
The defaults (chunk_size = 512, chunk_overlap = 50) work well for most use cases. Increase chunk size if results feel too narrow; decrease it if results are losing their focus.
# Larger chunks (more context per result)
indexed config set core.v1.indexing.chunk_size 1024
indexed config set core.v1.indexing.chunk_overlap 100
# Smaller chunks (more precise retrieval)
indexed config set core.v1.indexing.chunk_size 256
indexed config set core.v1.indexing.chunk_overlap 25After changing chunk size, existing collections must be re-indexed — the stored vectors were generated with a different chunk size and will produce inconsistent results if mixed with new ones:
indexed index create files -c my-docs -p ./docs --forceEmbedding Model
The default embedding model is all-MiniLM-L6-v2 — a 80 MB sentence-transformers model that balances retrieval quality with speed and size. It runs locally via ONNX Runtime.
Any model from the sentence-transformers collection on HuggingFace can be used:
indexed config set core.v1.embedding.model_name "paraphrase-multilingual-MiniLM-L12-v2"Changing the model invalidates all existing collections
Each embedding model produces vectors in a different mathematical space. If you change the model, you must re-index all existing collections — searching with mixed models produces garbage results. Run indexed index create ... --force for each collection after changing the model.
Stick with all-MiniLM-L6-v2 unless you have a specific reason to change — for example, multilingual content (use paraphrase-multilingual-MiniLM-L12-v2) or domain-specific retrieval.
Caching
Indexed caches parsed document representations so that re-indexing (e.g., after adding files) doesn't re-parse documents that haven't changed. The cache is used by default.
Bypass the cache when source files have changed but their timestamps are stale (e.g., after a git checkout that preserved mtimes):
indexed index create files -c my-docs -p ./docs --no-cacheRe-enable the cache explicitly (it is on by default):
indexed index create files -c my-docs -p ./docs --use-cache