Local Files & Code
Local Files & Code
By the end of this guide, you will have a local directory indexed into a searchable collection, with filters configured to include only the files you care about.
Quick start
indexed index create files -c my-docs -p ./documentsFor every flag and default, see Index commands (indexed index create files --help).
Prerequisites
- Indexed installed and workspace initialized (
indexed config init— see Quick Start) - A folder with documents (Markdown, PDF, DOCX, TXT, HTML, or any supported format)
What gets indexed
- Text extracted from each supported file type (see Supported file formats)
- File path and modification metadata
- Directory hierarchy
- Code structure when AST chunking is enabled (function/class boundaries for supported languages)
Create a Collection
Point Indexed at your folder. It recursively scans, parses every supported format, chunks, and embeds — all on your machine.
indexed index create files -c project-docs -p ~/work/docsIndexing collection 'project-docs'...
Parsed 24 documents
Created 96 chunks
Generated embeddings
✓ Collection 'project-docs' created (96 chunks, 5.1 MB)Indexing is always recursive — all subdirectories are scanned. There is currently no depth-limit flag.
Obsidian vaults
Point -p at your vault root (e.g., ~/Documents/ObsidianVault). Indexed handles nested folders, wiki-links in Markdown, and frontmatter automatically.
Verify the collection was created:
indexed index inspect project-docsCollection: project-docs
Type: files
Source: /Users/you/work/docs
Documents: 24
Chunks: 96
Size: 5.1 MB
Created: 2026-04-06 10:15:33Filter what gets indexed
On the CLI, --include and --exclude take regexes matched against the full file path (repeat the flag for each pattern). When both are set, includes are applied first.
Persistent defaults for a workspace live in [sources.files]: include_patterns are glob-style (for example *.md), and exclude_patterns are regexes — see Files connector. With --respect-gitignore (on by default), common dirs like node_modules and .git are skipped; see the same Index command reference for all indexed index create files flags.
# Only index Markdown and text files
indexed index create files -c docs-only -p ~/work/docs \
--include ".*\.md$" --include ".*\.txt$"
# Skip drafts and work-in-progress files
indexed index create files -c final-docs -p ~/work/docs \
--exclude ".*\.draft\.md$" --exclude ".*WIP.*"Configuration (OCR, change tracking, chunking)
Tune OCR, change tracking, table extraction, and code chunking under [sources.files] (or indexed config set sources.files.<key> <value>). Defaults and key names are in Config commands; narrative setup is in the Configuration guide.
[sources.files]
path = "./documents"
include_patterns = ["*.md", "*.pdf", "*.py"]
exclude_patterns = ["\\.tmp$", "/build/"]
fail_fast = false
respect_gitignore = true
ocr_enabled = true
table_structure = true
code_chunking = true
max_chunk_tokens = 512
change_tracking = "auto"ocr_enabled defaults to off; set it to true for scanned PDFs and images. max_chunk_tokens applies in the parsing pipeline; embedding chunk size is core.v1.indexing.chunk_size (see Config commands).
Code Files
Indexed parses code files (.py, .ts, .go, .rs, .java, .c, .cpp, etc.) as plain text by default. When code_chunking is enabled (the default), tree-sitter AST-aware chunking splits code at semantic boundaries — functions, classes, and methods — rather than arbitrary line counts.
Supported languages and formats
See Supported File Formats for the full list of file types, code languages with AST support, and plaintext formats.
Keep the Index Fresh
When your documents change, update only what's new:
indexed index update project-docsUpdating collection 'project-docs'...
Before: 24 documents, 96 chunks
After: 28 documents, 112 chunks
✓ Collection 'project-docs' updatedIndexed picks a change tracking strategy from sources.files.change_tracking (auto, git, content_hash, mtime, or none). See Parsing architecture and Config commands.
To automate updates:
# Update all Indexed collections every day at 8am (use the path from `which indexed`)
0 8 * * * /path/to/indexed index update 2>&1 >> ~/.indexed/update.logIf you need a clean re-index (e.g., after changing chunk size), use --force:
indexed index create files -c project-docs -p ~/work/docs --forceAdditional options
All indexed index create files flags and defaults: Index commands. OCR, change tracking, patterns, and excluded_extensions: Files connector and Configuration guide.
Troubleshooting
Unsupported file format error — Check the Supported File Formats list. If your format isn't listed, convert it to PDF or Markdown, which parse most reliably.
No results returned — Verify the collection exists with indexed index inspect <name>, then test from the CLI:
indexed index search "your query here" -c project-docsOCR not working on scanned PDFs — OCR is off by default (sources.files.ocr_enabled). Enable it in config, then re-create or update the collection (see Parsing architecture).