Indexed
Concepts

How Semantic Search Works

Understand embeddings, vectors, and why semantic search finds what keyword search can't.

How Semantic Search Works

Indexed uses semantic search to find documents by meaning, not just by matching keywords. This page explains what that means and why it matters.

Traditional search (like Jira's built-in search) matches exact words. If you search for "authentication problems," it only finds documents that contain those exact words. But the answers you need might use different language:

  • A ticket titled "SSO refresh token expiration handling"
  • A doc section about "login timeout after IdP migration"
  • A runbook titled "Fixing session drops for VPN users"

None of these contain the words "authentication problems" — but they're all relevant. Keyword search misses them entirely.

How Semantic Search Solves This

Semantic search works in two steps:

Text becomes vectors (embedding)

An embedding model reads a piece of text and converts it into a vector — a list of numbers that represents the text's meaning. Think of it like coordinates on a map: texts with similar meanings end up at nearby coordinates, even if they use completely different words.

For example, these phrases would all produce vectors that are close to each other:

  • "authentication problems"
  • "SSO login failures"
  • "users can't sign in"
  • "session timeout issues"

The model understands that these all relate to the same concept, even though they share almost no words.

Indexed uses the all-MiniLM-L6-v2 model from Sentence Transformers. It runs entirely on your machine — no API calls, no data sent anywhere.

Nearest-neighbor search (FAISS)

When you search, your query is also converted into a vector. Then FAISS (Facebook AI Similarity Search) finds the vectors in your index that are closest to your query vector. "Closest" means most similar in meaning.

This is fast — FAISS can search millions of vectors in milliseconds.

The Indexing Pipeline

Here's what happens when you run indexed index create:

Your Documents
  → Parse (extract text from PDF, DOCX, MD, etc.)
  → Chunk (split into ~500-token segments)
  → Embed (convert each chunk to a 384-dimensional vector)
  → Store (save vectors in a FAISS index on disk)

Why chunking matters

Documents are split into chunks before embedding because:

  • Embedding models have a maximum input length
  • Smaller chunks produce more precise search results — a match on a specific paragraph is more useful than a match on an entire 20-page document
  • Each chunk becomes a separate entry in the search results, so you see the relevant section, not just the document name

The default chunk size and overlap are configurable. See the Configuration Reference for details.

A Concrete Example

Suppose you've indexed 200 Jira tickets and you search for "how to handle rate limiting":

  1. Your query becomes a vector: [0.023, -0.412, 0.187, ...] (384 numbers)
  2. FAISS compares this vector to all 200+ chunk vectors in your index
  3. It returns the 5 closest matches, which might include:
    • A ticket about "API throttling with exponential backoff"
    • A ticket about "429 error handling in the payment service"
    • A doc section about "retry strategies for external APIs"

All of these are relevant to "rate limiting" even though none use those exact words.

Limitations

  • Language: The embedding model works best with English text. Other languages may produce less accurate results.
  • Very short text: Single-word or very short queries may produce less precise results. Be descriptive in your searches.
  • Domain jargon: The model has general knowledge but may not understand highly specialized terminology. However, it will still match documents that use similar jargon in similar contexts.
  • Not a chatbot: Semantic search finds relevant content — it doesn't generate answers. That's where MCP integration comes in: AI assistants can search your index and then synthesize answers.

What's Next