Skip to content
← Back to site
Soleri | Docs

Search Architecture

Vault search is a multi-layer pipeline that combines full-text search, sparse TF-IDF scoring, optional dense vector embeddings, and six weighted relevance signals. This page explains every layer.

For a quick overview, see Under the Hood.

A search query flows through four layers:

Query (search_intelligent op)
→ Facade (mode selection, filters)
→ Brain (scoring, hybrid ranking)
→ VaultManager (federated tier search)
→ SQLite FTS5 + optional vector recall

Each layer adds precision. FTS5 provides broad recall, the brain provides relevance ranking, and the vault manager federates across multiple knowledge sources.

Search supports two modes to keep context lean:

ModeReturnsUse case
scanLightweight results: title, score, snippet, token estimateBrowsing, triage
fullComplete entries with score breakdownsDeep reads, planning

The recommended workflow: scan first, pick the top 2-4 results by score, then load only those entries. This avoids flooding context with entries you don’t need.

"Search for authentication patterns" → scan (10 lightweight results)
"Load entries auth-jwt-001, auth-002" → full (2 complete entries)

The vault stores entries in a SQLite database with an FTS5 virtual table:

CREATE VIRTUAL TABLE entries_fts USING fts5(
id, title, description, context, tags,
content='entries', content_rowid='rowid',
tokenize='porter unicode61'
);

Porter stemming reduces words to their root form (authentication and authenticating both match). Unicode normalization handles accented characters and non-Latin scripts.

Your natural-language query gets transformed before hitting FTS5:

  1. Split into individual terms
  2. Lowercased
  3. Stop words removed (the, is, a, etc.)
  4. Terms shorter than 2 characters dropped
  5. Joined with OR for broad matching

A query like “how does JWT validation work” becomes jwt OR validation OR work.

FTS5 results are ranked using BM25 with per-field weights:

FieldWeightWhy
title10.0Title matches are the strongest signal
id5.0Entry IDs often contain meaningful slugs
description3.0The main content body
tags2.0Tag matches indicate topical relevance
context1.0Context is supplementary

If BM25 is unavailable (older SQLite builds), search falls back to the default FTS5 rank function.

The vault manager searches across multiple tiers — separate SQLite databases with different scopes:

TierScopeDefault priority
agentKnowledge specific to this agentHighest
projectShared across agents in a projectMedium
teamShared across team membersLower
dynamicExternal connected vaultsConfigurable

Each tier is searched independently. Results are then:

  1. Weighted by tier priority
  2. Deduplicated — if the same entry exists in multiple tiers, the highest-priority version wins
  3. Merged into a single result set, sorted by weighted score

This means an agent-level pattern always outranks the same pattern at team level.

FTS5 gives us broad recall. The brain turns it into precise relevance ranking.

The brain requests 3x the desired limit (or 30 results, whichever is larger) from FTS5. This provides headroom — many entries that rank well in FTS5 may score poorly on other signals. Over-fetching ensures the final top-N are truly the best matches across all factors.

Every result is scored across six factors:

SignalWeight (FTS only)Weight (hybrid)How it works
Semantic0.400.25TF-IDF cosine similarity between query and entry
Vector0.000.15Dense embedding cosine similarity
Severity0.150.15critical = 1.0, warning = 0.7, suggestion = 0.4
Temporal decay0.150.15Exponential decay with 365-day half-life
Tag overlap0.150.15Jaccard similarity between query tags and entry tags
Domain match0.150.15Binary — 1.0 if the query domain matches the entry domain

The total score is the weighted sum of all factors.

The brain maintains a TF-IDF vocabulary across all vault entries:

  1. Vocabulary building — tokenizes every entry (title + description + context + tags), computes IDF for each term: IDF = log((docCount + 1) / (df + 1)) + 1
  2. Query vector — tokenizes the query using the same rules, computes a TF-IDF vector
  3. Entry vectors — each entry gets a TF-IDF vector using the same vocabulary
  4. Cosine similarity — the dot product of query and entry vectors, normalized by their magnitudes

This is why rare terms are more valuable than common ones. “JWT” carries more weight than “the” because it appears in fewer entries.

Entries lose relevance over time:

Without a validity window:

decay = exp(-ln(2) * age / halfLife)

Half-life is 365 days. An entry scores 1.0 when new, 0.5 after one year, 0.25 after two years.

With a validity window (entries that have valid_from/valid_until dates):

if remaining > 75% of window: decay = 1.0 (fully valid)
else: decay = remaining / decayZone (linear ramp-down in last 25%)

After 30+ feedback entries (accepted/dismissed results), the brain adjusts its scoring weights:

  • High accept rate (>50%) — increase semantic weight (up to +0.15)
  • Low accept rate (<50%) — decrease semantic weight (down to -0.15)
  • Other weights scale proportionally to maintain a sum of 1.0

This means the search system learns from your usage. If you consistently prefer tag-matched results over text-matched ones, the weights shift accordingly.

If the vault has fewer than 50 FTS results and filtering to the requested limit would discard more than 50% of results, the brain returns all results instead. This prevents over-aggressive filtering on small knowledge bases where every entry matters.

When an embedding provider is configured, search adds a dense vector recall phase:

  1. Query embedding — the query is embedded into a dense vector (e.g., 1536 dimensions for OpenAI models)
  2. Cosine search — brute-force similarity computation against all stored entry vectors
  3. Candidate merging — entries found by vector search but missed by FTS5 are added to the candidate pool
  4. Score integration — vector similarity becomes the sixth scoring signal (0.15 weight)

Dense vectors are stored as binary float32 blobs in an entry_vectors table. Each entry can have one vector per embedding model, allowing multi-model support.

For vaults under 100K entries, brute-force cosine search completes in approximately 50ms. No approximate nearest-neighbor index is needed at this scale.

New entries are automatically embedded in batches of 100. Only the title, description, and context fields are embedded (not full entry content). Entries that already have vectors for the active model are skipped.

{
entry: IntelligenceEntry, // Complete entry object
score: number, // Total weighted score
breakdown: { // Per-factor scores
semantic: number,
vector: number,
severity: number,
temporalDecay: number,
tagOverlap: number,
domainMatch: number
}
}
{
id: string,
title: string,
score: number,
type: string,
domain: string,
tags: string[],
snippet: string, // First 120 chars of description
tokenEstimate: number // Rough token count (chars / 4)
}

Memory entries (session history, captured context) are searched separately using the same FTS5 engine. Memory results are merged with vault results and sorted by score. Memory entries use a fixed score of 0.5 as a baseline.

FileRole
packages/core/src/runtime/capture-ops.tssearch_intelligent facade op
packages/core/src/brain/brain.tsScoring, TF-IDF, hybrid ranking
packages/core/src/vault/vault-manager.tsFederated tier search
packages/core/src/vault/vault-entries.tsFTS5 queries, vector cosine search
packages/core/src/persistence/sqlite-provider.tsSQLite FTS5 implementation
packages/core/src/vault/vault-schema.tsFTS5 table schema
packages/core/src/embeddings/pipeline.tsBatch embedding pipeline

Next: Security & Privacy — understand where your data lives and who can access it.