Hybrid Search and Reranking

Airia provides multiple search modes that can be combined to balance precision, recall, and relevance for your specific use case. This page covers how semantic search, keyword search, hybrid search, fusion algorithms, and reranking work together.

Search Modes

Semantic (Vector) Search

Semantic search finds chunks whose meaning is similar to the query, even when the exact words differ. It works by comparing the vector embedding of the query against the vector embeddings of your ingested chunks. Strengths: Understands synonyms, paraphrases, and conceptual similarity. “What are the company’s revenue targets?” will match a chunk containing “fiscal year income goals.” Limitations: Can miss results where exact terminology matters, such as product codes, legal clause numbers, or technical identifiers.

Keyword (BM25) Search

Keyword search finds chunks that contain the exact terms in the query, weighted by term frequency and document rarity (the BM25 algorithm). This is traditional full-text search. Strengths: Precise for exact matches — part numbers, names, codes, legal references. Limitations: Misses semantically similar content that uses different words.

Hybrid Search

Hybrid search runs both semantic and keyword searches simultaneously, then combines the results. This gives you the best of both: conceptual understanding from vector search and exact-match precision from keyword search.

Configuring the Hybrid Balance

The Hybrid Search Alpha slider controls the weight between semantic and keyword search:

Alpha Value	Behavior
1.0	Semantic search only (purely meaning-based)
0.7 - 0.9	Semantic-dominant — good for concept-heavy queries, research, general Q&A
0.5	Balanced — equal weight to meaning and exact terms (default when hybrid is enabled)
0.1 - 0.3	Keyword-dominant — good for exact term matching, codes, identifiers
0.0	Keyword search only (purely term-based)

Recommendation: Start with 0.5 (balanced) and adjust based on your testing. If users frequently search for specific identifiers or product names, lean toward keyword. If queries are conversational or exploratory, lean toward semantic.

Fusion Algorithms

When hybrid search returns results from both semantic and keyword searches, a fusion algorithm combines the two ranked lists into a single result set. Airia supports two fusion algorithms:

Ranked Fusion (RRF)

Reciprocal Rank Fusion combines results based on their position in each result list, not their raw scores. A chunk that ranks #1 in keyword search and #3 in semantic search will score higher than a chunk that ranks #10 in both. When to use: When you want a balanced combination that doesn’t favor one search mode’s scoring scale over the other. RRF is robust and works well as a default.

Relative Score Fusion

Relative Score Fusion normalizes the scores from each search mode to a common scale, then combines them. This preserves the magnitude of relevance — a very high-scoring semantic match will outweigh a mediocre keyword match. When to use: When you want the strength of individual match scores to influence the final ranking. Good when one search mode produces highly confident results and the other produces marginal ones.

Note: Fusion algorithm selection is available when your vector store supports sparse vectors (Pinecone, Weaviate, or Cosmos DB with sparse vectors enabled).

Reranking

Reranking is a second-pass relevance scoring step that runs after initial retrieval. A reranker model reads the query and each retrieved chunk together, producing a more accurate relevance score than embedding similarity alone.

How It Works

Initial search (semantic, keyword, or hybrid) retrieves candidate chunks
The reranker model scores each candidate against the original query
Results are re-ordered by the reranker’s scores
Top results are returned to the LLM

When to Enable Reranking

Reranking improves precision at the cost of additional latency. Enable it when:

Accuracy is critical — The answer must come from the most relevant chunks, not just similar ones
Your data source is large — More candidates means more noise; reranking filters it
Knowledge Graph Extraction is enabled — Reranking is automatically activated with Graph RAG to ensure entity-enriched results are properly prioritized

Configuring Reranking

In the Data Search Step configuration:

Toggle on Perform Reranking
The reranker model is selected automatically based on your data store configuration

The search debug panel shows reranking timing metrics (seedRerankMs and finalRerankMs) to help you evaluate the latency impact.

⚠️ Warning: Reranking involves AI processing — each retrieval call invokes the reranker model, which incurs additional token costs. These costs are tracked and viewable in Settings > Token Consumption.

Tuning Retrieval Quality

Beyond search mode and reranking, three parameters control what gets returned:

Max Results

The maximum number of text chunks returned from search. Default: 5. Range: 1 - 10,000. This controls the primary matched chunks before neighboring chunks are added. Setting this higher retrieves more candidates but increases LLM context size and cost.

Relevance Threshold

Filters out chunks below a minimum similarity score. Default: 70. Range: 0 - 100. Maps linearly to cosine similarity (70 = 0.70 cosine similarity). Higher values return fewer, more relevant results. Lower values return more results with potentially lower relevance.

Recommendation: Start at 70 for general use. Increase to 80-90 for precision-critical applications. Decrease to 50-60 if queries frequently return no results.

Neighboring Chunks

Includes surrounding context from the same document. Default: 1. Range: 0 - 10. When set to 1, each matched chunk also returns the chunk immediately before and after it, providing broader context. Higher values include more surrounding text.

Note: With maxResults=5 and neighboringChunks=1, up to 15 chunks may be returned (5 matches + up to 10 neighbors). Factor this into your context window budget.

Configuration Summary

Parameter	Where to Configure	Default	Range
Hybrid Search Alpha	Data Search Step settings	0.5 (hybrid) or 1.0 (semantic only)	0.0 - 1.0
Fusion Algorithm	Data Search Step settings	Relative Score Fusion	RRF or Relative Score
Reranking	Data Search Step toggle	Off (On with Graph RAG)	On / Off
Max Results	Data Search Step settings	5	1 - 10,000
Relevance Threshold	Data Search Step settings	70	0 - 100
Neighboring Chunks	Data Search Step settings	1	0 - 10

Overview

Data Ingestion

Knowledge Enrichment

Retrieval & Search

Guides

Hybrid Search and Reranking

Search Modes

Semantic (Vector) Search

Keyword (BM25) Search

Hybrid Search

Configuring the Hybrid Balance

Fusion Algorithms

Ranked Fusion (RRF)

Relative Score Fusion

Reranking

How It Works

When to Enable Reranking

Configuring Reranking

Tuning Retrieval Quality

Max Results

Relevance Threshold

Neighboring Chunks

Configuration Summary

Overview

Data Ingestion

Knowledge Enrichment

Retrieval & Search

Guides

Documentation Index

​Search Modes

​Semantic (Vector) Search

​Keyword (BM25) Search

​Hybrid Search

​Configuring the Hybrid Balance

​Fusion Algorithms

​Ranked Fusion (RRF)

​Relative Score Fusion

​Reranking

​How It Works

​When to Enable Reranking

​Configuring Reranking

​Tuning Retrieval Quality

​Max Results

​Relevance Threshold

​Neighboring Chunks

​Configuration Summary

Search Modes

Semantic (Vector) Search

Keyword (BM25) Search

Hybrid Search

Configuring the Hybrid Balance

Fusion Algorithms

Ranked Fusion (RRF)

Relative Score Fusion

Reranking

How It Works

When to Enable Reranking

Configuring Reranking

Tuning Retrieval Quality

Max Results

Relevance Threshold

Neighboring Chunks

Configuration Summary