Retrieval Methods

Airia supports multiple retrieval patterns for bringing knowledge into your agents. Choose based on query complexity, cost, and accuracy requirements. For detailed configuration of search parameters (hybrid weighting, fusion algorithms, reranking), see Hybrid Search and Reranking. For how knowledge graphs enhance retrieval, see Graph-Enhanced Retrieval.

This page covers the same content as Add a Data Source in the Agent Basics section, reframed for the Context Engineering pipeline.

Two Core Retrieval Methods

1. Data Search Step

A dedicated pipeline step that performs a single, embedding-based retrieval pass. The full user input is used as the search query, and the retrieved chunks are passed directly to the LLM or the next step. Best for: Simple queries, linear workflows, batch processing. Faster execution and lower cost.

2. MCP Multi-Hop Retrieval

Sources are attached directly to the LLM, which dynamically decides which sources to query, which retrieval tools to use, and how many times to search. Powered by multi-hop retrieval via the Airia Datasource MCP Server. Best for: Complex queries, conversational agents, accuracy-critical applications.

Aspect	Data Search Step	MCP Multi-Hop Retrieval
Retrieval	Single-hop	Multi-hop
Search calls	Always one	One or more (LLM-determined)
Speed	Faster	Slower
Cost	Lower	Higher
Best use	Simple, predictable queries	Complex, conversational, reasoning-heavy

Configuring a MCP Multi-Hop Retrieval

Step 1 — Enable MCP Multi-Hop Retrieval

Open your agent pipeline in the Airia builder
Navigate to the AI Step you want to configure
Toggle on Enable Datasource

Step 2 — Select Your Datasource(s)

Click the datasource dropdown that appears
Select one or more datasources — multi-selection is supported
The description and ID of each selected datasource are automatically passed to the LLM context at search time

💡 Tip: Make sure your datasource descriptions are clear and specific. The AI uses these descriptions to determine which source is most relevant for a given query.

Step 3 — Review Retrieval Tools

Once a datasource is selected, the Airia Datasource MCP Server is automatically deployed and attached to your AI step.

By default, all available retrieval tools are enabled
You can manually disable individual tools based on your use case (e.g., if you only want vector search and not keyword search)

⚠️ Important: If neither the Airia Datasource MCP Server nor any Airia native retrieval tools are configured, the LLM will not have access to your knowledge base and may produce incorrect or hallucinated answers.

What If No Datasource Is Selected?

If a datasource is not selected in the AI step, the LLM will still require a datasource ID to search against. You must provide it in one of these ways:

In the LLM prompt (system or user prompt)
In the user input passed to the AI step at runtime

⚠️ Warning: If no datasource ID is supplied through any of these methods and no retrieval tool is configured, the AI has no knowledge source to query. This will likely result in hallucinated or factually incorrect responses.

Configuring a Data Search Step

Semantic Search Settings

Max Results — Maximum number of text chunks returned based on semantic similarity. Default: 5. Range: 1-10,000.
Relevance Threshold (1-100) — Filters out chunks below a minimum similarity score. Default: 70. Maps to cosine similarity.
Neighboring Chunks — Includes surrounding context from matched chunks. Default: 1 (one chunk before and after). Range: 0-10.
Hybrid Search — Combines semantic and keyword search with adjustable weighting:
- 100% Keyword / 0% Semantic: Only exact word matches
- 50% Keyword / 50% Semantic: Equal importance to meaning and exact words
- 0% Keyword / 100% Semantic: Only meaning-based matching

Text-to-SQL Search

For structured data (.csv and .xlsx files), you can enable Text-to-SQL search:

Translates natural language queries into SQL
Supports fuzzy search capability (increases query complexity)
Recommended models: Claude 4 Sonnet, GPT 4.1, Claude 3.7 Sonnet

Important: For both Semantic and Text-to-SQL search to function, indexes must be created and the data source configured during its creation.

Available Retrieval Tools (MCP)

When using the MCP Multi-Hop Retrieval, the Airia Datasource MCP Server exposes:

Tool	What it does
Datastore Semantic and Keyword Search	Searches a single data source using vector and/or keyword matching
Multi Data Store Semantic and Keyword Search	Searches across multiple data sources simultaneously
Datastore Filename Search	Finds files by name within a data source
File Content Retrieval	Retrieves full content of a specific file
Graph Database Cypher Query	Queries a knowledge graph (when Graph RAG is enabled)
Multi Data Store SQL Query	Runs natural language-to-SQL queries on structured data

Limitations

⚠️ Known Limitation: Tool calls — including datasource retrieval tools — do not currently work within nested agent (agent-in-agent) configurations. This is a platform-wide limitation affecting all MCPs, not specific to the Datasource MCP Server.

Best Practices

Write descriptive datasource names and descriptions for intelligent LLM source selection
Retain all retrieval tools unless you’re confident they won’t be needed
Ensure retrieval configuration exists in all knowledge-base AI steps
Use the Data Search Step for straightforward workflows where cost and speed matter
Use the MCP Multi-Hop Retrieval for complex, multi-source, or conversational use cases
Test multi-hop retrieval extensively before deployment

Overview

Data Ingestion

Knowledge Enrichment

Retrieval & Search

Guides

Two Core Retrieval Methods

1. Data Search Step

2. MCP Multi-Hop Retrieval

Configuring a MCP Multi-Hop Retrieval

Step 1 — Enable MCP Multi-Hop Retrieval

Step 2 — Select Your Datasource(s)

Step 3 — Review Retrieval Tools

What If No Datasource Is Selected?

Configuring a Data Search Step

Semantic Search Settings

Text-to-SQL Search

Available Retrieval Tools (MCP)

Limitations

Best Practices

Overview

Data Ingestion

Knowledge Enrichment

Retrieval & Search

Guides

Documentation Index

​Two Core Retrieval Methods

​1. Data Search Step

​2. MCP Multi-Hop Retrieval

​Configuring a MCP Multi-Hop Retrieval

​Step 1 — Enable MCP Multi-Hop Retrieval

​Step 2 — Select Your Datasource(s)

​Step 3 — Review Retrieval Tools

​What If No Datasource Is Selected?

​Configuring a Data Search Step

​Semantic Search Settings

​Text-to-SQL Search

​Available Retrieval Tools (MCP)

​Limitations

​Best Practices

Two Core Retrieval Methods

1. Data Search Step

2. MCP Multi-Hop Retrieval

Configuring a MCP Multi-Hop Retrieval

Step 1 — Enable MCP Multi-Hop Retrieval

Step 2 — Select Your Datasource(s)

Step 3 — Review Retrieval Tools

What If No Datasource Is Selected?

Configuring a Data Search Step

Semantic Search Settings

Text-to-SQL Search

Available Retrieval Tools (MCP)

Limitations

Best Practices