Airia exposes your data sources as tools through the Model Context Protocol (MCP), an open standard for connecting AI agents to external context. This means any MCP-compatible client — Claude Desktop, Cursor, custom agent frameworks, or other Airia agents — can search your Airia data sources without being locked into a specific vendor. This guide shows you how to make your Airia knowledge base available to external agents.Documentation Index
Fetch the complete documentation index at: https://explore.airia.com/llms.txt
Use this file to discover all available pages before exploring further.
How It Works
When you attach a data source to an AI Model step in Airia, the platform automatically deploys the Airia Datasource MCP Server. This server exposes your retrieval capabilities as callable tools:| Tool | What it does |
|---|---|
| Datastore Semantic and Keyword Search | Searches a single data source using vector and/or keyword matching |
| Multi Data Store Semantic and Keyword Search | Searches across multiple data sources simultaneously |
| Datastore Filename Search | Finds files by name within a data source |
| File Content Retrieval | Retrieves full content of a specific file |
| Graph Database Cypher Query | Queries a knowledge graph (when Graph RAG is enabled) |
| Multi Data Store SQL Query | Runs natural language-to-SQL queries on structured data |
Option 1: Expose an Airia Agent as an MCP Tool
The simplest path is to expose an existing Airia agent as a tool that external agents can call:- Open your agent in the Agent Builder
- Go to Interfaces and add a Tool & MCP Interface
- Configure the tool name and description — make the description clear about what knowledge this agent can retrieve (external agents use this description to decide when to call it)
- Publish the agent
Option 2: Connect to the Airia MCP Gateway
For direct access from external MCP clients:- Navigate to Admin Hub > Account Settings > MCP Gateway
- Configure access for your MCP client
- In your external client, add Airia as an MCP server using the gateway URL
Connecting from Claude Desktop
- In Claude Desktop, go to Settings > MCP Servers
- Add a new server with your Airia MCP Gateway URL
- Authenticate with your Airia credentials
- Your Airia data sources appear as available tools in Claude Desktop
- When you ask Claude a question, it can now search your Airia knowledge base as part of its response
Connecting from Other MCP Clients
Any client that supports the MCP standard (Streamable HTTP transport) can connect to Airia’s MCP Gateway. The gateway exposes the same retrieval tools listed above.Option 3: Use Airia’s API Directly
If your agent framework doesn’t support MCP, you can call Airia’s retrieval endpoints directly via REST API:- Navigate to Interfaces on your agent and add an API Interface
- Use the API key and endpoint URL to make search requests programmatically
Step-by-Step: Configuring a MCP Multi-Hop Retrieval
Step 1 — Enable MCP Multi-Hop Retrieval
- Open your agent pipeline in the Airia builder
- Navigate to the AI Step you want to configure
- Toggle on Enable Datasource
Step 2 — Select Your Datasource(s)
- Click the datasource dropdown that appears
- Select one or more datasources — multi-selection is supported, you can connect as many knowledge sources as your use case requires
- The description and ID of each selected datasource are automatically passed to the LLM context at search time
💡 Tip: Make sure your datasource descriptions are clear and specific. The AI uses these descriptions to determine which source is most relevant for a given query.
Step 3 — Review Retrieval Tools
Once a datasource is selected, the Airia Datasource MCP Server is automatically deployed and attached to your AI step.- By default, all available retrieval tools are enabled
- You can manually disable individual tools based on your use case (e.g., if you only want vector search and not keyword search)
⚠️ Important: If neither the Airia Datasource MCP Server nor any Airia native retrieval tools are configured, the LLM will not have access to your knowledge base and may produce incorrect or hallucinated answers.
What If No Datasource Is Selected?
If a datasource is not selected in the AI step, the LLM will still require a datasource ID to search against. In this case, you must provide it in one of these ways:- In the LLM prompt (system or user prompt)
- In the user input passed to the AI step at runtime
⚠️ Warning: If no datasource ID is supplied through any of these methods and no retrieval tool is configured, the AI has no knowledge source to query. This will likely result in hallucinated or factually incorrect responses.
Limitations
⚠️ Known Limitation: Tool calls — including datasource retrieval tools — do not currently work within nested agent (agent-in-agent) configurations. This is a platform-wide limitation affecting all MCPs, not specific to the Datasource MCP Server.
Configuring What Tools Are Available
When using the MCP Multi-Hop Retrieval:- Open the AI Model step settings
- Under the datasource configuration, review the list of available retrieval tools
- All tools are enabled by default — disable any that aren’t needed for your use case
💡 Tip: If your data source doesn’t have structured data (CSV/XLSX), disable SQL-related tools to reduce noise in the LLM’s tool selection. If Knowledge Graph Extraction isn’t enabled, graph query tools won’t appear.
Writing Effective Datasource Descriptions
When your data sources are exposed via MCP, the LLM uses the data source name and description to decide which source to search. Write descriptions that clearly communicate the content: Weak: “Documents” — Too vague. The LLM can’t determine when to search this source. Strong: “Internal HR policies including PTO, benefits, onboarding procedures, and employee handbook. Covers US and EU employees. Updated quarterly.” — The LLM can match user queries about benefits or onboarding to this source confidently. This is especially important when an agent has access to multiple data sources. Clear descriptions enable accurate source selection without unnecessary searches.Multi-Hop Retrieval in Action
With agentic retrieval via MCP, the LLM doesn’t just run one search. It can:- Search broadly — “Find documents about Project Atlas” (semantic search)
- Narrow down — “Get the budget section from the Project Atlas proposal” (file content retrieval)
- Cross-reference — “Search the compliance data source for regulations that apply to this project type” (multi-source search)
- Query the graph — “What entities are connected to the vendor mentioned in the proposal?” (Cypher query)
Best Practices
- Name and describe your data sources clearly — this is the single biggest factor in retrieval quality with agentic search
- Keep retrieval tools enabled unless you have a reason to disable them — the LLM is good at selecting the right tool
- Test multi-hop behavior — ask complex questions that require multiple searches and verify the agent is searching effectively
- Monitor token usage — multi-hop retrieval uses more tokens than single-hop. Use the Data Search Step for simple queries where cost matters
- Use the Tool & MCP Interface for cross-agent retrieval — if you have a specialized knowledge agent, expose it as a tool so other agents can call it rather than duplicating data source configurations
💡 Tip: For both Semantic and Text-to-SQL search to function, indexes must be created and the data source configured during its creation.
