Skip to main content

Documentation Index

Fetch the complete documentation index at: https://explore.airia.com/llms.txt

Use this file to discover all available pages before exploring further.

This guide walks you through connecting a data source, ingesting documents, and searching them from an agent — in about 10 minutes.

Prerequisites

  • An Airia account with access to a project
  • At least one document to upload (PDF, DOCX, or TXT)

Step 1: Create a Data Source

  1. Open your project and navigate to Data Sources
  2. Click Add Data Source
  3. Choose File Upload as the connector type (simplest for getting started)
  4. Name your data source — use a descriptive name like “Product Documentation” or “HR Policies” (this name is visible to the LLM when using agentic retrieval, so make it meaningful)
  5. Upload one or more files

Step 2: Configure Ingestion Settings

Before ingesting, review the key settings:
  1. PDF Parser — For standard documents, Basic works fine. If your documents contain tables, images, or complex layouts, choose Advanced or Universal. See Ingestion Settings for details on each parser.
  2. Scan Document for Images — Enabled by default. Leave it on if your documents contain relevant images or diagrams.
  3. Vector Database — Leave as Airia DB (default) unless you’re bringing your own vector store.
  4. Knowledge Graph Extraction — Leave off for this quick start. See the Graph RAG guide when you’re ready to try it.
Click Save to start ingestion. You can monitor progress in the data source detail view — files will show their processing status.

Step 3: Add the Data Source to an Agent

Once ingestion is complete:
  1. Open or create an agent in the Agent Builder
  2. You have two options:
Option A: Data Search Step (simple)
  • Drag a Data Search Step into your agent flow, before the AI Model step
  • Select your data source
  • Configure search settings:
    • Max Results: 5 (default, good starting point)
    • Relevance Threshold: 70 (default)
    • Neighboring Chunks: 1 (includes surrounding context)
  • Connect the Data Search Step output to your AI Model step’s input
Option B: MCP Multi-Hop Retrieval (agentic)
  • Open your AI Model Step settings
  • Toggle on Datasources
  • Select your data source
  • The Airia Datasource MCP Server is automatically deployed — the LLM will dynamically search your data as needed
💡 Which should I choose? Start with Option B (MCP Multi-Hop Retrieval) for the most natural experience. The LLM decides when and how to search. Use Option A if you need deterministic, single-pass retrieval every time.

Step 4: Test Your Agent

  1. Click Test in the agent builder
  2. Ask a question about the content in your uploaded documents
  3. The agent should respond with information grounded in your data, with source citations
If results are not relevant enough:
  • Try adjusting the Relevance Threshold lower (e.g., 50) to return more results
  • Try enabling Hybrid Search at 0.5 alpha for a blend of semantic and keyword matching
  • Check that your documents were fully ingested (file status should show as “Processed”)

Next Steps