Data Sources
Connecting to Data Sources
Data sources in the Airia platform enable you to integrate content from various origins into your Agent. Once a data source is added, the data undergoes ingestion, encoding, and indexing so that it can be retrieved by the LLM based on the user’s query. We support sparse and dense vectors for efficient retrieval, as well as SQL indexing for text-to-SQL capabilities. The processed data is then stored in vector databases, making it ready for retrieval-augmented generation (RAG) and tooling operations.
Airia’s Data Source connectors allow you to ingest different file types, which then serve as knowledge for your Agent.
Supported File Types
Our various Data Sources support a wide range of file types:
- Office Documents
- Word Documents (
.docx
,.doc
) - Excel Files (
.xlsx
,.xls
) - PowerPoint Files (
.pptx
,.ppt
)
- Word Documents (
- Structured Data
- JSON (
.json
) - CSV (
.csv
)
- JSON (
- Markdown (
.md
,.mdx
) - Images
- JPEG/JPG (
.jpeg
,.jpg
) - PNG (
.png
) - BMP (
.bmp
) - TIFF (
.tif
,.tiff
) - HEIF (
.heif
,.heic
)
- JPEG/JPG (
- Plain Text files (
.txt
) - PDF (
.pdf
)
💡 Note on Supported Content:
- JSON files are supported as text only.
- Excel and CSV files are supported as text (with a file size limit of 50MB) and as SQL (with a file size limit of 100MB).
- HEIF files are not supported for Microsoft connectors due to how files are processed by those connectors.
- PDF files are supported in either Text-only mode or Text with Images mode.
⚠️ Warning on Script Files: Files containing script content (such as PHP, JS, etc.) are not supported via direct File Upload, regardless of their file type. Such files can, however, be ingested via any other compatible connector (e.g., through a cloud storage connector if they reside there).
How to Add a New Data Source
-
Navigate to Data Sources In your project, go to the Data Sources sub-menu.
-
Add New Data Source Click on Add data source.
-
Select Connector Browse the connector library and select the appropriate data source connector (e.g., “Confluence,” “Google Drive,” “File Upload”).
-
Configure Connector Details Provide a name for your data source and fill out any additional required fields specific to the selected connector type.
💡 Note: Refer to each specific connector’s documentation for detailed configuration requirements.
-
Configure Ingestion Settings (Optional) Choose the Vector database, Image scanning mode, and SQL indexing for structured data that best suits your data source’s content.
💡 Note: For more information on these settings, see Ingestion Settings.
-
Create Data Source Click Done to finalize the setup and initiate ingestion.
Managing a Data Source
After a data source has been created, you can easily manage it from the list view in the Data Sources section.
Selecting an existing data source allows you to:
- View and edit key details.
- Review the data ingested from that source.
- Adjust the selected content for ingestion.
- For “File Upload”, you can also upload additional files directly to the platform within this section.