Data Components

Data components load data from a source into your flow. They may perform some processing or type checking, like converting raw HTML data into text, or ensuring your loaded file is of an acceptable type.

Use a data component in a flow

The URL data component loads content from a list of URLs. In the component’s URLs field, enter the URL you want to load To add multiple URL fields, click the add button Alternatively, connect a component that outputs the Message type, like the Chat Input component, to supply your URLs from a component

API Request Component

This component makes HTTP requests using URLs or cURL commands.

Setup Instructions

Connect the Data output to a component that accepts the input (for example, connect the API Request component to a Chat Output component)

In the API component’s URLs field, enter the endpoint for your request
In the Method field, enter the type of request (GET, POST, PATCH, PUT, or DELETE)
Optionally, enable the Use cURL button to create a field for pasting curl requests
Click Playground, then click Run Flow to execute your request

Parameters

Inputs

Name	Display Name	Info
urls	URLs	Enter one or more URLs, separated by commas.
curl	cURL	Paste a curl command to populate the dictionary fields for headers and body.
method	Method	The HTTP method to use.
use_curl	Use cURL	Enable cURL mode to populate fields from a cURL command.
query_params	Query Parameters	The query parameters to append to the URL.
body	Body	The body to send with the request as a dictionary (for `POST`, `PATCH`, `PUT`).
headers	Headers	The headers to send with the request as a dictionary.
timeout	Timeout	The timeout to use for the request.
follow_redirects	Follow Redirects	Whether to follow http redirects.
save_to_file	Save to File	Save the API response to a temporary file.
include_httpx_metadata	Include HTTPx Metadata	Include properties such as `headers`, `status_code`, `response_headers`, and `redirection_history` in the output.

Outputs

Name	Display Name	Info
data	Data	The result of the API requests. Returns a Data object containing source URL and results.
dataframe	DataFrame	Converts the API response data into a tabular DataFrame format.

File

This component loads and parses files of various supported formats and converts the content into a Data object. It supports multiple file types and provides options for parallel processing and error handling. To Load a Document

Click the Select files button

Select a local file or a file loaded with File management

Parameters

Inputs

Name	Display Name	Info
path	Files	The path to files to load. Supports individual files or bundled archives.
file_path	Server File Path	A Data object with a `file_path` property pointing to the server file or a Message object with a path to the file. Supersedes ‘Path’ but supports the same file types.
separator	Separator	The separator to use between multiple outputs in Message format.
silent_errors	Silent Errors	If true, errors do not raise an exception.
delete_server_file_after_processing	Delete Server File After Processing	If true, the Server File Path is deleted after processing.
ignore_unsupported_extensions	Ignore Unsupported Extensions	If true, files with unsupported extensions are not processed.
ignore_unspecified_files	Ignore Unspecified Files	If true, `Data` with no `file_path` property is ignored.
use_multithreading	[Deprecated] Use Multithreading	Set ‘Processing Concurrency’ greater than `1` to enable multithreading. This option is deprecated.
concurrency_multithreading	Processing Concurrency	When multiple files are being processed, the number of files to process concurrently. Default is 1. Values greater than 1 enable parallel processing for 2 or more files.

Outputs

Name	Display Name	Info
data	Data	The parsed content of the file as a Data object.
dataframe	DataFrame	The file content as a DataFrame object.
message	Message	The file content as a Message object.

Supported File Types

Text Files

.txt - Plain text files
.md, .mdx - Markdown files
.csv - Comma-separated values
.json - JSON data files
.yaml, .yml - YAML configuration files
.xml - XML documents
.html, .htm - HTML web pages
.pdf - PDF documents
.docx - Microsoft Word documents

Code Files

.py - Python source code
.js - JavaScript files
.ts, .tsx - TypeScript files
.sh - Shell scripts
.sql - SQL query files

Archive Formats

For bundling multiple files:

.zip - ZIP archives
.tar - TAR archives
.tgz - Gzipped TAR archives
.bz2 - Bzip2 compressed files
.gz - Gzip compressed files

SQL Query

This component executes SQL queries on a specified database.

Parameters

Inputs

Name	Display Name	Info
query	Query	The SQL query to execute.
database_url	Database URL	The URL of the database.
include_columns	Include Columns	Include columns in the result.
passthrough	Passthrough	If an error occurs, return the query instead of raising an exception.
add_error	Add Error	Add the error to the result.

Outputs

Name	Display Name	Info
result	Result	The result of the SQL query execution.

URL

This component fetches content from one or more URLs, processes the content, and returns it in various formats. It supports output in plain text or raw HTML. In the component’s URLs field, enter the URL you want to load

To use this component in a flow, connect the DataFrame output to a component that accepts the input. For example, connect the URL component to a Chat Output component.

In the URL component’s URLs field, enter the URL for your request. This example uses llmc.org.
Optionally, in the Max Depth field, enter how many pages away from the initial URL you want to crawl. Select 1 to crawl only the page specified in the URLs field. Select 2 to crawl all pages linked from that page. The component crawls by link traversal, not by URL path depth.
Click Playground, and then click Run Flow. The text contents of the URL are returned to the Playground as a structured DataFrame.
In the URL component, change the output port to Message, and then run the flow again. The text contents of the URL are returned as unstructured raw text, which you can extract patterns from with the Regex Extractor tool.
Connect the URL component to a Regex Extractor and Chat Output.

In the Regex Extractor tool, enter a pattern to extract text from the URL component’s raw output. This example extracts the first paragraph from the “In the News” section of https://en.wikipedia.org/wiki/Main_Page.

Parameters

Inputs

Name	Display Name	Info
urls	URLs	Click the ’+’ button to enter one or more URLs to crawl recursively.
max_depth	Max Depth	Controls how many ‘clicks’ away from the initial page the crawler will go.
prevent_outside	Prevent Outside	If enabled, only crawls URLs within the same domain as the root URL.
use_async	Use Async	If enabled, uses asynchronous loading which can be significantly faster but might use more system resources.
format	Output Format	Output Format. Use `Text` to extract the text from the HTML or `HTML` for the raw HTML content.
timeout	Timeout	Timeout for the request in seconds.
headers	Headers	The headers to send with the request.

Outputs

Name	Display Name	Info
data	Data	A list of Data objects containing fetched content and metadata.
text	Message	The fetched content as formatted text.
dataframe	DataFrame	The content formatted as a DataFrame object.

Webhook

This component defines a webhook trigger that runs a flow when it receives an HTTP POST request. If the input is not valid JSON, the component wraps it in a payload object so that it can be processed and still trigger the flow. The component does not require an API key. When you add a Webhook component to a flow, the flow’s API access pane exposes an additional Webhook cURL tab that contains a POST /v1/webhook/$FLOW_ID code snippet. You can use this request to send data to the Webhook component and trigger the flow. For example: To test the webhook component:

Add a Webhook component to the flow.
Connect the Webhook component’s Data output to the Data input of a Parser component.
Connect the Parser component’s Parsed Text output to the Text input of a Chat Output component.
In the Parser component, under Mode, select Stringify. This mode passes the webhook’s data as a string for the Chat Output component to print.
To send a POST request, copy the code from the Webhook cURL tab in the API pane and paste it into a terminal.
Send the POST request.
Open the Playground. Your JSON data is posted to the Chat Output component, which indicates that the webhook component is correctly triggering the flow.

Parameters

Inputs

Name	Display Name	Description
data	Payload	Receives a payload from external systems through HTTP POST requests.
curl	cURL	The cURL command template for making requests to this webhook.
endpoint	Endpoint	The endpoint URL where this webhook receives requests.

Outputs

Name	Display Name	Description
output_data	Data	Outputs processed data from the webhook input, and returns an empty Data object if no input is provided. If the input is not valid JSON, the component wraps it in a `payload` object.

Gmail Loader

This component loads emails from Gmail using provided credentials and filters. For more information about creating a service account JSON, see Service Account JSON.

Parameters

Inputs

Input	Type	Description
json_string	SecretStrInput	A JSON string containing OAuth 2.0 access token information for service account access.
label_ids	MessageTextInput	A comma-separated list of label IDs to filter emails.
max_results	MessageTextInput	The maximum number of emails to load.

Outputs

Output	Type	Description
data	Data	The loaded email data.

Google Drive Loader

This component loads documents from Google Drive using the provided credentials and a single document ID. For more information about creating a service account JSON, see Service Account JSON.

Parameters

Inputs

Input	Type	Description
json_string	SecretStrInput	A JSON string containing OAuth 2.0 access token information for service account access.
document_id	MessageTextInput	A single Google Drive document ID.

Outputs

Output	Type	Description
docs	Data	The loaded document data.

Google Drive Search

This component searches Google Drive files using provided credentials and query parameters. For more information about creating a service account JSON, see Service Account JSON.

Parameters

Inputs

Input	Type	Description
token_string	SecretStrInput	A JSON string containing OAuth 2.0 access token information for service account access.
query_item	DropdownInput	The field to query.
valid_operator	DropdownInput	The operator to use in the query.
search_term	MessageTextInput	The value to search for in the specified query item.
query_string	MessageTextInput	The query string used for searching.

Outputs

Output	Type	Description
doc_urls	List[str]	The URLs of the found documents.
doc_ids	List[str]	The IDs of the found documents.
doc_titles	List[str]	The titles of the found documents.
Data	Data	The document titles and URLs in a structured format.

Welcome

Get started

Templates

Concepts

Components

Integrations

Data components in LLM Controls

Data Components

Use a data component in a flow

API Request Component

Setup Instructions

File

Supported File Types

Text Files

Code Files

Archive Formats

SQL Query

URL

Webhook

Gmail Loader

Google Drive Loader

Google Drive Search

Welcome

Get started

Templates

Concepts

Components

Integrations

​Data Components

​Use a data component in a flow

​API Request Component

​Setup Instructions

​File

​Supported File Types

​Text Files

​Code Files

​Archive Formats

​SQL Query

​URL

​Webhook

​Gmail Loader

​Google Drive Loader

​Google Drive Search

Data Components

Use a data component in a flow

API Request Component

Setup Instructions

File

Supported File Types

Text Files

Code Files

Archive Formats

SQL Query

URL

Webhook

Gmail Loader

Google Drive Loader

Google Drive Search