Processing

Processing components process and transform data within a flow.

Use a processing component in a flow

The Split Text processing component in this flow splits the incoming Data into chunks to be embedded into the vector store component. The component offers control over chunk size, overlap, and separator, which affect context and granularity in vector store retrieval results.

DataFrame operations

This component performs operations on DataFrame rows and columns. To use this component in a flow, connect a component that outputs DataFrame to the DataFrame Operations component This example fetches JSON data from an API. The Smart function component extracts and flattens the results into a tabular DataFrame. The DataFrame Operations component can then work with the retrieved data.

The API Request component retrieves data with only source and result fields. For this example, the desired data is nested within the result field.
Connect a Smart function to the API request component, and a Language model to the Smart function. This example connects a Groq model component.
In the Groq model component, add your Groq API key.
To filter the data, in the Smart function component, in the Instructions field, use natural language to describe how the data should be filtered.

TIPAvoid punctuation in the Instructions field, as it can cause errors.

To run the flow, in the Smart function component, click Run component.
To inspect the filtered data, in the Smart function component, click Inspect output. The result is a structured DataFrame.
Add the DataFrame Operations component, and a Chat Output component to the flow.
In the DataFrame Operations component, in the Operation field, select Filter.
To apply a filter, in the Column Name field, enter a column to filter on. This example filters by name.
Click Playground, and then click Run Flow. The flow extracts the values from the name column.

Operations

This component can perform the following operations on Pandas DataFrame.

Operation	Required Inputs	Info
Add Column	new_column_name, new_column_value	Adds a new column with a constant value.
Drop Column	column_name	Removes a specified column.
Filter	column_name, filter_value	Filters rows based on column value.
Head	num_rows	Returns first `n` rows.
Rename Column	column_name, new_column_name	Renames an existing column.
Replace Value	column_name, replace_value, replacement_value	Replaces values in a column.
Select Columns	columns_to_select	Selects specific columns.
Sort	column_name, ascending	Sorts DataFrame by column.
Tail	num_rows	Returns last `n` rows.

Parameters

Inputs

Name	Display Name	Info
df	DataFrame	The input DataFrame to operate on.
operation	Operation	The DataFrame operation to perform. Options include Add Column, Drop Column, Filter, Head, Rename Column, Replace Value, Select Columns, Sort, and Tail.
column_name	Column Name	The column name to use for the operation.
filter_value	Filter Value	The value to filter rows by.
ascending	Sort Ascending	Whether to sort in ascending order.
new_column_name	New Column Name	The new column name when renaming or adding a column.
new_column_value	New Column Value	The value to populate the new column with.
columns_to_select	Columns to Select	A list of column names to select.
num_rows	Number of Rows	The number of rows to return for head/tail operations. The default is 5.
replace_value	Value to Replace	The value to replace in the column.
replacement_value	Replacement Value	The value to replace with.

Outputs

Name	Display Name	Info
output	DataFrame	The resulting DataFrame after the operation.

Data operations

This component performs operations on Data objects, including selecting keys, evaluating literals, combining data, filtering values, appending/updating data, removing keys, and renaming keys.

To use this component in a flow, connect a component that outputs Data to the Data Operations component’s input. All operations in the component require at least one Data input.
In the Operations field, select the operation you want to perform. For example, send this request to the Webhook component. Replace YOUR_FLOW_ID with your flow ID.
In the Data Operations component, select the Select Keys operation to extract specific user information. To add additional keys, click Add More.
Filter by name, username, and email to select the values from the request.

Operations

The component supports the following operations. All operations in the Data operations component require at least one Data input.

Operation	Required Inputs	Info
Select Keys	`select_keys_input`	Selects specific keys from the data.
Literal Eval	None	Evaluates string values as Python literals.
Combine	None	Combines multiple data objects into one.
Filter Values	`filter_key`, `filter_values`, `operator`	Filters data based on key-value pair.
Append or Update	`append_update_data`	Adds or updates key-value pairs.
Remove Keys	`remove_keys_input`	Removes specified keys from the data.
Rename Keys	`rename_keys_input`	Renames keys in the data.

Parameters

Inputs

Name	Display Name	Info
data	Data	The Data object to operate on.
operations	Operations	The operation to perform on the data.
select_keys_input	Select Keys	A list of keys to select from the data.
filter_key	Filter Key	The key to filter by.
operator	Comparison Operator	The operator to apply for comparing values.
filter_values	Filter Values	A list of values to filter by.
append_update_data	Append or Update	The data to append or update the existing data with.
remove_keys_input	Remove Keys	A list of keys to remove from the data.
rename_keys_input	Rename Keys	A list of keys to rename in the data.

Outputs

Name	Display Name	Info
data_output	Data	The resulting Data object after the operation.

Data to DataFrame

This component converts one or multiple Data objects into a DataFrame. Each Data object corresponds to one row in the resulting DataFrame. Fields from the .data attribute become columns, and the .text field (if present) is placed in a ‘text’ column.

To use this component in a flow, connect a component that outputs Data to the Data to Dataframe component’s input. This example connects a Webhook component to convert text and data into a DataFrame.
To view the flow’s output, connect a Chat Output component to the Data to Dataframe component.
Send a POST request to the Webhook containing your JSON data. Replace YOUR_FLOW_ID with your flow ID. This example uses the default LLM Controls server address.
In the Playground, view the output of your flow. The Data to DataFrame component converts the webhook request into a DataFrame, with text and data fields as columns.
Send another employee data object.
In the Playground, this request is also converted to DataFrame.

Parameters

Inputs

Name	Display Name	Info
data_list	Data or Data List	One or multiple Data objects to transform into a DataFrame.

Outputs

Name	Display Name	Info
dataframe	DataFrame	A DataFrame built from each Data object’s fields plus a text column.

LLM router

This component routes requests to the most appropriate LLM based on OpenRouter model specifications.

Parameters

Inputs

Name	Display Name	Info
models	Language Models	A list of LLMs to route between.
input_value	Input	The input message to be routed.
judge_llm	Judge LLM	The LLM that evaluates and selects the most appropriate model.
optimization	Optimization	The optimization preference between quality, speed, cost, or balanced.

Outputs

Name	Display Name	Info
output	Output	The response from the selected model.
selected_model	Selected Model	The name of the chosen model.

Message to data

This component converts Message objects to Data objects.

Parameters

Inputs

Name	Display Name	Info
message	Message	The Message object to convert to a Data object.

Outputs

Name	Display Name	Info
data	Data	The converted Data object.

Parser

This component formats DataFrame or Data objects into text using templates, with an option to convert inputs directly to strings using stringify. To use this component, create variables for values in the template the same way you would in a Prompt component. For DataFrames, use column names, for example Name: {Name}. For Data objects, use {text}. To use the Parser component with a Structured Output component, do the following:

Connect a Structured Output component’s DataFrame output to the Parser component’s DataFrame input.
Connect the File component to the Structured Output component’s Message input.
Connect the OpenAI model component’s Language Model output to the Structured Output component’s Language Model input.

The flow looks like this:

In the Structured Output component, click Open Table. This opens a pane for structuring your table. The table contains the rows Name, Description, Type, and Multiple.
Create a table that maps to the data you’re loading from the File loader. For example, to create a table for employees, you might have the rows id, name, and email, all of type string.
In the Template field of the Parser component, enter a template for parsing the Structured Output component’s DataFrame output into structured text. Create variables for values in the template the same way you would in a Prompt component. For example, to present a table of employees in Markdown:
To run the flow, in the Parser component, click Run component.
To view your parsed text, in the Parser component, click Inspect output.
Optionally, connect a Chat Output component, and open the Playground to see the output.

For an additional example of using the Parser component to format a DataFrame from a Structured Output component, see the Market Research template flow.

Parameters

Inputs

Name	Display Name	Info
mode	Mode	The tab selection between “Parser” and “Stringify” modes. “Stringify” converts input to a string instead of using a template.
pattern	Template	The template for formatting using variables in curly brackets. For DataFrames, use column names, such as `Name: {Name}`. For Data objects, use `{text}`.
input_data	Data or DataFrame	The input to parse. Accepts either a DataFrame or Data object.
sep	Separator	The string used to separate rows or items. The default is a newline.
clean_data	Clean Data	When stringify is enabled, this option cleans data by removing empty rows and lines.

Outputs

Name	Display Name	Info
parsed_text	Parsed Text	The resulting formatted text as a Message object.

Regex extractor

This component extracts patterns from text using regular expressions. It can be used to find and extract specific patterns or information from text data. To use this component in a flow:

Connect the Regex Extractor to a URL component and a Chat Output component.

In the Regex Extractor tool, enter a pattern to extract text from the URL component’s raw output. This example extracts the first paragraph from the “In the News” section of https://en.wikipedia.org/wiki/Main_Page:

Save to File

This component saves DataFrames, Data, or Messages to various file formats.

To use this component in a flow, connect a component that outputs DataFrames, Data, or Messages to the Save to File component’s input. The following example connects a Webhook component to two Save to File components to demonstrate the different outputs.

In the Save to File component’s Input Type field, select the expected input type. This example expects Data from the Webhook.
In the File Format field, select the file type for your saved file. This example uses .md in one Save to File component, and .xlsx in another.
In the File Path field, enter the path for your saved file. This example uses ./output/employees.xlsx and ./output/employees.md to save the files in a directory relative to where LLM Controls is running. The component accepts both relative and absolute paths, and creates any necessary directories if they don’t exist.

tipIf you enter a format in the file_path that is not accepted, the component appends the proper format to the file. For example, if the selected file_format is csv, and you enter file_path as ./output/test.txt, the file is saved as ./output/test.txt.csv so the file is not corrupted.

Send a POST request to the Webhook containing your JSON data. Replace YOUR_FLOW_ID with your flow ID. This example uses the default LLM Controls server address.
In your local filesystem, open the outputs directory. You should see two files created from the data you’ve sent: one in .xlsx for structured spreadsheets, and one in Markdown.

File input format options

For DataFrame and Data inputs, the component can create:

csv
excel
json
markdown
pdf

For Message inputs, the component can create:

txt
json
markdown
pdf

Parameters

Inputs

Name	Display Name	Info
input_text	Input Text	The text to analyze and extract patterns from.
pattern	Regex Pattern	The regular expression pattern to match in the text.
input_type	Input Type	The type of input to save.
df	DataFrame	The DataFrame to save.
data	Data	The Data object to save.
message	Message	The Message to save.
file_format	File Format	The file format to save the input in.
file_path	File Path	The full file path including filename and extension.

Outputs

Name	Display Name	Info
data	Data	A list of extracted matches as Data objects.
text	Message	The extracted matches formatted as a Message object.
confirmation	Confirmation	The confirmation message after saving the file.

Smart function

This component uses an LLM to generate a Lambda function for filtering or transforming structured data. To use the Smart function component, you must connect it to a Language Model component, which the component uses to generate a function based on the natural language instructions in the Instructions field. This example gets JSON data from the https://jsonplaceholder.typicode.com/users API endpoint. The Instructions field in the Smart function component specifies the task extract emails. The connected LLM creates a filter based on the instructions, and successfully extracts a list of email addresses from the JSON data.

Parameters

Inputs

Name	Display Name	Info
data	Data	The structured data to filter or transform using a Lambda function.
llm	Language Model	The connection port for a Model component.
filter_instruction	Instructions	The natural language instructions for how to filter or transform the data using a Lambda function, such as `Filter the data to only include items where the 'status' is 'active'`.
sample_size	Sample Size	For large datasets, the number of characters to sample from the dataset head and tail.
max_size	Max Size	The number of characters for the data to be considered “large”, which triggers sampling by the `sample_size` value.

Outputs

Name	Display Name	Info
filtered_data	Filtered Data	The filtered or transformed Data object.
dataframe	DataFrame	The filtered data as a DataFrame.

Split text

This component splits text into chunks based on specified criteria. It’s ideal for chunking data to be tokenized and embedded into vector databases. The Split Text component outputs Chunks or DataFrame. The Chunks output returns a list of individual text chunks. The DataFrame output returns a structured data format, with additional text and metadata columns the applied.

To use this component in a flow, connect a component that outputs Data or DataFrame to the Split Text component’s Data port. This example uses the URL component, which is fetching JSON placeholder data.
In the Split Text component, define your data splitting parameters.

This example splits incoming JSON data at the separator },, so each chunk contains one JSON object. The order of precedence is Separator, then Chunk Size, and then Chunk Overlap. If any segment after separator splitting is longer than chunk_size, it is split again to fit within chunk_size. After chunk_size, Chunk Overlap is applied between chunks to maintain context.

Connect a Chat Output component to the Split Text component’s DataFrame output to view its output.
Click Playground, and then click Run Flow. The output contains a table of JSON objects split at },.
Clear the Separator field, and then run the flow again. Instead of JSON objects, the output contains 50-character lines of text with 10 characters of overlap.

Parameters

Inputs

Name	Display Name	Info
data_inputs	Input Documents	The data to split. The component accepts Data or DataFrame objects.
chunk_overlap	Chunk Overlap	The number of characters to overlap between chunks. Default: `200`.
chunk_size	Chunk Size	The maximum number of characters in each chunk. Default: `1000`.
separator	Separator	The character to split on. Default: `newline`.
text_key	Text Key	The key to use for the text column. Default: `text`.

Outputs

Name	Display Name	Info
chunks	Chunks	A list of split text chunks as Data objects.
dataframe	DataFrame	A list of split text chunks as DataFrame objects.

Update data

This component dynamically updates or appends data with specified fields.

Parameters

Inputs

Name	Display Name	Info
old_data	Data	The records to update.
number_of_fields	Number of Fields	The number of fields to add. The maximum is 15.
text_key	Text Key	The key for text content.
text_key_validator	Text Key Validator	Validates the text key presence.

Outputs

Name	Display Name	Info
data	Data	The updated Data objects.

Welcome

Get started

Templates

Concepts

Components

Integrations

Use a processing component in a flow

DataFrame operations

Operations

Data operations

Operations

Data to DataFrame

LLM router

Message to data

Parser

Regex extractor

Save to File

File input format options

Smart function

Inputs

Split text

Inputs

Update data

Welcome

Get started

Templates

Concepts

Components

Integrations

​Use a processing component in a flow

​DataFrame operations

​Operations

​Data operations

​Operations

​Data to DataFrame

​LLM router

​Message to data

​

​Parser

​

​Regex extractor

​Save to File

​File input format options

​Smart function

​Inputs

​Split text

​Inputs

​

​Update data

Use a processing component in a flow

DataFrame operations

Operations

Data operations

Operations

Data to DataFrame

LLM router

Message to data

Parser

Regex extractor

Save to File

File input format options

Smart function

Inputs

Split text

Inputs

Update data