Skip to main content
Processing components process and transform data within a flow.

Use a processing component in a flow

The Split Text processing component in this flow splits the incoming Data into chunks to be embedded into the vector store component. The component offers control over chunk size, overlap, and separator, which affect context and granularity in vector store retrieval results. Processing 1 Pn

DataFrame operations

This component performs operations on DataFrame rows and columns. To use this component in a flow, connect a component that outputs DataFrame to the DataFrame Operations component. This example fetches JSON data from an API. The Smart function component extracts and flattens the results into a tabular DataFrame. The DataFrame Operations component can then work with the retrieved data. Processing 2 Pn
  1. The API Request component retrieves data with only source and result fields. For this example, the desired data is nested within the result field.
  2. Connect a Smart function to the API request component, and a Language model to the Smart function. This example connects a Groq model component.
  3. In the Groq model component, add your Groq API key.
  4. To filter the data, in the Smart function component, in the Instructions field, use natural language to describe how the data should be filtered.
TIPAvoid punctuation in the Instructions field, as it can cause errors.
  1. To run the flow, in the Smart function component, click Run component.
  2. To inspect the filtered data, in the Smart function component, click Inspect output. The result is a structured DataFrame.
  3. Add the DataFrame Operations component, and a Chat Output component to the flow.
  4. In the DataFrame Operations component, in the Operation field, select Filter.
  5. To apply a filter, in the Column Name field, enter a column to filter on. This example filters by name.
  6. Click Playground, and then click Run Flow. The flow extracts the values from the name column.

Operations

This component can perform the following operations on Pandas DataFrame.
OperationRequired InputsInfo
Add Columnnew_column_name, new_column_valueAdds a new column with a constant value.
Drop Columncolumn_nameRemoves a specified column.
Filtercolumn_name, filter_valueFilters rows based on column value.
Headnum_rowsReturns first n rows.
Rename Columncolumn_name, new_column_nameRenames an existing column.
Replace Valuecolumn_name, replace_value, replacement_valueReplaces values in a column.
Select Columnscolumns_to_selectSelects specific columns.
Sortcolumn_name, ascendingSorts DataFrame by column.
Tailnum_rowsReturns last n rows.
Inputs
NameDisplay NameInfo
dfDataFrameThe input DataFrame to operate on.
operationOperationThe DataFrame operation to perform. Options include Add Column, Drop Column, Filter, Head, Rename Column, Replace Value, Select Columns, Sort, and Tail.
column_nameColumn NameThe column name to use for the operation.
filter_valueFilter ValueThe value to filter rows by.
ascendingSort AscendingWhether to sort in ascending order.
new_column_nameNew Column NameThe new column name when renaming or adding a column.
new_column_valueNew Column ValueThe value to populate the new column with.
columns_to_selectColumns to SelectA list of column names to select.
num_rowsNumber of RowsThe number of rows to return for head/tail operations. The default is 5.
replace_valueValue to ReplaceThe value to replace in the column.
replacement_valueReplacement ValueThe value to replace with.
Outputs
NameDisplay NameInfo
outputDataFrameThe resulting DataFrame after the operation.

Data operations

This component performs operations on Data objects, including selecting keys, evaluating literals, combining data, filtering values, appending/updating data, removing keys, and renaming keys.
  1. To use this component in a flow, connect a component that outputs Data to the Data Operations component’s input. All operations in the component require at least one Data input.
  2. In the Operations field, select the operation you want to perform. For example, send this request to the Webhook component. Replace YOUR_FLOW_ID with your flow ID.
  3. In the Data Operations component, select the Select Keys operation to extract specific user information. To add additional keys, click Add More. Processing 3 Pn
  4. Filter by name, username, and email to select the values from the request.

Operations

The component supports the following operations. All operations in the Data operations component require at least one Data input.
OperationRequired InputsInfo
Select Keysselect_keys_inputSelects specific keys from the data.
Literal EvalNoneEvaluates string values as Python literals.
CombineNoneCombines multiple data objects into one.
Filter Valuesfilter_key, filter_values, operatorFilters data based on key-value pair.
Append or Updateappend_update_dataAdds or updates key-value pairs.
Remove Keysremove_keys_inputRemoves specified keys from the data.
Rename Keysrename_keys_inputRenames keys in the data.
Inputs
NameDisplay NameInfo
dataDataThe Data object to operate on.
operationsOperationsThe operation to perform on the data.
select_keys_inputSelect KeysA list of keys to select from the data.
filter_keyFilter KeyThe key to filter by.
operatorComparison OperatorThe operator to apply for comparing values.
filter_valuesFilter ValuesA list of values to filter by.
append_update_dataAppend or UpdateThe data to append or update the existing data with.
remove_keys_inputRemove KeysA list of keys to remove from the data.
rename_keys_inputRename KeysA list of keys to rename in the data.
Outputs
NameDisplay NameInfo
data_outputDataThe resulting Data object after the operation.

Data to DataFrame

This component converts one or multiple Data objects into a DataFrame. Each Data object corresponds to one row in the resulting DataFrame. Fields from the .data attribute become columns, and the .text field (if present) is placed in a ‘text’ column.
  1. To use this component in a flow, connect a component that outputs Data to the Data to Dataframe component’s input. This example connects a Webhook component to convert text and data into a DataFrame.
  2. To view the flow’s output, connect a Chat Output component to the Data to Dataframe component. Processing 4 Pn
  3. Send a POST request to the Webhook containing your JSON data. Replace YOUR_FLOW_ID with your flow ID. This example uses the default LLM Controls server address.
  4. In the Playground, view the output of your flow. The Data to DataFrame component converts the webhook request into a DataFrame, with text and data fields as columns.
  5. Send another employee data object.
  6. In the Playground, this request is also converted to DataFrame.
Inputs
NameDisplay NameInfo
data_listData or Data ListOne or multiple Data objects to transform into a DataFrame.
Outputs
NameDisplay NameInfo
dataframeDataFrameA DataFrame built from each Data object’s fields plus a text column.

LLM router

This component routes requests to the most appropriate LLM based on OpenRouter model specifications.
Inputs
NameDisplay NameInfo
modelsLanguage ModelsA list of LLMs to route between.
input_valueInputThe input message to be routed.
judge_llmJudge LLMThe LLM that evaluates and selects the most appropriate model.
optimizationOptimizationThe optimization preference between quality, speed, cost, or balanced.
Outputs
NameDisplay NameInfo
outputOutputThe response from the selected model.
selected_modelSelected ModelThe name of the chosen model.

Message to data

This component converts Message objects to Data objects.
Inputs
NameDisplay NameInfo
messageMessageThe Message object to convert to a Data object.
Outputs
NameDisplay NameInfo
dataDataThe converted Data object.

Parser

This component formats DataFrame or Data objects into text using templates, with an option to convert inputs directly to strings using stringify. To use this component, create variables for values in the template the same way you would in a Prompt component. For DataFrames, use column names, for example Name: {Name}. For Data objects, use {text}. To use the Parser component with a Structured Output component, do the following:
  1. Connect a Structured Output component’s DataFrame output to the Parser component’s DataFrame input.
  2. Connect the File component to the Structured Output component’s Message input.
  3. Connect the OpenAI model component’s Language Model output to the Structured Output component’s Language Model input.
The flow looks like this: Processing 5 Pn
  1. In the Structured Output component, click Open Table. This opens a pane for structuring your table. The table contains the rows Name, Description, Type, and Multiple.
  2. Create a table that maps to the data you’re loading from the File loader. For example, to create a table for employees, you might have the rows id, name, and email, all of type string.
  3. In the Template field of the Parser component, enter a template for parsing the Structured Output component’s DataFrame output into structured text. Create variables for values in the template the same way you would in a Prompt component. For example, to present a table of employees in Markdown:
  4. To run the flow, in the Parser component, click Run component.
  5. To view your parsed text, in the Parser component, click Inspect output.
  6. Optionally, connect a Chat Output component, and open the Playground to see the output.
For an additional example of using the Parser component to format a DataFrame from a Structured Output component, see the Market Research template flow.
Inputs
NameDisplay NameInfo
modeModeThe tab selection between “Parser” and “Stringify” modes. “Stringify” converts input to a string instead of using a template.
patternTemplateThe template for formatting using variables in curly brackets. For DataFrames, use column names, such as Name: {Name}. For Data objects, use {text}.
input_dataData or DataFrameThe input to parse. Accepts either a DataFrame or Data object.
sepSeparatorThe string used to separate rows or items. The default is a newline.
clean_dataClean DataWhen stringify is enabled, this option cleans data by removing empty rows and lines.
Outputs
NameDisplay NameInfo
parsed_textParsed TextThe resulting formatted text as a Message object.

Regex extractor

This component extracts patterns from text using regular expressions. It can be used to find and extract specific patterns or information from text data. To use this component in a flow:
  1. Connect the Regex Extractor to a URL component and a Chat Output component.
6 Pn
  1. In the Regex Extractor tool, enter a pattern to extract text from the URL component’s raw output. This example extracts the first paragraph from the “In the News” section of https://en.wikipedia.org/wiki/Main_Page:

Save to File

This component saves DataFrames, Data, or Messages to various file formats.
  1. To use this component in a flow, connect a component that outputs DataFrames, Data, or Messages to the Save to File component’s input. The following example connects a Webhook component to two Save to File components to demonstrate the different outputs.
7 Pn
  1. In the Save to File component’s Input Type field, select the expected input type. This example expects Data from the Webhook.
  2. In the File Format field, select the file type for your saved file. This example uses .md in one Save to File component, and .xlsx in another.
  3. In the File Path field, enter the path for your saved file. This example uses ./output/employees.xlsx and ./output/employees.md to save the files in a directory relative to where LLM Controls is running. The component accepts both relative and absolute paths, and creates any necessary directories if they don’t exist.
tipIf you enter a format in the file_path that is not accepted, the component appends the proper format to the file. For example, if the selected file_format is csv, and you enter file_path as ./output/test.txt, the file is saved as ./output/test.txt.csv so the file is not corrupted.
  1. Send a POST request to the Webhook containing your JSON data. Replace YOUR_FLOW_ID with your flow ID. This example uses the default LLM Controls server address.
  2. In your local filesystem, open the outputs directory. You should see two files created from the data you’ve sent: one in .xlsx for structured spreadsheets, and one in Markdown.

File input format options

For DataFrame and Data inputs, the component can create:
  • csv
  • excel
  • json
  • markdown
  • pdf
For Message inputs, the component can create:
  • txt
  • json
  • markdown
  • pdf
Inputs
NameDisplay NameInfo
input_textInput TextThe text to analyze and extract patterns from.
patternRegex PatternThe regular expression pattern to match in the text.
input_typeInput TypeThe type of input to save.
dfDataFrameThe DataFrame to save.
dataDataThe Data object to save.
messageMessageThe Message to save.
file_formatFile FormatThe file format to save the input in.
file_pathFile PathThe full file path including filename and extension.
Outputs
NameDisplay NameInfo
dataDataA list of extracted matches as Data objects.
textMessageThe extracted matches formatted as a Message object.
confirmationConfirmationThe confirmation message after saving the file.

Smart function

This component uses an LLM to generate a Lambda function for filtering or transforming structured data. To use the Smart function component, you must connect it to a Language Model component, which the component uses to generate a function based on the natural language instructions in the Instructions field. This example gets JSON data from the https://jsonplaceholder.typicode.com/users API endpoint. The Instructions field in the Smart function component specifies the task extract emails. The connected LLM creates a filter based on the instructions, and successfully extracts a list of email addresses from the JSON data. 8 Pn

Inputs

NameDisplay NameInfo
dataDataThe structured data to filter or transform using a Lambda function.
llmLanguage ModelThe connection port for a Model component.
filter_instructionInstructionsThe natural language instructions for how to filter or transform the data using a Lambda function, such as Filter the data to only include items where the 'status' is 'active'.
sample_sizeSample SizeFor large datasets, the number of characters to sample from the dataset head and tail.
max_sizeMax SizeThe number of characters for the data to be considered “large”, which triggers sampling by the sample_size value.
Outputs
NameDisplay NameInfo
filtered_dataFiltered DataThe filtered or transformed Data object.
dataframeDataFrameThe filtered data as a DataFrame.

Split text

This component splits text into chunks based on specified criteria. It’s ideal for chunking data to be tokenized and embedded into vector databases. The Split Text component outputs Chunks or DataFrame. The Chunks output returns a list of individual text chunks. The DataFrame output returns a structured data format, with additional text and metadata columns the applied.
  1. To use this component in a flow, connect a component that outputs Data or DataFrame to the Split Text component’s Data port. This example uses the URL component, which is fetching JSON placeholder data. Processing 9 Pn
  2. In the Split Text component, define your data splitting parameters.
This example splits incoming JSON data at the separator },, so each chunk contains one JSON object. The order of precedence is Separator, then Chunk Size, and then Chunk Overlap. If any segment after separator splitting is longer than chunk_size, it is split again to fit within chunk_size. After chunk_size, Chunk Overlap is applied between chunks to maintain context.
  1. Connect a Chat Output component to the Split Text component’s DataFrame output to view its output.
  2. Click Playground, and then click Run Flow. The output contains a table of JSON objects split at },.
  3. Clear the Separator field, and then run the flow again. Instead of JSON objects, the output contains 50-character lines of text with 10 characters of overlap.

Inputs

NameDisplay NameInfo
data_inputsInput DocumentsThe data to split. The component accepts Data or DataFrame objects.
chunk_overlapChunk OverlapThe number of characters to overlap between chunks. Default: 200.
chunk_sizeChunk SizeThe maximum number of characters in each chunk. Default: 1000.
separatorSeparatorThe character to split on. Default: newline.
text_keyText KeyThe key to use for the text column. Default: text.
Outputs
NameDisplay NameInfo
chunksChunksA list of split text chunks as Data objects.
dataframeDataFrameA list of split text chunks as DataFrame objects.

Update data

This component dynamically updates or appends data with specified fields.
Inputs
NameDisplay NameInfo
old_dataDataThe records to update.
number_of_fieldsNumber of FieldsThe number of fields to add. The maximum is 15.
text_keyText KeyThe key for text content.
text_key_validatorText Key ValidatorValidates the text key presence.
Outputs
NameDisplay NameInfo
dataDataThe updated Data objects.