Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add new Data utility components for CSV/JSON parsing, routing, and filtering #3776

Merged
merged 15 commits into from
Oct 3, 2024

Conversation

vasconceloscezar
Copy link
Collaborator

@vasconceloscezar vasconceloscezar commented Sep 11, 2024

New helper Data components were added:

  • CSV to Data
  • JSON to Data
  • Data Conditional Router
  • Extract Key
  • Filter Data Values
  • Current Date
  • Message to Data

CSV to Data

The CSV to Data component accepts a File, File Path, or a CSV string and convert into list of Data objects.

Inputs (one-off those)

  • CSV File: A valid CSV file.
  • CSV File Path: A valid path to the CSV file.
  • CSV String: A valid CSV string to be converted.

Outputs

  • Data List: A list of Data objects, where each object represents a row from the CSV.

Usage

This component is useful when you need to import CSV data into your Langflow project. It can handle parsing errors and provides appropriate error messages if the CSV is invalid.

JSON to Data

The JSON to Data component accepts a File, File Path, or a JSON string to a single Data object or a list of Data objects.

Inputs (one-off those)

  • JSON File: A valid JSON file.
  • JSON File Path: A valid path to the JSON file.
  • JSON String: A valid JSON string (object or array) to be converted.

Outputs

  • Data: Either a single Data object or a list of Data objects, depending on the input JSON structure.

Usage

Use this component when working with JSON data in your flows. It can handle both single JSON objects and arrays, making it versatile for various data structures.

Data Conditional Router

The Data Conditional Router component routes a Data object based on a condition applied to a specified key.

Inputs

  • Data Input: The Data object to process.
  • Key Name: The name of the key in the Data object to check.
  • Comparison Operator: The operator to apply for comparing values (e.g., equals, contains, starts with).
  • Compare Value: The value to compare against (not used for boolean validator).

Outputs

  • True Output: The Data object if the condition is met.
  • False Output: The Data object if the condition is not met.

Usage

This component is essential for creating conditional logic in your flows based on the content of Data objects.

Extract List Key

The Extract List Key component extracts a specific key containing a list from a Data object and returns a list of Data objects.

Inputs

  • Data: The Data object to extract the list key from.
  • Key to Extract: The key in the Data object that contains the list to extract.

Outputs

  • Extracted Data: A list of Data objects extracted from the specified key.

Usage

Use this component when you need to work with nested list data within your Data objects.

Filter Data Values

The Filter Data Values component filters a list of data items based on a specified key, filter value, and comparison operator.

Inputs

  • Input Data: The list of data items to filter.
  • Filter Key: The key to filter on (e.g., 'route').
  • Filter Value: The value to filter by (e.g., 'CMIP').
  • Comparison Operator: The operator to apply for comparing the values.

Outputs

  • Filtered Data: A list of Data objects that meet the filter criteria.

Usage

This component is useful for narrowing down large datasets based on specific criteria.

Current Date

The Current Date component is not provided in the given code snippets. It likely returns the current date and time, which can be useful in various data processing and logging scenarios.

image

image

image

@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. enhancement New feature or request python Pull requests that update Python code labels Sep 11, 2024
@vasconceloscezar vasconceloscezar changed the title Feat:data-components-utilities Feat: data-components-utilities Sep 11, 2024
@ogabrielluiz
Copy link
Contributor

Hey @vasconceloscezar

These look awesome. Really useful stuff.

Could you review the PR title to be something a bit more descriptive?

Thanks!

@vasconceloscezar vasconceloscezar changed the title Feat: data-components-utilities feat: Add new Data utility components for CSV/JSON parsing, routing, and filtering Sep 16, 2024
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Sep 16, 2024
@vasconceloscezar
Copy link
Collaborator Author

Thanks, I have some ideas for dedicated components for specific file types.

Not sure if we have something like this in the roadmap.

For example, instead of this:
image

We would have this:
image

Copy link
Contributor

@ogabrielluiz ogabrielluiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good.

What do you think about another component that turns a Message into a Data (should be as simple as Data(data=message.data)

src/backend/base/langflow/components/helpers/JSONtoData.py Outdated Show resolved Hide resolved
@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Sep 18, 2024
@vasconceloscezar
Copy link
Collaborator Author

I changed both JSON and CSV loaders to accept File, Path or String representation.

image

And this is the new Message to Data
image

@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Sep 18, 2024
@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Sep 19, 2024
@ogabrielluiz ogabrielluiz enabled auto-merge (squash) October 2, 2024 19:44
@ogabrielluiz ogabrielluiz merged commit b8e7a77 into main Oct 3, 2024
27 checks passed
@ogabrielluiz ogabrielluiz deleted the feat/data-components-utilities branch October 3, 2024 15:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request lgtm This PR has been approved by a maintainer python Pull requests that update Python code size:XL This PR changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants