Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Plugin] Helper functions for StructuredDatasets and FlyteFiles in Papermill Plugin #3617

Closed
peridotml opened this issue Apr 24, 2023 · 2 comments
Assignees
Labels
flytekit FlyteKit Python related issue

Comments

@peridotml
Copy link

peridotml commented Apr 24, 2023

Use Case

A common use-case for the papermill plugin could be automatically generating notebook reports at the end of modeling / etl pipelines. Having a notebook is useful because a Data Scientist can download it and inspect results further. The alternative is much harder.

Problem

Unfortunately, Papermill's inputs are limited, which makes it difficult to get data and files from Flyte into the notebooks. It requires a few extra tasks.

Idea

Providing helper functions that work with common data types like StructuredDataset, FlyteDirectory, and FlyteFiles.

It could be an inputs version of record_outputs, although serializing into json might difficult. See https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-papermill/flytekitplugins/papermill/task.py#L294.

@welcome
Copy link

welcome bot commented Apr 24, 2023

Thank you for opening your first issue here! 🛠

@pingsutw pingsutw added the flytekit FlyteKit Python related issue label Apr 25, 2023
@peridotml
Copy link
Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flytekit FlyteKit Python related issue
Projects
None yet
Development

No branches or pull requests

2 participants