-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Documentation for Preview Dataset #1757
Conversation
docs/source/preview_datasets.md
Outdated
|
||
This page describes how to preview data from different datasets in a Kedro project with Kedro-Viz. Dataset preview was introduced in Kedro-Viz version 6.3.0, which offers preview for `CSVDatasets` and `ExcelDatasets`. | ||
To provide users with a glimpse of their datasets within a Kedro project, Kedro-Viz offers a preview feature. This feature was introduced in Kedro-Viz version 6.3.0 and expanded upon in version 8.0.0. Initially, it supported `CSVDatasets` and `ExcelDatasets`, and later extended to encompass additional dataset types such as `PlotlyDatasets` and image datasets like `MatplotlibWriter`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Nit] ExcelDatasets
, and was later extended to ..
docs/source/preview_datasets.md
Outdated
```{important} | ||
We recommend that you use the same version of Kedro that was most recently used to test this tutorial (0.19.0). To check the version installed, type `kedro -V` in your terminal window. | ||
``` | ||
Whilst we currently support the above datasets. We are soon going to extend this functionality to other datasets. Users with custom datasets can also extend the preview functionality and we will cover that in the following sections. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we currently support the aforementioned datasets, we are soon going to extend this functionality to include other datasets. Users with custom datasets can also expand the preview functionality, and we will cover that in the following sections.
docs/source/preview_datasets.md
Outdated
|
||
**Extend Preview to Custom Datasets** | ||
|
||
The page titled [Extend Preview to Custom Datasets](./preview_custom_datasets.md) contains information on how you can set up preview for custom datasets and what types are supported by Kedro-viz. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The page titled Extend Preview to Custom Datasets contains information on how to set up previews for custom datasets and which types are supported by Kedro-Viz
docs/source/preview_datasets.md
Outdated
|
||
|
||
To enable dataset preview, add the `preview_args` attribute to the kedro-viz configuration under the `metadata` section in the Data Catalog. Within preview_args, specify `nrows` as the number of rows to preview for the dataset. | ||
To disable dataset previews for specific datasets, you need to set preview: false under the kedro-viz key within the metadata section of your conf.yml file. Here's an example configuration: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we mention that previewing is made default from latest version of viz ? If someone is using old version of viz, it will be opt in ? Also, it will be a good idea to mention about the kedro-datasets version this new feature of kedro viz supports somewhere in the doc. Thank you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point.
``` | ||
|
||
|
||
|
||
## Previewing Data on Kedro-viz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI - this entire section has now moved to Preview Tabular Data on Kedro-viz
@@ -0,0 +1,74 @@ | |||
# Preview Tabular Data in Kedro-viz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This content in this section is not new and it is just moved to a new page. Earlier it was a part of the 'Preview Datasets' page
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some initial comments, but I'll have another look next week 🙂
from kedro_datasets._typing import TablePreview | ||
|
||
class CustomDataset: | ||
def preview(self, nrows: int = 5) -> TablePreview: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we maybe add an example that works, to give users a bit more guidance on how they should realistically implement a preview()
method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will need some help on this as I can't think of a realistic CustomDataset that is not a part of kedro-datasets. @astrojuanlu -- do you have some ideas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few minor comments, I think the most important one is an example of CustomDataset. Other than that there are quite a few inconsistency uses of Dataset
Datasets
and DataSet
.
preview_args: | ||
nrows: 15 | ||
``` | ||
|
||
If no preview_args are specified, the default preview will show the first 5 rows. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was asked in the Slack once, I think we should make it obvious that the preview_args
is the argument that get pass into the preview
function directly, and user can have arbitary arguments.
def preview(self, arg1, arg2):
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I think for pandas datasets .. it is specific to nrows
as we wrote the preview()
func.
But I have updated the custom dataset docs to include arguments. Thanks for highlighting this @noklam
|
||
When creating a custom dataset, if you wish to enable data preview for that dataset, you must implement a `preview()` function within the custom dataset class. Kedro-Viz currently supports previewing tables, Plotly charts, images, and JSON objects. | ||
|
||
The return type of the `preview()` function should match one of the following types, as defined in the `kedro-datasets` source code (_typing.py file): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any chance to add a link to such file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the github link fine ? - https://github.com/kedro-org/kedro-plugins/blob/main/kedro-datasets/kedro_datasets/_typing.py
I can't seem to find docs source code link for _typing.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this is not documented, so a link to the source code is fine for now.
docs/source/preview_datasets.md
Outdated
|
||
In your terminal window, navigate to the folder you want to store the project. Generate the spaceflights tutorial project with all the code in place by using the [Kedro starter for the spaceflights tutorial](https://github.com/kedro-org/kedro-starters/tree/main/spaceflights-pandas): | ||
While we currently support the aforementioned datasets, we are soon going to extend this functionality to include other datasets. Users with custom datasets can also expand the preview functionality, , which is covered in the section [Extend Preview to Custom Datasets](./preview_custom_datasets.md). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have Vale enabled on this repo? I get the sense that aforementioned would be flagged as too wordy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't think so.
We recommend that you use the same version of Kedro that was most recently used to test this tutorial (0.19.0). To check the version installed, type `kedro -V` in your terminal window. | ||
``` | ||
|
||
In your terminal window, navigate to the folder you want to store the project. Generate the spaceflights tutorial project with all the code in place by using the [Kedro starter for the spaceflights tutorial](https://github.com/kedro-org/kedro-starters/tree/main/spaceflights-pandas): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably capitalise Spaceflights?
In your terminal window, navigate to the folder you want to store the project. Generate the spaceflights tutorial project with all the code in place by using the [Kedro starter for the spaceflights tutorial](https://github.com/kedro-org/kedro-starters/tree/main/spaceflights-pandas): | |
In your terminal window, navigate to the folder you want to store the project. Generate the Spaceflights tutorial project with all the code in place by using the [Kedro starter for the spaceflights tutorial](https://github.com/kedro-org/kedro-starters/tree/main/spaceflights-pandas): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed it's mostly lower-case elsewhere in the docs so I will leave it as is.
Co-authored-by: Juan Luis Cano Rodríguez <[email protected]>
…kedro-viz into docs/preview-datasets
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some minor comments and a suggestion on adding an example preview()
method. It's all non blocking though, so I'll approve and let you decide on whether to add it or not 🙂
@@ -0,0 +1,59 @@ | |||
# Extend preview to Custom Datasets |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Extend preview to Custom Datasets | |
# Extend preview to custom datasets |
``` | ||
|
||
|
||
## Examples of Previews |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there way to show the JSONPreview
as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so, the JSON preview is actually experiment tracking oriented hence I was hesitant to share as it might create some confusion. In the next couple of sprints, we will enable preview for a JSONDataset and then I could add that example then.
|
||
class CustomDataset: | ||
def preview(self, nrows, ncolumns, filters) -> TablePreview: | ||
# Add logic for generating preview |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I suggested adding a working example, I didn't necessarily mean anything complex, just in this case some code that would produce a working TablePreview
and demonstrates how nrows
, ncolumns
and filters
would be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work with the docs 💯 ...LGTM
Description
Documentation for the new changes in 'Preview Datasets'
Development notes
QA notes
Checklist
RELEASE.md
file