Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document how to test examples in user guide, add some more coverage #11178

Merged
merged 2 commits into from
Jul 2, 2024

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Jun 29, 2024

Which issue does this PR close?

Part of #11172 and #1813

Rationale for this change

I am trying to make the examples easier to find / navigate. Right now there are many files in datafusion-examples but they are somewhat overwhelming. There are also some examples in the user guide, but many are not tested

I tried to consolidate the examples in #11173 but @lewiszlw (rightly) pointed out that one giant .rs example file might be even harder to find / navigate

I think the ideal outcome from a users perspective is for the examples to be inline in the user guide so they are both discoverable and the context explained. However, it is critical that the examples be testable

What changes are included in this PR?

  1. Document how to test examples in the user guide
  2. Add test coverage for expressions.rs and fix the example there.

Are these changes tested?

Yes,

Are there any user-facing changes?

Documentation on tests.

@github-actions github-actions bot added the core Core DataFusion crate label Jun 29, 2024
@efredine
Copy link
Contributor

As a newcomer, I can say that more guidance on the examples is good! I think it might also be helpful:

  1. To perhaps provide more guidance on when on when an example should be included in the documentation?
  2. To include links to the appropriate examples in the example directory when appropriate? Especially for more advanced use cases?

I'll also note that the for newcomers the hardest part of examples - and tests - is figuring out how to setup the appropriate pre-conditions. Maybe we could create some "best practices" around how to approach this. Or perhaps its just a matter of reading enough existing examples and tests!

@alamb
Copy link
Contributor Author

alamb commented Jun 30, 2024

Thanks @efredine, this is great feedback

To perhaps provide more guidance on when on when an example should be included in the documentation?

Yes, I agree this would be helpful. I'll think if I can come up with some summary that matches current reality (rather than what I would ideally liek)

To include links to the appropriate examples in the example directory when appropriate? Especially for more advanced use cases?

Yes, I agree -- I think this is the approach that @tshauck took with the user guide (e.g. https://datafusion.apache.org/library-user-guide/working-with-exprs.html has links to various examples)

I'll also note that the for newcomers the hardest part of examples - and tests - is figuring out how to setup the appropriate pre-conditions. Maybe we could create some "best practices" around how to approach this. Or perhaps its just a matter of reading enough existing examples and tests!

When you say "preconditions" what do you mean? Is it the correct data directories / data files checked out?

@efredine
Copy link
Contributor

By pre-conditions I meant that in order to test or illustrate something like an option when reading a file I need to first create a file and write to it with data that is meaningful for the thing being tested.

@alamb
Copy link
Contributor Author

alamb commented Jul 1, 2024

By pre-conditions I meant that in order to test or illustrate something like an option when reading a file I need to first create a file and write to it with data that is meaningful for the thing being tested.

🤔 Maybe we could add some examples of "how to create test data" (specifically we can now use the COPY command thanks to @devinjdangelo so it is significantly easier)

//
// For example, if `user_guide_expressions(line 123)` fails,
// go to `docs/source/user-guide/expressions.md` to find the relevant problem.

#[cfg(doctest)]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it turns out we already have some coverage of tests in the user guide, but it was somewhat unclear. I documented how it works

Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm thanks @alamb for documenting this part!

@alamb
Copy link
Contributor Author

alamb commented Jul 2, 2024

Thank you for your review (as always @comphead ) 🙏

@alamb alamb merged commit 1840ab5 into apache:main Jul 2, 2024
24 checks passed
@alamb alamb deleted the alamb/doc_tests branch July 2, 2024 16:22
findepi pushed a commit to findepi/datafusion that referenced this pull request Jul 16, 2024
…pache#11178)

* Document testing examples, add some more coverage

* add docs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants