Skip to content

Conversation

@tinuademargaret
Copy link
Contributor

@tinuademargaret tinuademargaret commented Dec 11, 2020

This is my attempt of resolving the issue of adding directive supporting the use of remote data in doctests.
The issue can be found here #2

EDIT: Fix #2

EDIT: Re-posted notes from the author below.

  • A fourth directive .. doctest-remote-data:: that skips the next doctest chunk if --remote-data is not passed as an argument to the pytest command is included
  • If --remote-data option was not passed then check the chunk of strings provided by the result of doctest.DocTestParser.parse for the new directives using regex
  • If the new directive was matched, then skip the test

@pllim pllim changed the title initial attempt ENH: Support remote data in doctest Dec 11, 2020
@pllim
Copy link
Contributor

pllim commented Dec 14, 2020

I think simply skipping as per setting remote_data_strict = false over at https://github.com/astropy/pytest-remotedata is fine for now. But if we ever want to support the equivalent of remote_data_strict = true in doctest (do we?), we also have to make sure that the implementation in this PR will not require complete refactoring to add that feature.

Supporting the equivalent of --remote-data=astropy is probably overkill? I agree with Brigitta that supporting just the equivalent of --remote-data=none and --remote-data=all is sufficient for now.

Some test cases to consider (just brainstorming here, don't take this as a command). These might not all be suitable in CI but should at least be manually tested during development:

  • A plain block of doctest, following by remote data doctest, and then plain block again.
  • Remote data doctest followed by plan block of doctest. Will skipping remote data accidentally skip the plain block as well? It should not. Will specifying --remote-data run everything? It should.
  • All plain blocks of doctest without any marked remote data. Will skipping remote data accidentally skip these? It should not.
  • Several blocks of remote data doctest and only those. Will skipping remote data skip everything? It should. Will specifying --remote-data run everything? It should.
  • Can composite directive work? E.g., consider this:
.. doctest-remote-data::

    >>> some_func()  # doctest: +FLOAT_CMP
    1.2345
    >>> some_other_func_with_warnings()  # doctest: +IGNORE_WARNINGS +ELLIPSIS
    1.23...
  • Will misuse break it? Ideally, the extra +REMOTE_DATA would just be a silent no-op in this case? For example:
.. doctest-remote-data::

    >>> some_func_1()  # doctest: +REMOTE_DATA
    >>> some_func_2()

Thank you for tackling this!

@tinuademargaret
Copy link
Contributor Author

Hi @pllim, thank you for your suggestions. I have written a couple of test cases like you suggested, and I have been able to test for when the --remote-data option is not given, but when I try to pass in the --remote-data option I get an error.

This is the command I call pytest --doctest-plus --doctest-rst --remote-data tests/docs/skip_some.rst

This is the error I get ERROR: 'tests/docs/skip_some.rst' is not a valid source for remote data

If I go by the error message, do I really need to query an actual remote source for my tests?
I have pushed my latest commits so you can have a look.

@pllim
Copy link
Contributor

pllim commented Dec 22, 2020

To test how it is intended to be used, a lightweight remote data access might be nice. Something simple like using urllib to grab https://raw.githubusercontent.com/astropy/astropy-data/gh-pages/README.md might be sufficient (or even the README from this repo, doesn't matter).

But to better debug your problem, it would be nice if you can first reproduce that exact same error in the CI here. It seems like you might need to install pytest-remotedata as a test dependency now.

@bsipocz
Copy link
Member

bsipocz commented Dec 23, 2020

We have debugged this during our weekly call today.

With dev version pytest-remotedata the error message is a bit more verbose:
ERROR: 'tests/docs/skip_some.rst' is not a valid source for remote data, use one of 'astropy', 'any', 'github', 'none'

So, pytest --doctest-plus --doctest-rst --remote-data=any tests/docs/skip_some.rst fixes it. Having a default value for --remote-data, or astropy/pytest-remotedata#46, would be nice, but those are unrelated feature requests for pytest-remotedata.

@bsipocz
Copy link
Member

bsipocz commented Dec 23, 2020

As for the current test failures, that you also asked on slack, pytest-remotedata with these new additions are a requirement for the tests, and therefore need to be installed by tox.
I think we can make it a required test dependency and add it via setup.cfg and options.extras_required and install it for all test runs, but if either of you @pllim or @saimn have strong feelings otherwise, we can hack it to be an optional test dependency, too.

I have a solution that works locally for the first option, please let me know if you want me to push it to this branch. But I'm equally happy if you rather want to figure it out yourself @tinumide (you can use the astropy config files as guidance) as there is a similar situation there.

@bsipocz
Copy link
Member

bsipocz commented Dec 23, 2020

There are two more things that came into my mind while reviewing your tests:

  • I think currently there is no error if remote data is accessed without using the directive (unlike for the remote-data decorator when an IOError is raised with "An attempt was made to connect to the internet". I'm not sure we want this behaviour, it can certainly wait and be added once this first implementation is already been merged).

  • would be nice to add another directive, that works for the whole file/rest of the file and not just for the code block. e.g. doctest-remote-data-all

  • I think it already works like that: please do add tests for composite usage of multiple directives, e.g. adding a nonsensical example like this:

This should be skipped, even when remote data is enabled as satrapy is not installed::

.. doctest-remote-data::
.. doctest-requires:: satrapy
    >>> 1 + 3
    2

@pllim
Copy link
Contributor

pllim commented Dec 23, 2020

p.s. .. doctest-requires:: satrapy should be .. doctest-requires:: astropy 😉

Re: Test dependencies -- I am okay with defining pytest-remotedata under "extras required" as long as we also have a job that runs without it to make sure pytest-remotedata does not accidentally become a required dependency.

@bsipocz
Copy link
Member

bsipocz commented Dec 23, 2020

No, I think one example should be skipping because of the remote data, but it should be remained skipped due you the "missing" dependency

@pllim
Copy link
Contributor

pllim commented Dec 23, 2020

Ahhh... okay... maybe something clearer like .. doctest-requires:: does_not_exist because I have the habit to change "satrapy" back to "astropy"... 😬

@tinuademargaret
Copy link
Contributor Author

As for the current test failures, that you also asked on slack, pytest-remotedata with these new additions are a requirement for the tests, and therefore need to be installed by tox.
I think we can make it a required test dependency and add it via setup.cfg and options.extras_required and install it for all test runs, but if either of you @pllim or @saimn have strong feelings otherwise, we can hack it to be an optional test dependency, too.

I have a solution that works locally for the first option, please let me know if you want me to push it to this branch. But I'm equally happy if you rather want to figure it out yourself @tinumide (you can use the astropy config files as guidance) as there is a similar situation there.

Sure, sounds good to try out.

@tinuademargaret
Copy link
Contributor Author

tinuademargaret commented Dec 24, 2020

There are two more things that came into my mind while reviewing your tests:

  • I think currently there is no error if remote data is accessed without using the directive (unlike for the remote-data decorator when an IOError is raised with "An attempt was made to connect to the internet". I'm not sure we want this behaviour, it can certainly wait and be added once this first implementation is already been merged).
  • would be nice to add another directive, that works for the whole file/rest of the file and not just for the code block. e.g. doctest-remote-data-all
  • I think it already works like that: please do add tests for composite usage of multiple directives, e.g. adding a nonsensical example like this:
This should be skipped, even when remote data is enabled as satrapy is not installed::

.. doctest-remote-data::
.. doctest-requires:: satrapy
    >>> 1 + 3
    2

I think raising an exception when remote data is accessed without using the directive would be a nice to have. I mean if the goal is to avoid spamming data providers as much as possible while testing, then the exception would come in handy in the case where a developer forgets to add the remote-data directive, or a new contributor is not sure of what should be done or is even completely unaware. Although I'm not sure of the likeliness of any of these events occurring.

@tinuademargaret
Copy link
Contributor Author

There are two more things that came into my mind while reviewing your tests:

  • I think currently there is no error if remote data is accessed without using the directive (unlike for the remote-data decorator when an IOError is raised with "An attempt was made to connect to the internet". I'm not sure we want this behaviour, it can certainly wait and be added once this first implementation is already been merged).
  • would be nice to add another directive, that works for the whole file/rest of the file and not just for the code block. e.g. doctest-remote-data-all
  • I think it already works like that: please do add tests for composite usage of multiple directives, e.g. adding a nonsensical example like this:
This should be skipped, even when remote data is enabled as satrapy is not installed::

.. doctest-remote-data::
.. doctest-requires:: satrapy
    >>> 1 + 3
    2

Also @bsipocz I think we can create issues for the new features such as the first point you've suggested here so that we can keep track of them. Or would you rather have them in this MR?

Comment on lines 328 to 344
if config.getoption('remote_data', 'none') != 'any':
matches = [re.match(
r'{}\s+doctest-remote-data\s*::(\s+.*)?'.format(comment_char),
last_line) for last_line in last_lines]

if len(matches) > 1:
match = matches[0] or matches[1]
else:
match = matches[0]

if match:
marker = match.group(1)
if (marker is None or
(marker.strip() == 'win32' and
sys.platform == 'win32')):
skip_next = True
continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need the extra match that you copy-pasted from doctest-requires, as you don't need to capture the upstream dependency here.

Suggested change
if config.getoption('remote_data', 'none') != 'any':
matches = [re.match(
r'{}\s+doctest-remote-data\s*::(\s+.*)?'.format(comment_char),
last_line) for last_line in last_lines]
if len(matches) > 1:
match = matches[0] or matches[1]
else:
match = matches[0]
if match:
marker = match.group(1)
if (marker is None or
(marker.strip() == 'win32' and
sys.platform == 'win32')):
skip_next = True
continue
if config.getoption('remote_data', 'none') != 'any':
matches = [re.match(
r'{}\s+doctest-remote-data\s*::'.format(comment_char),
last_line) for last_line in last_lines]
if len(matches) > 1:
match = matches[0] or matches[1]
else:
match = matches[0]
if match:
skip_next = True
continue

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh I see

@tinuademargaret
Copy link
Contributor Author

Hi @pllim, @saimn you can take a look at my latest commit instead. I fixed the FLOAT_CMP test case. The skipping logic actually works but there is a funny behaviour when it is used with the IGNORE_WARNING directive.

@pllim
Copy link
Contributor

pllim commented Jan 4, 2021

Yes, @tinumide , I agree about IGNORE_WARNING (see #137 (comment)). @astrofrog wrote that part, so hopefully he can advise.

@tinuademargaret
Copy link
Contributor Author

Yes, @tinumide , I agree about IGNORE_WARNING (see #137 (comment)). @astrofrog wrote that part, so hopefully he can advise.

Alright, thanks. @astrofrog kindly take a look

@tinuademargaret tinuademargaret marked this pull request as ready for review January 12, 2021 12:53
@tinuademargaret
Copy link
Contributor Author

Opening a new issue for the issue with the +IGNORE_WARNING directive so that this PR can be reviewed and merged.

Copy link
Contributor

@pllim pllim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please undo all the unnecessary blank lines across different files and fix the CI.

There also needs to be a clear warning that this new directive cannot be used together with IGNORE_WARNINGS.

@tinuademargaret
Copy link
Contributor Author

#148 is merged, @tinumide , please rebase to pick it up; +IGNORE_WARNINGS should now work with your new feature.

p.s. You might have to resolve conflicts, so please be careful during rebase. When in doubt, back up your existing work to another branch before rebase. Thanks!

I've rebased. +IGNORE_WARNINGS now works as it should

Copy link
Contributor

@pllim pllim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the commits need to be squashed into one commit to remove from the history that vscode file that was accidentally added and then removed.

You might want to set your editor to automatically remove trailing whitespace. I see some of those in your diff, especially if you pasted code from your Python session.

We no longer has a test job that make sure pytest-doctestplus can still run without pytest-remotedata. Do we need that?

Would be nice if @ceb8 or @bsipocz could take this PR for a spin over at astroquery after the review comments are addressed but before merge.

README.rst Outdated
this way:

.. code-block:: python
.. doctest-remote-data::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is good practice to add blank line after the directive because I think some of the directives out there actually require it (but I am not 100% sure).

Suggested change
.. doctest-remote-data::
.. doctest-remote-data::

README.rst Outdated
Comment on lines 283 to 282
>>> from contextlib import closing
>>> from urllib.request import urlopen
>>> with closing(urlopen('https://www.astropy.org')) as remote:
... remote.read() # doctest: +IGNORE_OUTPUT

Copy link
Contributor

@pllim pllim Jan 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc in readme isn't tested, so you can simplify the example as follows without fear of introducing extra test dependencies:

Suggested change
>>> from contextlib import closing
>>> from urllib.request import urlopen
>>> with closing(urlopen('https://www.astropy.org')) as remote:
... remote.read() # doctest: +IGNORE_OUTPUT
>>> import requests
>>> r = requests.get('https://www.astropy.org')


This code won't run. So let's make sure that it gets skipped:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove addition of trailing whitespace.

Suggested change

.. doctest-skip::

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove addition of trailing whitespace.

Suggested change



Remote data block code sandwiched in block codes
===============================================
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
===============================================
================================================


This code block should work just fine::

>>> 1 + 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would we know if this is actually silently skipped too, say, because it thinks this code is still somehow part of remote date block above? That is, how are we sure the remote data block really ended before this line?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So a blank line within a doctest separates one chunk of code from another in the case of ..doctes-remote-data:: i.e the directive considers the chunk of code after a blank line as a separate chunk of code. That way we know that it does not control the chunk of code.


This code block should work just fine::

>>> 1 + 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would we know if this is actually silently skipped too, say, because it thinks this code is still somehow part of remote date block above? That is, how are we sure the remote data block really ended before this line?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm not sure how we could do that, as currently all the narrative docs testings are done as one test, if it's stopped somewhere the rest of the file is not being picked up. But then we have a tracelog with a failure. Whenever it's skipped, we have no easy way to see that the rest of the file was run as expected. Maybe add a failing one to the very bottom of the file? But then it's a failing test, and here we cannot really have it as an xfail, afaik.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, perhaps that test case is too complicated to be tested this way. If this is tested in the other unit tests, then we don't have to do it here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, good point, there is indeed a better handle of these in the unittest content.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pllim - So if we run this file with --remote-data, then it of course stops with a fail on L98. As expected. So I'm not sure what to do with the file itself, maybe only include it in when --remote-data is not used, and test the working of the remote-data stuff separately, only in unit tests?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe put those tests in a dedicated file that can be run with and without --remote-data ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what was the goal of having all those blocks that should fail ? Since the first failing test will fail the file anyway, we cannot test that the other blocks that should fail actually fail. So I guess we could test the expected failures in the unit tests, and in this file test blocks that should either pass or be skipped ?

p = testdir.makefile(
'.rst',
"""
# This test should fail, but we skip it, thus the test should pass.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# This test should fail, but we skip it, thus the test should pass.
# This test should fail, but we skip it, thus the test should be skipped.

testdir.makeini(
"""
[pytest]
doctest_optionflags = ELLIPSIS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of the extra doctest_optionflags = ELLIPSIS in this test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to test if the ELLIPSIS option works well with the .. doctest-remote-data:: directive, but I think I misunderstood it's function. The correct option here should be NORMALIZE_WHITESPACE

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pllim - we run into some cases where it seemed some of the default flags don't work nicely together, so as mentioned above the purpose was to test some of them.
Maybe the test is superfluous, then we can remove it later.

Copy link
Contributor

@saimn saimn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems mostly ok, just a few comments about the test comments which need some clarifications.

Copy link
Member

@bsipocz bsipocz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just wrap this up. I still don't know what the best approach with the skip_some.rst file, I like that it includes the remote-data related stuff, but then we cannot test it with remote-data enabled.

Otherwise the only other outstanding idea is to double up the tests, run one with remote-data one without and change only the expected outcome.

Otherwise, this seems to be working as expected with the astroquery PRs that I test it with, so would vote for merging it and making a quick release asap. If anything comes up, we can always fix it in a new release, after all this should not break anything but adds new functionality only.

testdir.makeini(
"""
[pytest]
doctest_optionflags = ELLIPSIS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pllim - we run into some cases where it seemed some of the default flags don't work nicely together, so as mentioned above the purpose was to test some of them.
Maybe the test is superfluous, then we can remove it later.

@pllim
Copy link
Contributor

pllim commented Jun 30, 2021

Re: tests -- I am okay with getting rid of those that don't work in the interest of wrapping this up and doing a release. We can always revisit if we encounter bugs in production. I am only mainly interested in this not breaking any existing functionality, especially for astropy core lib.

@bsipocz
Copy link
Member

bsipocz commented Jun 30, 2021

I can certainly do a test pr astropy core, but can't see anything that would break it.

tinuademargaret and others added 8 commits June 30, 2021 22:54
more descriptive comments in skip_some.rst

Update tests/test_doctestplus.py

Co-authored-by: Brigitta Sipőcz <[email protected]>

added pytest-remotedata as a requirement to be installed by tox

added version to pytest-remotedata

composite directive test fails

latest commit

removed test for ignore warning directive

test cleanup

removed extra marker

updated README and added changelog entry

review corrections

added vscode setting to .gitignore

removed .vscode

splitted test functions

review corrections
@bsipocz
Copy link
Member

bsipocz commented Jul 1, 2021

@pllim - I tested this locally with astropy main, and all seems good, I see test failures, but I see them with the current doctestplus release, too.


This code block should work just fine::

>>> 1 + 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe put those tests in a dedicated file that can be run with and without --remote-data ?

@bsipocz
Copy link
Member

bsipocz commented Jul 1, 2021

OK, so let's go ahead with this to avoid Friday EOB releases :)

@bsipocz bsipocz merged commit bd19587 into scientific-python:main Jul 1, 2021
@bsipocz
Copy link
Member

bsipocz commented Jul 1, 2021

Thank you very much @tinumide!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add directive supporting the use of remote data in doctests

4 participants