Skip to content

Conversation

@dhomeier
Copy link
Contributor

@dhomeier dhomeier commented Jun 23, 2021

To fix #831 and #824.

This converts the new ~astropy.units.physical.PhysicalType class from 4.3 (astropy/astropy#11204) to string, if necessary, and also uses the appropriate type only from types like PhysicalType({'energy', 'torque', 'work'}). Updated tests to test setting of custom spectral_axis units and types as well.

cm_unit = u.Unit(cm_unit)
if cm_unit.physical_type in ('length', 'frequency'):
cm_type = str(cm_unit.physical_type)
if 'length in cm_type' or 'frequency' in cm_type or 'energy' in cm_type:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if 'length in cm_type' or 'frequency' in cm_type or 'energy' in cm_type:
if 'length' in cm_type or 'frequency' in cm_type or 'energy' in cm_type:

Looks like one of the quotes is misplaced here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, yes; I should stay in Emacs+Flycheck for this stuff.

@dhomeier
Copy link
Contributor Author

I've simplified/reverted some of the comparisons since astropy/astropy#11625 (comment) clarified that PhysicalType has better support for string operations than it appeared.

@pllim
Copy link
Member

pllim commented Jun 23, 2021

The Windows failure is unrelated. And the remaining "allowed" failures appear unrelated.

The Windows failure is a symptom of a larger problem of not marking remote data access as # doctest: +REMOTE_DATA. As a results, all the jobs hammer the same service around the same time.

035     >>> file = download_file(filename, cache=True)
UNEXPECTED EXCEPTION: URLError(gaierror(11001, 'getaddrinfo failed'))

@dhomeier
Copy link
Contributor Author

dhomeier commented Jun 23, 2021

The Windows failure is a symptom of a larger problem of not marking remote data access as # doctest: +REMOTE_DATA. As a results, all the jobs hammer the same service around the same time.

And that download has a lot of redirections I noticed when fetching the file manually; don't know if that causes additional tensions.

The other failures are due to #832, not directly related to astropy (a number of new warnings from the updated WCSLIB though).

@pllim
Copy link
Member

pllim commented Jun 23, 2021

I think the changes are reasonable, so I approved but I'll let the package maintainer merge.

log.debug("Attempting auto-convert of table unit '%s' to "
"user-provided unit '%s'.", tab_unit, cm_unit)
log.debug(f"Attempting auto-convert of table unit {tab_unit} to "
f"user-provided unit {cm_unit}.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logging library in python is optimized to use %s formatting (ref, ref), which is why you'll find it throughout the codebase. I realize it's not strictly enforced, so this is probably a fine change (easier to parse > optimization), but just for future reference.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting... but does this line run so much that it warrants optimization?

In [14]: %timeit log.debug(f"'{x.unit}'")
3.59 µs ± 176 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [15]: %timeit log.debug("'%s'" % x.unit)
1.49 µs ± 5.85 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually log.debug("'%s'", x.unit) which is probably using .format() internally:

In [8]: %timeit log.debug("'%s'", x.unit)
526 ns ± 8.8 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [9]: %timeit log.debug("'%s'".format(x.unit))
611 ns ± 11.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Still, the speedup in this context is very modest compared to the time the actual code takes

In [10]: %timeit x.to(u.nm)
10.6 µs ± 199 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [11]: %timeit x.to(u.Hz, equivalencies=u.spectral())
64.5 µs ± 2.22 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

I find the original version easy enough to parse; just needs to be known since f-strings are otherwise strongly endorsed now in astropy core.

Copy link
Contributor

@nmearl nmearl Jun 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dhomeier is exactly right, the logging package is built to be passed arguments which are evaluated as-needed. In the case of f-strings, interpolation happens before logging visibility criteria is evaluated, which can lead to cases where string errors are raised, nothing gets logged, and the program exits because of the automatic evaluation of __str__. In %s-style, interpolation happens within the logging package and only as needed. It will therefore be caught as a logging-related issue, an error raised as log message, and the code will continue executing.

Additionally, log aggregation (i.e. the sentry package), works because the %s string template is the same between similar logging statements. Obviously, we don't make prolific use of this, nor do we experience string-related log-avoiding bugs, but there is a reason that python did not switch the package to using {} or f-style string interpolation with python 3.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, just because astropy uses it doesn't mean specutils has to. I don't think astropy uses much logging internally.

Though you probably want to document this somewhere in your dev docs, if not already.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I also took this solely as a recommendation, but still think it a good idea to synchronise the styles of coordinated packages unless there are more important considerations against it like here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though you probably want to document this somewhere in your dev docs, if not already.

I don't think the dev docs here include anything on code style; might be best to just link to https://docs.astropy.org/en/latest/development/codeguide.html
But I believe the performance considerations could be of value there, too, even if logging is currently used very little (found some direct usage of logging in the FITS scripts but hardly any of its custom AstropyLogger) – maybe just adding a note to the recommendation on f-strings.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't bring AstropyLogger into this... 😆 There was a separate discussion, which resulted in doc update clarifying that the logger is for astropy internal use only.

If you want to put the doc upstream in astropy dev docs, I'll have to loop in @embray here since he's working on dev docs revamp over there. They would have a better idea whether this particular doc should go.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had no intention to switch to AstropyLogger here, was just looking where logging is used inside astropy, if at all.
If it isn't, there's obviously not much need to mention its own formatting interface. But it probably could not hurt either...

spectrum.write(tmpfile, format='tabular-fits', overwrite=True)

cmap = {spectral_axis: ('spectral_axis', wlu[spectral_axis]),
cmap = {spectral_axis: ('spectral_axis', 'micron'),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this changed? It seems to defeat the purpose of the parameterized test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wlu[spectral_axis] is simply the original unit disp was created with, so for spec.spectral_axis the original test did not cover custom unit conversion by the mapping (and therefore did not find the bug of not identifying energy as a physical_type correctly).
I guess this might be moved to a separate check, but I don't see any problems with testing spectral_axis and flux units in the same run.

@dhomeier
Copy link
Contributor Author

Switched back to the original log() format and added some comments to explain what's done in the custom_mapping tests.

@nmearl
Copy link
Contributor

nmearl commented Jul 8, 2021

Thanks @dhomeier! Great PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Setting custom units in spectrum_from_column_mapping fails with astropy 4.3

4 participants