Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a better reference for what units are supported. #562

Open
ChrisBarker-NOAA opened this issue Nov 5, 2024 · 2 comments
Open

Provide a better reference for what units are supported. #562

ChrisBarker-NOAA opened this issue Nov 5, 2024 · 2 comments
Labels
enhancement Proposals to add new capabilities, improve existing ones in the conventions, improve style or format

Comments

@ChrisBarker-NOAA
Copy link
Contributor

Provide a better reference for what units are supported.

Moderator

TBD

Moderator Status Review [last updated: YYYY-MM-DD]

Brief comment on current status, update periodically

Requirement Summary

Users of CF need to know what unit strings they can use for their data. While the standard names provide canonical units, others are allowed, not everything has a standard name.

Consumers of CF data files also need to know what unit strings they might encounter for given data.

So there should be an easy way for users -- both producers and consumers, to quickly know what the options are for a particular physical quantity, e.g -- this data variable is a velocity, I can use (and may encounter) m/s or meters per seconds or meter/s or [a large number of combinations!]

CF passes this responsibility on to the UDUNITS package -- that's fine and complete, but the documentation is lacking.

Technical Proposal Summary

The CF spec should include, or reference, a clear and complete definition of all the units and spellings that are allowed.

It's possible that such a document exists, in which case, CF could simply point to it.

Otherwise, a new document / page / seciton should be created.

Benefits

Everyone that is producing CF compliant files, and even more so folks that are writing code to consume CF compliant files.

Status Quo

Currently in CF, there is this text:

The value of the units attribute is a string that can be recognized by the UDUNITS package [UDUNITS],
...
Note that case is significant in the units strings. Note also that CF depends on UDUNITS only for the definition of legal units strings. CF does not assume or require that the UDUNITS software will be used for units conversion.

There is a link to the UDUNITS page:

https://www.unidata.ucar.edu/software/udunits/

Technically, this is complete, but in practice, the UDUNITS page isn't that helpful for letting people know what they can use. All the definitions are in XML files, but that's not exactly human-readable. And I don't see anywhere there rules for pluralization, how to combine units, etc, etc.

So the only way really to know is to use UDUNITS itself to test what you may want to use, which is a bit in conflict with the statement that "CF does not assume or require that the UDUNITS software will be used for units conversion."

The challenge here is that UDUNITS accepts a LOT of spellings for units, e.g. "m", "meter", "meters" -- "m/s", "meters per second" , "m.s-1".

It is very hard to know what the all options are.

And there are no recommendations for a preferred spelling of units, beyond the canonical units used in standard names.

Associated pull request

TBD

Detailed Proposal

[Incomplete as of now]

Part 1: Provide a clear and complete description of the rules governing unit strings

This can be a link to another doc in UDUNITS (or elsewhere) if it exists -- in which case, this is easy, but I haven't found one.

Or we'll need to write a new doc that lays it out -- ideally with the help of the UDUNITS folks.

part 2: Provide recommendations for spelling of units.

Postel's law: "be conservative in what you send, be liberal in what you accept".

In this context that means UDUNITS is doing the right thing in accepting a wide variety of spellings for units.

But CF is not being "conservative in what you send"

In order to do that, CF would define (too late for that) or recommend (worth it?) a subset of the UDUNITS accepted spellings.

That is, should someone use:

"m.s-1" or "meter/second" or "meters per second" or ????

Perhaps that ship has sailed long, long ago, but is it a bad idea to start nailing it down more now? maybe.

@ChrisBarker-NOAA ChrisBarker-NOAA added the enhancement Proposals to add new capabilities, improve existing ones in the conventions, improve style or format label Nov 5, 2024
@larsbarring
Copy link
Contributor

larsbarring commented Nov 13, 2024

Chis @ChrisBarker-NOAA, I fully agree with all of this!

I have seen several requests for this (over the years and in various arenas/repos) although I do not have the specific links handy. And at one of the previous CF Workshops I tried a quick hack using an online tool for converting xml to html. It helped a bit, but not as far as one would like/need because the content of the xml files is a bit complex. So, I believe a fair bit of work would be needed to do something on this.

Moreover, in fact the recent standard name tables do include two one canonical units that are not in the UDUNITS2 xml database. I do understand, and accept the reasons for this. But at some point this is something that needs to be sorted out.

All in all, I think that there are a couple of currently unopened cans stowed away somewhere here ....

@larsbarring
Copy link
Contributor

larsbarring commented Nov 13, 2024

Further thought: is this better placed as discussion?
... and see last point in this comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Proposals to add new capabilities, improve existing ones in the conventions, improve style or format
Projects
None yet
Development

No branches or pull requests

2 participants