Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design of parsing methods #428

Closed
bakkot opened this issue Mar 12, 2020 · 11 comments
Closed

Design of parsing methods #428

bakkot opened this issue Mar 12, 2020 · 11 comments

Comments

@bakkot
Copy link
Contributor

bakkot commented Mar 12, 2020

(I haven't been following this proposal closely, sorry. Sorry if this is duplicated elsewhere, and for not raising it earlier. If this has been discussed previously, I will go read that discussion if it is documented somewhere.)

I came across #415 and noticed that the various parsing methods accept inputs which contain information irrelevant to the thing they are parsing for. I think that's bad. (Among other things, it leads to non-obvious questions like the ones raised in that issue.) In my experience, trying to guess the user's intent in cases like this leads to confusing bugs.

For this reason, generally speaking, I think that any parser for a Foo should reject all input strings which contain non-Foo parts. For example, Date.from('2019-02-16T23:45') is attempting to parse a string which contains a time, which is not relevant to Date, and so Date.from should reject it (that is, throw), not just silently ignore the irrelevant part. The correct way to get a Date out of a string formatted like a DateTime should be to parse it as a DateTime and then extract a Date out of it.

@ptomato
Copy link
Collaborator

ptomato commented Mar 16, 2020

There's discussion of some adjacent things in #198 ("Dates/Times/TimeZones may occur individually or in combination') but for the most part this was already the case when I started working on this proposal, maybe someone else can point to an earlier discussion?

@ryzokuken
Copy link
Member

/cc @sffc @pdunkel

@ptomato
Copy link
Collaborator

ptomato commented Mar 19, 2020

We discussed this in today's meeting but need to do some more thinking in order to get everyone in agreement, so we'll discuss it again next week.

Broadly speaking, there are two use cases:

  • Parse an ISO 8601 string and get the exact information that is mandated by ISO 8601, which means that the whole string is relevant.
  • Parse something that might look like an ISO 8601 string but might have been generated incorrectly by some other program. In this case it's important to ignore irrelevant parts of the string.

Some ideas discussed include:

  • For the first use case, use exclusively Absolute.from(), or possibly DateTime.from() as well, and project into the other types to get the type that you need.
  • For the second use case, it seems reasonable to have the other from() methods throw if irrelevant information is present in the string, to prevent would-be users from using them for the first use case.
  • We'd have to have another API to split out parts of an ISO string, e.g. "give me the time part of this ISO string".
  • We could put the whole thing on Temporal.parse() since we are drifting away from the original proposed use case for that anyway.

@bakkot
Copy link
Contributor Author

bakkot commented Mar 19, 2020

We'd have to have another API to split out parts of an ISO string, e.g. "give me the time part of this ISO string".

What's the use case for this beyond just using Absolute.from() or DateTime.from() and projecting into the other types?

@ptomato
Copy link
Collaborator

ptomato commented Mar 19, 2020

If you have an ISO string where the date is invalid, for example, 9999-99-99T13:45:00 and you want to pass it to Time.from(), but Time.from() throws if irrelevant parts are present, then you need to split out the time part of the string somehow. Using Absolute.from() or DateTime.from() for this won't work.

@bakkot
Copy link
Contributor Author

bakkot commented Mar 19, 2020

Ah, OK. Is there a particular reason to support that use case?

@ptomato
Copy link
Collaborator

ptomato commented Mar 19, 2020

We discussed this morning that it would be important when dealing with data supplied from another program.

@ryzokuken
Copy link
Member

@bakkot I guess in cases where you just don't care about the Date component and need to parse the Time component, it is useful to have a good way to ignore validation or balancing in Absolute.from.

I guess there's no major harm in enabling both cases? Of course the default (and preferred) way of doing things would be to just run Absolute.from and get whatever you want from there.

@ptomato
Copy link
Collaborator

ptomato commented Apr 8, 2020

To keep the discussion on this moving, I've written up some possibilities in #503. (Not sure it needs to be merged, but this is just to have something to discuss.)

As for my personal opinion: I prefer proposal 1, I would not oppose proposal 2 but think it's less convenient, and I think proposal 3 would be a mistake. I'm basing my opinon on the assumption that use case 1 is far more common than use case 2, and further on my strong preference that the most obvious way to parse should not yield a valid Temporal object out of an invalid ISO 8601 string. If my understanding of either of those factors were to change, my preference would probably change as well.

@ptomato
Copy link
Collaborator

ptomato commented Apr 16, 2020

I added a table with pros and cons to the document.

@ptomato
Copy link
Collaborator

ptomato commented May 20, 2020

Closing this because the design is settled on; we decided in the meeting of 23 April to go with option 1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants