Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Algorithm for balance-constrain #857

Closed
sffc opened this issue Aug 28, 2020 · 14 comments · Fixed by #1071
Closed

Algorithm for balance-constrain #857

sffc opened this issue Aug 28, 2020 · 14 comments · Fixed by #1071
Assignees
Labels
documentation Additions to documentation spec-text Specification text involved

Comments

@sffc
Copy link
Collaborator

sffc commented Aug 28, 2020

I was reading balancing.md as well as the logic in ecmascript.mjs. I think there are some open questions in the algorithm, in particular, how to resolve mixed-sign durations (either when mixed signs are given in .from() or a the result of .plus()/.minus()). The doc and the polyfill do not properly support negative durations. I assume this change is being tracked by the negative duration issues, so I won't discuss it here.

I think balance is easier to specify, because the answer is unambiguous. Given a mixed-sign duration as input, convert all fields to nanoseconds, add them together, and then bubble the results up through all the fields.

However, with balanceConstrain (or whatever we end up calling it), the contract is that we perform the "minimum necessary balancing" to resolve the mixed signs. This means that there are potentially ambiguous results. Here are some examples:

  1. PT5H120S minus PT1M (intermediate form: PT5H-1M120S). Is the answer:
    1. PT5H60S because we steal 60 seconds from the smaller unit?
    2. PT4H59M120S because we steal 1 minute from the larger unit?
  2. PT60S minus PT2M (intermediate form: PT-2M60S). Is the answer:
    1. -PT1M because we resolve seconds into minutes?
    2. -PT60S because we resolve the minutes into seconds?
  3. PT1H60S minus PT122M (intermediate form PT1H-122M60S). Is the answer:
    1. -PT61M (minutes only)?
    2. -PT3660S (seconds only)??
    3. -PT1H1M (hours and minutes)??
    4. -PT60M60S (minutes and seconds)??
    5. -PT1H60S (hours and seconds)???
  4. PT15S minus PT2M (intermediate form PT-2M15S). Is the answer:
    1. -PT1M45S (minutes and seconds)?
    2. -PT105S (seconds only)?

I looked for an algorithm to handle this case in ISO-8601-2, but could not find one (see #819).

I think it would be useful to specify a step-by-step algorithm in balancing.md. Consider thinking about whether we start by balance-constraining "up first" (steal from bigger units) or "down first" (steal from smaller units). Also consider whether we want/need to special-case balance-constrain when the overall sign of the duration changes (PT1H-75M), versus when the overall sign will end up staying the same (PT1H-15M).

@justingrant @ptomato

@ptomato

This comment has been minimized.

@justingrant
Copy link
Collaborator

justingrant commented Aug 29, 2020

Here's one other problematic problematic case: (copying from the open issues section of #856)

  • How to handle the case where a duration is currently balanced, and then it's rounded away from zero? For example, PT2H59M55S rounded to the nearest minute without balancing would be PT2H60M. It seems unexpected for round to cause a balanced duration to become unbalanced. Should the default for round be to balance, so that users have to opt in to the confusing balance-constrain behavior instead of opting in? Or should the balance-constrain algorithm try to always keep balanced durations balanced in the result?

@sffc sffc mentioned this issue Sep 5, 2020
@ptomato ptomato added this to the Stable API milestone Sep 11, 2020
@ronaldtse
Copy link

ronaldtse commented Sep 15, 2020

@sffc sorry for the late reply!

In ISO 8601-2 the "composite duration form" was meant to delay the resolution of date time duration until application because some time scale units require contextual information.

Specifically, the following time scale unit conversions are contextual (the "conversion boundaries"):

  1. year <=> day (365, 366 days)
  2. month <=> day (28, 29, 30, 31 days)
  3. minute <=> second (59, 60, 61 seconds)

The key point is that there is always a minimally factored form if you really wish to. As with arithmetic factorization, there is a limit on how much we can factor down to. These limits are the conversion boundaries.

If we do not care about being particularly accurate, one could simplify to omit some boundaries to adopt "nominal duration", such as ignoring the leap second (i.e. ignore rule 3) and treat 1 minute = 60 seconds.

All other conversions can occur interchangeably. Therefore from the 8601 perspective:

  1. PT5H120S minus PT1M (intermediate form: PT5H-1M120S). Is the answer:

    1. PT5H60S because we steal 60 seconds from the smaller unit?

Due to rule 3, we cannot convert between M and S until there is contextual information on whether the actual time span includes the last minute of the year (which is when the negative or positive leap second can apply). In the 8601 perspective this should remain in composite form, i.e. intermediate form: PT5H-1M120S is the most we can factor down to.

  1. PT4H59M120S because we steal 1 minute from the larger unit?

This is possible because the conversion between M <> H is deterministic, i.e. 1H = 60M always stays true (even when the last minute may contain a leap second -- this is the definition of minute by BIPM).

  1. PT60S minus PT2M (intermediate form: PT-2M60S). Is the answer:

    1. -PT1M because we resolve seconds into minutes?
    2. -PT60S because we resolve the minutes into seconds?

Since S <> M crosses conversion boundary we can't factor down the intermediate form (in the 8601 understanding).

  1. PT1H60S minus PT122M (intermediate form PT1H-122M60S). Is the answer:

Since H <> M is deterministic, one can freely convert H and M. This means that 1H60S = 60M60S - 122M => -62M60S.

  1. -PT61M (minutes only)?
  2. -PT3660S (seconds only)??
  3. -PT1H1M (hours and minutes)??
    No to 1-3 because they cross the conversion boundary.
  1. -PT60M60S (minutes and seconds)??
    Maybe you meant PT-62M60S? That would work.
  1. -PT1H60S (hours and seconds)???
    Maybe you meant PT-1H-2M60S? Works too.
  1. PT15S minus PT2M (intermediate form PT-2M15S). Is the answer:

Again this crosses the M <> S boundary, so the intermediate form is the best we can do.

  1. -PT1M45S (minutes and seconds)?
  2. -PT105S (seconds only)?
    (cannot be factored down)

@ptomato ptomato added documentation Additions to documentation spec-text Specification text involved non-prod-polyfill THIS POLYFILL IS NOT FOR PRODUCTION USE! and removed documentation Additions to documentation non-prod-polyfill THIS POLYFILL IS NOT FOR PRODUCTION USE! labels Sep 18, 2020
@ptomato ptomato added the documentation Additions to documentation label Oct 5, 2020
@ptomato ptomato self-assigned this Oct 28, 2020
@ptomato
Copy link
Collaborator

ptomato commented Oct 28, 2020

Note that in Temporal we take the POSIX route of ignoring leap seconds, so we do indeed omit the minute/second boundary. On the other hand, we add other boundaries because durations can be considered relative to dates in particular time zones with DST changes, or dates in other calendar systems than ISO 8601:

  • year/month is not allowed without the context of a date and a calendar system
  • year/week and month/week are never allowed
  • week/day is not allowed without the context of a date and a calendar system
  • day/hour normally assumes 24 hours in a day, but in the context of a date-time with time zone may be longer or shorter (this isn't implemented yet)

Currently the following results come out of the Temporal polyfill and Temporal spec.

Note that the current algorithm is to start by considering the smallest unit, and steal from the next-largest unit ("up first") if its sign is different from the sign of the largest unit in the duration. This does not work if the overall sign flips due to balancing, so that's why case 3 is buggy. So we do need to special-case this.

(That means that almost everything in my comment #857 (comment) was wrong; apparently I didn't think hard enough about it, which is not surprising; writing this comment has taken me several hours. The one thing I do think is correct from that comment is that I think this is too detailed for the docs, and belongs in the spec.)

Case 1:

Temporal.Duration.from({ hours: 5, seconds: 120 }).subtract({ minutes: 1 })  // => PT4H59M120S
Temporal.Duration.from({ hours: 5, seconds: 120 }).subtract({ minutes: 1 }, { overflow: 'balance' })  // => PT5H1M

This seems correct given the up-first algorithm.

Case 2:

Temporal.Duration.from({ seconds: 60 }).subtract({ minutes: 2 })  // => -PT1M

Ditto.

Case 3:

Temporal.Duration.from({ hours: 1, seconds: 60 }).subtract({ minutes: 122 })

This throws, which I believe is definitely a bug, see above.

One way to resolve this bug that still seems consistent with the up-first algorithm, would be to check if the sign of the largest unit was flipped as a result of balancing, and perform the algorithm a second time if so. That would give -PT1H1M as the preferred answer to this case.

It would have to be guaranteed that performing the algorithm again would not keep flipping the sign, leading to an infinite loop, but I believe that is the case; after one pass of the algorithm, no units should have overflows, so I don't believe it's possible for the sign to flip again.

Case 4:

Temporal.Duration.from({ seconds: 15 }).subtract({ minutes: 2 })  // => -PT1M45S

This one again seems correct given the up-first algorithm we have.

Justin's rounding case:

Temporal.Duration.from({ hours: 2, minutes: 59, seconds: 55 }).round({ smallestUnit: 'minutes' })  // => PT3H

I think we did take care in the API design of round() to make sure that balanced durations don't become unbalanced due to a round operation, by making the largestUnit option default to the largest nonzero unit. So I think this case is sufficiently addressed.

@sffc
Copy link
Collaborator Author

sffc commented Oct 28, 2020

Another possibility for Case 3: what if we were to adopt #980 as the algorithm to resolve balance-constrain?

Input Intermediate #980 Bal-Constrain Full Balance
PT1H60S - PT122M PT1H-122M60S -PT1H1M -PT1H1M
PT180M - PT60M PT120M PT120M PT2H
PT75M - PT120S PT75M-120S PT73M PT1H13M
PT1M60S - PT1M PT60S TBD (PT60S or PT1M) PT1M

The algorithm for #980-style Balance Constrain is simple: use the Balance algorithm, but don't bubble up beyond the highest unit in the input.

@ptomato
Copy link
Collaborator

ptomato commented Oct 28, 2020

That seems to give identical results to the algorithm I proposed, for at least all the cases that we have in our test suite (including the extra ones I wrote for this issue), and is simpler, so I'm all for it. (Provided that we only apply it when there are mixed signs to be resolved; if you apply it indiscriminately, then it does change the results.)

@justingrant
Copy link
Collaborator

justingrant commented Oct 28, 2020

The algorithm for #980-style Balance Constrain is simple: use the Balance algorithm, but don't bubble up beyond the highest unit in the input.

I thought we'd already agreed on this, per #856. Specifically, from #856 (comment):

  1. Remove the overflow option from all duration methods except from and with. Always balance, because when you're manipulating a duration (as opposed to just setting fields) then it's unintuitive to start with a balanced duration and end up with an unbalanced one. Also, rounding forces at least some balancing anyways (e.g. 23.6 hours=>1 day) and it will be hard for users to understand why some balancing is performed but other balancing is not.

In other words, there is no such thing as "balance constrain" anymore in Temporal. There's just "no balancing" (Duration's from and with only) which makes no changes whatsoever and "balancing" (every other place in Temporal that constructs a new Duration) which balances up to the largest unit defined by the largestUnit or the default largest unit. But there's no separate balance-constrain algorithm.

Does this match what you guys are thinking?

More details about expected behavior are in OP (edited) of #856. Here's some relevant sections:

  • 2.4 Duration's overflow: 'constrain' option in plus and minus is currently used to control whether the output is fully balanced or minimally balanced. This behavior has a few problems, especially after rounding is in the mix:
    • 2.4.1 Rounding forces at least some balancing anyways (e.g. 23.6 hours=>1 day) and it will be hard for users to understand why balancing due to rounding is performed but balancing down from larger units is not performed.
    • 2.4.2 It's not clear that constrain matches the most common use case for unbalanced durations which is where only the largest unit is unbalanced, e.g. 45 days, 3 hours, and 10 minutes, or @jasonwilliams's case where the BBC measures all program lengths in seconds. It's not clear that "unbalanced in the middle" durations like "45 days and 180 minutes" are a real use case that we need to support in the results of plus, minus, and round.
    • 2.4.3 Even before adding rounding, the "balance constrain" algorithm was one of the hardest concepts in all of Temporal to understand and explain.
  • 2.5 For these reasons, there will be no overflow options on plus, minus, or round. Instead, we'll optimize for the "unbalanced largest unit" use case by automatically setting the largestUnit default to prevent durations from growing into larger units by default. See (3.5.1) below for more details.
    • 2.5.1 Note that the overflow option was already in plus and minus before this proposal. It will be removed from those methods.

7. We'll add an 'auto' value for largestUnit in Duration and non-Duration difference methods where it's used. 'auto', like undefined, will select the default unit. EDIT 2020-10-06: added this section

  • 7.1 On Duration methods, 'auto' means "the largest nonzero unit in the input(s)"

  • 7.2 On difference of non-Duration types 'auto' means to pick the default unit of the type:

    • 7.2.1 Instant: 'seconds'
    • 7.2.2 Date / DateTime - 'days'
    • 7.2.3 ZonedDateTime - 'hours'
    • 7.2.4 Time - 'hours'
    • 7.2.5 YearMonth - 'years'
  • 7.3 If the default chosen above is smaller than the user-provided smallestUnit then set largestUnit = smallestUnit.

EDIT: balance-constrain is already removed from balancing.md in the docs. Here's what the docs say now:

In addition to round() as described above, add() and subtract() also balance their output into either a fully-balanced or a top-heavy duration depending on the largestUnit option.

@sffc
Copy link
Collaborator Author

sffc commented Oct 28, 2020

Provided that we only apply it when there are mixed signs to be resolved; if you apply it indiscriminately, then it does change the results.

If we use two different algorithms, it might produce unexpected results. For example, PT2H90S + PT1M and PT2H90S - PT1M should probably use the same algorithm, even though one has a mixed sign and the other does not.

@justingrant
Copy link
Collaborator

Provided that we only apply it when there are mixed signs to be resolved; if you apply it indiscriminately, then it does change the results.

If we use two different algorithms, it might produce unexpected results. For example, PT2H90S + PT1M and PT2H90S - PT1M should probably use the same algorithm, even though one has a mixed sign and the other does not.

Why would we use different algorithms? Why not the simpler "always balance up to largest unit"?

@ptomato
Copy link
Collaborator

ptomato commented Oct 28, 2020

I suppose that makes this whole issue out of date, then. I had left off #856 until ZonedDateTime lands and hadn't realized that it obsoleted this one.

@justingrant
Copy link
Collaborator

Yep, agreed. So sorry @ptomato that you ended up spending a bunch of time on this. I was offline for a few hours or I wouId have responded sooner that this issue is probably moot.

It does make me wonder if there are other issues like this one that have been superseded by later decisions. I'll try to take a look through open issues to see if there are any obvious others.

ptomato added a commit that referenced this issue Oct 28, 2020
The current algorithm produces correct results for all the test cases we
have, but it will produce wrong results when we fix the bug where addition
or subtraction of a duration flips the overall sign of the receiving
duration (in cases like, 1 hour, -61 minutes; we cannot test these
directly.)

See: #857
ptomato added a commit that referenced this issue Oct 28, 2020
Per #980 it is a recognized operation to get the correct largestUnit value
corresponding with the largest non-zero unit in the duration. Generalize
this operation in the spec, in order to use in #857.
ptomato added a commit that referenced this issue Oct 28, 2020
Removes the 'overflow' option from Duration.add() and Duration.subtract().
This also removes the concept of "balance-constrain." Durations are
balanced up to the largest nonzero unit in the input duration, and a
different largest unit can be selected with the largestUnit option.

The options object in Duration.add() and Duration.subtract() remains for
the time being, even though it accepts no valid options, because rounding
options will be added in #856.

See: #856
Closes: #857
@ptomato
Copy link
Collaborator

ptomato commented Oct 28, 2020

Silver lining is that it wasn't all a waste; the code to implement the proposed balance-constrain algorithm was quite similar to the code to always balance.

@justingrant
Copy link
Collaborator

OK cool. BTW I went through all 93 open Temporal issues and didn't find any others that were obviously superseded by recent decisions. There were a few feedback issues where we'd recently improved things and I posed code samples in those.

@ptomato
Copy link
Collaborator

ptomato commented Oct 29, 2020

Thanks!

Once we deliver the proposal to reviewers I intend to spend time updating all the feedback issues.

Ms2ger pushed a commit that referenced this issue Oct 29, 2020
The current algorithm produces correct results for all the test cases we
have, but it will produce wrong results when we fix the bug where addition
or subtraction of a duration flips the overall sign of the receiving
duration (in cases like, 1 hour, -61 minutes; we cannot test these
directly.)

See: #857
Ms2ger pushed a commit that referenced this issue Oct 29, 2020
Per #980 it is a recognized operation to get the correct largestUnit value
corresponding with the largest non-zero unit in the duration. Generalize
this operation in the spec, in order to use in #857.
Ms2ger pushed a commit that referenced this issue Oct 29, 2020
Removes the 'overflow' option from Duration.add() and Duration.subtract().
This also removes the concept of "balance-constrain." Durations are
balanced up to the largest nonzero unit in the input duration, and a
different largest unit can be selected with the largestUnit option.

The options object in Duration.add() and Duration.subtract() remains for
the time being, even though it accepts no valid options, because rounding
options will be added in #856.

See: #856
Closes: #857
Ms2ger pushed a commit that referenced this issue Oct 29, 2020
The current algorithm produces correct results for all the test cases we
have, but it will produce wrong results when we fix the bug where addition
or subtraction of a duration flips the overall sign of the receiving
duration (in cases like, 1 hour, -61 minutes; we cannot test these
directly.)

See: #857
Ms2ger pushed a commit that referenced this issue Oct 29, 2020
Per #980 it is a recognized operation to get the correct largestUnit value
corresponding with the largest non-zero unit in the duration. Generalize
this operation in the spec, in order to use in #857.
Ms2ger pushed a commit that referenced this issue Oct 29, 2020
Removes the 'overflow' option from Duration.add() and Duration.subtract().
This also removes the concept of "balance-constrain." Durations are
balanced up to the largest nonzero unit in the input duration, and a
different largest unit can be selected with the largestUnit option.

The options object in Duration.add() and Duration.subtract() remains for
the time being, even though it accepts no valid options, because rounding
options will be added in #856.

See: #856
Closes: #857
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Additions to documentation spec-text Specification text involved
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants