Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial work to add a Time type to Base.Dates #12274

Merged
merged 7 commits into from
Jan 24, 2017
Merged

Initial work to add a Time type to Base.Dates #12274

merged 7 commits into from
Jan 24, 2017

Conversation

quinnj
Copy link
Member

@quinnj quinnj commented Jul 23, 2015

Addresses #12140.

TODO:

  • Figure out the best way to "display" (i.e. show) micro/nanoseconds
  • Address any other needed functionality: @aviks @tbreloff @ScottPJones, anybody else interested
  • Add lots of tests

@quinnj
Copy link
Member Author

quinnj commented Jul 23, 2015

[CI cancelled because it's not worth running at this point]

(+)(x::Time,y::Dates.TimePeriod) = return Time(NS(value(x)+Dates.tons(y)))
(-)(x::Time,y::Dates.TimePeriod) = return Time(NS(value(x)-Dates.tons(y)))
(+)(y::Dates.TimePeriod,x::Time) = x + y
(-)(y::Dates.TimePeriod,x::Time) = x - y
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you intend 2 hours - 12:30 to be 10:30?

@tbreloff
Copy link

Jacob: thanks a tons for this. I certainly think you're a big chunk of the way there. I have a few questions/comments:

  • Is there a reason the Time type isn't a subtype of TimePeriod with the same definition as Nanosecond? If this type is intended to fill the same role as my TimeOfDay type, then I would think this is just a specialized TimePeriod that happens to convert numerically the same way as a Nanosecond.
  • Can you define ranges of Time types?
  • I'm undecided on the checks in the constructor... as @ivarne has pointed out, if you have them in the constructor, they should be in all mutating operations as well, and the efficiency of that worries me. Maybe there could be a specialized constructor that does bounds checking for people that need it?
  • Should line 7 also be named tons instead of NS? Can tons be a different name? I can't stop reading it as one word.

Regarding printing/parsing... I would personally like to see methods like:

function TimeOfDay(str::String)
  tmp = split(str,":")
  minutes = 0
  seconds = 0
  nanos = 0

  hours = parse(Int, tmp[1])
  if length(tmp) > 1
    minutes = parse(Int, tmp[2])
  end

  if length(tmp) > 2
    tmp2 = split(tmp[3], ".")
    seconds = parse(Int, tmp2[1])
    if length(tmp2) > 1
      nanos = parse(Int, rpad(tmp2[2],9,"0")[1:9])
    end
  end

  TimeOfDay(hours * nanosInOneHour + minutes * nanosInOneMinute + seconds * nanosInOneSecond + nanos)
end

function Base.string(timeOfDay::TimeOfDay)
  secsSinceMidnight, nanos = divrem(timeOfDay.nanosSinceMidnight, nanosInOneSecond)
  hours, hourrem = divrem(secsSinceMidnight, secondsInOneHour)
  minutes, seconds = divrem(hourrem, secondsInOneMinute)
  microseconds = div(nanos, millisInOneSecond)
  string(lpad(hours,2,"0"), ":", lpad(minutes,2,"0"), ":", lpad(seconds,2,"0"), ".", lpad(microseconds,6,"0"))
end

Your string/print is very similar, but I don't think I saw a string-based constructor. Of course, once this becomes part of Base, we could specialize some formatting options in the Formatting.jl package.

Thanks again for the effort!

@JeffreySarnoff
Copy link
Contributor

postfixes for subsecond units

Unit of Time Postfix
seconds s
milliseconds ms
microseconds us
nanoseconds ns
picoseconds ps
femtoseconds fs
attoseconds as
zeptoseconds zs
yoctoseconds ys
100 seconds:      100 s   ( "100s"  )
100 microseconds: 100 us  ( "100us" )
100 nanoseconds:  100 ns  ( "100ns" ) 

@JeffreySarnoff
Copy link
Contributor

As time of day is frequently 'flattened' into seconds or subseconds of a day and the preferred subsecond resolution varies with the application aor the computational environment, one approach to processing a time of day to allow parametric content or to offer a defaulted argument or to let the content of the string or of the tuple (and better, a prioritized mix of, say, two of those) to select working subsecond resolution.

@quinnj quinnj added the dates Dates, times, and the Dates stdlib module label Aug 18, 2015
@ViralBShah ViralBShah added this to the 0.5.0 milestone Nov 24, 2015
@ViralBShah
Copy link
Member

Bump. Would be nice to get this one done.

@JeffreySarnoff
Copy link
Contributor

Just a few notes and a question:

Time must use an integer underlying type, floats don't work well
because the spacing between adjacent values is not constant.

A week is the largest amount of time in common time units that is
a constant number of seconds (604_800s). If we want to represent
weeks in seconds then Int64 allows picosecond (1e-12) resolution
(1/10 picosecond if that helps). If we let go of weeks as a
quantity of seconds, then days become the largest amount of time
in common time units (86_400s). Then, Int64 allows picosecond
(1/100 picosecond if that helps) resolution. To cover smaller
time intervals requires Int128 -- which covers, for example
1/10 attosecond through 10 exaseconds. Int128 allows encoding
the age of the universe in attoseconds.

Given the state of the art in ultra-high frequency physics, where
femtosecond timing is in use and attosecond timing is useful, imo
limiting our internal resolution to picoseconds would be an error
because it means having a second high-resolution timing library.
Int128 runs quickly; so performance should not be a consideration.

What is of greater import, keeping the storage size to 64 bits
or having a single sub-second time type that covers all spans?

@tkelman
Copy link
Contributor

tkelman commented Nov 24, 2015

Int128 is not very well-supported on all platforms or compilers.

@JeffreySarnoff
Copy link
Contributor

that is a shame, and a deciding factor.

On Tue, Nov 24, 2015 at 1:58 PM, Tony Kelman [email protected]
wrote:

Int128 is not very well-supported on all platforms or compilers.


Reply to this email directly or view it on GitHub
#12274 (comment).

@JeffreySarnoff
Copy link
Contributor

I am not the guy to fold Time in with Date; however if something along
these lines would be helpful, I could make it. And if not, that's perfectly
ok:

#=
1e18 yoctoseconds per microsecond
86_400*1e6 microseconds/day
+/- 200_000 years in microseconds
microsecond #0 is at POSIX time 0 (1970-01-01 00:00:00 GMT)

To simplify this sketch by avoiding a great deal
of bounds checking, we adopt the convention that
times specified in, and times requested in
sub-microsecond units have no microsecond component (x.u==0).

=#

immutable RelativeTime
u::Int64 # MicroSeconds, 0..±1e18-1
y::Int64 # YoctoSeconds, 0..1e18-1, y >= 0
end

immutable GivenTime
u::Int64 # MicroSeconds, 0..±1e18-1, 0 is at POSIX time 0
y::Int64 # YoctoSeconds, 0..1e18-1, y >= 0
end

Times = Union{RelativeTime, GivenTime}
N64 = Union{Int64, Float64}
idiv{T<:N64,U<:N64}(p::T,q::U) = round(Int64, div(p,q))

Yoctoseconds{T<:Times}(t::T) = t.y
Zeptoseconds{T<:Times}(t::T) = div(t.y, Int64(1e3))
Attoseconds{T<:Times}(t::T) = div(t.y, Int64(1e6))
Femtoseconds{T<:Times}(t::T) = div(t.y, Int64(1e9))
Picoseconds{T<:Times}(t::T) = div(t.y, Int64(1e12))
Nanoseconds{T<:Times}(t::T) = div(t.y, Int64(1e15))
Microseconds{T<:Times}(t::T) = t.u
Milliseconds{T<:Times}(t::T) = div(t.u, Int64(1e3))
Seconds{T<:Times}(t::T) = div(t.u, Int64(1e6))
Minutes{T<:Times}(t::T) = div(t.u, 60_Int64(1e6))
Hours{T<:Times}(t::T) = div(t.u, 3600_Int64(1e6))
Days{T<:Times}(t::T) = div(t.u, 86400_Int64(1e6))
Weeks{T<:Times}(t::T) = div(t.u, 604800_Int64(1e6))

function dSeconds(t::Float64)
isneg, abst = signbit,abs(t)
...fldmod..
end
dSeconds(t::T) = RelativeTime(idiv(t,Int64(1e6)),0)
dMicroseconds{T<:N64}(t::T) = RelativeTime(t,0)
dYoctoseconds{T<:N64}(t::T) = RelativeTime(0,t)

nSeconds(t::T) = GivenTime(idiv(t,Int64(1e6)),0)
nMicroseconds{T<:N64}(t::T) = GivenTime(t,0)
nYoctoseconds{T<:N64}(t::T) = GivenTime(0,t)

On Tue, Nov 24, 2015 at 2:09 PM, Jeffrey Sarnoff [email protected]
wrote:

that is a shame, and a deciding factor.

On Tue, Nov 24, 2015 at 1:58 PM, Tony Kelman [email protected]
wrote:

Int128 is not very well-supported on all platforms or compilers.


Reply to this email directly or view it on GitHub
#12274 (comment).

@tkelman
Copy link
Contributor

tkelman commented Apr 21, 2016

What's the plan here? This has been pretty quiet and it probably isn't 0.5 material unless it gets finished up very soon.

@tkelman tkelman removed this from the 0.5.0 milestone Apr 21, 2016
@Jeffrey-Sarnoff
Copy link

Jeffrey-Sarnoff commented Apr 21, 2016

I took the non-response as preference for another approach -- I still am not the guy to merge Time into Date (I have not used Date and defer to @quinnj on its construction and what best works to interpose Time within that extensive software subfacility). OTOH, if we want to -- I'd spend the weekend doing some coding to have a Type of Time (and Type of Span of that Type of Time) that understands weeks, days, hours, minutes, seconds, microseconds, milliseconds, nanoseconds. I'd give it correct arithmetic.
* If someone else would write how to use that with Date stuff.*

@Jeffrey-Sarnoff
Copy link

@tkelman Is Int128 better supported with v0.5?

@iamed2
Copy link
Contributor

iamed2 commented Apr 21, 2016

I think there's a couple different ideas here:

  1. A type representing time of day A "Time" Type #12140
  2. New sub-millisecond Period subtypes

Those might each be better as separate PRs. It's not really clear which one each interested party is after. 1) is probably easy to implement using the work here and in the issue. 2) might need more consideration.

@quinnj
Copy link
Member Author

quinnj commented Apr 21, 2016

I think we should focus on use-case 1). The second case is much better served by going the SIUnits.jl route, where you already get a parametric Second type defined for all the granularities.

I'm happy to push this through in the next few days if people really want it. I also think this would fit well in a package; are there strong feelings one way or the other at this point? (package vs. in Base)

@Jeffrey-Sarnoff
Copy link

Hi Jacob,
For myself, no preference. For Julia, I prefer Time be with Date. Perhaps going about it the way you did with Date makes sense (first a package, some shakedown and requests/remodels, then in with Base) [why mess with a winning approach].

@tkelman
Copy link
Contributor

tkelman commented Apr 22, 2016

I think the performance issues that are just now being identified by Simon and others in the Dates code are indicative of just how long it can take to get these things right, and we probably shouldn't rush new features if you can see a good way to separate concerns with Dates in Base and Time types in a package. Some of the Dates internals might also be up for some refactoring to address those performance issues, which seem to have more people working on them right now with a higher priority, so I'd focus there at the moment.

@tkelman
Copy link
Contributor

tkelman commented Apr 22, 2016

@Jeffrey-Sarnoff Int128 probably works a bit better with the more recent LLVM versions used on master, but it could always use more testing. If we ever look more seriously at supporting a Visual-Studio built version of Julia, there may still be gaps in Int128 support there.

@quinnj quinnj self-assigned this May 9, 2016
@hgeorgako
Copy link

hgeorgako commented May 13, 2016

Don't know if this is the right issue to comment on this, but from a quant/trader's perspective, having the ability to merge high frequency tick data with nanosecond precision time-stamps is becoming a big deal in the industry. R's xts library does not currently support this. I believe pandas does. I for one would love to see this functionality enabled sooner than later.

@quinnj
Copy link
Member Author

quinnj commented May 13, 2016

Ok, made some progress here and this is actually getting close in terms of the rest of the infrastructure we have in Dates. (plus lots of tests, yay!)

Would definitely love some review/feedback.

One of the biggest design questions I ran into was the following:

  • Do we treat Time more like Date or DateTime or more like a special case of a CompoundPeriod type? (since Time is essentially a more efficient representation of a CompoundPeriod composed of Hours, Minutes, Seconds, etc.)
  • This most readily affects arithmetic: do we allow x::Time + y::Time? In the TimeType respect, we don't allow Date + Date (it doesn't' really make sense semantically), but it certainly feels a little more natural with something like Time (if you view it as a special-case CompoundPeriod).
  • It gets a little weird though, since Time has somewhat of a finite range (~2.5 days using nanosecond precision); currently things would just keep wrapping around (i.e. 23 + 1 hour == Hour(0)), so it doesn't necessarily follow CompoundPeriod behavior

I think I'm leaning towards treating Time as a true TimeType, which would disallow x::Time + y::Time and x::DateTime + y::Time. You'd really want to view Time then as a true marker of a specific moment in a day between midnight and 11:59:59.

@omus
Copy link
Member

omus commented May 13, 2016

I think Time should be treated more like a Date but should also be able to be explicitly converted to a CompoundPeriod via CompoundPeriod(::Time). I think a Time type should be bounded between 00:00:00 and 23:59:59.

Are we going to support x::Time + y::Period? Would Time("23:00:00") + Hour(2) roll over to be Time("01:00:00")?

@omus
Copy link
Member

omus commented May 13, 2016

Something I didn't see in the PR is x::Date + y::Time = z::DateTime. Maybe that isn't possible due to the extra precision in Time.

@tkelman tkelman removed this from the 0.6.0 milestone Jan 19, 2017
@quinnj
Copy link
Member Author

quinnj commented Jan 19, 2017

I don't think there's anything left to do here though; I was just waiting on the datetime-parsing PR to merge before rebasing and merging this. I can rebase and merge in the next few hours.

@@ -184,6 +191,31 @@ function DateTime(func::Function, y, m, d, h, mi, s; step::Period=Millisecond(1)
return adjust(DateFunction(func, negate, DateTime(y)), DateTime(y, m, d, h, mi, s), step, limit)
end

"""
Time(f::Function, h[, mi, s, ms, us]; step=Second(1), negate=false, limit=10000) -> Time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you want these docstrings to be in the stdlib docs or repl-only?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the default step isn't always one second in the implementations below

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the docstrings be in stdlib or repl-only? I don't have a preference either way, but if there's a convention we should follow, happy to do it. Yeah, I would hope it would be obvious that if you're doing sub-Second Time ranges, it wouldn't default to a Second (since that would potentially lose precision in range steps).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the Dates section of the stdlib docs includes other docstrings for non-exported types, then this should be added.

Docstring saying one thing and code doing another isn't obvious.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the Dates section of the stdlib docs includes other docstrings for non-exported types, then this should be added.

So does it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added Time docs to the stdlib.

(+)(x::Time, y::TimePeriod) = return Time(Nanosecond(value(x) + tons(y)))
(-)(x::Time, y::TimePeriod) = return Time(Nanosecond(value(x) - tons(y)))
(+)(y::Period, x::TimeType) = x + y
(-)(y::Period, x::TimeType) = x - y
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these magnitudes then?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand your comment. These are generic fallbacks that will end up dispatching to the definitions above. This is just for the switched order of operators.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks wrong that y - x = x - y

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, got it. Yes, these are magnitudes. Order doesn't matter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

worth a comment then, in case anyone else gets confused looking at this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we already do

julia> Dates.Day(1) - Date(2016, 1, 1)
2015-12-31

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wat? why allow that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember if it was an explicit decision to allow or not. We can open another issue to remove.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in favor of removing this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opened #20205

reference = period == :Week ? " For details see [`$accessor_str(::$typ_str)`](@ref)." : ""
typs = period in (:Microsecond, :Nanosecond) ? ["Time"] :
period in (:Hour, :Minute, :Second, :Millisecond) ? ["Time", "DateTime"] : ["Date","DateTime"]
reference = period == :Week ? " For details see [`$accessor_str(::Union{Date,DateTime})`](:func:`$accessor_str`)." : ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

old-style cross-reference format

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the new style?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(@ref), the way the line used to be

$($name)(t::Time) -> Int64

The $($name) of a `Time` as an `Int64`.
""" $func(t::Time)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spacing is off

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How so? This follows the spacing of the block directly above it....

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

L134: 4 spaces
L135: 4 spaces
L136: 8 spaces
L137: 12 spaces
L138: NA
L139: 9 spaces <- should be 8
L140: 9 spaces <- should be 8
L141: 4 spaces

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks. Sorry for missing this. Fixed.

@quinnj
Copy link
Member Author

quinnj commented Jan 19, 2017

Docs updated.

throwing an error (in the case that `f::Function` is never satisfied).
throwing an error (in the case that `f::Function` is never satisfied). Note that the default step
will adjust to allow for greater precision for the given arguments; i.e. if hour, minute, and second
arguments are provided, the default step will be `Millisecond(1)` instead of `Second(1)`.
Copy link
Contributor

@tkelman tkelman Jan 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd either leave out the default in the docstring signature and say something like "the default value of step is the smaller of Second(1) or 1 unit of the next smaller precision than provided as input" in the description, or probably clearer, just write out the different signatures since square brackets for optional args isn't Julia syntax anyway

    Time(f::Function, h, mi=0; step=Second(1), negate=false, limit=10000) -> Time
    Time(f::Function, h, mi, s; step=Millisecond(1), negate=false, limit=10000) -> Time
    Time(f::Function, h, mi, s, ms; step=Microsecond(1), negate=false, limit=10000) -> Time
    Time(f::Function, h, mi, s, ms, us; step=Nanosecond(1), negate=false, limit=10000) -> Time

(edited to fix Microsecond typo)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was not addressed, docstring should not list a default value that isn't correct

@test a in dr

@test all(x->sort(x) == (step(x) < zero(step(x)) ? reverse(x) : x),drs)
@test all(x->step(x) < zero(step(x)) ? issorted(reverse(x)) : issorted(x),drs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space after the comma in these all and map calls would look quite a bit better

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was not addressed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, the entire dates module/files have inconsistent spacing. I can do a separate PR to do a bunch of these spacing issues all at once.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though I'll probably wait until after the datetime-parsing PR so they don't have to rebase as much.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

newly added lines aren't going to conflict with the other PR if it won't need to touch them

@@ -168,6 +192,12 @@ ms = Dates.Millisecond(1)
@test Dates.Date(d,y) == Dates.Date(1,1,1)
@test Dates.Date(d,m) == Dates.Date(1,1,1)
@test Dates.Date(m,y) == Dates.Date(1,1,1)

@test isfinite(Dates.Date)
@test isfinite(Dates.DateTime)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these isfinite lines probably shouldn't be deleted

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added them back in.

@quinnj quinnj merged commit 06fa32c into master Jan 24, 2017
@tkelman tkelman deleted the jq/timebomb branch January 24, 2017 04:47
@tkelman tkelman added the needs news A NEWS entry is required for this change label Jan 24, 2017
@tkelman
Copy link
Contributor

tkelman commented Jan 24, 2017

@tkelman
Copy link
Contributor

tkelman commented Jan 24, 2017

I'd like to refactor a little bit to make the internal precision parameterised on a period (i.e. you could have a Second-precision Time, Millisecond-precision, etc.).

Correct me if I'm wrong, but that never happened? What changed?

@quinnj
Copy link
Member Author

quinnj commented Jan 24, 2017

At the time, I thought it'd be more useful to have a Time object that could hold different internal precisions, but after trying out an idea or two, it didn't actually seem as useful as the simple rule of always having it at Nanosecond resolution.

@StefanKarpinski
Copy link
Member

@tkelman, @quinnj: can we capture those todos in an issue?

@quinnj
Copy link
Member Author

quinnj commented Jan 25, 2017

They're all addressed in #20226

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dates Dates, times, and the Dates stdlib module
Projects
None yet
Development

Successfully merging this pull request may close these issues.