Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix datetime indexes to account for timezone information #3251

Merged
merged 2 commits into from
Apr 11, 2019

Conversation

mangalaman93
Copy link
Member

@mangalaman93 mangalaman93 commented Apr 3, 2019

Fixes #3245


This change is Reviewable

@mangalaman93 mangalaman93 self-assigned this Apr 3, 2019
@mangalaman93 mangalaman93 requested a review from a team April 3, 2019 10:25
@mangalaman93
Copy link
Member Author

Just realized that this would a breaking change, and all the datetime indexes in existing setups will have to be created again.!!!

Copy link
Contributor

@srfrog srfrog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 3 files at r1.
Reviewable status: 2 of 3 files reviewed, 3 unresolved discussions (waiting on @mangalaman93)


tok/tok.go, line 211 at r1 (raw file):

	tval := v.(time.Time)
	buf := make([]byte, 2)
	binary.BigEndian.PutUint16(buf[0:2], uint16(tval.UTC().Year()))

This would create the index in UTC even though the data is not. I think this would yield false positives.


tok/tok.go, line 243 at r1 (raw file):

	binary.BigEndian.PutUint16(buf[2:4], uint16(tval.UTC().Month()))
	binary.BigEndian.PutUint16(buf[4:6], uint16(tval.UTC().Day()))
	return []string{string(buf)}, nil

Read my previous comment


tok/tok.go, line 259 at r1 (raw file):

	binary.BigEndian.PutUint16(buf[2:4], uint16(tval.UTC().Month()))
	binary.BigEndian.PutUint16(buf[4:6], uint16(tval.UTC().Day()))
	binary.BigEndian.PutUint16(buf[6:8], uint16(tval.UTC().Hour()))

Same

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 of 3 files reviewed, 4 unresolved discussions (waiting on @mangalaman93 and @srfrog)


query/common_test.go, line 552 at r1 (raw file):

		<301> <created_at> "2019-03-28T14:41:57+30:00" .
		<302> <created_at> "2019-03-28T13:41:57+29:00" .
		<303> <created_at> "2019-03-27T14:41:57+06:00" .

Can you add some simpler cases as well and see how they behave? Most users would just create year-month-day-hour, not including the time zone, etc.


tok/tok.go, line 211 at r1 (raw file):

Previously, srfrog (Gus) wrote…

This would create the index in UTC even though the data is not. I think this would yield false positives.

I think that should be ok, because the index queries would still be looking at that.

Needs some simple case testing though. For e.g, dates with just year-month-day-hour? Those are not part of the tests above. How would they be affected?

@mangalaman93 mangalaman93 force-pushed the aman/datetime branch 4 times, most recently from d6bec50 to 4e08e9d Compare April 5, 2019 12:54
Copy link
Member Author

@mangalaman93 mangalaman93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 3 files reviewed, 4 unresolved discussions (waiting on @mangalaman93, @manishrjain, and @srfrog)


query/common_test.go, line 552 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Can you add some simpler cases as well and see how they behave? Most users would just create year-month-day-hour, not including the time zone, etc.

Done.


tok/tok.go, line 211 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

I think that should be ok, because the index queries would still be looking at that.

Needs some simple case testing though. For e.g, dates with just year-month-day-hour? Those are not part of the tests above. How would they be affected?

I also realized the time package internally stores UTC time which it then converts into correct value of hour in that timezone when the Hour() function is called. Keeping it UTC makes sense to me.


tok/tok.go, line 243 at r1 (raw file):

Previously, srfrog (Gus) wrote…

Read my previous comment

Done.


tok/tok.go, line 259 at r1 (raw file):

Previously, srfrog (Gus) wrote…

Same

Done.

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: If you can think of more scenarios, particularly around the date time shift, i.e. someone adding a time just before midnight in GMT (so date is still today's), and then trying to find entries from today -- that might no longer work and cause undesired effects? Is that a right understanding?

Reviewable status: 0 of 3 files reviewed, 4 unresolved discussions (waiting on @mangalaman93, @manishrjain, and @srfrog)

@mangalaman93 mangalaman93 force-pushed the aman/datetime branch 2 times, most recently from ec7c74d to 25ecece Compare April 11, 2019 12:44
@mangalaman93
Copy link
Member Author

This test is failing now, also pointed in https://discuss.dgraph.io/t/bug-with-facets-sorting-alias-by-datetime/3177. For the query -

{
	q(func: has(best_friend)){
		uid
		best_friend @facets(orderasc: since) {
			uid
		}
	}
}

The response should be -

{
  "data": {
    "q": [
      {
        "uid": "0x2",
        "best_friend": {
          "uid": "0x40",
          "best_friend|since": "2019-03-28T14:41:57+30:00"
        }
      },
      {
        "uid": "0x4",
        "best_friend": {
          "uid": "0x40",
          "best_friend|since": "2019-03-27T00:00:00Z"
        }
      },
      {
        "uid": "0x3",
        "best_friend": {
          "uid": "0x40",
          "best_friend|since": "2018-03-24T14:41:57+05:30"
        }
      }
    ]
  }
}

The response instead is -

{
  "data": {
    "q": [
      {
        "uid": "0x2",
        "best_friend": {
          "uid": "0x40",
          "best_friend|since": "2019-03-28T14:41:57+30:00"
        }
      },
      {
        "uid": "0x3",
        "best_friend": {
          "uid": "0x40",
          "best_friend|since": "2018-03-24T14:41:57+05:30"
        }
      },
      {
        "uid": "0x4",
        "best_friend": {
          "uid": "0x40",
          "best_friend|since": "2019-03-27T00:00:00Z"
        }
      }
    ]
  }
}

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 3 of 3 files at r3.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @manishrjain and @srfrog)

The time.Hour() function, for example, returns the value
of hour stored, without accounting for timezone information.
E.g. for timestamp, 2019-05-28T14:41:57+30:00, the Hour value
would be 14 whereas when we use this value for comparison,
that would not correctly compare two timestamps.

Now, we use UTC value of hour for creating index and for
comparison to ensure that we correctly compare all timestamps
having timezone information.

Note that all the timestmaps are parsed assuming UTC timezone
if no timezone is specified. This is the default behaviour
in Go for the time.Parse function as well.
@mangalaman93 mangalaman93 merged commit 95be3f4 into master Apr 11, 2019
@mangalaman93 mangalaman93 deleted the aman/datetime branch April 11, 2019 21:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants