Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Date/datetime filter autocompletion, new timezone test suite for JS #1282

Merged
merged 6 commits into from
Jan 30, 2021

Conversation

sc1f
Copy link
Contributor

@sc1f sc1f commented Jan 7, 2021

This PR fixes the datetime filtering issues specified in #1242, and clarifies the behavior of perspective.js with date and datetime columns by introducing a test suite that runs in the US/Eastern timezone.

Previously, the behavior of datetime filters (especially for ==) was inconsistent and partially broken:

  • Users don't have any hints or pickers to find a date format that was guaranteed to be valid
  • Dates are displayed in US locale format, but the filter configuration would not accept US-formatted locale strings
  • Date filters would not "match" the underlying data—because strings are parsed as UTC, a == filter that tries to match 12/31/2020 12:30:00 PM", for example, would require the string 12/31/2020 5:30:00 PM" in order to actually match the correct value inside the engine, which stores the values as POSIX timestamps.

This PR implements the following fixes to address the issue:

  • In perspective.js, all filters for date and datetime columns are converted to new Date() before being passed into the engine. This allows Perspective to treat the value as local time and call the getTime API for the POSIX timestamp, meaning that date/datetime filters will "match" in the UI.
  • In perspective-viewer, date and datetime filters now have an autocomplete, which allows users to narrow down the exact filter value, and provides them with a format that is guaranteed to work with the engine:

Screen Shot 2021-01-08 at 2 48 13 PM

A threshold of 100,000 rows has been added so that the filter does not try to materialize too many values to search through, which can happen on large datasets with large numbers of unique datetimes/strings. This does not change the filter behavior—only the suggestions provided by the autocomplete.

Additionally, while behavior regarding timezones and various date container values are explicitly specified for perspective-python, they are less clear for perspective.js. This PR adds a test suite that runs in the US/Eastern timezone in order to assert that datetime behavior is clear and correctly implemented when UTC offsets are involved, similar to the perspective-python test suite targeting the same issue.

Caveats

In order for US locale strings to be treated as valid by the filter validator, the locale string format must be added to the parser format definition in arrow_csv.cpp. This has the side effect of parsing US locale strings as part of the dataset, where I found an issue with strptime in the C STL - when given the US locale string format ("%m/%d/%Y, %I:%M:%S %p"), it parses "12:00:00 AM" as "12:00:00 PM", which is incorrect. This seems to be an issue with strptime when compiled in Emscripten; the Python library does not have the same issue. Users should pass in date values as Date() objects, but there can be cases where strings are being passed in and this parsing error occurs.

@sc1f
Copy link
Contributor Author

sc1f commented Jan 7, 2021

Summary of datetime handling in perspective.js -

  • On load, Date() values have a Unix timestamp (in milliseconds) extracted through getTime. On display to the user (in datagrid, etc.), the timestamp is passed to new Date() and displayed as browser local time.
  • On load, String values containing datetimes are parsed into Unix timestamps (in milliseconds). On display (in datagrid, etc.), the timestamp is passed to new Date()and displayed as browser local time. This means that date values passed through strings are assumed to be UTC, as Perspective is not timezone-aware (nor does it parse or calculate timezone offsets), and on display to the user they are converted to the browser's local time - thus a string such as `03/01/2020, 01:30:00 AM" is displayed as "02/29/2020, 08:30:00 PM" (when the browser is in US Eastern time).

The issue with strptime not parsing 12AM as 00:00 is unique to Emscripten, it seems - strptime on the Python build correctly parses the datestring.

@sc1f sc1f linked an issue Jan 8, 2021 that may be closed by this pull request
@sc1f sc1f changed the title Parse US and UK locale strings, add autocomplete for date/datetime filter columns Date/datetime filter autocompletion, new timezone test suite for JS Jan 8, 2021
@sc1f sc1f force-pushed the fix-datetime-filters branch from bfa9b02 to 864582f Compare January 8, 2021 19:51
@texodus texodus added the 0.6.1 label Jan 19, 2021
@sc1f sc1f force-pushed the fix-datetime-filters branch 2 times, most recently from 50df4c9 to 32b38c9 Compare January 20, 2021 21:48
@sc1f sc1f self-assigned this Jan 20, 2021
@sc1f sc1f force-pushed the fix-datetime-filters branch from 120cdd2 to 9144258 Compare January 23, 2021 01:21
@sc1f sc1f marked this pull request as ready for review January 25, 2021 15:48
Copy link
Member

@texodus texodus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thanks for the PR! Reviewed offline.

sc1f added 5 commits January 30, 2021 03:52
WIP: add US and UK datestrings

Remove UK locale string parsing - too much ambiguity
Add new timezone test module

don't run tz test twice
convert date and datetime filters to new Date(), fix formatting for date filters

refactor test_js script, fix tz tests

Add a ton of tests
@texodus texodus force-pushed the fix-datetime-filters branch from 24f09a9 to 9d639ae Compare January 30, 2021 21:07
@texodus texodus force-pushed the fix-datetime-filters branch from 9d639ae to 9c88df7 Compare January 30, 2021 22:31
@texodus
Copy link
Member

texodus commented Jan 30, 2021

These tests flapped because some "filter completion" queries were taking longer than the cursor blink timeout. As a fix, to any test which calls setAttribute("filter", ..), I've added document.activeElement.blur() at the end of their test (assuming the test is not testing the autocomplete menu itself). This makes sure there can be no cursor selected, and updated a boatload of hashes to remove cursors (and yes, manually verified that these screenshots diffs were just cursors 💯 )

@texodus texodus merged commit 91c89d0 into master Jan 30, 2021
@texodus texodus deleted the fix-datetime-filters branch January 30, 2021 23:00
@texodus texodus removed the 0.6.1 label Jan 30, 2021
@texodus texodus added this to the 0.6.1 milestone Jan 30, 2021
@texodus texodus mentioned this pull request Aug 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Feature requests or improvements JS
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Datetime and date == filters does not work as expected
2 participants