Skip to content

narrow: Encode our own structured keys instead of JSON.#4339

Merged
gnprice merged 9 commits intozulip:masterfrom
gnprice:pr-narrow-keys
Dec 15, 2020
Merged

narrow: Encode our own structured keys instead of JSON.#4339
gnprice merged 9 commits intozulip:masterfrom
gnprice:pr-narrow-keys

Conversation

@gnprice
Copy link
Copy Markdown
Member

@gnprice gnprice commented Dec 14, 2020

The main motivation for this change is that it will help us switch to a more sensible type for narrows in our code, rather than the existing arrays of operator/operand pairs -- we'll no longer be coupling our data structures to the details of those values. In particular we intend to make the Narrow values more appropriately structured, and to replace emails and stream names with user IDs and stream IDs: #4333.

As setup for changing how we encode narrows as string keys, we encapsulate doing so into a central function keyFromNarrow, and replace all the JSON.stringify calls that really meant "encode this narrow to use as a key" with calls to that function.

Along the way I discovered a bug in our encoding and decoding of narrows, #4338, which this fixes.

Fixes: #4338

@gnprice gnprice requested a review from chrisbobbe December 14, 2020 23:01
Copy link
Copy Markdown
Contributor

@chrisbobbe chrisbobbe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! Thanks for tightening this up so much.

One thing I thought of right away is that it seems like a good place to use an opaque type for the return value of keyFromNarrow. You introduced opaque types to our project in #4335. That way, you couldn't accidentally do a JSON.stringify in place of calling keyFromNarrow.

I did run into a lot of Flow errors, though. Maybe a bug? (facebook/flow#5407?)

Error ┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈ src/chat/__tests__/fetchingReducer-test.js:47:10

Cannot assign computed property using NarrowKey [1].

     src/chat/__tests__/fetchingReducer-test.js
      44│
      45│       const expectedState = {
      46│         [HOME_NARROW_STR]: { older: false, newer: false },
      47│         [keyFromNarrow(streamNarrow('some stream'))]: { older: true, newer: false },
      48│       };
      49│
      50│       const newState = fetchingReducer(initialState, action);

     src/utils/narrow.js
 [1] 261│ export function keyFromNarrow(narrow: Narrow): NarrowKey {

Comment thread src/utils/narrow.js
*/
export const parseNarrow = (narrowStr: string): Narrow => JSON.parse(narrowStr);
export const parseNarrow = (narrowStr: string): Narrow => {
const makeError = () => new Error('parseNarrow: bad narrow');
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I could imagine wanting to know what narrowStr is, in a Sentry report. Possibly that's sensitive data that shouldn't go to Sentry, though.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. This code is all self-contained enough (really just the few dozen lines of keyFromNarrow and parseNarrow) that if we learn from Sentry there's a bug here, I feel like we should be able to find it with some combination of more thorough tests and just staring at the code.

And if not, we can always add more logging then 🙂

Comment thread src/utils/narrow.js

case 'topic:': {
// eslint-disable-next-line no-control-regex
const match = /^s:(.*?)\x00(.*)/s.exec(rest);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I don't think I've used the s flag (dotAll) before. I think its use here has something to do with the \x00 in the pattern, but I'm not quite sure what. 🤔

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not related to that character in the pattern, but it's for sort of an inverse kind of problem: it's so that the dots . match absolutely any character, even a newline if there somehow happens to be one in either the stream name or the topic. (The page you linked describes how . by default matches everything except four flavors of newline.)

As of this morning my draft of this change actually had a bug here because I didn't have the /s. I added it after writing the "awkward characters" part of the round-trip test, and finding that it failed on this change!

I'll add a line or so of a comment about this.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Meanwhile the \x00 is there because it's the same character that keyFromNarrow uses as a delimiter. That choice is discussed in a short comment there.)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah OK, got it, thanks!

Comment thread src/boot/store.js
}),

// Change format of keys representing narrows.
'17': dropCache,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'drafts', which uses the new keys, is in storeKeys, so dropCache won't drop it.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good catch!

I think I'll add a real migration, then. It's probably not real common for someone to happen to have a draft they've just written (or otherwise written recently enough to still want it) at the time the app gets upgraded -- it's possible the number of users who'd notice, on a given upgrade, would be zero -- but if they do it'd be best not to discard it.

Comment thread src/boot/store.js Outdated
narrows: Immutable.Map(state.narrows),
}),

// Change format of keys representing narrows.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8fa5055 narrow: Encode our own structured keys instead of JSON.

One way this change could affect the code's behavior, in principle, is
that JSON.stringify emits an object's properties in enumeration order
-- which means that narrows we consider equal may not have gotten
serialized as equal:

  > JSON.stringify([{"operator": "stream", "operand": "general"}])
  '[{"operator":"stream","operand":"general"}]'

  > JSON.stringify([{"operand": "general", "operator": "stream"}])
  '[{"operand":"general","operator":"stream"}]'

which could result in looking for the same narrow at different keys.
If we have any cases where this can currently happen, then these
explicitly-structured keys fix that.

Yeah, this has been kind of in the back of our minds for a while. It's really nice to not have to worry about it anymore, even if we don't know of any actual instances of the issue.

@gnprice
Copy link
Copy Markdown
Member Author

gnprice commented Dec 15, 2020

Thanks for the review!

One thing I thought of right away is that it seems like a good place to use an opaque type for the return value of keyFromNarrow. You introduced opaque types to our project in #4335. That way, you couldn't accidentally do a JSON.stringify in place of calling keyFromNarrow.

Yeah, good idea!

I did run into a lot of Flow errors, though. Maybe a bug? (facebook/flow#5407?)

As it happens, I actually ran into this Flow bug just the other day -- I was experimenting with an opaque type for UserId -- but hadn't gone and found that thread. I did happen to discover a workaround for it, while trying to figure out if I was doing something wrong; I've gone and posted the workaround on that thread just now.

I'll revise this branch with the couple of changes mentioned above. (And might try a NarrowKey opaque type in a followup branch.)

@gnprice
Copy link
Copy Markdown
Member Author

gnprice commented Dec 15, 2020

OK, revision pushed!

Copy link
Copy Markdown
Contributor

@chrisbobbe chrisbobbe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks! Just one small comment/query on the migration, below; otherwise, this LGTM.

Comment thread src/boot/store.js
I'd actually forgotten that state.drafts was among these; I was
reminded by seeing its reducer and selector affected in the
upcoming commit migrating to keyFromNarrow!
We'd like to replace the Narrow type in most of our code with a new,
better-structured one.  In general this kind of refactoring gets a lot
of help from the type-checker: you change the type of a value in one
place, and Flow points out all the other places that consume or
provide that value and need to be updated to match.

But one thing it won't catch is where we pass a Narrow to
JSON.stringify, as we do all the time.  That's because the latter will
take any value at all as its argument (the type in the signature is
`mixed`) -- so passing a value of some new type is still well-typed,
but it breaks the code's implicit assumption that the value had the
structure of Narrow.

We can fix this by encapsulating this usage of JSON.stringify inside a
function that specifically expects a Narrow.  Here's that function.
In the next commit, we'll switch over all our uses of JSON.stringify
that should use it.
Done mostly automatically, as follows:

  $ perl -i -0pe '
      if (s<\bJSON\.stringify(\(.*(?i)narrow)>
           <keyFromNarrow\1>g) {
        s<^import .*?\K,?\s*(\} from \S*/narrow.;)>
         <, keyFromNarrow \1>ms
        || s<^import .* from \S*;\K$>
            <\nimport { keyFromNarrow } from "../utils/narrow";>ms;
      }
    ' src/**/*.js
  $ git restore src/api/
  $ tools/fmt

The Perl snippet rewrites `JSON.stringify` calls that appear to be on
anything the code calls a "narrow", and then adds an import as needed.
We leave out a couple of matches in src/api/ which really aren't
about keys in our data structures but about communicating with the
server, so that `JSON.stringify` is the right name for what we're
doing.

Beyond the above commands, we make a couple of manual fixups just in
narrow.js itself.

Searching the codebase for uses of `JSON.stringify`, all of the
remaining ones appear to be about things other than narrows.
Saying "string" in the name doesn't add anything; parsing is always
from a string.
We have a couple of parseNarrow tests that start from specific
strings, but the real spec for this function is that it's the inverse
of keyFromNarrow.  Use that to make some much more thorough tests.

That's also basically the spec for keyFromNarrow, too.  (That, and
that it should be deterministic.)
Here's a bug:
 * Send a message to a topic with a name like "this &amp; that".
 * Look at it in the whole-stream view, or another interleaved view.
   The recipient header will show it nice and correctly:
   "this &amp; that".
 * Tap the recipient header.
 * Expected: We narrow to topic "this &amp; that".
 * Actual: We narrow to a different topic, "this & that".

The cause of the bug is that we're trying to "unescape" a string that
hasn't been "escaped" in the first place -- it's already simply the
string we wanted, and "unescaping" can only corrupt it.

Specifically: in `webViewEventHandlers.js` when we get the relevant
outbound message-list event, it has the target narrow as a property,
encoded as a string; we pass it to this helper `parseNarrowString`,
and instead of simply decoding that string to a narrow, it calls
`lodash.unescape` on it first.

It looks like this bug has been here since the beginning of the
webview-based message list.  The location of it changed in commit
b5f2e53, in 2018-02: before then, the narrow was encoded wrong in
the HTML in the first place.  That commit fixed that problem (well,
part of it -- see 9941453) but introduced this spurious unescaping.

This helper has no other call sites which might be relying on its
current behavior, so we can just fix it directly.

There was a test that gestured in the direction of this kind of
question... but it actually just exercised the broken behavior,
because it wasn't paired with the use of keyFromNarrow.  Its useful
content is covered by the round-trip tests we just added, so delete
it; then add a bunch more round-trip tests that specifically probe
for this class of bug.

Fixes: zulip#4338
In the next commit, we'll change the implementation of keyFromNarrow
so that it uses caseNarrow, which will make it naturally belong in
this area of the file instead of at the top.  To keep that diff
focused and readable, we do the reordering separately here.
The main motivation for this change is that it will help us switch to
a more-structured type for narrows in our code, rather than the
existing arrays of operator/operand pairs -- we'll no longer be
coupling our data structures to the details of those values.

One way this change could affect the code's behavior, in principle, is
that JSON.stringify emits an object's properties in enumeration order
-- which means that narrows we consider equal may not have gotten
serialized as equal:

  > JSON.stringify([{"operator": "stream", "operand": "general"}])
  '[{"operator":"stream","operand":"general"}]'

  > JSON.stringify([{"operand": "general", "operator": "stream"}])
  '[{"operand":"general","operator":"stream"}]'

which could result in looking for the same narrow at different keys.
If we have any cases where this can currently happen, then these
explicitly-structured keys fix that.
@gnprice gnprice merged commit af1713a into zulip:master Dec 15, 2020
@gnprice
Copy link
Copy Markdown
Member Author

gnprice commented Dec 15, 2020

And merged, with a couple of additional comments from the thread above. Thanks again for the review!

@gnprice gnprice deleted the pr-narrow-keys branch December 15, 2020 20:51
chrisbobbe added a commit to chrisbobbe/zulip-mobile that referenced this pull request Dec 22, 2020
…list.

In af1713a, from zulip#4339, a subtle bug was introduced by our use of
the `\x00` character in making keys for narrows (`keyFromNarrow`).
That character doesn't round-trip through being the value of the
`data-narrow` attribute in our message-list HTML.

So, make it so that that attribute never contains `\x00` (or indeed
any other non-handled character [1]) by encoding it with our
`base64UtfEncode` function.

See discussion for how we arrived at this solution [2]. In
particular, it has the nice quality of also reducing our exposure to
bugs stemming from stream names (or topics, or users' email
addresses -- anything that ends up in the result of `keyFromNarrow`)
containing characters that aren't allowed in HTML-attribute values.

[1] https://html.spec.whatwg.org/multipage/syntax.html#attributes-2
[2] https://chat.zulip.org/#narrow/stream/243-mobile-team/topic/.23M4339.20regression/near/1085234
chrisbobbe added a commit to chrisbobbe/zulip-mobile that referenced this pull request Dec 22, 2020
…list.

In af1713a, from zulip#4339, a subtle bug was introduced by our use of
the `\x00` character in making keys for narrows (`keyFromNarrow`).
That character doesn't round-trip through being the value of the
`data-narrow` attribute in our message-list HTML, so we got an error
on pressing a topic-narrow header in the message list for a stream
narrow.

So, make it so that that attribute never contains `\x00` (or indeed
any other non-handled character [1]) by encoding it with our
`base64UtfEncode` function.

See discussion for how we arrived at this solution [2]. In
particular, it has the nice quality of also reducing our exposure to
bugs stemming from stream names (or topics, or users' email
addresses -- anything that ends up in the result of `keyFromNarrow`)
containing characters that aren't allowed in HTML-attribute values.

[1] https://html.spec.whatwg.org/multipage/syntax.html#attributes-2
[2] https://chat.zulip.org/#narrow/stream/243-mobile-team/topic/.23M4339.20regression/near/1085234
gnprice pushed a commit to chrisbobbe/zulip-mobile that referenced this pull request Dec 23, 2020
…list.

In af1713a, from zulip#4339, a subtle bug was introduced by our use of
the `\x00` character in making keys for narrows (`keyFromNarrow`).
That character doesn't round-trip through being the value of the
`data-narrow` attribute in our message-list HTML, so we got an error
on pressing a topic-narrow header in the message list for a stream
narrow.

So, make it so that that attribute never contains `\x00` (or indeed
any other non-handled character [1]) by encoding it with our
`base64UtfEncode` function.

See discussion for how we arrived at this solution [2]. In
particular, it has the nice quality of also reducing our exposure to
bugs stemming from stream names (or topics, or users' email
addresses -- anything that ends up in the result of `keyFromNarrow`)
containing characters that aren't allowed in HTML-attribute values.

[1] https://html.spec.whatwg.org/multipage/syntax.html#attributes-2
[2] https://chat.zulip.org/#narrow/stream/243-mobile-team/topic/.23M4339.20regression/near/1085234
gnprice added a commit to gnprice/zulip-mobile that referenced this pull request Jan 6, 2021
As the comment in migration 19 said, we were doing something a bit
risky there in order to avoid duplicating the then-current version
of `keyFromNarrow`.  We simply invoked `keyFromNarrow` directly --
which called for being careful about migrations for any future change
we made to `keyFromNarrow`.

Then a bit later we changed `keyFromNarrow` quite dramatically, and
didn't pay attention to migrations at all.  Oops.

Specifically, in cf4207f we changed the `Narrow` type to have quite
a different structure.  That's the type that `keyFromNarrow` accepts
-- so the result is that `keyFromNarrow` stopped working on the old
values.  Which are exactly what `JSON.parse` is going to supply us
when this migration is run.  The result is probably that if the user
has any drafts when they take an upgrade that spans migration 19
and commit cf4207f, the app will crash on startup.

We'd probably have found this bug at the alpha stage when next making
a release, so it wouldn't have made it to users other than ourselves.
Better to fix ASAP, though.

The plan for migration 19, as described here:
  zulip#4339 (comment)
knowingly involved breaking the basic rule of migrations: don't invoke
any of the app's model code, because it will change.  I figured it was
OK because there were changes coming up soon which I'd planned out how
I'd add migrations for.  Then later the same week I sent zulip#4346, which
followed that plan... and I hadn't thought through this particular
interaction, and it introduced this bug.  So this is a fresh lesson in
the basic rule of migrations: don't invoke any of the app's model code,
because it will change.

See discussion where we found this bug:
  zulip#4382 (comment)
gnprice added a commit to gnprice/zulip-mobile that referenced this pull request Jan 6, 2021
This completes a major objective of the long string of refactoring
that appeared in the series of PRs zulip#4330, zulip#4332, zulip#4335, zulip#4339, zulip#4342,
then zulip#4346, zulip#4356, zulip#4361, zulip#4364, and most recently zulip#4368.  After this
change, the portion of zulip#4333 that's about PMs, emails, and user IDs --
aka the portion of zulip#3764 that's about narrows -- is complete.

Still open from zulip#4333 is to convert stream and topic narrows from
stream names to stream IDs.  I'm hopeful that that will be simpler:
(a) unlike the situation for PMs, there's just one stream mentioned
at a time, so there's no question of sorting, and (b) there isn't
the "include self or not" complication that's bedeviled much of our
code related to PMs.  In other words, stream and topic narrows
don't suffer from zulip#4035.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Topics get mis-unescaped when narrowing through a recipient bar

2 participants