narrow: Encode our own structured keys instead of JSON.#4339
narrow: Encode our own structured keys instead of JSON.#4339gnprice merged 9 commits intozulip:masterfrom
Conversation
There was a problem hiding this comment.
This is great! Thanks for tightening this up so much.
One thing I thought of right away is that it seems like a good place to use an opaque type for the return value of keyFromNarrow. You introduced opaque types to our project in #4335. That way, you couldn't accidentally do a JSON.stringify in place of calling keyFromNarrow.
I did run into a lot of Flow errors, though. Maybe a bug? (facebook/flow#5407?)
Error ┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈ src/chat/__tests__/fetchingReducer-test.js:47:10
Cannot assign computed property using NarrowKey [1].
src/chat/__tests__/fetchingReducer-test.js
44│
45│ const expectedState = {
46│ [HOME_NARROW_STR]: { older: false, newer: false },
47│ [keyFromNarrow(streamNarrow('some stream'))]: { older: true, newer: false },
48│ };
49│
50│ const newState = fetchingReducer(initialState, action);
src/utils/narrow.js
[1] 261│ export function keyFromNarrow(narrow: Narrow): NarrowKey {
| */ | ||
| export const parseNarrow = (narrowStr: string): Narrow => JSON.parse(narrowStr); | ||
| export const parseNarrow = (narrowStr: string): Narrow => { | ||
| const makeError = () => new Error('parseNarrow: bad narrow'); |
There was a problem hiding this comment.
Hmm, I could imagine wanting to know what narrowStr is, in a Sentry report. Possibly that's sensitive data that shouldn't go to Sentry, though.
There was a problem hiding this comment.
Yeah. This code is all self-contained enough (really just the few dozen lines of keyFromNarrow and parseNarrow) that if we learn from Sentry there's a bug here, I feel like we should be able to find it with some combination of more thorough tests and just staring at the code.
And if not, we can always add more logging then 🙂
|
|
||
| case 'topic:': { | ||
| // eslint-disable-next-line no-control-regex | ||
| const match = /^s:(.*?)\x00(.*)/s.exec(rest); |
There was a problem hiding this comment.
Hmm, I don't think I've used the s flag (dotAll) before. I think its use here has something to do with the \x00 in the pattern, but I'm not quite sure what. 🤔
There was a problem hiding this comment.
It's not related to that character in the pattern, but it's for sort of an inverse kind of problem: it's so that the dots . match absolutely any character, even a newline if there somehow happens to be one in either the stream name or the topic. (The page you linked describes how . by default matches everything except four flavors of newline.)
As of this morning my draft of this change actually had a bug here because I didn't have the /s. I added it after writing the "awkward characters" part of the round-trip test, and finding that it failed on this change!
I'll add a line or so of a comment about this.
There was a problem hiding this comment.
(Meanwhile the \x00 is there because it's the same character that keyFromNarrow uses as a delimiter. That choice is discussed in a short comment there.)
| }), | ||
|
|
||
| // Change format of keys representing narrows. | ||
| '17': dropCache, |
There was a problem hiding this comment.
'drafts', which uses the new keys, is in storeKeys, so dropCache won't drop it.
There was a problem hiding this comment.
Ah, good catch!
I think I'll add a real migration, then. It's probably not real common for someone to happen to have a draft they've just written (or otherwise written recently enough to still want it) at the time the app gets upgraded -- it's possible the number of users who'd notice, on a given upgrade, would be zero -- but if they do it'd be best not to discard it.
| narrows: Immutable.Map(state.narrows), | ||
| }), | ||
|
|
||
| // Change format of keys representing narrows. |
There was a problem hiding this comment.
8fa5055 narrow: Encode our own structured keys instead of JSON.
One way this change could affect the code's behavior, in principle, is
that JSON.stringify emits an object's properties in enumeration order
-- which means that narrows we consider equal may not have gotten
serialized as equal:
> JSON.stringify([{"operator": "stream", "operand": "general"}])
'[{"operator":"stream","operand":"general"}]'
> JSON.stringify([{"operand": "general", "operator": "stream"}])
'[{"operand":"general","operator":"stream"}]'
which could result in looking for the same narrow at different keys.
If we have any cases where this can currently happen, then these
explicitly-structured keys fix that.
Yeah, this has been kind of in the back of our minds for a while. It's really nice to not have to worry about it anymore, even if we don't know of any actual instances of the issue.
|
Thanks for the review!
Yeah, good idea!
As it happens, I actually ran into this Flow bug just the other day -- I was experimenting with an opaque type for I'll revise this branch with the couple of changes mentioned above. (And might try a |
8fa5055 to
f1b858b
Compare
|
OK, revision pushed! |
chrisbobbe
left a comment
There was a problem hiding this comment.
Great, thanks! Just one small comment/query on the migration, below; otherwise, this LGTM.
f1b858b to
4e0a64d
Compare
I'd actually forgotten that state.drafts was among these; I was reminded by seeing its reducer and selector affected in the upcoming commit migrating to keyFromNarrow!
We'd like to replace the Narrow type in most of our code with a new, better-structured one. In general this kind of refactoring gets a lot of help from the type-checker: you change the type of a value in one place, and Flow points out all the other places that consume or provide that value and need to be updated to match. But one thing it won't catch is where we pass a Narrow to JSON.stringify, as we do all the time. That's because the latter will take any value at all as its argument (the type in the signature is `mixed`) -- so passing a value of some new type is still well-typed, but it breaks the code's implicit assumption that the value had the structure of Narrow. We can fix this by encapsulating this usage of JSON.stringify inside a function that specifically expects a Narrow. Here's that function. In the next commit, we'll switch over all our uses of JSON.stringify that should use it.
Done mostly automatically, as follows:
$ perl -i -0pe '
if (s<\bJSON\.stringify(\(.*(?i)narrow)>
<keyFromNarrow\1>g) {
s<^import .*?\K,?\s*(\} from \S*/narrow.;)>
<, keyFromNarrow \1>ms
|| s<^import .* from \S*;\K$>
<\nimport { keyFromNarrow } from "../utils/narrow";>ms;
}
' src/**/*.js
$ git restore src/api/
$ tools/fmt
The Perl snippet rewrites `JSON.stringify` calls that appear to be on
anything the code calls a "narrow", and then adds an import as needed.
We leave out a couple of matches in src/api/ which really aren't
about keys in our data structures but about communicating with the
server, so that `JSON.stringify` is the right name for what we're
doing.
Beyond the above commands, we make a couple of manual fixups just in
narrow.js itself.
Searching the codebase for uses of `JSON.stringify`, all of the
remaining ones appear to be about things other than narrows.
Saying "string" in the name doesn't add anything; parsing is always from a string.
We have a couple of parseNarrow tests that start from specific strings, but the real spec for this function is that it's the inverse of keyFromNarrow. Use that to make some much more thorough tests. That's also basically the spec for keyFromNarrow, too. (That, and that it should be deterministic.)
Here's a bug: * Send a message to a topic with a name like "this & that". * Look at it in the whole-stream view, or another interleaved view. The recipient header will show it nice and correctly: "this & that". * Tap the recipient header. * Expected: We narrow to topic "this & that". * Actual: We narrow to a different topic, "this & that". The cause of the bug is that we're trying to "unescape" a string that hasn't been "escaped" in the first place -- it's already simply the string we wanted, and "unescaping" can only corrupt it. Specifically: in `webViewEventHandlers.js` when we get the relevant outbound message-list event, it has the target narrow as a property, encoded as a string; we pass it to this helper `parseNarrowString`, and instead of simply decoding that string to a narrow, it calls `lodash.unescape` on it first. It looks like this bug has been here since the beginning of the webview-based message list. The location of it changed in commit b5f2e53, in 2018-02: before then, the narrow was encoded wrong in the HTML in the first place. That commit fixed that problem (well, part of it -- see 9941453) but introduced this spurious unescaping. This helper has no other call sites which might be relying on its current behavior, so we can just fix it directly. There was a test that gestured in the direction of this kind of question... but it actually just exercised the broken behavior, because it wasn't paired with the use of keyFromNarrow. Its useful content is covered by the round-trip tests we just added, so delete it; then add a bunch more round-trip tests that specifically probe for this class of bug. Fixes: zulip#4338
In the next commit, we'll change the implementation of keyFromNarrow so that it uses caseNarrow, which will make it naturally belong in this area of the file instead of at the top. To keep that diff focused and readable, we do the reordering separately here.
The main motivation for this change is that it will help us switch to
a more-structured type for narrows in our code, rather than the
existing arrays of operator/operand pairs -- we'll no longer be
coupling our data structures to the details of those values.
One way this change could affect the code's behavior, in principle, is
that JSON.stringify emits an object's properties in enumeration order
-- which means that narrows we consider equal may not have gotten
serialized as equal:
> JSON.stringify([{"operator": "stream", "operand": "general"}])
'[{"operator":"stream","operand":"general"}]'
> JSON.stringify([{"operand": "general", "operator": "stream"}])
'[{"operand":"general","operator":"stream"}]'
which could result in looking for the same narrow at different keys.
If we have any cases where this can currently happen, then these
explicitly-structured keys fix that.
4e0a64d to
af1713a
Compare
|
And merged, with a couple of additional comments from the thread above. Thanks again for the review! |
…list. In af1713a, from zulip#4339, a subtle bug was introduced by our use of the `\x00` character in making keys for narrows (`keyFromNarrow`). That character doesn't round-trip through being the value of the `data-narrow` attribute in our message-list HTML. So, make it so that that attribute never contains `\x00` (or indeed any other non-handled character [1]) by encoding it with our `base64UtfEncode` function. See discussion for how we arrived at this solution [2]. In particular, it has the nice quality of also reducing our exposure to bugs stemming from stream names (or topics, or users' email addresses -- anything that ends up in the result of `keyFromNarrow`) containing characters that aren't allowed in HTML-attribute values. [1] https://html.spec.whatwg.org/multipage/syntax.html#attributes-2 [2] https://chat.zulip.org/#narrow/stream/243-mobile-team/topic/.23M4339.20regression/near/1085234
…list. In af1713a, from zulip#4339, a subtle bug was introduced by our use of the `\x00` character in making keys for narrows (`keyFromNarrow`). That character doesn't round-trip through being the value of the `data-narrow` attribute in our message-list HTML, so we got an error on pressing a topic-narrow header in the message list for a stream narrow. So, make it so that that attribute never contains `\x00` (or indeed any other non-handled character [1]) by encoding it with our `base64UtfEncode` function. See discussion for how we arrived at this solution [2]. In particular, it has the nice quality of also reducing our exposure to bugs stemming from stream names (or topics, or users' email addresses -- anything that ends up in the result of `keyFromNarrow`) containing characters that aren't allowed in HTML-attribute values. [1] https://html.spec.whatwg.org/multipage/syntax.html#attributes-2 [2] https://chat.zulip.org/#narrow/stream/243-mobile-team/topic/.23M4339.20regression/near/1085234
…list. In af1713a, from zulip#4339, a subtle bug was introduced by our use of the `\x00` character in making keys for narrows (`keyFromNarrow`). That character doesn't round-trip through being the value of the `data-narrow` attribute in our message-list HTML, so we got an error on pressing a topic-narrow header in the message list for a stream narrow. So, make it so that that attribute never contains `\x00` (or indeed any other non-handled character [1]) by encoding it with our `base64UtfEncode` function. See discussion for how we arrived at this solution [2]. In particular, it has the nice quality of also reducing our exposure to bugs stemming from stream names (or topics, or users' email addresses -- anything that ends up in the result of `keyFromNarrow`) containing characters that aren't allowed in HTML-attribute values. [1] https://html.spec.whatwg.org/multipage/syntax.html#attributes-2 [2] https://chat.zulip.org/#narrow/stream/243-mobile-team/topic/.23M4339.20regression/near/1085234
As the comment in migration 19 said, we were doing something a bit risky there in order to avoid duplicating the then-current version of `keyFromNarrow`. We simply invoked `keyFromNarrow` directly -- which called for being careful about migrations for any future change we made to `keyFromNarrow`. Then a bit later we changed `keyFromNarrow` quite dramatically, and didn't pay attention to migrations at all. Oops. Specifically, in cf4207f we changed the `Narrow` type to have quite a different structure. That's the type that `keyFromNarrow` accepts -- so the result is that `keyFromNarrow` stopped working on the old values. Which are exactly what `JSON.parse` is going to supply us when this migration is run. The result is probably that if the user has any drafts when they take an upgrade that spans migration 19 and commit cf4207f, the app will crash on startup. We'd probably have found this bug at the alpha stage when next making a release, so it wouldn't have made it to users other than ourselves. Better to fix ASAP, though. The plan for migration 19, as described here: zulip#4339 (comment) knowingly involved breaking the basic rule of migrations: don't invoke any of the app's model code, because it will change. I figured it was OK because there were changes coming up soon which I'd planned out how I'd add migrations for. Then later the same week I sent zulip#4346, which followed that plan... and I hadn't thought through this particular interaction, and it introduced this bug. So this is a fresh lesson in the basic rule of migrations: don't invoke any of the app's model code, because it will change. See discussion where we found this bug: zulip#4382 (comment)
This completes a major objective of the long string of refactoring that appeared in the series of PRs zulip#4330, zulip#4332, zulip#4335, zulip#4339, zulip#4342, then zulip#4346, zulip#4356, zulip#4361, zulip#4364, and most recently zulip#4368. After this change, the portion of zulip#4333 that's about PMs, emails, and user IDs -- aka the portion of zulip#3764 that's about narrows -- is complete. Still open from zulip#4333 is to convert stream and topic narrows from stream names to stream IDs. I'm hopeful that that will be simpler: (a) unlike the situation for PMs, there's just one stream mentioned at a time, so there's no question of sorting, and (b) there isn't the "include self or not" complication that's bedeviled much of our code related to PMs. In other words, stream and topic narrows don't suffer from zulip#4035.
The main motivation for this change is that it will help us switch to a more sensible type for narrows in our code, rather than the existing arrays of operator/operand pairs -- we'll no longer be coupling our data structures to the details of those values. In particular we intend to make the Narrow values more appropriately structured, and to replace emails and stream names with user IDs and stream IDs: #4333.
As setup for changing how we encode narrows as string keys, we encapsulate doing so into a central function
keyFromNarrow, and replace all theJSON.stringifycalls that really meant "encode this narrow to use as a key" with calls to that function.Along the way I discovered a bug in our encoding and decoding of narrows, #4338, which this fixes.
Fixes: #4338