Skip to content

Conversation

@chrisbobbe
Copy link
Contributor

From discussion: https://chat.zulip.org/#narrow/stream/243-mobile-team/topic/Strings.20in.20messages_en.2Ejson.20but.20not.20in.20the.20app/near/1425315

TODO:

  • Incorporate into tools/test, perhaps to run only when --all-files is requested
  • Type-check this file with Flow?
  • Think of any more ways to keep non-UI strings out of possibleUiStringLiterals

@chrisbobbe chrisbobbe added a-tools a-i18n Translation, localization, internationalization labels Aug 26, 2022
@chrisbobbe chrisbobbe requested a review from gnprice August 26, 2022 21:45
@chrisbobbe
Copy link
Contributor Author

chrisbobbe commented Aug 26, 2022

Run with node tools/check-messages-en.js.

walk(fs.opendirSync(path.join(kSrcDirName, subdirName)), subdirName);
} else {
// Something we don't expect to find under src/, probably containing
// no UI strings. (symlinks? fifos, sockets, devices??)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, this function was inspired by a function in tsflower. 😛

@chrisbobbe chrisbobbe force-pushed the pr-dangling-translation-strings branch from 796b88b to f63aa76 Compare August 26, 2022 21:54
chrisbobbe added a commit to chrisbobbe/zulip-mobile that referenced this pull request Dec 22, 2022
…e UI)

Using my draft of a tool for finding these, in zulip#5476.

For a deep dive into where the first few of these came from, and why
they don't lead us to strings we've forgot to include in
messages_en.json, see zulip#5477.
chrisbobbe added a commit to chrisbobbe/zulip-mobile that referenced this pull request Dec 22, 2022
…e UI)

Using my draft of a tool for finding these, in zulip#5476.

For a deep dive into where the first few of these came from, and why
they don't lead us to strings we've forgot to include in
messages_en.json, see zulip#5477.
gnprice pushed a commit to chrisbobbe/zulip-mobile that referenced this pull request Dec 22, 2022
…e UI)

Using my draft of a tool for finding these, in zulip#5476.
chrisbobbe added a commit to chrisbobbe/zulip-mobile that referenced this pull request Dec 29, 2022
…e UI)

Using my draft of a tool for finding these, in zulip#5476.
@chrisbobbe chrisbobbe force-pushed the pr-dangling-translation-strings branch from f63aa76 to 0cd4245 Compare December 29, 2022 15:28
@gnprice
Copy link
Member

gnprice commented Dec 29, 2022

Run with node tools/check-messages-en.js.

You can simplify this part by doing two things:

  • Add a line #!/usr/bin/env node at the very beginning.
  • Run chmod +x tools/check-messages-en.js to make the file executable.

Then just tools/check-messages-en.js will run it. The kernel sees that #! line, and runs that line's command with the file's name as an argument: /usr/bin/env node tools/check-messages-en.js. Then in turn /usr/bin/env's job is just to look up node in your PATH and run that — which effectively gives you the same command running as if you'd typed node tools/check-messages-en.js.

A further refinement is to rename it without the .js suffix. Conventionally when a file is meant to be executed directly, one leaves off any file-type extension.

For an example, see tools/build-icon-font.

@gnprice
Copy link
Member

gnprice commented Dec 29, 2022

From #5624 (comment) :

After merging [#5624], the tool in #5476 now just finds one result, and it's a false positive:

Found message IDs in static/translations/messages_en.json that don't seem to be used in the app:
[
  '{num_of_people, plural,\n' +
    '  one {This message has been <z-link>read</z-link> by {num_of_people} person:}\n' +
    '  other {This message has been <z-link>read</z-link> by {num_of_people} people:}}',
  '{unreadCount, plural,\n' +
    '  one {{unreadCount} unread message}\n' +
    '  other {{unreadCount} unread messages}\n' +
    '}'
]

I say it's a false positive because that message shows up in translation on the read-receipts screen (tested with French).

(That's two results, right? Not one.)

It looks like the first one appears in the source as a template literal:

$ gr -C2 'This message has been' src/ 
src/message/ReadReceiptsScreen.js
132-            text: `\
133-{num_of_people, plural,
134:  one {This message has been <z-link>read</z-link> by {num_of_people} person:}
135:  other {This message has been <z-link>read</z-link> by {num_of_people} people:}\
136-}`,
137-            values: {

So probably that's a different type of node in the AST, and one just needs another method next to visitLiteral in order to cover it.


The second one appears to be a true positive. Here's from src/chat/UnreadNotice.js:

        <ZulipTextIntl
          style={styles.unreadText}
          text={
            // TODO(i18n): Try ICU syntax for plurals, like
            // `{unreadCount, plural,
            //   one {{unreadCount} unread message}
            //   other {{unreadCount} unread messages}
            // }`
            // We hope Transifex gives a nice UI to translators so they can
            // easily translate plurals. We've added the above message to
            // messages_en.json, and we'll see if that's the case? See:
            //   https://chat.zulip.org/#narrow/stream/58-translation/topic/ICU.20Message.20syntax/near/1300245
            unreadCount === 1
              ? '1 unread message'
              : {
                  text: '{unreadCount} unread messages',
                  values: { unreadCount: unreadCount.toString() },
                }
          }
        />

I guess that experiment is complete and we should declare victory on it, and start using that string.

Copy link
Member

@gnprice gnprice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for writing this up! Small comments below, and see also those above. I think it's probably a pretty short distance from here to something we can merge, and that will be helpful for keeping this list of messages garbage-collected.

Comment on lines 9 to 42
// Make a list of files that might contain UI strings, by recursing in src/.
const possibleUIStringFilePaths = [];
const kSrcDirName = 'src/';
function walk(dir, _dirName = '') {
let dirent;
// eslint-disable-next-line no-cond-assign
while ((dirent = dir.readSync())) {
// To reduce false negatives, `continue` when nothing in `dirent` can
// cause UI strings to appear in the app.

if (dirent.isFile()) {
// Non-JS code, and Flow type definitions in .js.flow files.
if (!dirent.name.endsWith('.js')) {
continue;
}

possibleUIStringFilePaths.push(path.join(kSrcDirName, _dirName, dirent.name));
} else if (dirent.isDirectory()) {
const subdirName = path.join(_dirName, dirent.name);

// Test code.
if (subdirName.endsWith('__tests__')) {
continue;
}

walk(fs.opendirSync(path.join(kSrcDirName, subdirName)), subdirName);
} else {
// Something we don't expect to find under src/, probably containing
// no UI strings. (symlinks? fifos, sockets, devices??)
continue;
}
}
}
walk(fs.opendirSync(kSrcDirName));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's encapsulate this as a function that returns possibleUIStringFilePaths.

const messages_en = require('../static/translations/messages_en.json');

// Make a list of files that might contain UI strings, by recursing in src/.
const possibleUIStringFilePaths = [];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: proper camel case

Suggested change
const possibleUIStringFilePaths = [];
const possibleUiStringFilePaths = [];

It's in my backlog to put this in the style guide, but here's the previous reference I have noted down in that backlog:
#5489 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right 🙂 I see I wrote this name a few weeks before that comment where you reminded me of this.

Comment on lines 88 to 120
// Look at all files in possibleUIStringFilePaths, and collect all string
// literals that might represent UI strings.
const possibleUiStringLiterals = new Set();
possibleUIStringFilePaths.forEach(filePath => {
const source = fs.readFileSync(filePath).toString();
const ast = parse(source, parseOptions);

visit(ast, {
// Find nodes with type "Literal" in the AST.
/* eslint-disable no-shadow */
visitLiteral(path) {
// To reduce false negatives, return false when `path` definitely
// doesn't represent a string literal for a UI string in the app.

const { value } = path.value;

// Non-string literals: numbers, booleans, etc.
if (typeof value !== 'string') {
return false;
}

// String literals like 'react' in lines like
// import React from 'react';
if (n.ImportDeclaration.check(path.parent.value)) {
return false;
}

possibleUiStringLiterals.add(value);

return this.traverse(path);
},
});
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, this can be a function that gets passed possibleUIStringFilePaths and returns possibleUiStringLiterals.

Comment on lines 122 to 133
// Check each key ("message ID" in formatjs's lingo) against
// possibleUiStringLiterals, and make a list of any that aren't found.
const danglingMessageIds = Object.keys(messages_en).filter(
messageId => !possibleUiStringLiterals.has(messageId),
);

if (danglingMessageIds.length > 0) {
console.warn(
"Found message IDs in static/translations/messages_en.json that don't seem to be used in the app:",
);
console.warn(danglingMessageIds);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And then the rest of this code that's doing things at top level can be in a function called main, and the script ends with a call main();.

This way, the control flow and data flow of the code is structured for readability in much the same way as it would be in any other file, in a file that's meant to be imported as a library rather than run as a script. The main() call at the end becomes the only part that couldn't equally well go in an importable library.

Comment on lines 20 to 22
// Non-JS code, and Flow type definitions in .js.flow files.
if (!dirent.name.endsWith('.js')) {
continue;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: for a comment like this that's stating the situation we're in when the condition is true, I find it clearer to understand when it's placed inside the conditional rather than outside:

Suggested change
// Non-JS code, and Flow type definitions in .js.flow files.
if (!dirent.name.endsWith('.js')) {
continue;
if (!dirent.name.endsWith('.js')) {
// Non-JS code, and Flow type definitions in .js.flow files.
continue;

That way the comment describes something that's true at the point where the comment is.

const subdirName = path.join(_dirName, dirent.name);

// Test code.
if (subdirName.endsWith('__tests__')) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about:

Suggested change
if (subdirName.endsWith('__tests__')) {
if (subdirName.endsWith('tests__')) {

That way it also covers __flow-tests__.

Comment on lines 128 to 132
if (danglingMessageIds.length > 0) {
console.warn(
"Found message IDs in static/translations/messages_en.json that don't seem to be used in the app:",
);
console.warn(danglingMessageIds);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a failure result, in addition to printing output also say process.exit(1);. That way the script exits with status 1 (instead of 0, which is what you get if it reaches the end of the script code.)

Exit status 0 means success and anything else means failure, by convention and in particular as interpreted by the shell:
https://www.gnu.org/software/bash/manual/html_node/Exit-Status.html
so that's what we'll want for running this from tools/test.

@chrisbobbe chrisbobbe force-pushed the pr-dangling-translation-strings branch from 0cd4245 to 6cf3742 Compare December 30, 2022 18:02
@chrisbobbe chrisbobbe changed the title [WIP] Tool to find dangling message IDs in messages_en.json tools/test: Add intl suite Dec 30, 2022
@chrisbobbe chrisbobbe marked this pull request as ready for review December 30, 2022 18:02
@chrisbobbe
Copy link
Contributor Author

chrisbobbe commented Dec 30, 2022

Thanks for the review! Revision pushed, with a few more checks added to tools/check-messages-en, and a tools/test intl suite added, in separate commits.

@chrisbobbe chrisbobbe force-pushed the pr-dangling-translation-strings branch from 6cf3742 to 3f805cb Compare December 30, 2022 18:05
@chrisbobbe
Copy link
Contributor Author

chrisbobbe commented Dec 30, 2022

Some questions prompted by adding those two added checks (for straight quotes, and for mismatched message entries):

  • Should we exit as soon as one of the three checks fails, or continue on to the next check?
  • Probably helpful for each failure message to say something like "[fix the thing] and then rerun tools/test intl", right, so contributors don't have to wait for CI to error again if they get something wrong?

@gnprice
Copy link
Member

gnprice commented Dec 31, 2022

Thanks for the revision! I'll take a look at the actual code when I'm back, but quick replies just to the questions above:

  • Should we exit as soon as one of the three checks fails, or continue on to the next check?

Yeah, in general that's a helpful thing for test suites and type-checkers and linters and so on to do.

Because there's only three of these and probably each of them is quick to fix when it fails, the impact of that nuance isn't big here. So if it seems complicated to do, just let it be.

  • Probably helpful for each failure message to say something like "[fix the thing] and then rerun tools/test intl", right, so contributors don't have to wait for CI to error again if they get something wrong?

In general the advice of "run tools/test locally to not have to wait for CI" applies to all our tests. So if people aren't getting that message, we should see about communicating it more overall.

I guess is this PR adding it as a non-default suite? In that case some more specific advice may be helpful. I think tools/test itself is probably the right place for that: something like, at the very end,

To rerun the suites that failed, run:
  $ tools/test flow intl

(with the list of suites being whatever suites indeed failed.)

@chrisbobbe chrisbobbe force-pushed the pr-dangling-translation-strings branch from 3f805cb to 91bdfe7 Compare January 3, 2023 23:07
@chrisbobbe
Copy link
Contributor Author

Thanks for the review! Revision pushed.

Copy link
Member

@gnprice gnprice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the revision! Generally this looks good; a few comments below.

  • Should we exit as soon as one of the three checks fails, or continue on to the next check?

Yeah, in general that's a helpful thing for test suites and type-checkers and linters and so on to do.

Rereading this I see that my answer was pretty ambiguous. 🙂 What I meant was that continuing on (and not exiting) is a helpful thing for such tools to do. Seems like you guessed my meaning correctly.

tools/test Outdated
Comment on lines 296 to 299
prettier) run_prettier ;;
deps) run_deps ;;
tsflower) run_tsflower ;;
intl) run_intl ;;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: match alignment of neighboring lines

Suggested change
prettier) run_prettier ;;
deps) run_deps ;;
tsflower) run_tsflower ;;
intl) run_intl ;;
prettier) run_prettier ;;
deps) run_deps ;;
tsflower) run_tsflower ;;
intl) run_intl ;;

tools/test Outdated
prettier
deps
tsflower
intl (Doesn't run by default, as it takes several seconds.)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Earlier in the usage message:

By default, run all suites, but only on files changed in this branch
as found by \`tools/git changed-files \$(tools/git base)\`.

So that should be updated too.

return (
<View style={styles.container}>
<ZulipTextIntl style={styles.text} text="That conversation doesn't seem to exist." />
<ZulipTextIntl style={styles.text} text="That conversation doesnt seem to exist." />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We caught all the straight quotes in f8351f0a6, but it looks like
one crept in after that. Add this check so it doesn't happen again.

So to really accomplish that, there's one key step which this PR doesn't yet take: causing this suite to run in CI.

What runs in CI is controlled by .github/workflows/. That config is kept quite simple, with the interesting logic almost all pushed inside tools/test. The key bit is:

    - name: Run tools/test
      run: TERM=dumb tools/test --all

So basically it runs tools/test --all.

And --all doesn't currently change the set of suites to run:

  --all       In the given suites, run all tests on all files. Same as
              --all-files --platform both.

So that line of CI config is effectively relying on the fact that currently all suites run by default.


The crude fix here would be to add to that line of CI config a list of all the suites.

A cleaner fix would be to let the CI config just express "give me all the suites, whatever they are". Probably a good version of that is to change the definition of --all like so:

  --all       In the given suites, run all tests on all files, equivalently
              to \`--all-files --platform both\`. If no list of suites was specified,
              run all suites.

That way a simple tools/test --all does exactly what it sounds like it should do, while a command like tools/test --all lint still works as desired.

(Effectively --all has also been relying on the fact that all suites are run by default, so that this distinction doesn't matter.)

@chrisbobbe chrisbobbe force-pushed the pr-dangling-translation-strings branch from 91bdfe7 to 13d00d3 Compare January 13, 2023 01:48
@chrisbobbe
Copy link
Contributor Author

Thanks for the review! Revision pushed.

@gnprice
Copy link
Member

gnprice commented Jan 25, 2023

Thanks for the revision!

I think the new CLI-parsing code for handling non-default suites can be somewhat cleaner, so I'm pushing back here a version of the branch with an additional commit revising that. Everything else looks good — so if that change looks good to you, please merge at will.

@gnprice gnprice force-pushed the pr-dangling-translation-strings branch from 13d00d3 to 21f32e5 Compare January 25, 2023 02:53
Currently, all suites run by default, so the only functional change
is the changed usage-message text.
gnprice and others added 7 commits January 25, 2023 18:35
This makes for a bit less duplication in the code; fewer places
to have to update when adding a new suite.  The CLI-parsing logic
also gets a bit simpler.

I think this also makes the usage message a bit simpler to understand.
Like the existing version, it's a bit awkward as long as there are no
non-default suites; but we'll be adding one of those shortly.
…n.json

As discussed on CZO:
  https://chat.zulip.org/#narrow/stream/243-mobile-team/topic/Strings.20in.20messages_en.2Ejson.20but.20not.20in.20the.20app/near/1425641

Run with:
  tools/check-messages-en

This gives just one result, which we'll address next.

We add flow-parser at <0.159.0. We'd like a later version --
probably ideal would be to sync it with flow-bin, which is ^0.170.0
right now -- but facebook/flow@5de4ea57e, released in v0.159.0, was
a change that isn't yet handled by any version of ast-types [1]. A
fix for ast-types is open as the issue benjamn/ast-types#728 and PR
benjamn/ast-types#727.

[1] I get this error output:

/Users/chrisbobbe/dev/zulip-mobile/node_modules/ast-types/lib/types.js:669
            throw new Error("did not recognize object of type " +
            ^

Error: did not recognize object of type "PropertyDefinition"
    at Object.getFieldNames (/Users/chrisbobbe/dev/zulip-mobile/node_modules/ast-types/lib/types.js:669:19)
    at visitChildren (/Users/chrisbobbe/dev/zulip-mobile/node_modules/ast-types/lib/path-visitor.js:185:36)
    at Visitor.PVp.visitWithoutReset (/Users/chrisbobbe/dev/zulip-mobile/node_modules/ast-types/lib/path-visitor.js:167:20)
    at NodePath.each (/Users/chrisbobbe/dev/zulip-mobile/node_modules/ast-types/lib/path.js:88:26)
    at visitChildren (/Users/chrisbobbe/dev/zulip-mobile/node_modules/ast-types/lib/path-visitor.js:179:18)
    at Visitor.PVp.visitWithoutReset (/Users/chrisbobbe/dev/zulip-mobile/node_modules/ast-types/lib/path-visitor.js:167:20)
    at visitChildren (/Users/chrisbobbe/dev/zulip-mobile/node_modules/ast-types/lib/path-visitor.js:204:25)
    at Visitor.PVp.visitWithoutReset (/Users/chrisbobbe/dev/zulip-mobile/node_modules/ast-types/lib/path-visitor.js:167:20)
    at visitChildren (/Users/chrisbobbe/dev/zulip-mobile/node_modules/ast-types/lib/path-visitor.js:204:25)
    at Visitor.PVp.visitWithoutReset (/Users/chrisbobbe/dev/zulip-mobile/node_modules/ast-types/lib/path-visitor.js:167:20)
We ran an experiment to see how translators handled this form, with
the help of Transifex's special UI for translating plurals. Greg
concluded that we should go ahead and use translations for this
message:
  https://chat.zulip.org/#narrow/stream/58-translation/topic/ICU.20Message.20syntax/near/1422025

The recently added tools/check-messages-en flagged this as an unused
message in messages_en.json.
We caught all the straight quotes in f8351f0, but it looks like
one crept in after that. Add this check so it doesn't happen again.
@chrisbobbe chrisbobbe force-pushed the pr-dangling-translation-strings branch from 21f32e5 to ac59787 Compare January 26, 2023 02:35
@chrisbobbe chrisbobbe merged commit ac59787 into zulip:main Jan 26, 2023
@chrisbobbe
Copy link
Contributor Author

Great, thanks! Done.

@chrisbobbe chrisbobbe deleted the pr-dangling-translation-strings branch January 26, 2023 02:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

a-i18n Translation, localization, internationalization a-tools

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants