Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test WASM Translations in CI #927

Merged
merged 17 commits into from
Nov 14, 2024
Merged

Test WASM Translations in CI #927

merged 17 commits into from
Nov 14, 2024

Conversation

nordzilla
Copy link
Contributor

@nordzilla nordzilla commented Nov 11, 2024

This PR adds functionality to test our generated JavaScript WASM translations end-to-end in CI, as well as adding an eslint config/task for the new JS files.

This is a fairly large changeset, but each commit is broken out into a purposeful set of changes. I recommend reviewing commit by commit.


@mozilla/releng @bhearsum I think the only directly related commit for you is Add new lint tasks for eslint, which modifies the relevant kind.yml file.


@gregtatum all of the new functionality can be run/tested locally with the following commands:

task lint-eslint
task lint-eslint-fix
task lint
task lint-fix
task docker-run -- task inference-clean
task docker-run -- task inference-test-wasm

@nordzilla nordzilla changed the title Test Bergamot WASM Translations in CI Test WASM Translations in CI Nov 11, 2024
@nordzilla nordzilla force-pushed the wasm-tests branch 4 times, most recently from 5af3410 to 2dc0b4f Compare November 12, 2024 03:38
nordzilla and others added 6 commits November 12, 2024 11:34
This commit consolidates all preexisting inference-related gitignore
directives into the `.gitignore` file in the `inference` directory.
This commit adds an eslint config for the WASM tests.
I think this overall adheres to Mozilla's code-format
preferences and is good enough for a first pass.

I have found the linting config to be a bit finnicky,
so my preference would be to improve the linting in a
follow-up, if needed/desired at all.
This commit adds new tasks to run the eslint linter
on relevant JavaScript files in the project, as well
as hooks up the tasks to a kind.yml file to run in CI.
The name of the file that we use in the mozilla-unified source tree
is `bergamot-translator.js`, but the name of the file generated here
is `bergamot-translator-worker.js`.

I wanted the names to match, so I am renaming the CMake directives
that dictate the generated file's names such that the generated
WASM code will be `bergamot-translator.js`.

This is the first step of that process.
This is the second step of the previous commit, which renames
the WASM-related directives to simply be `bergamot-translator`.

This results in the generated JavaScript file being `bergamot-translator.js`
instead of `bergamot-translator-worker.js`.
Given the issue where building the WASM within the Docker container 
fails on multiple threads only if the host operating system is macOS,
I have moved that default logic within the script itself. 

The default can still be overridden by passing the `-j` flag, but rather
than call sites having to know to do the "right" thing for macOS, I'm making 
it the default intrinsic behavior within the script.
@nordzilla nordzilla marked this pull request as ready for review November 12, 2024 18:28
@nordzilla nordzilla requested review from a team as code owners November 12, 2024 18:28
@nordzilla nordzilla force-pushed the wasm-tests branch 2 times, most recently from f469772 to 9009504 Compare November 12, 2024 22:44
Copy link
Member

@gregtatum gregtatum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clean commits, it made it really easy to review. I didn't test things locally at this time, but am relying on the CI to prove that it all works.

inference/CMakeLists.txt Show resolved Hide resolved
inference/wasm/bindings/bergamot-translator.d.ts Outdated Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought: These are quite big, and will live in the .git object storage forever. I guess it's fine, given that it unblocks the current work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I wasn't considering that these would live in tree forever, but you are right.

I can't imagine that it's hard to set up git-lfs.

That is what we will eventually have once we merge the models repo.

I'm going to consider that a blocker for this PR and just do it now.

*
* @type {RegExp}
*/
const whitespaceRegex = /^(\s*)(.*?)(\s*)$/s;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought: Hmm... this is a bit of a maintenance burden since we now have code duplication. A good example of this is in the whitespaceRegex code. It'll be a bit hard to keep the two implementations up to date with each other. I'm not sure what our long term plan here with this code. We should probably discuss it a bit (not in this PR). I know we had some conversation on Matrix regarding perf testing code, which is basically this same problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm definitely on the same page in general. I mentally justified doing this because this implementation is an extremely simplified/streamlined version of what is in the mozilla-unified tree. That is, it doesn't deal with work queues with async time slices. It doesn't manage multiple translators with keep-alive timeouts etc.

The translations engine here just has room to translate one single message at a time, and that's all it does. I actually think there is a bit of value in having a bare-bones example of the connections without the Firefox complexity, even if there is a bit of code duplication.

Regarding the whitespaceRegex, I almost just excluded it entirely, and I might still take it out, so I'm going to send this comment back your way for a response. We don't really need to handle the whitespace here in our test cases.

So, a few follow-up questions:

  1. Do you similarly see value in having this bare-bones implementation even if some code is duplicated?
  2. Do you agree that we can remove the whitespaceRegex here in the tests, as that's more of a Firefox-algorithm data cleaning detail?
  3. Where would you like to have this broader conversation about code duplication?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Yes I think it's fine for now, more of a long term thinking.
  2. Yeah I think it can be removed, but doesn't hurt to land initially.
  3. I would say let me do some of my perf work in here to get a feel for the repo and how it feels to work with it, then we can follow up in Matrix or a team meeting to discuss. If we need to we can write a document, but I don't think it needs to be that formal.

inference/wasm/tests/test-cases/shared.mjs Show resolved Hide resolved
subprocess.run(["npm", "install"], cwd=WASM_TESTS_PATH, check=True)

print("\n📊 Running Translations WASM JS tests\n")
subprocess.run(["npm", "run", "test"], cwd=WASM_TESTS_PATH, check=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: In CI I would rather have all this output actually show up so it's easy to deal with test failures. In jest you can have each test that is run enumerated in the log output. It's not clear from the task output that the tests actually ran, or what tests were run.

https://github.com/mozilla/translations/pull/927/checks?check_run_id=32892931631

[task 2024-11-12T23:03:00.175Z] 📂 Decompressing model files required for WASM testing
[task 2024-11-12T23:03:00.175Z] 
[task 2024-11-12T23:03:00.175Z]    Decompressed models in /builds/worker/checkouts/vcs/inference/wasm/tests/models
[task 2024-11-12T23:03:00.175Z] 
[task 2024-11-12T23:03:00.175Z] 
[task 2024-11-12T23:03:00.175Z] 🔧 Installing npm dependencies for WASM JS tests
[task 2024-11-12T23:03:00.175Z] 
[task 2024-11-12T23:03:00.175Z] 
[task 2024-11-12T23:03:00.175Z] 📊 Running Translations WASM JS tests
[task 2024-11-12T23:03:00.175Z] 
[task 2024-11-12T23:03:00.175Z] 
[task 2024-11-12T23:03:00.175Z] ✅ test-wasm.py completed successfully.
[task 2024-11-12T23:03:00.175Z] 
[taskcluster 2024-11-12 23:03:00.768Z] === Task Finished ===

Copy link
Contributor Author

@nordzilla nordzilla Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, you know, this is odd.

Locally, the output looks like this:

🚀 Starting build-wasm.py

🏗️  Build directory /home/nordzilla/src/translations/inference/build-wasm already exists.

   Pass the --clobber flag to rebuild if desired.

📁 Copying generated build artifacts to the WASM test directory

   Copied the following artifacts to /home/nordzilla/src/translations/inference/wasm/tests/generated:
     - /home/nordzilla/src/translations/inference/build-wasm/bergamot-translator.js
     - /home/nordzilla/src/translations/inference/build-wasm/bergamot-translator.wasm

🔑 Calculating SHA-256 hash of /home/nordzilla/src/translations/inference/build-wasm/bergamot-translator.js

   Hash of /home/nordzilla/src/translations/inference/build-wasm/bergamot-translator.js written to
   /home/nordzilla/src/translations/inference/wasm/tests/generated/bergamot-translator.js.sha256

📂 Decompressing model files required for WASM testing

   Decompressed models in /home/nordzilla/src/translations/inference/wasm/tests/models


🔧 Installing npm dependencies for WASM JS tests

(node:111180) ExperimentalWarning: CommonJS module /home/nordzilla/.nvm/versions/node/v23.1.0/lib/node_modules/npm/node_modules/debug/src/node.js is loading ES Module /home/nordzilla/.nvm/versions/node/v23.1.0/lib/node_modules/npm/node_modules/supports-color/index.js using require().
Support for loading ES Module in require() is an experimental feature and might change at any time
(Use `node --trace-warnings ...` to show where the warning was created)

up to date, audited 135 packages in 1s

38 packages are looking for funding
  run `npm fund` for details

found 0 vulnerabilities

📊 Running Translations WASM JS tests


> [email protected] test
> vitest --run


 RUN  v2.1.4 /home/nordzilla/src/translations/inference/wasm/tests

 ✓ test-cases/translate-html-no-pivot.test.mjs (4) 1578ms
   ✓ HTML Non-Pivot Translations (4) 1578ms
     ✓ (es -> en): Translate "<b>El perro</b> azul." 374ms
     ✓ (en -> es): Translate "<b>The blue</b> dog." 423ms
     ✓ (fr -> en): Translate "<b>Le chien</b> bleu." 406ms
     ✓ (en -> fr): Translate "<b>The blue</b> dog." 374ms
 ✓ test-cases/translate-html-with-pivot.test.mjs (2) 1271ms
   ✓ HTML Pivot Translations (2) 1271ms
     ✓ (es -> fr): Translate "<b>El perro</b> azul." 610ms
     ✓ (fr -> es): Translate "<b>Le chien</b> bleu." 660ms
 ✓ test-cases/translate-plain-text-no-pivot.test.mjs (4) 1601ms
   ✓ Plain-Text Non-Pivot Translations (4) 1601ms
     ✓ (es -> en): Translate "Hola mundo" 384ms
     ✓ (en -> es): Translate "Hello world" 415ms
     ✓ (fr -> en): Translate "Bonjour le monde" 414ms
     ✓ (en -> fr): Translate "Hello world" 387ms
 ✓ test-cases/translate-plain-text-with-pivot.test.mjs (2) 1263ms
   ✓ Plain-Text Pivot Translations (2) 1263ms
     ✓ (es -> fr): Translate "El perro azul." 592ms
     ✓ (fr -> es): Translate "Le chien bleu." 671ms

 Test Files  4 passed (4)
      Tests  12 passed (12)
   Start at  12:32:46
   Duration  1.94s (transform 42ms, setup 0ms, collect 98ms, tests 5.71s, environment 1ms, prepare 230ms)


✅ test-wasm.py completed successfully.

But if you look at the link you pasted for the logs, the output from anything related to node is inserted above the entire Python script's output:

[task 2024-11-12T23:02:55.451Z] (node:8372) ExperimentalWarning: CommonJS module /builds/worker/.nvm/versions/node/v23.1.0/lib/node_modules/npm/node_modules/debug/src/node.js is loading ES Module /builds/worker/.nvm/versions/node/v23.1.0/lib/node_modules/npm/node_modules/supports-color/index.js using require().
[task 2024-11-12T23:02:55.451Z] Support for loading ES Module in require() is an experimental feature and might change at any time
[task 2024-11-12T23:02:55.451Z] (Use `node --trace-warnings ...` to show where the warning was created)
[task 2024-11-12T23:02:57.257Z] 
[task 2024-11-12T23:02:57.257Z] added 134 packages, and audited 135 packages in 2s
[task 2024-11-12T23:02:57.258Z] 
[task 2024-11-12T23:02:57.258Z] 38 packages are looking for funding
[task 2024-11-12T23:02:57.258Z]   run `npm fund` for details
[task 2024-11-12T23:02:57.259Z] 
[task 2024-11-12T23:02:57.259Z] found 0 vulnerabilities
[task 2024-11-12T23:02:57.385Z] 
[task 2024-11-12T23:02:57.385Z] > [email protected] test
[task 2024-11-12T23:02:57.385Z] > vitest --run
[task 2024-11-12T23:02:57.385Z] 
[task 2024-11-12T23:02:57.697Z] 
[task 2024-11-12T23:02:57.697Z]  RUN  v2.1.4 /builds/worker/checkouts/vcs/inference/wasm/tests
[task 2024-11-12T23:02:57.697Z] 
[task 2024-11-12T23:02:59.706Z]  ✓ test-cases/translate-plain-text-with-pivot.test.mjs  (2 tests) 1715ms
[task 2024-11-12T23:02:59.706Z]    ✓ Plain-Text Pivot Translations > (es -> fr): Translate "El perro azul." 853ms
[task 2024-11-12T23:02:59.706Z]    ✓ Plain-Text Pivot Translations > (fr -> es): Translate "Le chien bleu." 860ms
[task 2024-11-12T23:02:59.758Z]  ✓ test-cases/translate-html-with-pivot.test.mjs  (2 tests) 1768ms
[task 2024-11-12T23:02:59.759Z]    ✓ HTML Pivot Translations > (es -> fr): Translate "<b>El perro</b> azul." 882ms
[task 2024-11-12T23:02:59.759Z]    ✓ HTML Pivot Translations > (fr -> es): Translate "<b>Le chien</b> bleu." 885ms
[task 2024-11-12T23:03:00.093Z]  ✓ test-cases/translate-plain-text-no-pivot.test.mjs  (4 tests) 2103ms
[task 2024-11-12T23:03:00.093Z]    ✓ Plain-Text Non-Pivot Translations > (es -> en): Translate "Hola mundo" 545ms
[task 2024-11-12T23:03:00.093Z]    ✓ Plain-Text Non-Pivot Translations > (en -> es): Translate "Hello world" 515ms
[task 2024-11-12T23:03:00.093Z]    ✓ Plain-Text Non-Pivot Translations > (fr -> en): Translate "Bonjour le monde" 544ms
[task 2024-11-12T23:03:00.093Z]    ✓ Plain-Text Non-Pivot Translations > (en -> fr): Translate "Hello world" 498ms
[task 2024-11-12T23:03:00.129Z]  ✓ test-cases/translate-html-no-pivot.test.mjs  (4 tests) 2136ms
[task 2024-11-12T23:03:00.129Z]    ✓ HTML Non-Pivot Translations > (es -> en): Translate "<b>El perro</b> azul." 549ms
[task 2024-11-12T23:03:00.129Z]    ✓ HTML Non-Pivot Translations > (en -> es): Translate "<b>The blue</b> dog." 555ms
[task 2024-11-12T23:03:00.129Z]    ✓ HTML Non-Pivot Translations > (fr -> en): Translate "<b>Le chien</b> bleu." 532ms
[task 2024-11-12T23:03:00.129Z]    ✓ HTML Non-Pivot Translations > (en -> fr): Translate "<b>The blue</b> dog." 498ms
[task 2024-11-12T23:03:00.152Z] 
[task 2024-11-12T23:03:00.153Z]  Test Files  4 passed (4)
[task 2024-11-12T23:03:00.153Z]       Tests  12 passed (12)
[task 2024-11-12T23:03:00.154Z]    Start at  23:02:57
[task 2024-11-12T23:03:00.154Z]    Duration  2.45s (transform 46ms, setup 0ms, collect 98ms, tests 7.72s, environment 1ms, prepare 325ms)
[task 2024-11-12T23:03:00.154Z] 
[task 2024-11-12T23:03:00.175Z] 
[task 2024-11-12T23:03:00.175Z] 🚀 Starting build-wasm.py
[task 2024-11-12T23:03:00.175Z] 
[task 2024-11-12T23:03:00.175Z] 📁 Copying generated build artifacts to the WASM test directory
[task 2024-11-12T23:03:00.175Z] 
[task 2024-11-12T23:03:00.175Z]    Copied the following artifacts to /builds/worker/checkouts/vcs/inference/wasm/tests/generated:
[task 2024-11-12T23:03:00.175Z]      - /builds/worker/checkouts/vcs/inference/build-wasm/bergamot-translator.js
[task 2024-11-12T23:03:00.175Z]      - /builds/worker/checkouts/vcs/inference/build-wasm/bergamot-translator.wasm
[task 2024-11-12T23:03:00.175Z] 
[task 2024-11-12T23:03:00.175Z] 🔑 Calculating SHA-256 hash of /builds/worker/checkouts/vcs/inference/build-wasm/bergamot-translator.js
[task 2024-11-12T23:03:00.175Z] 
[task 2024-11-12T23:03:00.175Z]    Hash of /builds/worker/checkouts/vcs/inference/build-wasm/bergamot-translator.js written to
[task 2024-11-12T23:03:00.175Z]    /builds/worker/checkouts/vcs/inference/wasm/tests/generated/bergamot-translator.js.sha256
[task 2024-11-12T23:03:00.175Z] 
[task 2024-11-12T23:03:00.175Z] 📂 Decompressing model files required for WASM testing
[task 2024-11-12T23:03:00.175Z] 
[task 2024-11-12T23:03:00.175Z]    Decompressed models in /builds/worker/checkouts/vcs/inference/wasm/tests/models
[task 2024-11-12T23:03:00.175Z] 
[task 2024-11-12T23:03:00.175Z] 
[task 2024-11-12T23:03:00.175Z] 🔧 Installing npm dependencies for WASM JS tests
[task 2024-11-12T23:03:00.175Z] 
[task 2024-11-12T23:03:00.175Z] 
[task 2024-11-12T23:03:00.175Z] 📊 Running Translations WASM JS tests
[task 2024-11-12T23:03:00.175Z] 
[task 2024-11-12T23:03:00.175Z] 
[task 2024-11-12T23:03:00.175Z] ✅ test-wasm.py completed successfully.

I don't know why it's out of order in the taskcluster logs. Maybe just a Taskcluster oddity (cc @bhearsum)?

In any case, once we have more tests, I think it will be harder to miss. It is there, though.

@nordzilla
Copy link
Contributor Author

nordzilla commented Nov 13, 2024

@bhearsum @gregtatum

I'm going to have to re-request your reviews here after applying review feedback.


For @bhearsum

I've added a new commit that affects taskcluster docker images:

It made the most sense to me to modify the base docker image, since the configuration for git-lfs is at the project root in the .gitattributes file.

There is a Matrix discussion thread that you can read about my approach here.

I also have a question for you in this comment.


For @gregtatum

Add a subset of models for testing using git-lfs now adds only the git-lfs definitions instead of the whole model archive.

Extract test models from archives in test-wasm.py now pulls the files via git-lfs before extracting them from their archives.

This ensures that we're not checking the large model files into tree.

This moves the logic that is currently in the mozilla-unified tree,
of adding the licensing, and wrapping the generated WASM JavaScript
module in a function.

This will be paired with a downstream-pr that removes this step on
the mozilla-unified end.
This commit adds some Typescript bindings to the
test directory that match the generated JS.

I spent some time trying to get emscripten to generate
these automatically, but I gave up on my time-boxed effort.
@nordzilla nordzilla force-pushed the wasm-tests branch 5 times, most recently from c57bb80 to 2753894 Compare November 14, 2024 16:31
nordzilla and others added 9 commits November 14, 2024 11:02
This commit adds support for pulling files via `git-lfs`
to the Dockerfile for the base docker image. In order
to pull the files, we need to install `git-lfs` from apt,
but also add github.com to the list of known ssh hosts.
This commit adds the gzipped artifacts for

* `enes`
* `enfr`
* `esen`
* `fren`

These are used for testing for the moment, but I view this as a
temporary solution that is good enough for this PR.

In the future, we will need to merge the `firefox-translations-models`
repository here.
This commit adds a Python script for testing the WASM,
which runs the WASM build script (if needed), and then
invokes the test runner.
This commit modifies the new `test-wasm.py` script to
extract the model artifacts from their gzipped files
in the `models` test directory.

The non-gzip artifacts are ignored in the .gitignore,
as well as removed in the clean script.
This commit taks the WASM artifacts generated by the
build script and copies them to a directory for use
in tests.
This commit computes a hash of the generated JavaScript,
since the test runner adds it to the worker global scope
using `eval`. This ensures that our test runner will only 
`eval` the intended script.
This commit adds a minimal API surface of the WorkerGlobalScope
API functionality that we use for Translations within Firefox,
wrapping the Node.js worker_threads equivalent behavior underneath.

This allows us to test the generated code in a Node.js environment
with the same API calls that we use in Firefox.
This commit adds a simplified and minimal implementation of our
Translations Engine from the mozilla-unified source tree, which
is capable of starting a web-worker translator between a given
language pair and translating a single message at a time.
Adds test cases that test the current translation functionality
end-to-end, including plaint-text translations and HTML translations.
@nordzilla
Copy link
Contributor Author

Okay @bhearsum, I believe I've fixed all of the issues with the docker known_hosts.

The primary commit is still Add support for git-lfs to base docker image

@nordzilla nordzilla requested a review from bhearsum November 14, 2024 17:21
Copy link
Member

@gregtatum gregtatum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks like it worked

Git LFS file not shown

@nordzilla nordzilla merged commit 60e8730 into main Nov 14, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants