-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test WASM Translations in CI #927
Conversation
807685a
to
38ad259
Compare
5af3410
to
2dc0b4f
Compare
This commit consolidates all preexisting inference-related gitignore directives into the `.gitignore` file in the `inference` directory.
This commit adds an eslint config for the WASM tests. I think this overall adheres to Mozilla's code-format preferences and is good enough for a first pass. I have found the linting config to be a bit finnicky, so my preference would be to improve the linting in a follow-up, if needed/desired at all.
This commit adds new tasks to run the eslint linter on relevant JavaScript files in the project, as well as hooks up the tasks to a kind.yml file to run in CI.
The name of the file that we use in the mozilla-unified source tree is `bergamot-translator.js`, but the name of the file generated here is `bergamot-translator-worker.js`. I wanted the names to match, so I am renaming the CMake directives that dictate the generated file's names such that the generated WASM code will be `bergamot-translator.js`. This is the first step of that process.
This is the second step of the previous commit, which renames the WASM-related directives to simply be `bergamot-translator`. This results in the generated JavaScript file being `bergamot-translator.js` instead of `bergamot-translator-worker.js`.
Given the issue where building the WASM within the Docker container fails on multiple threads only if the host operating system is macOS, I have moved that default logic within the script itself. The default can still be overridden by passing the `-j` flag, but rather than call sites having to know to do the "right" thing for macOS, I'm making it the default intrinsic behavior within the script.
2dc0b4f
to
e0f5c6a
Compare
f469772
to
9009504
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clean commits, it made it really easy to review. I didn't test things locally at this time, but am relying on the CI to prove that it all works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thought: These are quite big, and will live in the .git
object storage forever. I guess it's fine, given that it unblocks the current work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I wasn't considering that these would live in tree forever, but you are right.
I can't imagine that it's hard to set up git-lfs.
That is what we will eventually have once we merge the models repo.
I'm going to consider that a blocker for this PR and just do it now.
* | ||
* @type {RegExp} | ||
*/ | ||
const whitespaceRegex = /^(\s*)(.*?)(\s*)$/s; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thought: Hmm... this is a bit of a maintenance burden since we now have code duplication. A good example of this is in the whitespaceRegex code. It'll be a bit hard to keep the two implementations up to date with each other. I'm not sure what our long term plan here with this code. We should probably discuss it a bit (not in this PR). I know we had some conversation on Matrix regarding perf testing code, which is basically this same problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm definitely on the same page in general. I mentally justified doing this because this implementation is an extremely simplified/streamlined version of what is in the mozilla-unified
tree. That is, it doesn't deal with work queues with async time slices. It doesn't manage multiple translators with keep-alive timeouts etc.
The translations engine here just has room to translate one single message at a time, and that's all it does. I actually think there is a bit of value in having a bare-bones example of the connections without the Firefox complexity, even if there is a bit of code duplication.
Regarding the whitespaceRegex, I almost just excluded it entirely, and I might still take it out, so I'm going to send this comment back your way for a response. We don't really need to handle the whitespace here in our test cases.
So, a few follow-up questions:
- Do you similarly see value in having this bare-bones implementation even if some code is duplicated?
- Do you agree that we can remove the whitespaceRegex here in the tests, as that's more of a Firefox-algorithm data cleaning detail?
- Where would you like to have this broader conversation about code duplication?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Yes I think it's fine for now, more of a long term thinking.
- Yeah I think it can be removed, but doesn't hurt to land initially.
- I would say let me do some of my perf work in here to get a feel for the repo and how it feels to work with it, then we can follow up in Matrix or a team meeting to discuss. If we need to we can write a document, but I don't think it needs to be that formal.
subprocess.run(["npm", "install"], cwd=WASM_TESTS_PATH, check=True) | ||
|
||
print("\n📊 Running Translations WASM JS tests\n") | ||
subprocess.run(["npm", "run", "test"], cwd=WASM_TESTS_PATH, check=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: In CI I would rather have all this output actually show up so it's easy to deal with test failures. In jest you can have each test that is run enumerated in the log output. It's not clear from the task output that the tests actually ran, or what tests were run.
[task 2024-11-12T23:03:00.175Z] 📂 Decompressing model files required for WASM testing
[task 2024-11-12T23:03:00.175Z]
[task 2024-11-12T23:03:00.175Z] Decompressed models in /builds/worker/checkouts/vcs/inference/wasm/tests/models
[task 2024-11-12T23:03:00.175Z]
[task 2024-11-12T23:03:00.175Z]
[task 2024-11-12T23:03:00.175Z] 🔧 Installing npm dependencies for WASM JS tests
[task 2024-11-12T23:03:00.175Z]
[task 2024-11-12T23:03:00.175Z]
[task 2024-11-12T23:03:00.175Z] 📊 Running Translations WASM JS tests
[task 2024-11-12T23:03:00.175Z]
[task 2024-11-12T23:03:00.175Z]
[task 2024-11-12T23:03:00.175Z] ✅ test-wasm.py completed successfully.
[task 2024-11-12T23:03:00.175Z]
[taskcluster 2024-11-12 23:03:00.768Z] === Task Finished ===
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, you know, this is odd.
Locally, the output looks like this:
🚀 Starting build-wasm.py
🏗️ Build directory /home/nordzilla/src/translations/inference/build-wasm already exists.
Pass the --clobber flag to rebuild if desired.
📁 Copying generated build artifacts to the WASM test directory
Copied the following artifacts to /home/nordzilla/src/translations/inference/wasm/tests/generated:
- /home/nordzilla/src/translations/inference/build-wasm/bergamot-translator.js
- /home/nordzilla/src/translations/inference/build-wasm/bergamot-translator.wasm
🔑 Calculating SHA-256 hash of /home/nordzilla/src/translations/inference/build-wasm/bergamot-translator.js
Hash of /home/nordzilla/src/translations/inference/build-wasm/bergamot-translator.js written to
/home/nordzilla/src/translations/inference/wasm/tests/generated/bergamot-translator.js.sha256
📂 Decompressing model files required for WASM testing
Decompressed models in /home/nordzilla/src/translations/inference/wasm/tests/models
🔧 Installing npm dependencies for WASM JS tests
(node:111180) ExperimentalWarning: CommonJS module /home/nordzilla/.nvm/versions/node/v23.1.0/lib/node_modules/npm/node_modules/debug/src/node.js is loading ES Module /home/nordzilla/.nvm/versions/node/v23.1.0/lib/node_modules/npm/node_modules/supports-color/index.js using require().
Support for loading ES Module in require() is an experimental feature and might change at any time
(Use `node --trace-warnings ...` to show where the warning was created)
up to date, audited 135 packages in 1s
38 packages are looking for funding
run `npm fund` for details
found 0 vulnerabilities
📊 Running Translations WASM JS tests
> [email protected] test
> vitest --run
RUN v2.1.4 /home/nordzilla/src/translations/inference/wasm/tests
✓ test-cases/translate-html-no-pivot.test.mjs (4) 1578ms
✓ HTML Non-Pivot Translations (4) 1578ms
✓ (es -> en): Translate "<b>El perro</b> azul." 374ms
✓ (en -> es): Translate "<b>The blue</b> dog." 423ms
✓ (fr -> en): Translate "<b>Le chien</b> bleu." 406ms
✓ (en -> fr): Translate "<b>The blue</b> dog." 374ms
✓ test-cases/translate-html-with-pivot.test.mjs (2) 1271ms
✓ HTML Pivot Translations (2) 1271ms
✓ (es -> fr): Translate "<b>El perro</b> azul." 610ms
✓ (fr -> es): Translate "<b>Le chien</b> bleu." 660ms
✓ test-cases/translate-plain-text-no-pivot.test.mjs (4) 1601ms
✓ Plain-Text Non-Pivot Translations (4) 1601ms
✓ (es -> en): Translate "Hola mundo" 384ms
✓ (en -> es): Translate "Hello world" 415ms
✓ (fr -> en): Translate "Bonjour le monde" 414ms
✓ (en -> fr): Translate "Hello world" 387ms
✓ test-cases/translate-plain-text-with-pivot.test.mjs (2) 1263ms
✓ Plain-Text Pivot Translations (2) 1263ms
✓ (es -> fr): Translate "El perro azul." 592ms
✓ (fr -> es): Translate "Le chien bleu." 671ms
Test Files 4 passed (4)
Tests 12 passed (12)
Start at 12:32:46
Duration 1.94s (transform 42ms, setup 0ms, collect 98ms, tests 5.71s, environment 1ms, prepare 230ms)
✅ test-wasm.py completed successfully.
But if you look at the link you pasted for the logs, the output from anything related to node
is inserted above the entire Python script's output:
[task 2024-11-12T23:02:55.451Z] (node:8372) ExperimentalWarning: CommonJS module /builds/worker/.nvm/versions/node/v23.1.0/lib/node_modules/npm/node_modules/debug/src/node.js is loading ES Module /builds/worker/.nvm/versions/node/v23.1.0/lib/node_modules/npm/node_modules/supports-color/index.js using require().
[task 2024-11-12T23:02:55.451Z] Support for loading ES Module in require() is an experimental feature and might change at any time
[task 2024-11-12T23:02:55.451Z] (Use `node --trace-warnings ...` to show where the warning was created)
[task 2024-11-12T23:02:57.257Z]
[task 2024-11-12T23:02:57.257Z] added 134 packages, and audited 135 packages in 2s
[task 2024-11-12T23:02:57.258Z]
[task 2024-11-12T23:02:57.258Z] 38 packages are looking for funding
[task 2024-11-12T23:02:57.258Z] run `npm fund` for details
[task 2024-11-12T23:02:57.259Z]
[task 2024-11-12T23:02:57.259Z] found 0 vulnerabilities
[task 2024-11-12T23:02:57.385Z]
[task 2024-11-12T23:02:57.385Z] > [email protected] test
[task 2024-11-12T23:02:57.385Z] > vitest --run
[task 2024-11-12T23:02:57.385Z]
[task 2024-11-12T23:02:57.697Z]
[task 2024-11-12T23:02:57.697Z] RUN v2.1.4 /builds/worker/checkouts/vcs/inference/wasm/tests
[task 2024-11-12T23:02:57.697Z]
[task 2024-11-12T23:02:59.706Z] ✓ test-cases/translate-plain-text-with-pivot.test.mjs (2 tests) 1715ms
[task 2024-11-12T23:02:59.706Z] ✓ Plain-Text Pivot Translations > (es -> fr): Translate "El perro azul." 853ms
[task 2024-11-12T23:02:59.706Z] ✓ Plain-Text Pivot Translations > (fr -> es): Translate "Le chien bleu." 860ms
[task 2024-11-12T23:02:59.758Z] ✓ test-cases/translate-html-with-pivot.test.mjs (2 tests) 1768ms
[task 2024-11-12T23:02:59.759Z] ✓ HTML Pivot Translations > (es -> fr): Translate "<b>El perro</b> azul." 882ms
[task 2024-11-12T23:02:59.759Z] ✓ HTML Pivot Translations > (fr -> es): Translate "<b>Le chien</b> bleu." 885ms
[task 2024-11-12T23:03:00.093Z] ✓ test-cases/translate-plain-text-no-pivot.test.mjs (4 tests) 2103ms
[task 2024-11-12T23:03:00.093Z] ✓ Plain-Text Non-Pivot Translations > (es -> en): Translate "Hola mundo" 545ms
[task 2024-11-12T23:03:00.093Z] ✓ Plain-Text Non-Pivot Translations > (en -> es): Translate "Hello world" 515ms
[task 2024-11-12T23:03:00.093Z] ✓ Plain-Text Non-Pivot Translations > (fr -> en): Translate "Bonjour le monde" 544ms
[task 2024-11-12T23:03:00.093Z] ✓ Plain-Text Non-Pivot Translations > (en -> fr): Translate "Hello world" 498ms
[task 2024-11-12T23:03:00.129Z] ✓ test-cases/translate-html-no-pivot.test.mjs (4 tests) 2136ms
[task 2024-11-12T23:03:00.129Z] ✓ HTML Non-Pivot Translations > (es -> en): Translate "<b>El perro</b> azul." 549ms
[task 2024-11-12T23:03:00.129Z] ✓ HTML Non-Pivot Translations > (en -> es): Translate "<b>The blue</b> dog." 555ms
[task 2024-11-12T23:03:00.129Z] ✓ HTML Non-Pivot Translations > (fr -> en): Translate "<b>Le chien</b> bleu." 532ms
[task 2024-11-12T23:03:00.129Z] ✓ HTML Non-Pivot Translations > (en -> fr): Translate "<b>The blue</b> dog." 498ms
[task 2024-11-12T23:03:00.152Z]
[task 2024-11-12T23:03:00.153Z] Test Files 4 passed (4)
[task 2024-11-12T23:03:00.153Z] Tests 12 passed (12)
[task 2024-11-12T23:03:00.154Z] Start at 23:02:57
[task 2024-11-12T23:03:00.154Z] Duration 2.45s (transform 46ms, setup 0ms, collect 98ms, tests 7.72s, environment 1ms, prepare 325ms)
[task 2024-11-12T23:03:00.154Z]
[task 2024-11-12T23:03:00.175Z]
[task 2024-11-12T23:03:00.175Z] 🚀 Starting build-wasm.py
[task 2024-11-12T23:03:00.175Z]
[task 2024-11-12T23:03:00.175Z] 📁 Copying generated build artifacts to the WASM test directory
[task 2024-11-12T23:03:00.175Z]
[task 2024-11-12T23:03:00.175Z] Copied the following artifacts to /builds/worker/checkouts/vcs/inference/wasm/tests/generated:
[task 2024-11-12T23:03:00.175Z] - /builds/worker/checkouts/vcs/inference/build-wasm/bergamot-translator.js
[task 2024-11-12T23:03:00.175Z] - /builds/worker/checkouts/vcs/inference/build-wasm/bergamot-translator.wasm
[task 2024-11-12T23:03:00.175Z]
[task 2024-11-12T23:03:00.175Z] 🔑 Calculating SHA-256 hash of /builds/worker/checkouts/vcs/inference/build-wasm/bergamot-translator.js
[task 2024-11-12T23:03:00.175Z]
[task 2024-11-12T23:03:00.175Z] Hash of /builds/worker/checkouts/vcs/inference/build-wasm/bergamot-translator.js written to
[task 2024-11-12T23:03:00.175Z] /builds/worker/checkouts/vcs/inference/wasm/tests/generated/bergamot-translator.js.sha256
[task 2024-11-12T23:03:00.175Z]
[task 2024-11-12T23:03:00.175Z] 📂 Decompressing model files required for WASM testing
[task 2024-11-12T23:03:00.175Z]
[task 2024-11-12T23:03:00.175Z] Decompressed models in /builds/worker/checkouts/vcs/inference/wasm/tests/models
[task 2024-11-12T23:03:00.175Z]
[task 2024-11-12T23:03:00.175Z]
[task 2024-11-12T23:03:00.175Z] 🔧 Installing npm dependencies for WASM JS tests
[task 2024-11-12T23:03:00.175Z]
[task 2024-11-12T23:03:00.175Z]
[task 2024-11-12T23:03:00.175Z] 📊 Running Translations WASM JS tests
[task 2024-11-12T23:03:00.175Z]
[task 2024-11-12T23:03:00.175Z]
[task 2024-11-12T23:03:00.175Z] ✅ test-wasm.py completed successfully.
I don't know why it's out of order in the taskcluster logs. Maybe just a Taskcluster oddity (cc @bhearsum)?
In any case, once we have more tests, I think it will be harder to miss. It is there, though.
9f2678b
to
475cdef
Compare
I'm going to have to re-request your reviews here after applying review feedback. For @bhearsum I've added a new commit that affects taskcluster docker images: It made the most sense to me to modify the There is a Matrix discussion thread that you can read about my approach here. I also have a question for you in this comment. For @gregtatum Add a subset of models for testing using git-lfs now adds only the Extract test models from archives in test-wasm.py now pulls the files via This ensures that we're not checking the large model files into tree. |
This moves the logic that is currently in the mozilla-unified tree, of adding the licensing, and wrapping the generated WASM JavaScript module in a function. This will be paired with a downstream-pr that removes this step on the mozilla-unified end.
This commit adds some Typescript bindings to the test directory that match the generated JS. I spent some time trying to get emscripten to generate these automatically, but I gave up on my time-boxed effort.
475cdef
to
b4def84
Compare
c57bb80
to
2753894
Compare
This commit adds support for pulling files via `git-lfs` to the Dockerfile for the base docker image. In order to pull the files, we need to install `git-lfs` from apt, but also add github.com to the list of known ssh hosts.
This commit adds the gzipped artifacts for * `enes` * `enfr` * `esen` * `fren` These are used for testing for the moment, but I view this as a temporary solution that is good enough for this PR. In the future, we will need to merge the `firefox-translations-models` repository here.
This commit adds a Python script for testing the WASM, which runs the WASM build script (if needed), and then invokes the test runner.
This commit modifies the new `test-wasm.py` script to extract the model artifacts from their gzipped files in the `models` test directory. The non-gzip artifacts are ignored in the .gitignore, as well as removed in the clean script.
This commit taks the WASM artifacts generated by the build script and copies them to a directory for use in tests.
This commit computes a hash of the generated JavaScript, since the test runner adds it to the worker global scope using `eval`. This ensures that our test runner will only `eval` the intended script.
This commit adds a minimal API surface of the WorkerGlobalScope API functionality that we use for Translations within Firefox, wrapping the Node.js worker_threads equivalent behavior underneath. This allows us to test the generated code in a Node.js environment with the same API calls that we use in Firefox.
This commit adds a simplified and minimal implementation of our Translations Engine from the mozilla-unified source tree, which is capable of starting a web-worker translator between a given language pair and translating a single message at a time.
Adds test cases that test the current translation functionality end-to-end, including plaint-text translations and HTML translations.
2753894
to
c8380a8
Compare
Okay @bhearsum, I believe I've fixed all of the issues with the docker The primary commit is still Add support for git-lfs to base docker image |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good to me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks like it worked
Git LFS file not shown
This PR adds functionality to test our generated JavaScript WASM translations end-to-end in CI, as well as adding an
eslint
config/task for the new JS files.This is a fairly large changeset, but each commit is broken out into a purposeful set of changes. I recommend reviewing commit by commit.
@mozilla/releng @bhearsum I think the only directly related commit for you is Add new lint tasks for eslint, which modifies the relevant
kind.yml
file.@gregtatum all of the new functionality can be run/tested locally with the following commands: