Skip to content

Conversation

@christian-byrne
Copy link
Contributor

@christian-byrne christian-byrne commented Jun 1, 2025

Adds cleanup step to i18n workflow to remove null entries that can occur when translation service encounters timing issues with node output changes.

┆Issue is synchronized with this Notion page by Unito

- name: Clean null values from translations
run: |
# Remove null entries from outputs arrays in all translated files
find src/locales/*/nodeDefs.json -not -path "*/en/*" -exec sed -i 's/,\s*null//g; s/null,\s*//g; s/\[\s*null\s*\]/[]/g' {} \;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this generated? It's doing some weird stuff. Where are we going to get ,\s*null?

For any bash script using sed, verification is reasonably important. It looks like the middle script is the only one that's actually wanted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this is generated. I'm currently testing by recreating conditions of issue detailed in #3967 and verifying it correctly solves the problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More context:

  Where the null values come from:
  The null values appear in the translated locale files (not the English source) when the OpenAI
  translation service encounters timing issues. Specifically, when node outputs are removed/added between
  script runs, the translation service sometimes produces malformed arrays like:
  "outputs": [
    null,
    null,
    null,
    {
      "name": "normal"
    }
  ]

  Instead of the correct object format:
  "outputs": {
    "0": {
      "name": "imagen"
    },
    "1": {
      "name": "mask"
    }
  }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah fair enough! Must be doing something weird with arrays and having them serialise as objects. Originally couldn't find anything that matched when I searched.

@christian-byrne christian-byrne force-pushed the docs/node-def-translations branch from 2fc1427 to 01a6ac4 Compare June 1, 2025 12:46
@christian-byrne
Copy link
Contributor Author

christian-byrne commented Jun 1, 2025

Changes:

  • Convert from using sed inside workflow => using js script which also coerces malformed outputs to correct data structure, in addition to removing syntactically invalid null. Confirmed to work by testing with the malformed outputs generated here

@christian-byrne christian-byrne changed the title [fix] Prevent null values in translated node definition outputs [fix] Automatically fix malformed node def translations Jun 1, 2025
Adds cleanup step to i18n workflow to remove null entries that can occur
when translation service encounters timing issues with node output changes.

Also adds concise documentation for the node definition translation script.
@christian-byrne christian-byrne force-pushed the docs/node-def-translations branch from 01a6ac4 to 14ae679 Compare June 1, 2025 12:51
Copy link
Contributor

@webfiltered webfiltered left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming you're verifying by diffing the output of test vs the actual expected output, this looks great.

If the actual file contents haven't been checked, they should be. Reason is it looks like it works by:

  1. Removing nulls/undefined from the array and stuffing it back as an object
  2. If this invalidates the entire array, it should replace all outputs 👍

However if it simply adds any missing values (like it does elsewhere in i18n), then the first output or three will have the (completely) wrong translation. 🙅

@christian-byrne
Copy link
Contributor Author

This commit is entirely the result of running the script: 3bee3c5

@christian-byrne christian-byrne merged commit ec4ced2 into main Jun 1, 2025
10 checks passed
@christian-byrne christian-byrne deleted the docs/node-def-translations branch June 1, 2025 13:45
Copy link
Contributor

@huchenlei huchenlei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect claude write ts script under scripts/ and call it via tsx like other files under scripts/. Maybe add that instruction to CLAUDE.md?

@webfiltered
Copy link
Contributor

webfiltered commented Jun 1, 2025

image

@christian-byrne Reverting this for now. This script must be reviewed using the diff of a 100% valid, working run.

@christian-byrne
Copy link
Contributor Author

What's expected/correct behavior that the script should do in that screenshot?

@webfiltered
Copy link
Contributor

It's only setting "index 5" on an "array object" - which is a bit weird to begin with but we know about that. It's missing indices 0-4 (removed by script). So they'll be undefined instead of null.

So

{
  "0": {name:"n1"},
  "1": {name:"n2"},
  // ... 2, 3, and 4 go here
  "5": {name:"sanc_banana"}
}

@webfiltered
Copy link
Contributor

webfiltered commented Jun 2, 2025

Or if it really only has one output, not 6, then just {"0": {name:"sanc_banana"}}

Edit: Or if it's only translating input 6, then that's fine. But in this instance, it's not - the english section had all outputs listed. This is one of a few styles of broken things I noticed.

You can just get a working example (i18n had no timeouts) and a broken example of the same changes from another run, then diff. Is there a reason not to do this? It's basically the only thing worth verifying here.

@christian-byrne
Copy link
Contributor Author

christian-byrne commented Jun 2, 2025

I need to start from beginning to better understand original error that Terry fixed and work from there.

@christian-byrne
Copy link
Contributor Author

christian-byrne commented Jun 2, 2025

The immediately apparent solution is to fill the mistaken null values generated by lobe-i18n with the translations from a previous version if it exists (otherwise the untranslated name). It will require re-running the emulation and field extraction, as the order of outputs may have genuinely switched since last time the script ran. And it requires accessing the history of the translation file and parsing the relevant fields to determine if a previous translation can be used.

It is tempting to rewrite the translation automation process using better/maintained tooling. There are so many better tools for this, it doesn't make sense why we need to use a obscure open-source tool when we can easily make our own agent. But even in that case you may still want this type of validation process anyway.

Lowest effort solution is to accept (presumably) small margin of error with current tool and just add null check here:

#addOutputs(outputs: OutputSpec[]) {
for (const output of outputs) {
const { name, type, is_list } = output
const shapeOptions = is_list ? { shape: LiteGraph.GRID_SHAPE } : {}
const nameKey = `${this.#nodeKey}.outputs.${output.index}.name`
const typeKey = `dataTypes.${normalizeI18nKey(type)}`
const outputOptions = {
...shapeOptions,
// If the output name is different from the output type, use the output name.
// e.g.
// - type ("INT"); name ("Positive") => translate name
// - type ("FLOAT"); name ("FLOAT") => translate type
localized_name:
type !== name ? st(nameKey, name) : st(typeKey, name)
}
this.addOutput(name, type, outputOptions)
}
}

in which case the lost translation is just simply lost but everything still works/renders otherwise.

@webfiltered
Copy link
Contributor

Can't we just delete corrupt entries and rerun i18n?

@webfiltered
Copy link
Contributor

Actually let's just break this down today quickly.

@christian-byrne
Copy link
Contributor Author

The corruption is from lobe-cli itself

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants