Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Localization differences - Address EmbedLiveSample case with its "localized" argument #4971

Open
Tracked by #5216
SphinxKnight opened this issue Nov 21, 2021 · 17 comments
Labels
🧑‍🤝‍🧑 community contributions by our wonderful community idle localization i18n & l10n macros tracking issues related to kumascript macros p4 Not urgent, only if time allows tx localizer/translator experience

Comments

@SphinxKnight
Copy link
Member

SphinxKnight commented Nov 21, 2021

As far as I understand, EmbedLiveSample macro builds the sample content based on a given heading ID (first parameter) or (I just discovered this) fallbacks without argument by using an algorithm based on content.

The thing is, EmbedLiveSample is:

  • considered as a macro of interest for "localization differences"
  • still vastly using an argument based on the heading ID

As the localization differences take into account macros and their arguments, and as headings' IDs are generated based on their content… We often encounter what I think are false positive signals when translating (see the first flaw on mdn/translated-content#3158 (comment) for example). Before corresponding PRs it was actually one of the most important source of issues for French content on mdn/translated-content.

There are a few ways to fix this:

  • Remove IDs in EmbedLiveSample macro calls in mdn/content and mdn/translated-content (as it mdn/content acts as the source of truth/reference for mdn/translated-content)
  • Remove EmbedLiveSample from the list of "macros of interest" for localization differences
  • Add some complexity in the localization difference detection algorithm to specifically deal with EmbedLiveSample (and any other macros with arguments based on localizable content)

I'm not sure of the way I prefer. EmbedLiveSample has long been frail regarding l10n (cf. mdn/translated-content#2844 and mdn/translated-content#2842 for example among others).

  • Using no argument as the default would indeed solve the root cause of the issue but updating this in current content would need a large set of changes in mdn/content and mdn/translated-content.
  • Removing EmbedLiveSample from the localization differences check is the most easy one but it feels wrong as, I believe, good live samples are essential to the quality of a page (whether it's an English or a localized one)
  • Making an exception for EmbedLiveSample in the current algorithm only has an impact on mdn/yari but feels a bit like a "ad hoc" solution.

Pointers:

@escattone @wbamberg @lex111 @mfuji09 @hochan222, let me know if you have any thoughts about this.

@lex111
Copy link
Member

lex111 commented Nov 21, 2021

You raised an important issue. Indeed, currently it is not very convenient to work with embedded live samples, it is definitely worth to automate them somehow. Especially it's not very scalable option in case of translated content.

However I'm worried about another question -- will MDN platform continue to evolve or not? I don't see any progress after Peter left the project. I have a feeling that it is in an abandoned state. Maybe there is financial difficulties, I don't know, so I would like to know the current situation with the content platform.
Moreover, it is not such an easy project to contribute in, so it requires to maintain by the team members or who is familiar with the codebase. I'd like to help, but don't think I can get into the project quickly, it's pretty complicated.

@mfuji09
Copy link
Contributor

mfuji09 commented Nov 22, 2021

I am using a HTML heading element (<hX>) for headings of the live samples because the EmbedLiveSample macro doesn't handle a heading element with an id has a localized (Japanese) string correctly. You know, the markdown processor adds a content of a heading (##...) as an id. In a translated heading, the id becomes Japanese, so the macro doesn't work.

I think, if the macro can process a localized id correctly, the best approach seems using the macro without the argument -- then we will be able to use markdown instead of HTML heading elements.

@ddbeck
Copy link
Contributor

ddbeck commented Nov 22, 2021

I'm not in a position yet to comment on the issue mentioned here, but I wanted to respond to a concern raised:

However I'm worried about another question -- will MDN platform continue to evolve or not? I don't see any progress after Peter left the project. I have a feeling that it is in an abandoned state. Maybe there is financial difficulties, I don't know, so I would like to know the current situation with the content platform.

A fair question, @lex111. There's been some changes to the size of the team and some near-term changes in who was working on what as a consequence. It's not abandoned, but it is taking us time to get back to a level of activity we sustained before. It's one thing for me to say it's not abandoned, however; I hope @schalkneethling and others on the team will soon be able to prove it through renewed efforts on the platform. Thanks for your understanding and patience.

@schalkneethling
Copy link

+1 to what @ddbeck said. We did drop the ball here but, I am trying to pick it back up. We have had some good conversations all round and we are working on way forward that will benefit all interested parties. Thank you for your patience and please, keep filing those issues.

@schalkneethling
Copy link

@escattone You might also want to have a look at this issue.

@wbamberg
Copy link
Collaborator

As far as I understand, EmbedLiveSample macro builds the sample content based on a given heading ID (first parameter) or (I just discovered this) fallbacks without argument by using an algorithm based on content.

As the localization differences take into account macros and their arguments, and as headings' IDs are generated based on their content… We often encounter what I think are false positive signals when translating (see the first flaw on mdn/translated-content#3158 (comment) for example).

OK, so as I understand it the problem is:

  • the Yari flaw checker complains when there are macro calls in en-US that don't match macro calls in fr, and it includes arguments in this determination
  • EmbedLiveSample typically uses a heading ID as its argument, and this is (especially now in Markdown) derived from the heading text

...so the flaw checker reports (probably almost every?) live sample call as a flaw.

(this isn't unique to live samples of course. At least cross-reference macros often include localizable text in their arguments. But I guess the flaw checker ignores these?)

Back when we were discussing what the impact of Markdown on live samples would be, @escattone and I thought it would be a good principle that as far as possible macro calls should not need to be locale-sensitive, and that was one of the reasons we came up with this way for a live sample not to need to pass the argument, but for Yari to figure it out from the context (aka "(Auto)EmbedLiveSample"). So in principle I guess I am in favour of updating macro calls to remove the argument.

There are some risks around this in that we've not tried applying it on a large scale yet. But having spent a lot of time picking at live samples in the last few months for Markdowning, I think it will probably work in at least the vast majority of the time. But still, as you say, it's a project.

I am a little worried about this "(Auto)EmbedLiveSample" system, because it seems very implicit/magical. In general I prefer systems where it's clearly expressed in the code/content what's going on, and this isn't like that. But it's the least-bad idea I could think of. Long term I like the idea of separating out examples from pages and embedding the complete example at build time, as we proposed to do in stumptown. But that seems a long way off.

I am using a HTML heading element () for headings of the live samples because the EmbedLiveSample macro doesn't handle a heading element with an id has a localized (Japanese) string correctly. You know, the markdown processor adds a content of a heading (##...) as an id. In a translated heading, the id becomes Japanese, so the macro doesn't work.

I didn't know that, and I'm sorry to hear it. I would have expected Yari would be able to handle this case. If it's not an easy fix in Yari, that seems like another reason to want to move away from including IDs in live sample arguments.

@hochan222
Copy link
Member

hochan222 commented Nov 23, 2021

Back when we were discussing what the impact of Markdown on live samples would be, @escattone and I thought it would be a good principle that as far as possible macro calls should not need to be locale-sensitive ...

For this reason, there is also a 'Translating heading IDs' principle in 'General guidelines for MDN translated content'. So I'm also in favor of updating the macro calls to remove the argument if it needs to change.

Using no argument as the default would indeed solve the root cause of the issue but updating this in current content would need a large set of changes in mdn/content and mdn/translated-content.

@SphinxKnight The above option seems to be the most likely. Of course, it will depend on how we choose. If so, what specifically problems might arise from a locale(translated-content) contributor's point of view?

@SphinxKnight
Copy link
Member Author

SphinxKnight commented Nov 24, 2021

@wbamberg, regarding application of "(Auto)EmbedLiveSample" and its effect across the English docset, I've searched/replaced the macro calls to replace the first arg with an empty string and crawled all the pages locally via a script.
I got an MacroLiveSampleError for the 8 following pages (which is quite nice vs the 1506 files where this macro is used):

I intend to build a more thorough "scanner" by fetching the actual generated content and comparing the result before/after this change.

@mfuji09 I wasn't aware of this issue as well, I struggled a bit with some French headings but I guess rewording kind of worked for this locale :x

@hochan222 I don't expect particular issues for localizers/contributors apart from the set of PRs which will need to be made/reviewed and, later on, explaining newcomers why there is an empty string as the first argument ^^

@wbamberg
Copy link
Collaborator

@wbamberg, regarding application of "(Auto)EmbedLiveSample" and its effect across the English docset, I've searched/replaced the macro calls to replace the first arg with an empty string and crawled all the pages locally via a script.

That's amazing, @SphinxKnight !

I got an MacroLiveSampleError for the 8 following pages (which is quite nice vs the 1506 files where this macro is used):

* http://localhost:5000/en-us/docs/mdn/contribute/howto/create_an_interactive_exercise_to_help_learning_the_web/distant_example (I was not subtle enough to deal specifically with this "meta" page, that was not my goal yet :) )

* http://localhost:5000/en-us/docs/web/html/global_attributes/itemscope

* http://localhost:5000/en-us/docs/web/html/element/input/date

* http://localhost:5000/en-us/docs/web/html/element/input/month

* http://localhost:5000/en-us/docs/web/css/css_positioning/understanding_z_index/adding_z-index

* http://localhost:5000/en-us/docs/web/css/css_positioning/understanding_z_index/stacking_and_float

* http://localhost:5000/en-us/docs/web/css/css_positioning/understanding_z_index/stacking_without_z-index

* http://localhost:5000/en-us/docs/web/api/canvas_api/tutorial/compositing/example

Yes, that's pretty good! (I was expecting it would be OK but nice to have this extra evidence.) I'll have a go at fixing these.

I intend to build a more thorough "scanner" by fetching the actual generated content and comparing the result before/after this change.

@mfuji09 I wasn't aware of this issue as well, I struggled a bit with some French headings but I guess rewording kind of worked for this locale :x

@hochan222 I don't expect particular issues for localizers/contributors apart from the set of PRs which will need to be made/reviewed and, later on, explaining newcomers why there is an empty string as the first argument ^^

Was talking this over with Ryan yesterday and it might be a good time to move to a new macro or to revise the arguments it can take. Already it takes 2 arguments that are always ignored ("width" and "hide Codepen buttons").

@wbamberg
Copy link
Collaborator

I've filed mdn/content#10758 which tries to fix those 8 pages. Unfortunately it looks like there is a bug in the new algorithm, as well as apparently some difference in the way it handles whitespace.

@schalkneethling
Copy link

I've filed mdn/content#10758 which tries to fix those 8 pages. Unfortunately it looks like there is a bug in the new algorithm, as well as apparently some difference in the way it handles whitespace.

Could you file an issue with regards to the algorithm bug and whitespace differences?

@schalkneethling
Copy link

Was talking this over with Ryan yesterday and it might be a good time to move to a new macro or to revise the arguments it can take. Already it takes 2 arguments that are always ignored ("width" and "hide Codepen buttons").

Is there a tracking bug to discuss and do this work? Either creating a new macro or refactoring the existing one.

@SphinxKnight
Copy link
Member Author

SphinxKnight commented Nov 29, 2021

I've filed mdn/content#10758 which tries to fix those 8 pages. Unfortunately it looks like there is a bug in the new algorithm, as well as apparently some difference in the way it handles whitespace.

Could you file an issue with regards to the algorithm bug and whitespace differences?

If I may: #5006 and #5005

@wbamberg
Copy link
Collaborator

Was talking this over with Ryan yesterday and it might be a good time to move to a new macro or to revise the arguments it can take. Already it takes 2 arguments that are always ignored ("width" and "hide Codepen buttons").

Is there a tracking bug to discuss and do this work? Either creating a new macro or refactoring the existing one.

I just filed #5016 for that.

@SphinxKnight
Copy link
Member Author

For cross-reference, another "frailty" of EmbedLiveSample when it comes to embedded images for mdn/translated-content #5215

@schalkneethling schalkneethling added the 🚉 platform keeping the platform healthy label Feb 16, 2022
@github-actions github-actions bot added the idle label Apr 15, 2022
@schalkneethling schalkneethling added localization i18n & l10n 🧑‍🤝‍🧑 community contributions by our wonderful community macros tracking issues related to kumascript macros and removed idle labels Apr 17, 2022
@github-actions github-actions bot added the 🐌 idle Issues and PRs without recent activity. Flagged for maintainer follow-up. label May 28, 2022
@caugner
Copy link
Contributor

caugner commented Nov 15, 2022

@SphinxKnight @wbamberg If this is still an issue, could you please try and describe the problem and the desired solution again from today's perspective?

@caugner caugner added need more info Further information is requested and removed 🚉 platform keeping the platform healthy labels Nov 15, 2022
@caugner caugner added p4 Not urgent, only if time allows tx localizer/translator experience labels Nov 15, 2022
@wbamberg
Copy link
Collaborator

The situation is described above and hasn't changed. To summarise:

If we want to go ahead with the ID-less version (and that's at least partly a choice for translators, but in this thread they seem pretty keen) then we should make sure it is working properly with our content and consistently use it in en-US and other locales. We should also consider revising the EmbedLiveSample API at the same time (#5016).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🧑‍🤝‍🧑 community contributions by our wonderful community idle localization i18n & l10n macros tracking issues related to kumascript macros p4 Not urgent, only if time allows tx localizer/translator experience
Projects
Status: No status
Development

No branches or pull requests

8 participants