Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLDR-15817 Update Example Generator #3978

Conversation

haytenf
Copy link
Contributor

@haytenf haytenf commented Aug 21, 2024

CLDR-15817

Locale Before After
de Screenshot 2024-09-03 at 5 37 59 PM Screenshot 2024-09-03 at 5 43 26 PM
it Screenshot 2024-09-03 at 5 38 46 PM Screenshot 2024-09-03 at 5 43 42 PM
zh Screenshot 2024-09-03 at 5 39 27 PM Screenshot 2024-09-03 at 5 43 55 PM
  • This PR partially completes the ticket.

ALLOW_MANY_COMMITS=true

@haytenf
Copy link
Contributor Author

haytenf commented Aug 21, 2024

Incomplete draft that creates a new method that calls createExampleHtml passing a list as an additional argument and updates one of the handlers ( handleDateRangePattern) to use this list -- doesn't pass tests since only one handler has been changed to fit new approach. Just to make sure on the right track for updating the handlers. If anyone could take a look when they get a chance @macchiati @srl295

@btangmu
Copy link
Member

btangmu commented Aug 27, 2024

@haytenf to fix the formatting you can run this:
mvn --file=tools/pom.xml spotless:apply && sh tools/cldr-apps/js/test/pretty.sh

@haytenf haytenf requested review from btangmu and srl295 August 29, 2024 20:00
Copy link
Member

@btangmu btangmu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! The code looks cleaner with your changes, except a couple of commented-out bits of code that should be removed. Hurray, the tests all pass! If there were lots of time to do more work on this I would suggest adding another unit test or two, but that doesn't seem essential. A couple screenshots of examples would be nice to see in the PR or the ticket to confirm the examples look the same as they did before the refactoring

Copy link
Member

@macchiati macchiati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good so far. I suggest not changing the main dispatch to a switch in a different PR; that will make it easier to test.

Also, for refactorings like this, once you are statisfied with this, you can pull in the original file as ExampleGeneratorOld.java, and write a test case that checks that the results are identical for old and new, iterating across all locales, and iterating across all paths in that locale. That's a small amount of code that gives us a great deal of confidence!

@haytenf
Copy link
Contributor Author

haytenf commented Sep 4, 2024

@btangmu @srl295 @macchiati In the most recent commit, I've added a unit test comparing the old/new example generator, cleaned up the commented out code, and added before/after screenshots that show the example doesn't change.

I also wanted to explain a change I made that isn't related to the purpose of this ticket. With the new unit test I added (TestRefactoring), I ended up running into a null pointer exception (in both the old example generator and the updated example generator) that I traced back to the getOtherGender method that handleMinimalPairs calls. The exception was an issue with calling grammarInfo.get(GrammaticalTarget.nominal, GrammaticalFeature.grammaticalGender, GrammaticalScope.units); in line 1221 since grammarInfo could be null, so that's why I made that change.

I also wanted to note for that the unit test, TestRefactoring, I ended up using the unresolved cldr files to get the paths that I would test over, since attempting with the resolved file led to an extremely long run time (in fact, it never completed and I ran out of memory before it could). I'm not sure if there's something I could do to improve the efficiency of the test and eventually use the resolved files, but for now that's how I've left it.

@haytenf haytenf marked this pull request as ready for review September 4, 2024 01:00
@@ -1215,7 +1214,7 @@ private void handleMinimalPairs(
}

private String getOtherGender(String gender) {
if (gender == null) {
if (gender == null || grammarInfo == null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch

@btangmu
Copy link
Member

btangmu commented Sep 5, 2024

attempting with the resolved file led to an extremely long run time

that's a puzzle -- does anyone know why?

@srl295
Copy link
Member

srl295 commented Sep 5, 2024

attempting with the resolved file led to an extremely long run time

that's a puzzle -- does anyone know why?

locally or on the github build? if it's the github build could have just been random.

@haytenf
Copy link
Contributor Author

haytenf commented Sep 5, 2024

attempting with the resolved file led to an extremely long run time

that's a puzzle -- does anyone know why?

locally or on the github build? if it's the github build could have just been random.

I'm still not sure what could be the issue. It was an issue locally--it ended up running for 12+ hours at which point it ran into a runtime exception/memory error from running out of heap space.

Copy link
Member

@srl295 srl295 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking good and I think it's an improvement

Comment on lines 2466 to 2467
examples.add("");
return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to add a "" example here? or just to exit without adding anything?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Yes, i think it makes sense to just return in this case.

@@ -2076,4 +2079,27 @@ private String getResult(String starredPath, String attr) {
public String sampleAttrAndValue(PathStarrer ps, final String separator, String value) {
return ps.getAttributesString(separator) + "➔«" + value + "»";
}

public void TestRefactoring() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, i see it's for the temporary test.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ouch. All locales X all resolved paths will be way too much.

Actually what you probably want to do here is to limit it by coverage and pathheader. I'll get some sample code for it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this code (will need some setup) should show whether a path would even be shown to users or not. If not, it should be skipped from the test:

// adapted from STFactory.isValidSurveyToolVote

        PathHeader.Factory phf = PathHeader.getFactory(null); // do this once!


        private boolean isValidSurveyToolVote(String xpath) {
            PathHeader ph = phf.fromPath(xpath); // will need a 
 PathHeaderFactory
            if (ph == null) return false;
            if (ph.getSurveyToolStatus() == PathHeader.SurveyToolStatus.DEPRECATED) return false;
            if (ph.getSurveyToolStatus() == PathHeader.SurveyToolStatus.HIDE
                    || ph.getSurveyToolStatus() == PathHeader.SurveyToolStatus.READ_ONLY) {
                return false;
            }

            if (supplementalDataInfo.getCoverageValue(xpath, locale.getBaseName())
                    > org.unicode.cldr.util.Level.COMPREHENSIVE.getLevel()) {
                return false;
            }
            return true;
        }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. I'll revise to test over only the paths that'd be shown to users. Even over that reduced set of paths, and considering the extra check needed to determine if a path shown to users, I'm guessing the resolved files may still be too large to test over.

Copy link
Member

@srl295 srl295 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think ExampleGeneratorOld.java shouldn't be in the PR

@haytenf
Copy link
Contributor Author

haytenf commented Sep 9, 2024

@srl295 @btangmu @macchiati I've modified the test so that it only checks the paths that are shown to users. If that's looking ok, I can remove the old file (ExampleGeneratorOld.java) from the PR.

Copy link
Member

@srl295 srl295 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good

Copy link
Member

@srl295 srl295 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, I think if you remove teh ExampleGeneratorOld and TestRefactoring this is ready to go - test passes!!

@haytenf haytenf requested a review from srl295 September 9, 2024 20:12
@srl295 srl295 merged commit 5c98aa7 into unicode-org:CLDR-15817-Update-ExampleGenerator Sep 10, 2024
9 checks passed
@srl295
Copy link
Member

srl295 commented Sep 10, 2024

@haytenf Thanks! 🎉

@haytenf
Copy link
Contributor Author

haytenf commented Sep 10, 2024

@haytenf Thanks! 🎉

Thank you!

srl295 added a commit that referenced this pull request Sep 13, 2024
haytenf added a commit to haytenf/cldr that referenced this pull request Sep 17, 2024
conradarcturus pushed a commit that referenced this pull request Sep 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants