Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose java.lang.Character.getType() and constant fields like COMBINING_SPACING_MARK #103

Closed
ctjlewis opened this issue Jul 20, 2020 · 2 comments

Comments

@ctjlewis
Copy link

ctjlewis commented Jul 20, 2020

It seems that J2CL is unable to load the Character.getType or Unicode category constant fields (like Character.COMBINING_SPACING_MARK), and throws a "symbol not found" error.

For reference, see google/closure-compiler#3639, where Closure Compiler was unable to interpret a composite Unicode sequence as a valid IdentifierPart. In CC, there is a part of the parsing process which relies on Scanner.java, a class that determines if a given token is an IdentifierStart or IdentifierPart in compliance with the ECMAScript spec. All token Unicode category checks are currently done by evaluating if the character belongs to any hard-coded Unicode ranges (see below), an approach that I replicated that for this fix, but is not as future-proof nor as legible as Character.getType(char) == Character.COMBINING_SPACING_MARK, which will work as the Unicode standard evolves over time.

private static boolean isCombiningMark(char ch) {
    return (
      // 0300-036F
      (0x0300 <= ch & ch <= 0x036F) |
      // 1AB0–1AFF
      (0x1AB0 <= ch & ch <= 0x1AFF) |
      // 1DC0–1DFF
      (0x1DC0 <= ch & ch <= 0x1DFF) |
      // 20D0–20FF
      (0x20D0 <= ch & ch <= 0x20FF) |
      // FE20–FE2F
      (0xFE20 <= ch & ch <= 0xFE2F)
    );
    // TODO (ctjl): Implement in a more reliable and future-proofed way, i.e.:
    // return Character.getType(ch) == Character.NON_SPACING_MARK;
  }

This hardcoded, manual approach is taken for every Unicode category check in the jsComp library because the J2CL compile must succeed in order to push a release (using Character.getType() will compile using maven, but not with bazel). It would be beneficial for the CC library if J2CL could support these.

@ctjlewis
Copy link
Author

It seems like this might be a Guava issue rather than a j2cl one - please close this if so.

@gkdn
Copy link
Member

gkdn commented Jul 21, 2020

J2CL doesn't support various java.lang.Character APIs since they are too costly to support on the web.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants