-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Accept all existing Greek letters using unicode characters in math mode #410
Conversation
src/Lexer.js
Outdated
@@ -46,6 +46,7 @@ function Token(text, data, position) { | |||
var tokenRegex = new RegExp( | |||
"([ \r\n\t]+)|(" + // whitespace | |||
"---?" + // special combinations | |||
"|[\u0391-\u03A9\u03B1-\u03C9]" + // Greek letters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
\u03D6
is \varpi
ϖ, \u03D5
is \phi
ϕ, \u03F5
is \epsilon
ϵ. KaTeX outputs all these (try \varpi \phi \epsilon
and look at the Unicode code points of the output); do you want to include them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And \u03D1
is \vartheta
ϑ, \u03F1
is \varrho
ϱ, also supported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May as well add them since we have those glyphs. Good catch.
bf8c8ad
to
d2ae5ed
Compare
Similarly, U+03B5 ↦ I would include The AMS (and KaTeX) function It’s a problem. The |
@ronkok your comment made me realize that I hadn't updated the token regex. I have added the glyphs for |
@ronkok Good catch about U+03F0 |
After thinking of about the |
@wchargin, you are correct about U+03B5, U+03C2, and U+03C6. They do not require any special casing. My bad. @kevinbarabash, Good call. It's okay to go slow. I don't hear the world calling out for a expedited decision on \digamma. |
@wchargin I misread my own regex. This looks like it's good to go. |
Haha, okay :) The regex seems good to me except for the bizarre case of U+03A2, which is not a valid Unicode character for some reason. This seemed weird to me, but I confirmed it in my official Unicode Consortium book and it is correct. For reference, here's the full Greek table: (The next page, which has all the codepoints' official names and decompositions, lists it as "reserved.") Other than that, the characters that match the regex are ΑΒΓΔΕΖΗΘΙΚΛΜΝΞΟΠΡΣΤΥΦΧΨαβγδεζηθικλμνξοπρςστυφχψ, which seems fine to me. # Python 3
''.join(chr(x) for x in list(range(0x0391, 0x03A9)) + list(range(0x03B1, 0x03C9))) (I'm definitely not qualified to review the actual KaTeX aspect of it though, sorry 😕 ) |
If it's reserved it's probably going to be hard for a person to type it in. There are some characters (capitals letters) which we don't have glyphs for. I'll see what happens when they're typed in and report back. |
It throws on all of the Greek glyphs that we don't have, e.g. |
Yeah, that's a tricky one. It always bothered me that TeX has no In the spirit of forward-compatibility, your decision to disallow them for now sounds good to me; you can always add them later, but they'd be harder to remove. @xymostech ? |
Should |
@ronkok definitely worth a test. I'll see to it Monday. |
@kevinbarabash The code of this looks good! Do you want to write the test that @ronkok suggested? Also, what happens when you put these in |
@xymostech will do. I totally forgot about that test. Thanks for reminding me. |
d2ae5ed
to
f25a3b6
Compare
@xymostech I finally got around to testing |
@ronkok sorry for the delay. I just tested |
f25a3b6
to
13fc10b
Compare
Just bumped into this older PR. Looks like everything was good to go, and just needs a rebase and review? |
I'll rebase this this evening. |
I did more reading of the various unicode PRs. I'm a little confused about what the latest/best approach is, but it does make me wonder whether this one will be necessary... Intuitively, Perhaps we should add another option argument to |
13fc10b
to
b2a757b
Compare
defineSymbol(math, main, mathord, "\u03bc", "\\mu", true); | ||
defineSymbol(math, main, mathord, "\u03bd", "\\nu", true); | ||
defineSymbol(math, main, mathord, "\u03be", "\\xi", true); | ||
defineSymbol(math, main, mathord, "\u03bf", "\\omicron", true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have the omicron glyph in the our fonts so we may as well use it.
|
||
if (acceptUnicodeChar) { | ||
module.exports[mode][replace] = module.exports[mode][name]; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love this new approach! Avoids repetition of the Unicode symbol, and makes the decision of "include this unicode character" clear. I think if we ever want a unicode symbol to point to one not matching the font, then we can manually define that symbol.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should extend pretty easily to other unicode characters that are simple matches.
It looks like there's an error with the screenshot. Can you regenerate? |
Test Plan: - make test - run screenshot tests on travis-ci Reviewers: emily
b2a757b
to
8150258
Compare
* Support Unicode relations This is the first in a series of PRs to give KaTeX the ability to recognize Unicode character input. The code in this PR follows the style of PR #410. All the characters in this PR will produce rel atoms. I’ll submit PRs for other atom types later. * Fix lint error. * Correct mapping errors This commit fixes a brain cramp of mine.
Test Plan: