Accept all existing Greek letters using unicode characters in math mode #410

kevinbarabash · 2015-12-05T23:57:35Z

Test Plan:

make test
run screenshot tests on travis-ci

wchargin · 2015-12-09T05:46:25Z

src/Lexer.js

@@ -46,6 +46,7 @@ function Token(text, data, position) {
 var tokenRegex = new RegExp(
    "([ \r\n\t]+)|(" +                                // whitespace
    "---?" +                                          // special combinations
+    "|[\u0391-\u03A9\u03B1-\u03C9]" +                 // Greek letters


\u03D6 is \varpi ϖ, \u03D5 is \phi ϕ, \u03F5 is \epsilon ϵ. KaTeX outputs all these (try \varpi \phi \epsilon and look at the Unicode code points of the output); do you want to include them?

And \u03D1 is \vartheta ϑ, \u03F1 is \varrho ϱ, also supported.

May as well add them since we have those glyphs. Good catch.

ronkok · 2015-12-26T17:08:37Z

Similarly, U+03B5 ↦ \varepsilon ε, U+03F0 ↦ \varkappa ϰ, U+03C2 ↦ \varsigma ς, U+03C6 ↦ \varphi φ. KaTeX supports all of them.

I would include \digamma, but there is an issue to resolve first.

The AMS (and KaTeX) function \digamma looks like U+03DC, capital letter Ϝ. But the unicode-math and Teubner packages map \digamma to U+03DD, small letter ϝ. It is \Digamma that they map to U+03DC, capital letter Ϝ.

It’s a problem. The unicode-math function names are consistent with the naming convention for other Greek letters. But AMS is more popular. I don’t know the best way to resolve the collision.

kevinbarabash · 2015-12-26T18:16:16Z

@ronkok your comment made me realize that I hadn't updated the token regex. I have added the glyphs for \varepsilon, \varsigma, and \varphi. I will add \varkappa and the AMS version of \digamma. I don't believe any of our fonts includes a glyph for the small letter ϝ.

wchargin · 2015-12-26T20:09:37Z

@ronkok Good catch about U+03F0 \varkappa, but the other three (U+03B5, U+03C2,
U+03C6) are handled by [\u03B1-\u03C9], right? (The ε, ς, and φ characters are the "normal" characters output by pressing e, w (for "word-final sigma"), or f (respectively) on a Greek keyboard.) Do they need any additional special casing?

kevinbarabash · 2015-12-26T21:06:02Z

After thinking of about the \digamma issue, I'm going to punt on it for now because at some point in the future we probably want to support more of the unicode-math package and I'd like to avoid changing the behavior.

ronkok · 2015-12-27T03:15:17Z

@wchargin, you are correct about U+03B5, U+03C2, and U+03C6. They do not require any special casing. My bad.

@kevinbarabash, Good call. It's okay to go slow. I don't hear the world calling out for a expedited decision on \digamma.

kevinbarabash · 2015-12-31T00:26:06Z

@wchargin I misread my own regex. This looks like it's good to go.

wchargin · 2015-12-31T00:40:58Z

Haha, okay :) The regex seems good to me except for the bizarre case of U+03A2, which is not a valid Unicode character for some reason. This seemed weird to me, but I confirmed it in my official Unicode Consortium book and it is correct. For reference, here's the full Greek table:

(The next page, which has all the codepoints' official names and decompositions, lists it as "reserved.")

Other than that, the characters that match the regex are ΑΒΓΔΕΖΗΘΙΚΛΜΝΞΟΠΡΣΤΥΦΧΨαβγδεζηθικλμνξοπρςστυφχψ, which seems fine to me.

# Python 3
''.join(chr(x) for x in list(range(0x0391, 0x03A9)) + list(range(0x03B1, 0x03C9)))

(I'm definitely not qualified to review the actual KaTeX aspect of it though, sorry 😕 )

kevinbarabash · 2015-12-31T00:52:20Z

If it's reserved it's probably going to be hard for a person to type it in. There are some characters (capitals letters) which we don't have glyphs for. I'll see what happens when they're typed in and report back.

kevinbarabash · 2015-12-31T01:06:08Z

It throws on all of the Greek glyphs that we don't have, e.g. Α, Β, Ε, etc. which I think is acceptable.

wchargin · 2015-12-31T01:16:11Z

Yeah, that's a tricky one. It always bothered me that TeX has no \Alpha command (whatever happened to semantics matter?). I mean, the two obvious choices that I see are disallow entering Α (U+0391 GREEK CAPITAL LETTER ALPHA) or alias it to A (U+0041 LATIN CAPITAL LETTER A). If we're ever planning to add support for user-defined fonts, for which the two glyphs could be distinct, that might make a difference.

In the spirit of forward-compatibility, your decision to disallow them for now sounds good to me; you can always add them later, but they'd be harder to remove. @xymostech ?

ronkok · 2016-01-02T16:40:54Z

Should \mathbf{Ω} render the same as \mathbf{\Omega}? Would a test of that be worthwhile?

kevinbarabash · 2016-01-02T21:56:15Z

@ronkok definitely worth a test. I'll see to it Monday.

xymostech · 2016-04-18T18:53:29Z

@kevinbarabash The code of this looks good! Do you want to write the test that @ronkok suggested? Also, what happens when you put these in \text{}?

kevinbarabash · 2016-04-18T18:55:29Z

@xymostech will do. I totally forgot about that test. Thanks for reminding me.

kevinbarabash · 2017-01-15T04:38:35Z

@xymostech I finally got around to testing \text{} and it blew up b/c I hadn't defined the symbols for text mode. I've updated the diff to include text mode versions of all of these symbols.

kevinbarabash · 2017-01-15T04:43:54Z

@ronkok sorry for the delay. I just tested \mathbf{Ω}\mathbf{\Omega} and they render the same. I'm going to update the screenshot test to include this.

edemaine · 2017-06-15T18:37:16Z

Just bumped into this older PR. Looks like everything was good to go, and just needs a rebase and review?

kevinbarabash · 2017-06-16T13:05:42Z

I'll rebase this this evening.

edemaine · 2017-06-16T13:58:57Z

I did more reading of the various unicode PRs. I'm a little confused about what the latest/best approach is, but it does make me wonder whether this one will be necessary... Intuitively, defineSymbol ought to define both the unicode and \command version of a symbol. But the other work shows that the font's notion of which unicode symbol it is is not always what we want.

Perhaps we should add another option argument to defineSymbol that gives the unicode character that should be recognized as this one? And then go through symbols by hand? Greek could still be the initial testing ground.

kevinbarabash · 2017-08-12T23:52:07Z

src/symbols.js

+defineSymbol(math, main, mathord, "\u03bc", "\\mu", true);
+defineSymbol(math, main, mathord, "\u03bd", "\\nu", true);
+defineSymbol(math, main, mathord, "\u03be", "\\xi", true);
+defineSymbol(math, main, mathord, "\u03bf", "\\omicron", true);


We have the omicron glyph in the our fonts so we may as well use it.

edemaine · 2017-08-13T14:20:57Z

src/symbols.js

+
+    if (acceptUnicodeChar) {
+        module.exports[mode][replace] = module.exports[mode][name];
+    }


Love this new approach! Avoids repetition of the Unicode symbol, and makes the decision of "include this unicode character" clear. I think if we ever want a unicode symbol to point to one not matching the font, then we can manually define that symbol.

This should extend pretty easily to other unicode characters that are simple matches.

edemaine · 2017-08-13T14:22:59Z

It looks like there's an error with the screenshot. Can you regenerate?

Test Plan: - make test - run screenshot tests on travis-ci Reviewers: emily

* Support Unicode relations This is the first in a series of PRs to give KaTeX the ability to recognize Unicode character input. The code in this PR follows the style of PR #410. All the characters in this PR will produce rel atoms. I’ll submit PRs for other atom types later. * Fix lint error. * Correct mapping errors This commit fixes a brain cramp of mine.

kevinbarabash added the GH Review: review-needed label Dec 5, 2015

kevinbarabash mentioned this pull request Dec 5, 2015

Support unicode in text #15

Closed

wchargin reviewed Dec 9, 2015
View reviewed changes

kevinbarabash added GH Review: needs-revision and removed GH Review: review-needed labels Dec 9, 2015

kevinbarabash force-pushed the greek_unicode_support branch from bf8c8ad to d2ae5ed Compare December 23, 2015 01:57

kevinbarabash added GH Review: review-needed and removed GH Review: needs-revision labels Dec 23, 2015

kevinbarabash added the GH Review: needs-revision label Dec 26, 2015

kevinbarabash removed the GH Review: review-needed label Dec 26, 2015

kevinbarabash added GH Review: review-needed and removed GH Review: needs-revision labels Dec 31, 2015

xymostech added GH review: accepted and removed GH Review: review-needed labels Apr 18, 2016

kevinbarabash self-assigned this Nov 1, 2016

kevinbarabash force-pushed the greek_unicode_support branch from d2ae5ed to f25a3b6 Compare January 15, 2017 04:36

kevinbarabash force-pushed the greek_unicode_support branch from f25a3b6 to 13fc10b Compare January 15, 2017 05:03

kevinbarabash added GH Review: review-needed and removed GH review: accepted labels Jan 15, 2017

kevinbarabash requested a review from xymostech January 20, 2017 01:42

kevinbarabash requested review from edemaine and removed request for xymostech June 16, 2017 13:04

kevinbarabash added GH Review: needs-revision and removed GH Review: review-needed labels Aug 11, 2017

kevinbarabash force-pushed the greek_unicode_support branch from 13fc10b to b2a757b Compare August 12, 2017 23:50

kevinbarabash commented Aug 12, 2017

View reviewed changes

kevinbarabash added GH Review: review-needed and removed GH Review: needs-revision labels Aug 12, 2017

edemaine approved these changes Aug 13, 2017

View reviewed changes

Accept all existing Greek letters using unicode characters in math mode

8150258

Test Plan: - make test - run screenshot tests on travis-ci Reviewers: emily

kevinbarabash force-pushed the greek_unicode_support branch from b2a757b to 8150258 Compare August 14, 2017 04:02

kevinbarabash merged commit e00738d into master Aug 14, 2017

kevinbarabash deleted the greek_unicode_support branch August 23, 2017 01:47

ronkok mentioned this pull request Sep 30, 2017

Add unicode symbols to symbols table #261

Closed

ronkok mentioned this pull request Oct 14, 2017

Support Unicode relations #933

Merged

ronkok mentioned this pull request Jun 5, 2018

[plugin system] Add a utility function (setFontMetrics) to extend builtin fontMetrics #1269

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Accept all existing Greek letters using unicode characters in math mode #410

Accept all existing Greek letters using unicode characters in math mode #410

kevinbarabash commented Dec 5, 2015

wchargin Dec 9, 2015

wchargin Dec 9, 2015

kevinbarabash Dec 9, 2015

ronkok commented Dec 26, 2015

kevinbarabash commented Dec 26, 2015

wchargin commented Dec 26, 2015

kevinbarabash commented Dec 26, 2015

ronkok commented Dec 27, 2015

kevinbarabash commented Dec 31, 2015

wchargin commented Dec 31, 2015

kevinbarabash commented Dec 31, 2015

kevinbarabash commented Dec 31, 2015

wchargin commented Dec 31, 2015

ronkok commented Jan 2, 2016

kevinbarabash commented Jan 2, 2016

xymostech commented Apr 18, 2016

kevinbarabash commented Apr 18, 2016

kevinbarabash commented Jan 15, 2017

kevinbarabash commented Jan 15, 2017 •

edited

Loading

edemaine commented Jun 15, 2017

kevinbarabash commented Jun 16, 2017

edemaine commented Jun 16, 2017

kevinbarabash Aug 12, 2017

edemaine Aug 13, 2017

kevinbarabash Aug 14, 2017

edemaine commented Aug 13, 2017

Accept all existing Greek letters using unicode characters in math mode #410

Accept all existing Greek letters using unicode characters in math mode #410

Conversation

kevinbarabash commented Dec 5, 2015

wchargin Dec 9, 2015

Choose a reason for hiding this comment

wchargin Dec 9, 2015

Choose a reason for hiding this comment

kevinbarabash Dec 9, 2015

Choose a reason for hiding this comment

ronkok commented Dec 26, 2015

kevinbarabash commented Dec 26, 2015

wchargin commented Dec 26, 2015

kevinbarabash commented Dec 26, 2015

ronkok commented Dec 27, 2015

kevinbarabash commented Dec 31, 2015

wchargin commented Dec 31, 2015

kevinbarabash commented Dec 31, 2015

kevinbarabash commented Dec 31, 2015

wchargin commented Dec 31, 2015

ronkok commented Jan 2, 2016

kevinbarabash commented Jan 2, 2016

xymostech commented Apr 18, 2016

kevinbarabash commented Apr 18, 2016

kevinbarabash commented Jan 15, 2017

kevinbarabash commented Jan 15, 2017 • edited Loading

edemaine commented Jun 15, 2017

kevinbarabash commented Jun 16, 2017

edemaine commented Jun 16, 2017

kevinbarabash Aug 12, 2017

Choose a reason for hiding this comment

edemaine Aug 13, 2017

Choose a reason for hiding this comment

kevinbarabash Aug 14, 2017

Choose a reason for hiding this comment

edemaine commented Aug 13, 2017

kevinbarabash commented Jan 15, 2017 •

edited

Loading