Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement AMSMath's \dots #599

Closed
wants to merge 10 commits into from
Closed

Implement AMSMath's \dots #599

wants to merge 10 commits into from

Conversation

edemaine
Copy link
Member

@edemaine edemaine commented Jan 3, 2017

I spent a few hours reading AMSMath's source code for \dots. Wow, what a beast! This should be a full implementation (fixing #528), except for one issue:

I don't know how to add some positive or negative space left or right of a symbol returned by a function, which is what should happen for \dotsi and dotso in some cases (as described in the TODO comments). Is there an easy way to do this?

Copy link
Contributor

@xymostech xymostech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @edemaine! This looks really awesome! Thanks for going through the effort of reading through that insane implementation. :) I have a suggestion for how to do the spacing in the comments.

Your code looks really good to me! I'll let @gagern or @kevinbarabash do the final review.

src/functions.js Outdated
}
}
}
return {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to conditionally add some positive/negative space here, you could do something like this (I tested this locally, it seems to work fine):

    const space = new ParseNode("kern", {
        dimension: {
            number: 1,
            unit: "em",
        },
    }, context.parser.mode);

    const bod = new ParseNode("op", {
        limits: false,
        symbol: true,
        body: thedots,
    }, context.parser.mode);

    return {
        type: "op",
        limits: false,
        symbol: false,
        value: [space, bod],
    };

Basically, we're creating a "kern" node and sticking that into our op, which accepts arrays of ParseNodes. Not sure if this is the "right" way to do this, but it certainly beats making a new group type!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect, thanks! That's great that value can be an array, and kern is the other feature I was looking for. I'll finish the implementation now.

@edemaine
Copy link
Member Author

edemaine commented Jan 4, 2017

Hmm, I'm having another problem. Even with the PR code as is, which returns a symbol similar to existing \bigvee, rendering \dots gives me a warning:

No character metrics for '…' in style 'Size1-Regular'

groupTypes.op() seems to set the fontName to 'Size1-Regular' or 'Size2-Regular', but I don't understand how fonts work, so I'm not sure whether this is correct... \ldots works fine, using font 'Main-Regular' I assume. The warning is problematic as it leads to the character being higher than intended. (Same thing with \cdots+ which expands to \cdots+.)

Relatedly, I'm vaguely worried that my \dots macro is returning an type: "op" instead of a type: "inner", but maybe that's OK... (As I understand the existing function infrastructure, it would be impossible to return a type: "inner" with a string as a value.) Not sure whether this will affect spacing, though.

@edemaine
Copy link
Member Author

edemaine commented Jan 6, 2017

Seeing macro support in #605, I wonder if it would make more sense to have \dots expanded at the macro level, so that it can literally expand to an \ldots, \cdots, \!\cdots, \cdots\,, etc. and be reparsed accordingly. This is how it works in TeX, of course.

But currently functions can't return strings. Does this seem like a good direction to go? If so, I can try.

@kevinbarabash kevinbarabash self-assigned this Jan 7, 2017
@kevinbarabash
Copy link
Member

kevinbarabash commented Jan 7, 2017

@edemaine thanks for the PR. I checked out your branch. The wrong font is definitely a problem. A person could fix groupTypes.op in buildHTML.js, but that seems hacky. It would be great if we could use macros for this, but this seems doubtful with our current macro support b/c it looks like we'd need conditionals which we don't have yet. In the short term maybe we set type: "inner" instead of type: "op". It clears up the font issue and appears to be what LaTeX is doing:

Running the following using pdflatex:

\documentclass{article}
\usepackage{amsmath}
\begin{document}
\tracingonline=1
\showboxbreadth=\maxdimen
\showboxdepth=\maxdimen
$1\dots n\showlists$
\end{document}

produces the following (partial) output:

\mathord
.\fam0 1
\mathinner
.\mathpunct
..\fam1 :
.\mathpunct
..\fam1 :
.\mathpunct
..\fam1 :
\mathord
.\fam1 n

@kevinbarabash
Copy link
Member

kevinbarabash commented Jan 7, 2017

Please ignore the suggestion to use type: "inner" in my previous comment... it doesn't actually work. :(

@edemaine
Copy link
Member Author

edemaine commented Jan 8, 2017

Yeah, op is pretty magical/special.

How crazy would it be to add support for returning a string from a defined function, and having that trigger macro expansion? That's exactly what I want here... (\dots returning \ldots or etc.) Alternatively, adding support for Javascript functions in a defined macro...

@gagern
Copy link
Collaborator

gagern commented Jan 10, 2017

Returning a string from a function would work as a shortcut, but calling something else might be better. Doing macro expansion there might be tricky, since built-in functions are working on the level of parse tree nodes, not input text tokens. What we could do is parse a whole string using a separate parser instance, and then extract the root node out of that tree and include it in the current tree. This should work well for some simple functions, but might be problematic once you have arguments or state to deal with.

@edemaine
Copy link
Member Author

@gagern What if I moved this from a function to a macro, adding support for functions in defineMacro? I think that would be the right place to do this, assuming macros have nextToken defined... (In TeX, there are no functions, only macros...)

@gagern
Copy link
Collaborator

gagern commented Jan 10, 2017

Right now, macros don't have access to the next token, but perhaps we should change that. In the long run we'll want to support things like \@ifstar, which in turn builds on \futurelet. The question is, do you need the next token from the input stream, or the next unexpandable token from the final stream handed on to the parser? The former we get from the lexer, the latter from the macro expander. In either case it should be OK to unget the token once we have examined it, although we have to be careful about whitespace handling if we got the token from the lexer instead.

@edemaine
Copy link
Member Author

edemaine commented Jan 10, 2017

I checked the amsmath code, and it uses \expandafter, so I believe we want the next unexpandable token from the final stream (latter). Relevant definitions:

\let\@xp=\expandafter
\DeclareRobustCommand{\dots}{%
  \ifmmode \@xp\mdots@\else \@xp\textellipsis \fi
}
\def\FN@{\futurelet\@let@token}
\def\mdots@{\FN@\mdots@@}

In general, though, I think we might need either one -- a plain \futurelet without these \expandafter shenanigans probably wants the very next token in the input (former). But maybe we can use direct access to the lexer in this case, via a context object.

@gagern
Copy link
Collaborator

gagern commented Jan 11, 2017

What exactly does \expandafter do? Given input \expandafter\foo\bar I'd assume it would remove \foo from the sequence, then expand \bar one level, then put \foo back into the sequence and start the next cycle with expanding that. So it does not expand \bar as far as possible, that's what \edef is for.

I guess \expandafter could use two utility functions factored out from the current nextToken function: one to obtain the next token of input either from the stack or from the lexer, and one to perform a single level of expansion without doing so recursively. This should definitely be built on the argument-handling macro expander from #605, I'd say, so perhaps I should see after that first.

@edemaine
Copy link
Member Author

Ah, you're right about \expandafter. See this tutorial. In that case, actually, just getting the next unexpanded token from the lexer would do pretty well in our case, because we have so few macros, and rather have lots of functions and symbols. The \expandafter in \dots is to get the \DOTSB etc. at the beginning of macros (which tell \dots what type of dots to be), but given that all our symbols are implemented via a different mechanism, this feature isn't critical to support in \dots at this time.

Anyway, I'd like to request that #605 be extended to allow a JavaScript function in defineMacro, which gives access to at least the lexer (presumably via a context object). Then I can move this PR to use that instead. Once \expandafter is also supported (which doesn't sound too difficult...), I can use it in \dots too.

@gagern
Copy link
Collaborator

gagern commented Jan 11, 2017

I'd not modify #605; it already does contain three distinct features, which might make review slower than it would otherwise be. But having a branch build on #605 and posted as a separate PR either immediately or once that got merged sounds like a good idea. Usually I'd say I'll have a look, but I feel we are opening up new areas of development faster than we are closing them right now, so I'm not sure I'll get to that in a timely manner.

@edemaine
Copy link
Member Author

Sounds good; please reinterpret my request as features for a future follow-up to #605, then. I agree that waiting for #605 to be approved first might make sense. Then you or I can look at the requested extensions.

@edemaine edemaine force-pushed the master branch 2 times, most recently from cf1e646 to 08f40ee Compare January 13, 2017 20:58
remove 'var's and add screenshot test
@kohler
Copy link
Collaborator

kohler commented Jul 25, 2017

I think this should be closed right @edemaine?

@edemaine
Copy link
Member Author

@kohler Nope, still need to revise this to work with new macro parser. \dots remains unimplemented in KaTeX.

@kevinbarabash
Copy link
Member

Are the screenshot images correct? Here's what I get using quicklatex.com:
ql_812ffebfacf04832cfe12f1868d4b38c_l3

and KaTeX:
dots-chrome

@edemaine
Copy link
Member Author

Definitely not correct. There were various issues, described above, with building the parse structure directly. Macros should let it actually expand to \cdots or \ldots.

@kevinbarabash
Copy link
Member

@edemaine I'm going to check out your branch and see what's causing the square dots.

@kevinbarabash
Copy link
Member

edemaine#2 fixes the issue with square dots.

@kevinbarabash kevinbarabash self-requested a review August 13, 2017 06:56
Copy link
Member

@kevinbarabash kevinbarabash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Square dots issue needs to be resolved.

add group type 'dots' so that we get the right font w/o special casing group type 'op'
@kevinbarabash
Copy link
Member

I'm not sure why the Dots screenshot test is failing now.

@edemaine
Copy link
Member Author

Should be replaced by #794.

@edemaine edemaine closed this Aug 13, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants