Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

\newcommand-like definitions with 'bare' command name #4470

Closed
minoki opened this issue Mar 19, 2018 · 4 comments · Fixed by #10573
Closed

\newcommand-like definitions with 'bare' command name #4470

minoki opened this issue Mar 19, 2018 · 4 comments · Fixed by #10573

Comments

@minoki
Copy link
Contributor

minoki commented Mar 19, 2018

When defining a LaTeX macro, subsequent occurrences of \renewcommand or \providecommand with 'bare' command name (i.e. not enclosed in { }) fails because the (bare) command name gets wrongly expanded.

Input:

\newcommand{\foo}{123}\renewcommand\foo{456}\foo

Expected output:

456

Actual output (with 2.1.3 from git master):

$ echo '\newcommand{\foo}{123}\renewcommand\foo{456}\foo' | stack exec pandoc -- -f latex -t native

Error at "source" (line 1, column 36):
unexpected 123
\newcommand{\foo}{123}\renewcommand\foo{456}\foo
                                   ^
@minoki
Copy link
Contributor Author

minoki commented Mar 19, 2018

My understanding is:

In newcommand function, the part that reads the command name seems to be guarded from macro expansion with withVerbatimMode:

newcommand :: PandocMonad m => LP m (Text, Macro)
newcommand = do
pos <- getPosition
Tok _ (CtrlSeq mtype) _ <- controlSeq "newcommand" <|>
controlSeq "renewcommand" <|>
controlSeq "providecommand" <|>
controlSeq "DeclareRobustCommand"
optional $ symbol '*'
Tok _ (CtrlSeq name) txt <- withVerbatimMode $ anyControlSeq <|>
(symbol '{' *> spaces *> anyControlSeq <* spaces <* symbol '}')

But this withVerbatimMode has no effect for 'bare' command name, because the expansion is already done in satisfyTok that read the prior \renewcommand token (in controlSeq "renewcommand"), which calls doMacros 0:

satisfyTok :: PandocMonad m => (Tok -> Bool) -> LP m Tok
satisfyTok f =
try $ do
res <- tokenPrim (T.unpack . untoken) updatePos matcher
doMacros 0 -- apply macros on remaining input stream
return res

i.e. satisfyTok should not expand the following token by calling doMacros 0.

@jgm
Copy link
Owner

jgm commented Mar 19, 2018

I tried moving doMacros 0 to the beginning of the do block, and all tests pass (including your test above). However, there is some performance penalty, which is no doubt why I put doMacros 0 where it is in the first place. The idea is that we don't want to resolve macros every time we test for a new token, but only when we actually read a token. The perf penalty isn't huge but it's significant.

Perhaps this could be worked around by having a state variable that gets set when a token is read, and reset when doMacros 0 is done, and making doMacros 0 conditional on this variable.

@jgm
Copy link
Owner

jgm commented Mar 19, 2018

Oddly, the technique I sketched didn't work at all.

On a crude benchmark:

doMacro after getting token: 1.4s

doMacro before getting token: 1.56s

doMacro conditional on a state variable, ensuring that we don't run it too many times: 1.8s!

I can't really explain this.

@jgm
Copy link
Owner

jgm commented Mar 19, 2018

here's the code I tried:

diff --git a/src/Text/Pandoc/Readers/LaTeX.hs b/src/Text/Pandoc/Readers/LaTeX.hs
index 23b68361e..44a9771f9 100644
--- a/src/Text/Pandoc/Readers/LaTeX.hs
+++ b/src/Text/Pandoc/Readers/LaTeX.hs
@@ -159,6 +159,7 @@ data LaTeXState = LaTeXState{ sOptions       :: ReaderOptions
                             , sLogMessages   :: [LogMessage]
                             , sIdentifiers   :: Set.Set String
                             , sVerbatimMode  :: Bool
+                            , sExpanded      :: Bool
                             , sCaption       :: Maybe Inlines
                             , sInListItem    :: Bool
                             , sInTableCell   :: Bool
@@ -178,6 +179,7 @@ defaultLaTeXState = LaTeXState{ sOptions       = def
                               , sLogMessages   = []
                               , sIdentifiers   = Set.empty
                               , sVerbatimMode  = False
+                              , sExpanded      = False
                               , sCaption       = Nothing
                               , sInListItem    = False
                               , sInTableCell   = False
@@ -401,8 +403,12 @@ untoken (Tok _ _ t) = t
 satisfyTok :: PandocMonad m => (Tok -> Bool) -> LP m Tok
 satisfyTok f =
   try $ do
+    expanded <- sExpanded <$> getState
+    unless expanded $ do
+      doMacros 0 -- apply macros on remaining input stream
+      updateState $ \st -> st{ sExpanded = True }
     res <- tokenPrim (T.unpack . untoken) updatePos matcher
-    doMacros 0 -- apply macros on remaining input stream
+    updateState $ \st -> st{ sExpanded = False }
     return res
   where matcher t | f t       = Just t
                   | otherwise = Nothing

silby added a commit to silby/pandoc that referenced this issue Jan 28, 2025
@jgm jgm closed this as completed in 0be07a5 Jan 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants