Skip to content

Commit a13e9dd

Browse files
committed
userdiff: extend Scheme support for other Lisp dialects
Common Lisp, Emacs Lisp, and other dialects have some top-level forms, most importantly 'defun', that are not matched by the current Scheme pattern. Also, it is common in these dialects, when defining user macros intended as top-level forms, to prefix their names with "def" instead of "define"; such forms are also not currently matched. Some such forms don't even begin with "def". On the other hand, it is an established formatting convention in the Lisp community that only top-level forms start at the left margin. So matching any unindented line starting with an open parenthesis is an acceptable heuristic; false positives will be rare. However, there are also cases where notionally top-level forms are grouped together within some containing form. At least in the Common Lisp community, it is conventional to indent these by two spaces, or sometimes one. But matching just an open parenthesis indented by two spaces would be too broad; so the pattern added by this commit requires an indented form to start with "(def". It is believed that this strikes a good balance between potential false positives and false negatives. This commit disjoins a regexp employing these heuristics to the existing Scheme regexp, so it will still match everything that it did previously.
1 parent da99bb0 commit a13e9dd

File tree

1 file changed

+12
-13
lines changed

1 file changed

+12
-13
lines changed

userdiff.c

Lines changed: 12 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -249,14 +249,6 @@ PATTERNS("kotlin",
249249
"|[.][0-9][0-9_]*([Ee][-+]?[0-9]+)?[fFlLuU]?"
250250
/* unary and binary operators */
251251
"|[-+*/<>%&^|=!]==?|--|\\+\\+|<<=|>>=|&&|\\|\\||->|\\.\\*|!!|[?:.][.:]"),
252-
PATTERNS("lisp",
253-
/* Either an unindented left paren, or a slightly indented line
254-
* starting with "(def" */
255-
"^((\\(|:space:{1,2}\\(def).*)$",
256-
/* Common Lisp symbol syntax allows arbitrary strings between vertical bars */
257-
"\\|([^\\\\]|\\\\\\\\|\\\\\\|)*\\|"
258-
/* All other words are delimited by spaces or parentheses/brackets/braces */
259-
"|([^][(){} \t])+"),
260252
PATTERNS("markdown",
261253
"^ {0,3}#{1,6}[ \t].*",
262254
/* -- */
@@ -352,14 +344,21 @@ PATTERNS("rust",
352344
"|[0-9][0-9_a-fA-Fiosuxz]*(\\.([0-9]*[eE][+-]?)?[0-9_fF]*)?"
353345
"|[-+*\\/<>%&^|=!:]=|<<=?|>>=?|&&|\\|\\||->|=>|\\.{2}=|\\.{3}|::"),
354346
PATTERNS("scheme",
355-
"^[\t ]*(\\(((define|def(struct|syntax|class|method|rules|record|proto|alias)?)[-*/ \t]|(library|module|struct|class)[*+ \t]).*)$",
347+
/* A possibly indented left paren followed by a Scheme keyword. */
348+
"^[\t ]*(\\(((define|def(struct|syntax|class|method|rules|record|proto|alias)?)[-*/ \t]|(library|module|struct|class)[*+ \t]).*)$\n"
349+
/*
350+
* For other Lisp dialects: either an unindented left paren, or a
351+
* slightly indented line starting with "(def".
352+
*/
353+
"^((\\(| {1,2}\\([Dd][Ee][Ff]).*)$",
356354
/*
357-
* R7RS valid identifiers include any sequence enclosed
358-
* within vertical lines having no backslashes
355+
* The union of R7RS and Common Lisp symbol syntax: allows arbitrary
356+
* strings between vertical bars, including escaped backslashes and
357+
* vertical bars.
359358
*/
360-
"\\|([^\\\\]*)\\|"
359+
"\\|([^\\\\]|\\\\\\\\|\\\\\\|)*\\|"
361360
/* All other words should be delimited by spaces or parentheses */
362-
"|([^][)(}{[ \t])+"),
361+
"|([^][)(}{ \t])+"),
363362
PATTERNS("tex", "^(\\\\((sub)*section|chapter|part)\\*{0,1}\\{.*)$",
364363
"\\\\[a-zA-Z@]+|\\\\.|([a-zA-Z0-9]|[^\x01-\x7f])+"),
365364
{ .name = "default", .binary = -1 },

0 commit comments

Comments
 (0)