Skip to content

Commit

Permalink
Add escape(), unescape(), and Minimatch.hasMagic()
Browse files Browse the repository at this point in the history
Also, treat single-character brace classes as their literal character,
without magic.

So for example, `[f]` would be parsed as just `'f'`, and not treated as
a magic pattern.
  • Loading branch information
isaacs committed Mar 1, 2023
1 parent 75a8b84 commit 327cb60
Show file tree
Hide file tree
Showing 13 changed files with 2,328 additions and 67 deletions.
55 changes: 54 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,14 +122,29 @@ var mm = new Minimatch(pattern, options)

### Methods

- `makeRe` Generate the `regexp` member if necessary, and return it.
- `makeRe()` Generate the `regexp` member if necessary, and return it.
Will return `false` if the pattern is invalid.
- `match(fname)` Return true if the filename matches the pattern, or
false otherwise.
- `matchOne(fileArray, patternArray, partial)` Take a `/`-split
filename, and match it against a single row in the `regExpSet`. This
method is mainly for internal use, but is exposed so that it can be
used by a glob-walker that needs to avoid excessive filesystem calls.
- `hasMagic()` Returns true if the parsed pattern contains any
magic characters. Returns false if all comparator parts are
string literals. If the `magicalBraces` option is set on the
constructor, then it will consider brace expansions which are
not otherwise magical to be magic. If not set, then a pattern
like `a{b,c}d` will return `false`, because neither `abd` nor
`acd` contain any special glob characters.

This does **not** mean that the pattern string can be used as a
literal filename, as it may contain magic glob characters that
are escaped. For example, the pattern `\\*` or `[*]` would not
be considered to have magic, as the matching portion parses to
the literal string `'*'` and would match a path named `'*'`,
not `'\\*'` or `'[*]'`. The `minimatch.unescape()` method may
be used to remove escape characters.

All other methods are internal, and will be called as necessary.

Expand All @@ -150,6 +165,34 @@ supplied argument, suitable for use with `Array.filter`. Example:
var javascripts = fileList.filter(minimatch.filter('*.js', { matchBase: true }))
```

### minimatch.escape(pattern, options = {})

Escape all magic characters in a glob pattern, so that it will
only ever match literal strings

If the `windowsPathsNoEscape` option is used, then characters are
escaped by wrapping in `[]`, because a magic character wrapped in
a character class can only be satisfied by that exact character.

Slashes (and backslashes in `windowsPathsNoEscape` mode) cannot
be escaped or unescaped.

### minimatch.unescape(pattern, options = {})

Un-escape a glob string that may contain some escaped characters.

If the `windowsPathsNoEscape` option is used, then square-brace
escapes are removed, but not backslash escapes. For example, it
will turn the string `'[*]'` into `*`, but it will not turn
`'\\*'` into `'*'`, becuase `\` is a path separator in
`windowsPathsNoEscape` mode.

When `windowsPathsNoEscape` is not set, then both brace escapes
and backslash escapes are removed.

Slashes (and backslashes in `windowsPathsNoEscape` mode) cannot
be escaped or unescaped.

### minimatch.match(list, pattern, options)

Match against the list of
Expand Down Expand Up @@ -212,6 +255,16 @@ When a match is not found by `minimatch.match`, return a list containing
the pattern itself if this option is set. When not set, an empty list
is returned if there are no matches.

### magicalBraces

This only affects the results of the `Minimatch.hasMagic` method.

If the pattern contains brace expansions, such as `a{b,c}d`, but
no other magic characters, then the `Minipass.hasMagic()` method
will return `false` by default. When this option set, it will
return `true` for brace expansion as well as other magic glob
characters.

### matchBase

If set, then patterns without slashes will be matched
Expand Down
6 changes: 6 additions & 0 deletions changelog.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
# change log

## 7.4

- Add `escape()` method
- Add `unescape()` method
- Add `Minimatch.hasMagic()` method

## 7.3

- Add support for posix character classes in a unicode-aware way.
Expand Down
56 changes: 38 additions & 18 deletions src/brace-expressions.ts
Original file line number Diff line number Diff line change
Expand Up @@ -20,15 +20,21 @@ const posixClasses: { [k: string]: [e: string, u: boolean, n?: boolean] } = {
}

// only need to escape a few things inside of brace expressions
const regExpEscape = (s: string) => s.replace(/[[\]\\-]/g, '\\$&')

const rangesToString = (ranges: string[]): string => {
return (
ranges
// .map(r => r.replace(/[[\]]/g, '\\$&').replace(/^-/, '\\-'))
.join('')
)
}
// escapes: [ \ ] -
const braceEscape = (s: string) => s.replace(/[[\]\\-]/g, '\\$&')
// escape all regexp magic characters
const regexpEscape = (s: string) =>
s.replace(/[-[\]{}()*+?.,\\^$|#\s]/g, '\\$&')

// everything has already been escaped, we just have to join
const rangesToString = (ranges: string[]): string => ranges.join('')

export type ParseClassResult = [
src: string,
uFlag: boolean,
consumed: number,
hasMagic: boolean
]

// takes a glob string at a posix brace expression, and returns
// an equivalent regular expression source, and boolean indicating
Expand All @@ -39,7 +45,7 @@ const rangesToString = (ranges: string[]): string => {
export const parseClass = (
glob: string,
position: number
): [string, boolean, number] => {
): ParseClassResult => {
const pos = position
/* c8 ignore start */
if (glob.charAt(pos) !== '[') {
Expand Down Expand Up @@ -84,7 +90,7 @@ export const parseClass = (
if (glob.startsWith(cls, i)) {
// invalid, [a-[] is fine, but not [a-[:alpha]]
if (rangeStart) {
return ['$.', false, glob.length - pos]
return ['$.', false, glob.length - pos, true]
}
i += cls.length
if (neg) negs.push(unip)
Expand All @@ -101,9 +107,9 @@ export const parseClass = (
// throw this range away if it's not valid, but others
// can still match.
if (c > rangeStart) {
ranges.push(regExpEscape(rangeStart) + '-' + regExpEscape(c))
ranges.push(braceEscape(rangeStart) + '-' + braceEscape(c))
} else if (c === rangeStart) {
ranges.push(regExpEscape(c))
ranges.push(braceEscape(c))
}
rangeStart = ''
i++
Expand All @@ -113,7 +119,7 @@ export const parseClass = (
// now might be the start of a range.
// can be either c-d or c-] or c<more...>] or c] at this point
if (glob.startsWith('-]', i + 1)) {
ranges.push(regExpEscape(c + '-'))
ranges.push(braceEscape(c + '-'))
i += 2
continue
}
Expand All @@ -124,20 +130,34 @@ export const parseClass = (
}

// not the start of a range, just a single character
ranges.push(regExpEscape(c))
ranges.push(braceEscape(c))
i++
}

if (endPos < i) {
// didn't see the end of the class, not a valid class,
// but might still be valid as a literal match.
return ['', false, 0]
return ['', false, 0, false]
}

// if we got no ranges and no negates, then we have a range that
// cannot possibly match anything, and that poisons the whole glob
if (!ranges.length && !negs.length) {
return ['$.', false, glob.length - pos]
return ['$.', false, glob.length - pos, true]
}

// if we got one positive range, and it's a single character, then that's
// not actually a magic pattern, it's just that one literal character.
// we should not treat that as "magic", we should just return the literal
// character. [_] is a perfectly valid way to escape glob magic chars.
if (
negs.length === 0 &&
ranges.length === 1 &&
/^\\?.$/.test(ranges[0]) &&
!negate
) {
const r = ranges[0].length === 2 ? ranges[0].slice(-1) : ranges[0]
return [regexpEscape(r), false, endPos - pos, false]
}

const sranges = '[' + (negate ? '^' : '') + rangesToString(ranges) + ']'
Expand All @@ -149,5 +169,5 @@ export const parseClass = (
? sranges
: snegs

return [comb, uflag, endPos - pos]
return [comb, uflag, endPos - pos, true]
}
23 changes: 23 additions & 0 deletions src/escape.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
import { MinimatchOptions } from './index.js'
/**
* Escape all magic characters in a glob pattern.
*
* If the {@link windowsPathsNoEscape | GlobOptions.windowsPathsNoEscape}
* option is used, then characters are escaped by wrapping in `[]`, because
* a magic character wrapped in a character class can only be satisfied by
* that exact character. In this mode, `\` is _not_ escaped, because it is
* not interpreted as a magic character, but instead as a path separator.
*/
export const escape = (
s: string,
{
windowsPathsNoEscape = false,
}: Pick<MinimatchOptions, 'windowsPathsNoEscape'> = {}
) => {
// don't need to escape +@! because we escape the parens
// that make those magic, and escaping ! as [!] isn't valid,
// because [!]] is a valid glob class meaning not ']'.
return windowsPathsNoEscape
? s.replace(/[?*()[\]]/g, '[$&]')
: s.replace(/[?*()[\]\\]/g, '\\$&')
}
38 changes: 34 additions & 4 deletions src/index.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
import expand from 'brace-expansion'
import { parseClass } from './brace-expressions.js'
import { escape } from './escape.js'
import { unescape } from './unescape.js'

export interface MinimatchOptions {
nobrace?: boolean
Expand All @@ -15,6 +17,7 @@ export interface MinimatchOptions {
dot?: boolean
nocase?: boolean
nocaseMagicOnly?: boolean
magicalBraces?: boolean
matchBase?: boolean
flipNegate?: boolean
preserveMultipleSlashes?: boolean
Expand Down Expand Up @@ -182,6 +185,16 @@ export const defaults = (def: MinimatchOptions): typeof minimatch => {
}
},

unescape: (
s: string,
options: Pick<MinimatchOptions, 'windowsPathsNoEscape'> = {}
) => orig.unescape(s, ext(def, options)),

escape: (
s: string,
options: Pick<MinimatchOptions, 'windowsPathsNoEscape'> = {}
) => orig.escape(s, ext(def, options)),

filter: (pattern: string, options: MinimatchOptions = {}) =>
orig.filter(pattern, ext(def, options)),

Expand Down Expand Up @@ -353,6 +366,18 @@ export class Minimatch {
this.make()
}

hasMagic():boolean {
if (this.options.magicalBraces && this.set.length > 1) {
return true
}
for (const pattern of this.set) {
for (const part of pattern) {
if (typeof part !== 'string') return true
}
}
return false
}

debug(..._: any[]) {}

make() {
Expand Down Expand Up @@ -1182,12 +1207,12 @@ export class Minimatch {
case '[':
// swallow any state-tracking char before the [
clearStateChar()
const [src, needUflag, consumed] = parseClass(pattern, i)
const [src, needUflag, consumed, magic] = parseClass(pattern, i)
if (consumed) {
re += src
uflag = uflag || needUflag
i += consumed - 1
hasMagic = true
hasMagic = hasMagic || magic
} else {
re += '\\['
}
Expand Down Expand Up @@ -1303,7 +1328,7 @@ export class Minimatch {
// unescape anything in it, though, so that it'll be
// an exact match against a file etc.
if (!hasMagic) {
return globUnescape(pattern)
return globUnescape(re)
}

const flags = (options.nocase ? 'i' : '') + (uflag ? 'u' : '')
Expand Down Expand Up @@ -1496,5 +1521,10 @@ export class Minimatch {
return minimatch.defaults(def).Minimatch
}
}

/* c8 ignore start */
export { escape } from './escape.js'
export { unescape } from './unescape.js'
/* c8 ignore stop */
minimatch.Minimatch = Minimatch
minimatch.escape = escape
minimatch.unescape = unescape
25 changes: 25 additions & 0 deletions src/unescape.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
import { MinimatchOptions } from './index.js'
/**
* Un-escape a string that has been escaped with {@link escape}.
*
* If the {@link windowsPathsNoEscape} option is used, then square-brace
* escapes are removed, but not backslash escapes. For example, it will turn
* the string `'[*]'` into `*`, but it will not turn `'\\*'` into `'*'`,
* becuase `\` is a path separator in `windowsPathsNoEscape` mode.
*
* When `windowsPathsNoEscape` is not set, then both brace escapes and
* backslash escapes are removed.
*
* Slashes (and backslashes in `windowsPathsNoEscape` mode) cannot be escaped
* or unescaped.
*/
export const unescape = (
s: string,
{
windowsPathsNoEscape = false,
}: Pick<MinimatchOptions, 'windowsPathsNoEscape'> = {}
) => {
return windowsPathsNoEscape
? s.replace(/\[([^\/\\])\]/g, '$1')
: s.replace(/((?!\\).|^)\[([^\/])\]/g, '$1$2').replace(/\\([^\/])/g, '$1')
}
Loading

0 comments on commit 327cb60

Please sign in to comment.