Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

False positives, false negatives #6

Open
pygy opened this issue Jan 22, 2020 · 0 comments
Open

False positives, false negatives #6

pygy opened this issue Jan 22, 2020 · 0 comments

Comments

@pygy
Copy link

pygy commented Jan 22, 2020

Beside #3, you can also trivially fool the detector with comments, strings or even regexps.

e.g. if you feed it this file, it returns true:

// fool-is-module.js
const foo = ";import {} from ''";

The current regexps are also pretty contrived. They are equivalent to

const ES6ImportExportRegExp = sequence(
  either(/^\s*/, /[}{\(\);,\n]\s*/),
  capture(
    either(
      sequence('import', /\s+['"]/),
      sequence(
        capture(either('import', 'module')),
        /\s+[^"'\(\)\n;]+\s+/, "from", /\s+['"]/
      ),
      sequence("export", /\s+/, capture(either(
        "*",
        "{",
        "default", "function", "var", "const", "let",
        /[_$a-zA-Z\xA0-\uFFFF][_$a-zA-Z0-9\xA0-\uFFFF]*/
      )))
    )
  )
)

const ES6AliasRegExp = sequence(
  either(/^\s*/, /[}{\(\);,\n]\s*/),
  capture(
    "export", /\s*/, "*", /\s*/, "from", /\s*/, either(
        sequence("'", capture(/[^']+/), "'"),
        sequence('"', capture(/[^"]+/), '"')
    )
  )
)

I propose we move to something like

const moduleFinder = sequence(
  zeroplus(either(
    strings, comment, identifiers,
    sequence(avoid(moduleStatement), /\s\S/)
  )),
  moduleStatement
)

Where moduleSatement is defined according to the the JS spec (1 and 2). strings would also cover the various bits of template literals. By matching identifiers explicitly we avoid the issues where an identifier ends in import or export (JS regexps are eager). Edit: more thinking is required here... Edit2: done, it should work.

The module could still be fooled by a regexp that's been designed to trick it (e.g. r = /;import {} from ''). Avoiding this would require more logic than what RegExps provide AFAICT.

Edit: assuming you decided to use compose-regexp, it can be kept as a dev-dependency.

Edit3: dealing with interpolations in template literals would also require an actual parser (though note a complex one).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant