Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test with fuzzer #20

Open
pvdz opened this issue Aug 18, 2013 · 3 comments
Open

Test with fuzzer #20

pvdz opened this issue Aug 18, 2013 · 3 comments

Comments

@pvdz
Copy link
Owner

pvdz commented Aug 18, 2013

Should write a fuzzer to try and poke holes. And because it's fun :)

@michaelficarra
Copy link

I have a WIP fuzzer for JavaScript parsers that implement Mozilla's Reflect.parse API: https://github.com/michaelficarra/esfuzz

@pvdz
Copy link
Owner Author

pvdz commented Jun 8, 2014

Here's a poc token fuzzer that already caught a few nice bugs. By far from finished, of course.

Checks whether browser groks it (by compiling it in a Function) and whether ZeParser2 does it too. Needs some tweaking still. And this is not the fuzzer I had in mind, it's only a small part of it.

  var tokens = [
    'identifier',
    '$ident',
    '_ident',
    'iden$fier',
    'ident_ier',
    '\\u0065foo',
    'foo\\u0065bar',
    '500',
    '.50',
    '.050',
    '0e5',
    '5e2',
    '.5e05',
    '"string"',
    "'string'",
    '/foo/',
    '/foo/g',

    '(',
    ')',
    '[',
    ']',
    '{',
    '}',
    '!',
    '!=',
    '!==',
    '%',
    '%=',
    '^',
    '^=',
    '&',
    '&&',
    '&=',
    '*',
    '*=',
    '|',
    '|=',
    '-',
    '--',
    '-=',
    '+',
    '++',
    '+=',
    '~',
    '~=',
    ':',
    ';',
    '<',
    '<=',
    '<<',
    '<<=',
    '>',
    '>>',
    '>=',
    '>>',
    '>>=',
    '>>>',
    '>>>=',
    ',',
    '.',
    '/',
    '/=',
    '\\',
    '=',
    '==',
    '===',
    'break',
    'do',
    'instanceof',
    'typeof',
    'case',
    'else',
    'new',
    'var',
    'catch',
    'finally',
    'return',
    'void',
    'continue',
    'for',
    'switch',
    'while',
    'debugger',
    'function',
    'this',
    'with',
    'default',
    'if',
    'throw',
    'delete',
    'in',
    'try',
    'class',
    'enum',
    'extends',
    'super',
    'const',
    'export',
    'import',
  ];
  var white = [' ', '\t', '\r', '\n', '\r\n', '\u2028', '\u2029'];

  function compiles(code) {
    var func = null;
    var err = '';
    try {
      func = Function(code);
    } catch(e) {
      // chrome doesnt support 2028 2029 very well
      if (code.indexOf('\u2029') >= 0 || code.indexOf('\u2028') >= 0) {
        var again = code.replace(/[\u2028\u2029]/g, '\n');
        try {
          func = Function(again);
        } catch (e) {
          err = e.toString();
        }
      } else {
        err = e.toString();
      }
    }
    if (/^\s*-->>>/.test(code)) err = 'chrome bug', func=null;
    else if (/(^|\s)const\s/.test(code)) err = 'const', func=null;
    else if (/\dinstanceof/.test(code)) func = 'digit instanceof', err='';

    return err;
  }
  function parse(code) {
    var err = '';
    try {
      Par.parse(code, {
        functionMode: true, // we'll use `Function` to partially validate anyways
        regexNoClassEscape: false,
        saveTokens: true,
        strictAssignmentCheck: true,
        strictForInCheck: true,
      });
    } catch (e) {
      err = e.toString();
    }

    return err;
  }

  var total = 0;
  var n = 0;
  function repeat(){

    var list = [];
    for (var i = 0; i < 10; ++i) {
      var len = Math.floor(Math.random() * 10) + 1;
      var s = '';
      while (len--) {
        s += tokens[Math.floor(Math.random() * tokens.length)];
        while (Math.random() > 0.4) s += white[Math.floor(Math.random() * white.length)];
      }
      list.push(s);
    }

    document.body.innerHTML = '<pre>'+total+':\n'+(list.join('<hr>'))+'</pre>';

    setTimeout(function () {
      ++total;
      while (list.length) {
        var test = list.pop();
        var testEscaped = test.replace(/[^\w\d \u0020-\u00ff]/g, function(m){
          var x = m.charCodeAt(0).toString(16);
          while (x.length < 4) x = '0'+x;
          return '\\u'+x;
        });
        var testNewlined = test.replace(/[\u000a\u000d\u2028\u2029]/g, '\u21b5').replace(/[^\w\d \t\u0020-\u00ff\u21b5]/g, function(m){
          var x = m.charCodeAt(0).toString(16);
          while (x.length < 4) x = '0'+x;
          return '\\u'+x;
        });

        var browserError = compiles(test);
        var parserError = parse(test);

        if ((!browserError) !== (!parserError)) {
          console.log('browser:',!browserError, 'parser:', !parserError);
          console.log([testNewlined, browserError || parserError]);
          console.log([testEscaped]);
        }
      }

      if (total < 1000) repeat();
    }, 1);

  }

  repeat();

@pvdz
Copy link
Owner Author

pvdz commented Jul 27, 2014

There's a fuzzer in the repo now that randomly combines tokens and tries to parse it. It uses the browser to check whether it should be valid to parse. Added some flags for specific parsers to circumvent certain oddities.

This is only one part of what I wanted so I'm leaving this issue open. I also want a more semantic fuzzer that actually tries to construct valid code. I already pretty much know which combinatoric approach I wanna take, I just have to code the damn thing :p

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants