ruby_parser (RP) is a ruby parser written in pure ruby (utilizing racc–which does by default use a C extension). It outputs s-expressions which can be manipulated and converted back to ruby via the ruby2ruby gem.
As an example:
def conditional1 arg1 return 1 if arg1 == 0 return 0 end
becomes:
s(:defn, :conditional1, s(:args, :arg1), s(:if, s(:call, s(:lvar, :arg1), :==, s(:lit, 0)), s(:return, s(:lit, 1)), nil), s(:return, s(:lit, 0)))
Tested against 801,039 files from the latest of all rubygems (as of 2013-05):
-
1.8 parser is at 99.9739% accuracy, 3.651 sigma
-
1.9 parser is at 99.9940% accuracy, 4.013 sigma
-
2.0 parser is at 99.9939% accuracy, 4.008 sigma
-
2.6 parser is at 99.9972% accuracy, 4.191 sigma
-
3.0 parser has a 100% parse rate.
-
Tested against 2,672,412 unique ruby files across 167k gems.
-
As do all the others now, basically.
-
-
Pure ruby, no compiles.
-
Includes preceding comment data for defn/defs/class/module nodes!
-
Incredibly simple interface.
-
Output is 100% equivalent to ParseTree.
-
Can utilize PT’s SexpProcessor and UnifiedRuby for language processing.
-
-
Known Issue: Speed is now pretty good, but can always improve:
-
RP parses a corpus of 3702 files in 125s (avg 108 Kb/s)
-
MRI+PT parsed the same in 67.38s (avg 200.89 Kb/s)
-
-
Known Issue: Code is much better, but still has a long way to go.
-
Known Issue: Totally awesome.
-
Known Issue: line number values can be slightly off. Parsing LR sucks.
RubyParser.new.parse "1+1" # => s(:call, s(:lit, 1), :+, s(:lit, 1))
You can also use Ruby19Parser, Ruby18Parser, or RubyParser.for_current_ruby:
RubyParser.for_current_ruby.parse "1+1" # => s(:call, s(:lit, 1), :+, s(:lit, 1))
To add a new version:
-
New parser should be generated from lib/ruby_parser.yy.
-
Extend lib/ruby_parser.yy with new class name.
-
Add new version number to V2/V3 in Rakefile for rule creation.
-
Add new ‘ruby_parse “x.y.z”` line to Rakefile for rake compare (line ~300).
-
Require generated parser in lib/ruby_parser.rb.
-
Add new V## = ::Ruby##Parser; end to ruby_parser.rb (bottom of file).
-
Add empty TestRubyParserShared##Plus module and TestRubyParserV## to test/test_ruby_parser.rb.
-
Extend Manifest.txt with generated file names.
-
Add new version number to sexp_processor’s pt_testcase.rb in all_versions.
Until all of these are done, you won’t have a clean test run.
-
ruby. woot.
-
sexp_processor for Sexp and SexpProcessor classes, and testing.
-
racc full package for parser development (compiling .y to .rb).
-
sudo gem install ruby_parser
(The MIT License)
Copyright © Ryan Davis, seattle.rb
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the ‘Software’), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED ‘AS IS’, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.