-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Liquid::ParseTreeVisitor #1025
Liquid::ParseTreeVisitor #1025
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was initially thinking something more declarative, specifically the visitor pattern implemented similarly to graphql-ruby's visitor. Static analysis can be expressed very elegantly in such a form.
What is this needed for? |
@pushrax Wouldn't such a visitor still need a way to actually get at the nodes, however? Or would you suggest building one like that that knows about the instance variable names, etc? |
The example here uses a |
Yeah, we definitely need a way to get at the children. It doesn't necessarily need to be a public API though, and it could be better-defined. In the context of Perusing some types:
So, what is a node in this new world? I guess it's a sum type of all these classes and their members. Expression.parse can return strings, integers, ranges, floats, and VariableLookups, so these must all be node types as well. I suppose there is nothing intrinsically wrong with that, but it's hard to reason about what you'll get when you call Your choice to reject strings in the implementation of The code that depends on such a A refactor to create a proper Node class that is inherited globally and provides a consistent API is a potential way to alleviate some complexity. |
@pushrax How would you suggest making it non-public? Have the |
Hmm, so, probably having all the subnodes be readable anyway (via public attr_readers) is desirable -- I've run into a few I want in my code outside of the traversal. So maybe the right thing to do is expose all of these, and then the logic of which things are children can live in the visitor but we don't need to use Thoughts? |
That sounds quite reasonable to me. Having a Ruby-visible public |
cdede9c
to
21e3132
Compare
21e3132
to
a959cc3
Compare
Implemented, cleaned up, and added tests |
Not sure what's up with the policial failures... they don't match local rubocop run. Is policial not using the config from the repo? |
7f69097
to
aa5fd4c
Compare
Ok, it passes Policial here now. Local rubocop failures in files I didn't modify persist, but they seem to be present on master |
Yeah, the rubocop config has been broken in this repo for a few months since there hasn't been much active work. Will address that at some point. |
@pushrax Ok, cool. So if we're following policial I think this is good for review. |
aa5fd4c
to
b325070
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the bike shedding, I haven't been in the head space of this problem and haven't made my goals for this solution clear.
lib/liquid/traversal.rb
Outdated
# frozen_string_literal: true | ||
|
||
module Liquid | ||
class Traversal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Visitor is a name that carries more useful connotation for me here. Is it the wrong name for a reason I'm missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I felt like since this is just one possible instance of the visitor pattern, taking the name of the whole pattern for this case was over-broad. But happy to call it Visitor
if that makes more sense to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AllTypesVisitor
? I'm basically looking for the ability to have a decent idea what this does without reading the implementation.
lib/liquid/traversal.rb
Outdated
module Liquid | ||
class Traversal | ||
def self.for(node, callbacks = Hash.new(proc {})) | ||
kase = CASES.find { |(klass, _)| node.is_a?(klass) }&.last |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CASES
isn't used for anything else, maybe it should be inlined to make the code clearer?
traversal_class = case node
when Liquid::Assign then Assign
when Liquid::Case then Case
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing it as an inline case
expression fails policial because there are too many cases. Doing it with data also makes it easier to extend this to a registry-style model in the future (maybe for custom liquid tags to add to or similar?)
lib/liquid/traversal.rb
Outdated
def children | ||
@node.lookups | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is better than colocating these definitions alongside their corresponding classes. Perhaps the convention could be:
# variable_lookup.rb
module Liquid
class VariableLookup
class Traversal < Liquid::Traversal
def children
@node.lookups
end
end
end
end
or do away with this abstraction altogether and put a children method on VariableLookup etc directly. If the latter, picking the method name is the hardest problem, given the rant about type consistency I posted earlier. Perhaps something to the effect of children_of_all_types
, maybe phrased more eloquently. Then the new public accessors can be removed as well. Reflecting back, I think my dislike of the method added in the original PR was:
- the method name was ambiguous
- I'd rather have consumers use the visitor pattern to access the data
Having the method at all wasn't the issue for me.
lib/liquid/traversal.rb
Outdated
@callbacks = callbacks | ||
end | ||
|
||
def callback_for(*classes, &block) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not super happy with this name, but can't think of any great alternatives. Some options I considered:
add_callback
add_callback_for
add_visitor
add_visitor_for
on_visit
[]=
WDYT?
7685e03
to
a83dbd3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment about why you think the "subclass Traversal and add missing getters" is better than "private-ish method" would be useful for posterity.
The last concern I have is that there isn't test coverage for the individual Traversal classes. Table tests are a concise way to implement tests for this sort of thing.
lib/liquid/traversal.rb
Outdated
# frozen_string_literal: true | ||
|
||
module Liquid | ||
class Traversal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AllTypesVisitor
? I'm basically looking for the ability to have a decent idea what this does without reading the implementation.
They individual classes do have test coverage from the integration tests -- I did it that way because setting up the tree by parsing a Liquid snippet seemed much cleaner than doing manual object-unit setup. I chose the full Visitor pattern instead of a private-ish method for a couple reasons: (1) the idea of a private-ish method is very bothersome, nothing but docs to even suggest it is not meant to be used directly (2) no pollution of the core-LIquid namespaces with methods that are only used by a different module |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose the only non-hacky way to make sure the variables visited is equal to the variables that exist would be to force storage in an array, but that's a bit of a major refactor and adds a runtime cost. This is probably an issue we can live with.
@fw42 can you give this a review?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a bunch of small comments. I'm not entirely sure what the purpose and the intended API of this is exactly. Could you update the PR description so that that's more clear (especially to people outside of Shopify who can't read the Slack conversation that you linked)?
lib/liquid/parse_tree_visitor.rb
Outdated
|
||
def add_callback_for(*classes, &block) | ||
cb = block | ||
cb = ->(node, _) { block[node] } if block.arity.abs == 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is the abs
needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when there are optional arguments, ruby returns a negative number for arity
lib/liquid/tags/for.rb
Outdated
attr_reader :collection_name | ||
attr_reader :variable_name | ||
attr_reader :collection_name, :variable_name, :limit, :from, | ||
:for_block, :else_block |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did for_block
and else_block
need to be accessible? doesn't seem like you are using that anywhere
[ | ||
@node.template_name_expr, | ||
@node.variable_name_expr | ||
] + @node.attributes.values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why only the values
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the keys in the hash are the name of attributes to be assigned, not expressions
class ParseTreeVisitorTest < Minitest::Test | ||
include Liquid | ||
|
||
def traversal(template) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: I think helpers like this are conventionally put at the end of the test and made private
ParseTreeVisitor | ||
.for(Template.parse(template).root) | ||
.add_callback_for(VariableLookup, &:name) | ||
.visit.flatten.compact |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the visit.flatten.compact
here? by making this a part of the test, I might accidentally break the API that you wanted without your tests covering it. For example I could change the code to always add a nil
to the return value of every visit
and that would most likely break consumers of this new API, but none of your tests would cover that. Same for the flatten
(I could add some garbage array nesting and your tests wouldn't catch it).
I think you should add at least one or two tests or so to explicitly assert that the array structure is what you wanted. If you always want this to be flat and compact then maybe #visit
itself should do that (rather than the caller)?
module Liquid | ||
class ParseTreeVisitor | ||
def self.for(node, callbacks = Hash.new(proc {})) | ||
if defined?(node.class::ParseTreeVisitor) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this subclassing pattern seems overkill. Why not just have each class define a children
method (then you also wouldn't have to make so many accessors public) and then only have one ParseTreeVisitor
class (rather than one subclass per tag)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed above, using a full visitor pattern instead of a method that has to be public but that we don't want anyone to call seemed cleaner, and also allows us to keep traversal-related things out of the main namespace.
If you feel very strongly, I can change this back again.
This enables traversal over whole document tree.
19e9f0b
to
cf8dcaf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks correct to me. Don't feel strongly about the extra class comment. Not sure why you marked my other comments as resolved even though you didn't respond to them nor change any of the code I commented on. I think those comments are still valid. But if you disagree then feel free to merge.
This puts all knowledge of the traversal in the same file, and removes the need for a CASES registry.
cf8dcaf
to
8217a8d
Compare
@fw42 Sorry about that -- somehow didn't push all my changes but thought I had. Made those other changes you requested now. |
Currently, there is no way to traverse the parse tree of a liquid document without using
instance_variable_get
and a lot of knowledge of library internals, which is quite brittle.This change enables writing code to traverse the tree for reasons other than rendering (such as static analysis).