Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question on allowing any protocol #54

Closed
digitalmoksha opened this issue Jun 4, 2024 · 5 comments · Fixed by #55
Closed

Question on allowing any protocol #54

digitalmoksha opened this issue Jun 4, 2024 · 5 comments · Fixed by #55

Comments

@digitalmoksha
Copy link
Contributor

I have a situation where any protocols are allowed, except a few.

I have a handler that can check every href and filter out bad protocols. Same thing I'm able to do using Sanitize.

However, I can't seem to convince Selma to not filter on protocols for the href. I've tried adding only an a element, etc. I tried each of the following with no luck.

html = '<a href="http://example.com">test</a>'
sanitizer = Selma::Sanitizer.new(Selma::Sanitizer::Config::DEFAULT)
sanitizer.allow_element(['a'])
sanitizer.allow_attribute('a', ['href'])
sanitizer.allow_protocol('a', 'href', [''])
sanitizer.allow_protocol('a', 'href', [nil])
sanitizer.allow_protocol('a', 'href', ['*'])

It always filters out the href, giving <a>test</a>.

Is this possible?

@gjtorikian
Copy link
Owner

Actually...this functionality doesn't exist! 😓 However, I looked through the code, and if you want to propose a keyword you'd like, I can implement it pretty quickly.

Since protocols are nested within attributes, which are nested within elements, the minimum config for this would be:

sanitizer = Selma::Sanitizer.new({
  elements: ["a"],
  attributes: { "a" => ["href"] },
  protocols: { "a" => { "href" => [:all] } },
})

We already have :relative to allow any relative URL, so I thought :all might be a nice counterpart. Or it could be protocols: { "a" => { "href" => ["*"] } }. What do you think?

@digitalmoksha
Copy link
Contributor Author

I like :all. But since the config mirrors the Sanitize config (mostly), what about doing it the way it does it, which is basically if a protocol is not specified, then it doesn't sanitize it.

sanitizer = Selma::Sanitizer.new({
  elements: ["a img"],
  attributes: { "a" => ["href"], "img" => ["src"] },
  protocols: { "img" => { "src" => ["http", "https"] } },
})

So this would sanitize img tags, but not a tags.

Though I also see the benefit in being explicit, so in that case I like :all

@gjtorikian
Copy link
Owner

Yeah, I'm an advocate for explicitness.

image

Cool, I'll hack on adding :all shortly.

@digitalmoksha
Copy link
Contributor Author

Hey thanks for this! I haven't had a chance to test it yet but I will, my week has gone sideways.

@digitalmoksha
Copy link
Contributor Author

@gjtorikian worked like a charm, thanks! 🙇

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants