Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(path): correctly encode delimiters and relative paths #108

Merged
merged 17 commits into from
Apr 22, 2021
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 37 additions & 36 deletions lib/imgix/path.rb
Original file line number Diff line number Diff line change
Expand Up @@ -11,44 +11,12 @@ def initialize(prefix, secure_url_token, path = "/")
@prefix = prefix
@secure_url_token = secure_url_token
#TODO(luqven): should this method not be invoked on initialization?
@path = sanitize_path(path)
@path = path
@options = {}
end

# Escape and encode any characters in path that are reserved and not utf8 encoded.
# This includes " +?:#" characters. If a path is being used as a proxy, utf8
# encode everything. If it is not being used as proxy, leave certain chars, like
# "/", alone. Method assumes path is not already encoded.
def sanitize_path(path)
# remove the leading "/", we'll add it back after encoding
path = path.slice(1, path.length) if Regexp.new('^/') =~ path
# if path is being used as a proxy, encode the entire thing
if /^https?/ =~ path
return encode_URI_Component(path)
else
# otherwise, encode only specific characters
return encode_URI(path)
end
end

# URL encode the entire path
def encode_URI_Component(path)
return "/" + CGI.escape(path)
end

# URL encode every character in the path, including
# " +?:#" characters.
def encode_URI(path)
result = []
path.split("/").each do |str|
escaped_key = ERB::Util.url_encode(str)
result << escaped_key
end
result = "/" + result.join("/")
return result;
end

def to_url(opts = {})
@path = sanitize_path(@path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this cause multiple encoding issues if to_url is called multiple times on the same Path? Perhaps a different ivar name and ||= sanitize_path(@path) could be used instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great suggestion 💯 , I'll try this out locally and push the change if all looks good.

prev_options = @options.dup
@options.merge!(opts)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to this PR, but it would be great if this state mutation could be removed by passing state in method calls, like you're doing with sanitized_path.


Expand Down Expand Up @@ -161,8 +129,41 @@ def method_missing(method, *args, &block)

private

def signature(current_path_and_params)
Digest::MD5.hexdigest(@secure_url_token + current_path_and_params)
# Escape and encode any characters in path that are reserved and not utf8 encoded.
# This includes " +?:#" characters. If a path is being used as a proxy, utf8
# encode everything. If it is not being used as proxy, leave certain chars, like
# "/", alone. Method assumes path is not already encoded.
def sanitize_path(path)
# remove the leading "/", we'll add it back after encoding
path = path.slice(1, path.length) if Regexp.new('^/') =~ path
# if path is being used as a proxy, encode the entire thing
if /^https?/ =~ path
return encode_URI_Component(path)
else
# otherwise, encode only specific characters
return encode_URI(path)
end
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the Variant PR, which shifts a lot of this escaping/encoding work to application boot, this would ideally be done in the Variant initializer: https://github.com/imgix/imgix-rb/pull/104/files#diff-f7f33d40dff4d8a40100ac8a68f7fab2e082a98a683b678c8ee04c1bc930102aR8

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't had a chance to dig into #104 too much, but loved what I saw so far. Will definitely keep this in mind.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion @stevehodgkiss. I could not find any encoding issues when to_url is called multiple times on the same Path, but I’m sure there will be edge cases that I haven’t thought of. Also, I think the ivar for sanitized_path creates a more predictable experience for the user.

I’ve implemented a solution based on your idea:

def to_url(opts = {})
sanitized_path ||= sanitize_path(@path)
prev_options = @options.dup
@options.merge!(opts)
current_path_and_params = path_and_params(sanitized_path)
url = @prefix + current_path_and_params
if @secure_url_token
url += (has_query? ? "&" : "?") + "s=#{signature(current_path_and_params)}"
end

Before these changes, @path was being modified in a way that was not explicit. There may be a case in the future where we want an @sanitized_path var or a santize_path! func, but for the purposes of this PR that is out of scope.

There are a lot of improvements needed in this file and the library throughout, but I felt these changes have to least potential for producing breaking changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I take back my previous comment about "this would ideally be done in the Variant initializer". This can't be done there since it's about the Path itself, not the Variant.


# URL encode the entire path
def encode_URI_Component(path)
return "/" + CGI.escape(path)
end

# URL encode every character in the path, including
# " +?:#" characters.
def encode_URI(path)
result = []
path.split("/").each do |str|
escaped_key = ERB::Util.url_encode(str)
result << escaped_key
end
result = "/" + result.join("/")
luqven marked this conversation as resolved.
Show resolved Hide resolved
return result;
end

def signature(path_and_params)
Digest::MD5.hexdigest(@secure_url_token + path_and_params)
end

def path_and_params
Expand Down