Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion source/common/http/http1/codec_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -533,7 +533,7 @@ void ServerConnectionImpl::handlePath(HeaderMapImpl& headers, unsigned int metho
// forward the received Host field-value.
headers.insertHost().value(std::string(absolute_url.host_and_port()));

headers.insertPath().value(std::string(absolute_url.path()));
headers.insertPath().value(std::string(absolute_url.path_and_query_params()));
active_request_->request_url_.clear();
}

Expand Down
15 changes: 12 additions & 3 deletions source/common/http/utility.cc
Original file line number Diff line number Diff line change
Expand Up @@ -57,10 +57,19 @@ bool Utility::Url::initialize(absl::string_view absolute_url) {
// RFC allows the absolute-uri to not end in /, but the absolute path form
// must start with
if ((u.field_set & (1 << UF_PATH)) == (1 << UF_PATH) && u.field_data[UF_PATH].len > 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be a bit easier to see the differences here if we split out length settings, e.g.

uint64_t path_len = u.field_data[UF_PATH].len;
if ([query param exists) {
path_len += 1 + u.field_data[UF_PATH].len
}
path_ = path_ = absl::string_view(absolute_url.data() + u.field_data[UF_PATH].off, path_len);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes it looks much better.

path_ = absl::string_view(absolute_url.data() + u.field_data[UF_PATH].off,
u.field_data[UF_PATH].len);
uint64_t path_len = u.field_data[UF_PATH].len;
if ((u.field_set & (1 << UF_QUERY)) == (1 << UF_QUERY) && u.field_data[UF_QUERY].len > 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My one remaining question - is it possible for UF_QUERY to be set and u.field_data[UF_QUERY].len == 0?

If path length is 0 we substitute a hard-coded "/". If the +1 for length is due to the "?" being considered the delimiter I'd like to make sure that http://foo.com? would get proxied without losing the ?. Can you add a test for that corner case?

Copy link
Contributor Author

@lukidzi lukidzi Apr 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I investigated it a bit.
Http url parser works different for path / and ?<-query start :
e.g: http://foo.com/? it sets bit to 1 for UF_PATH but won't set UF_QUERY.
When there is query start sign(?) it will go to the next iteration and won't set UF_QUERY bit to 1. So it is impossible to get this information from parser. Same for http://foo.com?. When it gets to (?) sign and won't see another character after (?) it won't set UF_QUERY bit to 1.
I added more test cases and fixed issue when url looks like: http://foo.com?query=param

for url: http://foo.com/ -> UF_PATH bit is set and len = 1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh nice, thanks for digging into this, and adding the extra corner case testing!

So while this is clearly an improvement over what we have, I'm still pretty uncomfortable with /? being normalized to / unless it's illegal by the spec.

I'd be fine

  1. if this is correct as defined by some corner of the URL spec (@PiotrSikora do you know this one?)
  2. landing this as is today to get most query params in this Envoy build, and leaving the bug open to sort out to retain the ? next week
  3. setting the host to "everything after the beginning of the path delimiter", though we're getting close enough to merge freeze that might lose us this landing in this release.

I'd lean towards 2. @mattklein123 thoughts/preferences?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(2) SGTM, agreed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. Merged because I'd prefer the clearly-better-behavior to make next week's release.

Lukasz, thanks so much for tackling this. Bonus points if you'd be willing to poke through the spec or sync with @PiotrSikora on expected behavior for /? and tackle that in a follow-up :-)

path_len += 1 + u.field_data[UF_QUERY].len;
}
path_and_query_params_ =
absl::string_view(absolute_url.data() + u.field_data[UF_PATH].off, path_len);
} else if ((u.field_set & (1 << UF_QUERY)) == (1 << UF_QUERY) && u.field_data[UF_QUERY].len > 0) {
// Http parser skips question mark and starts count from first character after ?
// so we need to move left by one
path_and_query_params_ = absl::string_view(absolute_url.data() + u.field_data[UF_QUERY].off - 1,
u.field_data[UF_QUERY].len + 1);
} else {
path_ = absl::string_view(kDefaultPath, 1);
path_and_query_params_ = absl::string_view(kDefaultPath, 1);
}
return true;
}
Expand Down
6 changes: 3 additions & 3 deletions source/common/http/utility.h
Original file line number Diff line number Diff line change
Expand Up @@ -24,19 +24,19 @@ namespace Utility {

/**
* Given a fully qualified URL, splits the string_view provided into scheme,
* host and path components.
* host and path with query parameters components.
*/
class Url {
public:
bool initialize(absl::string_view absolute_url);
absl::string_view scheme() { return scheme_; }
absl::string_view host_and_port() { return host_and_port_; }
absl::string_view path() { return path_; }
absl::string_view path_and_query_params() { return path_and_query_params_; }

private:
absl::string_view scheme_;
absl::string_view host_and_port_;
absl::string_view path_;
absl::string_view path_and_query_params_;
};

/**
Expand Down
2 changes: 1 addition & 1 deletion source/common/router/router.cc
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ bool convertRequestHeadersForInternalRedirect(Http::HeaderMap& downstream_header
// Replace the original host, scheme and path.
downstream_headers.insertScheme().value(std::string(absolute_url.scheme()));
downstream_headers.insertHost().value(std::string(absolute_url.host_and_port()));
downstream_headers.insertPath().value(std::string(absolute_url.path()));
downstream_headers.insertPath().value(std::string(absolute_url.path_and_query_params()));

return true;
}
Expand Down
33 changes: 32 additions & 1 deletion test/common/http/utility_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -754,7 +754,7 @@ void ValidateUrl(absl::string_view raw_url, absl::string_view expected_scheme,
ASSERT_TRUE(url.initialize(raw_url)) << "Failed to initialize " << raw_url;
EXPECT_EQ(url.scheme(), expected_scheme);
EXPECT_EQ(url.host_and_port(), expected_host_port);
EXPECT_EQ(url.path(), expected_path);
EXPECT_EQ(url.path_and_query_params(), expected_path);
}

TEST(Url, ParsingTest) {
Expand All @@ -766,12 +766,43 @@ TEST(Url, ParsingTest) {
ValidateUrl("http://www.host.com:80/", "http", "www.host.com:80", "/");
ValidateUrl("http://www.host.com/", "http", "www.host.com", "/");

// Test url with "?".
ValidateUrl("http://www.host.com:80/?", "http", "www.host.com:80", "/");
ValidateUrl("http://www.host.com/?", "http", "www.host.com", "/");

// Test url with "?" but without slash.
ValidateUrl("http://www.host.com:80?", "http", "www.host.com:80", "/");
ValidateUrl("http://www.host.com?", "http", "www.host.com", "/");

// Test url with multi-character path
ValidateUrl("http://www.host.com:80/path", "http", "www.host.com:80", "/path");
ValidateUrl("http://www.host.com/path", "http", "www.host.com", "/path");

// Test url with multi-character path and ? at the end
ValidateUrl("http://www.host.com:80/path?", "http", "www.host.com:80", "/path");
ValidateUrl("http://www.host.com/path?", "http", "www.host.com", "/path");

// Test https scheme
ValidateUrl("https://www.host.com", "https", "www.host.com", "/");

// Test url with query parameter
ValidateUrl("http://www.host.com:80/?query=param", "http", "www.host.com:80", "/?query=param");
ValidateUrl("http://www.host.com/?query=param", "http", "www.host.com", "/?query=param");

// Test url with query parameter but without slash
ValidateUrl("http://www.host.com:80?query=param", "http", "www.host.com:80", "?query=param");
ValidateUrl("http://www.host.com?query=param", "http", "www.host.com", "?query=param");

// Test url with multi-character path and query parameter
ValidateUrl("http://www.host.com:80/path?query=param", "http", "www.host.com:80",
"/path?query=param");
ValidateUrl("http://www.host.com/path?query=param", "http", "www.host.com", "/path?query=param");

// Test url with multi-character path and more than one query parameter
ValidateUrl("http://www.host.com:80/path?query=param&query2=param2", "http", "www.host.com:80",
"/path?query=param&query2=param2");
ValidateUrl("http://www.host.com/path?query=param&query2=param2", "http", "www.host.com",
"/path?query=param&query2=param2");
}

} // namespace Http
Expand Down