Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add headers policy #497

Merged
merged 2 commits into from
Nov 27, 2017
Merged

Add headers policy #497

merged 2 commits into from
Nov 27, 2017

Conversation

davidor
Copy link
Contributor

@davidor davidor commented Nov 22, 2017

Closes #490

end

local function add_resp_headers(headers)
for header, value in pairs(headers) do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pairs can't be JITed. So we should either use Luafun or make it a numerically indexed table.
I know this is pain, but that is a limitation of LuaJIT.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather evaluate Luafun later.
Do you see any other option apart from having two separate arrays in the config? Something like this:

          {
            "name": "apicast.policy.headers",
            "configuration":
              {
                "response":
                  {
                    "headers_to_add": { "Header1", "Header2" }
                    "values_of_headers_to_add": { "value1", "value2" }
                  }
              }
          }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about the configuration a bit and it really depends how the UI will look like.
But something like this seemed flexible:

 {
            "name": "apicast.policy.headers",
            "configuration":
              {
                "response":
                  [
                    { "name": "Header1", "op": "set", "value": "foobar" },
                    { "name": "Header1", "op": "add", "value": "foobar1" },
                    { "name": "Header2", "op": "remove" },
                  ]
              }
          }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That could work 👍
Also, I see "add" there. What would that do? Add only if the header has not been set already?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidor add is the array operation (in my mind). Because all headers that have , are technically arrays and can be sent as individual headers:

Set-Cookie: foo=bar, one=two

Is the same as:

Set-Cookie: foo=bar
Set-Cookie: one=two

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would 'set' work in the case of repeated headers like your cookie example?
Blows away the entire array of values and sets a new value (or an array of values if that is allowed?).

I assume 'remove' removes all values if the header was an array?

Is it possible to 'add' an array of values?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrewdavidmackenzie
If I understood correctly:

  1. Yes. I think that to be coherent with @mikz definition, we should delete the exiting array of values and initialize it with the value passed to the set op.
  2. Yes.
  3. That's something we need to define.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll try to spec it out in more detail.

set

Ability to override any value (including non existing) to a specific value. Removes all previous values and sets the one desired.

{ "op": "set", "name": "X-Header", "value": "foo" }

Examples:

GET /
X-Header: bar, daz

will be transformed to:

GET /
X-Header: foo

Request:

GET /
X-Foo: sometihing

will be transformed to:

GET /
X-Foo: something
X-Header: foo

remove

Can be implemented as set with value of empty string. Nginx will not send out empty headers.

{ "op": "set", "name": "X-Header", "value": "" }

add

{ "op": "add", "name": "X-Header", "value": "foo" }

Will append , value to the existing value of the header or set a new value if there is no previous value. Duplicates allowed.

request

GET /

transformed to

GET / 
X-Header: foo

request

GET /
X-Header: foo

transformed to

GET
X-Header: foo, foo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect 👌

There's only one question left. @andrewdavidmackenzie wrote it above. Should we allow:

{ "op": "add", "name": "X-Header", "value": "abc, def" }

To transform the request

GET /
X-Header: foo

Into:

GET /
X-Header: foo, abc, def

?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidor I think that is fine and within the specs. There is no escaping of ,. Less rules - better :)

@davidor davidor changed the title Add headers policy [WIP] Add headers policy Nov 23, 2017
@davidor davidor changed the title [WIP] Add headers policy Add headers policy Nov 23, 2017
@davidor
Copy link
Contributor Author

davidor commented Nov 23, 2017

Ready for review @mikz

end

local function add_request_header(header, value)
local current_value = ngx.req.get_headers()[header]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should check the performance of this. get_headers will read all the headers. That might not be needed.
You can access individual headers as ngx var http_header_name (ngx.var.http_x_request_id for example).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Just did a little experiment.

For a request:

GET /
Foo: bar
Foo: foo

ngx.req.get_headers() returns: foo = { "bar", "foo" }
ngx.var.http_foo returns "bar"

That is very unfortunate.

If we need to use get_headers then it should be done once and not in a loop.

Copy link
Contributor

@mikz mikz Nov 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From openresty/lua-nginx-module#161 (comment) it looks like both the http_HEADER var and get_headers should have same performance, so better to use get_headers once.

edit: from the nginx source it looks like that reading headers once and then using a table will be faster !
http://lxr.nginx.org/source/src/http/ngx_http_variables.c#0953

Because both have to go through the array to find the key. I guess we could do some performance optimizations to read the headers once. There is some code that is using $http_HEADER and could benefit from reading the headers just once. But that can wait.

local function set_resp_header(header, value)
ngx.header[header] = nil

if value and value ~= "" then
Copy link
Contributor

@mikz mikz Nov 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the condition? We want to set empty string / nil to remove the header no?

edit:

so this should work:

ngx.header[name] = value


local function new_header_value(current_value, value)
if current_value then
return current_value .. ', ' .. value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I read https://tools.ietf.org/html/rfc7230#section-3.2.2 and it looks like this is more complex than I though because there is an exception: set-cookie header.

Also this is different on request/response. Request can't have duplicated headers and they have to be sent in one field with commas, response can (and they can be concatenated with comma with some exception).

Also the A.2 of describes Cookie in the request has the same issue. https://arxiv.org/pdf/cs/0105018.pdf?

As it is concatenated with semicolon instead of a comma.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A sender MUST NOT generate multiple header fields with the same field
name in a message unless either the entire field value for that
header field is defined as a comma-separated list [i.e., #(values)]
or the header field is a well-known exception (as noted below).

A recipient MAY combine multiple header fields with the same field
name into one "field-name: field-value" pair, without changing the
semantics of the message, by appending each subsequent field value to
the combined field value in order, separated by a comma. The order
in which header fields with the same field name are received is
therefore significant to the interpretation of the combined field
value; a proxy MUST NOT change the order of these field values when
forwarding a message.

A "sender" is the one sending the message (in HTTP request that would be the client, in HTTP response the server). A "recipient" is the party reading ht HTTP message (server in the request, client in the response).

So my understanding is that the message header can be split into several headers with the same name if the value is a comma separated list and it is not on a blacklist (Cookie, Set-Cookie, ...).

So I think it should be safe to "add" as a second header with the same name.
Because it is allowed for the sender to send it split as several headers if the header is array-like. We can't maintain a list of array-like headers but we can tell users to figure it out.

So I think we would be following the RFC if we would "add" a header as:

GET /
Header: val
Header: other

And then the recipient can concatenate them with comma. And we leave the blacklist on the user. If they do this operation on some unsupported header (like Host) then it is their problem.

]
}
--- backend
location /transactions/authrep.xml {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to have this in each policy test? I'd say it might be too much and actually testing the apicast policy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking the same. Would be good to have a way of declaring a 'default backend' so we don't copy paste the same thing over and over :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this is too verbose, but at the same time, I think it's important to check that adding this policy does not break the normal auth flow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but we could verify it with the phase logger policy? Just add it after and see it works.

And we can provide default backend in the blackbox test base. If no backend is defined we can add there whatever.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather address this in a separate PR, as this is a problem that we have in multiple test files.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd love get rid of those. Or at least have it in just one test. Because this is making plenty of noise that is not helping the test readability.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidor missed your previous comment. Yes we can address it in different PR. 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also when dealing with this we could consider making api_backend and backend endpoint optional - that could disable the call and result in smaller more focused configuration file.
I guess when writing these files by hand we want them to be as small and focused as possible - as our tests.

@davidor davidor force-pushed the headers-policy branch 3 times, most recently from 8c69629 to 3a23c8e Compare November 26, 2017 17:35
@davidor
Copy link
Contributor Author

davidor commented Nov 27, 2017

@mikz I made some changes as we agreed based on https://tools.ietf.org/html/rfc7230#section-3.2.2

Now the 'add' operation adds a new header with the same name if it already exists, instead of concatenating values separated by commas in the value of the existing header.

I also added a 'push' operation.

-- 2) When the header is set, replaces its value with the given one.
-- 3) Deletes a header when the value is "".
-- The push operation:
-- 1) When the header is not set, it does nothing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought push would always create new header. Thats why it is push no ? Like the array operation (and assumption that no header is an empty array).

-- 2) header
-- 3) value
-- The add operation:
-- 1) When the header is not set, creates it with the given value.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the add would be set only when the header is missing. Like a default value.
Like the shdict operation add https://github.com/openresty/lua-nginx-module/blob/master/README.markdown#ngxshareddictadd

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. So there has been a misunderstanding here. I got 'push' and 'add' the opposite way.
The way you describe is coherent with what openresty does, so I think it would be good to follow that.

end

local function add_resp_header(header, value)
local new_value = new_header_value(ngx.header[header], value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd consider renaming header to name or header_name. Because it looks like there is too much use of header (because of the ngx.header).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea 👍


if ngx.req.get_headers()['New-Header'] ~= 'config_value_nh' then
ngx.log(ngx.ERR, "The header 'New-Header' does not have the correct value")
elseif existing_header_values[1] ~= 'request_value_eh1' or
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should use use assert.same here ?

It should be easy as assert = require("luassert").

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Good idea 👍

}
}
],
"api_backend": "http://test:$TEST_NGINX_SERVER_PORT/",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be nice to reorganise the configuration in a way the most useful options fit the screen with the upstream/request definition. So the backend block could be gone (or moved to the top), the "api_backend" could be on the top of the config and on the bottom would be the policy_chain configuration. The next would be upstream block and the request+response. That should fit one screen so you could read the test and understand all important parts without scrolling...
Just my 2ç.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point 👍


for _, command in ipairs(commands) do
local command_func = command_functions[header_type][command.op]
command_func(command.header, command.value, req_headers)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is passing the request headers to the response commands too, right?
That might be confusing a bit.
What about calling it cache and pass it from the outside (rewrite handler)?

Copy link
Contributor Author

@davidor davidor Nov 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This is passing req headers to response commands too.
Not sure I follow. what would you call cache?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cache is a bad name. Probably just better to use ... varargs and pass the get_headers() from the rewrite function.

@davidor davidor force-pushed the headers-policy branch 2 times, most recently from 623abfe to e80c59d Compare November 27, 2017 14:42
Copy link
Contributor

@mikz mikz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants