Skip to content

Documented a common integration tests problem#12933

Merged
alyssawilk merged 3 commits intoenvoyproxy:masterfrom
yosrym93:patch-3
Sep 9, 2020
Merged

Documented a common integration tests problem#12933
alyssawilk merged 3 commits intoenvoyproxy:masterfrom
yosrym93:patch-3

Conversation

@yosrym93
Copy link
Contributor

@yosrym93 yosrym93 commented Sep 1, 2020

Commit Message:
Documented a common integration tests problem when the response content-length header does not match the actual body length.
Signed-off-by: Yosry Ahmed yosryahmed@google.com

Risk Level: N/A
Testing: N/A
Docs Changes: N/A
Release Notes: N/A

…s not match the actual body length

Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Copy link
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool, thanks for adding more debugging breadcrumbs!

In addition to the existing test framework, which allows for carefully timed interaction and ordering of events between downstream, Envoy, and Upstream, there is now an “autonomous” framework which simplifies the common case where the timing is not essential (or bidirectional streaming is desired). When AutonomousUpstream is used, by setting `autonomous_upstream_ = true` before `initialize()`, upstream will by default create AutonomousHttpConnections for each incoming connection and AutonomousStreams for each incoming stream. By default, the streams will respond to each complete request with “200 OK” and 10 bytes of payload, but this behavior can be altered by setting various request headers, as documented in [`autonomous_upstream.h`](autonomous_upstream.h)

## Common Problems
- If a response body length does not match the `content-length` header (or there is no header at all), any mock calls to wait for the response completion such as `sendRequestAndWaitForResponse` or `response_->waitForEndStream` could timeout. Also, any asserts that the response was completed such as `EXPECT_TRUE(response_->complete())` would fail. Make sure that the response body length matches the `content-length` header.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could use a tweak - most requests are "sent" without a content length header, but helper functions add them, for example sendRequestAndWaitForResponse will send an end stream bit, indicating if the response is done (and sometimes causing content length to be added)

I think if you remove "(or there is no header at all)" it'd be more accurate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also timeout -> time out

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I faced this problem when I was sending a request using makeHeaderOnlyRequest and encoding the response using upstream_request_->encodeXXX(). Even when I set end_stream, for e.g. by using upstream_request_->encodeData(_, true), the call for response->waitForEndStream times out. This happened because I did not include a content-length to the encoded headers at all. That's why I added the "no header at all" part.

I guess whether the content-length header is added or not depends on the protocol, so it should not be something we depend on. I know this because when I was missing the content-length header, the integration test passed for some protocols and timed out for others.

I'd say its better for whoever is writing the test to add a correct content-length header, even if some functions/protocols would implicitly add it anyway. That's why I also chose the wording "it could time out" rather than "it will time out".

Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
@alyssawilk
Copy link
Contributor

alyssawilk commented Sep 3, 2020 via email

@yosrym93
Copy link
Contributor Author

yosrym93 commented Sep 3, 2020

Can you share the code of your previously failing test with me?

In #12832 that I just submitted recently, the test I added to protocol_integration_test here:
https://github.com/envoyproxy/envoy/pull/12832/files#diff-1d3e9c7609566ad39972e60840c4d384

In the continue-headers-only-inject-body-filter code here:
https://github.com/envoyproxy/envoy/pull/12832/files#diff-50d09ef0a52a4f57d86897574f214739

If you comment line 32 (headers.setContentLength(body_.length()) in encodeHeaders), this will cause a time out.

@yosrym93
Copy link
Contributor Author

yosrym93 commented Sep 3, 2020

Interestingly, some protocol combinations pass, some fail, and some timeout. Also in some combinations the content-length header is non-existent, while in others a value of 0 is added. Yet there's no direct correlation between this and the time out. For example:
IPv4_HttpDownstream_HttpUpstream puts a content-length value of 0 but it doesn't time out, it fails with a body mismatch.
IPv6_Http2Downstream_HttpUpstream also puts a content-length value of 0, but it does time out.

But I noticed that all the cases where no content-length header is added at all (null) pass. On the other hand, cases where the the content-length is set to 0 (which I guess is a default behavior somewhere) either fail or time out. Maybe this is the problem? The value of 0 that gets set by default in some protocol combinations?

@alyssawilk
Copy link
Contributor

alyssawilk commented Sep 3, 2020 via email

@yosrym93
Copy link
Contributor Author

yosrym93 commented Sep 3, 2020

I tried it before with addEncodedData, same outcome. So I don't think it's about the method the filters uses to add the body.

  1. It seems like something that's happening beyond the HCM and the filter manager, maybe in the codec? I don't know. I have no familiarity whatsoever with code beyond the filter manager and the HCM. The outcome change based on the protocols does suggest that it has something to do with the codecs though.
  2. I agree. If this is not a common failure mode it should not be documented as a common problem.

Unfortunately I am not able to investigate this. I can open an issue to report the bug, and then someone else might pick it up later. What do you think?

@alyssawilk
Copy link
Contributor

Still a bit confused - if your "inject body" test only works with a content length up front, how are you going to combine chunks when you may not know the ultimate size? I think you have to solve that problem before you can land your caching feature.

Either way I'm going to still say that given I've shown that the integration test framework (client and server) work fine without content length, saying you need to have content length seems misleading and I would not like to add that. I'm happy to call in another maintainer for a second opinion if you feel strongly it should be left in!

@alyssawilk
Copy link
Contributor

Ah, per your earlier example

If you comment line 32 (headers.setContentLength(body_.length()) in encodeHeaders), this will cause a time out.

If you comment out the line it causes problems, but if you replace with
headers.removeContentLength()
it is fine.

My guess is the HCM was setting the content length to 0 in the utility functions, when there was no body
https://github.com/envoyproxy/envoy/blob/master/source/common/http/conn_manager_utility.cc#L93
https://github.com/envoyproxy/envoy/blob/master/source/common/http/conn_manager_utility.cc#L402

So if you didn't clear the content length explicitly, you accidentally ended up with the wrong content length. So again I think the part of your change discussing incorrect content lengths is worth adding, but in this case setting a content length wasn't necessary either.

@yosrym93
Copy link
Contributor Author

yosrym93 commented Sep 8, 2020

Still a bit confused - if your "inject body" test only works with a content length up front, how are you going to combine chunks when you may not know the ultimate size? I think you have to solve that problem before you can land your caching feature.

The CacheFilter knows the length of the total body up front and sets the content length header accordingly in encodeHeaders.

So if you didn't clear the content length explicitly, you accidentally ended up with the wrong content length. So again I think the part of your change discussing incorrect content lengths is worth adding, but in this case setting a content length wasn't necessary either.

I will modify the added docs to only mention incorrect content lengths.

Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
@yosrym93
Copy link
Contributor Author

yosrym93 commented Sep 8, 2020

@alyssawilk looking again at mutateResponseHeaders utility at:
https://github.com/envoyproxy/envoy/blob/master/source/common/http/conn_manager_utility.cc#L402

It seems like if there's no Content-Length header, the HCM will set it to 0. Isn't this ultimately equivalent to having an incorrect Content-Length? (unless the response has no body ofc).

Maybe I should modify the wording to something like "If the response has a body, and its Content-Length header does not match the body length or does not exist at all, ... ". What do you think?

@alyssawilk
Copy link
Contributor

Ugh I'm sorry, that was a bad link.
That line is force-setting content length 0 on the upgrade path, not in general
The place we force-set to zero that's relevant to your PR is this
https://github.com/envoyproxy/envoy/blob/master/source/common/http/http1/codec_impl.cc#L165

Where we set it to 0 for header-only requests (which the test client was originally sending, before your filter altered things)

Copy link
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, pending CI churn - thanks for improving our docs!

@yosrym93
Copy link
Contributor Author

yosrym93 commented Sep 8, 2020

Where we set it to 0 for header-only requests (which the test client was originally sending, before your filter altered things)

Makes sense, thanks!

The place we force-set to zero that's relevant to your PR is this
https://github.com/envoyproxy/envoy/blob/master/source/common/http/http1/codec_impl.cc#L165

I'm curious why the content-length is set to 0 in that path though. The filter returns ContinueAndDontEndStream, which should change end_stream to false. I will look into it.

@yosrym93
Copy link
Contributor Author

yosrym93 commented Sep 8, 2020

The place we force-set to zero that's relevant to your PR is this
https://github.com/envoyproxy/envoy/blob/master/source/common/http/http1/codec_impl.cc#L165

I tried removing the setContentLength from the test filter and putting an assert in that path. As I suspected, end_stream is false, so it seems like this is not what causes the 0 content-length. Line 165 was never hit.

Regardless, I think it should be documented somewhere that if a filter adds a body to headers-only response, either through ContineAndDontEndStream and inject.. or through addEncodedData in encodeHeaders (or any other way), it should set the content-length header accordingly. Do you agree? If so, where do you think this should be documented?

@alyssawilk alyssawilk merged commit f02faee into envoyproxy:master Sep 9, 2020
@yosrym93 yosrym93 deleted the patch-3 branch September 9, 2020 17:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants