Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OTLP HTTP exporter errors are never reported #1578

Closed
mdf-observe opened this issue Aug 22, 2022 · 9 comments · Fixed by #1581
Closed

OTLP HTTP exporter errors are never reported #1578

mdf-observe opened this issue Aug 22, 2022 · 9 comments · Fixed by #1581
Assignees
Labels
area:exporter:otlp OpenTelemetry Protocol (OTLP) Exporter bug Something isn't working

Comments

@mdf-observe
Copy link

Describe your environment
nothing relevant

Steps to reproduce
Configure the http exporter with an URL that will parse correctly but will report errors, e.g. 404 because of an invalid export path

What is the expected behavior?
I expected to see something in the log (at least DEBUG) about a failure to export

What is the actual behavior?
A log line:
[OTLP HTTP Client] DEBUG: Export 1 trace span(s) success

Additional context
n/a

I'm not sure if this should be considered a bug report or a feature request. But it's very surprising to me that a misconfigured OtlpHttpClientOptions.url can result in no errors. In my case I had the right http address but an incorrect path, and the collector was replying with a 404. This error showed up nowhere.

I can see that the otlp_http_client.cc's ResponseHandler::OnResponse does see the 404, but without console_debug_ set this isn't even logged. The OnResponse function returns void so it doesn't seem to have a way to report this error to the calling framework.

I expected at least an OTEL_INTERNAL_LOG_DEBUG from OnRespnse() about the error, but ideally it would actually return a status to the framework that was propagated to the return value from Export().

@mdf-observe mdf-observe added the bug Something isn't working label Aug 22, 2022
@mdf-observe
Copy link
Author

mdf-observe commented Aug 22, 2022

I think maybe this patch is about correct to fix the issue?

--- a/exporters/otlp/src/otlp_http_client.cc	2022-08-22 17:46:51.146707254 +0000
+++ b/exporters/otlp/src/otlp_http_client.cc	2022-08-22 17:49:53.642246980 +0000
@@ -85,12 +85,20 @@
    */
   void OnResponse(http_client::Response &response) noexcept override
   {
+    sdk::common::ExportResult result = sdk::common::ExportResult::kSuccess;
+
     // Lock the private members so they can't be read while being modified
     {
       std::unique_lock<std::mutex> lk(mutex_);
 
       // Store the body of the request
       body_ = std::string(response.GetBody().begin(), response.GetBody().end());
+      auto code = response.GetStatusCode();
+      if (code != 200 && code != 202) {
+        // This may be a lie; depending on the result this could be retryable; e.g. the
+        // submission target is restarting.
+        result = sdk::common::ExportResult::kFailure;
+        OTEL_INTERNAL_LOG_ERROR("[OTLP HTTP Client] ERROR: Export failed: status=" << code
+                                << " body=" << body_);
+      }
 
       if (console_debug_)
       {
@@ -110,7 +120,7 @@
       bool expected = false;
       if (stopping_.compare_exchange_strong(expected, true, std::memory_order_release))
       {
-        Unbind(sdk::common::ExportResult::kSuccess);
+        Unbind(result);
       }
     }
   }

@lalitb
Copy link
Member

lalitb commented Aug 22, 2022

Good catch. Yes, the status code handling seems to be missing. The patch look good to log the error. As of now, we don't report the status code to the calling framework (i.e, HTTP exporter), which probably needs to be changed when we decide to start supporting retrying failed requests (say in the case the status code is 500, 409 or 428). cc @owent

Also, do you plan to contribute this patch :)

@ThomsonTan
Copy link
Contributor

Yes, patch is welcome:) Should all 2xx be considered as success?

@mdf-observe
Copy link
Author

I wasn't sure how to handle non-200 success codes. The OTEL spec says that a collector must return 200. So I think it's up to the project to decide whether to be permissive or strictly conforming.

@lalitb
Copy link
Member

lalitb commented Aug 22, 2022

I wasn't sure how to handle non-200 success codes. The OTEL spec says that a collector must return 200.

I think we can keep 200 as a success in that case. That's what other languages - java, js and go - are doing it too :)

@lalitb lalitb added the area:exporter:otlp OpenTelemetry Protocol (OTLP) Exporter label Aug 23, 2022
@owent
Copy link
Member

owent commented Aug 23, 2022

Yes, patch is welcome:) Should all 2xx be considered as success?

Agree, All 2xx are successful responses of HTTP.
@mdf-observe Could you please raise a PR to apply this patch?Or I can raise it if you don't mind.

@lalitb
Copy link
Member

lalitb commented Aug 23, 2022

Would be good to fix it for elastic search log exporter too:

void OnResponse(http_client::Response &response) noexcept override

Zipkin exporter uses Sync HTTP Client, and it is already having the check for HTTP status codes 200 and 202, so should be good.

@mdf-observe
Copy link
Author

@owent please feel free. I am not very familiar with how to do these things on github.com.

@owent
Copy link
Member

owent commented Aug 24, 2022

Sorry, I've had a lot going on recently, and I will raise a PR tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:exporter:otlp OpenTelemetry Protocol (OTLP) Exporter bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants