Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions src/sources/http_client/client.rs
Original file line number Diff line number Diff line change
Expand Up @@ -307,8 +307,7 @@ impl http_client::HttpClientContext for HttpClientContext {
fn on_response(&mut self, _url: &Uri, _header: &Parts, body: &Bytes) -> Option<Vec<Event>> {
// get the body into a byte array
let mut buf = BytesMut::new();
let body = String::from_utf8_lossy(body);
buf.extend_from_slice(body.as_bytes());
buf.extend_from_slice(body);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this changing the behavior for all of the other codecs? They previously always received the lossy conversion, now only json does.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are correct. I took a quick scan over the existing deserializers and Syslog, NativeJson, and Gelf will no longer apply a lossy conversion, which is undesirable. Conversely, lossy conversion is no longer applied to the input of Bytes and Native which is desired.

Do you have any suggestions for a way forward? We could add the conversion into the Deserializers itself, similar to what we did for the json deserializer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we do something along the lines of:

match self.decoder.deserializer {
  // apply lossy conversion to Syslog, NativeJson, Gelf
}

?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That does seem like the easiest way to slide this change in without additional updates to other codecs. Longer term it does seem appropriate to standardize more on the lossy behavior, but I don't know what the priority/capacity is for that.

Copy link
Collaborator

@jszwedko jszwedko Jun 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah, I neglected that other codecs also need UTF-8 aside from the json codec when thinking about this issue. I think ideally all codecs than need UTF-8 would grow the new lossy option. I'd support doing that now since it doesn't seem like a huge lift based on #17628, but, if it is, I'd support the workaround mentioned by Kyle.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😄 sorry for making more work for you @dsmith3197

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries, I'll go ahead with adding the lossy option to the relevant codecs 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has now been resolved thanks to #17688.


let events = self.decode_events(&mut buf);

Expand Down