Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http_in returns internal server error if a field "time" is provided as a string #3036

Closed
dobesv opened this issue Jun 10, 2020 · 13 comments
Closed
Labels

Comments

@dobesv
Copy link
Contributor

dobesv commented Jun 10, 2020

Describe the bug

When submitting logs to the http input, if the logs contain a time field that isn't a floating point seconds since the epoch, a status 500 is returned. I think it should return a status 400 instead, if the data is not valid.

Further, there's nothing mentioned in the documentation that the field time inside a log record has any special significance or constraint upon it. time is only documented as a query parameter.

To Reproduce

Just for example:

curl 'https://fluentd-hostname/web' \
  -H 'content-type: application/x-www-form-urlencoded' \
  -H 'accept: */*' \
  -H 'accept-language: en-GB,en-US;q=0.9,en;q=0.8' \
  --data-raw 'json=%5B%7B%22anonymousId%22%3A%225eb1b213517fe50000ea9b52%22%2C%22name%22%3A%22formative%22%2C%22level%22%3A40%2C%22levelName%22%3A%22warn%22%2C%22msg%22%3A%22test%22%2C%22time%22%3A%222020-06-10T17%3A30%3A29.802Z%22%2C%22timestamp%22%3A%222020-06-10T17%3A30%3A29.804Z%22%7D%5D'

This causes an internal server error parsing the "time" field. The error message is returned to the client with status 500 "internal server error".

Furthermore, observe that the documentation does not mention the special nature of the time field in a log record.

Expected behavior

A status 400 should be returned to indicate it was supposedly the client that made the mistake.

Additionally, the documentation should describe the usage and restrictions of the time field of log records submitted to the http input.

Your Environment

fluentd (1.11.0) running in kubernetes using the docker debian image.

@mlasevich
Copy link

see #2591

There is no problem in the submission of the event, your problem is likely with the output instead.

The issue is likely with output buffering, it seems to confuse it when you specify buffering by time, as it does not know which "time" you mean ("time" means both event time and record field "time")

However, if "time" is actual time of the transaction, you can you <extract> to convert it into event timestamp and remove it from the record - problem solved. If time is another field, you can use a <filter> between input and output to rename the field

@dobesv
Copy link
Contributor Author

dobesv commented Jun 18, 2020

@mlasevich I investigated the issue a bit and did confirm the 500 error came from http_in itself.

The other issue would probably result in http_in accepting the log without complaint, then never writing it to the output.

@mlasevich
Copy link

mlasevich commented Jun 19, 2020

Ahh, that seems to be something specific to in_http - as per documentation here time is a special parameter that allows you to set the time - I guess it needs to be in some specific format. Seems silly as there is now a better way for plugins to accomplish this via <extract> directive. If you intend to actually provide your own time, provide it in the expected format - but if you want this field to just be part of the record, I strongly advise renaming the field since it will cause you problems elsewhere as well.

@mlasevich
Copy link

Also, check your parser config, but default parser_json uses "time" as time_key (https://docs.fluentd.org/parser/json)

@dobesv
Copy link
Contributor Author

dobesv commented Jun 19, 2020

@mlasevich I've already fixed this myself, but I opened the issue for the benefit of others.

My request here are (1) that it return error 400 with an appropriate message and (2) this feature / limitation is actually mentioned in the documentation.

Even just (2) might be enough to save others the time I spent debugging.

@mlasevich
Copy link

While the docs leave much to be desired, but I am not sure they would understand what you are asking for. FWIW, still do not

@dobesv
Copy link
Contributor Author

dobesv commented Jun 19, 2020

For docs, I'm thinking something like this:

fluent/fluentd-docs-gitbook#187

@dobesv
Copy link
Contributor Author

dobesv commented Jun 19, 2020

Although the change noted above might obsolete my docs suggestion as it adds a whole new feature for parsing the time.

@mlasevich
Copy link

mlasevich commented Jun 19, 2020

Keep in mind, if I understand it correctly(and it is a significant if, to be fair), the above change should affect ONLY the query parameter, not field in data/json (you should already be able to control format for that one).

@dobesv
Copy link
Contributor Author

dobesv commented Jun 19, 2020

Based on e.g. https://github.com/fluent/fluentd/pull/3046/files#diff-9c1e429b7574065cdb9677d43880606dR208 I would say that it is parsing the time field of the records in the request body.

@mlasevich
Copy link

Based on e.g. https://github.com/fluent/fluentd/pull/3046/files#diff-9c1e429b7574065cdb9677d43880606dR208 I would say that it is parsing the time field of the records in the request body.

Right, which is normally done in a parser plugin and not in the input itself - and it seems to be indicated the configuration of the test, since test is explicitly configuring the parser plugin a few lines above. But apparently it is ALSO processing it in the input. No wonder it is a mess- there are multiple plugins trying to do the same work with separate/competing configurations :-(

@github-actions
Copy link

This issue has been automatically marked as stale because it has been open 90 days with no activity. Remove stale label or comment or this issue will be closed in 30 days

@github-actions github-actions bot added the stale label Dec 18, 2020
@github-actions
Copy link

This issue was automatically closed because of stale in 30 days

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants