Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add datetime conversion for unwrapping #3866

Closed

Conversation

dannykopping
Copy link
Contributor

@dannykopping dannykopping commented Jun 18, 2021

What this PR does / why we need it:
This PR adds support for parsing dates in log lines.

Given a log line with a timestamp:

{"host":"235.123.149.110", "user-identifier":"connelly2350", "datetime":"18/Jun/2021:10:50:22 +0000", "method": "HEAD", "request": "/collaborative/frictionless", "protocol":"HTTP/1.1", "status":302, "bytes":18697, "referer": "http://www.chiefe-services.org/streamline"}

If we wanted to find the most recent line for each method label, it is not currently possible - to my knowledge.
This PR introduces a new datetime parser for unwrapping dates into UNIX timestamps, which can be used in aggregations.

last_over_time(
  {job="random"}
    | json
    | unwrap datetime(datetime)[1m]
) by (method)

image

Of course, as a follow up to this, we may want to expose each line's actual ingested timestamp as a meta-label, like __ts__ for example. This would obviate the need to parse the timestamp out of the line again.

Which issue(s) this PR fixes:
Fixes #

Special notes for your reviewer:
Using time.Parse doesn't work here because timestamps in the log line can be in any format. I opted to use https://github.com/araddon/dateparse as it's very thorough and widely used.

Just threw this together really quick for an experiment so still need to write docs, and maybe expand the tests to include benchmarks and maybe more examples.

Checklist

  • Documentation added
  • Tests updated

Copy link
Member

@owen-d owen-d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea, but don't want to develop multiple ways to parse these across the code base (i.e. https://grafana.com/docs/loki/latest/clients/promtail/stages/timestamp/). I'd prefer to find a way to align our approaches, perhaps something like

ts() # no arguments, uses the underlying log's timestamp in loki
ts(foo, ISO_ABC) # parses the foo field, using ISO_ABC format

We could fill them with our formats from https://grafana.com/docs/loki/latest/clients/promtail/stages/timestamp/#reference-time

What do you think?

@kavirajk
Copy link
Contributor

There were some similar proposals previously!
#2925

@sandeepsukhani
Copy link
Contributor

I like the idea, but don't want to develop multiple ways to parse these across the code base (i.e. https://grafana.com/docs/loki/latest/clients/promtail/stages/timestamp/). I'd prefer to find a way to align our approaches, perhaps something like

ts() # no arguments, uses the underlying log's timestamp in loki
ts(foo, ISO_ABC) # parses the foo field, using ISO_ABC format

We could fill them with our formats from https://grafana.com/docs/loki/latest/clients/promtail/stages/timestamp/#reference-time

What do you think?

I like Owen's idea about unifying the parsing. The only problem is users would have to explicitly specify the format of the time, which would make it a little harder for the users to adapt. I doubt many users would be easily able to identify which ISO format the time is in. We can maybe automate the parsing everywhere using the dateparse code to make it seamless.

@dannykopping
Copy link
Contributor Author

Thanks for the thoughts, folks! Apologies for the delayed response.

There were some similar proposals previously!

Thanks @kavirajk! I missed this; I will check those out.

I like the idea, but don't want to develop multiple ways to parse these across the code base

@owen-d I agree 100%. Do you think there's room for adding another "magic" format like what I've suggested below in my response to Sandeep?

We can maybe automate the parsing everywhere using the dateparse code to make it seamless.

@sandeepsukhani totally! I'm thinking of perhaps introducing a "Autodetect" format alongside the predefined list which uses dateparse; would anyone be opposed to that?

The amount of time I've wasted trying to get date parsing right in Promtail vastly exceeds the amount of time I've spent on configuring the rest of Promtail, so this is something I'd personally love to have streamlined.

I'm going to look into implementing @owen-d's suggestion here as I think it's an improvement over my current design, and aligns with the existing patterns better.

ts() # no arguments, uses the underlying log's timestamp in loki
ts(foo, ISO_ABC) # parses the foo field, using ISO_ABC format

@cyriltovena
Copy link
Contributor

Love it, I also think date parsing could be leverage during label filtering.

| date_label > datetime(....)

@owen-d
Copy link
Member

owen-d commented Jul 8, 2021

@sandeepsukhani good points. Perhaps we can introduce a third form

ts() # no arguments, uses the underlying log's timestamp in loki
ts(foo) # tries to parse the foo field without a format hint
ts(foo, ISO_ABC) # parses the foo field, using ISO_ABC format

@stale
Copy link

stale bot commented Aug 21, 2021

This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale A stale issue or PR that will automatically be closed. label Aug 21, 2021
@stale stale bot closed this Aug 28, 2021
@marcusteixeira
Copy link
Contributor

Reopen?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/XL stale A stale issue or PR that will automatically be closed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants