Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JsonConsoleFormatter: Should not put the same key in the scopes multiple times (not allowed in Json) #39664

Closed
maryamariyan opened this issue Jul 20, 2020 · 10 comments · Fixed by #39860

Comments

@maryamariyan
Copy link
Member

Description

From #34742 (comment):

If there are multiple scopes which are a string with a place holder, the {OriginalFormat} get added multiple time, but having duplicated keys in JSON, while correct, it is generally not recommended (see RFC8259 section 4. Also I'm unsure if it makes sense to have {OriginalFormat} when the content is only the place holder and it gets pulled out as key.

Repro code

using (logger.BeginScope(new { Message = "Hello" }))
using (logger.BeginScope(new KeyValuePair<string, string>("key", "value")))
using (logger.BeginScope(new KeyValuePair<string, object>("anotherkey", "anothervalue")))
using (logger.BeginScope(new Dictionary<string, object> { ["yetanotherkey"] = "yetanothervalue" }))
using (logger.BeginScope("A string"))
using (logger.BeginScope("This is a scope message with number: {CustomNumber}", 11123))
using (logger.BeginScope("{AnotherNumber}", 42))
{
    logger.LogInformation(new Exception(), "exception message with {0}", "stacktrace");
}

Output:

{
  "Timestamp": "2020-07-19T20:21:33.8836859\u002B01:00",
  "EventId": 0,
  "LogLevel": "Information",
  "Category": "JsonLogging.Program",
  "Message": "exception message with stacktrace",
  "Exception": {
    "Message": "Exception of type \u0027System.Exception\u0027 was thrown.",
    "Type": "System.Exception",
    "StackTrace": [],
    "HResult": -2146233088
  },
  "0": "stacktrace",
  "{OriginalFormat}": "exception message with {0}",
  "Scopes": {
    "0": "{ Message = Hello }",
    "1": "[key, value]",
    "2": "[anotherkey, anothervalue]",
    "yetanotherkey": "yetanothervalue",
    "3": "A string",
    "CustomNumber": "11123",
    "{OriginalFormat}": "This is a scope message with number: {CustomNumber}",
    "AnotherNumber": "42",
    "{OriginalFormat}": "{AnotherNumber}"
  }
}

Regression?

No

Other information

@ghost
Copy link

ghost commented Jul 20, 2020

Tagging subscribers to this area: @maryamariyan
Notify danmosemsft if you want to be subscribed.

@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the untriaged New issue has not been triaged by the area owner label Jul 20, 2020
@maryamariyan maryamariyan added this to the 5.0.0 milestone Jul 20, 2020
@maryamariyan maryamariyan removed the untriaged New issue has not been triaged by the area owner label Jul 20, 2020
@maryamariyan
Copy link
Member Author

maryamariyan commented Jul 21, 2020

As I was in search for a proper expected behavior, I looked up what Serilog would do. Here's a sample:

Code:

            Log.Logger = new LoggerConfiguration()
                .Enrich.FromLogContext()
                .WriteTo.Console(outputTemplate: "[{Timestamp:HH:mm:ss} {Level:u3}] {Message:lj} " +
                                                 "{Properties:j}{Exception}")
                .CreateLogger(); 

            var exampleUser = new { Id = 1, Name = "Jess", Created = DateTime.Now };
            using (LogContext.PushProperty("This is a scope message with number: {CustomNumber}", 11123))
            using (LogContext.PushProperty("{AnotherNumber}", 42))
            {
                Log.Information("Created {@User} on {Created}", exampleUser, DateTime.Now);
            }

            Log.CloseAndFlush();

Output:

[16:20:36 INF] Created {"Id": 1, "Name": "Jess", "Created": "2020-07-20T16:20:36.3982501-07:00"} on 07/20/2020 16:20:36 {"{AnotherNumber}": 42, "This is a scope message with number: {CustomNumber}": 11123}

Takeaway

For the sample in the description of this issue, we could theoretically consider changing the below Actual behavior:

    "CustomNumber": "11123",
    "{OriginalFormat}": "This is a scope message with number: {CustomNumber}",
    "AnotherNumber": "42",
    "{OriginalFormat}": "{AnotherNumber}"

To below expected behavior

    "This is a scope message with number: {CustomNumber}": "11123",
    "{AnotherNumber}": "42",

But, BeginScopes allows to have multiple named properties like below:

using (logger.BeginScope("This is a scope message with number: {CustomNumber}, and {CustomString}", 11123, "some string"))

which I'm not sure if its possible with Serilog. Serilog has LogContext.PushProperties but that doesn't allow to take a single string with multiple named properties.

cc: @davidfowl @nblumhardt

@maryamariyan
Copy link
Member Author

maryamariyan commented Jul 21, 2020

I think the proper expected behavior could look like this:

"{OriginalFormat}" : [ 
   "This is a scope message with number: {CustomNumber} and {CustomString}",
   "{AnotherNumber}"
],
"CustomNumber" : "11123", 
"CustomString" : "some string"
...

@alefranz
Copy link
Contributor

alefranz commented Jul 22, 2020

I think the proper expected behavior could look like this:

"{OriginalFormat}" : [  
   "This is a scope message with number: {CustomNumber} and {CustomString}",
   "{AnotherNumber}"
],
"CustomNumber" : "11123", 
"CustomString" : "some string"
...

Do you mean {OriginalFormat} to be an array?

Also a similar problem will arise not only with the format but also with properties, if you have for example multiple scopes with the same property name, or if a property as the same name of one of the keys used in the json, which practically became banned property name, unless the properties are moved in a dedicated object.

Serilog put the properties in a property bag and collapse all the properties, keeping the value of the last one. It also combines the properties between the scopes and the message. Of course this is an opinionated solution that looe expressiveness.

Extending the example to also add duplicated properties:

            using (logger.BeginScope(new { Message = "Hello" }))
            using (logger.BeginScope(new KeyValuePair<string, string>("key", "value")))
            using (logger.BeginScope(new KeyValuePair<string, object>("anotherkey", "anothervalue")))
            using (logger.BeginScope(new Dictionary<string, object> { ["yetanotherkey"] = "yetanothervalue" }))
            using (logger.BeginScope("A string"))
            using (logger.BeginScope("This is a scope message with number: {CustomNumber}", 11123))
            using (logger.BeginScope("{AnotherNumber}{FinalNumber}", 2, 42))
            using (logger.BeginScope("{AnotherNumber}", 3))
            {
                logger.LogInformation(new Exception(), "exception message with {0} and {CustomNumber}", "stacktrace", 123);
            }

MEL JsonConsoleFormatter:

{
  "Timestamp": "2020-07-22T08:44:01.1163808\u002B01:00",
  "EventId": 0,
  "LogLevel": "Information",
  "Category": "JsonLogging.Program",
  "Message": "exception message with stacktrace and 123",
  "Exception": {
    "Message": "Exception of type \u0027System.Exception\u0027 was thrown.",
    "Type": "System.Exception",
    "StackTrace": [],
    "HResult": -2146233088
  },
  "0": "stacktrace",
  "CustomNumber": "123",
  "{OriginalFormat}": "exception message with {0} and {CustomNumber}",
  "Scopes": {
    "0": "{ Message = Hello }",
    "1": "[key, value]",
    "2": "[anotherkey, anothervalue]",
    "yetanotherkey": "yetanothervalue",
    "3": "A string",
    "CustomNumber": "11123",
    "{OriginalFormat}": "This is a scope message with number: {CustomNumber}",
    "AnotherNumber": "2",
    "FinalNumber": "42",
    "{OriginalFormat}": "{AnotherNumber}{FinalNumber}",
    "AnotherNumber": "3",
    "{OriginalFormat}": "{AnotherNumber}"
  }
}

Serilog JsonFormatter:

{
    "Timestamp": "2020-07-22T08:44:01.1241557+01:00",
    "Level": "Information",
    "MessageTemplate": "exception message with {0} and {CustomNumber}",
    "RenderedMessage": "exception message with \"stacktrace\" and 123",
    "Exception": "System.Exception: Exception of type 'System.Exception' was thrown.",
    "Properties": {
        "0": "stacktrace",
        "CustomNumber": 123,
        "SourceContext": "JsonLogging.Program",
        "AnotherNumber": 3,
        "FinalNumber": 42,
        "yetanotherkey": "yetanothervalue",
        "Scope": [
            "{ Message = Hello }",
            "[key, value]",
            "[anotherkey, anothervalue]",
            "A string",
            "This is a scope message with number: 11123",
            "242",
            "3"
        ]
    }
}

Another option for the JsonFormatter, to avoid having to collapse the properties and loosing information, could be to keep the scopes separated, however this would make aggregating logs harder so it's not probably a good idea:
included only a few for brevity

{
    "Scopes": [
        {
            "Message": "{ Message = Hello }"
        },
        {
            "Message": "[anotherkey, anothervalue]",
            "yetanotherkey": "yetanothervalue"
        },
        {
            "Message": "This is a scope message with number: {CustomNumber}",
            "{OriginalFormat}": "This is a scope message with number: {CustomNumber}",
            "Properties": { "CustomNumber": "11123" }
        }
    ]
}

I think your suggestion to aggregate the {OriginalFormat} is the best one, but I wonder if it would make sense to do the same with the the rendered message, instead of using numbered keys (it's true that in some systems you can navigate arrays the same if it is an array or an object with zero based keys).

Also note the current implementation is missing the rendered message for some of the scopes, not sure if that should be tracked as separate issue?

I've putted the sample code with this changes to have more edge cases here: https://github.com/alefranz/DotNetPlayground/blob/json-logging/JsonLogging/Program.cs

Sorry if I am trying to push the scope of this feature. I've had to deal with formatting logs from ASP.NET Core microservices and it would be nice to get an out of the box solution to have logs in a great format that can be easily queried and aggregated in systems like Seq, Kibana, etc...

Thank you,
Alessio

@maryamariyan
Copy link
Member Author

maryamariyan commented Jul 22, 2020

Do you mean {OriginalFormat} to be an array?

yes.

Sorry if I am trying to push the scope of this feature. I've had to deal with formatting logs from ASP.NET Core microservices and it would be nice to get an out of the box solution to have logs in a great format that can be easily queried and aggregated in systems like Seq, Kibana, etc...

Thanks for trying out the preview. These are great feedback which should definitely be looked into.

Also note the current implementation is missing the rendered message for some of the scopes, not sure if that should be tracked as separate issue?

Yes. Would you please add a separate issue to track this too? Though I just think the rendered message is just the value of Message in the Json logs.

@nblumhardt
Copy link

nblumhardt commented Jul 23, 2020

Hi @maryamariyan - sorry I'm late to reply; just a thought, we consider JsonFormatter to be "legacy" for Serilog, and generally recommend/use the (quite different) CompactJsonFormatter or RenderedCompactJsonFormatter from https://github.com/serilog/serilog-formatting-compact whenever asked.

Not sure how relevant that is to the current issue (only skim read), but there could be more learnings in our newer JSON formatter worth considering, if it's not too late to change the schema of MEL's JSON output.

In particular, the new format uses very short names for built-in properties like Message - we use @m - and this adds up to quite a few bytes saved when logging a lot of small events.

We also use a naming scheme that generally leaves user-defined properties at the root level in the resulting JSON document, e.g. if a user event carries a first-class property called Message (maybe it's a message-driven app), then this ends up in the document where the user would expect, since built-in properties in Serilog's new format all begin with @ and thus very rarely conflict.

Some format details in the linked repo, hope it helps!

@maryamariyan
Copy link
Member Author

maryamariyan commented Jul 24, 2020

In particular, the new format uses very short names for built-in properties like Message - we use @m - and this adds up to quite a few bytes saved when logging a lot of small events.

Thanks for the idea. It should make sense for MEL's JSON then to also rename Message to @m or @msg for the same reasons. I'll open a separate issue to address this. Perhaps JsonConsoleFormatterOptions could have an extra property called IsCompact, which would switch:
Message to @m, Timestamp -> @t, etc.
Similar to what Serilog does for compact json formatting in your link: https://github.com/serilog/serilog-formatting-compact

@nblumhardt
Copy link

Cool! 🙂 FWIW, there are a few tools built to work with the Serilog ("CLEF") format that might be interesting to check out, if you'd consider using the same schema:

Could be food for thought, anyway :-) - HTH!

@maryamariyan
Copy link
Member Author

maryamariyan commented Jul 25, 2020

@nblumhardt thanks for the links. I created #39917 in the meantime. But will check the above resources to make comparisons.

@alefranz
Copy link
Contributor

alefranz commented Jul 25, 2020

Using those compact identifiers will only mitigate it as it would still cause a collision in the unlikely event that you use that name as a property e.g. logger.BeginScope("{@m}", 3)
I have a couple of alternatives in mind.

  1. I believe there is special handling in place for curly braces, which is the reason why {OriginalFormat} has no risk of collision in the property bag, so using something like {Message} as property name should solve the collision problem, as if you had that as a token it would be ignored e.g. logger.BeginScope("{{Message}}", 3) does not add a {Message} property.
    While this will not affect readability much , having a special character it will make writing queries in a log ingestion system harder.

  2. Moving the properties to a dedicated nested object (e.g. State) would avoid any risk of collision. This is what is already happening for the top level message, so it would feel natural to also have the same behaviour in the scopes.

@ghost ghost closed this as completed in #39860 Jul 27, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 8, 2020
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants