Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Unsafe" in UnsafeRelaxedJsonEscaping is misleading from a security standpoint, in both directions #87138

Closed
davidmatson opened this issue Jun 5, 2023 · 2 comments

Comments

@davidmatson
Copy link

davidmatson commented Jun 5, 2023

JSON is defined as a sequence of Unicode characters/code points (see both the RFC and ECMA standard), and as such, correct processing of JSON means all Unicode characters are "safe" to use in JSON.

Having the word "Unsafe" in UnsafeRelaxedJsonEscaping is misleading from a security standpoint, in both directions:

  1. Calling this implementation "Unsafe" is misleading in terms of the risk of using it.

There's nothing inherently "Unsafe" about a JSON document that contains characters from other scripts (such as Greek, Hebrew, Arabic, etc.) any more than English. The JSON RFC's security considerations section does not mention any risks with some scripts rather than others. The only risk here is with buggy code, but elsewhere .NET does not generally call something "Unsafe" if it meets the behavior of a standard. For example, we don't call it UnsafeProcess because if you pass an untrusted command line, bad things can happen, or UnsafeXmlSerializer, because if you fail to escape its output in a SQL command, bad things can happen.

This terminology impacts the decision customers make. For example, consider a StackOverflow response on this question, and the following comment:
"Using an "unsafe" encoding is not the answer, the answer from ahsonkhan is correct"

By calling this encoding "unsafe," we cause folks to conclude it is, by definition, the wrong answer, rather than understanding that the alternative simply adds an extra layer of defense against failure to encode output properly or bugs in intermediate layers.

  1. On the same principle, not calling the Default implementation "Unsafe" is misleading in terms of the risks of using it.

The default implementation does not escape * or /. If someone adds JSON output as part of a generated source file inside a multi-line comment, the JSON text produced by default is not safe - the JSON text can terminate the multi-line comment and be treated as executable source code. The same motivation for calling the other "Unsafe" with regards to embedding JSON in HTML and forgetting to escape it properly applies to the Default implementation, if the use is C/C#/C++ rather than HTML.

The possible fixes here include both docs and APIs.

In terms of docs, we can update them to make it explicit that "Unsafe" here is misleading and inconsistent.

In terms of APIs, the existing "Unsafe" property name likely cannot be removed for back-compat reasons, so I think there are two possible solutions in code:

  1. Add another property without the "Unsafe" prefix.
  2. Add another property with a better name altogether.

I tend to prefer the second option, specifically, a new "Unicode" property that does not escape any characters beyond the required ones (as suggested in #86800).

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Jun 5, 2023
@ghost
Copy link

ghost commented Jun 5, 2023

Tagging subscribers to this area: @dotnet/area-system-text-json, @gregsdennis
See info in area-owners.md if you want to be subscribed.

Issue Details

JSON is defined as a sequence of Unicode characters/code points (see both the RFC and ECMA standard), and as such, correct processing of JSON means all Unicode characters are "safe" to use in JSON.

Having the word "Unsafe" in UnsafeRelaxedJsonEscaping is misleading from a security standpoint, in both directions:

  1. Calling this implementation "Unsafe" is misleading in terms of the risk of using it.

There's nothing inherently "Unsafe" about a JSON document that contains characters from other scripts (such as Greek, Hebrew, Arabic, etc.) any more than English. The JSON RFC's security considerations section does not mention any risks with some scripts rather than others. The only risk here is with buggy code, but elsewhere .NET does not generally call something "Unsafe" if it meets the behavior of a standard. For example, we don't call it UnsafeProcess because if you pass an untrusted command line, bad things can happen, or UnsafeXmlSerializer, because if you fail to escape its output in a SQL command, bad things can happen.

This terminology impacts the decision customers make. For example, consider a StackOverflow response on this question, and the following comment:
"Using an "unsafe" encoding is not the answer, the answer from ahsonkhan is correct"

By calling this encoding "unsafe," we cause folks to conclude it is, by definition, the wrong answer, rather than understanding that the alternative simply adds an extra layer of defense against failure to encode output properly or bugs in intermediate layers.

  1. On the same principle, not calling the Default implementation "Unsafe" is misleading in terms of the risks of using it.

The default implementation does not escape * or /. If someone writes JSON as part of a source code file inside a multi-line comment, the JSON text produced by default is not safe - the JSON text can terminate the multi-line comment and be treated as executable source code. The same motivation for calling the other "Unsafe" with regards to embedding JSON in HTML and forgetting to escape it properly applies to the Default implementation, if the use is C/C#/C++ rather than HTML.

The possible fixes here include both docs and APIs.

In terms of docs, we can update them to make it explicit that "Unsafe" here is misleading and inconsistent.

In terms of APIs, the existing "Unsafe" property name likely cannot be removed for back-compat reasons, so I think there are two possible solutions in code:

  1. Add another property without the "Unsafe" prefix.
  2. Add another property with a better name altogether.

I tend to prefer the second option, specifically, a new "Unicode" property that does not escape any characters beyond the required ones (as suggested in #86800).

Author: davidmatson
Assignees: -
Labels:

area-System.Text.Json, untriaged

Milestone: -

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Jun 5, 2023
davidmatson added a commit to davidmatson/runtime that referenced this issue Jun 5, 2023
Also fixes dotnet#86800.
Also fixes dotnet#87138 (except docs outside this repo).
@eiriktsarpalis
Copy link
Member

Closing in favor of #87153.

@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Jun 6, 2023
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Jul 5, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Aug 5, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants