Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to marshal response error #2662

Closed
calummoore opened this issue Oct 16, 2018 · 16 comments
Closed

Unable to marshal response error #2662

calummoore opened this issue Oct 16, 2018 · 16 comments
Assignees
Labels
kind/bug Something is broken.

Comments

@calummoore
Copy link

If you suspect this could be a bug, follow the template.

  • What version of Dgraph are you using?
    v1.0.9 (latest)

  • Have you tried reproducing the issue with latest release?
    Yes

  • What is the hardware spec (RAM, OS)?
    MacOS + Docker (4GB RAM for Docker) - --lru_mb=1024

  • Steps to reproduce the issue (command/config used to run Dgraph).
    Add the following mutation:

{
  "set": [
      { "log.message": "\u001b[32mHello World 1!\u001b[39m\n" }
  ]
}

Then run the following query (replacing 0x1d788 for new uid):

query {
  node (func: uid(0x1d788)) {
    log.message
  }
}
  • Expected behaviour and actual result.

Expected - return the log message that was set

Actual - {"errors":[{"code":"Error","message":"Unable to marshal response"}],"data":null}

@calummoore
Copy link
Author

So it looks like the escaping of \u is causing all the problems.

The mutation succeeds because it is encoded (e.g. via JSON.stringify(obj)) - but it seems like that escaping is being lost somewhere on ingestion into Dgraph.

@srfrog srfrog added the investigate Requires further investigation label Oct 16, 2018
@charlie-wasp
Copy link

I am experiencing, I guess, the same issue right now 🙂 In my case it's the string "KHL\u0000"

@manishrjain manishrjain assigned srfrog and unassigned gitlw Nov 26, 2018
@srfrog
Copy link
Contributor

srfrog commented Nov 28, 2018

This has to do with unicode optimizations done by Go. The escape sequence \u001b is converted to \x1b and \u0000 is \x0. Escape sequences/control chars are not allowed in JSON strings.

I'd suggest to add slash to those chars you know will be control chars and try again. e.g.,, \\u001b

@srfrog
Copy link
Contributor

srfrog commented Dec 13, 2018

any luck with this? I will close this issue, re-open if you think of any alternatives.

@srfrog srfrog closed this as completed Dec 13, 2018
@calummoore
Copy link
Author

@srfrog why is this being closed? The workaround is not desirable and is likely to catch a lot of people out. I would have thought it should only be used while a fix is pending?

At the moment, it's not documented anywhere, and it causes the entire query to be unreturnable without any clear error message. I really think this should be fixed!

@srfrog
Copy link
Contributor

srfrog commented Dec 18, 2018

The JSON spec states that control characters are not part of a string. You need to double-escape them with \\. This is not something we can fix in Dgraph per-se, the error comes from the Go json package when we try to marshal the response. It's documented here: https://en.wikipedia.org/wiki/JSON#Data_portability_issues

The alternative I was referring is that we would need to find a way to insert slashes for escape sequences without breaking normal unicode escapes. I think that you would know the data needs the slashes and add another pass to escape them instead of Dgraph doing this auto.

@manishrjain
Copy link
Contributor

I'm able to run this code, with no issues: https://play.golang.org/p/cUKMcr5UqC-

It looks like an issue specific to Dgraph, not Go.

package main

import (
	"encoding/json"
	"fmt"
)

func main() {
	fmt.Println("hey")

	m := make(map[string]string)
	m["key"] = "\u001b[32mHello World 1!\u001b[39m\n"
	fmt.Printf("map = %+v\n", m)

	js, err := json.Marshal(m)
	fmt.Printf("js = %s. Err = %v\n", js, err)
}

@manishrjain manishrjain reopened this Dec 18, 2018
@manishrjain manishrjain added kind/bug Something is broken. and removed investigate Requires further investigation labels Dec 18, 2018
@srfrog
Copy link
Contributor

srfrog commented Dec 18, 2018

That example is not accurate. This is actually what is going on: https://play.golang.org/p/qaUXuiAcP83

func main() {
	m := make(map[string]interface{})
	m["response"] = map[string]json.RawMessage{"key":json.RawMessage("\x1b[32mHello World 1!\x1b[39m\n")}
	fmt.Printf("map = %+v\n", m)

	js, err := json.Marshal(m)
	fmt.Printf("js = %s. Err = %v\n", js, err)
}

For all our responses we are returning a blind map using interface{}, we must do that because we expect different values at various levels (think of the query). So json.Marshal to string won't work. We need to use bytes, and when we convert bytes to string that's when it returns error.

Alternatives:

  1. Pre-scan the input string before we save it, adding slashes to control sequences (this was my suggestion above).
  2. Rewrite the response system to not use plain bytes, but that could be slow(er).
  3. Post-scan the value and add slashes before we return the response. This would be ideal for the data you already had stored.

@manishrjain
Copy link
Contributor

The behavior here (replacing \u00 with \x) is due to the fix for this issue: #1508

We use strconv.Quote when returning string results, which does the replacement. See it in action here:
https://play.golang.org/p/PYmDS8jhmY8

@manishrjain
Copy link
Contributor

manishrjain commented Dec 19, 2018

I made a small commit ae1d9f3
, which avoids calling json.Marshal, and directly writes the JSON response to a bytes.Buffer -- essentially an optimization. This change would allow the JSON response to be created, however, considering Go is converting \u to \x, you'd still get an error while trying to parse the JSON.

We have to use strconv.Quote (equivalent via fmt.Sprintf("%q")) for strings and bytes, there's no way around that. I think you could do what @srfrog is proposing, that is, do an extra escape to ensure that you get the \u back.

@niemeyer: Do you have any ideas regarding this particular issue?

@niemeyer
Copy link

I most likely don't understand the actual issue, as my reading from the above would indicate that the bug is that Dgraph is sending broken JSON on the wire by putting something in a json.RawMessage that is not in fact valid JSON. If that was the case, the straightforward solution would be of course to not do that, and make sure that anything sent through the wire to be consumed as JSON is in fact JSON.

That understanding comes from the two exchanges above where you said that this works but then the reply was that this doesn't. Yes, it doesn't work and it shouldn't, since what's going into RawMessage seems bogus?

But again, I'm sure you already had that much figured out. So what am I missing?

@manishrjain
Copy link
Contributor

Can you look at my last two comments? It includes a link to the issue you created some time ago, and a link to go playground which shows how it happens.

@niemeyer
Copy link

That issue is about a different file format which takes data in a different representation. The fix done there, which makes Dgraph read the format properly and store data in raw form, seems correct to me. The problem here, as far as I could perceive given the above exchanges, is that some data in non-JSON form is being embedded into a JSON document, and as expected that is breaking down. Go's quoting and JSON quoting are not interchangeable.

IOW, this is a bug if the data if the data is being embedded in a JSON result with RawMessage:

We use strconv.Quote when returning string results, which does the replacement.

@manishrjain
Copy link
Contributor

To clarify, this is not a RawMessage issue. We have our own JSON encoder which we use to generate JSON response, which we were then encapsulating in RawMessage. There was no need for that. In fact, I removed the RawMessage usage in my last change.

The issue is around this JSON string creation. So, we store data in the raw format, which is correct. Now, when we need to create the JSON, we have to put quotes around the data to create JSON key-value ("key": "rawdata"), which we do like this:
https://github.com/dgraph-io/dgraph/blob/ae1d9f3fa7ea7292c241a53215064b5ddff289e6/query/outputnode.go#L148

In essence, we use fmt.Sprintf("%q", rawdata) to generate a JSON string. It's possible that's not the best way. In fact, maybe doing a json.Marshal(rawdata), instead of fmt.Sprintf might be the right thing here, as demonstrated by this code snippet:

https://play.golang.org/p/W8Gug_yQOq9

@niemeyer
Copy link

In essence, we use fmt.Sprintf("%q", rawdata) to generate a JSON string. It's possible that's not the best way.

Yeah, that has the same effect, and thus is certainly a bug.

@manishrjain
Copy link
Contributor

We should be correctly dealing with JSON strings now. You can build Dgraph from master and try it out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something is broken.
Development

No branches or pull requests

6 participants