Skip to content

gloo-histroy Support custom query decoder / encoder #362

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Roba1993 opened this issue Aug 5, 2023 · 9 comments · Fixed by #364
Closed

gloo-histroy Support custom query decoder / encoder #362

Roba1993 opened this issue Aug 5, 2023 · 9 comments · Fixed by #364

Comments

@Roba1993
Copy link

Roba1993 commented Aug 5, 2023

Summary

Allow custom query decoder / encoder to support advanced use-cases.

Motivation

WI have the problem, that I need to serialize and deserialize rust structs into the a query. This is done via serde and serde_json. For better readability and changeability of the url I want to use an own serializer / deserializer. In my example along the lines of jsonurl.org.

Detailed Explanation

Unfortunately I have not a clear idea of to implement this right now in gloo.

Drawbacks, Rationale, and Alternatives

Drawback could be a more complex API?

An alternative would also to implement a jsonurl.org behavior natively into gloo-history. But this is would not be completely HTML conform and would break existing links.

Unresolved Questions

I also tried to use jsonurl to create a query string and push that as query. But it's still getting serialized / deserialized with the default behavior...

@ranile
Copy link
Collaborator

ranile commented Aug 5, 2023

What prevents you from pushing the query as:

struct Query {
    json_url: String
}

The url will be ?json_url=<query>

@xfbs
Copy link
Contributor

xfbs commented Aug 6, 2023

Hey, I just now also have the same issue. I have a Query struct that is really simple, like this:

#[derive(Serialize, Deserialize)]
struct Query {
    pub query: Vec<Predicate>,
    pub sort: Option<String>,
}

And I get an error.

I believe that the underlying issue for this is that gloo uses serde_urlencoded to do the query string encoding and decoding, which is quite limited in terms of what it supports.

It seems that the serde_qs crate is a more powerful and actively maintained implementation of query string encoding and decoding. It supports any arbitrary Rust structs, enums and basically anything the type system allows.

I already use this crate to decode my querys with axum. For that reason, the crate has an axum feature. When enabled, it offers the serde_qs::axum::QsQuery extractor that has an interface which matches that of the built-in extractor.

I would recommend one of two options:

  • adding a feature flag to gloo-history, maybe named query-qs which can be enabled to use serde-qs instead.
  • making the interface accept an enum
  • adding separate, optional methods to use a different encoder

I think this would bring quite some value to people using more advanced query parameters. However, this is a breaking change I believe.

If you would like, I could draft up a PR with my idea for how it could look like, in a way that does not break existing code.

Cheers and thanks for the work on this crate!

Edit: Some more context for this, see here. The serde-urlencoded crate sticks very closely to the RFC. As such, it really only supports key-value data, for example ?name=patrick&type=user. The serde docs link to serde_qs as being the query string encoder of choice. From what I understand, anything serde-urlencoded can do, serde-qs can do. But the latter prioritizes being able to encode any valid Rust type as a query string. For example, it might encode an array of tags as ?tag[]=biking&tag[]=shedding, which is not defined like this in the standard but used extensively, versus serde-urlencoded focusses simply on implementing the RFC correctly and only supports key-value types (so basically, single-level Struct with Strings. No support for Vec<>, BTreeMap, BTreeSet, anything that is nested, etc.)

@Roba1993
Copy link
Author

Roba1993 commented Aug 7, 2023

@hamza1311 the answer from @xfbs is really on the point.
In addition to your question:

What prevents you from pushing the query as:

struct Query {
     json_url: String
 }

The url will be ?json_url=

The string will be still again url encoded by serde_urlencode which is automatically percent encode every reserved character like . ! + [ ] @ which destroys again the readability. But especially these sub-delimiters are needed readable custom serialization / deserialization. Here is the RFC 3986 which describes this.

@xfbs Thanks for the detail explanation I didn't know about serde_qs but it also looks suitable for my use-cases.

@ranile
Copy link
Collaborator

ranile commented Aug 7, 2023

@futursolo what are your thoughts on this? I'm thinking, maybe we should switch over serde_qs or even use - URLSearchParams from the browser so the encoding is handled at the browser level.

@Roba1993, does URLSearchParams work for your use case? You can try it in the browser console.

@Roba1993
Copy link
Author

Roba1993 commented Aug 7, 2023

@hamza1311 I tested the URLSearchParams and it works. I would vote in favor of this approach, because it's the most flexible and closest to the browser. serde_qs could work, but is not a real standard and I don't know how stable and maintained it is and will be in future.

@futursolo
Copy link
Collaborator

futursolo commented Aug 8, 2023

I think using URLSearchParams will lose the current typed query serialisation / deserialisation, I am not sure if this is a favourable change in terms of the API design.

I am also concerned about introduce non-standard features.

For example, it might encode an array of tags as ?tag[]=biking&tag[]=shedding

I think this is not only non-standard, but also contradicts with some of the more popular implementations:

> const qs = new URLSearchParams()
< undefined
> qs.append("tag", "biking")
< undefined
> qs.append("tag", "shedding")
< undefined
> qs.toString()
< "tag=biking&tag=shedding"
>>> parse_qs("tag=biking&tag=shedding")
{'tag': ['biking', 'shedding']}
>>> parse_qs("tag[]=biking&tag[]=shedding")
{'tag[]': ['biking', 'shedding']}

If I pass Query { tag: vec!["biking", "shedding"] } to a serialiser, I would also expect tag=biking&tag=shedding as the outcome.

Non-standard nested behaviour is also historically known to cause security concerns as it can deserialise to objects that can be interpreted by MongoDB. One can introduce injections with expressions like: value[$gt]=1

In conclusion, I am in favour of keeping serde_urlencoded behaviour as default as that is adhering to traditional belief of how x-www-form-urlencoded works more closely.

I think that's also why warp and Axum both use serde_urlencoded as their default implementation for query string.
tokio-rs/axum#504
seanmonstar/warp#22

However, I think an api allowing pushing query as string may be introduced to accommodate more flexible non standard pattern. One can make a wrapper History / Location type with custom query str if they want to use non-standard serialiser / deserialiser by reading / writing the string representation. An API that allows a custom serialiser / deserialiser would also work but I am not sure how this would work without major internal change as we expect the history object to be the same for all browser / hash history.

@xfbs
Copy link
Contributor

xfbs commented Aug 8, 2023

I think using URLSearchParams will lose the current typed query serialisation / deserialisation, I am not sure if this is a favourable change in terms of the API design.

I agree with this. I do think that having some type-safe facility is essential. I think it makes sense to have an escape hatch for situations where none of the serde-powered encoding schemes work, but the main API should focus on being type-safe and, to some extent, "magic".

I am also concerned about introduce non-standard features.

Some context here: serde_qs aims for compatibility with the qs npm package, which is a dependency of over 15k packages on npm, including some rather prominent ones. This package I believe traces back to what Ruby on Rails was doing, the OG of magic web frameworks. So while there is not an explicity specification, the encoding scheme that it uses is not arbitrary.

If I pass Query { tag: vec!["biking", "shedding"] } to a serialiser, I would also expect tag=biking&tag=shedding as the outcome.

I do not disagree, that is a reasonable encoding for the query you have specified. The issue that you are overlooking is that with the current situation, if you tried to encode this query you would get a panic. serde_urlencoded does not support arrays. If you try to encode this, you will get a panic on Custom("unsupported value").

If you pass this to serde_qs, you will get tag[0]=bike&tag[1]=shedding. So the choice here is between not supporting arrays (duplication) at all or supporting it. I also prefer the encoding that you have suggested. But the priority for me is to just get something workable at all.

Non-standard nested behaviour is also historically known to cause security concerns as it can deserialise to objects that can be interpreted by MongoDB. One can introduce injections with expressions like: value[$gt]=1

I think I don't quite follow. It should not be our concern how MongoDB designs their APIs. This is simply a crate which exposes functionality to set the browser's history state. How this functionality is used, and making sure that it used safely, is left up to the end user.

In conclusion, I am in favour of keeping serde_urlencoded behaviour as default as that is adhering to traditional belief of how x-www-form-urlencoded works more closely.

I believe that serde_urlencoded is seriously broken, because it does not allow serializing simple constructions such as arrays in query parameters. So we do need an alternative that is compatible with how the web ecosystem works.

That being said, I am also a proponent for leaving serde_urlencoded as the default encoding, because applications rely on it and I would not want to swap out the encoding mechanism from underneath them.

Rather, my proposal is to turn to the type system to find a way to give users a choice of which encoding method they would like to use. I will be following this commend up with a PR shortly to demonstrate what I mean by this. I am confident that we can find a solution that is pluggable and future-proof.

@xfbs
Copy link
Contributor

xfbs commented Aug 8, 2023

With the PR I just made, the default behaviour is preserved. So you can write something like this:

#[derive(Serialize, Deserialize)]
struct Query {
    pub name: String,
    pub sort: Option<String>,
}

let query = Query {
    name: "name".into(),
    sort: None,
};

location.replace_with_query("path", &query)?;

And it will work and use serde_urlencoded.

However, if you want to encode something more complex, you can do it like this:

use gloo_history::query::Qs;

let query = Query {
    name: "name".into(),
    sort: None,
};

location.replace_with_query("path", &Qs(query))?;

That Qs type there, anything you wrap into it gets encoded using serde_qs instead of serde_urlencoded. Similarly, if you want to specify a raw query and not use any encoding at all, you can use the Raw type:

use gloo_history::query::Raw;
location.replace_with_query("path", &Raw("name=value&other=xyz"))?;

The same functionality exists for decoding queries as well:

// uses serde_urlencoded
let query: MyType = location.query()?;
// uses serde_qs
let query: MyType = location.query::<Qs<MyType>>()?;
// get raw query (just use query_str() instead)
let query: String = location.query::<Raw<String>>()?;

If you need a custom encoding or decoding strategy, you can manually implement the FromQuery and IntoQuery traits for your type. This achieves dependency inversion: any fancy future serde encoding method someone comes up with, you can use it without needing to update the gloo_history crate.

I believe that this is the most flexible approach, does not break existing code (with one hypothetical caveat, which I have explained in the PR).

@ranile ranile linked a pull request Aug 11, 2023 that will close this issue
@ranile
Copy link
Collaborator

ranile commented Aug 11, 2023

Fixed by #364

@ranile ranile closed this as completed Aug 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants