Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support custom json and base64 encoders for Token and Parser #301

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

dcalsky
Copy link
Contributor

@dcalsky dcalsky commented Apr 2, 2023

This PR adds support for Token and Parser using custom JSON and base64 encoders to encode/decode related data for performance and experimentation purpose.

By implementing the Base64Encoder interface with EncodeToString and DecodeString methods to create a custom base64 encoder, while implementing JSONEncoder with Marshal and Unmarshal methods to create a custom json encoder.

import (
	"github.com/bytedance/sonic"
	asmbase64 "github.com/segmentio/asm/base64"
)

type sonicJsonEncoder struct{}

func (s *sonicJsonEncoder) Marshal(v any) ([]byte, error) {
	return sonic.Marshal(v)
}

func (s *sonicJsonEncoder) Unmarshal(data []byte, v any) error {
	return sonic.Unmarshal(data, v)
}

type asmBase64Encoder struct{}

func (s *asmBase64Encoder) EncodeToString(data []byte) string {
	return asmbase64.StdEncoding.EncodeToString(data)
}

func (s *asmBase64Encoder) DecodeString(data string) ([]byte, error) {
	return asmbase64.RawURLEncoding.DecodeString(data)
}

Then Using WithJSONEncoder and WithBase64Encoder to create a Parser, and using WithTokenJSONEncoder and WithTokenBase64Encoder to create a Token.

// create the token and parser  with json encoder based on sonic and base64 encoder based on asm/base64
token := jwt.NewWithClaims(jwt.SigningMethodHS256, jwt.RegisteredClaims{}, jwt.WithTokenJSONEncoder(&sonicJsonEncoder{}), jwt.WithTokenBase64Encoder(&asmBase64Encoder{}))

parser := jwt.NewParser(jwt.WithJSONEncoder(&sonicJsonEncoder{}), jwt.WithBase64Encoder(&asmBase64Encoder{}))

The disadvantages of this PR are (Any suggestions are welcomed):

  • Introduce the third-party libs into the go module
  • due to almost all of functions and structs are under the same package, the optional methods naming might be confused, like WithJSONEncoder and WithTokenJSONEncoder

Resolves #168

@oxisto
Copy link
Collaborator

oxisto commented Apr 2, 2023

Very nice! I will do a review over the next days, but it’s basically the direction I envisioned. Just some general pointers: We want to avoid the inclusion of any third party libs, even if it’s only in the tests. I would suggest only testing the interface itself in the test without the actual sonic implementation.

@dcalsky dcalsky force-pushed the main branch 2 times, most recently from cd88b49 to 7c726ab Compare April 2, 2023 15:54
@dcalsky
Copy link
Contributor Author

dcalsky commented Apr 2, 2023

Very nice! I will do a review over the next days, but it’s basically the direction I envisioned. Just some general pointers: We want to avoid the inclusion of any third party libs, even if it’s only in the tests. I would suggest only testing the interface itself in the test without the actual sonic implementation.

I've updated the PR according to your advice.

token_option.go Outdated Show resolved Hide resolved
encoder.go Outdated Show resolved Hide resolved
parser.go Outdated Show resolved Hide resolved
parser.go Outdated Show resolved Hide resolved
parser.go Outdated Show resolved Hide resolved
parser.go Outdated Show resolved Hide resolved
parser.go Outdated Show resolved Hide resolved
parser_option.go Outdated Show resolved Hide resolved
parser_option.go Outdated Show resolved Hide resolved
token.go Outdated Show resolved Hide resolved
token.go Outdated Show resolved Hide resolved
@alam-chime
Copy link

thanks @dcalsky
+1 to this PR

@oxisto
Copy link
Collaborator

oxisto commented Sep 13, 2023

Thanks for the initial work on this @dcalsky. I had another go at this, basically splitting up encoding and decoding and also converting most of the interfaces into function types. This should be pretty good integrated in the existing code base now. It even supports both parsing JSON with and without JSON number (as long as the underlying JSON library supports it).

I wonder whether I could also make our "special" base64 options (strict, allow padding) work with external libraries as well. For now, they are disabled when an external decoder is used.

If we are happy with this approach, I want to finalise docs and examples.

cc @mfridman @dcalsky

@oxisto
Copy link
Collaborator

oxisto commented Feb 1, 2024

@mfridman Do we still want to pursue this? It did gather some interest from the community. If yes, I would do another cleanup pass and add docs and examples/tests.

@mfridman
Copy link
Member

mfridman commented Feb 1, 2024

I'll review this in the next few days.

@mfridman
Copy link
Member

mfridman commented Feb 5, 2024

It seems useful, albeit the number of requests for this was quite low.

I was mainly waiting to see what comes of golang/go#63397. If the native std lib could be improved then bringing your own encoding package becomes unnecessary?

I'm a bit on the fence on this one, but if we get it into a good enough shape I wouldn't block. My main hesitation is all the plumbing we have to do to support this in the current /v5 API.

@MikeYast
Copy link

MikeYast commented Jul 4, 2024

Hi,

I am waiting for this change because we are experiencing performance issues with large JWT tokens using the standard JSON parser.

I was mainly waiting to see what comes of golang/go#63397. If the native std lib could be improved then bringing your own encoding package becomes unnecessary?
According to their proposal, they are not going to improve performance significantly (like sonic json parser is doing).

Personally, I am waiting for this change or any other change which allows us to replace standard JSON parser.

@Semerokozlyat
Copy link

Hello, would like to express an urgent demand in this change to boost up our JWT parsing logic on a highly loaded service.

@oxisto
Copy link
Collaborator

oxisto commented Jul 4, 2024

It seems useful, albeit the number of requests for this was quite low.

I was mainly waiting to see what comes of golang/go#63397. If the native std lib could be improved then bringing your own encoding package becomes unnecessary?

I'm a bit on the fence on this one, but if we get it into a good enough shape I wouldn't block. My main hesitation is all the plumbing we have to do to support this in the current /v5 API.

@mfridman seems there is some more interest in this. Can you maybe give this review another go? I think it's not that much plumbing making it work. It is mainly two additional if-clauses in token and parse. The rest are just all optional options.

@mfridman
Copy link
Member

mfridman commented Jul 4, 2024

Thanks for the ping @oxisto, I'll take a pass this weekend. I get the notification but typically lack time during the week for more in-depth reviews.

@dcalsky
Copy link
Contributor Author

dcalsky commented Jul 5, 2024

@oxisto Almost forget this PR... I will fine-tune this PR to ensure minimum changes today later or tomorrow.

@dcalsky
Copy link
Contributor Author

dcalsky commented Jul 6, 2024

I have written some example tests using custom decoder and encoder. the new methods are very smooth and easy to use. It's time to merge them into the our main branch.

@oxisto
Copy link
Collaborator

oxisto commented Jul 6, 2024

I have written some example tests using custom decoder and encoder. the new methods are very smooth and easy to use. It's time to merge them into the our main branch.

Thanks! That also got rid of the pesky code coverage warning.

I agree, I like the public interface part of this. I think @mfridman's concern is more about the inner workings of this inside parse/etc. I would abstain from further review comments now (expect small nits) and give this into the hands of @mfridman since I heavily reworked this PR myself so I am not really neutral as a reviewer here :)

DecodeString(s string) ([]byte, error)
}

type StrictFunc[T Base64Encoding] func() T
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this type really needs to be exported at all


type StrictFunc[T Base64Encoding] func() T

type Stricter[T Base64Encoding] interface {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This at least needs a comment if we want to keep It exported, it could probably also be private I guess

Strict() T
}

func DoStrict[S Base64Encoding, T Stricter[S]](x T) Base64Encoding {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double-check if this is actually still needed, which I think it is not?

// unmarshal algorithms.
type JSONUnmarshalFunc func(data []byte, v any) error

type JSONDecoder interface {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a comment because its a public interface

Decode(v any) error
}

type JSONNewDecoderFunc[T JSONDecoder] func(r io.Reader) T
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a comment

decoding
}

type decoding struct {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its not private, but still we probably could use some comments here, especially for this outer struct

encoders
}

type encoders struct {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its not private, but still we probably could use some comments here, especially for this outer struct

@oxisto oxisto requested a review from mfridman July 6, 2024 07:26
} else {
err = json.Unmarshal(claimBytes, &claims)
// If `useJSONNumber` is enabled, then we must use a dedicated JSONDecoder
// to decode the claims. However, this comes with a performance penalty so
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small nit: we only know definitly that the performance penalty occurs when using json.NewDecoder. Third-party libraries might not have that penalty, so we might at least state this like ... a performance penalty (when using the standard library decoder).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any evidence?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. #303

@@ -121,8 +124,80 @@ func WithPaddingAllowed() ParserOption {
// WithStrictDecoding will switch the codec used for decoding JWTs into strict
// mode. In this mode, the decoder requires that trailing padding bits are zero,
// as described in RFC 4648 section 3.5.
//
// Note: This is only supported when using [encoding/base64.Encoding], but not
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this note is not true anymore, since we are checking whether the library supports a Strict() function. We need to check this.


// WithJSONDecoder supports a custom JSON decoder to use in parsing the JWT.
// There are two functions that can be supplied:
// - jsonUnmarshal is a [JSONUnmarshalFunc] that is used for the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: double "the"

}
}

// WithBase64Decoder supports a custom Base64 when decoding a base64 encoded
Copy link
Collaborator

@oxisto oxisto Jul 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

grammar: "a custom Base64 [something missing here]"

@@ -3,3 +3,15 @@ package jwt
// TokenOption is a reserved type, which provides some forward compatibility,
// if we ever want to introduce token creation-related options.
type TokenOption func(*Token)

func WithJSONEncoder(f JSONMarshalFunc) TokenOption {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing comments here

}
}

func WithBase64Encoder(enc Base64Encoding) TokenOption {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing comments here

@dcalsky
Copy link
Contributor Author

dcalsky commented Oct 15, 2024

Do we just miss the code annotation part now, if it is supplied, the PR would pass? @oxisto


// Choose our JSON decoder. If no custom function is supplied, we use the standard library.
var unmarshal JSONUnmarshalFunc
if p.jsonUnmarshal != nil {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, why are we determining whether a custom JSON marshaller is present in the Parse() method instead of during the Parser creation? The same applies to other places. Parser.jsonUnmarshal should be initialized during Parser construction, as we're not planning to replace JSON marshallers or base64 encoders on the fly.

@@ -200,18 +227,37 @@ func (p *Parser) ParseUnverified(tokenString string, claims Claims) (token *Toke
// take into account whether the [Parser] is configured with additional options,
// such as [WithStrictDecoding] or [WithPaddingAllowed].
func (p *Parser) DecodeSegment(seg string) ([]byte, error) {
encoding := base64.RawURLEncoding
var encoding Base64Encoding
if p.rawUrlBase64Encoding != nil {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same comment about stage where custom base64 encoder function set. We can set p.rawUrlBase64Encoding to base64.RawURLEncoding or custom function on Parser creation

Copy link
Collaborator

@oxisto oxisto Oct 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this as a safe guard if people ever came across the crazy idea to directly create a Parser struct, which is unfortunately possible because we did not hide it behind an interface. But yes, if we assume that in the case you are doing that you REALLY know what you are doing we can do it in the init

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that sounds reasonable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Supporting alternative base64/json encoder/decoder
6 participants