Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add a quickcheck generator for aeson Values that match a swagger schema #162

Merged
merged 7 commits into from
Oct 30, 2018

Conversation

benweitzman
Copy link
Contributor

We can validate ToJSON instances by checking that an arbitrary value of a datatype matches against its swagger schema given by ToSchema when serialized using aeson.

This PR begins to add a way to validate FromJSON instances. In order to do so, we have to generate an arbitrary json value that matches we some swagger schema. We can use the schema to help guide the generator.

One we have a json value, we can then try and parse it to the haskell datatype using aeson. If the FromJSON instance does not match the ToSchema instance, then we will expect to see a parse error occasionally.

@fizruk
Copy link
Member

fizruk commented Aug 15, 2018

Thank you for the contribution!

I can see that validateFromJSON seems like a nice complement to validateToJSON.
However, I don't believe it is as useful (hence I didn't bother implementing it).
Can you elaborate on why you need it or why you think this might be useful?

Can you please add tests for types that are valid under validateFromJSON?

I suspect some types (e.g. IntMap) won't have this property, since Haskell types can express more properties than Swagger can. Can you document that as well?

@benweitzman
Copy link
Contributor Author

Since ToSchema is written separately from the aeson instances there is always a possibility of a mismatch in the implementations. It is easy to forget to update one when updating the other, leaving out a required property, etc. this isn’t a problem if you use generic derivations but that’s not always possible if you are dealing with external APIs that have specific requirements or if you have to maintain some backwards compatibility, for example.

Checking that a ToJSON instance is valid against a schema is useful when you have a server that is sending responses to a client that’s expecting them to be in a certain format or if you have a client that is sending requests.

This supports the opposite (and equally common) case: a server that is expecting request bodies in a certain format or a client that is expecting responses.

I develop an iOS app, so it's critical that the request/response structures continue to match what the app is expecting as I make changes to the server. I'm using servant-swagger to test for backwards compatibility in schemas, but that's not fully useful unless I can make sure that the aeson instances actually match those schemas. I'd like to use this in conjunction with servant-swagger to validate my entire API at once.

Can you clarify why you think IntMap would cause issues? It seems to validate to me:

> quickCheck $ validateFromJSON @(IM.IntMap String) Proxy
+++ OK, passed 100 tests.

@fizruk
Copy link
Member

fizruk commented Aug 15, 2018

This supports the opposite (and equally common) case: a server that is expecting request bodies in a certain format or a client that is expecting responses.

Sure I see the intent here. But a similar effect can also be achieved with validateToJSON + decode . encode ~ id, right? The reason I prefer to only rely on ToJSON is because instance almost always come in pairs and ToJSON is easier to test against ToSchema.

I'd like to use this in conjunction with servant-swagger to validate my entire API at once.

You may notice that validateEveryToJSON from servant-swagger validates both request responses AND request bodies. So it already got you covered if you have decode . encode ~ id test for your FromJSON instances.

Can you clarify why you think IntMap would cause issues?

I forgot that IntMap gets encoded as array. Map Int String should demonstrate what I had in mind, but it also passes your tests:

>>> quickCheck $ validateFromJSON @(Map Int String) Proxy
+++ OK, passed 100 tests.

However that's only because nothing useful is generated for these tests:

>>> sample $ genValue @(Map Int String) Proxy
Object (fromList [])
Object (fromList [])
Object (fromList [])
Object (fromList [])
Object (fromList [])
Object (fromList [])
Object (fromList [])
Object (fromList [])
Object (fromList [])
Object (fromList [])
Object (fromList [])

Swagger cannot put any constraints on its keys, so that at least in these cases your property won't hold.

@benweitzman I don't want to discourage you from working on this! Generating values based on Schema is useful and your version of validateFromJSON should provide a stronger test for types where it is applicable. It's just that I doubt that a real world API would pass validateEveryFromJSON.

@benweitzman
Copy link
Contributor Author

Sure I see the intent here. But a similar effect can also be achieved with validateToJSON + decode . encode ~ id, right? The reason I prefer to only rely on ToJSON is because instance almost always come in pairs and ToJSON is easier to test against ToSchema.

No, this is actually not correct, for a few reasons:

  1. Datatypes that have ToSchema instances do not almost always come in pairs, especially in a "real world API". There is no point in generating/writing ToJSON instances for things that only appear as request bodies, and in some cases it is useful to not write/generate them as it provides an extra compile time check that you aren't sending some data to the client that you shouldn't be. For example, not having a ToJSON instance for a password newtype might give you some extra security that you are not accidentally sending a plaintext password to a client.

  2. There is nothing enforcing that a ToJSON and a FromJSON instance match, they may be intentionally different or they may be accidentally different. Trying to test a FromJSON instance against a ToSchema instance by using decode . encode only checks that the ToJSON instance matches against the schema and that the FromJSON instance is compatible with the ToJSON instance. This does not mean in general that the FromJSON instance matches against the ToSchema instance.

  3. Using decode . encode will never catch an issue where a field that is required by FromJSON is not marked as such in ToSchema.

You may notice that validateEveryToJSON from servant-swagger validates both request responses AND request bodies. So it already got you covered if you have decode . encode ~ id test for your FromJSON instances.

This behavior is actually incorrect for the above reasons and I'll be submitting a PR soon, but this is unrelated to this PR.

Swagger cannot put any constraints on its keys, so that at least in these cases your property won't hold.

Right this is true. I haven't tried to use the additionalProperties field of swagger yet but it could be implemented and used to test map-like things.

I submit that it's actually the ToSchema instance for Map Int which is broken. Swagger cannot adequately describe this type of de/serialization. Writing an inaccurate instance for this type and then testing it via ToJSON gives a false sense of security that the implementation is correct. If the FromJSON test were to generate arbitrary textual keys (which is what is allowed by the swagger spec when you use the additionalProperties field), it would catch this issue.

@fizruk
Copy link
Member

fizruk commented Aug 15, 2018

There is no point in generating/writing ToJSON instances for things that only appear as request bodies

I find it very useful for logging, Haskell clients and integration tests.

Using decode . encode will never catch an issue where a field that is required by FromJSON is not marked as such in ToSchema.

This is indeed a serious argument in favour of your solution. I'm convinced 😄

I submit that it's actually the ToSchema instance for Map Int which is broken. Swagger cannot adequately describe this type of de/serialization.

Well, you still need schemas for these types in real world, right?
What about things like UTCTime and other formatted strings?

Writing an inaccurate instance for this type and then testing it via ToJSON gives a false sense of security that the implementation is correct.

My position on this is that Swagger is far from perfect and you can't really use it for a strong guarantee that your implementation is correct anyway. Although it makes sense to try getting better at it nonetheless 😄

@benweitzman
Copy link
Contributor Author

I added a generation of additional properties to help pick up some more issues.

The format section of the swagger spec is just not specific enough about what can go in that field and what it does, so I think there's nothing that can be done there. There are a few standardish formats that we could try and work off of but I think it's a challenge for another day.

Any type that is expecting some text to be in a certain format (UTCTime, Map Int Bool etc) should at least fail with this version so if this happens to you can decide how to proceed (change your type, remove the test). No false positives at least.

This will take some consideration on how to add to servant-swagger (I am imagining something like type validateEveryFromJSON = validateEveryFromJSONExcept '[] which would give users a way to whitelist specific problem types). Will give it a think.

@benweitzman
Copy link
Contributor Author

Can this be merged?

@benweitzman
Copy link
Contributor Author

@fizruk is there anything that is blocking this from being merged? Please let me know if there is so I can work on fixing it.

@fizruk
Copy link
Member

fizruk commented Sep 24, 2018

Apart from missing documentation I think I am okay with this.

@phadej can you review this please?

@fizruk fizruk requested a review from phadej September 24, 2018 10:03
@phadej phadej merged commit 151331d into GetShopTV:master Oct 30, 2018
@phadej
Copy link
Collaborator

phadej commented Oct 30, 2018

Ok. I think this is good. We can iterate this based on feedback, if needed.

Before releasing this, however, I'd like to drop hspec-dependency however (we'll probably have swagger2-hspec integrating it back). I'll prepare a patch soon-ish.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants