Possible DOS due to hash collision #864

TomMD · 2021-09-11T17:02:28Z

As detailed in a blog, Aeson is susceptible to DOS if given maliciously-crafted JSON. This is likely a tired conversation for many developers, but with the public posting today it seems good to have a public issue tracking the discussion and resolution.

NorfairKing · 2021-09-11T17:56:56Z

Author here: Happy to help with this.

phadej · 2021-09-11T18:07:42Z

Indeed, it's old and known problem. See e.g. https://oleg.fi/gists/posts/2021-05-19-dont-default-to-hashmap.html

Yet, I'm not keen at changing Value, but rather would skip Value in decoding pipeline entirely. That would allow people e.g. process key-value pairs on order (if it matters for them), deal with duplicate keys (again, if it matters), possibly work with as-written values of number literals etc.

NorfairKing · 2021-09-11T18:17:42Z

Indeed, it's old and known problem. See e.g. https://oleg.fi/gists/posts/2021-05-19-dont-default-to-hashmap.html

With all due respect, that makes this worse, not better.

Yet, I'm not keen at changing Value, but rather would skip Value in decoding pipeline entirely.

That's a much bigger breaking change and would make it entirely new library.
We have played around with this idea by making the equivalent of optparse-applicative, but for json parsing without an intermediate type.
That approach has different tradeoffs and can be cool, but it doesn't solve the problem at hand in aeson right now.

TomMD · 2021-09-11T18:43:32Z

@NorfairKing In the post you indicated a rejected PR, could you link that here now that the issue is publicly discussed?

@phadej Among the proposed paths forward, is there a short-term solution you'd be in favor of? This issue deserves a short term fix, even if it isn't the desired end state.

NorfairKing · 2021-09-11T18:44:36Z

@NorfairKing In the post you indicated a rejected PR, could you link that here now that the issue is publicly discussed?

It's linked in the post but here's the link again: haskell-unordered-containers/unordered-containers#217

phadej · 2021-09-11T18:47:15Z

@TomMD:

Introduce Data.Aeson.Object module with anObject opaque newtype(fromList, toList, member), maybe TextMap v would be simpler to avoid circular dependencies.
Add a manual flag to switch between Map to HashMap- the internals shouldn't matter if they are hidden behind opaque TextMap.

EDIT: bonus points for providing HashMap SHA256HashedText variant too, maybe 64 lower bits would be enough to get collision resistance. That would be interesting benchmark wise.

lehins · 2021-09-11T20:39:22Z

Just a suggestion, but maybe Blake2b_64/Blake2s_32 here is a good solution. Blake2b_256 is much faster than SHA-256, but maybe doing Blake2b_64 is even faster than just cropping 256bit hash to 64 bits.

I have suspicion that cooking up collisions for Blake2 (even with a small digest size) would be prohibitively expensive for the purpose of DoS attack

konsumlamm · 2021-09-11T22:20:48Z

Another solution could be using some map data structure specialized to Text keys (something like a trie, for example text-trie (although I have no idea how performant that implementation is) or a ternary search tree). This would of course be a breaking change, but so would using Data.Map (and it could be hidden behind a newtype, as suggested by @phadej).

This would have the advantage (apart from not being susceptible to hash DoS attacks obviously) that it doesn't (potentially) cause lots of string comparisons (this could be a problem with Data.Map if lots of the keys share the same prefix). Disclaimer: I didn't run any benchmarks, so I'm not sure if this would actually be faster in practice than using Data.Map (which has been extensively optimized after all), especially since text-2.0 will apparently use a more efficient compare implementation (see here).

phadej · 2021-09-11T22:22:38Z

@konsumlamm my solution is to make Object opaque.Then people are free to experiment with different representations, secure, fast, both, neither.

phadej · 2021-09-11T22:46:53Z

More concretely, something like: https://github.com/haskell/aeson/compare/textmap

The rest is just H.foo -> TM.foo all over the place. (And something to make FromJSONKeyCoerce work in happy situations)

I hope someone will complete that and make a PR.

Boarders · 2021-09-12T19:03:22Z

I'm going to try and push that through @phadej.

joelmccracken · 2021-09-12T19:05:16Z

@Boarders @phadej I was just considering that too. Please flag me if you could use any help.

Boarders · 2021-09-13T00:20:56Z

@phadej What did you have in mind with the Lift instance, is there an easy way to write it?

TravisCardwell · 2021-09-13T00:22:15Z

That would allow people e.g. process key-value pairs on order (if it matters for them), ...

I often work with YAML files where the order of object properties is important. Humans edit YAML files, and a logical order of object properties maximizes the usability. I do not know of a way to do this in Haskell using existing libraries, and this issue unfortunately sometimes results in having to use a different language.

This comment is in agreement with the current direction, but I am adding it as a concrete example of a real-world issue that Oleg's suggestion would resolve, in addition to the hash collision issue.

phadej · 2021-09-13T00:42:49Z

@Boarders, thanks for working on this!

@phadej What did you have in mind with the Lift instance, is there an easy way to write it?

look at the diff, probably something like

lift (TextMap m) = [| TextMap (H.fromList . map (first pack) $ m') |]
      where m' = map (first unpack) . H.toList $ m

It would be good to first introduce TextMap, and only then fiddle with its implementation.

piyush-kurur · 2021-09-13T06:02:53Z

Just a suggestion, but maybe Blake2b_64/Blake2s_32 here is a good solution. Blake2b_256 is much faster than SHA-256, but maybe doing Blake2b_64 is even faster than just cropping 256bit hash to 64 bits.

I have suspicion that cooking up collisions for Blake2 (even with a small digest size) would be prohibitively expensive for the purpose of DoS attack

Probably this is relevant

google/highwayhash#28

brendanhay · 2021-09-13T08:28:57Z

I realise the following probably isn't advisable because of (.:) and friends, but it'd be nicer (IMO) as the lowest common denominator (aka default) to use Vector (Text, Value) and avoid {Hash,}Map completely. It's more inline with the spec in that it doesn't specify how duplicate keys are handled, preserves the original ordering of the keys, and doesn't tie you to Ord or Hashable, etc.

That said, TextMap looks like a clear improvement over the status quo. 👍🏽

TravisCardwell · 2021-09-14T04:08:27Z

it'd be nicer (IMO) as the lowest common denominator (aka default) to use Vector (Text, Value) and avoid {Hash,}Map completely. It's more inline with the spec in that it doesn't specify how duplicate keys are handled, preserves the original ordering of the keys, and doesn't tie you to Ord or Hashable, etc.

I completely agree with this, as the order issue has been a significant paint point for me with the current implementation. I brainstormed about Aeson Object Design in my blog. Here are my ideas in brief, which address the (.:) issue.

The type is named ObjectEntries to emphasize that it is a list (Vector):

type ObjectEntries = Vector (Text, Value)

For working with other types, an Object type class defines the required API:

class Object o where
  lookup :: Text -> o -> Maybe Value

(.:) :: (FromJSON a, Object o) => o -> Text -> Parser a

...

Instances can be provided for built-in types, and users are free to define instances for other types:

instance Object ObjectEntries where
  lookup k = fmap snd . Vector.find ((== k) . fst)

instance Object (HashMap Text Value) where
  lookup = HashMap.lookup

...

The "with" function is named withObjectEntries to emphasize that it uses the list (Vector) by default:

withObjectEntries
  :: String
  -> (ObjectEntries -> Parser a)
  -> Value
  -> Parser a

Helper functions can be provided to perform conversions to other types:

asHashMap
  :: (HashMap Text Value -> Parser a)
  -> ObjectEntries
  -> Parser a
asHashMap f = f . HashMap.fromList . Vector.toList

The example from the documentation would be written as follows:

instance FromJSON Coord where
  parseJSON = withObjectEntries "Coord" . asHashMap $ \v -> Coord
    <$> v .: "x"
    <*> v .: "y"

This allows the user to decide how to implement each instance. For example, a public API may prioritize security while an internal or administrative API may prioritize performance.

Generic instances could have a suitable default and provide a compile-time flag that allows users to select an alternative, much like the current fix.

brendanhay · 2021-09-14T08:51:03Z

For my own codebases I'm not convinced that (.:) introducing a linear scan by way of Vector.find would even be slower, at least, as the deserialisation routines (for say, a record) never perform repeated lookup of the same object key. As mentioned in the Z-Data code I linked above you're probably paying more to materialise a {Hash,}Map than what the linear scan for a single key introduces? It would naturally require actual benchmarks to prove that I am not, in fact, a moron for claiming this.

phadej · 2021-09-14T13:50:25Z

Isn't this issue about supposedly O(1) operation degenerating to O(n) (i.e. linear)...

TravisCardwell · 2021-09-15T07:33:10Z

Isn't this issue about supposedly O(1) operation degenerating to O(n) (i.e. linear)...

Indeed. (The DoS vulnerability is in the construction of the HashMap. Without collision, fromList is O(n log n) which is practically O(n) with a high base. With collisions, however, each insertion becomes linear, resulting in quadratic performance.)

The current fix allows users to globally choose between Map and HashMap using a flag. My suggestion provides a way for users to choose which data structure to use for each instance, using the flag to set the default for automatic (Generic) instances.

In some cases, it would even be possible to use the list (better than Vector for conversion) length to mitigate attacks. For example, consider an API that uses 10 properties. If the length of the list is above a certain threshold (10 for a strict API), an error could be returned. Otherwise, the HashMap could be constructed and used as usual. This would allow users to mitigate attacks while keeping the performance of HashMap, with only the additional cost of determining the length of the list (one time O(n)).

Would delaying the construction of the (Hash)Map until parseJSON be problematic? If the goal is to minimize API change, the withObject function could do the conversion to HashMap and keep the same type, while users can use the constructor (or new API functions) to access the list/Vector when necessary, in order to use Map for example.

Thank you very much for your time and patience.

enobayram · 2021-09-15T18:15:14Z

Just wanted to chime in that I agree it'll be great for the Haskell ecosystem if whatever solution comes out of this discussion lets us keep the order of JSON object properties. Regardless of how wrong it may be to rely on the order of the properties, many tools and standards still do it, and it's a nightmare to work with them using Haskell. Feels suddenly like the entire ecosystem is taken away from you (i.e. not only aeson itself, but all the downstream packages that expose a convenient interface over aeson). I find it very unfortunate that this opinionated treatment of the object properties is baked so deeply into Haskell's de-facto standard package for working with JSON. Now that it seems like a breaking change is inevitable, this might be the best time to tackle this issue as well.

TravisCardwell · 2021-09-18T06:49:25Z

For my own codebases I'm not convinced that (.:) introducing a linear scan by way of Vector.find would even be slower, at least, as the deserialisation routines (for say, a record) never perform repeated lookup of the same object key. As mentioned in the Z-Data code I linked above you're probably paying more to materialise a {Hash,}Map than what the linear scan for a single key introduces? It would naturally require actual benchmarks to prove that I am not, in fact, a moron for claiming this.

I ended up running some benchmarks (blog entry, code).

On my laptop (UPDATED after feedback from Talyor):

Lists marginally outperformed Vector for objects with 1 pair.
Lists marginally outperformed Map for objects with 6 or fewer pairs.
Lists marginally outperformed HashMap for objects with 8 or fewer pairs.
Vector marginally outperformed HashMap for objects with 17 or fewer
Vector marginally outperformed Map for objects with 35 or fewer pairs.
Map marginally outperformed HashMap for objects with 13 or fewer pairs.

In the context of the current fix, using Map should not result in decreased performance when dealing with small objects.

tfausak · 2021-09-18T14:32:07Z

@TravisCardwell: Did you run those benchmarks with stack script as this suggests? If so, your script was effectively interpreted with runghc rather than compiled with ghc and then run. The docs say that if you want to compile with ghc -O0 you can pass --compile and if you want to compile with ghc -O1 you can pass --optimize.

I wouldn't expect Vector to be slower than [], and I especially wouldn't expect looking up about 40 keys to take 200 microseconds. I wrote my own benchmarks and got these results (note this is a log-log plot):

Once you get to n=32, HashMaps are the fastest. InsOrd is slower by a constant factor since it has to track the order of the keys. After n=64, Map.Lazy is the next fastest with Map.Strict just behind. ZVector and Vector are quite a bit slower, and the List is slower still. Looking at smaller numbers, at n=16 the vectors are the fastest. At n=2 List edges out Vector, and at n=0 List is the fastest, which makes sense because it doesn't have any associated startup costs.

My takeaway is that you should basically never use List, Vector is pretty good until you get more than 32 elements, and Map is worse than HashMap but not by much.

TravisCardwell · 2021-09-18T23:41:45Z

Did you run those benchmarks with stack script as this suggests? If so, your script was effectively interpreted with runghc rather than compiled with ghc and then run. The docs say that if you want to compile with ghc -O0 you can pass --compile and if you want to compile with ghc -O1 you can pass --optimize.

Indeed I did! Thank you very much for catching that! I have updated my code and blog entry to fix the mistake. I also updated the stats in my comment above to make sure people are not misled by the problematic stats.

Thanks for sharing your benchmark code. I look forward to using gauge in the future!

My takeaway is that you should basically never use List, Vector is pretty good until you get more than 32 elements, and Map is worse than HashMap but not by much.

That sounds good. On my (old and slow) laptop, HashMap is better than Vector from 18 pairs, as the performance is system-dependent.

ketzacoatl · 2021-09-27T15:10:09Z

Indeed, it's old and known problem.

@phadej, if it's an old and known problem, why hasn't it been resolved? Is it too difficult? Too time-consuming? Not enough interest? Not in agreement there is a problem?

hasufell · 2021-09-27T16:29:04Z

If anyone has an API-compatible fork that fixes this issue, please let me know. I'll replace all my uses of aeson with it, even if performance is 10 times worse.

phadej · 2021-10-09T17:33:45Z

aeson-2.0.0.0 with ordered-keymap flag is released

NorfairKing · 2021-10-09T18:21:55Z

@phadej, Do I understand correctly that this vulnerability is still there by default?

aeson/aeson.cabal

Lines 62 to 67 in 4ad6b40

    
           flag ordered-keymap 
        
             description: 
        
               Use ordered @Data.Map.Strict@ for KeyMap implementation. 
        
             default:     False 
        
             manual:      True

phadej · 2021-10-09T21:20:46Z

I changed the default of keymap-ordered in 2.0.1.0: https://hackage.haskell.org/package/aeson-2.0.1.0

Though there is no guarantee it won't be changed again: that's the point of it being internal. If it is important, please pin the flag explicitly in your cabal.project / stack.yaml / ...

ketzacoatl · 2021-10-10T13:04:47Z

@phadej, for us laymen, can you explain how this is resolved and what level of certainty we can place in the solution?

phadej · 2021-10-10T13:07:01Z

@ketzacoatl upgrade to aeson-2.0.1.0 or later, then the underlying object implementation uses Data.Map.

EDIT i.e. the plan as in #864 (comment) was implemented, but with different module/type names.

ketzacoatl · 2021-10-10T13:25:43Z

@phadej, thank you for clarifying!

Are there tests that exist or could be added to this package as a regression test or otherwise help us confirm this is resolved?

Lysxia · 2021-10-11T14:58:00Z

@ketzacoatl What's a good way to test this? We could generate colliding inputs and make sure that the resulting map gets constructed in reasonable time, but that only prevents reverting back to unordered-containers specifically, doesn't that seem overly narrow?

If the lesson learned is "don't use vulnerable hashmaps" I think enough noise was made about this issue to make maintainers vigilant for years to come at least as far as aeson is concerned. I'm unsure about this. I could imagine abusing this line of thinking to justify not testing for any bugs, but at the same time I'm not convinced this issue in particular is one that can easily be tested against.

phadej · 2021-10-11T16:10:05Z

Testing performance issues is tricky. I'll be happy to learn how to make non-flaky performance tests.

Lysxia · 2021-10-11T17:05:47Z

When it's about the difference between O(n) and O(n^2), you can set n to a big number like n=10k and it's about detecting a 10k order of magnitude difference. That doesn't require much sophistication, does it? Make a test run in 0.1s without collisions, but 100s with collisions, set the cut-off at 1s.

phadej · 2021-10-11T17:16:48Z

for Data.Map it's O (n log n) always (or should be). By making one measurement you don't know anything. We'd need to construct a series and fit the n log n curve with some confidence interval (i forgot all statistics I knew to figure that). (CPU) time measurements are not very precise (especially in cloud CIs, these machines are very noisy). Would be better to count instructions, or something else stable. How? perf on Linux?

Doable, but I don't have time to spend on making that work reliably. Such complexity testing tool would be nice though.

NorfairKing · 2021-10-11T17:18:25Z

@phadej; this might fit your use-case: https://github.com/nh2/haskell-cpu-instruction-counter

phadej · 2021-10-11T17:20:33Z

Sadly, it's not on Hackage, nh2/haskell-cpu-instruction-counter#8

Lysxia · 2021-10-11T17:57:23Z

By making one measurement you don't know anything.

I just explained why that's not the case. You do not need to actually measure the curve, just find one point where the discrepancy is large enough for all reasonable machines where one might run the test:

Make a test run in 0.1s without collisions, but 100s with collisions, set the cut-off at 1s.

How do you expect 0.1s on one machine ever take more than 1s on another machine? (and these numbers are just for the sake of example, if a 10x discrepancy is actually not enough slack you can easily turn it up to 100x or 1000x) You might get a false positive on a very busy machine but then you're likely to be in much more trouble already anyway.

phadej · 2021-10-11T18:04:27Z

Why no curve? To test pass the curve should be e.g. n log n with some confidence. It is easy to make a test which will report true negatives, but we want a test which doesn't report false positives, i.e. doesn't tell that everything is ok when it isn't.

Bodigrim · 2021-10-12T19:33:05Z

You can test two points: if processing a tenfold data increases time 50x, then it's highly unlikely to be O(n log n) and more like O(n^2). One can use measureCpuTime to measure run time, and there is probably something similar available from criterion-measurement.

ulysses4ever · 2021-10-12T19:41:39Z

I think @phadej wants to test a stronger hypothesis to find out which asymptotics we are in: linearifmic or quadratic. But @Lysxia makes a (perfectly valid imo) point that a much easier test -- to catch a difference between the two asymptotics (e.g. with explicit keymap-ordered and with implicit) -- perfectly suffices: for large enough data you will have a large enough difference to cover for any possible noise on CI machines.

phadej · 2021-10-12T23:22:10Z

@ulysses4ever but we have only one implementation at the time to test, there is nothing to compare to.

Maybe I'm failing to explain, so I'd welcome the patch of your suggestions, so we can discuss something concrete.

Lysxia · 2021-10-13T13:56:05Z

but we have only one implementation at the time to test, there is nothing to compare to.

I could be misunderstanding you, but like all regression tests, I thought the point is to distinguish the implementation before and the implementation after the bug was fixed.

@NorfairKing Your opinion would be valuable on this. Is a regression test worth adding to this library, considering that it could realistically only catch a regression to a specific hashtable implementation with a specific hashing function, which is likely to end up being obsolete if people start to act on the issue in the unordered-containers library (haskell-unordered-containers/unordered-containers#319)?

NorfairKing · 2021-10-14T11:39:05Z

@NorfairKing Your opinion would be valuable on this. Is a regression test worth adding to this library, [...] ?

I don't think so.
The best you could do against a resource exhaustion attack is to create a situation in which resources would be exhausted and then assert that resources are not exhausted.
We could do that in unordered-containers using a type for which collisions are common, but in aeson that involves using text-based collisions and I don't think it's worth the effort and maintenance work TBH.

juhp · 2021-10-15T04:44:05Z

(I think it would be interesting to see any benchmark comparison(s) of 1.5 and 2.0 anyway)

phadej · 2021-10-15T06:03:54Z

@juhp

(I think it would be interesting to see any benchmark comparison(s) of 1.5 and 2.0 anyway)

See #883

sjakobi mentioned this issue Sep 13, 2021

Vulnerability to collision attacks haskell-unordered-containers/unordered-containers#319

Open

brandon-leapyear mentioned this issue Sep 13, 2021

Support configuring default salt haskell-unordered-containers/hashable#218

Open

infinisil mentioned this issue Sep 16, 2021

Compatibility with aeson 2.x tweag/webauthn#22

Closed

dhess mentioned this issue Sep 18, 2021

Remote exploit in aeson hackworthltd/primer#51

Closed

1 task

phadej closed this as completed Oct 9, 2021

This was referenced Oct 9, 2021

Support for aeson >= 2 haskell-hvr/HsYAML-aeson#11

Closed

Doesn't build with aeson >=2 informatikr/aeson-pretty#36

Closed

Support aeson >=2 haskell-hvr/cabal-plan#75

Closed

brendanhay mentioned this issue Oct 11, 2021

cabal outdated check before release brendanhay/amazonka#660

Closed

akhesaCaro mentioned this issue Oct 29, 2021

Aeson security Issue (Servant dependency) GetShopTV/swagger2#226

Closed

akhesaCaro mentioned this issue Nov 18, 2021

support Aeson 2 haskell-servant/servant#1475

Merged

13 tasks

mrandri19 mentioned this issue Feb 6, 2022

[haskell-http-client] Update haskell-http-client generator to Aeson 2.x OpenAPITools/openapi-generator#11535

Closed

5 tasks

Possible DOS due to hash collision #864

Possible DOS due to hash collision #864

Comments

TomMD commented Sep 11, 2021

NorfairKing commented Sep 11, 2021

phadej commented Sep 11, 2021

NorfairKing commented Sep 11, 2021

TomMD commented Sep 11, 2021

NorfairKing commented Sep 11, 2021

phadej commented Sep 11, 2021 • edited Loading

lehins commented Sep 11, 2021

konsumlamm commented Sep 11, 2021

phadej commented Sep 11, 2021

phadej commented Sep 11, 2021 • edited Loading

Boarders commented Sep 12, 2021

joelmccracken commented Sep 12, 2021

Boarders commented Sep 13, 2021

TravisCardwell commented Sep 13, 2021

phadej commented Sep 13, 2021 • edited Loading

piyush-kurur commented Sep 13, 2021

brendanhay commented Sep 13, 2021 • edited Loading

TravisCardwell commented Sep 14, 2021

brendanhay commented Sep 14, 2021

phadej commented Sep 14, 2021

TravisCardwell commented Sep 15, 2021 • edited Loading

enobayram commented Sep 15, 2021 • edited Loading

TravisCardwell commented Sep 18, 2021 • edited Loading

tfausak commented Sep 18, 2021

TravisCardwell commented Sep 18, 2021

ketzacoatl commented Sep 27, 2021

hasufell commented Sep 27, 2021

phadej commented Oct 9, 2021

NorfairKing commented Oct 9, 2021 • edited Loading

phadej commented Oct 9, 2021

ketzacoatl commented Oct 10, 2021

phadej commented Oct 10, 2021 • edited Loading

ketzacoatl commented Oct 10, 2021

Lysxia commented Oct 11, 2021

phadej commented Oct 11, 2021

Lysxia commented Oct 11, 2021

phadej commented Oct 11, 2021 • edited Loading

NorfairKing commented Oct 11, 2021

phadej commented Oct 11, 2021

Lysxia commented Oct 11, 2021

phadej commented Oct 11, 2021

Bodigrim commented Oct 12, 2021

ulysses4ever commented Oct 12, 2021

phadej commented Oct 12, 2021

Lysxia commented Oct 13, 2021

NorfairKing commented Oct 14, 2021

juhp commented Oct 15, 2021

phadej commented Oct 15, 2021

phadej commented Sep 11, 2021 •

edited

Loading

phadej commented Sep 11, 2021 •

edited

Loading

phadej commented Sep 13, 2021 •

edited

Loading

brendanhay commented Sep 13, 2021 •

edited

Loading

TravisCardwell commented Sep 15, 2021 •

edited

Loading

enobayram commented Sep 15, 2021 •

edited

Loading

TravisCardwell commented Sep 18, 2021 •

edited

Loading

NorfairKing commented Oct 9, 2021 •

edited

Loading

phadej commented Oct 10, 2021 •

edited

Loading

phadej commented Oct 11, 2021 •

edited

Loading