Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store a hash in a NormalizedUri #214

Merged
merged 2 commits into from
Jan 28, 2020
Merged

Store a hash in a NormalizedUri #214

merged 2 commits into from
Jan 28, 2020

Conversation

mpickering
Copy link
Contributor

NormalizedUris are supposed to be used as keys in maps. Using a HashMap
with a cached key is far faster than using a Map with the slow ordering
instance.

In particular, ghcide was hammering updateDiagnostics and the compare
function for NormalizedUri was dominating the profile.

NormalizedUris are supposed to be used as keys in maps. Using a HashMap
with a cached key is far faster than using a Map with the slow ordering
instance.

In particular, ghcide was hammering updateDiagnostics and the `compare`
function for `NormalizedUri` was dominating the profile.
where (Uri t) = maybe uri filePathToUri (uriToFilePath uri)
-- To ensure all `Uri`s have the file path like the created ones by `filePathToUri`
norm = T.pack $ escapeURIString isUnescapedInURI $ unEscapeString $ T.unpack t
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given this is just a value for indexing, maybe we should skip the T.pack, and store a string?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that would increase the memory footprint quite a bit. I wold prefer to implement the normalisation logic directly on text.

--
-- If you care about performance then you should use a hash map. The keys
-- are cached in order to make hashing very fast.
data NormalizedUri = NormalizedUri Int !Text
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the hash intentionally non-strict? I guess it saves you a bit if you don’t care about it but if you do care about it, making it strict so it can be unboxed seems benefitiial.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You want to delay the hashing until it is actually used, which may never happen. So it makes sense to keep it non-strict

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hash is intentionally non-strict for the reasons that Pepe suggests.

@pepeiborra
Copy link
Collaborator

Text has an efficient Hashable instance. Do you need both changes, or maybe just the switch to HashMap?

@mpickering
Copy link
Contributor Author

The hashing method still accounts for 2.8% of runtime if you don't cache it so I would prefer to leave it as it is for now.

Copy link
Contributor

@cocreature cocreature left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks a lot!

@alanz alanz merged commit 65607bf into haskell:master Jan 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants