You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Updating from quick-xml v0.31.0 to v0.32.0, I was surprised to see my project's build suddenly take longer. While quick-xml v0.31.0 takes less than three seconds to compile with optimizations, v0.32.0 takes between 10 and 30 seconds depending on how the stars align. This turns it from "beneath notice" into "the one thing that holds up everything" -- especially because I use Cargo profile overrides to enabled optimizations for quick-xml even for debug builds, so tests that have to deal with a lot of XML don't take forever. I did some digging and concluded that this is due to a single function (resolve_html5_entity) whose enormous control flow graph causes several LLVM passes to take several seconds each.
This is not a new problem: the function has existed in the same form for a long time. I've only noticed it now because it used to be feature-gated until 10d1ff8 and I had never enabled the escape-html feature. But v0.31.0 with that feature enabled compiles just as slowly as v0.32.0 does by default. So the easy workaround (modulo semver concerns) would make that function feature-gated again. Another workaround would be slapping #[inline] on it so that it's not lowered to LLVM IR if downstream crates never call it, but that's silly for other reasons. I assume y'all had reasons to make this function available unconditionally, but as long as it has such a hefty build time impact, I'd appreciate a way to side-step it.
Of course, it would be best to address the root of the problem (the huge match statement), though that may be more involved. The standard solution in such cases is to convert it into a data structure of some sort, so the lookup doesn't require O(n) code for n key-value pairs. If the function is hot, a well-tuned perfect hash table will probably also improve performance. From skimming the code that gets generated right now, it seems to do a jump table on entity.len() followed by a lot of byte-by-byte comparisons/jump tables. While that probably works pretty well if you see the same couple of entities repeatedly, I wouldn't expect it to be competitive in more complex workloads.
The text was updated successfully, but these errors were encountered:
Hm. This is interesting consequence of making long function public, I've never think about that. It was made public just for convenience of possible users, we can hide it under feature again.
Updating from quick-xml v0.31.0 to v0.32.0, I was surprised to see my project's build suddenly take longer. While quick-xml v0.31.0 takes less than three seconds to compile with optimizations, v0.32.0 takes between 10 and 30 seconds depending on how the stars align. This turns it from "beneath notice" into "the one thing that holds up everything" -- especially because I use Cargo profile overrides to enabled optimizations for quick-xml even for debug builds, so tests that have to deal with a lot of XML don't take forever. I did some digging and concluded that this is due to a single function (
resolve_html5_entity
) whose enormous control flow graph causes several LLVM passes to take several seconds each.This is not a new problem: the function has existed in the same form for a long time. I've only noticed it now because it used to be feature-gated until 10d1ff8 and I had never enabled the
escape-html
feature. But v0.31.0 with that feature enabled compiles just as slowly as v0.32.0 does by default. So the easy workaround (modulo semver concerns) would make that function feature-gated again. Another workaround would be slapping#[inline]
on it so that it's not lowered to LLVM IR if downstream crates never call it, but that's silly for other reasons. I assume y'all had reasons to make this function available unconditionally, but as long as it has such a hefty build time impact, I'd appreciate a way to side-step it.Of course, it would be best to address the root of the problem (the huge match statement), though that may be more involved. The standard solution in such cases is to convert it into a data structure of some sort, so the lookup doesn't require O(n) code for n key-value pairs. If the function is hot, a well-tuned perfect hash table will probably also improve performance. From skimming the code that gets generated right now, it seems to do a jump table on
entity.len()
followed by a lot of byte-by-byte comparisons/jump tables. While that probably works pretty well if you see the same couple of entities repeatedly, I wouldn't expect it to be competitive in more complex workloads.The text was updated successfully, but these errors were encountered: