Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Profiling CLE, pyelftools, and pefile #231

Open
ltfish opened this issue Feb 15, 2020 · 7 comments
Open

Profiling CLE, pyelftools, and pefile #231

ltfish opened this issue Feb 15, 2020 · 7 comments
Assignees

Comments

@ltfish
Copy link
Member

ltfish commented Feb 15, 2020

Loading binaries is taking longer and longer since recent updates in CLE, pyelftools, and pefile. Profiling them is the first step to make things faster.

@rhelmot
Copy link
Member

rhelmot commented Apr 29, 2020

Here are some preliminary findings:

  • the hottest functions for elf loading are clemory.load and pyelftools struct._parse
  • For the former I was able to get a few milliseconds out of it
  • The latter is going to be very hard. pyelftools has a very intricate struct parsing mechanism and I can't imagine making any changes to it without bringing down a house of cards. The one change we could make which I could see improving things is somehow getting pyelftools to directly use clemory.unpack_word for its word unpacking when it is using a clemory as a stream, instead of reading the word out as bytes and then unpacking that. I have no idea what percentage of the struct parsing is done over a clemory vs the binary stream, so take this with a big old grain of salt.
  • how much time is spent on the various aspects of ELF parsing will obviously vary from binary to binary, but on the one I was testing on, relocation parsing was the most intensive. Note that this is just the parsing, not performing relocations, which actually takes relatively little by comparison. Because of this, another change I made was to disable relocation parsing when we disable relocation performing. This removes our ability to introspect into a binary's relocations without also performing them, but imo this is an okay tradeoff considering it is a noticeable speed improvement for large binaries.

I profiled PE loading back in summer 2017 and found that the same thing applied to pefile as it does to pyelftools - the hot functions are all struct parsing and this is already highly optimized. The big difference between our use of pefile vs pyelftools is that we use pefile as much more of a monolity, whereas we use pyelftools as a parsing toolkit. It might be possible to remove some unnecessary parsing if we look more carefully into how to use pefile efficiently.

@ltfish
Copy link
Member Author

ltfish commented Apr 29, 2020

Are you using load_debug_info=True? If so, are you using the latest pyelftools master? Recently a PR added a cache for DIU I believe, which sped up DWARF loading for me a lot.

I was thinking of monkeypatching the struct loading code in pyelftools in CLE using a C-backed implementation. What do you think?

@rhelmot
Copy link
Member

rhelmot commented Apr 29, 2020

all of my tests were with load_debug_info=False. I think your idea could maybe work but we would need to read the entire file into memory first and I don't really know how we would keep track of that.

@rhelmot
Copy link
Member

rhelmot commented Apr 29, 2020

also which level of abstraction were you thinking of monkeypatching pyelftools at? I can't seem to find a level in between "redo the whole gigantic mess" and "so small I don't think it would help anything"

@ltfish
Copy link
Member Author

ltfish commented Apr 29, 2020

I'm thinking of moving elftools/common/construct_utils.py into C.

@github-actions
Copy link
Contributor

This issue has been marked as stale because it has no recent activity. Please comment or add the pinned tag to prevent this issue from being closed.

@ltfish
Copy link
Member Author

ltfish commented Nov 16, 2022

One of the timeout binaries that we definitely want to be able to load: asterisk.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants