-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
accuarcy of function perf_get_mcycle64() #745
Comments
Hi Mingxuan,
In this case it seems that adding performance counters, which should slow
the calculation down, has actually sped it up! This is very interesting.
Could you please include the following information:
1. Which platform are you working on?
2. What exact command did you run to build the system used to make the
above measurements?
3. Please publish your git repo and send a link to the branch used to run
the above.
Thanks,
Alan
|
@limingxuan-pku , can I assume that you used CPU variant "perf" or "perf+cfu" in both cases? We have observed small code changes cause rather large runtime changes. The seems to be some sensitivity to code placement --- when adding calls to the perf routines, they should get inlined, which moves the location of other code, adding or removing L1 I-cache collisions. You could look at the disassembly in Ideally we'd have counters recording Icache and Dcache misses, but we don't have immediate plans for adding them. |
Hi! @alanvgreen @tcal-x |
Hello! I have some questions about function perf_get_mcycle64().
There exists some difference in the total cycles if I use perf_counters in contrast to not using them.
The following shows the result without perf_counter when running KWS model.
However, if I add
perf_enable_counter
andperf_disable_counter
inconv.h
to use some perf_counters, the total cycles change.It's strange that total cycles reduce a lot, I don't think it's due to error because the difference is so huge.
I know the total cycles are counted by using function
perf_get_mcycle64()
, which I think will not be affected by whether to use perf_counters.Thanks in advance!
The text was updated successfully, but these errors were encountered: