-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change MAX_MEMORY_BANDWIDTH device query to uint64 #2653
base: main
Are you sure you want to change the base?
Change MAX_MEMORY_BANDWIDTH device query to uint64 #2653
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
f3775e3
to
3c22a26
Compare
3c22a26
to
8922e51
Compare
7a156bd
to
35c8baa
Compare
|
||
uint32_t MemoryBandwidth = MemoryClockKHz * MemoryBusWidth * 250; | ||
uint64_t MemoryBandwidth = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is bits per second, don't we need to multiply by 1000 to account for the kHz aspect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also where does the MemoryBandwidthConstant
come from? I know it was already there, but have providence comments for magic numbers would be useful.
Also, since it's putatively a constant: const
or constexpr
, please
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the original equation is meant to use it as KHz to get it as b/s. I added a comment explaining how to get this constant from in the nvidia doc page. Do you think that should be enough or should I write the computation constants also in a comment?
35c8baa
to
807196a
Compare
807196a
to
804e82d
Compare
The max value that could be assigned to
MemoryBandwidth
variable is (250 * 3200000 * 256) which will overflow the uint32 max capacity. so we should change that to be uint64 to fit the maximum value from this computation.