Skip to content

metric for heap fragmentation #14

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
d-a-v opened this issue Aug 29, 2018 · 12 comments
Closed

metric for heap fragmentation #14

d-a-v opened this issue Aug 29, 2018 · 12 comments

Comments

@d-a-v
Copy link

d-a-v commented Aug 29, 2018

You may find some interest in esp8266/Arduino#5090 .

This could be useful to test the efficiency of realloc().

@devyte
Copy link

devyte commented Sep 13, 2019

@rhempel have you looked at this? It has been in our umm code base for a while and is very useful. With little heap and libs like bearssl which require large allocs, it is rather common to hit out of memory due to fragmentation, and this metric, together with the largest free block, serve to detect the case.

@rhempel
Copy link
Owner

rhempel commented Dec 27, 2019

I have now looked at this and I am curious - I am not sure that I see this in the current esp8266 code repo?

@devyte
Copy link

devyte commented Dec 27, 2019

@rhempel I hope you had a very Merry Christmas!

Our umm code is here:
https://github.com/esp8266/Arduino/tree/master/cores/esp8266/umm_malloc

The fragmentation metric is here:
https://github.com/esp8266/Arduino/blob/698ffc34980191ab422918e3aa22ea90b6596766/cores/esp8266/Esp-frag.cpp#L26

The fragmentation metric together with max contiguous block and total free heap give a pretty good measure of the heap health.

@rhempel
Copy link
Owner

rhempel commented Dec 27, 2019

Thanks for the link to the metric - that was the missing piece. If we do integrate this would it be OK if I moved that code (with credit) into a function within umm_malloc()?

I am a bit wary of the 16 bit limit on heap size - there are some pretty cool Cortex-M devices where a 128k or even 256k heap is practical 😀

@d-a-v
Copy link
Author

d-a-v commented Dec 29, 2019

No problem if you integrate that code !
I am currently thinking that it would be a great addition to keep sumOfFree/max/sumOfSquares up-to-date with little cost at every malloc/realloc/free calls.
It would avoid to walk through every blocks everytime when calling metric functions.

@rhempel
Copy link
Owner

rhempel commented Jan 5, 2020

I am integrating the metric this weekend but I am having trouble with this comment:

@rhempel
Copy link
Owner

rhempel commented Jan 5, 2020

// L2 / Euclidian norm of free block sizes.
// Having getFreeHeap()=sum(hole-size), fragmentation is given by
// 100 * (1 - sqrt(sum(hole-size²)) / sum(hole-size))

fh = ummHeapInfo.freeBlocks * block_size;

Am I missing something? Is sum(hole-size) the correct comment, or should it be sum(blocks*block_size).

@rhempel
Copy link
Owner

rhempel commented Jan 5, 2020

A possible speed improvement is to simplify the summing of squares. We can just sum the squares of the number of free blocks in a free section, then the Euclidean norm simplifies to:

Block size * sqrt( sum of squares of blocks)

Saves two multiplications per free block plus all the math fits into uint32_t

@d-a-v
Copy link
Author

d-a-v commented Jan 6, 2020

block_size is a constant so it can indeed be taken out from the sum calculation.
Fragmentation can be calculated in terms of
f=sqrt(sum((number-of-adjacent-free-blocks)²))/sum(number-of-adjacent-free-blocks)
(with fragmentation-metric M = 100*(1-f))

Whatever the size of the block is, we'd have

|F|F|F|F|F|F|F|F|F|F|F|F|F|F|F|F| f=sqrt(16²)/16=1
|F|F|F|F|F|F|F|F|F|F|F|F|F|F|F|A| f=sqrt(15²)/15=1
|F|F|F|F|F|F|F|F|A|A|A|A|A|A|A|A| f=sqrt(8²)/8=1
|F|A|A|A|A|A|A|A|A|A|A|A|A|A|A|A| f=sqrt(1²)/1=1
|F|F|F|F|F|F|F|A|F|F|F|F|F|F|F|A| f=sqrt(7²+7²)/(7+7)=0.71
|F|F|F|A|F|F|F|A|F|F|F|A|F|F|F|A| f=sqrt(3²+3²+3²+3²)/(3+3+3+3)=0.5
|F|A|F|A|F|A|F|A|F|A|F|A|F|A|F|A| f=sqrt(1²*8)/(1*8)=0.35 (->0 with large number of total block)

We (at esp8266/arduino) should remove the block size from the equation.

Also what do you think of maintaining these values number-of-adjacent-free-blocks and (number-of-adjacent-free-blocks)² live ?
As an example, when

       v
|F|F|F|A|F|F|F|A|F|F|F|A|F|F|F|A| f=sqrt(36)/12=0.5

changes to

|F|F|F|F|F|F|F|A|F|F|F|A|F|F|F|A| f=sqrt(67)/13=0.62
       ^

We'd have -(3²+3²)+(7²) to add to (number-of-adjacent-free-blocks)²
and -(3+3)+(7) to add to (number-of-adjacent-free-blocks).
Then if (36,12) is the current state, the new state would be
(36-(9+9)+49, 12-(3+3)+(7)) = (67, 13) = the new state, at the cost of three squares calculation.
We do know about 3 and 3 and 7 in the malloc() or the free() call don't we ?

@rhempel
Copy link
Owner

rhempel commented Jan 8, 2020

Created PR UMM-14

@d-a-v
Copy link
Author

d-a-v commented Jan 9, 2020

Thanks for #22 !

Do you think it is worth having a look to what's proposed above as an optimization ?
In other words, a program that will monitor the state of its memory using all metrics including fragmentation will surely do that often, and it has a cost.
Updating these metrics live, by calculating three squares at every malloc/free may reduce the overall cost. malloc is expensive anyway, so user will (generally) try to minimize the frequency of these calls.

That would allow to not call the-also-expensive umm_info() and always have a fresh value of Free/Max/Frag metrics.

@rhempel
Copy link
Owner

rhempel commented Jan 9, 2020

That’s the plan - but my normal mode of operation is to implement the easy part along with test cases and then do the continuous calculation and use the same test case 😏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants