Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix memory leaks #180

Merged
merged 2 commits into from
Mar 10, 2023
Merged

Fix memory leaks #180

merged 2 commits into from
Mar 10, 2023

Conversation

GXKe
Copy link
Contributor

@GXKe GXKe commented Mar 4, 2023

Fixes #179

@lukebakken lukebakken self-assigned this Mar 4, 2023
@lukebakken lukebakken added this to the 1.7.1 milestone Mar 4, 2023
@lukebakken lukebakken requested a review from Zerpet March 4, 2023 16:13
Copy link
Contributor

@lukebakken lukebakken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Contributor

@Zerpet Zerpet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this change address this alledged memory leak?

The reference to queue[0] becomes invalid right after queue = queue[1:]. Since there are no more references to queue[0], the garbage collector will eventually collect and reclaim this piece of memory.

Zerpet added a commit that referenced this pull request Mar 7, 2023
Signed-off-by: Aitor Perez Cedres <[email protected]>
@lukebakken
Copy link
Contributor

Maybe this triggers GC faster? I haven't had time to reproduce the issue.

@lukebakken lukebakken self-requested a review March 7, 2023 14:31
@Zerpet
Copy link
Contributor

Zerpet commented Mar 7, 2023

I wrote an integration test to verify this behaviour. After profiling with your patch, I don't see a difference in allocations

https://github.com/rabbitmq/amqp091-go/blob/test-for-pr-180/integration_test.go#L2008

(pprof) top
Showing nodes accounting for 1.84GB, 88.98% of 2.07GB total
Dropped 23 nodes (cum <= 10.59MB)
Showing top 10 nodes out of 31
      flat  flat%   sum%        cum   cum%
  321.60MB 15.19% 15.19%   321.60MB 15.19%  github.com/rabbitmq/amqp091-go.newDelivery
  296.07MB 13.98% 29.17%   498.58MB 23.55%  github.com/rabbitmq/amqp091-go.(*reader).parseMethodFrame
  272.57MB 12.87% 42.05%   851.64MB 40.22%  github.com/rabbitmq/amqp091-go.(*Channel).PublishWithDeferredConfirmWithContext
  256.05MB 12.09% 54.14%   579.06MB 27.35%  github.com/rabbitmq/amqp091-go.(*Channel).sendOpen
  239.05MB 11.29% 65.43%   293.05MB 13.84%  github.com/rabbitmq/amqp091-go.(*reader).parseHeaderFrame
  218.51MB 10.32% 75.75%   218.51MB 10.32%  github.com/rabbitmq/amqp091-go.readShortstr
  121.01MB  5.72% 81.47%   121.01MB  5.72%  bytes.(*Buffer).grow
      58MB  2.74% 84.21%   154.01MB  7.27%  github.com/rabbitmq/amqp091-go.(*headerFrame).write
   56.50MB  2.67% 86.88%    56.50MB  2.67%  github.com/rabbitmq/amqp091-go.writeShortstr
   44.50MB  2.10% 88.98%   162.51MB  7.68%  github.com/rabbitmq/amqp091-go.(*methodFrame).write

(pprof) top recv
Active filters:
   focus=recv
Showing nodes accounting for 336.60MB, 15.90% of 2.07GB total
      flat  flat%   sum%        cum   cum%
  321.60MB 15.19% 15.19%   321.60MB 15.19%  github.com/rabbitmq/amqp091-go.newDelivery
      15MB  0.71% 15.90%   336.60MB 15.90%  github.com/rabbitmq/amqp091-go.(*Channel).recvContent
         0     0% 15.90%   321.60MB 15.19%  github.com/rabbitmq/amqp091-go.(*Channel).dispatch
         0     0% 15.90%   336.60MB 15.90%  github.com/rabbitmq/amqp091-go.(*Connection).demux
         0     0% 15.90%   336.60MB 15.90%  github.com/rabbitmq/amqp091-go.(*Connection).dispatchN
         0     0% 15.90%   336.60MB 15.90%  github.com/rabbitmq/amqp091-go.(*Connection).reader

(pprof) top buffer
Active filters:
   focus=buffer
Showing nodes accounting for 160.35MB, 7.57% of 2.07GB total
Showing top 10 nodes out of 14
      flat  flat%   sum%        cum   cum%
  121.01MB  5.72%  5.72%   121.01MB  5.72%  bytes.(*Buffer).grow
   39.34MB  1.86%  7.57%    39.34MB  1.86%  github.com/rabbitmq/amqp091-go.(*consumers).buffer

This is with the patch in this PR.

(pprof) top
Showing nodes accounting for 1.80GB, 88.49% of 2.04GB total
Dropped 29 nodes (cum <= 10.43MB)
Showing top 10 nodes out of 30
      flat  flat%   sum%        cum   cum%
  293.58MB 14.07% 14.07%   512.59MB 24.57%  github.com/rabbitmq/amqp091-go.(*reader).parseMethodFrame
  280.59MB 13.45% 27.52%   280.59MB 13.45%  github.com/rabbitmq/amqp091-go.newDelivery
  277.08MB 13.28% 40.81%   867.14MB 41.57%  github.com/rabbitmq/amqp091-go.(*Channel).PublishWithDeferredConfirmWithContext
  269.55MB 12.92% 53.73%   590.06MB 28.29%  github.com/rabbitmq/amqp091-go.(*Channel).sendOpen
  226.55MB 10.86% 64.59%   268.05MB 12.85%  github.com/rabbitmq/amqp091-go.(*reader).parseHeaderFrame
  218.51MB 10.47% 75.06%   218.51MB 10.47%  github.com/rabbitmq/amqp091-go.readShortstr
  116.01MB  5.56% 80.62%   116.01MB  5.56%  bytes.(*Buffer).grow
   60.50MB  2.90% 83.53%    60.50MB  2.90%  github.com/rabbitmq/amqp091-go.writeShortstr
   57.50MB  2.76% 86.28%   152.51MB  7.31%  github.com/rabbitmq/amqp091-go.(*headerFrame).write
      46MB  2.21% 88.49%   164.01MB  7.86%  github.com/rabbitmq/amqp091-go.(*methodFrame).write

(pprof) top recv
Active filters:
   focus=recv
Showing nodes accounting for 300.59MB, 14.41% of 2.04GB total
      flat  flat%   sum%        cum   cum%
  280.59MB 13.45% 13.45%   280.59MB 13.45%  github.com/rabbitmq/amqp091-go.newDelivery
      20MB  0.96% 14.41%   300.59MB 14.41%  github.com/rabbitmq/amqp091-go.(*Channel).recvContent
         0     0% 14.41%   280.59MB 13.45%  github.com/rabbitmq/amqp091-go.(*Channel).dispatch
         0     0% 14.41%   300.59MB 14.41%  github.com/rabbitmq/amqp091-go.(*Connection).demux
         0     0% 14.41%   300.59MB 14.41%  github.com/rabbitmq/amqp091-go.(*Connection).dispatchN
         0     0% 14.41%   300.59MB 14.41%  github.com/rabbitmq/amqp091-go.(*Connection).reader

(pprof) top buffer
Active filters:
   focus=buffer
Showing nodes accounting for 154.17MB, 7.39% of 2.04GB total
Showing top 10 nodes out of 14
      flat  flat%   sum%        cum   cum%
  116.01MB  5.56%  5.56%   116.01MB  5.56%  bytes.(*Buffer).grow
   38.16MB  1.83%  7.39%    38.16MB  1.83%  github.com/rabbitmq/amqp091-go.(*consumers).buffer

@lukebakken
Copy link
Contributor

Thanks for doing that @Zerpet

@GXKe
Copy link
Contributor Author

GXKe commented Mar 8, 2023

During the test, the memory can not be recycled for 24 hours after consumption.
After patching, it will be recycled within 10 minutes

@lukebakken
Copy link
Contributor

During the test, the memory can not be recycled for 24 hours after consumption. After patching, it will be recycled within 10 minutes

@GXKe - you'll have to provide more information. @Zerpet can't reproduce what you report.

Are you both using ARM architecture machines?

@GXKe
Copy link
Contributor Author

GXKe commented Mar 8, 2023

I'll do it again at the weekend.
The message has been consumed. It is abnormal to occupy so much memory.
No recycling is found on mac (m1) and ubuntu.
I have verified the 10-minute recovery on the mac (m1)

@lukebakken
Copy link
Contributor

No recycling is found on mac (m1) and ubuntu. I have verified the 10-minute recovery on the mac (m1)

This means that you ONLY see this issue using OS X?

@GXKe
Copy link
Contributor Author

GXKe commented Mar 8, 2023

图片

and intel cpu

@Zerpet
Copy link
Contributor

Zerpet commented Mar 10, 2023

The message has been consumed. It is abnormal to occupy so much memory.

I'm not sure I follow. The in_use object space is a few kilobytes. The numbers you shared in the original issue, are the numbers for the amount of memory used since the begining of the execution of the program. Since the test consists of sending 1_000_000 messages, the constuctor newDelivery and the readers/writers of the program will allocate memory for those messages. That does not mean there's a memory leak.

The garbage collector will reclaim the memory of those objects.

amqp091-go/consumers.go

Lines 77 to 79 in 5656876

case out <- *queue[0]:
queue = queue[1:]
}

Since the reference queue is being replaced by queue[1:], and nothing else in this scope references queue[0], the GC will reclaim that memory. Of course, the memory has already been allocated (and accounted by pprof) in the heap, but it does not mean it's leaking.

@lars-t-hansen
Copy link

lars-t-hansen commented Mar 10, 2023

Since the reference queue is being replaced by queue[1:], and nothing else in this scope references queue[0], the GC will reclaim that memory.

I don't think that follows. Given Go's slice semantics, and barring any information available to the compiler that proves that queue is the only pointer to the memory it references, the only meaning that queue = queue[1:] can have is basically queue += sizeof(queue element), ie, it bumps a pointer. Looking at the generated code for a simple example (on ARM64 in this case) bears this out. So what we're left with is an array that we have a pointer into the middle of. When the GC traces this pointer, it too does not know whether the array has multiple referents, and so its only sensible choice is to find the beginning of the array, and if the array is not already visited, mark every element in it, including the "dead" pointer.

(Depending on the program dynamics, an element may eventually be appended to the queue when the queue is at capacity, and in this case the live elements are copied into a new array and the old array is left to be GC'd eventually, along with the dead object. But that can take time.)

@lukebakken
Copy link
Contributor

@lars-t-hansen that's interesting.

@Zerpet there is no harm in adding the statement. I will add a comment to explain it more.

@lukebakken lukebakken merged commit 3acf42c into rabbitmq:main Mar 10, 2023
@lukebakken
Copy link
Contributor

Thanks everyone.

@Zerpet
Copy link
Contributor

Zerpet commented Mar 15, 2023

@lars-t-hansen I checked this with a simple program and you are right. Thank you for taking the time to explain this ❤️

@GXKe thank you for your contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

memory leak
4 participants