Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Severe memory leaking on Ruby 2.2 #785

Closed
assembler opened this issue Apr 21, 2015 · 9 comments
Closed

Severe memory leaking on Ruby 2.2 #785

assembler opened this issue Apr 21, 2015 · 9 comments
Labels
guidance Question that needs advice or information.

Comments

@assembler
Copy link

I have upgraded to aws-sdk v2, and at the same time I have upgraded to ruby 2.2. All of my processes suffer memory bloats so severe that I have to restart them every hour to make things running.

My application is processing data at a large scale, so processes issue tens of aws requests per second. I have used mac "leaks" utility to diagnose what causes memory leaks. As the process time goes, this utility find tens of thousands of memory leaks. The interesting part is that 99% of them is related to aws-sdk. Here is a sample output of: leaks PID | grep "Leak"

Leak: 0x7ff947406730  size=752  zone: DefaultMallocZone_0x101314000  "<ReceiveMessageResponse xmlns="http://queue.amazonaws.com/doc/2012-11-05/">
Leak: 0x7ff94740c730  size=48  zone: DefaultMallocZone_0x101314000  "{"Items":[],"Count":0,"ScannedCount":0}"
Leak: 0x7ff94740dd60  size=48  zone: DefaultMallocZone_0x101314000  "{"Items":[],"Count":0,"ScannedCount":0}"
Leak: 0x7ff9474101f0  size=752  zone: DefaultMallocZone_0x101314000  "{"Items":[{"f":{"SS":["/m/082cb"]},"g":{"S":"public"},"d":{"N":"1205180267"},"e":{"S":"113581533764688022263"},"q":{"S":"#C09068"}}],"Count":1,"ScannedCount":1}"
Leak: 0x7ff947411040  size=48  zone: DefaultMallocZone_0x101314000  "{"Items":[],"Count":0,"ScannedCount":0}"
Leak: 0x7ff9474158b0  size=752  zone: DefaultMallocZone_0x101314000  "<SendMessageResponse xmlns="http://queue.amazonaws.com/doc/2012-11-05/">
Leak: 0x7ff947416f10  size=384  zone: DefaultMallocZone_0x101314000  "<DeleteMessageResponse xmlns="http://queue.amazonaws.com/doc/2012-11-05/">

It feels like every AWS response is leaking memory, and I have thousands of them..

Using:

aws-sdk 2.0.39
ruby 2.2.1

Test script:

require "aws-sdk"

sqs = Aws::SQS::Client.new(endpoint: "http://lvh.me:9324")
queue_url = begin
  sqs.get_queue_url(queue_name: "test")[:queue_url]
rescue Aws::SQS::Errors::NonExistentQueue
  sqs.create_queue(queue_name: "test")[:queue_url]
end
queue_url = queue_url.gsub("localhost", "lvh.me") # avoiding 'invalid queue url error'

loop do
  100.times do
    sqs.send_message(queue_url: queue_url, message_body: "Hello")
  end

  loop do
    messages = sqs.receive_message(queue_url: queue_url)[:messages]
    break if messages.empty?

    messages.each do |message|
      sqs.delete_message(queue_url: queue_url, receipt_handle: message[:receipt_handle])
    end
  end

  GC.start
  memory = `ps -o rss,vsz -p #{Process.pid} | tail +2`.strip
  leaks = `leaks #{Process.pid} | grep -c Leak`.strip
  puts "Memory: #{memory}; Leaks: #{leaks}; Heap: #{GC.stat[:heap_live_slots]}"
end

Test output:

Memory: 38788  2487568; Leaks: 403; Heap: 82552
Memory: 40104  2487572; Leaks: 683; Heap: 82489
Memory: 40420  2487572; Leaks: 961; Heap: 82492
Memory: 40580  2487572; Leaks: 1246; Heap: 82492
Memory: 40716  2487572; Leaks: 1534; Heap: 82492
Memory: 41128  2488596; Leaks: 1825; Heap: 82492
Memory: 41868  2488596; Leaks: 2104; Heap: 82492
Memory: 42276  2488596; Leaks: 2395; Heap: 82492
Memory: 42368  2488596; Leaks: 2689; Heap: 82492
Memory: 42440  2488596; Leaks: 2980; Heap: 82492
Memory: 42548  2488596; Leaks: 3274; Heap: 82492
Memory: 42628  2488596; Leaks: 3561; Heap: 82492
Memory: 42660  2488596; Leaks: 3855; Heap: 82492
Memory: 42720  2488596; Leaks: 4147; Heap: 82492
Memory: 42736  2488596; Leaks: 4438; Heap: 82492
Memory: 42900  2488596; Leaks: 4732; Heap: 82492
Memory: 43004  2488596; Leaks: 5025; Heap: 82492
Memory: 43452  2490644; Leaks: 5313; Heap: 82492
Memory: 44648  2490644; Leaks: 5609; Heap: 82492
Memory: 44788  2490644; Leaks: 5904; Heap: 82492
Memory: 45432  2490644; Leaks: 6207; Heap: 82492
Memory: 45520  2490644; Leaks: 6499; Heap: 82492
Memory: 45680  2490644; Leaks: 6793; Heap: 82492
@trevorrowe
Copy link
Member

Sorry for the slow response. I've been out of town at RailsConf. I took some time to reproduce this issue and to track the root cause. I'll try to update with more information tomorrow, but I have a good idea of the root cause for the leak and some possible work-arounds.

@trevorrowe
Copy link
Member

It appears that StringIO is the culprit from Ruby stdlib. The SDK expects that the http handler should write the http response body to an IO object. The default response target is a StringIO object. The following snippet when run from a script will demonstrate the leak.

require 'stringio'

def report
  GC.start
  memory = `ps -o rss,vsz -p #{Process.pid} | tail +2`.strip
  leaks = `leaks #{Process.pid} | grep -c Leak`.strip
  puts "Memory: #{memory}; Leaks: #{leaks}; Heap: #{GC.stat[:heap_live_slots]}"
end

def leak(data)
  io = StringIO.new
  io.write(data)
end

def no_leak(data)
  StringIO.new(data)
end

data = '.' * 1024 * 1024 * 10 # 10MB data
20.times do
  leak(data)
  report
end

By removing the StringIO#write call, and replacing it with say StringIO.new the leak goes away. I'm going to be raising this with the Ruby core team. If I have more time tomorrow, I'll try to share a code snippet that can work-around this issue for now.

@assembler
Copy link
Author

Wow, thanks for the update! Glad you have figured out where the leak is. I've tested this against ruby 2.1.5, and there are no leaks. On ruby 2.2.0 and 2.2.1 there is severe leaking. That explains why the memory went wild after upgrading from 2.1.5 (which suffered its own memory issues) to 2.2.0 (which suffered even more :)

I've changed this line:
https://github.com/aws/aws-sdk-ruby/blob/v2.0.39/aws-sdk-core/lib/seahorse/client/http/response.rb#L60

Into this:

@body = StringIO.new(body_contents + chunk)

And the leaking went away. I know that this is stupid and that its ruby responsibility to provide proper fix, but it will be probably months before ruby is fixed. Do you think I can safely monkeypatch locally aws-sdk until then? Do you see any problems with the patch above? I have searched through codebase to find a place where custom :body IO is passed to response, but I don't think there is.

Thank you so much for your help!

@trevorrowe
Copy link
Member

The only places a custom body is passed into response are:

  • When you provide once as the response target. This can happen when you use Aws::S3::Client#get_object:

    s3 = Aws::S3::Client.new
    File.open('target', 'wb') do |file|
      s3.get_object(bucket:'name', key:'key', response_target:file)
    end
  • When you pass a block to the same method, the response target is replaced with a block yielder.

I was working on a drop-in replacement for StringIO. Here is what I have so-far. It implements all of the public interfaces of StringIO required by the SDK. It needs additional testing, but should be functional:

class CustomIO

  def initialize(data = '')
    @data = data
    @offset = 0
  end

  def write(data)
    @data << data
    data.bytesize
  end

  def read(bytes = nil, output_buffer = nil)
    if bytes
      data = partial_read(bytes)
    else
      data = full_read
    end
    output_buffer ? output_buffer.replace(data || '') : data
  end

  def rewind
    @offset = 0
  end

  def truncate(bytes)
    @data = @data[0,bytes]
    bytes
  end

  private

  def partial_read(bytes)
    if @offset >= @data.bytesize
      nil
    else
      data = @data[@offset,@offset+bytes]
      bump_offset(bytes)
      data
    end
  end

  def full_read
    data = @offset == 0 ? @data : @data[@offset,-1]
    @offset = @data.bytesize
    data
  end

  def bump_offset(bytes)
    @offset = [@data.bytesize, @offset + bytes].min
  end

end

@trevorrowe
Copy link
Member

I was putting together a bug report for Ruby and found a related issue already opened and then resolved: https://bugs.ruby-lang.org/issues/10942

It was closed as fixed 7 days ago. I don't know when this will become available, but the fact the same issue was reported and then fixed is good news.

@assembler
Copy link
Author

Awesome. So I can just put this in my code and it will all work without memory leaks:

module Seahorse
  class StringIO
    def initialize(data = '')
      @data = data
      @offset = 0
    end

    def write(data)
      @data << data
      data.bytesize
    end

    def read(bytes = nil, output_buffer = nil)
      if bytes
        data = partial_read(bytes)
      else
        data = full_read
      end
      output_buffer ? output_buffer.replace(data || '') : data
    end

    def rewind
      @offset = 0
    end

    def truncate(bytes)
      @data = @data[0,bytes]
      bytes
    end

    private

    def partial_read(bytes)
      if @offset >= @data.bytesize
        nil
      else
        data = @data[@offset,@offset+bytes]
        bump_offset(bytes)
        data
      end
    end

    def full_read
      data = @offset == 0 ? @data : @data[@offset,-1]
      @offset = @data.bytesize
      data
    end

    def bump_offset(bytes)
      @offset = [@data.bytesize, @offset + bytes].min
    end
  end
end

Once the patch is applied to ruby, I can safely remove this. Thank you for your help!

@assembler
Copy link
Author

I've created a gem that can be included in project and which solves memory issues with aws until ruby 2.2.3 is released: https://rubygems.org/gems/aws_memfix

@assembler
Copy link
Author

In the process of developing a gem, I've tried with SimpleDelegator implementation of StringIO. It turned out that all it takes to avoid memory leaks is this:

require "delegate"
module Seahorse
  class StringIO < SimpleDelegator
    def initialize(data = '')
      @io = ::StringIO.new(data)
      super(@io)
    end
  end
end

Have no clue why, but it works..

@fertobar
Copy link

fertobar commented Apr 16, 2018

hi guys! do you know if patch is still needed with sdk-3 and Ruby 2.2, or if is need to upgrade Ruby to avoid this issue?
I was using https://github.com/assembler/aws-sdk-ruby-memory-fix for sdk 2 but is not compatible for sdk 3

@srchase srchase added the guidance Question that needs advice or information. label Dec 24, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
guidance Question that needs advice or information.
Projects
None yet
Development

No branches or pull requests

4 participants