gstdec: Avoid leaking memory when reading audio data

We were reading audio data with the Gst.Buffer.extract_dup() method. This allocates new memory using g_malloc() and returns it to the caller. The memory needs to be freed with g_free(), however the PyGObject bindings do not do this. We can avoid problem by reading the audio data directory from the underlying Gst.Memory object. In this case the Python interpreter is responsible for copying the data and so it is able to correctly free the memory after it's no longer needed. I tested this by calling pyacoustid.fingerprint() on 34 .MP3 files in sequence, and I saw the following difference: - memory usage without the patch: 557052 KB - memory usage with the patch: 52752 KB The generated acoustid fingerprints were identical with and without the patch.
beetbox · Jan 12, 2019 · 527ed09 · 527ed09
1 parent 92f73b3
commit 527ed09
Showing 1 changed file with 5 additions and 1 deletion.
diff --git a/audioread/gstdec.py b/audioread/gstdec.py
@@ -311,7 +311,11 @@ def _new_sample(self, sink):
             # New data is available from the pipeline! Dump it into our
             # queue (or possibly block if we're full).
             buf = sink.emit('pull-sample').get_buffer()
-            self.queue.put(buf.extract_dup(0, buf.get_size()))
+            mem = buf.get_all_memory()
+            success, info = mem.map(Gst.MapFlags.READ)
+            data = info.data
+            mem.unmap(info)
+            self.queue.put(data)
         return Gst.FlowReturn.OK
 
     def _unkown_type(self, uridecodebin, decodebin, caps):