Skip to content

Conversation

@rschu1ze
Copy link
Contributor

@rschu1ze rschu1ze commented Jul 7, 2024

We (ClickHouse) use azure-sdk-for-cpp as a submodule to connect to Azure Blob storage.

Our comprehensive suite of tests is run under thread/undefined/memory/address sanitizers to detect issues before they hit production. One particular issue detected was a memory leak found and fixed in January 2023 which this PR upstreams.

Notes:

The problem was first addressed in azure 1.6 by these PRs: ClickHouse#4, ClickHouse#5, ClickHouse#6.

Every time azure-sdk-for-cpp was upgraded, the fixes have been picked into the current release. ClickHouse currently pulls in azure-sdk-for-cpp 1.12, also with the fix. This is to say that the fix has been part of our production code for a long while.

There was a previous attempt to upstream the fix (#4647) but 1. that other PR was sort of different from this fix and 2. the contributor unfortunately disappeared and the PR was abandoned.

I cannot determine which exact memory leak the fix originally addressed. The original fixes in ClickHouse's fork of azure-sdk-for-cpp only state that they fix a memory leak but they don't link to a defect report (sorry). I could spend some time doing code and Git archeology but it is probably not worth the trouble. The idea of this PR is to manage memory using RAII - that is generally a good idea as makes the code safer and less prone to errors. I could not find any obvious places where non-use of RAII would lead to errors.

The PR also makes setup/teardown functions XmlGlobalInitialize and XmlGlobalDeinitialize public. This is required for another fix that frees memory in TLS allocated by libxml when azure/libxml is used by multiple threads: https://github.com/ClickHouse/ClickHouse/blob/907a3659262e9029fd2c257913c62fadfc3bc433/programs/server/Server.cpp#L929 (see ClickHouse/ClickHouse#45796).

Two more notes:

  • For some weird reason, the XML wrapper code exists twice:

    • sdk/tables/azure-data-tables/inc/azure/data/tables/internal/xml_wrapper.hpp / sdk/tables/azure-data-tables/inc/azure/data/tables/internal/xml_wrapper.hpp
    • sdk/storage/azure-storage-common/inc/azure/storage/common/internal/xml_wrapper.hpp / sdk/storage/azure-storage-common/src/xml_wrapper.cpp

    This PR only addresses the second file(s). If you think we should rather apply the same fixes in both sets of files, please push another commit into this PR. My personal preference is to kill one of the copies (the other one, ideally) - feel free to do so.

  • I am a Linux user and did not check or compile the code in the AZ_PLATFORM_WINDOWS ifdef. Perhaps the Windows code is good but kindly double-check.

@github-actions github-actions bot added Community Contribution Community members are working on the issue customer-reported Issues that are reported by GitHub users external to the Azure organization. Storage Storage Service (Queues, Blobs, Files) labels Jul 7, 2024
@github-actions
Copy link

github-actions bot commented Jul 7, 2024

Thank you for your contribution @rschu1ze! We will review the pull request and get back to you soon.

@Jinming-Hu
Copy link
Member

@rschu1ze I'd like to confirm, you didn't observe any memory leak in the original implementation. This PR is just to adopt a good practice of memory management.

@rschu1ze
Copy link
Contributor Author

Sorry for the delay, I was on a business trip.

@rschu1ze I'd like to confirm, you didn't observe any memory leak in the original implementation. This PR is just to adopt a good practice of memory management.

I touched on this a bit in #5767 (comment). In retrospective, I can't tell which exact memory leak was addressed, but the title of ClickHouse#4 says "Fix memory leak", so I assume there was some memory leak existed somewhere. In any case, I would argue that using RAII for resource management is a good idea makes generally sense.

@rschu1ze
Copy link
Contributor Author

Thanks for approving. I am good but as I mentioned in the commit message, the XML wrapper class exists as copypasta in another place. We ideally get rid of the other copy or apply the fix in the other location.

@Jinming-Hu
Copy link
Member

@rschu1ze can you fix the CI failures? There are some format issues.

cc @gearama for the copypasta in table package.

@rschu1ze

This comment was marked as resolved.

@Jinming-Hu
Copy link
Member

Hi @rschu1ze , please apply below patch to fix build on Windows

diff --git a/sdk/storage/azure-storage-common/src/xml_wrapper.cpp b/sdk/storage/azure-storage-common/src/xml_wrapper.cpp
index 40f5ce10..ede70454 100644
--- a/sdk/storage/azure-storage-common/src/xml_wrapper.cpp
+++ b/sdk/storage/azure-storage-common/src/xml_wrapper.cpp
@@ -31,7 +31,7 @@ namespace Azure { namespace Storage { namespace _internal {
   void XmlGlobalInitialize() {}
   void XmlGlobalDeinitialize() {}
 
-  struct XmlReaderContext
+  struct XmlReader::XmlReaderContext
   {
     XmlReaderContext()
     {
@@ -99,20 +99,14 @@ namespace Azure { namespace Storage { namespace _internal {
       throw std::runtime_error("Unsupported xml encoding.");
     }
 
-    m_context = context.release();
+    m_context = std::move(context);
   }
 
-  XmlReader::~XmlReader()
-  {
-    if (m_context)
-    {
-      delete static_cast<XmlReaderContext*>(m_context);
-    }
-  }
+  XmlReader::~XmlReader() {}
 
   XmlNode XmlReader::Read()
   {
-    auto context = static_cast<XmlReaderContext*>(m_context);
+    auto& context = m_context;
 
     auto moveToNext = [&]() {
       HRESULT ret = WsReadNode(context->reader, context->error);
@@ -222,7 +216,7 @@ namespace Azure { namespace Storage { namespace _internal {
     }
   }
 
-  struct XmlWriterContext
+  struct XmlWriter::XmlWriterContext
   {
     XmlWriterContext()
     {
@@ -276,20 +270,14 @@ namespace Azure { namespace Storage { namespace _internal {
       throw std::runtime_error("Failed to initialize xml writer.");
     }
 
-    m_context = context.release();
+    m_context = std::move(context);
   }
 
-  XmlWriter::~XmlWriter()
-  {
-    if (m_context)
-    {
-      delete static_cast<XmlWriterContext*>(m_context);
-    }
-  }
+  XmlWriter::~XmlWriter() {}
 
   void XmlWriter::Write(XmlNode node)
   {
-    auto context = static_cast<XmlWriterContext*>(m_context);
+    auto& context = m_context;
     if (node.Type == XmlNodeType::StartTag)
     {
       if (node.HasValue)
@@ -363,7 +351,7 @@ namespace Azure { namespace Storage { namespace _internal {
 
   std::string XmlWriter::GetDocument()
   {
-    auto context = static_cast<XmlWriterContext*>(m_context);
+    auto& context = m_context;
 
     BOOL boolValueTrue = TRUE;
     WS_XML_WRITER_PROPERTY writerProperty[2];

@rschu1ze

This comment was marked as resolved.

@Jinming-Hu
Copy link
Member

@rschu1ze This PR now looks good to me. We can merge it after fixing the clang format issue.

@rschu1ze
Copy link
Contributor Author

@rschu1ze This PR now looks good to me. We can merge it after fixing the clang format issue.

Done, thanks.

@rschu1ze
Copy link
Contributor Author

(Note to myself: bump to v1.12)

@Jinming-Hu Jinming-Hu merged commit 6960ad1 into Azure:main Jul 30, 2024
rschu1ze pushed a commit to ClickHouse/azure-sdk-for-cpp that referenced this pull request Jul 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Community Contribution Community members are working on the issue customer-reported Issues that are reported by GitHub users external to the Azure organization. Storage Storage Service (Queues, Blobs, Files)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants