api: add memory desc serialization api

oneapi-src · Mar 27, 2024 · 9b848c8 · 9b848c8
1 parent ebe77b5
commit 9b848c8
Show file tree

Hide file tree

Showing 5 changed files with 149 additions and 20 deletions.
diff --git a/doc/advanced/persistent_cache.md b/doc/advanced/persistent_cache.md
@@ -1,28 +1,29 @@
 Persistent Cache{#dev_guide_persistent_cache}
 ===========================================================
 
-Creating some oneDNN abstractions can be costly for various reasons.
+Creating oneDNN abstractions can be costly for various reasons.
 Usually, oneDNN mitigates that overhead by caching such objects but
-the cache has no effect when the objects are created for the first time.
-For some applications it can be critical to reduce that overhead.
+cache has no effect when the objects are created for the first time.
+For some applications, reducing that overhead is critical.
 
 oneDNN provides an API that can be used to create a persistent cache for
-such oneDNN abstractions. User can use that API to obtain a cache blob ID
-and a cache blob to use them as a key and value respectively.
+oneDNN abstractions. Through the API, users can obtain a cache blob ID
+and a cache blob to use as a key and value respectively.
 
 @note
 Content and size of the cache blob ID and cache blob objects are not specified.
 
 @note
-oneDNN version and git commit hash (@ref dnnl_version_t::hash) affect equality
-of the cache blob IDs. That is, the queried cache blob ID will be different
+The oneDNN version and git commit hash (@ref dnnl_version_t::hash) affect the equality
+of the cache blob IDs. That is, the queried cache blob ID will differ
 for different oneDNN versions and git commit hashes.
 
 @warning
-The git commit hash may not be available if the git package was not found during
+The git commit hash may not be available if the git package is not found during
 a CMake call. In this case, the cache blob ID will be the same for different
 hashes. This may result in fetching a wrong cache blob from persistent cache.
 
+
 ## Primitive
 
 * The cache blob ID can be obtained via @ref dnnl::engine dnnl::primitive_desc_base::get_cache_blob_id
@@ -32,8 +33,8 @@ with the primitive descriptor.
 
 
 ### Relation to Primitive Cache
-In the case when a primitive is created from a cache blob and the identical
-primitive is present in the primitive cache the one from primitive cache will
+When a primitive is created from a cache blob and the identical
+primitive is present in the primitive cache, the one from primitive cache will
 be returned to the user, and the given cache blob will not be used. Otherwise,
 the cache blob will be used to speed up the primitive creation. The information
 about how the primitive was created (`cache_miss`, `cache_hit` or
@@ -69,7 +70,7 @@ using namespace dnnl;
 
 * The cache blob ID can be obtained via @ref dnnl::ocl_interop::get_engine_cache_blob_id
 * The cache blob can obtained via @ref dnnl::ocl_interop::get_engine_cache_blob
-* Engine can be created with the cache blob via @ref dnnl::ocl_interop::make_engine(cl_device_id, cl_context, const std::vector<uint8_t> &)
+* The engine can be created with the cache blob via @ref dnnl::ocl_interop::make_engine(cl_device_id, cl_context, const std::vector<uint8_t> &)
 
 ### API Usage Example
 
@@ -100,12 +101,28 @@ using namespace dnnl;
 }
 ~~~
 
+## Memory descriptor
+
+When serializing primitives, a binary blob can be obtained from a
+memory descriptor using @ref dnnl::memory::desc::get_blob. Any binary
+blob obtained from @ref dnnl::memory::desc::get_blob can be used to
+create a memory descriptor @ref dnnl::memory::desc.
+
+@note
+When deserializing a constant tensor, the user must verify that the deserialized memory descriptor matches the memory
+descriptor expected by the primitive that will use that memory. The
+only circumstance where both are guaranteed to match is when
+serialization/deserialization happens on the same system and in the same
+environment.
+
+
+
 ## Limitations
 
-* The API is implemented for the OpenCL runtime only. For CPU engine kind and
-other runtimes the library will return #dnnl_unimplemented in the case of the C
-API or throw a corresponding @ref dnnl::error exception in the case of the C++
-API.
-* Currently, the library cannot differentiate cache blob created for devices
-that have different stepping therefore the cache blob can be safely used only
-on the system where it was created.
+* The primitive and engine APIs are implemented for OpenCL runtime
+only. For CPU engine and other runtimes, the library will return
+#dnnl_unimplemented (in the case of the C API) or throw a corresponding
+@ref dnnl::error exception (in the case of the C++ API).
+* Currently, the library cannot differentiate cache blobs created for devices
+that have different stepping; therefore, the cache blob can be safely used only
+on the system where it is created.
diff --git a/include/oneapi/dnnl/dnnl.h b/include/oneapi/dnnl/dnnl.h
@@ -780,6 +780,27 @@ dnnl_status_t DNNL_API dnnl_memory_desc_destroy(dnnl_memory_desc_t memory_desc);
 dnnl_status_t DNNL_API dnnl_memory_desc_clone(dnnl_memory_desc_t *memory_desc,
         const_dnnl_memory_desc_t existing_memory_desc);
 
+/// Retrieves a binary blob associated with the given memory descriptor
+///
+/// @param Output blob Pointer to binary blob.
+///     If not nullptr, size bytes of the memory descriptor blob are written.
+/// @param Output size Pointer to the size of the binary blob in bytes.
+///     Size is written if blob is nullptr.
+/// @param memory_desc Input memory descriptor to serialize
+/// @returns #dnnl_success on success and a status describing the error
+///     otherwise.
+dnnl_status_t DNNL_API dnnl_memory_desc_get_blob(
+        uint8_t *blob, size_t *size, const_dnnl_memory_desc_t memory_desc);
+
+/// Creates a memory descriptor from a memory descriptor binary blob.
+///
+/// @param Output memory_desc Pointer to a newly allocated memory descriptor.
+/// @param blob Pointer to a memory descriptor binary blob.
+/// @returns #dnnl_success on success and a status describing the error
+///     otherwise.
+dnnl_status_t DNNL_API dnnl_memory_desc_create_with_blob(
+        dnnl_memory_desc_t *memory_desc, const uint8_t *blob);
+
 /// Creates a memory descriptor using dimensions and strides.
 ///
 /// @note

diff --git a/include/oneapi/dnnl/dnnl.hpp b/include/oneapi/dnnl/dnnl.hpp
@@ -2833,6 +2833,17 @@ struct memory : public handle<dnnl_memory_t> {
         /// @param md The C API memory descriptor.
         desc(dnnl_memory_desc_t md) : handle<dnnl_memory_desc_t>(md) {}
 
+        /// Construct a memory descriptor from a binary blob.
+        ///
+        /// @param blob A binary blob previously queried from a memory descriptor.
+        desc(const std::vector<uint8_t> &blob) {
+            dnnl_memory_desc_t md = nullptr;
+            error::wrap_c_api(
+                    dnnl_memory_desc_create_with_blob(&md, blob.data()),
+                    "could not create a memory descriptor from blob");
+            reset(md);
+        }
+
         /// Constructs a memory descriptor for a region inside an area
         /// described by this memory descriptor.
         //
@@ -3121,6 +3132,21 @@ struct memory : public handle<dnnl_memory_t> {
         size_t get_size() const { return dnnl_memory_desc_get_size(get()); }
 #endif
 
+        /// Returns a binary blob associated with the given memory descriptor
+        /// @returns The memory descriptor blob associated with the memory descriptor
+        std::vector<uint8_t> get_blob() {
+            size_t size;
+            dnnl_status_t status
+                    = dnnl_memory_desc_get_blob(nullptr, &size, get());
+            error::wrap_c_api(
+                    status, "could not get memory descriptor blob size");
+
+            std::vector<uint8_t> out_blob(size);
+            status = dnnl_memory_desc_get_blob(out_blob.data(), &size, get());
+            error::wrap_c_api(status, "could not get memory descriptor blob");
+            return out_blob;
+        }
+
         /// Checks whether the memory descriptor is zero (empty).
         /// @returns @c true if the memory descriptor describes an empty
         ///     memory and @c false otherwise.

diff --git a/src/common/memory_desc.cpp b/src/common/memory_desc.cpp
@@ -1,5 +1,5 @@
 /*******************************************************************************
-* Copyright 2022-2023 Intel Corporation
+* Copyright 2022-2024 Intel Corporation
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
@@ -734,6 +734,27 @@ status_t dnnl_memory_desc_clone(memory_desc_t **memory_desc,
     return success;
 }
 
+status_t dnnl_memory_desc_get_blob(
+        uint8_t *blob, size_t *size, const memory_desc_t *md) {
+    if (md == nullptr || (blob == nullptr && size == nullptr))
+        return invalid_arguments;
+    if (blob != nullptr)
+        memcpy(blob, md, *size);
+    else if (size != nullptr)
+        *size = sizeof(memory_desc_t);
+
+    return success;
+}
+
+status_t dnnl_memory_desc_create_with_blob(
+        memory_desc_t **md, const uint8_t *blob) {
+    if (one_of(nullptr, md, blob)) return invalid_arguments;
+
+    *md = new memory_desc_t();
+    memcpy(*md, blob, sizeof(memory_desc_t));
+    return success;
+}
+
 // This is an internal API that is used only for testing in benchdnn.
 extern "C" status_t DNNL_API dnnl_memory_desc_create_with_string_tag(
         memory_desc_t **memory_desc, int ndims, const dims_t dims,

diff --git a/tests/gtests/test_persistent_cache_api.cpp b/tests/gtests/test_persistent_cache_api.cpp
@@ -1,5 +1,5 @@
 /*******************************************************************************
-* Copyright 2021-2022 Intel Corporation
+* Copyright 2021-2024 Intel Corporation
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
@@ -17,6 +17,7 @@
 #include "dnnl_test_common.hpp"
 #include "gtest/gtest.h"
 
+#include "oneapi/dnnl/dnnl.h"
 #include "oneapi/dnnl/dnnl.hpp"
 
 #if DNNL_GPU_RUNTIME == DNNL_RUNTIME_OCL
@@ -92,4 +93,47 @@ HANDLE_EXCEPTIONS_FOR_TEST(
 }
 #endif
 
+HANDLE_EXCEPTIONS_FOR_TEST(
+        persistent_cache_api_test_t, TestPersistentCacheAPIMemoryDesc) {
+    engine e = get_test_engine();
+    auto pd = convolution_forward::primitive_desc {e,
+            prop_kind::forward_training, algorithm::convolution_direct,
+            {{2, 16, 16, 16}, memory::data_type::f32, memory::format_tag::any},
+            {{16, 16, 3, 3}, memory::data_type::f32, memory::format_tag::any},
+            {{2, 16, 14, 14}, memory::data_type::f32, memory::format_tag::any},
+            {1, 1}, {0, 0}, {0, 0}};
+
+    auto wei_desc = pd.weights_desc();
+
+    // C API check
+    {
+        auto c_wei_desc = wei_desc.get();
+
+        // serialization
+        size_t size;
+        dnnl_memory_desc_get_blob(nullptr, &size, c_wei_desc);
+        std::vector<uint8_t> serialized_wei_desc(size);
+        dnnl_memory_desc_get_blob(
+                serialized_wei_desc.data(), &size, c_wei_desc);
+
+        // deserialization
+        dnnl_memory_desc_t deserialized_wei_desc;
+        dnnl_memory_desc_create_with_blob(
+                &deserialized_wei_desc, serialized_wei_desc.data());
+
+        ASSERT_EQ(pd.weights_desc(), deserialized_wei_desc);
+    }
+
+    // C++ API check
+    {
+        // serialization
+        std::vector<uint8_t> serialized_wei_desc = wei_desc.get_blob();
+
+        //  deserialization
+        auto deserialized_desc = memory::desc(serialized_wei_desc);
+
+        ASSERT_EQ(pd.weights_desc(), deserialized_desc);
+    }
+}
+
 } // namespace dnnl