Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
216bd5e
Added DLLEncryptor. Currently unused.
argmarco-tkd Jul 8, 2025
5059384
creating empty DLL Encryptor. Some wiring.
argmarco-tkd Jul 9, 2025
a35c9ac
DLLEncryptor succesfully loaded (via the Loader) from the mini-app.
argmarco-tkd Jul 9, 2025
a61d2a9
Refactored and separated the DLLEncryptorLoader from DLLEncryptor impl
argmarco-tkd Jul 11, 2025
9717f7e
Refactoring. Began the renaming of DLLEncryptorLoader into LoadableEn…
argmarco-tkd Jul 14, 2025
e99e3e9
Extracted LoadableEncryptorInterface into its own file.
argmarco-tkd Jul 15, 2025
323e608
adding a few utils to Dockerfile.miniApp
argmarco-tkd Aug 2, 2025
30a2b35
adding parallelism to ./ninja to the helper build scripts
argmarco-tkd Aug 5, 2025
958f8a9
created the parquet/encryption/[external] folder to put external encr…
argmarco-tkd Aug 5, 2025
6243e77
First pass at implementing DBPALibraryWrapper - tests not added yet.
argmarco-tkd Aug 5, 2025
d391d71
cleaning up the impl of LoadFromLibrary()
argmarco-tkd Aug 5, 2025
edff151
Defined DBPATestAgent as a simple agent for testing loading machinery…
argmarco-tkd Aug 5, 2025
6c8f265
moving DBPALibraryWrapper into its own .h and .cc files. No tests yet.
argmarco-tkd Aug 5, 2025
a62b405
Adding the option to pass a function to close the shared library. Thi…
argmarco-tkd Aug 6, 2025
0248ef6
Reviewed and improved the testing suite. All the tests for dbpa_libra…
argmarco-tkd Aug 6, 2025
1db72b6
Reviewed and improved the testing suite. All the tests for dbpa_libra…
argmarco-tkd Aug 6, 2025
3bf7406
ensuring that WIN32 stuff throws an exception (unimplemented)
argmarco-tkd Aug 6, 2025
3b09ef3
commented out the references to loadble_encryptor_utils and to DLL lo…
argmarco-tkd Aug 6, 2025
4d52fc7
Added a default value (function) for the handle_closing_fn parameter …
argmarco-tkd Aug 6, 2025
e91a805
Modified LoadableEncryptorUtils to work with DBPAInterface, instead o…
argmarco-tkd Aug 6, 2025
7d5b847
Adding tests for loadable_encryptor_utils
argmarco-tkd Aug 7, 2025
300ddb3
Added CloseDynamicLibrary() to arrow/util/io_util.cc
argmarco-tkd Aug 7, 2025
bdafe6c
Replaced the "custom" usage of dlopen, dlsym, etc to handle Shared Li…
argmarco-tkd Aug 7, 2025
c33df5c
better casting
argmarco-tkd Aug 7, 2025
be838ef
Added better heuristics to find the *.so file during tests
argmarco-tkd Aug 7, 2025
08b817f
Removed references to DLLEncryptor (sources, CMake configs)
argmarco-tkd Aug 7, 2025
99a0d64
working
sofia-tekdatum Jul 22, 2025
7beec95
closing decryption loop in miniapp. super hacky code, pls ignore
sofia-tekdatum Aug 4, 2025
4747aef
Removing unnecessary prints
sofia-tekdatum Aug 7, 2025
e975ec6
Merge branch 'dev-miniapp' into dll_work
argmarco-tkd Aug 8, 2025
d2c48f8
Some cleanup after feedback in draft PR: https://github.com/protegrit…
argmarco-tkd Aug 8, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions cpp/src/arrow/util/io_util.cc
Original file line number Diff line number Diff line change
Expand Up @@ -2278,4 +2278,24 @@ Result<void*> GetSymbol(void* handle, const char* name) {
#endif
}

Status CloseDynamicLibrary(void* handle) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method looks a bit cryptic. Could you add a comment at the method level to describe what this does?

if (handle == nullptr) {
return Status::Invalid("Attempting to close null library handle");
}
#ifdef _WIN32
if (FreeLibrary(reinterpret_cast<HMODULE>(handle))) {
return Status::OK();
}
// win32 api doc: "If the function fails, the return value is zero."
return IOErrorFromWinError(GetLastError(), "FreeLibrary() failed");
#else
if (dlclose(handle) == 0) {
return Status::OK();
}
// dlclose(3) man page: "On success, dlclose() returns 0; on error, it returns a nonzero value."
auto* error = dlerror();
return Status::IOError("dlclose() failed: ", error ? error : "unknown error");
#endif
}

} // namespace arrow::internal
7 changes: 7 additions & 0 deletions cpp/src/arrow/util/io_util.h
Original file line number Diff line number Diff line change
Expand Up @@ -443,6 +443,13 @@ ARROW_EXPORT Result<void*> LoadDynamicLibrary(const char* path);
/// returned; instead an error will be raised.
ARROW_EXPORT Result<void*> GetSymbol(void* handle, const char* name);

/// \brief Close a dynamic library
///
/// This wraps dlclose() except on Windows, where FreeLibrary() is called.
///
/// \return Status::OK() if the library was closed successfully, otherwise an error is returned.
ARROW_EXPORT Status CloseDynamicLibrary(void* handle);

template <typename T>
Result<T*> GetSymbolAs(void* handle, const char* name) {
ARROW_ASSIGN_OR_RAISE(void* sym, GetSymbol(handle, name));
Expand Down
5 changes: 4 additions & 1 deletion cpp/src/parquet/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -238,8 +238,11 @@ endif()

if(PARQUET_REQUIRE_ENCRYPTION)
list(APPEND PARQUET_SHARED_PRIVATE_LINK_LIBS ${ARROW_OPENSSL_LIBS})
list(APPEND PARQUET_STATIC_LINK_LIBS ${ARROW_OPENSSL_LIBS})
set(PARQUET_SRCS ${PARQUET_SRCS} encryption/encryption_internal.cc
encryption/openssl_internal.cc)
encryption/openssl_internal.cc
encryption/external/loadable_encryptor_utils.cc
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my understanding, do these modules need to be specified somehow differently (with additional flags maybe) during compilation? Or these are treated the same as any other source?

encryption/external/dbpa_library_wrapper.cc)
# Encryption key management
set(PARQUET_SRCS
${PARQUET_SRCS}
Expand Down
25 changes: 25 additions & 0 deletions cpp/src/parquet/encryption/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -17,3 +17,28 @@

# Headers: public api
arrow_install_all_headers("parquet/encryption")

if(ARROW_TESTING)
add_library(DBPATestAgent SHARED
external/dbpa_test_agent.cc
)

# DBPATestAgent configuration
target_link_libraries(DBPATestAgent PUBLIC
arrow_shared
)

set_target_properties(DBPATestAgent PROPERTIES OUTPUT_NAME "DBPATestAgent")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably answered this already, but the TestAgent is just for the testing flow, right? (as in ARROW_TESTING=True above)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct, this is just for the testing flow. The generated library (*.so, *.dll) is not "installed" (i.e. copied) anywhere other than where needed for unit testing


# Add test for DBPALibraryWrapper
add_parquet_test(dbpa-library-wrapper-test
SOURCES external/dbpa_library_wrapper_test.cc
LABELS "parquet-tests" "encryption-tests")

# Add test for LoadableEncryptorUtils
add_parquet_test(loadable-encryptor-utils-test
SOURCES external/loadable_encryptor_utils_test.cc
EXTRA_LINK_LIBS DBPATestAgent
LABELS "parquet-tests" "encryption-tests")
endif()

32 changes: 32 additions & 0 deletions cpp/src/parquet/encryption/external/dbpa_interface.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
//TODO: figure out the licensing.

#pragma once

#include <memory>
#include "parquet/platform.h"
#include "arrow/util/span.h"

using ::arrow::util::span;

namespace parquet::encryption::external {

//TODO: this will change once we have a solid defition of interfaces

class EncryptionResult {
};

class DecryptionResult {
};

class PARQUET_EXPORT DataBatchProtectionAgentInterface {
public:
virtual std::unique_ptr<EncryptionResult> Encrypt(
span<const uint8_t> plaintext,
span<uint8_t> ciphertext) = 0;

virtual std::unique_ptr<DecryptionResult> Decrypt(
span<const uint8_t> ciphertext) = 0;

virtual ~DataBatchProtectionAgentInterface() = default;
};
}
80 changes: 80 additions & 0 deletions cpp/src/parquet/encryption/external/dbpa_library_wrapper.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
//TODO: figure out the licensing.

#include "parquet/encryption/external/dbpa_library_wrapper.h"
#include "parquet/encryption/external/dbpa_interface.h"
#include "arrow/util/span.h"
#include <dlfcn.h>
#include <stdexcept>
#include <functional>

#include <iostream>

#include "arrow/util/io_util.h"

using ::arrow::util::span;

namespace parquet::encryption::external {

// Default implementation for handle closing function
void DefaultSharedLibraryClosingFn(void* library_handle) {
auto status = arrow::internal::CloseDynamicLibrary(library_handle);
if (!status.ok()) {
std::cerr << "Error closing library: " << status.message() << std::endl;
}
}

DBPALibraryWrapper::DBPALibraryWrapper(
std::unique_ptr<DataBatchProtectionAgentInterface> agent,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These few lines are quite loaded. Could you add an 2-line comment on what this does?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is described in the .h, but I see your point. Will add a comment here as well.

void* library_handle,
std::function<void(void*)> handle_closing_fn)
: wrapped_agent_(std::move(agent)),
library_handle_(library_handle),
handle_closing_fn_(std::move(handle_closing_fn)) {
// Ensure the wrapped agent is not null
if (!wrapped_agent_) {
throw std::invalid_argument("DBPAWrapper: Cannot create wrapper with null agent");
}
if (!library_handle_) {
throw std::invalid_argument("DBPAWrapper: Cannot create wrapper with null library handle");
}
if (!handle_closing_fn_) {
throw std::invalid_argument("DBPAWrapper: Cannot create wrapper with null handle closing function");
}
}

// DBPALibraryWrapper destructor
// This is the main reason for the decorator/wrapper.
// This will (a) destroy the wrapped agent, and (b) close the shared library.
// While the wrapped_agent_ would automatically be destroyed when this object is destroyed
// we need to explicitly destroy **before** we are able to close the shared library.
// Doing it in a different order, may cause issues, as by unloading the library may cause the class
// definition to be unloaded before the destructor completes, and that is likely to cause issues
// (such as a segfault).
DBPALibraryWrapper::~DBPALibraryWrapper() {
// Explicitly destroy the wrapped agent first
if (wrapped_agent_) {
DataBatchProtectionAgentInterface* wrapped_agent = wrapped_agent_.release();
delete wrapped_agent;
}

// Now we can close the shared library using the provided function
handle_closing_fn_(library_handle_);
library_handle_ = nullptr;
}

// Decorator implementation of Encrypt method
std::unique_ptr<EncryptionResult> DBPALibraryWrapper::Encrypt(
span<const uint8_t> plaintext,
span<uint8_t> ciphertext) {

return wrapped_agent_->Encrypt(plaintext, ciphertext);
}

// Decorator implementation of Decrypt method
std::unique_ptr<DecryptionResult> DBPALibraryWrapper::Decrypt(
span<const uint8_t> ciphertext) {

return wrapped_agent_->Decrypt(ciphertext);
}

} // namespace parquet::encryption::external
59 changes: 59 additions & 0 deletions cpp/src/parquet/encryption/external/dbpa_library_wrapper.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
//TODO: figure out the licensing.

#pragma once

#include <memory>
#include <functional>

#include "parquet/encryption/external/dbpa_interface.h"
#include "arrow/util/span.h"

using ::arrow::util::span;

namespace parquet::encryption::external {

// Default implementation for shared library closing function
// This is passed into the constructor of DBPALibraryWrapper,
// and is used as the default function to close the shared library.
void DefaultSharedLibraryClosingFn(void* library_handle);

// Decorator/Wrapper class for the DataBatchProtectionAgentInterface
// Its main purpose is to close the shared library when Arrow is about to destroy
// an intance of an DPBAgent
//
// In the constructor we allow to pass a function that will be used to close the shared library.
// This simplifies testing, as we can use a mock function to avoid actually closing the shared library.
class DBPALibraryWrapper : public DataBatchProtectionAgentInterface {
private:
std::unique_ptr<DataBatchProtectionAgentInterface> wrapped_agent_;
void* library_handle_;
std::function<void(void*)> handle_closing_fn_;

public:
// Constructor that takes ownership of the wrapped agent
explicit DBPALibraryWrapper(
std::unique_ptr<DataBatchProtectionAgentInterface> agent,
void* library_handle,
std::function<void(void*)> handle_closing_fn = &DefaultSharedLibraryClosingFn);

// Destructor
// This is the main reason for the decorator/wrapper.
// This will (a) destroy the wrapped agent, and (b) close the shared library.
// While the wrapped_agent_ would automatically be destroyed when this object is destroyed
// we need to explicitly destroy **before** we are able to close the shared library.
// Doing it in a different order, may cause issues, as by unloading the library may cause the class
// definition to be unloaded before the destructor completes, and that is likely to cause issues
// (such as a segfault).
~DBPALibraryWrapper() override;

// Decorator implementation of Encrypt method
std::unique_ptr<EncryptionResult> Encrypt(
span<const uint8_t> plaintext,
span<uint8_t> ciphertext) override;

// Decorator implementation of Decrypt method
std::unique_ptr<DecryptionResult> Decrypt(
span<const uint8_t> ciphertext) override;
};

} // namespace parquet::encryption::external
Loading
Loading