Skip to content

Conversation

@argmarco-tkd
Copy link
Collaborator

@argmarco-tkd argmarco-tkd commented Aug 8, 2025

Rationale for this change

In this PR, I'm creating the machinery (utils classes) for

  • Loading an implementation of the DataBatchProtectionAgent from a shared library (e.g. *.so file)
  • Wrapping that implementation in a Decorator which will ensure the shared library handle will be properly disposed of.

These changes are described in this doc

(An earlier PR was prepared on the same change. The earlier PR was closed because it had undesired changes/files in it - however I have tried to address the comments from that one in this one).

What changes are included in this PR?

  • Mostly new files + their respective tests.
  • A relatively important, but hopefully not controversial change is the addition of the "CloseDynamicLibrary" function to Arrow's io_util.cc - the same file which already contained functionality for dealing with dynamic libraries.

Are these changes tested?

  • Yes - unit tests have been created.

Additional Notes

  • The interface for DataBatchProtectionAgentInterface is still being worked on. I have created a temporary interface (parquet/encryption/external/DBPAInterface.h), which hopefully is close-enough to the final version. Once the interface is finalized, we will incorporate it here, and send the final PR.
  • This PR is of the changes between two branches: dev_dll_work (feature branch) and dev-miniapp

argmarco-tkd and others added 30 commits July 8, 2025 10:47
…yptor related stuff. Begin moving things around.
@github-actions
Copy link

github-actions bot commented Aug 8, 2025

Thanks for opening a pull request!

If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose

Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project.

Then could you also rename the pull request title in the following format?

GH-${GITHUB_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

or

MINOR: [${COMPONENT}] ${SUMMARY}

See also:

@argmarco-tkd argmarco-tkd changed the title Dev dll work [Draft] Machinery for loading external Agent (encryptor) out of a shared library Aug 8, 2025
Copy link
Collaborator

@avalerio-tkd avalerio-tkd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall, just a few comments. I mostly skipped the unittests.

#endif
}

Status CloseDynamicLibrary(void* handle) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method looks a bit cryptic. Could you add a comment at the method level to describe what this does?

set(PARQUET_SRCS ${PARQUET_SRCS} encryption/encryption_internal.cc
encryption/openssl_internal.cc)
encryption/openssl_internal.cc
encryption/external/loadable_encryptor_utils.cc
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my understanding, do these modules need to be specified somehow differently (with additional flags maybe) during compilation? Or these are treated the same as any other source?

arrow_shared
)

set_target_properties(DBPATestAgent PROPERTIES OUTPUT_NAME "DBPATestAgent")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably answered this already, but the TestAgent is just for the testing flow, right? (as in ARROW_TESTING=True above)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct, this is just for the testing flow. The generated library (*.so, *.dll) is not "installed" (i.e. copied) anywhere other than where needed for unit testing

}

DBPALibraryWrapper::DBPALibraryWrapper(
std::unique_ptr<DataBatchProtectionAgentInterface> agent,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These few lines are quite loaded. Could you add an 2-line comment on what this does?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is described in the .h, but I see your point. Will add a comment here as well.

Copy link
Collaborator

@avalerio-tkd avalerio-tkd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these are mostly comment and nits will approve for you to merge when ready. Thanks.

@argmarco-tkd argmarco-tkd changed the title [Draft] Machinery for loading external Agent (encryptor) out of a shared library GH-41 -- Machinery for loading external Agent (encryptor) out of a shared library Aug 8, 2025
@github-actions
Copy link

github-actions bot commented Aug 8, 2025

❌ GitHub issue #41 could not be retrieved.

@argmarco-tkd argmarco-tkd changed the title GH-41 -- Machinery for loading external Agent (encryptor) out of a shared library GH-41: Machinery for loading external Agent (encryptor) out of a shared library Aug 8, 2025
@github-actions
Copy link

github-actions bot commented Aug 8, 2025

❌ GitHub issue #41 could not be retrieved.

@argmarco-tkd argmarco-tkd changed the title GH-41: Machinery for loading external Agent (encryptor) out of a shared library Machinery for loading external Agent (encryptor) out of a shared library Aug 8, 2025
@avalerio-tkd
Copy link
Collaborator

@argmarco-tkd if there's anything pending from my side on this PR, let me know. Thanks!

@argmarco-tkd
Copy link
Collaborator Author

@argmarco-tkd if there's anything pending from my side on this PR, let me know. Thanks!

Nope. I was initially planning to have the finalized DataBatchProtectionAgent.h before merging this in - but on second thought, I'll merge it as is, THEN apply the changes to DataBatchProtectionAgent.h

@argmarco-tkd argmarco-tkd marked this pull request as ready for review August 9, 2025 20:56
@argmarco-tkd argmarco-tkd merged commit d2c48f8 into dev-miniapp Aug 9, 2025
34 of 77 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants