-
Notifications
You must be signed in to change notification settings - Fork 0
fix: improve NanoTDF support #112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Summary of ChangesHello @b-long, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request refines the Python SDK by removing unused and placeholder service client interfaces and associated configuration options, leading to a leaner and more focused API. Crucially, it also advances NanoTDF support by enabling and updating its integration test, demonstrating basic encryption and decryption functionality with a directly provided key, paving the way for more robust KAS integration. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request enables an integration test for NanoTDF encryption and decryption. While the test itself is a good addition for verifying the round-trip functionality, it exposes a critical issue in the underlying NanoTDF implementation. The NanoTDF class in src/otdf_python/nanotdf.py defines methods with the same name multiple times, which is not supported in Python and leads to method overriding. This will cause the new test to fail with a TypeError. The underlying issue in nanotdf.py needs to be addressed before this test can pass.
| config = NanoTDFConfig(kas_info_list=[kas_info], cipher=key.hex()) | ||
|
|
||
| # Create and read NanoTDF | ||
| nanotdf_bytes = nanotdf.create_nanotdf(data, config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This call to create_nanotdf will fail with a TypeError.
The NanoTDF class in src/otdf_python/nanotdf.py has two method definitions for create_nanotdf. In Python, the second definition overwrites the first. The method being called here (defined at line 442 in nanotdf.py) then makes a call to self.create_nano_tdf on line 455 with an incorrect number of arguments, which causes the TypeError.
A similar issue exists for the read_nanotdf method.
To fix this, the duplicate method names in src/otdf_python/nanotdf.py must be resolved, for instance by renaming the stream-based methods to something like create_nanotdf_to_stream. Since that file is not part of this pull request, this test will remain broken until the underlying code is fixed.
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request significantly improves NanoTDF support by implementing core encryption and decryption logic and adding a comprehensive suite of integration tests. The changes are a great step forward. My review focuses on two main areas for improvement. First, in ecc_mode.py, the new from_string method should be made safer by validating its input to prevent downstream errors. Second, in nanotdf.py, the logic for retrieving a symmetric key during decryption contains unreachable code and can be refactored for better robustness and readability. The new integration tests are well-structured and provide excellent coverage for the new functionality.
a5ae142 to
c5c5142
Compare
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request provides a significant improvement to NanoTDF support by implementing the core encryption and decryption logic, and adding a suite of valuable integration tests. The new tests for interoperability with otdfctl and for the Python-only roundtrip are particularly well-done and increase confidence in the implementation. My review includes a couple of minor suggestions to improve code maintainability by replacing magic values and locally-scoped constants with more permanent constants.
| assert otdfctl_size < 10000, "otdfctl NanoTDF should be compact" | ||
| assert python_size < 10000, "Python NanoTDF should be compact" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The magic number 10000 is used here for asserting the file size. It's better to define this as a constant with a descriptive name at the module level (e.g., MAX_EXPECTED_NANO_TDF_SIZE = 10000) and use that constant in these assertions. This improves readability and makes the test easier to maintain.
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request significantly improves NanoTDF support by implementing KAS integration for key wrapping and unwrapping, adding the ability to derive KAS endpoints from the platform URL, and introducing a comprehensive suite of integration tests to ensure interoperability. The changes are well-structured and the addition of tests is a great step towards ensuring robustness. My review focuses on a few areas for improvement: moving local imports to the top level for better code style and performance, addressing a critical gap in policy handling during KAS unwrap, and simplifying some complex and redundant logic to enhance maintainability and correctness.
| policy_json = "{}" # Default empty policy for now | ||
| # TODO: Extract actual policy from policy_info if needed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The policy is currently hardcoded to an empty JSON object ({}), and a TODO comment indicates that the actual policy extraction is not implemented. This is a significant gap, as it means the unwrap request sent to KAS will not contain the correct policy, potentially leading to incorrect access decisions or failures. The policy information should be extracted from the header_obj and included in the KAS request.
| def _local_unwrap(self, wrapped_key: bytes, config: NanoTDFConfig) -> bytes: | ||
| """Unwrap key locally using private key or mock unwrap (for testing/offline use).""" | ||
| kas_private_key = None | ||
| # Try to get from cipher field if it looks like a PEM key | ||
| if ( | ||
| config.cipher | ||
| and isinstance(config.cipher, str) | ||
| and "-----BEGIN" in config.cipher | ||
| ): | ||
| kas_private_key = config.cipher | ||
|
|
||
| # Check if mock unwrap is enabled in config string | ||
| kas_mock_unwrap = False | ||
| if config.config and "mock_unwrap=true" in config.config.lower(): | ||
| kas_mock_unwrap = True | ||
|
|
||
| if not kas_private_key and not kas_mock_unwrap: | ||
| raise InvalidNanoTDFConfig( | ||
| "Unable to unwrap NanoTDF key: KAS unwrap failed and no local private key available. " | ||
| "Ensure SDK has valid credentials or provide kas_private_key in config for offline use." | ||
| ) | ||
|
|
||
| if kas_mock_unwrap: | ||
| # Use the KAS mock unwrap_nanotdf logic | ||
| from otdf_python.sdk import KAS | ||
|
|
||
| return KAS().unwrap_nanotdf( | ||
| curve=None, | ||
| header=None, | ||
| kas_url=None, | ||
| wrapped_key=wrapped_key, | ||
| kas_private_key=kas_private_key, | ||
| mock=True, | ||
| ) | ||
| else: | ||
| asym = AsymDecryption(kas_private_key) | ||
| return asym.decrypt(wrapped_key) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic in _local_unwrap is overly complex and contains redundant paths. The if kas_mock_unwrap: branch is convoluted, inefficient due to KAS() instantiation, and ultimately performs the same action as the else branch: decrypting the key with AsymDecryption. This complexity increases the maintenance burden and can hide bugs.
The function can be greatly simplified by removing the kas_mock_unwrap logic and relying on a single, clear path for local private key decryption.
def _local_unwrap(self, wrapped_key: bytes, config: NanoTDFConfig) -> bytes:
"""Unwrap key locally using private key from config."""
kas_private_key = None
# Try to get from cipher field if it looks like a PEM key
if (
config.cipher
and isinstance(config.cipher, str)
and "-----BEGIN" in config.cipher
):
kas_private_key = config.cipher
if not kas_private_key:
raise InvalidNanoTDFConfig(
"Unable to unwrap NanoTDF key: KAS unwrap failed and no local private key available. "
"Ensure SDK has valid credentials or provide a PEM private key in config.cipher for offline use."
)
asym = AsymDecryption(kas_private_key)
return asym.decrypt(wrapped_key)| Returns: | ||
| tuple: (wrapped_key, kas_public_key) | ||
| """ | ||
| import logging |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imports should be at the top of the file, not inside functions. This file has several local imports (e.g., logging here, base64, Header, KeyAccess in _kas_unwrap).
Per PEP 8, imports should be at the top of the module. This practice improves readability, avoids circular import issues, and prevents the performance overhead of re-importing modules if a function is called multiple times in a loop. Please move all local imports to the top of the file.
| except Exception as e: | ||
| # If KAS unwrap fails, log and fall through to local unwrap methods | ||
| import logging | ||
|
|
||
| logging.warning(f"KAS unwrap failed for NanoTDF: {e}, trying local unwrap") | ||
| key = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Catching a broad Exception is generally discouraged as it can mask unexpected programming errors (e.g., AttributeError, TypeError) by treating them as a KAS communication failure. This makes debugging more difficult. It would be more robust to catch more specific exceptions related to network I/O or specific SDKException subclasses that the KAS client might raise.
fix: improve NanoTDF support