Skip to content

Conversation

@nastra
Copy link
Contributor

@nastra nastra commented Oct 30, 2025

The REST spec currently uses %1F as the UTF-8 encoded namespace separator for multi-part namespaces.
This causes issues, since it's a control character and the Servlet spec can reject such characters.

This PR makes the hard-coded namespace separator configurable by giving servers an option to send an optional namespace separator instead of %1F. The configuration part is entirely optional for REST server implementers and there's no behavioral change for existing installations.

The actual implementation for this can be seen at #10877

For backward compatibility, empty string is treated as absent for now.
If parent is a multipart namespace, the parts must be separated by the unit separator (`0x1F`) byte.
If parent is a multipart namespace, the parts must be separated by the namespace separator as
indicated via the /config override `namespace-separator`, which defaults to the unit separator (`0x1F`) byte.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about we name this conf nested-namespace-separator ?

Copy link
Contributor Author

@nastra nastra Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not too strong on the naming and either is fine, so let's see what others think this should be named

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you provide an example inside the comment?

If parent is a multipart namespace, the parts must be separated by the unit separator (`0x1F`) byte.
If parent is a multipart namespace, the parts must be separated by the namespace separator as
indicated via the /config override `namespace-separator`, which defaults to the unit separator (`0x1F`) byte.
To be compatible with older clients, servers must use both the advertised separator and `0x1F` as valid separators when decoding namespaces.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[doubt] I see is then, there no way then 0x1F can be surely considered as a non-seperator ? even when the client supports adhering to advertised seperator and is doing that ?
Or may be if servers wanna support that they can capture X-Iceberg-Version header and see if the client supports this ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry I don't think I'm following. Could you rephrase your question please?
The reason why the server must use both separators is because an older client will always use 0x1F while a newer client will use the advertised separator

Copy link
Contributor

@singhpk234 singhpk234 Oct 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, my question was if the server knows the client is new i.e respects the configurable seperator (for example client send the iceberg sdk version to server as part of header) can they choose not to treat 0x1F as seperator ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while this would technically be possible with the java client, it would be more challenging with other client implementations as you'd need to keep track of each client version where you know that this is a newer client. Hence I think we should always treat 0x1F as the legacy separator

@github-actions
Copy link

github-actions bot commented Dec 4, 2025

This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions.

@nastra nastra force-pushed the configurable-namespace-separator-spec-change branch from 0a02d02 to ecfa72d Compare December 4, 2025 15:04
@nastra nastra requested a review from Fokko December 4, 2025 15:07
Copy link
Contributor

@singhpk234 singhpk234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ! thanks @nastra

@nastra
Copy link
Contributor Author

nastra commented Dec 8, 2025

thanks @kamcheungting-db, @amogh-jahagirdar, @singhpk234 for reviewing!

@nastra nastra merged commit 642b852 into apache:main Dec 8, 2025
12 checks passed
@nastra nastra deleted the configurable-namespace-separator-spec-change branch December 8, 2025 15:40
sfc-gh-prsingh added a commit to singhpk234/iceberg that referenced this pull request Jan 11, 2026
- Rename parameter from 'referenced-list' to 'referenced-by'
- Update namespace separator docs to align with PR apache#14448
- Clarify parsing rules for dot separator between namespace and view name
- Add example showing multiple comma-separated views
- Regenerate Python models with explicit typing imports

Format: ?referenced-by=namespace.view1,namespace.view2
where namespace parts use configurable separator (default %1F)
sfc-gh-prsingh added a commit to singhpk234/iceberg that referenced this pull request Jan 11, 2026
- Rename parameter from 'referenced-list' to 'referenced-by'
- Update namespace separator docs to align with PR apache#14448
- Clarify parsing rules for dot separator between namespace and view name
- Add example showing multiple comma-separated views

Format: ?referenced-by=namespace.view1,namespace.view2
where namespace parts use configurable separator (default %1F)

Note: Python file not regenerated to avoid CI conflicts due to
different datamodel-codegen versions between environments.
sfc-gh-prsingh added a commit to singhpk234/iceberg that referenced this pull request Jan 11, 2026
- Rename parameter from 'referenced-list' to 'referenced-by'
- Update namespace separator docs to align with PR apache#14448
- Clarify parsing rules for dot separator between namespace and view name
- Document comma separator for multiple view identifiers
- Add example showing multiple comma-separated views
- Specify URL encoding for commas in view names (%2C)

Format: ?referenced-by=namespace.view1,namespace.view2
where namespace parts use configurable separator (default %1F)

Separators used:
- %1F (or configured): between namespace parts
- . (dot): between namespace and view name
- , (comma): between multiple view identifiers
sfc-gh-prsingh added a commit to singhpk234/iceberg that referenced this pull request Jan 13, 2026
- Rename parameter from 'referenced-list' to 'referenced-by'
- Update namespace separator docs to align with PR apache#14448
- Clarify parsing rules for dot separator between namespace and view name
- Document comma separator for multiple view identifiers
- Add example showing multiple comma-separated views
- Specify URL encoding for commas in view names (%2C)

Format: ?referenced-by=namespace.view1,namespace.view2
where namespace parts use configurable separator (default %1F)

Separators used:
- %1F (or configured): between namespace parts
- . (dot): between namespace and view name
- , (comma): between multiple view identifiers
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants