Skip to content

Conversation

@nastra
Copy link
Contributor

@nastra nastra commented Jan 8, 2026

This fixes an issue that @haizhou-zhao brought up in #14334.
Basically the test added in #14334 performs a concurrent update of the same view version, but fails with

org.apache.iceberg.exceptions.ValidationException: Cannot set last added schema: no schema has been added
	at org.apache.iceberg.exceptions.ValidationException.check(ValidationException.java:49)
	at org.apache.iceberg.view.ViewMetadata$Builder.addVersionInternal(ViewMetadata.java:297)
	at org.apache.iceberg.view.ViewMetadata$Builder.addVersion(ViewMetadata.java:277)
	at org.apache.iceberg.MetadataUpdate$AddViewVersion.applyTo(MetadataUpdate.java:508)
	at org.apache.iceberg.rest.CatalogHandlers.lambda$commit$11(CatalogHandlers.java:624)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
	at org.apache.iceberg.rest.CatalogHandlers.lambda$commit$12(CatalogHandlers.java:624)

This is due to our internal state tracking of lastAddedSchemaId, which is then assumed to be set when adding the view version and checking

if (version.schemaId() == LAST_ADDED) {
  ValidationException.check(lastAddedSchemaId != null, "Cannot set last added schema: no schema has been added");
  version = ImmutableViewVersion.builder().from(version).schemaId(lastAddedSchemaId).build();
}

I added a reproducible test to TestViewMetadata where the schema ID is set to -1, indicating that the schema ID can be re-assigned.
Once we get this change in, we should also get #14334 in, as that reproduces the issue and has a good test for it.

@huaxingao, @singhpk234, @amogh-jahagirdar since you guys reviewed #14434 already, could you please review this one as well?

// (indicating that the schema ID should be automatically assigned)
// this scenario can happen with concurrent updates in REST cases where the same update is
// applied twice. The view version gets a new ID assigned because
// ViewMetadata#sameViewVersion(current, updated) isn't true, because the current's schemaId was
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should probably deduplicate this properly. I'm working on a fix for this

@nastra nastra marked this pull request as draft January 8, 2026 14:04
private int addSchemaInternal(Schema schema) {
int newSchemaId = reuseOrCreateNewSchemaId(schema);
if (schemasById.containsKey(newSchemaId)) {
if (null == lastAddedSchemaId) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just FYI that this is most likely currently wrong, because the implication of setting lastAddedSchemaId here is that the metadata update will contain a -1 as the schema ID in addVersionInternal().
The internal state tracking is quite delicate here, so I'm currently exploring a few other options on how to achieve a concurrent replace operation to not fail due to internal state tracking in ViewMetadata

@nastra nastra force-pushed the last-added-schemaid branch 3 times, most recently from c7c9c51 to fd9bbc7 Compare January 9, 2026 11:41
Comment on lines 370 to 372
&& (one.schemaId() == two.schemaId()
|| (two.schemaId() == LAST_ADDED
&& Objects.equals(lastSeenExistingSchemaId, one.schemaId())));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit ugly - there is a difference between one and two

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the method name suggests that the comparison is symmetrical, but it is not

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I agree that this should be made symmetrical. I was mostly exploring this part to do proper deduplication for the edge case that we're testing for

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated this to be symmetrical

@nastra nastra changed the title Core: Set lastAddedSchemaId in case the same view version is being added as part of a concurrent update Core: Adding same view version as part of a concurrent update shouldn't fail Jan 9, 2026
@nastra nastra force-pushed the last-added-schemaid branch from fd9bbc7 to 718e098 Compare January 15, 2026 11:18
@nastra nastra marked this pull request as ready for review January 15, 2026 11:32
@nastra nastra requested a review from pvary January 22, 2026 06:47
Applying a concurrent update of the same view version currently fails with
```
org.apache.iceberg.exceptions.ValidationException: Cannot set last added schema: no schema has been added
	at org.apache.iceberg.exceptions.ValidationException.check(ValidationException.java:49)
	at org.apache.iceberg.view.ViewMetadata$Builder.addVersionInternal(ViewMetadata.java:297)
	at org.apache.iceberg.view.ViewMetadata$Builder.addVersion(ViewMetadata.java:277)
	at org.apache.iceberg.MetadataUpdate$AddViewVersion.applyTo(MetadataUpdate.java:508)
	at org.apache.iceberg.rest.CatalogHandlers.lambda$commit$11(CatalogHandlers.java:624)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
	at org.apache.iceberg.rest.CatalogHandlers.lambda$commit$12(CatalogHandlers.java:624)
```

This is due to our internal state tracking of `lastAddedSchemaId`, which is then assumed to be set when adding the view version and checking
```
if (version.schemaId() == LAST_ADDED) {
  ValidationException.check(lastAddedSchemaId != null, "Cannot set last added schema: no schema has been added");
  version = ImmutableViewVersion.builder().from(version).schemaId(lastAddedSchemaId).build();
}
```

I added a reproducible test to `TestViewMetadata` where the schema ID is set to `-1`, indicating that the schema ID can be re-assigned.
@nastra nastra force-pushed the last-added-schemaid branch from 718e098 to 38d3376 Compare January 22, 2026 06:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants