Skip to content

Migrate Procedure sub-classes in spark-extensions to JUnit5 and AssertJ style#9760

Merged
nastra merged 6 commits intoapache:mainfrom
tomtongue:mig-junit5-procedures-extensions
Feb 24, 2024
Merged

Migrate Procedure sub-classes in spark-extensions to JUnit5 and AssertJ style#9760
nastra merged 6 commits intoapache:mainfrom
tomtongue:mig-junit5-procedures-extensions

Conversation

@tomtongue
Copy link
Contributor

@tomtongue tomtongue commented Feb 20, 2024

Migrate the following "Procedure" sub-classes in spark-extensions to JUnit 5 and AssertJ style along with #9613, and #9086.

TestAddFiles was included in #9613, it's skipped in this PR.

Current Progress

  • TestAncestorsOfProcedure
  • TestCallStatementParser
  • (Table Migration)
    • TestSnapshotTableProcedure
    • TestMigrateTableProcedure
    • TestRegisterTableProcedure
  • (Snapshot management)
    • TestRollbackToSnapshotProcedure
    • TestRollbackToTimestampProcedure
    • TestCherrypickSnapshotProcedure
    • TestSetCurrentSnapshotProcedure
    • TestPublishChangesProcedure
    • TestFastForwardBranchProcedure
  • (Metadata management)
    • TestExpireSnapshotsProcedure
    • TestRemoveOrphanFilesProcedure
    • TestRewriteDataFilesProcedure
    • TestRewriteManifestsProcedure
    • TestRewritePositionDeleteFiles
    • TestRewritePositionDeleteFilesProcedure
  • (CDC)
    • TestCreateChangelogViewProcdure
    • TestChangelogTable

@tomtongue tomtongue changed the title Migrate Procedure sub-classes in spark-extensions to JUnit5 and AssertJ style [WIP] Migrate Procedure sub-classes in spark-extensions to JUnit5 and AssertJ style Feb 20, 2024
@github-actions github-actions bot added the spark label Feb 20, 2024
@tomtongue
Copy link
Contributor Author

@nastra the migration for procedures is complete. When you have a chance, could you review the changes?

@tomtongue tomtongue changed the title [WIP] Migrate Procedure sub-classes in spark-extensions to JUnit5 and AssertJ style Migrate Procedure sub-classes in spark-extensions to JUnit5 and AssertJ style Feb 21, 2024
(CallStatement) parser.parsePlan("CALL c.n.func(1, '2', 3L, true, 1.0D, 9.0e1, 900e-1BD)");
Assert.assertEquals(
ImmutableList.of("c", "n", "func"), JavaConverters.seqAsJavaList(call.name()));
assertThat(seqAsJavaList(call.name())).hasSameElementsAs(ImmutableList.of("c", "n", "func"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assertThat(seqAsJavaList(call.name())).hasSameElementsAs(ImmutableList.of("c", "n", "func"));
assertThat(seqAsJavaList(call.name())).containsExactly("c", "n", "func");

please also update the other places

private <T> T checkCast(Object value, Class<T> expectedClass) {
Assert.assertTrue(
"Expected instance of " + expectedClass.getName(), expectedClass.isInstance(value));
assertThat(expectedClass.isInstance(value))
Copy link
Contributor

@nastra nastra Feb 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assertThat(expectedClass.isInstance(value))
assertThat(expectedClass).isInstanceOf(value.getClass())

Copy link
Contributor Author

@tomtongue tomtongue Feb 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this previously, but the tests related to this method failed with the following error:

// Example
java.lang.AssertionError: 
Expecting actual:
  org.apache.spark.sql.catalyst.plans.logical.PositionalArgument
to be an instance of:
  org.apache.spark.sql.catalyst.plans.logical.PositionalArgument
but was instance of:
  java.lang.Class

value.getClass seems not to be resolved the correct class name inside of .isInstanceOf.

This works well when using the current code like expectedClass.isInstance(value)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the problem with the current assertion is that it only will print true/false if the assertion fails. What we rather want to achieve for assertions is to have enough context to know what actual/expected was exactly. That being said, we want to avoid the usage of isTrue() / isFalse() as much as possible.

What was the assertion that you previously used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reply. Yes, I understand the situation and we should avoid using isTrue()/isFalse().

In the current PR, I change it to assertThat(expectedClass.isInstance(value)).isTrue(),
Before submitting the PR, I tested assertThat(expectedClass).isInstanceOf(value.getClass()), but this failed due to the error above. This is the reason assertThat(expectedClass.isInstance(value)).isTrue() is used here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry my bad, the check should be assertThat(value).isInstanceOf(expectedClass)

Copy link
Contributor Author

@tomtongue tomtongue Feb 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, you're correct. sorry for that. Let me fix it.

Assert.assertTrue(
"Expected instance of " + expectedClass.getName(), expectedClass.isInstance(value));
assertThat(expectedClass.isInstance(value))
.as("Expected instance of " + expectedClass.getName())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.as("Expected instance of " + expectedClass.getName())

sql(
"CALL %s.system.create_changelog_view(table => '%s', identifier_columns => array('id'), net_changes => true)",
catalogName, tableName));
Assertions.assertThatThrownBy(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Assertions.assertThatThrownBy(
assertThatThrownBy(


Map<String, String> props = createdTable.properties();
Assert.assertEquals("Should have extra property set", "bar", props.get("foo"));
assertThat(props.get("foo")).as("Should have extra property set").isEqualTo("bar");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assertThat(props.get("foo")).as("Should have extra property set").isEqualTo("bar");
assertThat(props).containsEntry("foo", "bar");


Table table = validationCatalog.loadTable(tableIdent);
Assert.assertEquals("Should override user value", "true", table.properties().get("migrated"));
assertThat(table.properties().get("migrated"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should also be using containsEntry()as this will show the full output of the map if the assertion ever fails


Assertions.assertThat(statsLocation.exists()).as("stats file should exist").isTrue();
Assertions.assertThat(statsLocation.length())
assertThat(statsLocation.exists()).as("stats file should exist").isTrue();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assertThat(statsLocation.exists()).as("stats file should exist").isTrue();
assertThat(statsLocation).exists();

Assertions.assertThat(statsLocation.exists()).as("stats file should exist").isTrue();
Assertions.assertThat(statsLocation.length())
assertThat(statsLocation.exists()).as("stats file should exist").isTrue();
assertThat(statsLocation.length())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assertThat(statsLocation.length())
assertThat(statsLocation).hasSize(statisticsFile.fileSizeInBytes())

.as("Deleted files")
.containsExactly(statsLocation.toURI().toString());
Assertions.assertThat(statsLocation.exists()).as("stats file should be deleted").isFalse();
assertThat(statsLocation.exists()).as("stats file should be deleted").isFalse();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assertThat(statsLocation.exists()).as("stats file should be deleted").isFalse();
assertThat(statsLocation).doesNotExist();

+ "options => map("
+ "'foo', 'bar'))",
catalogName, tableIdent));
Assertions.assertThatThrownBy(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Assertions.assertThatThrownBy(
assertThatThrownBy(


Map<String, String> props = createdTable.properties();
Assert.assertEquals("Should have extra property set", "bar", props.get("foo"));
assertThat(props.get("foo")).as("Should have extra property set").isEqualTo("bar");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use containsEntry() here

.as("No Snapshoting with Alternate locations with Hadoop Catalogs")
.doesNotContain("hadoop");
String location = temp.toFile().toString();
String snapshotLocation = temp.toFile().toString();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both now point to the same location, whereas in the previous version of the code they were pointing to different folders, so I think we should fix that

Assert.assertEquals(
"Should override user value", "false", props.get(TableProperties.GC_ENABLED));
assertThat(props.get("snapshot")).as("Should override user value").isEqualTo("true");
assertThat(props.get(TableProperties.GC_ENABLED))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should also use containsEntry()

@tomtongue
Copy link
Contributor Author

Thanks for the review. fix them tomorrow.

Assume.assumeTrue(catalogName.equals("spark_catalog"));
String location = temp.newFolder().toString();
assumeThat(catalogName).isEqualToIgnoringCase("spark_catalog");
String location = temp.toFile().toString();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in other places we use Files.createTempDirectory(temp, "junit").toFile().toString(). We should probably do the same here and in other places from this PR

Copy link
Contributor Author

@tomtongue tomtongue Feb 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other sub-classes in spark-extensions like TestAddFilesProcedure and TestWriteAborts are newly updated with File.createTempDirectory in the new commit.

sql(
"CREATE TABLE %s (id bigint NOT NULL, data string) USING iceberg LOCATION '%s'",
tableName, temp.newFolder());
tableName, temp.toFile());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Files.createTempDirectory(temp, "junit").toFile()

public void setupTempDirs() {
fileTableDir = temp.toFile();
public void setupTempDirs() throws IOException {
fileTableDir = Files.createTempDirectory(temp, "junit").toFile();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

temp is already a temporary directory, so we don't need to do this here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay sure. will revert this part.

*/
package org.apache.iceberg.spark.extensions;

import static java.nio.file.Files.createTempDirectory;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we typically only statically import stuff from AssertJ (and a few other things) but avoid statically importing stuff like these methods, so please use Files.createTempDirectory to be in-line with the rest of the codebase

Copy link
Contributor Author

@tomtongue tomtongue Feb 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nastra Sure, got it. I statically import the method here because the java.nio.file.Files is conflicting with the same class name like org.apache.iceberg.Files. Will define a method for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's something better for this, please let me know.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use java.nio.file.Files.createTempDirectory in that case

Copy link
Contributor Author

@tomtongue tomtongue Feb 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, thank you. Will update all the related lines.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

Copy link
Contributor

@nastra nastra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @tomtongue

@nastra nastra merged commit 56da99b into apache:main Feb 24, 2024
@tomtongue
Copy link
Contributor Author

@nastra Thanks for your reviews! There are still some spark-extensions sub-classes that are necessary to be migrated (I believe they are 5 classes). I will submit the remaining classes.

@tomtongue tomtongue deleted the mig-junit5-procedures-extensions branch February 24, 2024 14:16
bitsondatadev pushed a commit to bitsondatadev/iceberg that referenced this pull request Mar 3, 2024
devangjhabakh pushed a commit to cdouglas/iceberg that referenced this pull request Apr 22, 2024
zachdisc pushed a commit to zachdisc/iceberg that referenced this pull request Dec 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants