Skip to content

Comments

Move HDFS-repo YAML to Rest tests#140142

Merged
mhl-b merged 6 commits intoelastic:mainfrom
mhl-b:hdfs-yaml-to-rest-tests
Jan 20, 2026
Merged

Move HDFS-repo YAML to Rest tests#140142
mhl-b merged 6 commits intoelastic:mainfrom
mhl-b:hdfs-yaml-to-rest-tests

Conversation

@mhl-b
Copy link
Contributor

@mhl-b mhl-b commented Jan 3, 2026

Move HDFS repository tests from YAML to Rest. I keep original structure of tests, with slight changes around grouping create/get/delete into a single test. Assertions are the same.

There is a substantial time gap between finding free port and binding it in HDFS fixture, that can lead to java.net.BindException: Address already in use. https://gradle-enterprise.elastic.co/s/inqfl34rgjszo/failures#1. With Rest tests there is no need for early port binding. HDFS binds to ephemeral port that can be queried at repository creation time.

@mhl-b mhl-b added Team:Data Management (obsolete) DO NOT USE. This team no longer exists. :Distributed/HDFS HDFS repository issues >test Issues or PRs that are addressing/adding tests labels Jan 3, 2026
@mhl-b mhl-b requested a review from DaveCTurner January 3, 2026 07:49
@mhl-b mhl-b marked this pull request as ready for review January 3, 2026 07:49
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it. Left a few comments about ways we can strengthen these tests a bit - not technically necessary to make this change but nice to have if you don't mind doing a bit more work. Also a few other comments.

Comment on lines 2299 to 2304
try {
client().performRequest(new Request("GET", "/_snapshot/" + repoName));
fail("repository [" + repoName + "] must not exists");
} catch (ResponseException e) {
assertEquals(404, e.getResponse().getStatusLine().getStatusCode());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggest expectThrows() for this pattern


MiniDFSCluster.Builder builder = new MiniDFSCluster.Builder(cfg);
builder.nameNodePort(explicitPort);
builder.nameNodePort(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice.

public class RepositoryHdfsRestIT extends AbstractRepositoryHdfsRestIT {

public static HdfsFixture hdfsFixture = new HdfsFixture();
static final HdfsFixture hdfs = new HdfsFixture();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: would prefer to keep the old name hdfsFixture here

Comment on lines 45 to 68
@Override
public void testCreateGetDeleteRepository() throws IOException {
super.testCreateGetDeleteRepository();
}

@Override
public void testCreateAndVerifyRepository() throws IOException {
super.testCreateAndVerifyRepository();
}

@Override
public void testCreateListDeleteSnapshots() throws IOException {
super.testCreateListDeleteSnapshots();
}

@Override
public void testCreateReadOnlyRepo() throws IOException {
super.testCreateReadOnlyRepo();
}

@Override
public void testRestore() throws IOException {
super.testRestore();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why override these methods if we're just deferring right back to super()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's me only, but running individual tests in intellij does not work well when I try to execute one from the abstract class. Running from abstract class always goes into intellij runner, not gradle. So I keep them for convenience.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

public void testCreateGetDeleteRepository() throws IOException {
final var repoName = randomIdentifier();
final var path = "test/" + randomIdentifier();
for (int i = 0; i < 10; i++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Twice should be enough right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched to at least twice

}

final var listSnapshots = listAllSnapshots(repoName);
assertEquals(snapshotNames.size(), ((List<?>) listSnapshots.get("snapshots")).size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we sometimes unregister and re-register the repo at this point to confirm the snapshots persist across that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

final var path = "/user/elasticsearch/existing/readonly-repository";
registerHdfsRepository(hdfsUri0(), repoName, path, false, true);
final var listSnapshots = listAllSnapshots(repoName);
assertEquals(1, ((List<?>) listSnapshots.get("snapshots")).size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does this 1 snapshot come from?

Copy link
Contributor Author

@mhl-b mhl-b Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From existing path "/user/elasticsearch/existing/readonly-repository". I tried random path and it didn't work (obviously), then used same path as YAML and it worked, and I was happy with that.

From fixture

// Install a pre-existing repository into HDFS
String directoryName = "readonly-repository";
String archiveName = directoryName + ".tar.gz";
URL readOnlyRepositoryArchiveURL = getClass().getClassLoader().getResource(archiveName);
if (readOnlyRepositoryArchiveURL != null) {
Path tempDirectory = Files.createTempDirectory(getClass().getName());
File readOnlyRepositoryArchive = tempDirectory.resolve(archiveName).toFile();
FileUtils.copyURLToFile(readOnlyRepositoryArchiveURL, readOnlyRepositoryArchive);
FileUtil.unTar(readOnlyRepositoryArchive, tempDirectory.toFile());
fs.copyFromLocalFile(
true,
true,
new org.apache.hadoop.fs.Path(tempDirectory.resolve(directoryName).toAbsolutePath().toUri()),
esUserPath.suffix("/existing/" + directoryName)
);
FileUtils.deleteDirectory(tempDirectory.toFile());

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will extract hardcoded path from fixture into test

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 thanks, I was looking for the string existing/readonly-repository but drew a blank

assertEquals(snapshotNames.size(), ((List<?>) listSnapshots.get("snapshots")).size());

for (var snapshotName : snapshotNames) {
deleteSnapshot(repoName, snapshotName, false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we confirm that if we delete a snapshot then it's not returned when you list the repository?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Now asserting that remaining snapshots does not contain deleted ones in a loop.

deleteRepository(repoName);
}

public void testCreateReadOnlyRepo() throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we confirm in this case that creating or deleting a snapshot fails with an appropriate error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stumbled over error codes. I was expecting 400, but got 500. Fixing in separate PR #140200, once merged will proceed with current one

Comment on lines +101 to +108
final var indexRecovery = getIndexRecovery(indexName);
final var shard0 = indexName + ".shards.0.";
assertEquals("SNAPSHOT", indexRecovery.get(shard0 + "type"));
assertEquals("DONE", indexRecovery.get(shard0 + "stage"));
assertEquals(Integer.valueOf(1), indexRecovery.get(shard0 + "index.files.recovered"));
assertTrue((int) indexRecovery.get(shard0 + "index.size.recovered_in_bytes") >= 0);
assertEquals(Integer.valueOf(0), indexRecovery.get(shard0 + "index.files.reused"));
assertEquals(Integer.valueOf(0), indexRecovery.get(shard0 + "index.size.reused_in_bytes"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd kinda like us to verify that a document indexed before creating the snapshot is visible to searches at this point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, which document, my understanding original test created only index without documents.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I think this was an unnecessary omission, ideally we'd confirm that the snapshotted data was actually restored and not just an empty index. Not essential, just a nice-to-have.

@DaveCTurner
Copy link
Contributor

Oh yeah one more thing, we have a YAML test that verifies that you can't create a HDFS repository with a file:// URL but that has not been replicated here

@mhl-b
Copy link
Contributor Author

mhl-b commented Jan 5, 2026

Thanks David, I'll address all of above.

@mhl-b mhl-b force-pushed the hdfs-yaml-to-rest-tests branch from 396f643 to 6fd907c Compare January 9, 2026 21:18
@mhl-b mhl-b requested a review from DaveCTurner January 9, 2026 23:18
@mhl-b
Copy link
Contributor Author

mhl-b commented Jan 10, 2026

Hold on, forgot 'file://' url test.

@mhl-b
Copy link
Contributor Author

mhl-b commented Jan 12, 2026

Ready for review

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines +101 to +108
final var indexRecovery = getIndexRecovery(indexName);
final var shard0 = indexName + ".shards.0.";
assertEquals("SNAPSHOT", indexRecovery.get(shard0 + "type"));
assertEquals("DONE", indexRecovery.get(shard0 + "stage"));
assertEquals(Integer.valueOf(1), indexRecovery.get(shard0 + "index.files.recovered"));
assertTrue((int) indexRecovery.get(shard0 + "index.size.recovered_in_bytes") >= 0);
assertEquals(Integer.valueOf(0), indexRecovery.get(shard0 + "index.files.reused"));
assertEquals(Integer.valueOf(0), indexRecovery.get(shard0 + "index.size.reused_in_bytes"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I think this was an unnecessary omission, ideally we'd confirm that the snapshotted data was actually restored and not just an empty index. Not essential, just a nice-to-have.

@mhl-b mhl-b merged commit d21f790 into elastic:main Jan 20, 2026
35 checks passed
spinscale pushed a commit to spinscale/elasticsearch that referenced this pull request Jan 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed/HDFS HDFS repository issues Team:Data Management (obsolete) DO NOT USE. This team no longer exists. >test Issues or PRs that are addressing/adding tests v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants