Skip to content

Conversation

@bowenlan-amzn
Copy link
Member

@bowenlan-amzn bowenlan-amzn commented Nov 10, 2025

Description

This PR fixes a bootstrap failure that occurs when streaming transport is used with remote cluster state. Without this fix, nodes fail to start with the error:

can't overwrite as repositories are already present
	at org.opensearch.repositories.RepositoriesService.updateRepositoriesMap(RepositoriesService.java:885)
	at org.opensearch.node.remotestore.RemoteStoreNodeService.createAndVerifyRepositories(RemoteStoreNodeService.java:163)
	at org.opensearch.node.Node$LocalNodeFactory.apply(Node.java:2355)
	at org.opensearch.node.Node$LocalNodeFactory.apply(Node.java:2323)
	at org.opensearch.transport.TransportService.doStart(TransportService.java:402)
	at org.opensearch.common.lifecycle.AbstractLifecycleComponent.start(AbstractLifecycleComponent.java:77)
	at org.opensearch.node.Node.start(Node.java:1850)

Root Cause

When streaming transport is enabled, the node bootstrap process creates two separate LocalNodeFactory instances:

  1. One for the regular TransportService
  2. Another for the StreamTransportService

During node startup, both transport services are started sequentially:

if (streamTransportService != null) {
    streamTransportService.start();  // First call to LocalNodeFactory.apply()
}
transportService.start();  // Second call to LocalNodeFactory.apply()

Both services inherit TransportService.doStart() which calls:

localNode = localNodeFactory.apply(transport.boundAddress());

Each call to LocalNodeFactory.apply() triggers:

  1. DiscoveryNode creation
  2. Remote store repository creation and verification via remoteStoreNodeService.createAndVerifyRepositories()

Since both factories attempt to register the same repositories (configured via node attributes), the second call fails with "can't overwrite as repositories are already present".

Solution

We don't do Remote store repository creation and verification when it's for stream transport

Related Issues

Resolves #[Issue number to be closed when this PR is merged]

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Contributor

❌ Gradle check result for 566d7f6: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@bowenlan-amzn bowenlan-amzn force-pushed the streaming-transport-bootstrap branch 2 times, most recently from b64bdc7 to 68fb9d8 Compare November 11, 2025 02:33
@github-actions
Copy link
Contributor

❌ Gradle check result for 68fb9d8: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@bowenlan-amzn bowenlan-amzn changed the title Reuse local node factory when streaming transport bootstrap Fix node bootstrap error when enable stream transport and remote cluster state Nov 11, 2025
@bowenlan-amzn bowenlan-amzn marked this pull request as ready for review November 11, 2025 05:36
@bowenlan-amzn bowenlan-amzn requested a review from a team as a code owner November 11, 2025 05:36
@github-actions
Copy link
Contributor

✅ Gradle check result for 3688824: SUCCESS

@codecov
Copy link

codecov bot commented Nov 11, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.32%. Comparing base (50f2231) to head (40873de).
⚠️ Report is 4 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main   #19948      +/-   ##
============================================
+ Coverage     73.25%   73.32%   +0.06%     
- Complexity    71615    71634      +19     
============================================
  Files          5789     5790       +1     
  Lines        327471   327549      +78     
  Branches      47168    47181      +13     
============================================
+ Hits         239905   240160     +255     
+ Misses        68315    68117     -198     
- Partials      19251    19272      +21     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@bowenlan-amzn bowenlan-amzn force-pushed the streaming-transport-bootstrap branch from a7986d3 to 0f95d1c Compare November 17, 2025 06:00
@github-actions
Copy link
Contributor

❌ Gradle check result for 0f95d1c: null

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Contributor

✅ Gradle check result for 0f95d1c: SUCCESS

Signed-off-by: bowenlan-amzn <[email protected]>
Signed-off-by: bowenlan-amzn <[email protected]>
Signed-off-by: bowenlan-amzn <[email protected]>
@bowenlan-amzn bowenlan-amzn force-pushed the streaming-transport-bootstrap branch from 0f95d1c to 3e1a00d Compare November 21, 2025 19:13
@github-actions
Copy link
Contributor

❌ Gradle check result for 3e1a00d: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: bowenlan-amzn <[email protected]>
@github-actions
Copy link
Contributor

✅ Gradle check result for 40873de: SUCCESS

@andrross andrross merged commit 3ceeffa into opensearch-project:main Nov 21, 2025
33 checks passed
@bowenlan-amzn bowenlan-amzn deleted the streaming-transport-bootstrap branch November 22, 2025 01:10
kkewwei pushed a commit to kkewwei/OpenSearch that referenced this pull request Nov 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants