Generate right-length node name #30491

turing85 · 2023-01-19T16:47:01Z

Background

Currently, when we pass a value to quarkus.transaction-manager.node-name that is too long, the application fails to start with an exception:

2023-01-19 17:37:06,788 WARN  [com.arj.ats.arjuna] (Quarkus Main Thread) ARJUNA012138: Node name cannot exceed 28 bytes!
2023-01-19 17:37:06,788 WARN  [com.arj.ats.arjuna] (Quarkus Main Thread) ARJUNA012138: Node name cannot exceed 28 bytes!
2023-01-19 17:37:06,797 ERROR [io.qua.run.Application] (Quarkus Main Thread) Failed to start application (with profile dev): java.lang.IllegalArgumentException
	at com.arjuna.ats.arjuna.coordinator.TxControl.setXANodeName(TxControl.java:178)
	at io.quarkus.narayana.jta.runtime.NarayanaJtaRecorder.setNodeName(NarayanaJtaRecorder.java:36)
	at io.quarkus.deployment.steps.NarayanaJtaProcessor$build75739834.deploy_0(Unknown Source)
	at io.quarkus.deployment.steps.NarayanaJtaProcessor$build75739834.deploy(Unknown Source)
	at io.quarkus.runner.ApplicationImpl.doStart(Unknown Source)
	at io.quarkus.runtime.Application.start(Application.java:101)
	at io.quarkus.runtime.ApplicationLifecycleManager.run(ApplicationLifecycleManager.java:110)
	at io.quarkus.runtime.Quarkus.run(Quarkus.java:69)
	at io.quarkus.runtime.Quarkus.run(Quarkus.java:42)
	at io.quarkus.runtime.Quarkus.run(Quarkus.java:122)
	at io.quarkus.runner.GeneratedMain.main(Unknown Source)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
	at java.base/java.lang.reflect.Method.invoke(Method.java:578)
	at io.quarkus.runner.bootstrap.StartupActionImpl$1.run(StartupActionImpl.java:103)
	at java.base/java.lang.Thread.run(Thread.java:1589)

2023-01-19 17:37:06,797 ERROR [io.qua.run.Application] (Quarkus Main Thread) Failed to start application (with profile dev): java.lang.IllegalArgumentException
	at com.arjuna.ats.arjuna.coordinator.TxControl.setXANodeName(TxControl.java:178)
	at io.quarkus.narayana.jta.runtime.NarayanaJtaRecorder.setNodeName(NarayanaJtaRecorder.java:36)
	at io.quarkus.deployment.steps.NarayanaJtaProcessor$build75739834.deploy_0(Unknown Source)
	at io.quarkus.deployment.steps.NarayanaJtaProcessor$build75739834.deploy(Unknown Source)
	at io.quarkus.runner.ApplicationImpl.doStart(Unknown Source)
	at io.quarkus.runtime.Application.start(Application.java:101)
	at io.quarkus.runtime.ApplicationLifecycleManager.run(ApplicationLifecycleManager.java:110)
	at io.quarkus.runtime.Quarkus.run(Quarkus.java:69)
	at io.quarkus.runtime.Quarkus.run(Quarkus.java:42)
	at io.quarkus.runtime.Quarkus.run(Quarkus.java:122)
	at io.quarkus.runner.GeneratedMain.main(Unknown Source)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
	at java.base/java.lang.reflect.Method.invoke(Method.java:578)
	at io.quarkus.runner.bootstrap.StartupActionImpl$1.run(StartupActionImpl.java:103)
	at java.base/java.lang.Thread.run(Thread.java:1589)

It would be a better developer experience when if quarkus-narayana-jta transformed the value to a valid value.

Story

AS a developer
WHEN I provide a value for quarkus.transaction-manager.node-name
THEN the extension quarkus-narayana-jta guarantees that the value is transformed to a valid node name if necessary.

Implementation ideas

No response

The text was updated successfully, but these errors were encountered:

quarkus-bot · 2023-01-19T18:28:41Z

/cc @mmusgrov (narayana)

maxandersen · 2023-01-19T18:29:52Z

Wondering if we should just sha any value longer than 28(?) bytes ?

What could go wrong :)

turing85 · 2023-01-20T15:07:16Z

FTR: relevant zulip discussion

mmusgrov · 2023-04-19T11:59:14Z

@maxandersen Wouldn't truncating the hash increase the probability of a collision.

If there is someone else here with a background in hashing functions that can provide an estimate of the two probabilities (truncated hash versus the truncated hashes of any of the other pods) then we can decide how exposed we'd be if we go with truncating the hash.

mmusgrov · 2023-04-19T13:04:44Z

But if the hashes for two different nodes collide then you could end up with two transaction recovery managers independently recovering the same in doubt transaction, probably resulting in a heuristic outcome, that would be wrong ;-)

maxandersen · 2023-04-20T01:26:38Z

So your concern is that a sha256 is 32 characters and we only have room for 28?

Where does that 28 size come from anyway ? ::)

maxandersen · 2023-04-20T01:38:45Z

Just realizing sha-224 exists which would fit into 28 bytes. Thus not be truncated.

My understanding is that collisions on small values is still very unlikely - one can potentially have a hit if one generate a very long string that coincidentally has same hash. But that is not bound to happen in any realistic usage of these key to signify identity.

Don't know the exact numbers of these but should be possible to find.

maxandersen · 2023-04-20T02:12:44Z

Okey so late night reading. The hash collision is the same as the birthday problem. This article explains it pretty well https://kevingal.com/blog/collisions.html with a calculator https://kevingal.com/apps/collision.html

Let's say we expect there to be on average 4(?) nodes participating; how likely is it their values will have hash collisions given their values are most likely unique (i.e. host names are user entered keys with the intention to be unique) then the simplified calculation is:

4^2/2*(2^224) = 215679573337205118357336120696157045389097155380324579848828881993728

Approximate: 2.15*10^68

It's way more likely a asteroid hits and eliminate the planet in our lifetime than there will be collisions Afaics :)

maxandersen · 2023-04-20T02:16:40Z

Ps. Check my math - I simply based it on the article explaining the approximation for the birthday problem.

mmusgrov · 2023-04-20T09:01:29Z

Interesting, this is a useful analysis.

The limitation on the size of the node name is because we include it in the (X/Open) Xid which is a fixed size data structure.
When deployed on k8s we need to further restrict the node name/pod name length to 23 characters.
We would need to reverse the hash to get back the node name (I think this is what Jonathan was referring to in the linked zulip chat) but hash functions are not reversible in general. In the chat someone did suggest a way to "re-establish" the connection between the cryptic value and the actual name.

But I think you have kick-started another round of discussions about how to work around the issues ...

maxandersen · 2023-04-23T09:05:16Z

Is there a technical reason for needing the reversal? If yes; then we should consider using compression instead.

If it's just to locate the node; doesn't the protocol allow to annotate/tag the log to get more context ?

turing85 · 2023-05-06T22:43:10Z

...When deployed on k8s we need to further restrict the node name/pod name length to 23 characters. ...

@mmusgrov Why is that? I know that there is a limit on route URLs (I think of 63 characters). I definitively know that we have pod-names with more than 23 and 28 character in production in an OpenShift cluster.

turing85 · 2023-05-06T22:55:50Z

Just realizing sha-224 exists which would fit into 28 bytes. Thus not be truncated.

@maxandersen while this is true, not all of the generated bytes represent printable characters. So sha224sum would actually result in a 56-character long hexdec string (28 * 2). Not sure how relevant this is for the X/Open Xid. Maybe @mmusgrov has some additional input?

turing85 · 2023-05-06T23:41:48Z

Okey so late night reading. The hash collision is the same as the birthday problem. This article explains it pretty well https://kevingal.com/blog/collisions.html with a calculator https://kevingal.com/apps/collision.html

Let's say we expect there to be on average 4(?) nodes participating; how likely is it their values will have hash collisions given their values are most likely unique (i.e. host names are user entered keys with the intention to be unique) then the simplified calculation is:

4^2/2*(2^224) = 215679573337205118357336120696157045389097155380324579848828881993728

Approximate: 2.15*10^68

It's way more likely a asteroid hits and eliminate the planet in our lifetime than there will be collisions Afaics :)

Not sure the math is correct, but the order of magnitude seems sensible.

So basically

f(k, n) = prod_(i = 0)^(k - 1) ((2^n - i) / (2^n))

should be the probability to have k hashes of bit-length n that do not have a collision (f ~ "(collision-)free").

Conversely,

c(k, n) = 1 - f(k, n) 
        = 1 - prod_(i = 0)^(k - 1) ((2^n - i) / (2^n))

is the probability that at least one collision occurs (c ~ "collision").

Wolframalpha refuses to calculate the value c(4, 244) (wolframalpha.com), but happily computates c(4,1000) as 0 (wolframalpha.com), so the probability seems astronomically low.

If we let wolframalpha calculate f(4, 224) instead, it is pretty darn close to 1 (wolframalpah.com).

so yeah... sha224 should be sufficient.

turing85 · 2023-05-07T00:01:12Z

Okay, I used the approximation used by the link provided by @maxandersen and calculated the number of nodes necessary to reach a probability of 0.01 = 1% for a collision (wolframalpha.com), which comes down to something in the range of 10^32. For comparison: the estimated number of grains of sand on earth is in the order of 10^18. So yeah... this should be sufficient.

turing85 · 2023-09-07T19:18:12Z

bump so... do we wanna do this?

mmusgrov · 2023-09-13T09:23:42Z

I'm not sure what to say here other than to state our needs:

We embed the nodeIdentifier into XIDs (these identify resource manager XA branches) and we need the nodeIdentifier to stop independent recovery managers from trying to recover the same branch.
The XID data structure is a fixed size and after we have put all of the other information, that we and XA demand, into the structure we are only left with 23 bytes for the nodeIdentifier.

So our key requirement is uniqueness and I don't feel sufficiently qualified to comment on the solutions that Marco and Max have presented here. So to answer your question, we would like to do this but I'm uncomfortable with the solutions presented so far. Can we not detect the problem, of someone setting too long a node name, at build time?

turing85 · 2023-10-21T05:53:55Z

@mmusgrov The node identifier is something one would set at run time. Think, for example, the pod name of a pod in a stateful set. We cannot detect this value at build time. Also: where are the 23 bytes coming from if the error message says 28 bytes?

mmusgrov · 2023-10-27T13:27:10Z

Thanks.

The 23 byte node identifier limit is something that was originally imposed when running WildFly on OpenShift and subsequently got added as a WildFly restriction too and there are a number of JIRAs for it, none of which say why. I've asked around a few times without success and the two people that I think could know the answer no longer work in this area (and no longer work for RedHat). Perhaps it's just one of those things that got "lost in the mists of time" and is no longer relevant, I know that OpenShift has changed a lot since the restriction was added. So let's just assume this isn't applicable to quarkus, I'm comfortable doing that. Note that the 28 bytes for the datum is still a hard requirement since we inherit this limitation from the XA specification which constrains the size of an Xid.

turing85 · 2023-10-27T13:38:48Z

28 bytes is fine, we have hash-algorithms that generate 28-byte hashes. But there is nothing that generates 23-byte hashes.

turing85 · 2023-10-27T13:48:44Z

So would an identifier like "Rӽ[/d��h�}1�:��o¦Z��2��" be fine, or is this problematic? (This is basically the result of hashing some string with SHA224)

mmusgrov · 2023-10-27T13:54:50Z

That would be fine with me. I know it was said elsewhere that this would make debugging a pain but that complaint can be side-stepped if the boot log can report the original value prior to hashing?

turing85 · 2023-10-27T14:00:53Z

Printing the initial value is no problem. So the last question... do we want to always hash, or only if the name is too long (and print information accordingly)?

mmusgrov · 2023-10-27T14:06:35Z

I don't know. But if you only hash when necessary then if someone complained about it then we could tell them to use a shorter node name so personally I'd go for only hash if necessary.

mmusgrov · 2023-10-27T14:09:25Z

Oh, and thanks very much for sorting through the issues and coming up with a good solution.

… with SHA-224

turing85 · 2023-10-27T14:46:56Z

@maxandersen @mmusgrov PR is open. You are welcome to review it 🙂
When the issue is closed, I'd love to see a backport to 3.2.x.

… with SHA-224

Generate right-length nodeIdentifier if nodeIdentifier is longer than 28 bytes, the XID generated by narayana ist too long, and it breaks. This shortens it, while nodeIdentifier remains unique. Fix ported from quarkus quarkusio/quarkus#30491

… with SHA-224

Generate right-length nodeIdentifier if nodeIdentifier is longer than 28 bytes, the XID generated by narayana ist too long, and it breaks. This shortens it, while nodeIdentifier remains unique. Fix ported from quarkus quarkusio/quarkus#30491

* Update NarayanaPropertiesInitializer.java Generate right-length nodeIdentifier if nodeIdentifier is longer than 28 bytes, the XID generated by narayana ist too long, and it breaks. This shortens it, while nodeIdentifier remains unique. Fix ported from quarkus quarkusio/quarkus#30491 * Replace deprecated API usage * Compute XA recovery nodes from node identifier by default * Upgrade versions --------- Co-authored-by: deepred-dev <[email protected]> Co-authored-by: Benjamin Graf <[email protected]>

turing85 added the kind/enhancement New feature or request label Jan 19, 2023

quarkus-bot bot added the triage/needs-triage label Jan 19, 2023

maxandersen added area/narayana Transactions / Narayana and removed triage/needs-triage labels Jan 19, 2023

turing85 added a commit to turing85/quarkus that referenced this issue Oct 27, 2023

quarkusio#30491: If the node name is longer than 28 bytes, shorten it…

e29d3a8

… with SHA-224

turing85 mentioned this issue Oct 27, 2023

If the node name is longer than 28 bytes, shorten it with SHA-224 #36752

Merged

turing85 added a commit to turing85/quarkus that referenced this issue Oct 31, 2023

quarkusio#30491: If the node name is longer than 28 bytes, shorten it…

07b3a4c

… with SHA-224

gsmet closed this as completed in #36752 Nov 6, 2023

quarkus-bot bot added this to the 3.6 - main milestone Nov 6, 2023

DavideD pushed a commit to DavideD/quarkus that referenced this issue Nov 27, 2023

quarkusio#30491: If the node name is longer than 28 bytes, shorten it…

7fa76ca

… with SHA-224

deepred-dev mentioned this issue Feb 6, 2024

Generate right-length nodeIdentifier/ XID snowdrop/narayana-spring-boot#132

Closed

holly-cummins pushed a commit to holly-cummins/quarkus that referenced this issue Feb 8, 2024

quarkusio#30491: If the node name is longer than 28 bytes, shorten it…

a74521f

… with SHA-224

marcosgopen mentioned this issue Feb 26, 2024

Narayana update to 7.0.1.Final #39009

Merged

This was referenced Apr 22, 2024

Revert "Generate right-length nodeIdentifier/ XID" snowdrop/narayana-spring-boot#141

Merged

Patch from #30491 "Generate right-length node name" seems invalid #40256

Closed

marcosgopen mentioned this issue May 20, 2024

Ensure the Transaction Manager node name is less than or equal to 28 bytes #40613

Merged

mmusgrov mentioned this issue Jul 30, 2024

JBTM-3883 use a base64 encoder if required jbosstm/narayana#2281

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generate right-length node name #30491

Generate right-length node name #30491

turing85 commented Jan 19, 2023 •

edited

Loading

quarkus-bot bot commented Jan 19, 2023

maxandersen commented Jan 19, 2023

turing85 commented Jan 20, 2023

mmusgrov commented Apr 19, 2023 •

edited

Loading

mmusgrov commented Apr 19, 2023

maxandersen commented Apr 20, 2023

maxandersen commented Apr 20, 2023

maxandersen commented Apr 20, 2023 •

edited

Loading

maxandersen commented Apr 20, 2023

mmusgrov commented Apr 20, 2023

maxandersen commented Apr 23, 2023

turing85 commented May 6, 2023

turing85 commented May 6, 2023

turing85 commented May 6, 2023 •

edited

Loading

turing85 commented May 7, 2023

turing85 commented Sep 7, 2023

mmusgrov commented Sep 13, 2023 •

edited

Loading

turing85 commented Oct 21, 2023 •

edited

Loading

mmusgrov commented Oct 27, 2023

turing85 commented Oct 27, 2023

turing85 commented Oct 27, 2023 •

edited

Loading

mmusgrov commented Oct 27, 2023

turing85 commented Oct 27, 2023

mmusgrov commented Oct 27, 2023 •

edited

Loading

mmusgrov commented Oct 27, 2023

turing85 commented Oct 27, 2023

Generate right-length node name #30491

Generate right-length node name #30491

Comments

turing85 commented Jan 19, 2023 • edited Loading

Background

Story

Implementation ideas

quarkus-bot bot commented Jan 19, 2023

maxandersen commented Jan 19, 2023

turing85 commented Jan 20, 2023

mmusgrov commented Apr 19, 2023 • edited Loading

mmusgrov commented Apr 19, 2023

maxandersen commented Apr 20, 2023

maxandersen commented Apr 20, 2023

maxandersen commented Apr 20, 2023 • edited Loading

maxandersen commented Apr 20, 2023

mmusgrov commented Apr 20, 2023

maxandersen commented Apr 23, 2023

turing85 commented May 6, 2023

turing85 commented May 6, 2023

turing85 commented May 6, 2023 • edited Loading

turing85 commented May 7, 2023

turing85 commented Sep 7, 2023

mmusgrov commented Sep 13, 2023 • edited Loading

turing85 commented Oct 21, 2023 • edited Loading

mmusgrov commented Oct 27, 2023

turing85 commented Oct 27, 2023

turing85 commented Oct 27, 2023 • edited Loading

mmusgrov commented Oct 27, 2023

turing85 commented Oct 27, 2023

mmusgrov commented Oct 27, 2023 • edited Loading

mmusgrov commented Oct 27, 2023

turing85 commented Oct 27, 2023

turing85 commented Jan 19, 2023 •

edited

Loading

mmusgrov commented Apr 19, 2023 •

edited

Loading

maxandersen commented Apr 20, 2023 •

edited

Loading

turing85 commented May 6, 2023 •

edited

Loading

mmusgrov commented Sep 13, 2023 •

edited

Loading

turing85 commented Oct 21, 2023 •

edited

Loading

turing85 commented Oct 27, 2023 •

edited

Loading

mmusgrov commented Oct 27, 2023 •

edited

Loading