Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Signature parsing change with java.lang.classfile backend #1771

Open
cushon opened this issue Feb 13, 2025 · 2 comments
Open

Signature parsing change with java.lang.classfile backend #1771

cushon opened this issue Feb 13, 2025 · 2 comments

Comments

@cushon
Copy link

cushon commented Feb 13, 2025

Hi,

I've been experimenting with testing the new java.lang.classfile backend with JDK 25 EA builds. So far a lot of things seem to just work, but I ran into one regression that I think may be a usage error in the code using ByteBuddy, but that I was interested in input on.

Running a test using apache beam with JDK 25 and bytebuddy at 6441732, I saw:

Caused by: java.lang.IllegalArgumentException: Unexpected character < at position 0, expected a type signature: <T:Ljava/lang/Object;>Ljava/lang/Object;Ljava/io/Serializable;
	at java.base/jdk.internal.classfile.impl.SignaturesImpl.error(SignaturesImpl.java:377)
	at java.base/jdk.internal.classfile.impl.SignaturesImpl.unexpectedError(SignaturesImpl.java:371)
	at java.base/jdk.internal.classfile.impl.SignaturesImpl.referenceTypeSig(SignaturesImpl.java:152)
	at java.base/jdk.internal.classfile.impl.SignaturesImpl.typeSig(SignaturesImpl.java:135)
	at java.base/jdk.internal.classfile.impl.SignaturesImpl.parseSignature(SignaturesImpl.java:98)
	at java.base/java.lang.classfile.Signature.parseFrom(Signature.java:68)
	at codes.rafael.asmjdkbridge.JdkClassWriter$WritingMethodVisitor.lambda$visitLocalVariable$0(JdkClassWriter.java:1008)
	at codes.rafael.asmjdkbridge.JdkClassWriter$WritingMethodVisitor.lambda$visitEnd$3(JdkClassWriter.java:1084)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1604)
	at codes.rafael.asmjdkbridge.JdkClassWriter$WritingMethodVisitor.lambda$visitEnd$2(JdkClassWriter.java:1084)
	at java.base/jdk.internal.classfile.impl.DirectCodeBuilder.build(DirectCodeBuilder.java:92)
	at java.base/jdk.internal.classfile.impl.DirectMethodBuilder.withCode(DirectMethodBuilder.java:118)
	at java.base/jdk.internal.classfile.impl.DirectMethodBuilder.withCode(DirectMethodBuilder.java:125)
	at codes.rafael.asmjdkbridge.JdkClassWriter$WritingMethodVisitor.lambda$visitEnd$1(JdkClassWriter.java:1083)
	at java.base/jdk.internal.classfile.impl.DirectMethodBuilder.run(DirectMethodBuilder.java:139)
	at java.base/jdk.internal.classfile.impl.DirectClassBuilder.withMethod(DirectClassBuilder.java:117)
	at java.base/java.lang.classfile.ClassBuilder.withMethod(ClassBuilder.java:313)
	at codes.rafael.asmjdkbridge.JdkClassWriter$WritingMethodVisitor.lambda$visitEnd$0(JdkClassWriter.java:1032)
	at codes.rafael.asmjdkbridge.JdkClassWriter.lambda$visitEnd$2(JdkClassWriter.java:1114)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1604)
	at codes.rafael.asmjdkbridge.JdkClassWriter.lambda$visitEnd$1(JdkClassWriter.java:1114)
	at java.base/jdk.internal.classfile.impl.ClassFileImpl.build(ClassFileImpl.java:145)
	at java.base/java.lang.classfile.ClassFile.build(ClassFile.java:558)
	at codes.rafael.asmjdkbridge.JdkClassWriter.visitEnd(JdkClassWriter.java:1114)
	at net.bytebuddy.jar.asm.ClassVisitor.visitEnd(ClassVisitor.java:395)
	at net.bytebuddy.dynamic.scaffold.TypeWriter$Default$ForCreation.create(TypeWriter.java:6077)
	at net.bytebuddy.dynamic.scaffold.TypeWriter$Default.make(TypeWriter.java:2246)
	at net.bytebuddy.dynamic.DynamicType$Builder$AbstractBase$UsingTypeWriter.make(DynamicType.java:4085)
	at net.bytebuddy.dynamic.DynamicType$Builder$AbstractBase.make(DynamicType.java:3769)
	at net.bytebuddy.dynamic.DynamicType$Builder$AbstractBase$Delegator.make(DynamicType.java:4021)
	at org.apache.beam.sdk.transforms.reflect.ByteBuddyDoFnInvokerFactory.generateInvokerClass(ByteBuddyDoFnInvokerFactory.java:518)

ByteBuddyDoFnInvokerFactory is https://github.com/apache/beam/blob/48743e5c3ae4430a85e5ad6d8cd5e3a28a59417d/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/ByteBuddyDoFnInvokerFactory.java

Debugging showed the bad signature is coming from here: https://github.com/apache/beam/blob/48743e5c3ae4430a85e5ad6d8cd5e3a28a59417d/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/ByteBuddyDoFnInvokerFactory.java#L1314

The relevant bit is

this.returnType = targetMethod.getReturnType().asErasure();
...
        mv.visitLocalVariable(
            "res",
            returnType.getDescriptor(),
            returnType.getGenericSignature(),
            wrapStart,
            wrapEnd,
            returnVarIndex);

Apparently targetMethod.getReturnType().asErasure().getGenericSignature() returns a generic method signature which include type variable declarations, which is illegal as a variable signature. I guess if you want an erased type you just shouldn't be passing a signature at all, so maybe the behaviour of .asErasure().getGenericSignature() is working as intended? Updating it to pass null instead of the bogus signature fixes the crash.

The other noteworthy thing here is that java.lang.classfile is doing stricter validation of signatures, which is a good thing, but is also an incompatible change. I wondered if it's worth thinking about adding stricter signature validation to the other bytebuddy backends so the behaviour is consistent, and to help people catch these issues now and prepare for the switch?

cushon added a commit to cushon/beam that referenced this issue Feb 13, 2025
This code is creating a field with the same type as a method return
type. The return type is erased, but calling `getGenericSignature()`
turns the erased type back into a generic type. For methods with type
parameters, this results in signatures that are invalid for field
declarations, for example
`<T:Ljava/lang/Object;>Ljava/lang/Object;Ljava/io/Serializable;`. Future
versions of bytebuddy reject the invalid signature.

See also raphw/byte-buddy#1771
Abacn pushed a commit to apache/beam that referenced this issue Feb 13, 2025
This code is creating a field with the same type as a method return
type. The return type is erased, but calling `getGenericSignature()`
turns the erased type back into a generic type. For methods with type
parameters, this results in signatures that are invalid for field
declarations, for example
`<T:Ljava/lang/Object;>Ljava/lang/Object;Ljava/io/Serializable;`. Future
versions of bytebuddy reject the invalid signature.

See also raphw/byte-buddy#1771
@raphw
Copy link
Owner

raphw commented Feb 13, 2025

This is a misuse of returnType.getGenericSignature() which as you say returns a generic representation of a type declaration, whereas a local variable needs to be declared as the generic representation of a generic type specification.

Likely, this should just be ǹull, or one would retain the generic type from targetMethod.getReturnType()` and use it, what would reflect the correct declaration.

In order to impose the same strictness as the Class File API, one would need to decompose any generic signature of a class file, what would be costly. Also, there are plenty of compilers (Scala, Kotlin) which have generated dozens of faulty signatures over the years that Byte Buddy needs to handle. There are even multiple explicit handlers in the code base, to fallback to a certain behaviour, if a faulty signature is discovered. Changing that default would be painful to many agents, so I'd rather not touch this.

I think however it is a healthy approach to force in the Class File API to validate something.

@cushon
Copy link
Author

cushon commented Feb 13, 2025

That all makes sense, thanks for taking a look. I'll fix that usage to pass null.

In order to impose the same strictness as the Class File API, one would need to decompose any generic signature of a class file, what would be costly

Out of curiosity do you know if that means that the Class File API is making a performance tradeoff to do that validation? Or is ASM doing similar but more lenient parsing, and the issue is that would have to be duplicated in bytebuddy if you wanted to stricter validation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants