Conversation
Created using spr 1.3.5-bogner
|
@llvm/pr-subscribers-mc Author: Fangrui Song (MaskRay) Changesgas has supported " quoted symbols since 2015: Close #138390 Full diff: https://github.com/llvm/llvm-project/pull/138817.diff 2 Files Affected:
diff --git a/llvm/lib/MC/MCContext.cpp b/llvm/lib/MC/MCContext.cpp
index f70087e14f702..41caf20b331b2 100644
--- a/llvm/lib/MC/MCContext.cpp
+++ b/llvm/lib/MC/MCContext.cpp
@@ -212,6 +212,27 @@ MCDataFragment *MCContext::allocInitialFragment(MCSection &Sec) {
MCSymbol *MCContext::getOrCreateSymbol(const Twine &Name) {
SmallString<128> NameSV;
StringRef NameRef = Name.toStringRef(NameSV);
+ if (NameRef.contains('\\')) {
+ NameSV = NameRef;
+ size_t S = 0;
+ // Support escaped \\ and \" as in GNU Assembler. GAS issues a warning for
+ // other characters following \\, which we do not implement due to code
+ // structure.
+ for (size_t I = 0, E = NameSV.size(); I < E; ++I) {
+ char C = NameSV[I];
+ if (C == '\\') {
+ switch (NameSV[I + 1]) {
+ case '"':
+ case '\\':
+ C = NameSV[++I];
+ break;
+ }
+ }
+ NameSV[S++] = C;
+ }
+ NameSV.resize(S);
+ NameRef = NameSV;
+ }
assert(!NameRef.empty() && "Normal symbols cannot be unnamed!");
diff --git a/llvm/test/MC/ELF/symbol-names.s b/llvm/test/MC/ELF/symbol-names.s
index f605c723d4d4d..8c891052ebd9a 100644
--- a/llvm/test/MC/ELF/symbol-names.s
+++ b/llvm/test/MC/ELF/symbol-names.s
@@ -1,12 +1,25 @@
-// RUN: llvm-mc -triple i686-pc-linux -filetype=obj %s -o - | llvm-readobj --symbols - | FileCheck %s
+// RUN: llvm-mc -triple=x86_64 -filetype=obj %s | llvm-objdump -tdr - | FileCheck %s
// MC allows ?'s in symbol names as an extension.
+// CHECK-LABEL:SYMBOL TABLE:
+// CHECK-NEXT: 0000000000000001 l F .text 0000000000000000 a"b\{{$}}
+// CHECK-NEXT: 0000000000000000 g F .text 0000000000000000 foo?bar
+// CHECK-NEXT: 0000000000000000 *UND* 0000000000000000 a"b\q{{$}}
+// CHECK-EMPTY:
+
.text
.globl foo?bar
.type foo?bar, @function
foo?bar:
ret
-// CHECK: Symbol
-// CHECK: Name: foo?bar
+// CHECK-LABEL:<a"b\>:
+// CHECK-NEXT: callq {{.*}} <a"b\>
+// CHECK-NEXT: callq {{.*}}
+// CHECK-NEXT: R_X86_64_PLT32 a"b\q-0x4
+.type "a\"b\\", @function
+"a\"b\\":
+ call "a\"b\\"
+/// GAS emits a warning for \q
+ call "a\"b\q"
|
Created using spr 1.3.5-bogner
Created using spr 1.3.5-bogner
gas has supported " quoted symbols since 2015: https://sourceware.org/pipermail/binutils/2015-August/090003.html We don't handle \\ or \" , leading to clang -c --save-temps vs clang -c difference for the following C code: ``` int x asm("a\"\\b"); ``` Fix #138390 MC/COFF/safeseh.h looks incorrect. \01 in `.safeseh "\01foo"` is not a correct escape sequence. Change it to \\ Pull Request: llvm/llvm-project#138817
|
This broke the bots it seems: https://lab.llvm.org/buildbot/#/builders/153/builds/31071 |
Reverts #138817 The BOLT testing is failing after this change.
Reverts llvm/llvm-project#138817 The BOLT testing is failing after this change.
gas has supported " quoted symbols since 2015. Both \ and " need to be escaped. https://sourceware.org/pipermail/binutils/2015-August/090003.html We don't unescape \\ or \" in assembly strings, leading to clang -c --save-temps vs clang -c difference for the following C code: ``` int x asm("a\"\\b"); ``` Fix #138390 MC/COFF/safeseh.h looks incorrect. \01 in `.safeseh "\01foo"` is not a correct escape sequence. Change it to \\ Pull Request: #138817
gas has supported " quoted symbols since 2015. Both \ and " need to be escaped. https://sourceware.org/pipermail/binutils/2015-August/090003.html We don't unescape \\ or \" in assembly strings, leading to clang -c --save-temps vs clang -c difference for the following C code: ``` int x asm("a\"\\b"); ``` Fix #138390 MC/COFF/safeseh.h looks incorrect. \01 in `.safeseh "\01foo"` is not a correct escape sequence. Change it to \\ Pull Request: llvm/llvm-project#138817
| } | ||
| NameSV.resize(S); | ||
| NameRef = NameSV; | ||
| } |
There was a problem hiding this comment.
Shouldn't this unescaping only be done when parsing textual assembly in AsmParser, but not when programmatically creating symbols?
I'd expect the LLVM IR @"\\\22" = constant i8 0 to result in "\\\"": rather than "\"":, as it does with this patch. (Previously it resulted in "\\"":, which is also wrong, but at least only affects the case where you're emitting assembly.)
See rust-lang/rust#146065 for a regression from this change. Of course we could easily apply additional escaping on the Rust side, but I think the current behavior here isn't right.
There was a problem hiding this comment.
Sorry, just saw this message. I agree that ideally this should be limited to texual assembly in MC/MCParser/AsmParser.cpp. Unfortunately this is challenging because the wide-used API AsmParser::parseIdentifier (used ~107 times) takes a StringRef instead of an owned string. It seems a lot of work to fix parseIdentifier...
There was a problem hiding this comment.
Yes, this is a problem. I put up #158106 to put at least some of these uses behind a common API.
A simple option would be to just keep StringRef and have the backing storage in an Allocator. This would avoid API changes and should not really be a performance or memory usage problem as long as quoted symbols are rare.
Though I'm not sure whether all uses of parseIdentifier should parse quoted strings or not.
The MCContext::getOrCreateSymbol change in llvm#138817 was a workaround. With llvm#158106, we can replace `getOrCreateSymbol` with `parseSymbol`, in llvm/lib/MC/MCParser to handle backslash-escaped symbols.
The MCContext::getOrCreateSymbol change in llvm#138817 was a workaround. With llvm#158106, we can replace `getOrCreateSymbol` with `parseSymbol`, in llvm/lib/MC/MCParser to handle backslash-escaped symbols. (cherry picked from commit 0cf6688)
The MCContext::getOrCreateSymbol change in llvm#138817 was a workaround. With llvm#158106, we can replace `getOrCreateSymbol` with `parseSymbol`, in llvm/lib/MC/MCParser to handle backslash-escaped symbols. (cherry picked from commit 0cf6688)
The MCContext::getOrCreateSymbol change in llvm#138817 was a workaround. With llvm#158106, we can replace `getOrCreateSymbol` with `parseSymbol`, in llvm/lib/MC/MCParser to handle backslash-escaped symbols. (cherry picked from commit 0cf6688)
gas has supported " quoted symbols since 2015:
https://sourceware.org/pipermail/binutils/2015-August/090003.html
We don't handle \ or " , leading to clang -c --save-temps vs clang -c
difference for the following C code:
Fix #138390
MC/COFF/safeseh.h looks incorrect. \01 in
.safeseh "\01foo"is not acorrect escape sequence. Change it to \