Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Sema] Mark alias/ifunc targets used and consider mangled names #87130

Conversation

MaskRay
Copy link
Member

@MaskRay MaskRay commented Mar 30, 2024

https://reviews.llvm.org/D54188 marked "alias" targets as used in C to
fix -Wunused false positives. This patch extends the approach to handle
mangled names to support global scope names in C++ and the
overloadable attribute in C.

In addition, we mark ifunc targets as used to fix #63957.

While our approach has false negatives for namespace scope names, the
majority of alias/ifunc C++ uses (global scope with no overloads) are
handled.

Note: The following function with internal linkage but C language
linkage type is mangled in Clang but not in GCC. This inconsistency
makes alias/ifunc difficult to use in C++ with portability (#88593).

extern "C" {
static void f0() {}
// GCC: void g0() __attribute__((alias("_ZL2f0v")));
// Clang: void g0() __attribute__((alias("f0")));
}

Created using spr 1.3.5-bogner
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Mar 30, 2024
@llvmbot
Copy link
Member

llvmbot commented Mar 30, 2024

@llvm/pr-subscribers-clang

Author: Fangrui Song (MaskRay)

Changes

https://reviews.llvm.org/D54188 marked "alias" targets as used in C to
fix -Wunused false positives. This patch extends the approach to
handle mangled names to support global scope names in C++. We fix false
positives while causing false negatives when overloads are present.

In addition, we mark ifunc targets as used to fix #63957.

While our approach has false negatives and does not handle namespace
scope names, the majority of alias/ifunc C++ uses (global scope with no
overloads) are handled.


Full diff: https://github.com/llvm/llvm-project/pull/87130.diff

4 Files Affected:

  • (modified) clang/lib/Sema/CMakeLists.txt (+1)
  • (modified) clang/lib/Sema/SemaDeclAttr.cpp (+20-11)
  • (modified) clang/test/AST/ast-dump-attr-json.cpp (+1)
  • (modified) clang/test/Sema/alias-unused.c (+29-1)
diff --git a/clang/lib/Sema/CMakeLists.txt b/clang/lib/Sema/CMakeLists.txt
index e8bff07ced0cfa..7657536fabd1eb 100644
--- a/clang/lib/Sema/CMakeLists.txt
+++ b/clang/lib/Sema/CMakeLists.txt
@@ -1,5 +1,6 @@
 set(LLVM_LINK_COMPONENTS
   Core
+  Demangle
   FrontendHLSL
   FrontendOpenMP
   MC
diff --git a/clang/lib/Sema/SemaDeclAttr.cpp b/clang/lib/Sema/SemaDeclAttr.cpp
index f25f3afd0f4af2..8ed5242f325e5e 100644
--- a/clang/lib/Sema/SemaDeclAttr.cpp
+++ b/clang/lib/Sema/SemaDeclAttr.cpp
@@ -42,6 +42,7 @@
 #include "clang/Sema/SemaInternal.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/StringExtras.h"
+#include "llvm/Demangle/Demangle.h"
 #include "llvm/IR/Assumptions.h"
 #include "llvm/MC/MCSectionMachO.h"
 #include "llvm/Support/Error.h"
@@ -1980,6 +1981,23 @@ static void handleWeakRefAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
   D->addAttr(::new (S.Context) WeakRefAttr(S.Context, AL));
 }
 
+// Mark alias/ifunc target as used. For C++, we look up the demangled name
+// ignoring parameters. This should handle the majority of use cases while
+// leaveing false positives for namespace scope names and false negatives in
+// the presence of overloads.
+static void markUsedForAliasOrIfunc(Sema &S, Decl *D, const ParsedAttr &AL,
+                                    StringRef Str) {
+  char *Demangled = llvm::itaniumDemangle(Str, /*ParseParams=*/false);
+  if (Demangled)
+    Str = Demangled;
+  const DeclarationNameInfo target(&S.Context.Idents.get(Str), AL.getLoc());
+  LookupResult LR(S, target, Sema::LookupOrdinaryName);
+  if (S.LookupQualifiedName(LR, S.getCurLexicalContext()))
+    for (NamedDecl *ND : LR)
+      ND->markUsed(S.Context);
+  free(Demangled);
+}
+
 static void handleIFuncAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
   StringRef Str;
   if (!S.checkStringLiteralArgumentAttr(AL, 0, Str))
@@ -1992,6 +2010,7 @@ static void handleIFuncAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
     return;
   }
 
+  markUsedForAliasOrIfunc(S, D, AL, Str);
   D->addAttr(::new (S.Context) IFuncAttr(S.Context, AL, Str));
 }
 
@@ -2026,17 +2045,7 @@ static void handleAliasAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
     }
   }
 
-  // Mark target used to prevent unneeded-internal-declaration warnings.
-  if (!S.LangOpts.CPlusPlus) {
-    // FIXME: demangle Str for C++, as the attribute refers to the mangled
-    // linkage name, not the pre-mangled identifier.
-    const DeclarationNameInfo target(&S.Context.Idents.get(Str), AL.getLoc());
-    LookupResult LR(S, target, Sema::LookupOrdinaryName);
-    if (S.LookupQualifiedName(LR, S.getCurLexicalContext()))
-      for (NamedDecl *ND : LR)
-        ND->markUsed(S.Context);
-  }
-
+  markUsedForAliasOrIfunc(S, D, AL, Str);
   D->addAttr(::new (S.Context) AliasAttr(S.Context, AL, Str));
 }
 
diff --git a/clang/test/AST/ast-dump-attr-json.cpp b/clang/test/AST/ast-dump-attr-json.cpp
index 051c2956abfdf7..883e584bfedf07 100644
--- a/clang/test/AST/ast-dump-attr-json.cpp
+++ b/clang/test/AST/ast-dump-attr-json.cpp
@@ -46,6 +46,7 @@ __thread __attribute__ ((tls_model ("local-exec"))) int tls_model_var;
 // CHECK-NEXT:    "tokLen": 11
 // CHECK-NEXT:   }
 // CHECK-NEXT:  },
+// CHECK-NEXT:  "isUsed": true,
 // CHECK-NEXT:  "name": "global_decl",
 // CHECK-NEXT:  "mangledName": "global_decl",
 // CHECK-NEXT:  "type": {
diff --git a/clang/test/Sema/alias-unused.c b/clang/test/Sema/alias-unused.c
index de9fc8cc737662..030b5737a93a21 100644
--- a/clang/test/Sema/alias-unused.c
+++ b/clang/test/Sema/alias-unused.c
@@ -1,7 +1,35 @@
-// RUN: %clang_cc1 -triple x86_64-linux-gnu -Wunneeded-internal-declaration -x c -verify %s
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -Wunused -x c -verify %s
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -Wunused -x c++ -verify %s
+
+#ifdef __cplusplus
+extern "C" {
+#else
 // expected-no-diagnostics
+#endif
 static int f(void) { return 42; }
 int g(void) __attribute__((alias("f")));
 
 static int foo [] = { 42, 0xDEAD };
 extern typeof(foo) bar __attribute__((unused, alias("foo")));
+
+static int (*resolver(void))(void) { return f; }
+int ifunc(void) __attribute__((ifunc("resolver")));
+
+#ifdef __cplusplus
+}
+
+/// We demangle alias/ifunc target and mark all found functions as used.
+static int f1(int) { return 42; }
+static int f1(void) { return 42; }
+int g1(void) __attribute__((alias("_ZL2f1v")));
+
+static int (*resolver1(void))(void) { return f; }
+static int (*resolver1(int))(void) { return f; }
+int ifunc1(void) __attribute__((ifunc("_ZL9resolver1v")));
+
+namespace ns {
+static int f2(int) { return 42; } // expected-warning{{unused function 'f2'}}
+static int f2(void) { return 42; } // expected-warning{{unused function 'f2'}}
+int g2(void) __attribute__((alias("_ZN2nsL2f2Ev")));
+}
+#endif

Copy link
Collaborator

@tstellar tstellar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to NAK this change for now. Given the recent developments with the xz project and considering this feature was requested by a user associated with the project, I think we need to be extra careful with this change. I would like to see at least two trusted code owners review and approve this change before it is committed.

Copy link
Collaborator

@erichkeane erichkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A quick few things I noticed, not nearly a thorough review.

clang/lib/Sema/SemaDeclAttr.cpp Outdated Show resolved Hide resolved
// the presence of overloads.
static void markUsedForAliasOrIfunc(Sema &S, Decl *D, const ParsedAttr &AL,
StringRef Str) {
char *Demangled = llvm::itaniumDemangle(Str, /*ParseParams=*/false);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't really assume 'itanium demangle' here, this could be a non-itanium build.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does one specify a "non-itanium build?"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Triple is the only way I am aware of IIRC. So we should be checking that triple. I think ifunc is ELF/Itanium only, and I'm unsure about alias. That said, I'm aware of at least 1 downstream (IIRC) that used an ELF system with an MSVC mangling (one of the offload compilers).

Copy link
Member

@nickdesaulniers nickdesaulniers Apr 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there's getLangOpts().CXXABI. TargetCXXABI::Microsoft is the one that doesn't use itanium FWICT (looking at ASTContext::createCXXABI) And TargetInfo::getCXXABI (looking at ASTContext::createMangleContext). isItaniumFamily comes up, too.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there's getLangOpts().CXXABI. TargetCXXABI::Microsoft is the one that doesn't use itanium FWICT (looking at ASTContext::createCXXABI) And TargetInfo::getCXXABI (looking at ASTContext::createMangleContext). isItaniumFamily comes up, too.

Yep, Microsoft is the exception. We DO typically use https://clang.llvm.org/doxygen/classclang_1_1TargetCXXABI.html to determine which.

Copy link
Member Author

@MaskRay MaskRay Apr 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alias(mangled) referring to non-existent definitions are rejected by another diagnostic.

It's an error

% fclang -c b.cc
b.cc:3:28: error: ifunc must point to a defined function
    3 | int i1(int) __attribute__((ifunc("f")));
      |                            ^
b.cc:3:28: note: the function specified in an ifunc must refer to its mangled name
b.cc:3:28: note: function by that name is mangled as "_Z1fv"
    3 | int i1(int) __attribute__((ifunc("f")));
      |                            ^~~~~~~~~~
      |                            ifunc("_Z1fv")
1 error generated.

The reported issue has nothing to do with security as all. It's about a -Wunused-function false positive. (https://git.tukaani.org/?p=xz.git;a=commitdiff;h=a37a2763383e6c204fe878e1416dd35e7711d3a9)

The warning (under -Wall) becomes an error if users provide -Werror, not what Linux distributions do.
In addition, the backdoor is skipped unless GCC is used.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we can add S.getASTContext().getCXXABIKind() != TargetCXXABI::Microsoft check for downstream users, it doesn't appear to add any value?

If we know the target ABI uses Microsoft mangling, then we should be using microsoftDemangle rather than itaniumDemangle. There's also llvm::demangle which can try to figure this out.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I agree that we should not be assuming itanium demangling here; we definitely should care about Microsoft mangling as well because alias is supported on all targets, including Windows.

The reported issue has nothing to do with security as all. It's about a -Wunused-function false positive.

Eh, I don't agree. Replacing false positives with false negatives can be a security concern because the false negatives could be hiding an issue. @nickdesaulniers pointed out a potential in https://github.com/llvm/llvm-project/pull/87130/files#r1546691566. (The reverse can also be true -- replacing false negatives with false positives can hide security issues because low-quality output may either cause the user to disable the diagnostic entirely or may drown out the valid issues.)

Regardless, I think the direction of this patch is heading where we want it to go. But urging caution in this area is very reasonable IMO.

Copy link
Member Author

@MaskRay MaskRay Apr 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a S.getASTContext().getCXXABIKind() != TargetCXXABI::Microsoft check.

Note: while alias works for Windows, it is only checked in tests CodeGenCXX/visibility-dllstorageclass.cpp/CodeGen/debug-info-alias-pointer.c . I don't consider an internal linkage target used or cared by users.

I've also updated the description to state

Note: The following function with internal linkage but C language
linkage type is mangled in Clang but not in GCC. This inconsistency
makes alias/ifunc difficult to use in C++ with portability.

extern "C" {
static void f0() {}
// GCC: void g0() __attribute__((alias("_ZL2f0v")));
// Clang: void g0() __attribute__((alias("f0")));
}

This makes internal linkage target difficult to use in C++ even on Linux.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a S.getASTContext().getCXXABIKind() != TargetCXXABI::Microsoft check.

Note: while alias works for Windows, it is only checked in tests CodeGenCXX/visibility-dllstorageclass.cpp/CodeGen/debug-info-alias-pointer.c . I don't consider an internal linkage target used or cared by users.

I'm not certain why there's so much resistance to doing this when three people have pointed out the concern. It still works for Windows. e.g.,

F:\source\llvm-project>cat "C:\Users\aballman\OneDrive - Intel Corporation\Desktop\test.cpp"
namespace NS {
  extern "C" int puts(const char *);
  void f() { puts("I did the thing!"); }
}

void func() __attribute__((alias("?f@NS@@YAXXZ")));

int main() {
  func();
}

F:\source\llvm-project>llvm\out\build\x64-Debug\bin\clang.exe "C:\Users\aballman\OneDrive - Intel Corporation\Desktop\test.cpp"

F:\source\llvm-project>a.exe
I did the thing!

We should support it properly rather than only halfway, especially considering the context of the bug report.

LookupResult LR(S, target, Sema::LookupOrdinaryName);
if (S.LookupQualifiedName(LR, S.getCurLexicalContext()))
for (NamedDecl *ND : LR)
ND->markUsed(S.Context);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop doesn't seem quite right either. First, we need to ensure we're only marking 'functions' as used, not all possible named decls I think.

Second, we need to ensure we're doing a better job with testing, this doesn't seem to be adding enough test infrastructure to make sure we're picking up the right function.

Finally, what does this mean in the case of template declarations/instantiations? We need to make sure we're marking those correctly (or not picking them up at all perhaps?).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you can put alias attributes on non-functions: https://godbolt.org/z/s1PbojYr4.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you can put alias attributes on non-functions: https://godbolt.org/z/s1PbojYr4.

ifuncs cannot, so we perhaps need to be more careful there.

Copy link
Member Author

@MaskRay MaskRay Apr 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"alias" can target variables. "ifunc" has an existing diagnostic to reject non-function targets.

% cat a.c
static int x;
extern void y() __attribute__((ifunc("x")));
% clang -c a.c
a.c:2:32: error: ifunc must point to a defined function
    2 | extern void y() __attribute__((ifunc("x")));
      |                                ^
1 error generated.

I am adding a template test:

template <class T>
static void *f3(T) { return nullptr; }
static void *f3() { return nullptr; } // expected-warning{{unused function 'f3'}}
extern void g3() __attribute__((ifunc("_ZL2f3IiEPvT_")));
void *use3 = f3(0);

I think the current code does the right thing for templates because mangled names for templates contain <>.

% c++filt -p <<< _ZL2f3IiEPvT_
f3<int>

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of the nature of ifunc and alias, would it make sense to have an assertion here that what we found was either a variable or a function, just to be defensive?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added code to check the mangled name to avoid -Wunused-function false negatives.

Copy link
Member

@nickdesaulniers nickdesaulniers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the author of https://reviews.llvm.org/D54188 and the TODO added from there, this patch is the right thing to do. I fixed a false positive diagnostic for C and knew that the fix was insufficient for C++.

I do appreciate the abundance of caution here, but this looks generally like what we want to do.

We fix false positives while causing false negatives when overloads are present.

Can we avoid the false negatives?

clang/test/Sema/alias-unused.c Outdated Show resolved Hide resolved

namespace ns {
static int f2(int) { return 42; } // expected-warning{{unused function 'f2'}}
static int f2(void) { return 42; } // expected-warning{{unused function 'f2'}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why warn for f2(void)? Would _ZNL2nsL2f2Ev be the mangling? (an additional L between N and 2)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

c++filt -p <<< _ZNL2nsL2f2Ev => ns::f2. We attempt to find ns::f2 in the scope and will find nothing (we should use f2).

This weakness is mentioned in the description as "... does not handle namespace scope names"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LookupParsedName smells like it might handle namespaced lookups?

clang/test/Sema/alias-unused.c Outdated Show resolved Hide resolved
// the presence of overloads.
static void markUsedForAliasOrIfunc(Sema &S, Decl *D, const ParsedAttr &AL,
StringRef Str) {
char *Demangled = llvm::itaniumDemangle(Str, /*ParseParams=*/false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does one specify a "non-itanium build?"

Created using spr 1.3.5-bogner
@MaskRay MaskRay requested a review from rupprecht as a code owner April 2, 2024 06:48
@llvmbot llvmbot added the bazel "Peripheral" support tier build system: utils/bazel label Apr 2, 2024
// the presence of overloads.
static void markUsedForAliasOrIfunc(Sema &S, Decl *D, const ParsedAttr &AL,
StringRef Str) {
char *Demangled = llvm::itaniumDemangle(Str, /*ParseParams=*/false);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I agree that we should not be assuming itanium demangling here; we definitely should care about Microsoft mangling as well because alias is supported on all targets, including Windows.

The reported issue has nothing to do with security as all. It's about a -Wunused-function false positive.

Eh, I don't agree. Replacing false positives with false negatives can be a security concern because the false negatives could be hiding an issue. @nickdesaulniers pointed out a potential in https://github.com/llvm/llvm-project/pull/87130/files#r1546691566. (The reverse can also be true -- replacing false negatives with false positives can hide security issues because low-quality output may either cause the user to disable the diagnostic entirely or may drown out the valid issues.)

Regardless, I think the direction of this patch is heading where we want it to go. But urging caution in this area is very reasonable IMO.

Comment on lines 1993 to 1994
const DeclarationNameInfo target(&S.Context.Idents.get(Str), AL.getLoc());
LookupResult LR(S, target, Sema::LookupOrdinaryName);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const DeclarationNameInfo target(&S.Context.Idents.get(Str), AL.getLoc());
LookupResult LR(S, target, Sema::LookupOrdinaryName);
DeclarationNameInfo Target(&S.Context.Idents.get(Str), AL.getLoc());
LookupResult LR(S, Target, Sema::LookupOrdinaryName);

Coding style nits

clang/lib/Sema/SemaDeclAttr.cpp Outdated Show resolved Hide resolved
@@ -1980,6 +1981,23 @@ static void handleWeakRefAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
D->addAttr(::new (S.Context) WeakRefAttr(S.Context, AL));
}

// Mark alias/ifunc target as used. For C++, we look up the demangled name
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also happen in C because of __attribute__((overloadable)) (so the code is correct, but we should fix up the comment and add a test case in C for this situation)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for mentioning overloadable. Added a test and fixed the comment.

LookupResult LR(S, target, Sema::LookupOrdinaryName);
if (S.LookupQualifiedName(LR, S.getCurLexicalContext()))
for (NamedDecl *ND : LR)
ND->markUsed(S.Context);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of the nature of ifunc and alias, would it make sense to have an assertion here that what we found was either a variable or a function, just to be defensive?

@@ -1,7 +1,43 @@
// RUN: %clang_cc1 -triple x86_64-linux-gnu -Wunneeded-internal-declaration -x c -verify %s
// RUN: %clang_cc1 -triple x86_64-linux-gnu -Wunused -x c -verify %s
// RUN: %clang_cc1 -triple x86_64-linux-gnu -Wunused -x c++ -verify %s
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see a non-Itanium RUN line as well. Perhaps we should be using -triple %itanium_abi_triple and -triple %ms_abi_triple rather than a specific target?

MaskRay added 3 commits April 4, 2024 15:39
Created using spr 1.3.5-bogner
Created using spr 1.3.5-bogner
Created using spr 1.3.5-bogner
Copy link
Collaborator

@AaronBallman AaronBallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marking as requesting changes until we get agreement on the behavior for the Microsoft ABI.

Copy link
Member

@nickdesaulniers nickdesaulniers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Progressing in the right direction and I'm in favor of fixing this issue in clang, with similar levels of caution as my fellow reviewers.

I think we can support Microsoft ABI mangling trivially here.

That we mangle globals with static linkage in extern "C" blocks looks to me like a bug in clang that should get fixed first BEFORE this lands. Otherwise that seems like a massive compatibility issue. Perhaps another approach would be to at least document this difference. But then again, this behavioral difference is surprising, and I worry that it might be used to try to hide something.

The namespace lookup false positive is something that perhaps doesn't need to get fixed here; if we don't plan to remove that final false positive in the PR, then please file a bug @MaskRay and link to it from a TODO comment in the test case and/or sources.

/// We should report "unused function" for f3(int).
namespace ns {
static int f3(int) { return 42; } // cxx-warning{{unused function 'f3'}}
static int f3() { return 42; } // cxx-warning{{unused function 'f3'}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While our approach has false negatives for namespace scope names

Isn't this with ns::f3() here a false positive, not a false negative? Consider adding TODO: to make it even more blatant.

Is this a bug in Sema::LookupOrdinaryName ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To support ns::f3, it seems that we need to use LookupNestedNameSpecifierName first to find ns, then find f3. The process is quite complex and may not fit into the specific mark* function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surely there's another case where we need to go from mangled string back to named decl. Perhaps there's somewhere else in clang that already does so and can be reused here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may not fit

?

Copy link
Member Author

@MaskRay MaskRay Apr 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To fully support qualified names in alias/ifunc attributes for the -Wunused-function diagnostic, it would take a lot of code due to complexity of C++ name lookup when namespaces are involved.

This PR intends to address the ifunc -Wunused-function issue. The C++ support is a side product. And it would seem odd to add lots of code to fix a -Wunused-function false positive related to qualified names (which people likely don't use).

The test is added here to test the current behavior.

Comment on lines 7 to 8
static int f(void) { return 42; }
int g(void) __attribute__((alias("f")));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems wrong to me, a false negative.

FWICT, for C++ mode, even in an extern "C" block, it seems that functions with static linkage get mangled. Or... Oh, look, more behavioral differences from GCC: https://godbolt.org/z/sa4e7aYqj

We mangle, they do not. That's going to lead to further compatibility issues when trying to use this function attribute.


Ah! right you pointed this out in your note. Perhaps we should fix/change that first in clang?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! right you pointed this out in your note. Perhaps we should fix/change that first in clang?

I remember an existing issue about the mangling difference, but I don't recall the link now.
I have analyzed it a bit and run into a feature where we need mangling.
The discrepancy from GCC is indeed unfortunate and in practice makes alias/ifunc not useful for internal linkage targets in C++.

https://eel.is/c++draft/dcl.link gives an example

extern "C" {
  static void f4();             // the name of the function f4 has internal linkage,
                                // so f4 has no language linkage; its type has C language linkage
}

I am unsure whether "no language linkage" means that ignoring the C language linkage is fine.


Improving GCC consistency is not the goal of this PR. This PR intends to make clangSema and clangCodeGen consistent. If we want to follow GCC in the future, the updated test will show the differences.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improving GCC consistency is not the goal of this PR. This PR intends to make clangSema and clangCodeGen consistent.

Is there a bug on file wrt to this inconsistency? I would like to link to it from the test, sources, and public docs on the function attribute if we don't fix this inconsistency BEFORE landing this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please file a bug and link to it from a TODO comment in the tests and/or sources.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created #88593

But I don't think a TODO applies. Clang's behavior is probably desired. So this just records a difference.

This difference probably doesn't make a difference because people rarely use alias for C++ code.

clang/test/Sema/alias-unused.cpp Outdated Show resolved Hide resolved
clang/test/Sema/alias-unused.cpp Show resolved Hide resolved
clang/lib/Sema/SemaDeclAttr.cpp Show resolved Hide resolved
clang/test/Sema/alias-unused.cpp Outdated Show resolved Hide resolved
Created using spr 1.3.5-bogner

static int __attribute__((overloadable)) f0(int x) { return x; } // expected-warning{{unused function 'f0'}}
static float __attribute__((overloadable)) f0(float x) { return x; } // expected-warning{{unused function 'f0'}}
int g0(void) __attribute__((alias("?f0@@YAHH@Z")));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As someone not familiar with microsoft's ABI, is there a tool like llvm-cxxfilt that can help me check this mangling from the command line, on my Linux host?


I'm guessing this test file is demonstrating the lack of msabi mangling support? And will function as a change detector in the future?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing this test file is demonstrating the lack of msabi mangling support? And will function as a change detector in the future?

Yes, a change detector. The test shall be pre-committed once good enough.

llvm-cxxfilt seems Itanium only.... I compiled the file locally and used llvm-objdump -t a.o to get the symbol name.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I use http://demangler.com/ to help check manglings, which isn't quite what you were asking for but might help.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it; please add a TODO that the overload of f0 that accepts an int should not warn (similar to the TODO you added in clang/test/Sema/alias-unused.cpp for the namespace part. Same for g1 and the overload of f3 that accepts no parameters.

@MaskRay
Copy link
Member Author

MaskRay commented Apr 5, 2024

I really appreciate the suggestions. alias-unused.cpp and alias-unused-win.cpp contain test improvement that should be pre-commited once they look good enough.
Then this PR can be changed to show the difference.

On a separate note, I wanted to clarify that -Wunused-function false positives/negatives shouldn't be automatically considered security bugs.
Categorizing all these warning improvements as "security bugs" would dilute the meaning of "security bugs" and make it harder to prioritize real vulnerabilities.

(

What is a GCC security bug?
===========================

    A security bug is one that threatens the security of a system or
    network, or might compromise the security of data stored on it.
    In the context of GCC, there are multiple ways in which this might
    happen and some common scenarios are detailed below.

)

(An attacker can easily bypass warnings: remove -Werror (uncommon in distros anyway), remove -Wall (which covers -Wunused-function, or use a pragma to disable -Wunused-function locally.
)

The description contains an example about name mangling differences (https://github.com/llvm/llvm-project/pull/87130/files#r1554029811)
and I mentioned that "This inconsistency makes alias/ifunc difficult to use in C++ with portability."

extern "C" {
static void f0() {}
// GCC: void g0() __attribute__((alias("_ZL2f0v")));
// Clang: void g0() __attribute__((alias("f0")));
}

I added microsoftDemangle tests to show the current behavior.
Since the feature that demangles to the function name without parameters (f3 instead of f3(int)) appears to be missing, I cannot address -Wunused-function false positives for microsoftDemangle with reasonable time complexity.

@erichkeane
Copy link
Collaborator

I really appreciate the suggestions. alias-unused.cpp and alias-unused-win.cpp contain test improvement that should be pre-commited once they look good enough. Then this PR can be changed to show the difference.

On a separate note, I wanted to clarify that -Wunused-function false positives/negatives shouldn't be automatically considered security bugs. Categorizing all these warning improvements as "security bugs" would dilute the meaning of "security bugs" and make it harder to prioritize real vulnerabilities.

(

What is a GCC security bug?
===========================

    A security bug is one that threatens the security of a system or
    network, or might compromise the security of data stored on it.
    In the context of GCC, there are multiple ways in which this might
    happen and some common scenarios are detailed below.

)

(An attacker can easily bypass warnings: remove -Werror (uncommon in distros anyway), remove -Wall (which covers -Wunused-function, or use a pragma to disable -Wunused-function locally. )

The description contains an example about name mangling differences (https://github.com/llvm/llvm-project/pull/87130/files#r1554029811) and I mentioned that "This inconsistency makes alias/ifunc difficult to use in C++ with portability."

extern "C" {
static void f0() {}
// GCC: void g0() __attribute__((alias("_ZL2f0v")));
// Clang: void g0() __attribute__((alias("f0")));
}

I added microsoftDemangle tests to show the current behavior. Since the feature that demangles to the function name without parameters (f3 instead of f3(int)) appears to be missing, I cannot address -Wunused-function false positives for microsoftDemangle with reasonable time complexity.

I don't believe we ARE trying to make that change to security definitions. However, we are all being particularly security conscious of this patch because the reporter is the person who JUST did the xz exploit over years. IMO, this bug report (and the original C only fix) were used in part to help cover his tracks, so being particularly careful here is paramount.

@MaskRay
Copy link
Member Author

MaskRay commented Apr 5, 2024

I really appreciate the suggestions. alias-unused.cpp and alias-unused-win.cpp contain test improvement that should be pre-commited once they look good enough. Then this PR can be changed to show the difference.
On a separate note, I wanted to clarify that -Wunused-function false positives/negatives shouldn't be automatically considered security bugs. Categorizing all these warning improvements as "security bugs" would dilute the meaning of "security bugs" and make it harder to prioritize real vulnerabilities.
(

What is a GCC security bug?
===========================

    A security bug is one that threatens the security of a system or
    network, or might compromise the security of data stored on it.
    In the context of GCC, there are multiple ways in which this might
    happen and some common scenarios are detailed below.

)
(An attacker can easily bypass warnings: remove -Werror (uncommon in distros anyway), remove -Wall (which covers -Wunused-function, or use a pragma to disable -Wunused-function locally. )
The description contains an example about name mangling differences (#87130 (files)) and I mentioned that "This inconsistency makes alias/ifunc difficult to use in C++ with portability."

extern "C" {
static void f0() {}
// GCC: void g0() __attribute__((alias("_ZL2f0v")));
// Clang: void g0() __attribute__((alias("f0")));
}

I added microsoftDemangle tests to show the current behavior. Since the feature that demangles to the function name without parameters (f3 instead of f3(int)) appears to be missing, I cannot address -Wunused-function false positives for microsoftDemangle with reasonable time complexity.

I don't believe we ARE trying to make that change to security definitions. However, we are all being particularly security conscious of this patch because the reporter is the person who JUST did the xz exploit over years. IMO, this bug report (and the original C only fix) were used in part to help cover his tracks, so being particularly careful here is paramount.

The description might derail from the primary purpose of the patch.

The -Wunused-function false positive was not used in part to cover the issue.
"Hans Jansen" claimed 5% improvement for his app that called the crc func and contributed that ifunc code.
Lasse Collin's commit (https://git.tukaani.org/?p=xz.git;a=commit;h=a37a2763383e6c204fe878e1416dd35e7711d3a9), confirmed with the author on IRC, aimed to fix a configure issue with Clang -Wall.
Lasse did not report the issue. "Jia Tan" did.

The attacker did not need to care about the -Wunused-function false positive.
They could remove static from the resolver function to avoid the -Wunused-function false positive.

Anyhow, the backdoor doesn't target Clang (if test "x$CC" != 'xgcc' > /dev/null), which might be related to other codegen discrepancy the attacker did not intend to control.

@emaste
Copy link
Member

emaste commented Apr 5, 2024

Giving this extra scrutiny is certainly warranted given the context, but #63957 is a legitimate bug.

@MaskRay
Copy link
Member Author

MaskRay commented Apr 12, 2024

Ping:)

Copy link
Member

@nickdesaulniers nickdesaulniers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM; though I have a strong preference to get bugs on file and linked to regarding:

  • issues with non-itanium mangling
  • issues with namespaced identifiers.


static int __attribute__((overloadable)) f0(int x) { return x; } // expected-warning{{unused function 'f0'}}
static float __attribute__((overloadable)) f0(float x) { return x; } // expected-warning{{unused function 'f0'}}
int g0(void) __attribute__((alias("?f0@@YAHH@Z")));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it; please add a TODO that the overload of f0 that accepts an int should not warn (similar to the TODO you added in clang/test/Sema/alias-unused.cpp for the namespace part. Same for g1 and the overload of f3 that accepts no parameters.

Comment on lines 7 to 8
static int f(void) { return 42; }
int g(void) __attribute__((alias("f")));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please file a bug and link to it from a TODO comment in the tests and/or sources.

/// We should report "unused function" for f3(int).
namespace ns {
static int f3(int) { return 42; } // cxx-warning{{unused function 'f3'}}
static int f3() { return 42; } // cxx-warning{{unused function 'f3'}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may not fit

?

Created using spr 1.3.5-bogner
@MaskRay
Copy link
Member Author

MaskRay commented Apr 15, 2024

I think I've addressed all comments and changed the patch to rebase on the improved tests.

@AaronBallman @tstellar Are you happy with the current version?

@AaronBallman
Copy link
Collaborator

I think I've addressed all comments and changed the patch to rebase on the improved tests.

@AaronBallman @tstellar Are you happy with the current version?

Please file an issue for this:

I added microsoftDemangle tests to show the current behavior.
Since the feature that demangles to the function name without parameters (f3 instead of f3(int)) appears to be missing, I cannot > address -Wunused-function false positives for microsoftDemangle with reasonable time complexity.

and file a second one to come back and fix this issue once we can demangle with parameters.

Copy link
Collaborator

@tstellar tstellar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this as long as @AaronBallman approves.

Copy link
Collaborator

@AaronBallman AaronBallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG enough TM, thank you @MaskRay!

@MaskRay
Copy link
Member Author

MaskRay commented Apr 16, 2024

I think I've addressed all comments and changed the patch to rebase on the improved tests.
@AaronBallman @tstellar Are you happy with the current version?

Please file an issue for this:

I added microsoftDemangle tests to show the current behavior.
Since the feature that demangles to the function name without parameters (f3 instead of f3(int)) appears to be missing, I cannot > address -Wunused-function false positives for microsoftDemangle with reasonable time complexity.

and file a second one to come back and fix this issue once we can demangle with parameters.

Created #88823 and #88825

Thanks!

Created using spr 1.3.5-bogner
@MaskRay MaskRay merged commit 2ac562a into main Apr 16, 2024
3 of 4 checks passed
@MaskRay MaskRay deleted the users/MaskRay/spr/sema-mark-aliasifunc-targets-used-and-consider-mangled-names branch April 16, 2024 00:58
@joker-eph
Copy link
Collaborator

This broke a bot, I reverted and it's back green here: https://lab.llvm.org/buildbot/#/builders/272/builds/14069

@MaskRay
Copy link
Member Author

MaskRay commented Apr 16, 2024

This broke a bot, I reverted and it's back green here: lab.llvm.org/buildbot/#/builders/272/builds/14069

Thanks. llvm-project/libc has a pattern like the following.

namespace libc {
double log2(double x);
}
extern "C" double log2(double);

namespace std {
using ::log2;
}
using std::log2;

namespace libc {
decltype(libc::log2) __log2_impl__ __asm__("log2");
decltype(libc::log2) log2 __attribute__((alias("log2")));
double __log2_impl__(double x) { return x; }
}

ItaniumMangleContextImpl::mangleCXXName called by mangleName (new code in this PR) asserts that the parameter is one of FunctionDecl/VariableDecl/TemplateParamObjectDecl and fails on a UsingShadowDecl.
We can check FunctionDecl before mangleCXXName (we only handle global namespace names, where variables are not mangled).

To make the regression test more useful, I'll place it in CodeGen/alias.cpp separately.

MaskRay added a commit that referenced this pull request Apr 16, 2024
The pattern is quite involved and deserves a specific codegen test.
This test would catch the bug in the first attempt of #87130
MaskRay added a commit that referenced this pull request Apr 16, 2024
https://reviews.llvm.org/D54188 marked "alias" targets as used in C to
fix -Wunused false positives. This patch extends the approach to handle
mangled names to support global scope names in C++ and the
`overloadable` attribute in C.

(Note: we should skip `UsingShadowDecl`, which would trigger an
assertion failure in `ItaniumMangleContextImpl::mangleCXXName`.
See regression test added by commit 1c2afba.)

In addition, we mark ifunc targets as used to fix #63957 (temporarily
used by xz; ifunc was removed by
tukaani-project/xz@689ae24)

While our approach has false negatives for namespace scope names, the
majority of alias/ifunc C++ uses (global scope with no overloads) are
handled.

Note: The following function with internal linkage but C language
linkage type is mangled in Clang but not in GCC. This inconsistency
makes alias/ifunc difficult to use in C++ with portability (#88593).
```
extern "C" {
static void f0() {}
// GCC: void g0() __attribute__((alias("_ZL2f0v")));
// Clang: void g0() __attribute__((alias("f0")));
}
```

Pull Request: #87130
petk added a commit to petk/php-build-system that referenced this pull request May 19, 2024
Clang 19 has fixed the false warning of unused function resolve_foo:
llvm/llvm-project#87130
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bazel "Peripheral" support tier build system: utils/bazel clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[clang] Reporting unused function with ifunc resolver
9 participants