Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLVM and SPIRV-LLVM-Translator pulldown (WW31) #2182

Merged
merged 596 commits into from
Jul 29, 2020

Conversation

alexbatashev
Copy link
Contributor

@alexbatashev alexbatashev commented Jul 27, 2020

MaskRay and others added 30 commits July 22, 2020 10:16
Otherwise if 'ld' is an older system LLD (FreeBSD; or if someone adds 'ld' to
point to an LLD from a different installation) which does not support the
current ModuleSummaryIndex::BitCodeSummaryVersion, the test will fail.

Add lit feature 'binutils_lto'. GNU ld is more common than GNU gold, so
we can just require 'is_binutils_lto_supported' to additionally support GNU ld.

Reviewed By: myhsu

Differential Revision: https://reviews.llvm.org/D84133
For now, xdrrec_create is only intercepted Linux as its signature
is different on Solaris.

The method of intercepting xdrrec_create isn't super ideal but I
couldn't think of a way around it: Using an AddrHashMap combined
with wrapping the userdata field.

We can't just allocate a handle on the heap in xdrrec_create and leave
it at that, since there'd be no way to free it later. This is because it
doesn't seem to be possible to access handle from the XDR struct, which
is the only argument to xdr_destroy.
On the other hand, the callbacks don't have a way to get at the
x_private field of XDR, which is what I chose for the HashMap key. So we
need to wrap the handle parameter of the callbacks. But we can't just
pass x_private as handle (as it hasn't been set yet). We can't put the
wrapper struct into the HashMap and pass its pointer as handle, as the
key we need (x_private again) hasn't been set yet.

So I allocate the wrapper struct on the heap, pass its pointer as
handle, and put it into the HashMap so xdr_destroy can find it later and
destroy it.

Differential Revision: https://reviews.llvm.org/D83358
This was structured in a way that implied every split argument is in
memory, or in registers. It is possible to pass an original argument
partially in registers, and partially in memory. Transpose the logic
here to only consider a single piece at a time. Every individual
CCValAssign should be treated independently, and any merge to original
value needs to be handled later.

This is in preparation for merging some preprocessing hacks in the
AMDGPU calling convention lowering into the generic code.

I'm also not sure what the correct behavior for memlocs where the
promoted size is larger than the original value. I've opted to clamp
the memory access size to not exceed the value register to avoid the
explicit trunc/extend/vector widen/vector extract instruction. This
happens for AMDGPU for i8 arguments that end up stack passed, which
are promoted to i16 (I think this is a preexisting DAG bug though, and
they should not really be promoted when in memory).
If we use the default of None, we get a python exception in
find_and_diagnose_missing() instead of printing a sensible error message.

Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D84342
The underlying infrastructure supports this already, just add the
pattern matching for linalg.generic.

Differential Revision: https://reviews.llvm.org/D84335
v_cvt_f32_f16 can still accept this value as a literal constant. This
showed up in GlobalISel since it doesn't have constant folding for
G_FPEXT.
This implements OpenMP runtime support for the OpenMP TR8 `present`
map type modifier.  The previous patch in this series implements Clang
front end support.  See that patch summary for behaviors that are not
yet supported.

Reviewed By: grokos, jdoerfert

Differential Revision: https://reviews.llvm.org/D83062
There's no reason to involve the hassle of a virtual method targets
have to override for a simple boolean.

Not sure exactly what's going on with Mips, but it seems to define its
own totally separate handler classes.
We can just use the definition from config.h. This means we need to move
a few lines around in CMakeLists.txt - the TF_AOT detection needs to be
before the spot we process the config.h.cmake files.

Differential Revision: https://reviews.llvm.org/D84349
After lots of follow-up fixes, there are still problems, such as
-Wno-suggest-override getting passed to the Windows Resource Compiler
because it was added with add_definitions in the CMake file.

Rather than piling on another fix, let's revert so this can be re-landed
when there's a proper fix.

This reverts commit 21c0b4c.
This reverts commit 81d68ad.
This reverts commit a361aa5.
This reverts commit fa42b7c.
This reverts commit 955f87f.
This reverts commit 8b16e45.
This reverts commit 308a127.
This reverts commit 274b6b0.
This reverts commit 1c7037a.
This upgrade should be friction-less because we've already been ensuring
that CMake >= 3.13.4 is used.

This is part of the effort discussed on llvm-dev here:

  http://lists.llvm.org/pipermail/llvm-dev/2020-April/140578.html

Differential Revision: https://reviews.llvm.org/D78648
…n DwarfCompileUnit.h. NFC.

Also remove DIE.h include from DwarfCompileUnit.h and replace with forward declarations.
…entation

The implementation of the xvtlsbb builtins/intrinsics were not correct as the
intrinsics previously used i1 as an argument type. This patch changes the i1
argument type used in these intrinsics to be i32 instead, as having the second
as an i1 can lead to issues in the backend.

Differential Revision: https://reviews.llvm.org/D84291
This allows simplifying the implementation of barriers.

This is a re-commit of 1ac403b, which had to be reverted in
64a9c94 because the minimum CMake version wasn't high enough.
Now that we've upgraded, we can do this.

Differential Revision: https://reviews.llvm.org/D75243
We want to be sure that atomic<size_t> is always lock-free, or the code
will be much slower than expected (and could even conceivably fail if
the lock implementation somehow calls back into libc++abi).
Add pass dependecies:
  - TargetTransformInfoWrapperPass
  - TargetPassConfig
  - LoopInfoWrapperPass
  - TargetLibraryInfoWrapperPass

To fix inconsistencies when passes are added to the pipeline.

Reviewers: efriedma, kmclaughlin, paquette

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D84346
Summary:
When a constant array of empty strings goes through contant folding, the result
is something that contains no bytes.  If this array is passed to the intrinsic
function `RESHAPE()`, we were not handling things correctly.  I fixed this by
checking for an empty destination when calling the function `CopyFrom()` on an
array of strings.

I also added a test with a couple of different examples that trigger the
problem.

Reviewers: klausler, tskeith, DavidTruby

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D84352
CopyOp get vectorized to vector.transfer_read followed by vector.transfer_write

Differential Revision: https://reviews.llvm.org/D83739
This was missed out of 1030e82, but hopefully fixes the issues
reported with NEON accidentally generating MVE instructions.
I was trying to pick this up a bit when reviewing D48426 (& perhaps D69778) - in any case, looks like D48426 added a module level flag that might not be needed.

The D48426 implementation worked by setting a module level flag, then code generating contents from the PCH a special case in ASTContext::DeclMustBeEmitted would be used to delay emitting the definition of these functions if they came from a Module with this flag.

This strategy is similar to the one initially implemented for modular codegen that was removed in D29901 in favor of the modular decls list and a bit on each decl to specify whether it's homed to a module.

One major difference between PCH object support and modular code generation, other than the specific list of decls that are homed, is the compilation model: MSVC PCH modules are built into the object file for some other source file (when compiling that source file /Yc is specified to say "this compilation is where the PCH is homed"), whereas modular code generation invokes a separate compilation for the PCH alone. So the current modular code generation test of to decide if a decl should be emitted "is the module where this decl is serialized the current main file" has to be extended (as Lubos did in D69778) to also test the command line flag -building-pch-with-obj.

Otherwise the whole thing is basically streamlined down to the modular code generation path.

This even offers one extra material improvement compared to the existing divergent implementation: Homed functions are not emitted into object files that use the pch. Instead at -O0 they are not emitted into the IR at all, and at -O1 they are emitted using available_externally (existing functionality implemented for modular code generation). The pch-codegen test has been updated to reflect this new behavior.

[If possible: I'd love it if we could not have the extra MSVC-style way of accessing dllexport-pch-homing, and just do it the modular codegen way, but I understand that it might be a limitation of existing build systems. @hans / @Thakis: Do either of you know if it'd be practical to move to something more similar to .pcm handling, where the pch itself is passed to the compilation, rather than homed as a side effect of compiling some other source file?]

Reviewers: llunak, hans

Differential Revision: https://reviews.llvm.org/D83652
Alexander Batashev and others added 4 commits July 27, 2020 13:45
The test broke after LLVM commit df952cb ("[llvm-readobj] Print error
when executed with no input files", 2020-07-20), but it turns out we
actually do not need this test because the translator does not attempt
to preserve XRay instrumentation.
Update the test according to LLVM commit ce6de37 ("[DebugInfo]
Drop location ranges for variables which exist entirely outside the
variable's scope", 2020-07-22).
vladimirlaz
vladimirlaz previously approved these changes Jul 27, 2020
@bader
Copy link
Contributor

bader commented Jul 27, 2020

@alexbatashev you can contact @AGindinson for resolving CI issue with FPGA test.

@bader
Copy link
Contributor

bader commented Jul 27, 2020

If possible, could you also cherry-pick KhronosGroup/SPIRV-LLVM-Translator@9333920 to this pulldown, please?

@alexbatashev
Copy link
Contributor Author

@bader Are you ok with disabling failing test while @AGindinson is working on a proper fix? We can then manually cherry-pick and enable it back.

@AGindinson
Copy link
Contributor

@bader Are you ok with disabling failing test while @AGindinson is working on a proper fix? We can then manually cherry-pick and enable it back.

KhronosGroup/SPIRV-LLVM-Translator#655 should fix the problem.

MrSidims and others added 3 commits July 28, 2020 13:27
This extension adds a function parameter decoration that is useful
for FPGA targets. This decoration indicates that a particular global
memory pointer can only access a particular physical memory location.
Knowing this information at compile time can allow FPGA compilers to
generate load store units of lower area for accesses done through such
a pointer.

Specification: intel#2084

Signed-off-by: Dmitry Sidorov <[email protected]>
The method did not handle correctly the case when addrspacecast instruction had
more than one usage.

Signed-off-by: Alexey Sotkin <[email protected]>
The initial implementation had required that the first
ptr.annotation operand be a result of a load intruction.
However, it may be the case that the LSU built-in gets applied
to a cast result/to a result of a function call, therefore
this patch drops the unfeasible requirement.

Signed-off-by: Artem Gindinson <[email protected]>
@bader
Copy link
Contributor

bader commented Jul 28, 2020

@bader Are you ok with disabling failing test while @AGindinson is working on a proper fix? We can then manually cherry-pick and enable it back.

KhronosGroup/SPIRV-LLVM-Translator#655 should fix the problem.

Let's try to use cherry-pick this patch. If it doesn't work, I'm okay to disable.

@AGindinson
Copy link
Contributor

Does the merge commit message need to be updated, given the latest Translator additions?

@bader
Copy link
Contributor

bader commented Jul 28, 2020

@alexbatashev, could you also fix the link to LLVM project commit in the description, please? It gives me 404.

@vladimirlaz
Copy link
Contributor

/summary:run

@alexbatashev
Copy link
Contributor Author

Does the merge commit message need to be updated, given the latest Translator additions?

@alexbatashev, could you also fix the link to LLVM project commit in the description, please? It gives me 404.

Done

@vladimirlaz vladimirlaz merged commit b539b89 into intel:sycl Jul 29, 2020
@alexbatashev alexbatashev deleted the llvmspirv_pulldown branch September 17, 2021 06:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.