Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Win64 ABI fix: return non-POD structs via sret. #856

Merged
merged 3 commits into from
Apr 8, 2015

Conversation

kinke
Copy link
Member

@kinke kinke commented Feb 28, 2015

And get rid of obsolete integer rewrites on Win64 and PPC and some general rewrites on System V x86_64 platforms.

And get rid of obsolete integer rewrites on Win64 and PPC.
@redstar
Copy link
Member

redstar commented Mar 1, 2015

Does experimental mean that you are still working on it?

@kinke
Copy link
Member Author

kinke commented Mar 2, 2015

I wanted to check whether that introduces any regressions. Seems like it doesn't for VS 2015 (fixing at least one failing unit test in std.algorithm), so I guess it also doesn't for VS 2013. And apparently the more thorough typesAreEuivalent() check also works nicely for the System V ABI.
I'm not perfectly happy with that LLTypeMemoryLayout struct in abi-generic.h. So in case someone has a better name and/or location for it, pls tell me.

PS: We're now down to 50 failing test modules for VS 2015.

@kinke kinke mentioned this pull request Mar 3, 2015
I.e., pass them explicitly ByVal and return them via sret.
Just like DMD. Inline assembly in druntime/phobos depends on this.
Only relevant for MinGW-w64 as MSVC uses 64-bit real.
@kinke
Copy link
Member Author

kinke commented Mar 21, 2015

@redstar: This one's good to go unless you have a better name and/or location for the LLTypeMemoryLayout struct.

@redstar
Copy link
Member

redstar commented Apr 1, 2015

I am still unsure about the LLTypeMemorylayout solution. This adds a (possible recursive) call to every struct parameter. What is the performance impact? (See #830 and #539.)


// single element? then discard wrapping struct
if (numElements == 1)
return elements[0];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it is better to check for the one-element case first?
This is really a question - are there one member structs in Phobos and druntime?

@kinke
Copy link
Member Author

kinke commented Apr 2, 2015

Subjectively, I haven't noticed any significant slowdown, but I don't have any numbers (yet). It's not invoked for every struct function parameter; on Win64, only for structs of power-of-2 sizes <= 8 bytes, on PowerPC similar iirc, and for System V only structs which can be passed in registers, i.e., generally <= 16 bytes, so that the recursion depth isn't deep. But currently the results aren't cached for each already visited type, so this could be added if there's really a significant slowdown.

Of course there are single-member structs - in upstream druntime, every associative array is a struct containing an opaque pointer, just to name a popular example. And besides druntime and phobos there's always user code too ;). The many unnecessary rewrites cluttering the LL code produced by LDC are the reason for this more thorough (and hence recursive) check.

@kinke
Copy link
Member Author

kinke commented Apr 4, 2015

Alright, some numbers for a rough estimate:

  • Clean build directory
  • make ldc2
  • time make -j3 all druntime-ldc-unittest-debug phobos2-ldc-unittest-debug, i.e., build druntime and phobos in debug & release + druntime and phobos unittests, debug only

Current merge-2.067 head + LLVM 3.5.2 in my Xubuntu x64 VM using 3 of my 4 physical cores:

real    11m36.532s
user    33m10.192s
sys 0m36.870s

With this PR:

real    11m33.704s
user    33m0.260s
sys 0m34.958s

redstar added a commit that referenced this pull request Apr 8, 2015
Win64 ABI fix: return non-POD structs via sret.
@redstar redstar merged commit 1dacce4 into ldc-developers:master Apr 8, 2015
@JohanEngelen
Copy link
Member

@kinke With these changes, I can no longer build certain files in druntime and phobos. For example runtime\druntime\src\core\sys\windows\stacktrace.d fails. They all fail with a ICE:
Assertion failed: Ty && "Invalid GetElementPtrInst indices for type!", file E:\ldc_msvc\LLVM36\include\llvm/IR/Instructions.h, line 782

I am using MSVC 2013 and LLVM 3.6. If I revert the changes from this PR, the problem is gone.

Part of the backtrace (line numbers from ldc-merge-2.067 branch):

0x000007FED9007642 (0x000000000000000A 0x0000000000000000 0x000000000012C4B8 0x000000000012C5C0), _initptd() + 0x912 bytes(s)
0x000007FED9132044 (0x000000000012CBD0 0x0000000000000240 0x000007FED91F8430 0x00000001420FFA10), abort() + 0x24 bytes(s)
0x000007FED911BF78 (0x00000001420FFA10 0x00000001420F2BD0 0xCCCCCCCC0000030E 0xCCCCCCCCCCCCCCCC), _wassert() + 0x108 bytes(s)
0x000000013FF8D00C (0x0000000000000000 0x000000000012D0D0 0xCCCCCCCCCCCCCCCC 0xCCCCCCCCCCCCCCCC), llvm::checkGEPType() + 0x4C bytes(s), e:\ldc_msvc\llvm36\include\llvm\ir\instructions.h, line 782 + 0x2E byte(s)
0x00000001400ACF14 (0x0000000002F3F598 0x0000000002F387C8 0x000000000012D150 0x000000000012D1B8), llvm::ExtractValueInst::ExtractValueInst() + 0x64 bytes(s), e:\ldc_msvc\llvm36\include\llvm\ir\instructions.h, line 1938 + 0x64 byte(s)
0x00000001400AD70B (0x0000000002F387C8 0x000000000012D1E0 0x000000000012D1B8 0x0000000000000000), llvm::ExtractValueInst::Create() + 0x8B bytes(s), e:\ldc_msvc\llvm36\include\llvm\ir\instructions.h, line 1883 + 0x5A byte(s)
0x00000001400AD9BC (0x0000000002D42F40 0x0000000002F387C8 0x000000000012D330 0x000000000012D250), llvm::IRBuilder<1,llvm::ConstantFolder,llvm::IRBuilderDefaultInserter<1> >::CreateExtractValue() + 0xDC bytes(s), e:\ldc_msvc\llvm36\include\llvm\ir\irbuilder.h, line 1488 + 0x42 byte(s)
0x00000001400BCB59 (0x0000000002F387C8 0x0000000002B37C30 0x0000000002F1B8A0 0x000007FED9000000), DtoAggrPaint() + 0xD9bytes(s), e:\ldc_msvc\ldc\gen\tollvm.cpp, line 890 + 0x79 byte(s)
0x000000014020C848 (0x0000000002D057F8 0x0000000000364CC0 0x0000000002EF6770 0x0000000002BCF290), DtoCallFunction() + 0x11B8 bytes(s), e:\ldc_msvc\ldc\gen\tocall.cpp, line 510 + 0x1D byte(s)
0x00000001401F79DA (0x000000000012DE88 0x0000000002D057F0 0xCCCCCCCCCCCCCCCC 0xCCCCCCCCCCCCCCCC), ToElemVisitor::visit() + 0x1DDA bytes(s), e:\ldc_msvc\ldc\gen\toir.cpp, line 1113 + 0x3A byte(s)
0x000000014001409B (0x0000000002D057F0 0x000000000012DE88 0xCCCCCCCCCCCCCCCC 0xCCCCCCCCCCCCCCCC), CallExp::accept() + 0x3B bytes(s), e:\ldc_msvc\ldc\dmd2\expression.h, line 933 + 0x3B byte(s)
0x00000001401E8737 (0x0000000002D057F0 0x0000000002ECDFA0 0x0000000002F1AEE0 0x0000000002F1AE80), toElem() + 0x47 bytes(s), e:\ldc_msvc\ldc\gen\toir.cpp, line 3091

Verbose compilation output right before the crash:

* * * * * * * * * DtoCallFunction()
* * * * * * * * * * Building type: pure nothrow @system inout(ulong)[](inout(ulong)[] a)
* * * * * * * * * * * DtoFunctionType(pure nothrow @system inout(ulong)[](inout(ulong)[] a))
* * * * * * * * * * * * Rewriting argument type inout(ulong)[]
* * * * * * * * * * * * * { i64, i64* } => { i64, i64* }*
* * * * * * * * * * * * Final function type: void ({ i64, i64* }*, { i64, i64* }*)
* * * * * * * * * * doing normal arguments
* * * * * * * * * * Arguments so far: (1)
* * * * * * * * * * *   %.rettmp = alloca { i64, i64* }, align 8
* * * * * * * * * * Function type: pure nothrow @system inout(ulong)[](inout(ulong)[] a)
* * * * * * * * * * DtoArgument
* * * * * * * * * * * VarExp::toElem: a @ const(ulong)[]
* * * * * * * * * * * * DtoSymbolAddress ('a' of type 'const(ulong)[]')
* * * * * * * * * * * * * function param
* * * * * * * * * * * * * type: const(ulong)[]
* * * * * * * * * * Rewrite: putParam
* * * * * * * * * * repainting return value from 'inout(ulong)[]' to 'const(ulong)[]'

So it seems there is a problem with inout(T)[] _rawDup(T)(inout(T)[] a) in object.di.

In tollvm.cpp:

LLValue* DtoAggrPaint(LLValue* aggr, LLType* as)
{
    if (aggr->getType() == as)
        return aggr;

    LLValue* res = llvm::UndefValue::get(as);

    LLValue* V = gIR->ir->CreateExtractValue(aggr, 0);   <-- ASSERTION CRASH

called from tocall.cpp line 511:

    // repaint the type if necessary
    if (resulttype)
    {
        Type* rbase = stripModifiers(resulttype->toBasetype());
        Type* nextbase = stripModifiers(tf->nextOf()->toBasetype());
        if (!rbase->equals(nextbase))
        {
            IF_LOG Logger::println("repainting return value from '%s' to '%s'", tf->nextOf()->toChars(), rbase->toChars());
            retllval->getType()->dump();
            DtoType(rbase)->dump();
            std::cout << std::endl;
            switch(rbase->ty)
            {
            case Tarray:
                if (tf->isref)
                    retllval = DtoBitCast(retllval, DtoType(rbase->pointerTo()));
                else
                retllval = DtoAggrPaint(retllval, DtoType(rbase));  <---------
                break;

I do not understand enough of the code to help you fix this :(

@redstar
Copy link
Member

redstar commented Apr 13, 2015

@kinke Should we revert this PR?

@JohanEngelen
Copy link
Member

Reverting is OK, but I'd like to understand what is going on too :)
retllval is of type %c = alloca { i64, i64* }, align 8
which is a pointer (LLVM tells me), and thus CreateExtractValue cannot be used on it.
Overall DtoCallFunction is very long, and hard to understand :'(

@redstar
Copy link
Member

redstar commented Apr 13, 2015

Only speculating: the value for retllval is a memory allocation. The pointer must be loaded with DtoLoad.
e.g. the alloc:

LLValue* darray = DtoRawAlloca(dtype, 0, ".array");

and the load:
DtoLoad(darray)

Edit: changed the line numbers!

@kinke
Copy link
Member Author

kinke commented Apr 13, 2015

Thanks Johan.

but I'd like to understand what is going on too :)

Heh me too. ;) I've never liked this repainting in case the types differ. In your example, the result (pointer to an array returned via sret) should obviously not be repainted at all, as the formal inout array result is simply converted to a const array as the single inout parameter is a const array (passed by ref and hence rewritten).
Based on your output, tf->nextOf() is inout(ulong)[] while rbase = stripModifiers(resulttype->toBasetype()) is const(ulong)[]. Stripping the modifiers of the array type itself via stripModifiers() doesn't yield an identical type; both array types are mutable. The types only differ in the qualifiers of their element type.
I.e., returning inout(ulong[]) would work, but the LDC code fails for inout(ulong)[].

I guess the best course of action would be stripping the qualifiers transitively (i.e., recursively) for array types.

@JohanEngelen
Copy link
Member

@kinke Yep, I thought the same and had tried to figure out how to strip all qualifiers (recursively) but couldn't figure it out so quickly.

@JohanEngelen
Copy link
Member

@kinke @redstar Got it!
In:

Type * stripModifiers( Type * type )
{
    if (type->ty == Tfunction)
        return type;

-    return type->castMod(0);
+    return type->unqualify(MODimmutable | MODconst | MODwild);
}

unqualify works transitively.

@JohanEngelen
Copy link
Member

@kinke Now the next error (probably related to this PR) is:

[12/497] Generating src/core/time.obj
FAILED: cmd.exe /c cd /D E:\ldc_msvc\ldc && E:\ldc_msvc\ldc\build\bin\ldc2.exe --output-o -c -IE:/ldc_msvc/ldc/runtime/druntime/src -IE:/ldc_msvc/ldc/runtime/druntime/src/gc E:/ldc_msvc/ldc/runtime/druntime/src/core/time.d -ofE:/ldc_msvc/ldc/build/runtime/src/core/time.obj -w -d -O3 -release -disable-invariants
E:/ldc_msvc/ldc/build/import\object.di(681): Error: Function type does not match previously declared function with the same mangled name: _d_newarrayU

The error is generated in functions.cpp. Dumping the LLVM types of the two unequal function types

        func->getFunctionType()->dump();
        functype->dump();

gives

{ i64, i8* } (%object.TypeInfo*, i64)
void ({ i64, i8* }*, %object.TypeInfo*, i64)

Thanks for any hints.

@kinke
Copy link
Member Author

kinke commented Apr 13, 2015

I've come up with a +5 lines solution, yours is way more elegant. ;)

Your function type mismatch appears to be related to sret. func->getFunctionType() returns a dynamic array as normal 16-bytes LL struct. functype returns it via sret though, i.e., the LL struct is allocated by the caller and the callee gets its address as implicit first parameter (and returns void, an LLVM sret convention).
So func->getFunctionType() of this runtime function seems to have bypassed our target ABI; probably because it's one of the special runtime functions which have never returned something via sret before.

PS: I think LLVM 3.6 is particularly picky about proper forward declarations. I've fixed a couple runtime function signatures lately, but apparently gABI doesn't get a say when generating the runtime function types. With LLVM 3.6 you'll probably also run into serious exception handling issues. I'd strongly recommend using LLVM 3.7.

@JohanEngelen
Copy link
Member

@kinke OK, I'll look into this tomorrow.
Note that the problematic function is extern (C). I do not know how C-calling convention handles struct return types. But I'm guessing it is not sret? So perhaps functype is the problem? (our gABI wrongly applied on extern (C) functions?)

@kinke
Copy link
Member Author

kinke commented Apr 14, 2015

There's not a lot of diffs between D and C calling conventions, apart from the reversed parameter order (to be fixed sometime) and variadic functions. So on Win64, a 16-bytes struct (dynamic D array) is to be passed by ref and to be returned via sret, for both calling conventions.

@JohanEngelen
Copy link
Member

@kinke OK, I've fixed the function fwd decl, but again hitting on another LLVM assert. I'm starting to wonder if this ever actually worked on Win64! ;-)

:'(

@JohanEngelen
Copy link
Member

@kinke In DtoNewDynArray, the CreateCallOrInvoke2 is calling _d_newarrayU runtime function, which has an sret parameter.
I don't see how that CreateCallOrInvoke2 call can succeed, without allocating memory or passing of the sret argument.
The problem arises from this commit kinke@d24fc71.
If I revert that particular change, things work better.
Edit: so I think that means there is no target, other than Win64, configured for sret dynamic arrays?

@kinke
Copy link
Member Author

kinke commented Apr 15, 2015

@JohanEngelen To be honest, I'm not sure whether I've properly tested the 2 last commits on Win64 (with LLVM 3.7) as they appeared pretty minuscule. I'll have to check that when I get home.
And yep, Win64 is currently the only target which returns dynamic arrays via sret. It doesn't really have to be this way, I just found it appropriate given that the Win64 C calling convention returns all structs > 8 bytes via sret and I think we should treat a dynamic D array as normal struct.

@JohanEngelen
Copy link
Member

@kinke I bet a bottle of whisky that it's not going to work with any version of LLVM ;)
I will change the code so that it does not use sret for arrays for now. So then at least I can develop again!

I agree it is good to follow the Win64 C calling convention, so I will try later.
To get the runtime functions to work with sret is some work, and will add some ugly code; but I'll write it so we can see the result. If there is a way to add the runtime functions properly to the AST as fwd decls (instead of directly as LLVM fwd decls), that would allow us to solve the problem in a nicer way I think (instead of special calling code we could just do a DtoCall?)

@JohanEngelen
Copy link
Member

The work continues in #901

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants