Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement sizeof and alignof operator #5664

Closed
wants to merge 42 commits into from

Conversation

krux02
Copy link
Contributor

@krux02 krux02 commented Apr 5, 2017

Don't pull yet. It's almost done. I only want to see if I already broke something by running the tests on the server.

This is the issue I am working on #5493

@zah
Copy link
Member

zah commented Apr 5, 2017

I've always had a paranoid idea for implementing this feature btw.

You can also modify the codegen to introduce a static assert that Nim's assumption was right. Below each C type declaration produced by Nim, you can use the NIM_CHECK_SIZE macro from nimbase.h to verify that your calculation is working correctly (if we don't run into issue in the next few months, we'll disable these checks).

@krux02
Copy link
Contributor Author

krux02 commented Apr 5, 2017

Well I checked how the alignment of C structs actually behave. primitive types get self aligned, and structs get aligned to it's biggest alignment of a member. Compilers may not behave different here, because otherwise headers would not work for interfaces when they are compiled to a different memory layout.

But there are ways to change the behaviour of alignment. I think that when I am already doing this, there should be an align pragma that allows me to set alignment of any type manually.

@arnetheduck
Copy link
Contributor

keep in mind that alignment changes with os, architecture and compiler options - here's a telling example:

A double (eight bytes) will be 8-byte aligned on Windows and 4-byte aligned on Linux (8-byte with -malign-double compile time option).

https://en.wikipedia.org/wiki/Data_structure_alignment#Architectures

@arnetheduck
Copy link
Contributor

there's also a feature -d:checkAbi if you want to compare what nim thinks with what C thinks for importc types - unfortunately the nim standard library is severly lacking in this regard and many crucial structs are simply wrong from an abi-point-of-view

https://github.com/rust-lang/rust/tree/master/src/librustc_back/target is a good reference for common layouts - look for the data_layout string and the docs in http://llvm.org/docs/LangRef.html#data-layout

@krux02
Copy link
Contributor Author

krux02 commented Apr 8, 2017

@arnetheduck Thanks for the info, I will take that into consideration. Currently I am on Linux (64-bit) and here the double is 8 byte aligned, so I would not have noticed. I think the best thing to do here would be an alignment map (just a few constants that can be initialized with a few when statements.

result = szNonConcreteType
result = 1

proc computeObjectOffsetsRecursive(n: PNode, initialOffset: BiggestInt): tuple[size, align: BiggestInt] =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now enough code that it should reside it its own module. Suggested name: sizeofimpl.nim

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where should that file be? same folder as types.nim, and then just include it from types.nim?

Copy link
Member

@Araq Araq Apr 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

module would be nicer than include file and it should not have many dependencies (ast.nim, and not much else?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well I guess that would be possible

@krux02
Copy link
Contributor Author

krux02 commented Apr 19, 2017

I got a very weird problem:

In computeObjectOffsetsRecursive I print the result, to check if it has the correct value, and in fact it has. But when I print the exact same values from
where I called the function

This is the output I get:

result.size: 16 result.align: 8
       size: 24        align: 8

I get different values. How is that possible? I made sure that the output is only printed for one single type, where I know that the result is wrong.

this is the current testfile in the root of the project: ./koch temp c -r testsizeof

@krux02
Copy link
Contributor Author

krux02 commented Apr 19, 2017

ok, I got the problem fixed. There was one layer of recursion more of the same function that I thought that there would be and that changed the result. Now a lot more tests actually pass.

@krux02 krux02 force-pushed the sizeof-alignof branch 2 times, most recently from 6f84bba to 7cc43d5 Compare April 25, 2017 17:08
@Araq
Copy link
Member

Araq commented May 4, 2017

What's the status of this? Ready to be reviewed? (I might need this feature soon.)

@krux02
Copy link
Contributor Author

krux02 commented May 5, 2017

I thought so, but then I realized I am missing out on the packed pragma. That is the latest TODO I added. But there is one question that I have. I realized that enums have always an alignment of 2 bytes, no matter the size. Is that intentional? I thought an enum of the size of an int should also have the alignment of an int.

@Araq
Copy link
Member

Araq commented May 5, 2017

I realized that enums have always an alignment of 2 bytes, no matter the size. Is that intentional? I thought an enum of the size of an int should also have the alignment of an int.

That's a bug. Most enums fit in a single byte and so should get "byte alignment" (i.e. no alignment). An int sized enum should be aligned like an int, yes.

@jangko
Copy link
Contributor

jangko commented May 9, 2017

recently I stumbled upon something like this in windows related header files of mingw-gcc.

#include <pshpack2.h>
  typedef struct tagBITMAPFILEHEADER {
    WORD bfType;
    DWORD bfSize;
    WORD bfReserved1;
    WORD bfReserved2;
    DWORD bfOffBits;
  } BITMAPFILEHEADER,*LPBITMAPFILEHEADER,*PBITMAPFILEHEADER;
#include <poppack.h>

gcc documentation says those kind of struct alignment exist to provide compatibility for Microsoft's compiler. gcc itself do not use them in it's own header files, only in windows related header files.

pshpackXX force the compiler to use certain alignment of 1,2,4, and 8.

As far as I know, Nim compiler only support byte alignment with packed pragma (I might be wrong).

clang compiler usually have compatible features with gcc although it is not in C standard.

icc also support this feature because it claims source and binary compatibility with msvc compiler.

I think if Nim want to support this feature, this PR is the right place to do it since sizeof, alignof and offsetof all deals with alignment.

alignment up to 32 bytes might be needed with AVX instructions.

@Araq, @krux02, what do you think?

@krux02
Copy link
Contributor Author

krux02 commented May 9, 2017

I don't know of the pshpack2 header and how it works. But since it's a non-standard C feature, I think there is no need to support it. That means that wrappers for that non-standard struct should not expose the struct as an object but make the fields accessable via getters/setters. Or use the packed pragma und add alignment bytes manually. An align pragma I think should be supported. I will see what I can do.

@jangko
Copy link
Contributor

jangko commented May 9, 2017

pshpackXX and poppack contains something like this:

#pragma pack(push,2)

#pragma pack(pop)

#pragma pack(push,1) basically equals with Nim packed pragma.

dotExpr = n[1][0]
else:
#echo "evaluating: ", n, " type1: ", type1, " type2: ", type2
debug n[1]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

debug code left over.

@Parashurama
Copy link
Contributor

Parashurama commented Jun 15, 2017

I checked-out the PR locally, here are some results running tests/misc/tsizeof.nim: (x86_64 GNU/Linux)

I could get any useful results with packed pragma (some debug ouput).

RESULTS without {.packed.}
running nim -r c --cc:gcc tests/misc/tsizeof.nim:

TrivialType:           size:    3 !=     4 align:    1 at line tsizeof.nim(233,18)
InheritanceB.b offset: 16 != 10 at line tsizeof.nim(263,15)
InheritanceC.c offset: 24 != 12 at line tsizeof.nim(264,15)

running nim -r c --cc:gcc --cpu:i386 --passc: -m32 --passL: -m32 tests/misc/tsizeof.nim:

TrivialType:           size:    3 !=     4 align:    1 at line tsizeof.nim(233,18)
SimpleAlignment:       size:   12 !=    16 align:    8 at line tsizeof.nim(233,18)
AlignAtEnd:            size:   12 !=    16 align:    8 at line tsizeof.nim(233,18)
SimpleBranch:          size:   12 !=    16 align:    8 at line tsizeof.nim(233,18)
PaddingBeforeBranchA:  size:   12 !=    16 align:    8 at line tsizeof.nim(233,18)
RecursiveStuff:        size:   28 !=    40 align:    8 at line tsizeof.nim(233,18)
SimpleAlignment.c offset: 4 != 8 at line tsizeof.nim(241,15)
SimpleBranch.a offset: 4 != 8 at line tsizeof.nim(246,15)
SimpleBranch.b offset: 4 != 8 at line tsizeof.nim(247,15)
SimpleBranch.c offset: 4 != 8 at line tsizeof.nim(248,15)
PaddingBeforeBranchA.a offset: 4 != 8 at line tsizeof.nim(251,15)
Bazing.a offset: 4 != 8 at line tsizeof.nim(260,15)
InheritanceB.b offset: 8 != 6 at line tsizeof.nim(263,15)
InheritanceC.c offset: 12 != 8 at line tsizeof.nim(264,15)
RecursiveStuff.a offset: 4 != 8 at line tsizeof.nim(276,15)
RecursiveStuff.b offset: 4 != 8 at line tsizeof.nim(277,15)
RecursiveStuff.kind2 offset: 4 != 8 at line tsizeof.nim(278,15)
RecursiveStuff.ca1 offset: 8 != 16 at line tsizeof.nim(279,15)
RecursiveStuff.ca2 offset: 12 != 20 at line tsizeof.nim(280,15)
RecursiveStuff.cb offset: 8 != 16 at line tsizeof.nim(281,15)
RecursiveStuff.cc offset: 8 != 16 at line tsizeof.nim(282,15)
RecursiveStuff.d1 offset: 16 != 24 at line tsizeof.nim(283,15)
RecursiveStuff.d2 offset: 20 != 32 at line tsizeof.nim(284,15)

running nim -r c --cc:clang tests/misc/tsizeof.nim:

TrivialType:           size:    3 !=     4 align:    1 at line tsizeof.nim(233,18)
InheritanceB.b offset: 16 != 10 at line tsizeof.nim(263,15)
InheritanceC.c offset: 24 != 12 at line tsizeof.nim(264,15)

@krux02
Copy link
Contributor Author

krux02 commented Jun 15, 2017

@Parashurama thanks for the help. I am pretty sure where the problem is. Depending on the architecture and the platform the alignment might not be self aligned for all primitive types. My implementation does not handle those cases properly. Here is where the branching needs to take place:

https://github.com/nim-lang/Nim/pull/5664/files#diff-76f3d75559ee07c14682d1bd5f58e5f9R202

Here some branching based on the platform needs to happen, to get the correct alignment for the different types.

@krux02
Copy link
Contributor Author

krux02 commented Jul 24, 2017

Well this pull request is generally correct. The only problem it is facing now is that the alignment values for the builtin types are not set correctly by the compiler as described here:
#6094

@Parashurama
Copy link
Contributor

Parashurama commented Aug 19, 2017

Well, alignment issues appear to be fixed with latest devel. (I tested by rebasing your branch on devel)
Can you revisit your PR?

if size == szImportedType:
# Forward to the c code generation to emit a `sizeof` in the C code.
result = n
elif size >= 0:
Copy link
Member

@timotheecour timotheecour Oct 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will that work with incomplete types that have at least 1 field?

type
  MyStruct {.importc: "MyStruct".} = object
    field1: int
    # missing field2

see also #9250 for a more robust solution (that can be implemented after your PR)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

originally I was looking at the wrong location for the importc flag. Now it should work properly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@krux02 is my assessment in #9250 correct that currently you must pessimistically reject computing sizeof for any type annotated as importc ? feel free to comment in #9250

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are correct.

@timotheecour
Copy link
Member

@krux02
github says: "Conflicting files"
btw, instead of a giant merge that will mess up very old git history for everyone, would the following be ok:

  • either attempt git rebase -i (potentially painful in this case, I don't know)
  • or, "manual squash + rebase", by locally, merge against devel, then create a single commit based on the diff from devel

@krux02
Copy link
Contributor Author

krux02 commented Oct 11, 2018

@Araq this is the test that is currently failing:
https://github.com/krux02/Nim/blob/sizeof-alignof/tests/typerel/trectuples.nim#L6

Can you explain me, why this should be an illegal recursion?

@timotheecour
Copy link
Member

timotheecour commented Oct 11, 2018

Can you explain me, why this should be an illegal recursion?

odd indeed that these are invalid:
type Node = tuple[left: ref Node]
and
type A = tuple[B: ptr A]

given that these are valid:

  type A = object
    B: ptr A

  type A = object
    B: ref A

also, the test in https://github.com/krux02/Nim/blob/sizeof-alignof/tests/typerel/trectuples.nim#L6 is not great since the lines after line 6 are not processed

@krux02 krux02 closed this Oct 11, 2018
@krux02 krux02 reopened this Oct 11, 2018
@krux02 krux02 closed this Oct 11, 2018
@krux02 krux02 reopened this Oct 11, 2018
@krux02
Copy link
Contributor Author

krux02 commented Oct 12, 2018

Recursive types need to be nominal types. Otherwise type equality checks need to have cycle detection, and that is expensive. Forbidding that kind of recursive types isn't really a big limitation. The reason for this limitation could be documented better though.

@timotheecour
Copy link
Member

=> squashed + rebased against devel in #9356 to allow fast fwd merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants