-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add uint32_t i32 field to Value union #517
Conversation
This ultimately need typed execution API because |
What was the blocker on this? It would make the Rust API a tiny bit nicer too. |
Unit tests expectations update. |
Codecov Report
@@ Coverage Diff @@
## master #517 +/- ##
==========================================
- Coverage 99.31% 99.30% -0.02%
==========================================
Files 72 72
Lines 10257 10304 +47
==========================================
+ Hits 10187 10232 +45
- Misses 70 72 +2
Flags with carried forward coverage won't be shown. Click here to find out more.
|
This is now blocked on explicitly typing number literals in the tests. |
@@ -96,7 +97,6 @@ namespace fizzy::test | |||
{ | |||
inline uint32_t as_uint32(fizzy::Value value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eventually we should also remove this, if we merge this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chfast do you want to remove it in a separate commit, or a separate PR? I can do the removal too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, actually shouldn't all the test code be using as<uint32_t>()
? We can apply that parallel to this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, actually shouldn't all the test code be using
as<uint32_t>()
? We can apply that parallel to this PR.
The Result()
was checking this to some distinct. But I went with additional type checking to solved it for good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1cfb307
to
bf0c745
Compare
@@ -115,7 +115,7 @@ class OperandStack | |||
m_top = m_bottom - 1; | |||
|
|||
const auto local_variables = std::copy_n(args, num_args, m_locals); | |||
std::fill_n(local_variables, num_local_variables, 0); | |||
std::fill_n(local_variables, num_local_variables, Value{}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I'm confused by union initialization again. Does default initialization like this set the active member of the union?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused too. This was only found by GCC coverage build.
The 0
is definitely wrong because it calls the special constructor and inits .i32
only.
But the default constructor should be wrong too because it should only init the first field and .i32
is the first one. This works however for current compilers. I wander if MSan can help here. Would be good if you can check the spec yourself.
One solution to solve this is to provide own implementation of the default constructor which inits .i64
(or define i64 = 0;
). But with that the Value
looses some properties like "is trivially constructable".
Second solution is to move .i64
to the first position. This guarantees the Value{}
to be fully zero-initialized. Same for FizzyValue
in C API.
But I don't see any sanity solution to init locals with zero without knowing the type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe I'm doing it again to myself, but:
- The
Value{}
is value-initialization. - if T is a class type with a default constructor that is neither user-provided nor deleted (that is, it may be a class with an implicitly-defined or defaulted default constructor), the object is zero-initialized.
- Zero-initialization: If T is a union type, the first non-static named data member is zero-initialized and all padding is initialized to zero bits.
So I think the Value{}
is correct here as it zeros all object's bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to depend on the interpretation of what is "padding" for a union - whether unused bits of inactive members are "padding". I see some people having doubts about this here: https://stackoverflow.com/questions/11812555/how-to-zero-initialize-an-union#comment15704643_11812606 I have not found a definite answer to this yet.
However, I had a concern about something else (perhaps not directly related to this PR): we zero-initialize Value
s of locals here, which might or might not choose the active member of the union (I have not found the answer to this either). Then function may access them right away with local.get
, then some other operation may access some member of it - can be any type member. Then this may or may not be UB.
We can add a test for this, but as I remember sanitizers didn't catch union type punning.
And if we're required to set the right type, then there's no way around needing the types of locals when initializing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can add a test for this, but as I remember sanitizers didn't catch union type punning.
Well, the tests you added here (execute.local_init
) already check this actually. Because checking the result of a function reads some specific member of Value
union.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to depend on the interpretation of what is "padding" for a union - whether unused bits of inactive members are "padding". I see some people having doubts about this here: https://stackoverflow.com/questions/11812555/how-to-zero-initialize-an-union#comment15704643_11812606 I have not found a definite answer to this yet.
Agree. This definition issues would be nice to clear.
However, I had a concern about something else (perhaps not directly related to this PR): we zero-initialize
Value
s of locals here, which might or might not choose the active member of the union (I have not found the answer to this either).
"the object's first non-static named data member is zero initialized and padding is initialized to zero bits"
This indicates that the first member is initialized then the active one.
Then function may access them right away with local.get
, then some other operation may access some member of it - can be any type member. Then this may or may not be UB.
We can add a test for this, but as I remember sanitizers didn't catch union type punning.
This is true. I have found multiple notes that GCC explicitly supports union-based type punning as some kind of C++ extension. But I cannot find it in documentation.
It is also said that Clang's UBSan should report all UB's Clang recognizes (otherwise it is a compiler bug), so this can indicate this is also supported in Clang.
And if we're required to set the right type, then there's no way around needing the types of locals when initializing.
That's would be very pedantic way of handling this. I would stay with investigating C++ standard and compiler documentation + have many tests for this.
Another alternative to investigate is to use std::memset()
as a way of fixing theoretical UB (similarly to std::memcpy()
being a (the only) correct way of doing type punning).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did some experiments and it looks like GCC and MSVC inits the whole object while Clang only the first element: https://godbolt.org/z/cWscY8.
See also: https://stackoverflow.com/questions/37226837/union-zero-initialization-with-clang-vs-gcc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be a good reason to make i64
the first member. Are there any advantages for it to be the second other than aesthetical?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just esthetics. I'm keeping it though for now to catch issues and add regression tests.
4678dff
to
338d46c
Compare
I don't see any significant speed difference except for noise. The GCC LTO build is much noisier.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine to me
52786d4
to
37a3470
Compare
@@ -47,7 +47,7 @@ TEST(api, execution_result_value) | |||
const ExecutionResult result = Value{1234_u32}; | |||
EXPECT_FALSE(result.trapped); | |||
EXPECT_TRUE(result.has_value); | |||
EXPECT_EQ(result.value.i64, 1234_u32); | |||
EXPECT_EQ(result.value.i32, 1234_u32); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notpick, we do not actually need _u32
here, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. But using _u32
gives higher confidence (i.e. someone has thought what the result type is expected). Still gray area though.
This allows using i32 values directly what is less confusing than previous emulation with storing these as uint64_t.
Problems this introduces
stack.top() = 0
only zeros the.i32
now.FizzyValue{1}
only inits the.i32
part. This is excessively used in C API tests. And cannot be disabled. In C++20 you can doFizzyValue{.i64 = 1}
.