-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update core wasm32 typedefs to be compatible with reference-sysroot. #7799
Update core wasm32 typedefs to be compatible with reference-sysroot. #7799
Conversation
Change `time_t` from `long` (32-bit) to `long long` (64-bit). This fixes the [2038 bug](https://en.wikipedia.org/wiki/Year_2038_problem). It's also consistent with the [x32 ABI](https://en.wikipedia.org/wiki/X32_ABI). Change the fallback for `wchar_t` and `wint_t` from `unsigned` to `int`. This is not an ABI change in practice because the headers prefer to use `__WCHAR_TYPE__` and `__WINT_TYPE__` when they're defined, and clang defines those to be `int`. So this just makes the fallback cases for other hypothetical compilers consistent with clang. (And fwiw, no one specifically chose `int` for these, that's just the default in clang if the target doesn't override it. So if anyone feels strongly that wchar_t/wint_t should be something different, we can change them). Change `suseconds_t` from `long` to `long long`. This matches x32, uses what would otherwise be padding a `struct timeval`, and generally makes it easier for users to avoid overflow in time calculations. Change `_Reg` from `__PTRDIFF_TYPE__` (32-bit) to `__INT64_TYPE__` (64-bit). `_Reg` is only used for `nlink_t` and `register_t`. For `nlink_t`, this matches x32, and theoretically supports filesystems that support billions of links. For `register_t`, since wasm supports native `i64` values, a 64-bit int is the closest mapping of this concept.
This seems good to me. Let's see what others think though. Also would be good to confirm that this is in the right direction for #7649 (cc @rianhunter). I suspect some tests will fail here, in particular This will require a minor version bump when landing so system libraries are rebuilt. |
This includes off_t, ino_t, and several other POSIX types.
It turns out that I was confused about the relationship between alltypes.h.in and alltypes.h. I've updated a few more types now so that people can see what I'm proposing, though I still need to sanity-check everything. |
These changes are consistent with changes I also had plans to make in #7649.
My only request is the addition of a minimal "emscripten metadata" section for the generated wasm before making ABI changes like these. Without that it will be impossible for wasmjit (and others) to detect that this code is incompatible with the ABI it currently targets. Users who unwittingly build wasm files with incompatible versions of emscripten won't be able to easily discover the root problem, and think either wasmjit or emscripten is broken.
|
Update library_syscall.js to reflect that several types are i64 now. Struct stat is changing, so let's also update it to something with fewer odd legacy fields.
@@ -1,22 +1,20 @@ | |||
/* copied from kernel definition, but with padding replaced | |||
* by the corresponding correctly-sized userspace types. */ | |||
|
|||
struct stat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what struct stat
definition is this mirroring? is it the stat used in the x32 linux ABI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's the x32 linux ABI. I've now added a comment.
Would you be willing to submit a patch for this part? I don't know what you need here, and I'm not very familiar with the parts of Emscripten and Binaryen that would be involved. |
That is fair, I can do that ASAP.
|
src/library.js
Outdated
} | ||
return ret; | ||
}, | ||
|
||
difftime: function(time1, time0) { | ||
return time1 - time0; | ||
difftime: function(time1lo, time1hi, time0lo, time0hi) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use l, h
instead of lo, hi
for consistency with other similar things in this file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
src/library_syscall.js
Outdated
return low; | ||
}, | ||
// Like get64, but read high first and low second. | ||
get64hilo: function() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about reversed
or highLow
instead of hilo
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed; changed to reversed
.
src/library_syscall.js
Outdated
// Like get64, but read high first and low second. | ||
get64hilo: function() { | ||
var high = SYSCALLS.get(), low = SYSCALLS.get(); | ||
if (low >= 0) assert(high === 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could this check with a >> 31
like in library.js? or is there a reason to do it differently here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was just copied from the existing get64
. I've now updated both versions to use the >> 31
check.
system/lib/libc/musl/ldso/dynlink.c
Outdated
@@ -858,7 +858,7 @@ static void *dl_mmap(size_t n) | |||
void *p; | |||
int prot = PROT_READ|PROT_WRITE, flags = MAP_ANONYMOUS|MAP_PRIVATE; | |||
#ifdef SYS_mmap2 | |||
p = (void *)__syscall(SYS_mmap2, 0, n, prot, flags, -1, 0); | |||
p = (void *)__syscall(SYS_mmap2, 0, n, prot, flags, -1, 0, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this change is in general musl code, not in the emscripten-specific part - why is it needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, you're right. It isn't needed. I've now reverted this change.
system/lib/libc/musl/src/mman/mmap.c
Outdated
return (void *)syscall(SYS_mmap2, start, len, prot, flags, fd, off/UNIT); | ||
off_t pgoffset = off / UNIT; | ||
return (void *)syscall(SYS_mmap2, start, len, prot, flags, fd, | ||
(int32_t)pgoffset, ((int32_t)(pgoffset >> 32))); | ||
#else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same question as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same answer :-).
(['-O2'], 18, ['assert'], ['waka'], 12616, 17, 15, 34), # noqa | ||
(['-O3'], 7, [], [], 2706, 10, 2, 22), # noqa; in -O3, -Os and -Oz we metadce | ||
(['-Os'], 7, [], [], 2706, 10, 2, 22), # noqa | ||
(['-Oz'], 7, [], [], 2706, 10, 2, 22), # noqa |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like some extra function survives metadce, so it's actually used - what is it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's nullFunc_jiji
. We have one more unique signature now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused, I don't see any nullFunc methods when I build say ./emcc tests/hello_world.c -Os -profiling
. Reading emscripten.py those should only be emitted when ASSERTIONS are on. So that doesn't explain the change in the metadce levels here, -O3, -Os, -Oz
?
@@ -1,4 +1,4 @@ | |||
zlib version 1.2.5 = 4688, compile flags = 85 | |||
zlib version 1.2.5 = 4688, compile flags = 149 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
zlib apparently prints out some flags and one of them reflects the size of off_t
.
https://github.com/kripken/emscripten/blob/incoming/tests/zlib/zutil.c#L55
https://github.com/kripken/emscripten/blob/incoming/tests/zlib/example.c#L531
@@ -1 +1 @@ | |||
"1.38.21" | |||
"1.38.22" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of bumping minor version, we should bump to 1.39.0
, because this is a breaking ABI change. (like we did in #5916 (comment))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not opposed to that, but I don't think we have a strict rule here - even minor version number updates can break ABI, e.g. with an LLVM update. I did agree in the linked issue that we should do a major one because it was a large change, and because it could be particularly confusing and error-prone, but in this PR here the change is minor (unless I'm missing something), so I don't feel strongly either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, I think this is worth bumping to 1.39
I don't recall whether we came to a strict rule but I hope we can eventually (soon) get to the point where we can have a strict rule of bumping the minor for ABI breaks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. Then we should land this with a minor version, wait to see we are reasonably stable, then bump the major version, as documented here: https://github.com/kripken/emscripten/blob/incoming/docs/process.md#major-version-update-1xy-to-1x10
ABI versioning PR here: #7815 |
Hey @sunfishcode could you use some help with this? I have some spare cycles this week, I might be able to shepherd this in if you're prioritizing other things. |
Hi @rianhunter, sorry for being slow to respond here. If you're still able to help out here, that'd be a great! |
Related to emscripten-core#7649 (point 5) and subset of emscripten-core#7799 * Update python.bc * Update output wasm sizes * Update wasm backend tests
These changes are now best pursued by independent PRs such as #8467. |
Related to emscripten-core#7649 (point 5) and subset of emscripten-core#7799 * Update python.bc * Update output wasm sizes * Update wasm backend tests
Change
time_t
fromlong
(32-bit) tolong long
(64-bit). This fixes the 2038 bug. It's also consistent with the x32 ABI.Change the fallback for
wchar_t
andwint_t
fromunsigned
toint
. This is not an ABI change in practice because the headers prefer to use__WCHAR_TYPE__
and__WINT_TYPE__
when they're defined, and clang defines those to beint
. So this just makes the fallback cases for other hypothetical compilers consistent with clang. (And fwiw, no one specifically choseint
for these, that's just the default in clang if the target doesn't override it. So if anyone feels strongly that wchar_t/wint_t should be something different, we can change them).Change
suseconds_t
fromlong
tolong long
. This matches x32, uses what would otherwise be padding astruct timeval
, and generally makes it easier for users to avoid overflow in time calculations.Change
_Reg
from__PTRDIFF_TYPE__
(32-bit) to__INT64_TYPE__
(64-bit)._Reg
is only used fornlink_t
andregister_t
. Fornlink_t
, this matches x32, and theoretically supports filesystems that support billions of links. Forregister_t
, since wasm supports nativei64
values, a 64-bit int is the closest mapping of this concept.This PR doesn't currently include any provision for ABI versioning. At least in theory, the wasm32 ABI shouldn't be considered stable yet; the upstream LLVM component hasn't emerged from Experimental status yet (though it is expected to in a few months), though I'm open to practical concerns here.