Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update core wasm32 typedefs to be compatible with reference-sysroot. #7799

Closed

Conversation

sunfishcode
Copy link
Collaborator

Change time_t from long (32-bit) to long long (64-bit). This fixes the 2038 bug. It's also consistent with the x32 ABI.

Change the fallback for wchar_t and wint_t from unsigned to int. This is not an ABI change in practice because the headers prefer to use __WCHAR_TYPE__ and __WINT_TYPE__ when they're defined, and clang defines those to be int. So this just makes the fallback cases for other hypothetical compilers consistent with clang. (And fwiw, no one specifically chose int for these, that's just the default in clang if the target doesn't override it. So if anyone feels strongly that wchar_t/wint_t should be something different, we can change them).

Change suseconds_t from long to long long. This matches x32, uses what would otherwise be padding a struct timeval, and generally makes it easier for users to avoid overflow in time calculations.

Change _Reg from __PTRDIFF_TYPE__ (32-bit) to __INT64_TYPE__ (64-bit). _Reg is only used for nlink_t and register_t. For nlink_t, this matches x32, and theoretically supports filesystems that support billions of links. For register_t, since wasm supports native i64 values, a 64-bit int is the closest mapping of this concept.

This PR doesn't currently include any provision for ABI versioning. At least in theory, the wasm32 ABI shouldn't be considered stable yet; the upstream LLVM component hasn't emerged from Experimental status yet (though it is expected to in a few months), though I'm open to practical concerns here.

Change `time_t` from `long` (32-bit) to `long long` (64-bit). This fixes the
[2038 bug](https://en.wikipedia.org/wiki/Year_2038_problem). It's also
consistent with the [x32 ABI](https://en.wikipedia.org/wiki/X32_ABI).

Change the fallback for `wchar_t` and `wint_t` from `unsigned` to `int`. This
is not an ABI change in practice because the headers prefer to use
`__WCHAR_TYPE__` and `__WINT_TYPE__` when they're defined, and clang defines
those to be `int`. So this just makes the fallback cases for other
hypothetical compilers consistent with clang. (And fwiw, no one specifically
chose `int` for these, that's just the default in clang if the target doesn't
override it. So if anyone feels strongly that wchar_t/wint_t should be
something different, we can change them).

Change `suseconds_t` from `long` to `long long`. This matches x32, uses what
would otherwise be padding a `struct timeval`, and generally makes it easier
for users to avoid overflow in time calculations.

Change `_Reg` from `__PTRDIFF_TYPE__` (32-bit) to `__INT64_TYPE__` (64-bit).
`_Reg` is only used for `nlink_t` and `register_t`. For `nlink_t`, this
matches x32, and theoretically supports filesystems that support billions
of links. For `register_t`, since wasm supports native `i64` values, a 64-bit
int is the closest mapping of this concept.
@kripken
Copy link
Member

kripken commented Jan 3, 2019

This seems good to me. Let's see what others think though. Also would be good to confirm that this is in the right direction for #7649 (cc @rianhunter).

I suspect some tests will fail here, in particular library_syscall.js may depend on those types, and if so would need updates.

This will require a minor version bump when landing so system libraries are rebuilt.

This includes off_t, ino_t, and several other POSIX types.
@sunfishcode
Copy link
Collaborator Author

It turns out that I was confused about the relationship between alltypes.h.in and alltypes.h. I've updated a few more types now so that people can see what I'm proposing, though I still need to sanity-check everything.

@rianhunter
Copy link
Contributor

rianhunter commented Jan 3, 2019 via email

Update library_syscall.js to reflect that several types are i64 now.

Struct stat is changing, so let's also update it to something with
fewer odd legacy fields.
@@ -1,22 +1,20 @@
/* copied from kernel definition, but with padding replaced
* by the corresponding correctly-sized userspace types. */

struct stat
Copy link
Contributor

@rianhunter rianhunter Jan 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what struct stat definition is this mirroring? is it the stat used in the x32 linux ABI?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's the x32 linux ABI. I've now added a comment.

@sunfishcode
Copy link
Collaborator Author

These changes are consistent with changes I also had plans to make in #7649. My only request is the addition of a minimal "emscripten metadata" section for the generated wasm before making ABI changes like these. Without that it will be impossible for wasmjit (and others) to detect that this code is incompatible with the ABI it currently targets. Users who unwittingly build wasm files with incompatible versions of emscripten won't be able to easily discover the root problem, and think either wasmjit or emscripten is broken.

Would you be willing to submit a patch for this part? I don't know what you need here, and I'm not very familiar with the parts of Emscripten and Binaryen that would be involved.

@rianhunter
Copy link
Contributor

rianhunter commented Jan 3, 2019 via email

src/library.js Outdated
}
return ret;
},

difftime: function(time1, time0) {
return time1 - time0;
difftime: function(time1lo, time1hi, time0lo, time0hi) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use l, h instead of lo, hi for consistency with other similar things in this file

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

return low;
},
// Like get64, but read high first and low second.
get64hilo: function() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about reversed or highLow instead of hilo?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed; changed to reversed.

// Like get64, but read high first and low second.
get64hilo: function() {
var high = SYSCALLS.get(), low = SYSCALLS.get();
if (low >= 0) assert(high === 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could this check with a >> 31 like in library.js? or is there a reason to do it differently here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was just copied from the existing get64. I've now updated both versions to use the >> 31 check.

@@ -858,7 +858,7 @@ static void *dl_mmap(size_t n)
void *p;
int prot = PROT_READ|PROT_WRITE, flags = MAP_ANONYMOUS|MAP_PRIVATE;
#ifdef SYS_mmap2
p = (void *)__syscall(SYS_mmap2, 0, n, prot, flags, -1, 0);
p = (void *)__syscall(SYS_mmap2, 0, n, prot, flags, -1, 0, 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change is in general musl code, not in the emscripten-specific part - why is it needed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you're right. It isn't needed. I've now reverted this change.

return (void *)syscall(SYS_mmap2, start, len, prot, flags, fd, off/UNIT);
off_t pgoffset = off / UNIT;
return (void *)syscall(SYS_mmap2, start, len, prot, flags, fd,
(int32_t)pgoffset, ((int32_t)(pgoffset >> 32)));
#else
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question as above

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same answer :-).

(['-O2'], 18, ['assert'], ['waka'], 12616, 17, 15, 34), # noqa
(['-O3'], 7, [], [], 2706, 10, 2, 22), # noqa; in -O3, -Os and -Oz we metadce
(['-Os'], 7, [], [], 2706, 10, 2, 22), # noqa
(['-Oz'], 7, [], [], 2706, 10, 2, 22), # noqa
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like some extra function survives metadce, so it's actually used - what is it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's nullFunc_jiji. We have one more unique signature now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused, I don't see any nullFunc methods when I build say ./emcc tests/hello_world.c -Os -profiling. Reading emscripten.py those should only be emitted when ASSERTIONS are on. So that doesn't explain the change in the metadce levels here, -O3, -Os, -Oz?

@@ -1,4 +1,4 @@
zlib version 1.2.5 = 4688, compile flags = 85
zlib version 1.2.5 = 4688, compile flags = 149
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -1 +1 @@
"1.38.21"
"1.38.22"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of bumping minor version, we should bump to 1.39.0, because this is a breaking ABI change. (like we did in #5916 (comment))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not opposed to that, but I don't think we have a strict rule here - even minor version number updates can break ABI, e.g. with an LLVM update. I did agree in the linked issue that we should do a major one because it was a large change, and because it could be particularly confusing and error-prone, but in this PR here the change is minor (unless I'm missing something), so I don't feel strongly either way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, I think this is worth bumping to 1.39
I don't recall whether we came to a strict rule but I hope we can eventually (soon) get to the point where we can have a strict rule of bumping the minor for ABI breaks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Then we should land this with a minor version, wait to see we are reasonably stable, then bump the major version, as documented here: https://github.com/kripken/emscripten/blob/incoming/docs/process.md#major-version-update-1xy-to-1x10

@rianhunter
Copy link
Contributor

ABI versioning PR here: #7815

@rianhunter
Copy link
Contributor

Hey @sunfishcode could you use some help with this? I have some spare cycles this week, I might be able to shepherd this in if you're prioritizing other things.

@sunfishcode
Copy link
Collaborator Author

Hi @rianhunter, sorry for being slow to respond here. If you're still able to help out here, that'd be a great!

kripken pushed a commit that referenced this pull request Apr 24, 2019
Related to #7649 (point 5) and subset of #7799

* Update python.bc

* Update output wasm sizes

* Update wasm backend tests
VirtualTim pushed a commit to VirtualTim/emscripten that referenced this pull request May 21, 2019
Related to emscripten-core#7649 (point 5) and subset of emscripten-core#7799

* Update python.bc

* Update output wasm sizes

* Update wasm backend tests
@sbc100 sbc100 closed this Jan 30, 2020
@sbc100 sbc100 reopened this Jan 31, 2020
@sbc100 sbc100 changed the base branch from incoming to master January 31, 2020 01:21
@sunfishcode
Copy link
Collaborator Author

These changes are now best pursued by independent PRs such as #8467.

@sunfishcode sunfishcode deleted the match-reference-sysroot branch March 29, 2020 16:23
belraquib pushed a commit to belraquib/emscripten that referenced this pull request Dec 23, 2020
Related to emscripten-core#7649 (point 5) and subset of emscripten-core#7799

* Update python.bc

* Update output wasm sizes

* Update wasm backend tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants