-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zlib Illegal Instruction on CPU without SSSE3 support #32553
Comments
That is a bit odd, since we guard the optimization on a flag that will be set only if the CPU supports the required instructions (https://cs.chromium.org/chromium/src/third_party/zlib/deflate.c?l=1525). It seems that SSE3 was introduced in early 2004 (https://en.wikipedia.org/wiki/SSE3), or 16 years ago. Unfortunately I don't have any hardware available without SSE3 to debug. Could you provide further details about the environment? (i.e. CPU model, etc) |
@Adenilson It looks like it’s I would have assumed that instead of passing compiler flags, the optimized zlib code uses (Also, minor side note just so we’re on the same page: SSSE3 !== SSE3.) |
CPU seems to be a Xeon 7140M http://www.cpu-world.com/CPUs/Xeon/Intel-Xeon%20MP%207140M%20-%20LF80550KG096007%20(BX805507140M).html |
That makes sense, the 7100 series were probably the last Xeons without SSSE3 and SSE4.2 support. I'm not sure what to do here. This kind of breakage in a release line isn't great but on the other hand, we're talking 14 year old systems here. I'm inclined to mark the PR as "don't backport to the LTS branches" (edit: or remove the |
Agreed @bnoordhuis, even though we don't have any guarantees on which CPUs are supported within a major release (at least not according to https://github.com/nodejs/node/blob/master/BUILDING.md), we should prevent this breaking change at least on LTS.
We should test this firsts to make sure it works. Any tips on how @Strum355 can test it? |
A docker image would probably be a convenient way for me to test this if that suits :) |
The portable C code should work fine without SSSE3 flags. Chromium has SSE2 as minimal requirements (https://support.google.com/chrome/a/answer/7100626?hl=en), so crashing on a Xeon without SSSE3 is not acceptable. @addaleax yes, we are moving towards using attribute(target()) (e.g. https://cs.chromium.org/chromium/src/third_party/zlib/contrib/optimizations/insert_string.h?l=23) which I believe is better than using explicit compiler switches. IIRC, node has its own build system so it should be simple to fix (i.e. don't pass the compiler flag to deflate.c). If that requires further changes (i.e. moving code around), we can coordinate the work to make sure that upstream (Chromium zlib) adopts the same fixes. I will have breakfast and start looking on which changes would be required on Chromium side. |
Would be ok to share a test binary here so we validate if it is crashing (or not) on a machine without SSSE3? Or maybe someone could provide me remote access to such machine? |
Fwiw, here’s the somewhat straightforward diff I put together for testing this: diff in the folddiff --git a/deps/zlib/adler32_simd.c b/deps/zlib/adler32_simd.c
index 1354915cc099..0170c8df870d 100644
--- a/deps/zlib/adler32_simd.c
+++ b/deps/zlib/adler32_simd.c
@@ -53,6 +53,7 @@
#include <tmmintrin.h>
+__attribute__((target("ssse3")))
uint32_t ZLIB_INTERNAL adler32_simd_( /* SSSE3 */
uint32_t adler,
const unsigned char *buf,
diff --git a/deps/zlib/crc32_simd.c b/deps/zlib/crc32_simd.c
index c8e5592f38ef..a77d629f2288 100644
--- a/deps/zlib/crc32_simd.c
+++ b/deps/zlib/crc32_simd.c
@@ -21,6 +21,7 @@
#include <smmintrin.h>
#include <wmmintrin.h>
+__attribute__((target("sse4.2,pclmul")))
uint32_t ZLIB_INTERNAL crc32_sse42_simd_( /* SSE4.2+PCLMUL */
const unsigned char *buf,
z_size_t len,
diff --git a/deps/zlib/crc_folding.c b/deps/zlib/crc_folding.c
index 48d77744aaf4..24621d6c5449 100644
--- a/deps/zlib/crc_folding.c
+++ b/deps/zlib/crc_folding.c
@@ -39,6 +39,7 @@
_mm_storeu_si128((__m128i *)s->crc0 + 4, xmm_crc_part);\
} while (0);
+__attribute__((target("sse4.2,pclmul")))
ZLIB_INTERNAL void crc_fold_init(deflate_state *const s)
{
CRC_LOAD(s)
@@ -53,6 +54,7 @@ ZLIB_INTERNAL void crc_fold_init(deflate_state *const s)
s->strm->adler = 0;
}
+__attribute__((target("sse4.2,pclmul")))
local void fold_1(deflate_state *const s,
__m128i *xmm_crc0, __m128i *xmm_crc1,
__m128i *xmm_crc2, __m128i *xmm_crc3)
@@ -79,6 +81,7 @@ local void fold_1(deflate_state *const s,
*xmm_crc3 = _mm_castps_si128(ps_res);
}
+__attribute__((target("sse4.2,pclmul")))
local void fold_2(deflate_state *const s,
__m128i *xmm_crc0, __m128i *xmm_crc1,
__m128i *xmm_crc2, __m128i *xmm_crc3)
@@ -113,6 +116,7 @@ local void fold_2(deflate_state *const s,
*xmm_crc3 = _mm_castps_si128(ps_res31);
}
+__attribute__((target("sse4.2,pclmul")))
local void fold_3(deflate_state *const s,
__m128i *xmm_crc0, __m128i *xmm_crc1,
__m128i *xmm_crc2, __m128i *xmm_crc3)
@@ -153,6 +157,7 @@ local void fold_3(deflate_state *const s,
*xmm_crc3 = _mm_castps_si128(ps_res32);
}
+__attribute__((target("sse4.2,pclmul")))
local void fold_4(deflate_state *const s,
__m128i *xmm_crc0, __m128i *xmm_crc1,
__m128i *xmm_crc2, __m128i *xmm_crc3)
@@ -219,6 +224,7 @@ local const unsigned zalign(32) pshufb_shf_table[60] = {
0x0201008f,0x06050403,0x0a090807,0x0e0d0c0b /* shl 1 (16 -15)/shr15*/
};
+__attribute__((target("sse4.2,pclmul")))
local void partial_fold(deflate_state *const s, const size_t len,
__m128i *xmm_crc0, __m128i *xmm_crc1,
__m128i *xmm_crc2, __m128i *xmm_crc3,
@@ -269,6 +275,7 @@ local void partial_fold(deflate_state *const s, const size_t len,
*xmm_crc3 = _mm_castps_si128(ps_res);
}
+__attribute__((target("sse4.2,pclmul")))
ZLIB_INTERNAL void crc_fold_copy(deflate_state *const s,
unsigned char *dst, const unsigned char *src, long len)
{
@@ -425,6 +432,7 @@ local const unsigned zalign(16) crc_mask2[4] = {
0x00000000, 0xFFFFFFFF, 0xFFFFFFFF, 0xFFFFFFFF
};
+__attribute__((target("sse4.2,pclmul")))
unsigned ZLIB_INTERNAL crc_fold_512to32(deflate_state *const s)
{
const __m128i xmm_mask = _mm_load_si128((__m128i *)crc_mask);
diff --git a/deps/zlib/zlib.gyp b/deps/zlib/zlib.gyp
index c73b9adba902..6f5a8ce3464b 100644
--- a/deps/zlib/zlib.gyp
+++ b/deps/zlib/zlib.gyp
@@ -91,20 +91,6 @@
'x86.c',
],
'conditions': [
- ['OS!="win" or llvm_version!="0.0"', {
- 'cflags': [
- '-mssse3',
- '-msse4.2',
- '-mpclmul',
- ],
- 'xcode_settings': {
- 'OTHER_CFLAGS': [
- '-mssse3',
- '-msse4.2',
- '-mpclmul',
- ],
- },
- }],
['target_arch=="x64"', {
'defines': [ 'INFLATE_CHUNK_READ_64LE' ],
}], I assume that would need to be polished up a bit, but I can upload the |
@addaleax any chance that could be built with glibc 2.23? Ubuntu 16.04 only provides 2.23 in the repos, upgrading to 2.31 fails because locales requires a higher libc-bin version, which requires a higher libc6 version which requires a higher locales version... |
@Strum355 I … probably don’t want to downgrade glibc on my laptop. 😅 It probably doesn’t make sense to put a test build together from this, but I can see if I could build that patch on one of the Ubuntu 16.04 CI machines. |
Upon a closer look, it doesn't seem to be an issue upstream. We have separated modules with their own flags, for optimized modules (https://cs.chromium.org/chromium/src/third_party/zlib/BUILD.gn?l=201) and vanilla ones (https://cs.chromium.org/chromium/src/third_party/zlib/BUILD.gn?l=246). My GYP-fu is a bit rusty, but my guess is that you shouldn't use this compiler switches (https://github.com/nodejs/node/blob/master/deps/zlib/zlib.gyp#L94) on the portable code. While we work upstream towards removing compiler switches, I guess the quick fix (without diverging from upstream) for Node would be to follow the same build strategy as Chromium (i.e. use source modules with the same compiler switches). Does that sound reasonable? |
Yeah, if somebody from @nodejs/build-files knows how to specify build flags only for specific files, I guess that works too. |
@addaleax had the better idea of just running an ubuntu:18.04 container with glib 2.27, the binary you provided didnt SIGILL 👍 |
I think we'd have to split the files into a new static library target and set the flags for that. cc @nodejs/gyp |
That's right. |
Excelent! Please feel free to reach out if you guys run into any unexpected issue. |
Fix the compile flags so that zlib can run on CPUs that do not have SSSE3/SSE4.2/etc. Do not compile zlib with flags that indicate that those features are available, and instead enable them selectively for functions that use them. There are probably better way to do this, e.g. through gyp file modifications as suggested in the issue. However, this patch should do just fine until that happens. Fixes: nodejs#32553
Opened #32627 which is mostly the same as my patch above, since that has been confirmed to work for two people now. If somebody comes up with a gyp-only solution, great, but I’d like to get this resolved in time for v14.0.0 one way or another. |
Fix the compile flags so that zlib can run on CPUs that do not have SSSE3/SSE4.2/etc. Do not compile zlib with flags that indicate that those features are available, and instead enable them selectively for functions that use them. There are probably better way to do this, e.g. through gyp file modifications as suggested in the issue. However, this patch should do just fine until that happens. Fixes: #32553 PR-URL: #32627 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Tobias Nießen <[email protected]> Reviewed-By: James M Snell <[email protected]>
Fix the compile flags so that zlib can run on CPUs that do not have SSSE3/SSE4.2/etc. Do not compile zlib with flags that indicate that those features are available, and instead enable them selectively for functions that use them. There are probably better way to do this, e.g. through gyp file modifications as suggested in the issue. However, this patch should do just fine until that happens. Fixes: #32553 PR-URL: #32627 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Tobias Nießen <[email protected]> Reviewed-By: James M Snell <[email protected]>
Fix the compile flags so that zlib can run on CPUs that do not have SSSE3/SSE4.2/etc. Do not compile zlib with flags that indicate that those features are available, and instead enable them selectively for functions that use them. There are probably better way to do this, e.g. through gyp file modifications as suggested in the issue. However, this patch should do just fine until that happens. Fixes: nodejs#32553 PR-URL: nodejs#32627 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Tobias Nießen <[email protected]> Reviewed-By: James M Snell <[email protected]>
Fix the compile flags so that zlib can run on CPUs that do not have SSSE3/SSE4.2/etc. Do not compile zlib with flags that indicate that those features are available, and instead enable them selectively for functions that use them. There are probably better way to do this, e.g. through gyp file modifications as suggested in the issue. However, this patch should do just fine until that happens. Fixes: #32553 PR-URL: #32627 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Tobias Nießen <[email protected]> Reviewed-By: James M Snell <[email protected]>
Fix the compile flags so that zlib can run on CPUs that do not have SSSE3/SSE4.2/etc. Do not compile zlib with flags that indicate that those features are available, and instead enable them selectively for functions that use them. There are probably better way to do this, e.g. through gyp file modifications as suggested in the issue. However, this patch should do just fine until that happens. Fixes: nodejs#32553 PR-URL: nodejs#32627 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Tobias Nießen <[email protected]> Reviewed-By: James M Snell <[email protected]>
Fix the compile flags so that zlib can run on CPUs that do not have SSSE3/SSE4.2/etc. Do not compile zlib with flags that indicate that those features are available, and instead enable them selectively for functions that use them. There are probably better way to do this, e.g. through gyp file modifications as suggested in the issue. However, this patch should do just fine until that happens. Fixes: nodejs#32553 PR-URL: nodejs#32627 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Tobias Nießen <[email protected]> Reviewed-By: James M Snell <[email protected]>
Fix the compile flags so that zlib can run on CPUs that do not have SSSE3/SSE4.2/etc. Do not compile zlib with flags that indicate that those features are available, and instead enable them selectively for functions that use them. There are probably better way to do this, e.g. through gyp file modifications as suggested in the issue. However, this patch should do just fine until that happens. Fixes: nodejs#32553 PR-URL: nodejs#32627 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Tobias Nießen <[email protected]> Reviewed-By: James M Snell <[email protected]>
Fix the compile flags so that zlib can run on CPUs that do not have SSSE3/SSE4.2/etc. Do not compile zlib with flags that indicate that those features are available, and instead enable them selectively for functions that use them. There are probably better way to do this, e.g. through gyp file modifications as suggested in the issue. However, this patch should do just fine until that happens. Fixes: nodejs#32553 PR-URL: nodejs#32627 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Tobias Nießen <[email protected]> Reviewed-By: James M Snell <[email protected]>
Fix the compile flags so that zlib can run on CPUs that do not have SSSE3/SSE4.2/etc. Do not compile zlib with flags that indicate that those features are available, and instead enable them selectively for functions that use them. There are probably better way to do this, e.g. through gyp file modifications as suggested in the issue. However, this patch should do just fine until that happens. Fixes: nodejs#32553 PR-URL: nodejs#32627 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Tobias Nießen <[email protected]> Reviewed-By: James M Snell <[email protected]>
Fix the compile flags so that zlib can run on CPUs that do not have SSSE3/SSE4.2/etc. Do not compile zlib with flags that indicate that those features are available, and instead enable them selectively for functions that use them. There are probably better way to do this, e.g. through gyp file modifications as suggested in the issue. However, this patch should do just fine until that happens. Fixes: nodejs#32553 PR-URL: nodejs#32627 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Tobias Nießen <[email protected]> Reviewed-By: James M Snell <[email protected]>
Fix the compile flags so that zlib can run on CPUs that do not have SSSE3/SSE4.2/etc. Do not compile zlib with flags that indicate that those features are available, and instead enable them selectively for functions that use them. There are probably better way to do this, e.g. through gyp file modifications as suggested in the issue. However, this patch should do just fine until that happens. Fixes: nodejs#32553 PR-URL: nodejs#32627 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Tobias Nießen <[email protected]> Reviewed-By: James M Snell <[email protected]>
Fix the compile flags so that zlib can run on CPUs that do not have SSSE3/SSE4.2/etc. Do not compile zlib with flags that indicate that those features are available, and instead enable them selectively for functions that use them. There are probably better way to do this, e.g. through gyp file modifications as suggested in the issue. However, this patch should do just fine until that happens. Fixes: nodejs#32553 PR-URL: nodejs#32627 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Tobias Nießen <[email protected]> Reviewed-By: James M Snell <[email protected]>
Fix the compile flags so that zlib can run on CPUs that do not have SSSE3/SSE4.2/etc. Do not compile zlib with flags that indicate that those features are available, and instead enable them selectively for functions that use them. There are probably better way to do this, e.g. through gyp file modifications as suggested in the issue. However, this patch should do just fine until that happens. Fixes: nodejs#32553 PR-URL: nodejs#32627 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Tobias Nießen <[email protected]> Reviewed-By: James M Snell <[email protected]>
Fix the compile flags so that zlib can run on CPUs that do not have SSSE3/SSE4.2/etc. Do not compile zlib with flags that indicate that those features are available, and instead enable them selectively for functions that use them. There are probably better way to do this, e.g. through gyp file modifications as suggested in the issue. However, this patch should do just fine until that happens. Fixes: nodejs#32553 PR-URL: nodejs#32627 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Tobias Nießen <[email protected]> Reviewed-By: James M Snell <[email protected]>
Fix the compile flags so that zlib can run on CPUs that do not have SSSE3/SSE4.2/etc. Do not compile zlib with flags that indicate that those features are available, and instead enable them selectively for functions that use them. There are probably better way to do this, e.g. through gyp file modifications as suggested in the issue. However, this patch should do just fine until that happens. Fixes: nodejs#32553 PR-URL: nodejs#32627 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Tobias Nießen <[email protected]> Reviewed-By: James M Snell <[email protected]>
Fix the compile flags so that zlib can run on CPUs that do not have SSSE3/SSE4.2/etc. Do not compile zlib with flags that indicate that those features are available, and instead enable them selectively for functions that use them. There are probably better way to do this, e.g. through gyp file modifications as suggested in the issue. However, this patch should do just fine until that happens. Fixes: nodejs#32553 PR-URL: nodejs#32627 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Tobias Nießen <[email protected]> Reviewed-By: James M Snell <[email protected]>
Fix the compile flags so that zlib can run on CPUs that do not have SSSE3/SSE4.2/etc. Do not compile zlib with flags that indicate that those features are available, and instead enable them selectively for functions that use them. There are probably better way to do this, e.g. through gyp file modifications as suggested in the issue. However, this patch should do just fine until that happens. Fixes: nodejs#32553 PR-URL: nodejs#32627 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Tobias Nießen <[email protected]> Reviewed-By: James M Snell <[email protected]>
Fix the compile flags so that zlib can run on CPUs that do not have SSSE3/SSE4.2/etc. Do not compile zlib with flags that indicate that those features are available, and instead enable them selectively for functions that use them. There are probably better way to do this, e.g. through gyp file modifications as suggested in the issue. However, this patch should do just fine until that happens. Fixes: #32553 PR-URL: #32627 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Tobias Nießen <[email protected]> Reviewed-By: James M Snell <[email protected]> PR-URL: #45387 Reviewed-By: Rafael Gonzaga <[email protected]>
Fix the compile flags so that zlib can run on CPUs that do not have SSSE3/SSE4.2/etc. Do not compile zlib with flags that indicate that those features are available, and instead enable them selectively for functions that use them. There are probably better way to do this, e.g. through gyp file modifications as suggested in the issue. However, this patch should do just fine until that happens. Fixes: #32553 PR-URL: #32627 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Tobias Nießen <[email protected]> Reviewed-By: James M Snell <[email protected]> PR-URL: #45387 Reviewed-By: Rafael Gonzaga <[email protected]>
Fix the compile flags so that zlib can run on CPUs that do not have SSSE3/SSE4.2/etc. Do not compile zlib with flags that indicate that those features are available, and instead enable them selectively for functions that use them. There are probably better way to do this, e.g. through gyp file modifications as suggested in the issue. However, this patch should do just fine until that happens. Fixes: #32553 PR-URL: #32627 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Tobias Nießen <[email protected]> Reviewed-By: James M Snell <[email protected]> PR-URL: #45387 Reviewed-By: Rafael Gonzaga <[email protected]>
Fix the compile flags so that zlib can run on CPUs that do not have SSSE3/SSE4.2/etc. Do not compile zlib with flags that indicate that those features are available, and instead enable them selectively for functions that use them. There are probably better way to do this, e.g. through gyp file modifications as suggested in the issue. However, this patch should do just fine until that happens. Fixes: #32553 PR-URL: #32627 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Tobias Nießen <[email protected]> Reviewed-By: James M Snell <[email protected]> PR-URL: #45387 Reviewed-By: Rafael Gonzaga <[email protected]>
Fix the compile flags so that zlib can run on CPUs that do not have SSSE3/SSE4.2/etc. Do not compile zlib with flags that indicate that those features are available, and instead enable them selectively for functions that use them. There are probably better way to do this, e.g. through gyp file modifications as suggested in the issue. However, this patch should do just fine until that happens. Fixes: #32553 PR-URL: #32627 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Tobias Nießen <[email protected]> Reviewed-By: James M Snell <[email protected]> PR-URL: #45387 Reviewed-By: Rafael Gonzaga <[email protected]>
Fix the compile flags so that zlib can run on CPUs that do not have SSSE3/SSE4.2/etc. Do not compile zlib with flags that indicate that those features are available, and instead enable them selectively for functions that use them. There are probably better way to do this, e.g. through gyp file modifications as suggested in the issue. However, this patch should do just fine until that happens. Fixes: #32553 PR-URL: #32627 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Tobias Nießen <[email protected]> Reviewed-By: James M Snell <[email protected]> PR-URL: #45387 Reviewed-By: Rafael Gonzaga <[email protected]>
Running
npm i
for the following package.json:gives Illegal Instruction. In gdb, the backtrace is
With specifically
>│0x132a53f <fill_window+1007> pshufb 0xebc708(%rip),%xmm0 # 0x21e6c50
being the issue, where pshufb is an SSSE3 instruction not supported by the CPUThe text was updated successfully, but these errors were encountered: