Skip to content
This repository has been archived by the owner on Feb 5, 2019. It is now read-only.

Update to 4.5.0 #17

Closed
wants to merge 749 commits into from
Closed

Update to 4.5.0 #17

wants to merge 749 commits into from

Conversation

arthurprs
Copy link

@arthurprs arthurprs commented May 9, 2017

jemalloc came a long way since 4.0.3. Let's try to update. Once this is merged I'll follow-up with the rust-lang/rust PR.

jasone and others added 30 commits February 28, 2016 00:01
Stack corruption happens in x64 bit

This resolves jemalloc#347.
Add a cast to avoid comparing a ssize_t value to a uint64_t value that
is always larger than a 32-bit ssize_t.  This silences an innocuous
compiler warning from e.g. gcc 4.2.1 about the comparison always having
the same result.
Initial implementation of a twopass pairing heap with aux list.
Research papers linked in comments.

Where search/nsearch/last aren't needed, this gives much faster first(),
delete(), and insert().  Insert is O(1), and first/delete don't have to
walk the whole tree.

Also tested rb_old with parent pointers - it was better than the current
rb.h for memory loads, but still much worse than a pairing heap.

An array-based heap would be much faster if everything fits in memory,
but on a cold cache it has many more memory loads for most operations.
Use pairing heap instead of red black tree in arena runs_avail.  The
extra links are unioned with the bitmap_t, so this change doesn't use
any extra memory.

Canaries show this change to be a 1% cpu win, and 2% latency win.  In
particular, large free()s, and small bin frees are now O(1) (barring
coalescing).

I also tested changing bin->runs to be a pairing heap, but saw a much
smaller win, and it would mean increasing the size of arena_run_s by two
pointers, so I left that as an rb-tree for now.
Add (size_t) casts to MALLOCX_ALIGN() macros so that passing the integer
constant 0x80000000 does not cause a compiler warning about invalid
shift amount.

This resolves jemalloc#354.
Also avoid deleting the VERSION file while trying to (re)generate it.

This resolves jemalloc#305.
Specialize fast path to avoid code that cannot execute for dependent
loads.

Manually unroll.
Restructure the test program master header to avoid blindly enabling
assertions.  Prior to this change, assertion code in e.g. arena.h was
always enabled for tests, which could skew performance-related testing.
The arenas_extend() function was renamed to arenas_init() in commit
8bb3198, but its function declaration
was not removed from jemalloc_internal.h.in.
Variables s and slen are declared inside a switch statement, but outside
a case scope. clang reports these variable definitions as "unreachable",
though this is not really meaningful in this case. This is the only
-Wunreachable-code warning in jemalloc.

src/util.c:501:5 [-Wunreachable-code] code will never be executed

This resolves jemalloc#364.
Move chunk_dalloc_arena()'s implementation into chunk_dalloc_wrapper(),
so that if the dalloc hook fails, proper decommit/purge/retain cascading
occurs.  This fixes three potential chunk leaks on OOM paths, one during
dss-based chunk allocation, one during chunk header commit (currently
relevant only on Windows), and one during rtree write (e.g. if rtree
node allocation fails).

Merge chunk_purge_arena() into chunk_purge_default() (refactor, no
change to functionality).
jasone and others added 25 commits January 24, 2017 12:50
Avoid the name secure_getenv to avoid redeclaring secure_getenv when
secure_getenv is present but its use is manually disabled via
ac_cv_func_secure_getenv=no.
Introduces gen_travis.py, which generates .travis.yml, and updates .travis.yml
to be the generated version.

The travis build matrix approach doesn't play well with mixing and matching
various different environment settings, so we generate every build explicitly,
rather than letting them do it for us.

To avoid abusing travis resources (and save us time waiting for CI results), we
don't test every possible combination of options; we only check up to 2 unusual
settings at a time.
This regression was caused by 8f61fde
(Uniformly cast mallctl[bymib]() oldp/newp arguments to (void *).).

This resolves jemalloc#538.
Synchronize tcaches with tcaches_mtx rather than ctl_mtx.  Add missing
synchronization for tcache flushing.  This bug was introduced by
1cb181e (Implement explicit tcache
support.), which was first released in 4.0.0.
malloc_conf does not reliably work with MSVC, which complains of
"inconsistent dll linkage", i.e. its inability to support the
application overriding malloc_conf when dynamically linking/loading.
Work around this limitation by adding test harness support for per test
shell script sourcing, and converting all tests to use MALLOC_CONF
instead of malloc_conf.
This will hopefully catch some windows-specific bugs.
This fixes interactions with witness_assert_depth[_to_rank](), which was
added in dad74bd (Convert
witness_assert_lockless() to witness_assert_lock_depth().).
This makes it possible to make lock state assertions about precisely
which locks are held.
In some cases the prof machinery allocates (in order to modify the
bt2gctx hash table), and such operations are synchronized via
bt2gctx_mtx.  Rather than asserting that no locks are held on entry
into functions that may call prof_gdump(), make the weaker assertion
that no "core" locks are held.  The prof machinery enqueues dumps
triggered by prof_gdump() calls when bt2gctx_mtx is held, so this
weakened assertion avoids false failures in such cases.
Fix chunk_alloc_dss() to account for bytes that are not a multiple of
the chunk size.  This regression was introduced by
e2bcf03 (Make dss operations
lockless.), which was first released in 4.3.0.
These bugs were introduced by b599b32
(Add "J" (JSON) support to malloc_stats_print().), which was first
released in 4.3.0.

This resolves jemalloc#615.
Implement and test a JSON validation parser.  Use the parser to validate
JSON output from malloc_stats_print(), with a significant subset of
supported output options.

This resolves jemalloc#583.
This regression was caused by
b9408d7 (Fix/simplify chunk_recycle()
allocation size computations.).

This resolves jemalloc#647.
Fix lg_chunk clamping to take into account cache-oblivious large
allocation.  This regression only resulted in incorrect behavior if
!config_fill (false unless --disable-fill specified) and
config_cache_oblivious (true unless --disable-cache-oblivious
specified).

This regression was introduced by
8a03cf0 (Implement cache index
randomization for large allocations.), which was first released in
4.0.0.

This resolves jemalloc#555.
When multiple threads calling stats_print, race could happen as we read the
counters in separate mallctl calls; and the removed assertion could fail when
other operations happened in between the mallctl calls. For simplicity, output
"race" in the utilization field in this case.

This resolves jemalloc#616.
Convert CFLAGS to be a concatenation:

  CFLAGS := CONFIGURE_CFLAGS SPECIFIED_CFLAGS EXTRA_CFLAGS

This ordering makes it possible to override the flags set by the
configure script both during and after configuration, with CFLAGS and
EXTRA_CFLAGS, respectively.

This resolves jemalloc#619.
Detect whether chunks start off as THP-capable by default (according to
the state of /sys/kernel/mm/transparent_hugepage/enabled), and use this
as the basis for whether to call pages_nohuge() once per chunk during
first purge of any of the chunk's page runs.

Add the --disable-thp configure option, as well as the the opt.thp
mallctl.

This resolves jemalloc#541.
This avoids signed/unsigned comparison warnings when specifying integer
constants as inputs.

Clean up whitespace and add clarifying parentheses for
CONF_HANDLE_SIZE_T(opt_lg_chunk, ...).
@alexcrichton
Copy link
Member

@arthurprs arthurprs deleted the update-to-4.5.0 branch May 9, 2017 18:12
@arthurprs arthurprs restored the update-to-4.5.0 branch May 10, 2017 08:28
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.