Add option to evict keys LRU from the sharded redis tables #3499

ericl · 2018-12-08T02:31:28Z

What do these changes do?

This adds an experimental redis_max_memory flag that bounds the redis memory used per data shard. Note that this only applies to the non-primary shards which store the majority of the task and object metadata. Hence, stuff like client metadata is never evicted.

Since profiling data has a nested structure and cannot be LRU evicted, also make profiling controlled by collect_profiling_data, and disable it when redis_max_memory is set.

Analysis of redis's approximate LRU eviction algorithm: https://github.com/antirez/redis/blob/a2131f907a752e62c78ea6bb719daf9fe2f91402/src/evict.c#L118

We use maxmemory_samples 10. There is also a persisted eviction pool of 16 entries. This effectively gives us 26 tries per eviction to hit a old key (lower bound). Let's assume the most recent 30% of keys are required for stable operation, and we evict at 10000 QPS. Then:

>>> p_no_bad_eviction = 1 - 0.3**26
0.9999999999999746
>>> p_no_bad_eviction_year = p_not_bad_eviction**(10000*60*60*24*365)
0.9920143099318832

So a lower bound on reliability with approx LRU eviction is 99% per year. The actual reliability will be much higher of course since it's unlikely we need even 30% of the metadata, and also the eviction pool is persisted over time.

TODO:

stress test with long-running Ape-X cluster

Related issue number

#3306
#954
#3452

AmplabJenkins · 2018-12-08T03:00:29Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/9851/
Test FAILed.

AmplabJenkins · 2018-12-08T03:18:11Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/9849/
Test FAILed.

AmplabJenkins · 2018-12-08T03:24:40Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/9852/
Test FAILed.

ericl · 2018-12-08T06:22:45Z

src/ray/raylet/node_manager.cc

        // The task was not in the GCS task table. It must therefore be in the
        // lineage cache.
-        RAY_CHECK(lineage_cache_.ContainsTask(task_id));
+        RAY_CHECK(lineage_cache_.ContainsTask(task_id))


This seems to be the narrow waist at which we access evicted lineage, though there could be other sites I'm missing.

ericl · 2018-12-08T06:24:33Z

src/ray/ray_config.h

        handler_warning_timeout_ms_(100),
        heartbeat_timeout_milliseconds_(100),
-        num_heartbeats_timeout_(100),
+        num_heartbeats_timeout_(300),


Raising this to 30s since 10s is too easy to hit with random pauses (e.g., forking process takes a long time, or the kernel stalls compacting hugepages).

Sounds good. It's possible that some of the tests are currently waiting for the full 10s, in which case that will become really slow. If that's the case and we observe that, then we can configure this parameter specifically in those tests.

AmplabJenkins · 2018-12-08T07:07:18Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/9856/
Test FAILed.

ericl · 2018-12-08T08:22:02Z

Checked and Ape-X seems to be stable at a aggressive 500MB redis memory limit.

AmplabJenkins · 2018-12-08T08:35:55Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/9862/
Test FAILed.

AmplabJenkins · 2018-12-08T08:40:15Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/9864/
Test FAILed.

pcmoritz

This is working as intended for me.

AmplabJenkins · 2018-12-09T05:12:30Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/9894/
Test FAILed.

robertnishihara · 2018-12-09T05:58:51Z

test/failure_test.py


    # Check that we get warning messages for both raylets.
-    wait_for_errors(ray_constants.REMOVED_NODE_ERROR, 2, timeout=20)
+    wait_for_errors(ray_constants.REMOVED_NODE_ERROR, 2, timeout=40)


Ah for this test, can we actually do

internal_config=json.dumps({"num_heartbeats_timeout": 40}) or something like that?

llan-ml · 2018-12-09T07:39:00Z

python/ray/rllib/train.py

+        default=None,
+        type=int,
+        help="--redis-max-memory to pass to Ray."
+        " This only has an affect in local mode.")


Hi @ericl What does "local mode" mean? Does this work in multi-node mode?

It just means that when using a cluster, you need to pass --redis-max-memory to ray start and not train.py

llan-ml · 2018-12-09T07:50:15Z

For now the memory flush policy do not support multiple redis shards. I notice that this PR can limit the used memory of each redis shard.

If this PR is merged, does it mean we do not need redis memory flush anymore to some extent?

ericl · 2018-12-09T08:45:51Z

@llan-ml that's right, this should supercede redis flushing. I updated the doc page to to remove the old flushing documentation.

AmplabJenkins · 2018-12-09T10:06:04Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/9895/
Test FAILed.

AmplabJenkins · 2018-12-09T10:10:41Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/9896/
Test FAILed.

AmplabJenkins · 2018-12-09T10:18:19Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/9897/
Test FAILed.

ericl added 9 commits December 7, 2018 17:47

wip

bb3983c

wip

ced0b49

format

99e490f

wip

8f43150

note

c29252a

lint

0266f63

fix

9909466

flag

b499d42

typo

ffe6fed

raise timeout

9243404

ericl commented Dec 8, 2018

View reviewed changes

fix

6127893

ericl commented Dec 8, 2018

View reviewed changes

ericl assigned pcmoritz and stephanie-wang Dec 8, 2018

ericl added 2 commits December 7, 2018 23:10

optional get

c164419

fix flag

8202cc1

increase timeout in test

73fde90

pcmoritz approved these changes Dec 9, 2018

View reviewed changes

robertnishihara reviewed Dec 9, 2018

View reviewed changes

llan-ml reviewed Dec 9, 2018

View reviewed changes

ericl added 3 commits December 9, 2018 00:39

Merge remote-tracking branch 'upstream/master' into redis-lru

f8c20eb

Merge branch 'redis-lru' of github.com:ericl/ray into redis-lru

d52f7f9

update docs

d210a99

format

4d9cf39

pcmoritz merged commit cffe8f9 into ray-project:master Dec 9, 2018

pcmoritz deleted the redis-lru branch December 9, 2018 13:48

This was referenced Dec 11, 2018

Throw exception for ray.get of an evicted actor object #3490

Merged

Uncontrolled profiling table #3306

Closed

ericl mentioned this pull request Dec 13, 2018

Throw application-level error instead of killing raylet on task lineage not found. #3529

Closed

kfstorm mentioned this pull request Feb 28, 2019

Introduce set data structure in GCS #4199

Merged

ericl mentioned this pull request Jun 1, 2019

realloc(): Invalid pointer: perhaps shared memory getting massively clogged by plasma? #2128

Closed

Add option to evict keys LRU from the sharded redis tables #3499

Add option to evict keys LRU from the sharded redis tables #3499

Uh oh!

Conversation

ericl commented Dec 8, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What do these changes do?

Related issue number

Uh oh!

AmplabJenkins commented Dec 8, 2018

Uh oh!

AmplabJenkins commented Dec 8, 2018

Uh oh!

AmplabJenkins commented Dec 8, 2018

Uh oh!

ericl Dec 8, 2018

Choose a reason for hiding this comment

Uh oh!

ericl Dec 8, 2018

Choose a reason for hiding this comment

Uh oh!

robertnishihara Dec 9, 2018

Choose a reason for hiding this comment

Uh oh!

AmplabJenkins commented Dec 8, 2018

Uh oh!

ericl commented Dec 8, 2018

Uh oh!

AmplabJenkins commented Dec 8, 2018

Uh oh!

AmplabJenkins commented Dec 8, 2018

Uh oh!

pcmoritz left a comment

Choose a reason for hiding this comment

Uh oh!

AmplabJenkins commented Dec 9, 2018

Uh oh!

robertnishihara Dec 9, 2018

Choose a reason for hiding this comment

Uh oh!

ericl Dec 9, 2018

Choose a reason for hiding this comment

Uh oh!

llan-ml Dec 9, 2018

Choose a reason for hiding this comment

Uh oh!

ericl Dec 9, 2018

Choose a reason for hiding this comment

Uh oh!

llan-ml commented Dec 9, 2018

Uh oh!

ericl commented Dec 9, 2018

Uh oh!

AmplabJenkins commented Dec 9, 2018

Uh oh!

AmplabJenkins commented Dec 9, 2018

Uh oh!

AmplabJenkins commented Dec 9, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

ericl commented Dec 8, 2018 •

edited

Loading