Skip to content

Commit

Permalink
mm: workingset: fix use-after-free in shadow node shrinker
Browse files Browse the repository at this point in the history
Several people report seeing warnings about inconsistent radix tree
nodes followed by crashes in the workingset code, which all looked like
use-after-free access from the shadow node shrinker.

Dave Jones managed to reproduce the issue with a debug patch applied,
which confirmed that the radix tree shrinking indeed frees shadow nodes
while they are still linked to the shadow LRU:

  WARNING: CPU: 2 PID: 53 at lib/radix-tree.c:643 delete_node+0x1e4/0x200
  CPU: 2 PID: 53 Comm: kswapd0 Not tainted 4.10.0-rc2-think+ #3
  Call Trace:
     delete_node+0x1e4/0x200
     __radix_tree_delete_node+0xd/0x10
     shadow_lru_isolate+0xe6/0x220
     __list_lru_walk_one.isra.4+0x9b/0x190
     list_lru_walk_one+0x23/0x30
     scan_shadow_nodes+0x2e/0x40
     shrink_slab.part.44+0x23d/0x5d0
     shrink_node+0x22c/0x330
     kswapd+0x392/0x8f0

This is the WARN_ON_ONCE(!list_empty(&node->private_list)) placed in the
inlined radix_tree_shrink().

The problem is with 14b4687 ("mm: workingset: move shadow entry
tracking to radix tree exceptional tracking"), which passes an update
callback into the radix tree to link and unlink shadow leaf nodes when
tree entries change, but forgot to pass the callback when reclaiming a
shadow node.

While the reclaimed shadow node itself is unlinked by the shrinker, its
deletion from the tree can cause the left-most leaf node in the tree to
be shrunk.  If that happens to be a shadow node as well, we don't unlink
it from the LRU as we should.

Consider this tree, where the s are shadow entries:

       root->rnode
            |
       [0       n]
        |       |
     [s    ] [sssss]

Now the shadow node shrinker reclaims the rightmost leaf node through
the shadow node LRU:

       root->rnode
            |
       [0        ]
        |
    [s     ]

Because the parent of the deleted node is the first level below the
root and has only one child in the left-most slot, the intermediate
level is shrunk and the node containing the single shadow is put in
its place:

       root->rnode
            |
       [s        ]

The shrinker again sees a single left-most slot in a first level node
and thus decides to store the shadow in root->rnode directly and free
the node - which is a leaf node on the shadow node LRU.

  root->rnode
       |
       s

Without the update callback, the freed node remains on the shadow LRU,
where it causes later shrinker runs to crash.

Pass the node updater callback into __radix_tree_delete_node() in case
the deletion causes the left-most branch in the tree to collapse too.

Also add warnings when linked nodes are freed right away, rather than
wait for the use-after-free when the list is scanned much later.

Fixes: 14b4687 ("mm: workingset: move shadow entry tracking to radix tree exceptional tracking")
Reported-by: Dave Chinner <[email protected]>
Reported-by: Hugh Dickins <[email protected]>
Reported-by: Andrea Arcangeli <[email protected]>
Reported-and-tested-by: Dave Jones <[email protected]>
Signed-off-by: Johannes Weiner <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Chris Leech <[email protected]>
Cc: Lee Duncan <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
  • Loading branch information
hnaz authored and torvalds committed Jan 8, 2017
1 parent b0b9b3d commit ea07b86
Show file tree
Hide file tree
Showing 3 changed files with 14 additions and 4 deletions.
4 changes: 3 additions & 1 deletion include/linux/radix-tree.h
Original file line number Diff line number Diff line change
Expand Up @@ -306,7 +306,9 @@ void radix_tree_iter_replace(struct radix_tree_root *,
void radix_tree_replace_slot(struct radix_tree_root *root,
void **slot, void *item);
void __radix_tree_delete_node(struct radix_tree_root *root,
struct radix_tree_node *node);
struct radix_tree_node *node,
radix_tree_update_node_t update_node,
void *private);
void *radix_tree_delete_item(struct radix_tree_root *, unsigned long, void *);
void *radix_tree_delete(struct radix_tree_root *, unsigned long);
void radix_tree_clear_tags(struct radix_tree_root *root,
Expand Down
11 changes: 9 additions & 2 deletions lib/radix-tree.c
Original file line number Diff line number Diff line change
Expand Up @@ -640,6 +640,7 @@ static inline void radix_tree_shrink(struct radix_tree_root *root,
update_node(node, private);
}

WARN_ON_ONCE(!list_empty(&node->private_list));
radix_tree_node_free(node);
}
}
Expand All @@ -666,6 +667,7 @@ static void delete_node(struct radix_tree_root *root,
root->rnode = NULL;
}

WARN_ON_ONCE(!list_empty(&node->private_list));
radix_tree_node_free(node);

node = parent;
Expand Down Expand Up @@ -767,6 +769,7 @@ static void radix_tree_free_nodes(struct radix_tree_node *node)
struct radix_tree_node *old = child;
offset = child->offset + 1;
child = child->parent;
WARN_ON_ONCE(!list_empty(&node->private_list));
radix_tree_node_free(old);
if (old == entry_to_node(node))
return;
Expand Down Expand Up @@ -1824,15 +1827,19 @@ EXPORT_SYMBOL(radix_tree_gang_lookup_tag_slot);
* __radix_tree_delete_node - try to free node after clearing a slot
* @root: radix tree root
* @node: node containing @index
* @update_node: callback for changing leaf nodes
* @private: private data to pass to @update_node
*
* After clearing the slot at @index in @node from radix tree
* rooted at @root, call this function to attempt freeing the
* node and shrinking the tree.
*/
void __radix_tree_delete_node(struct radix_tree_root *root,
struct radix_tree_node *node)
struct radix_tree_node *node,
radix_tree_update_node_t update_node,
void *private)
{
delete_node(root, node, NULL, NULL);
delete_node(root, node, update_node, private);
}

/**
Expand Down
3 changes: 2 additions & 1 deletion mm/workingset.c
Original file line number Diff line number Diff line change
Expand Up @@ -473,7 +473,8 @@ static enum lru_status shadow_lru_isolate(struct list_head *item,
if (WARN_ON_ONCE(node->exceptional))
goto out_invalid;
inc_node_state(page_pgdat(virt_to_page(node)), WORKINGSET_NODERECLAIM);
__radix_tree_delete_node(&mapping->page_tree, node);
__radix_tree_delete_node(&mapping->page_tree, node,
workingset_update_node, mapping);

out_invalid:
spin_unlock(&mapping->tree_lock);
Expand Down

0 comments on commit ea07b86

Please sign in to comment.