<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/tools/testing/radix-tree/multiorder.c, branch linux-4.17.y</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-4.17.y</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-4.17.y'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2018-05-19T00:17:12Z</updated>
<entry>
<title>radix tree test suite: multi-order iteration race</title>
<updated>2018-05-19T00:17:12Z</updated>
<author>
<name>Ross Zwisler</name>
<email>ross.zwisler@linux.intel.com</email>
</author>
<published>2018-05-18T23:09:01Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=fd8f58c40b703e47697c9f12bc16c31f14c161f1'/>
<id>urn:sha1:fd8f58c40b703e47697c9f12bc16c31f14c161f1</id>
<content type='text'>
Add a test which shows a race in the multi-order iteration code.  This
test reliably hits the race in under a second on my machine, and is the
result of a real bug report against kernel a production v4.15 based
kernel (4.15.6-300.fc27.x86_64).  With a real kernel this issue is hit
when using order 9 PMD DAX radix tree entries.

The race has to do with how we tear down multi-order sibling entries
when we are removing an item from the tree.  Remember that an order 2
entry looks like this:

  struct radix_tree_node.slots[] = [entry][sibling][sibling][sibling]

where 'entry' is in some slot in the struct radix_tree_node, and the
three slots following 'entry' contain sibling pointers which point back
to 'entry.'

When we delete 'entry' from the tree, we call :

  radix_tree_delete()
    radix_tree_delete_item()
      __radix_tree_delete()
        replace_slot()

replace_slot() first removes the siblings in order from the first to the
last, then at then replaces 'entry' with NULL.  This means that for a
brief period of time we end up with one or more of the siblings removed,
so:

  struct radix_tree_node.slots[] = [entry][NULL][sibling][sibling]

This causes an issue if you have a reader iterating over the slots in
the tree via radix_tree_for_each_slot() while only under
rcu_read_lock()/rcu_read_unlock() protection.  This is a common case in
mm/filemap.c.

The issue is that when __radix_tree_next_slot() =&gt; skip_siblings() tries
to skip over the sibling entries in the slots, it currently does so with
an exact match on the slot directly preceding our current slot.
Normally this works:

                                      V preceding slot
  struct radix_tree_node.slots[] = [entry][sibling][sibling][sibling]
                                              ^ current slot

This lets you find the first sibling, and you skip them all in order.

But in the case where one of the siblings is NULL, that slot is skipped
and then our sibling detection is interrupted:

                                             V preceding slot
  struct radix_tree_node.slots[] = [entry][NULL][sibling][sibling]
                                                    ^ current slot

This means that the sibling pointers aren't recognized since they point
all the way back to 'entry', so we think that they are normal internal
radix tree pointers.  This causes us to think we need to walk down to a
struct radix_tree_node starting at the address of 'entry'.

In a real running kernel this will crash the thread with a GP fault when
you try and dereference the slots in your broken node starting at
'entry'.

In the radix tree test suite this will be caught by the address
sanitizer:

  ==27063==ERROR: AddressSanitizer: heap-buffer-overflow on address
  0x60c0008ae400 at pc 0x00000040ce4f bp 0x7fa89b8fcad0 sp 0x7fa89b8fcac0
  READ of size 8 at 0x60c0008ae400 thread T3
      #0 0x40ce4e in __radix_tree_next_slot /home/rzwisler/project/linux/tools/testing/radix-tree/radix-tree.c:1660
      #1 0x4022cc in radix_tree_next_slot linux/../../../../include/linux/radix-tree.h:567
      #2 0x4022cc in iterator_func /home/rzwisler/project/linux/tools/testing/radix-tree/multiorder.c:655
      #3 0x7fa8a088d50a in start_thread (/lib64/libpthread.so.0+0x750a)
      #4 0x7fa8a03bd16e in clone (/lib64/libc.so.6+0xf516e)

Link: http://lkml.kernel.org/r/20180503192430.7582-5-ross.zwisler@linux.intel.com
Signed-off-by: Ross Zwisler &lt;ross.zwisler@linux.intel.com&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: CR, Sapthagirish &lt;sapthagirish.cr@intel.com&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm, truncate: do not check mapping for every page being truncated</title>
<updated>2017-11-16T02:21:06Z</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@techsingularity.net</email>
</author>
<published>2017-11-16T01:37:41Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=c7df8ad2910e965a6241b6d8f52fd122e26b0315'/>
<id>urn:sha1:c7df8ad2910e965a6241b6d8f52fd122e26b0315</id>
<content type='text'>
During truncation, the mapping has already been checked for shmem and
dax so it's known that workingset_update_node is required.

This patch avoids the checks on mapping for each page being truncated.
In all other cases, a lookup helper is used to determine if
workingset_update_node() needs to be called.  The one danger is that the
API is slightly harder to use as calling workingset_update_node directly
without checking for dax or shmem mappings could lead to surprises.
However, the API rarely needs to be used and hopefully the comment is
enough to give people the hint.

sparsetruncate (tiny)
                              4.14.0-rc4             4.14.0-rc4
                             oneirq-v1r1        pickhelper-v1r1
Min          Time      141.00 (   0.00%)      140.00 (   0.71%)
1st-qrtle    Time      142.00 (   0.00%)      141.00 (   0.70%)
2nd-qrtle    Time      142.00 (   0.00%)      142.00 (   0.00%)
3rd-qrtle    Time      143.00 (   0.00%)      143.00 (   0.00%)
Max-90%      Time      144.00 (   0.00%)      144.00 (   0.00%)
Max-95%      Time      147.00 (   0.00%)      145.00 (   1.36%)
Max-99%      Time      195.00 (   0.00%)      191.00 (   2.05%)
Max          Time      230.00 (   0.00%)      205.00 (  10.87%)
Amean        Time      144.37 (   0.00%)      143.82 (   0.38%)
Stddev       Time       10.44 (   0.00%)        9.00 (  13.74%)
Coeff        Time        7.23 (   0.00%)        6.26 (  13.41%)
Best99%Amean Time      143.72 (   0.00%)      143.34 (   0.26%)
Best95%Amean Time      142.37 (   0.00%)      142.00 (   0.26%)
Best90%Amean Time      142.19 (   0.00%)      141.85 (   0.24%)
Best75%Amean Time      141.92 (   0.00%)      141.58 (   0.24%)
Best50%Amean Time      141.69 (   0.00%)      141.31 (   0.27%)
Best25%Amean Time      141.38 (   0.00%)      140.97 (   0.29%)

As you'd expect, the gain is marginal but it can be detected.  The
differences in bonnie are all within the noise which is not surprising
given the impact on the microbenchmark.

radix_tree_update_node_t is a callback for some radix operations that
optionally passes in a private field.  The only user of the callback is
workingset_update_node and as it no longer requires a mapping, the
private field is removed.

Link: http://lkml.kernel.org/r/20171018075952.10627-3-mgorman@techsingularity.net
Signed-off-by: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: Dave Hansen &lt;dave.hansen@intel.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>radix tree test suite: Fix split/join memory leaks</title>
<updated>2017-02-14T02:44:08Z</updated>
<author>
<name>Matthew Wilcox</name>
<email>mawilcox@microsoft.com</email>
</author>
<published>2017-01-29T07:00:31Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=3b7869c31f9358a63e502c8c5c7664daf1c6d8b0'/>
<id>urn:sha1:3b7869c31f9358a63e502c8c5c7664daf1c6d8b0</id>
<content type='text'>
The last of the memory leaks in the test suite was a couple of places in
the split/join testing where I forgot to free the element being removed
from the tree.

Signed-off-by: Matthew Wilcox &lt;mawilcox@microsoft.com&gt;
Reviewed-by: Rehas Sachdeva &lt;aquannie@gmail.com&gt;
</content>
</entry>
<entry>
<title>radix tree test suite: Dial down verbosity with -v</title>
<updated>2017-02-14T02:44:04Z</updated>
<author>
<name>Rehas Sachdeva</name>
<email>aquannie@gmail.com</email>
</author>
<published>2017-01-04T16:55:00Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=73bc029b76482260a144219786d19951f561716e'/>
<id>urn:sha1:73bc029b76482260a144219786d19951f561716e</id>
<content type='text'>
Make the output of radix tree test suite less verbose by default and add
-v and -vv command line options for increasing level of verbosity.

Signed-off-by: Rehas Sachdeva &lt;aquannie@gmail.com&gt;
Signed-off-by: Matthew Wilcox &lt;mawilcox@microsoft.com&gt;
</content>
</entry>
<entry>
<title>radix tree test suite: Build separate binaries for some tests</title>
<updated>2017-02-14T02:44:02Z</updated>
<author>
<name>Matthew Wilcox</name>
<email>mawilcox@microsoft.com</email>
</author>
<published>2016-12-19T03:56:05Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=8ac04868315c6ffcb2c5a5ad9cd5cec61cad3576'/>
<id>urn:sha1:8ac04868315c6ffcb2c5a5ad9cd5cec61cad3576</id>
<content type='text'>
To allow developers to run a subset of tests, build separate multiorder
and idr-test binaries which will run just the tests in those files.

Signed-off-by: Matthew Wilcox &lt;mawilcox@microsoft.com&gt;
Reviewed-by: Rehas Sachdeva &lt;aquannie@gmail.com&gt;
</content>
</entry>
<entry>
<title>radix-tree: ensure counts are initialised</title>
<updated>2016-12-15T00:04:10Z</updated>
<author>
<name>Matthew Wilcox</name>
<email>mawilcox@microsoft.com</email>
</author>
<published>2016-12-14T23:09:31Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=e8de4340767dd002978c285e3adddaeda8ac652c'/>
<id>urn:sha1:e8de4340767dd002978c285e3adddaeda8ac652c</id>
<content type='text'>
radix_tree_join() was freeing nodes with a non-zero -&gt;exceptional count,
and radix_tree_split() wasn't zeroing -&gt;exceptional when it allocated
the new node.  Fix this by making all callers of radix_tree_node_alloc()
pass in the new counts (and some other always-initialised fields), which
will prevent the problem recurring if in future we decide to do
something similar.

Link: http://lkml.kernel.org/r/1481667692-14500-3-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox &lt;mawilcox@microsoft.com&gt;
Cc: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Konstantin Khlebnikov &lt;koct9i@gmail.com&gt;
Cc: Ross Zwisler &lt;ross.zwisler@linux.intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>radix tree test suite: check multiorder iteration</title>
<updated>2016-12-15T00:04:10Z</updated>
<author>
<name>Matthew Wilcox</name>
<email>mawilcox@microsoft.com</email>
</author>
<published>2016-12-14T23:09:10Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=3e3cdc68bede179a957fcd6be7b833a83df4e5de'/>
<id>urn:sha1:3e3cdc68bede179a957fcd6be7b833a83df4e5de</id>
<content type='text'>
The random iteration test only inserts order-0 entries currently.
Update it to insert entries of order between 7 and 0.  Also make the
maximum index configurable, make some variables static, make the test
duration variable, remove some useless spinning, and add a fifth thread
which calls tag_tagged_items().

Link: http://lkml.kernel.org/r/1480369871-5271-62-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox &lt;mawilcox@microsoft.com&gt;
Tested-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Konstantin Khlebnikov &lt;koct9i@gmail.com&gt;
Cc: Ross Zwisler &lt;ross.zwisler@linux.intel.com&gt;
Cc: Matthew Wilcox &lt;mawilcox@microsoft.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>radix-tree: fix replacement for multiorder entries</title>
<updated>2016-12-15T00:04:10Z</updated>
<author>
<name>Matthew Wilcox</name>
<email>mawilcox@microsoft.com</email>
</author>
<published>2016-12-14T23:09:07Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=a90eb3a2a405cf7e96093ed531a285067dfdbc9d'/>
<id>urn:sha1:a90eb3a2a405cf7e96093ed531a285067dfdbc9d</id>
<content type='text'>
When replacing an entry with NULL, we need to delete any sibling
entries.  Also account deleting exceptional entries properly.  Also fix
a bug with radix_tree_iter_replace() where we would fail to remove
entirely freed nodes.  Also fix accounting bug when switching between
normal and exceptional entries with replace_slot.  Also add testcases
for all these bugs.

Link: http://lkml.kernel.org/r/1480369871-5271-61-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox &lt;mawilcox@microsoft.com&gt;
Tested-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Konstantin Khlebnikov &lt;koct9i@gmail.com&gt;
Cc: Ross Zwisler &lt;ross.zwisler@linux.intel.com&gt;
Cc: Matthew Wilcox &lt;mawilcox@microsoft.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>radix-tree: add radix_tree_split_preload()</title>
<updated>2016-12-15T00:04:10Z</updated>
<author>
<name>Matthew Wilcox</name>
<email>willy@linux.intel.com</email>
</author>
<published>2016-12-14T23:09:04Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=2791653a6814d170fa893344618563a7b1da95c6'/>
<id>urn:sha1:2791653a6814d170fa893344618563a7b1da95c6</id>
<content type='text'>
Calculate how many nodes we need to allocate to split an old_order entry
into multiple entries, each of size new_order.  The test suite checks
that we allocated exactly the right number of nodes; neither too many
(checked by rtp-&gt;nr == 0), nor too few (checked by comparing
nr_allocated before and after the call to radix_tree_split()).

Link: http://lkml.kernel.org/r/1480369871-5271-60-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox &lt;willy@linux.intel.com&gt;
Tested-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Konstantin Khlebnikov &lt;koct9i@gmail.com&gt;
Cc: Ross Zwisler &lt;ross.zwisler@linux.intel.com&gt;
Cc: Matthew Wilcox &lt;mawilcox@microsoft.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>radix-tree: add radix_tree_split</title>
<updated>2016-12-15T00:04:10Z</updated>
<author>
<name>Matthew Wilcox</name>
<email>willy@linux.intel.com</email>
</author>
<published>2016-12-14T23:09:01Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=e157b555945fb16ddc6cce605a1eb6b4135ea5f1'/>
<id>urn:sha1:e157b555945fb16ddc6cce605a1eb6b4135ea5f1</id>
<content type='text'>
This new function splits a larger multiorder entry into smaller entries
(potentially multi-order entries).  These entries are initialised to
RADIX_TREE_RETRY to ensure that RCU walkers who see this state aren't
confused.  The caller should then call radix_tree_for_each_slot() and
radix_tree_replace_slot() in order to turn these retry entries into the
intended new entries.  Tags are replicated from the original multiorder
entry into each new entry.

Link: http://lkml.kernel.org/r/1480369871-5271-59-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox &lt;willy@linux.intel.com&gt;
Tested-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Konstantin Khlebnikov &lt;koct9i@gmail.com&gt;
Cc: Ross Zwisler &lt;ross.zwisler@linux.intel.com&gt;
Cc: Matthew Wilcox &lt;mawilcox@microsoft.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
