<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/fs/fuse/dev.c, branch linux-5.1.y</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-5.1.y</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-5.1.y'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2019-06-15T09:53:00Z</updated>
<entry>
<title>fuse: retrieve: cap requested size to negotiated max_write</title>
<updated>2019-06-15T09:53:00Z</updated>
<author>
<name>Kirill Smelkov</name>
<email>kirr@nexedi.com</email>
</author>
<published>2019-03-27T10:15:19Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=a5ee124e47918b1905c1810236d4fb5f2c0bb85a'/>
<id>urn:sha1:a5ee124e47918b1905c1810236d4fb5f2c0bb85a</id>
<content type='text'>
[ Upstream commit 7640682e67b33cab8628729afec8ca92b851394f ]

FUSE filesystem server and kernel client negotiate during initialization
phase, what should be the maximum write size the client will ever issue.
Correspondingly the filesystem server then queues sys_read calls to read
requests with buffer capacity large enough to carry request header + that
max_write bytes. A filesystem server is free to set its max_write in
anywhere in the range between [1*page, fc-&gt;max_pages*page]. In particular
go-fuse[2] sets max_write by default as 64K, wheres default fc-&gt;max_pages
corresponds to 128K. Libfuse also allows users to configure max_write, but
by default presets it to possible maximum.

If max_write is &lt; fc-&gt;max_pages*page, and in NOTIFY_RETRIEVE handler we
allow to retrieve more than max_write bytes, corresponding prepared
NOTIFY_REPLY will be thrown away by fuse_dev_do_read, because the
filesystem server, in full correspondence with server/client contract, will
be only queuing sys_read with ~max_write buffer capacity, and
fuse_dev_do_read throws away requests that cannot fit into server request
buffer. In turn the filesystem server could get stuck waiting indefinitely
for NOTIFY_REPLY since NOTIFY_RETRIEVE handler returned OK which is
understood by clients as that NOTIFY_REPLY was queued and will be sent
back.

Cap requested size to negotiate max_write to avoid the problem.  This
aligns with the way NOTIFY_RETRIEVE handler works, which already
unconditionally caps requested retrieve size to fuse_conn-&gt;max_pages.  This
way it should not hurt NOTIFY_RETRIEVE semantic if we return less data than
was originally requested.

Please see [1] for context where the problem of stuck filesystem was hit
for real, how the situation was traced and for more involving patch that
did not make it into the tree.

[1] https://marc.info/?l=linux-fsdevel&amp;m=155057023600853&amp;w=2
[2] https://github.com/hanwen/go-fuse

Signed-off-by: Kirill Smelkov &lt;kirr@nexedi.com&gt;
Cc: Han-Wen Nienhuys &lt;hanwen@google.com&gt;
Cc: Jakob Unterwurzacher &lt;jakobunt@gmail.com&gt;
Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'page-refs' (page ref overflow)</title>
<updated>2019-04-14T22:09:40Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-04-14T22:09:40Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=6b3a707736301c2128ca85ce85fb13f60b5e350a'/>
<id>urn:sha1:6b3a707736301c2128ca85ce85fb13f60b5e350a</id>
<content type='text'>
Merge page ref overflow branch.

Jann Horn reported that he can overflow the page ref count with
sufficient memory (and a filesystem that is intentionally extremely
slow).

Admittedly it's not exactly easy.  To have more than four billion
references to a page requires a minimum of 32GB of kernel memory just
for the pointers to the pages, much less any metadata to keep track of
those pointers.  Jann needed a total of 140GB of memory and a specially
crafted filesystem that leaves all reads pending (in order to not ever
free the page references and just keep adding more).

Still, we have a fairly straightforward way to limit the two obvious
user-controllable sources of page references: direct-IO like page
references gotten through get_user_pages(), and the splice pipe page
duplication.  So let's just do that.

* branch page-refs:
  fs: prevent page refcount overflow in pipe_buf_get
  mm: prevent get_user_pages() from overflowing page refcount
  mm: add 'try_get_page()' helper function
  mm: make page ref count overflow check tighter and more explicit
</content>
</entry>
<entry>
<title>fs: prevent page refcount overflow in pipe_buf_get</title>
<updated>2019-04-14T17:00:04Z</updated>
<author>
<name>Matthew Wilcox</name>
<email>willy@infradead.org</email>
</author>
<published>2019-04-05T21:02:10Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=15fab63e1e57be9fdb5eec1bbc5916e9825e9acb'/>
<id>urn:sha1:15fab63e1e57be9fdb5eec1bbc5916e9825e9acb</id>
<content type='text'>
Change pipe_buf_get() to return a bool indicating whether it succeeded
in raising the refcount of the page (if the thing in the pipe is a page).
This removes another mechanism for overflowing the page refcount.  All
callers converted to handle a failure.

Reported-by: Jann Horn &lt;jannh@google.com&gt;
Signed-off-by: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>fuse: clean up aborted</title>
<updated>2019-02-13T12:15:14Z</updated>
<author>
<name>Miklos Szeredi</name>
<email>mszeredi@redhat.com</email>
</author>
<published>2019-01-24T09:40:16Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=eb98e3bdf3aa7b15b40c65063ea935f953f60c6b'/>
<id>urn:sha1:eb98e3bdf3aa7b15b40c65063ea935f953f60c6b</id>
<content type='text'>
The only caller that needs fc-&gt;aborted set is fuse_conn_abort_write().
Setting fc-&gt;aborted is now racy (fuse_abort_conn() may already be in
progress or finished) but there's no reason to care.

Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</content>
</entry>
<entry>
<title>fuse: Protect ff-&gt;reserved_req via corresponding fi-&gt;lock</title>
<updated>2019-02-13T12:15:14Z</updated>
<author>
<name>Kirill Tkhai</name>
<email>ktkhai@virtuozzo.com</email>
</author>
<published>2018-11-09T10:33:32Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=6b675738ce900dac8f8478b1d8487df420a315f3'/>
<id>urn:sha1:6b675738ce900dac8f8478b1d8487df420a315f3</id>
<content type='text'>
This is rather natural action after previous patches, and it just decreases
load of fc-&gt;lock.

Signed-off-by: Kirill Tkhai &lt;ktkhai@virtuozzo.com&gt;
Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</content>
</entry>
<entry>
<title>fuse: Verify userspace asks to requeue interrupt that we really sent</title>
<updated>2019-02-13T12:15:13Z</updated>
<author>
<name>Kirill Tkhai</name>
<email>ktkhai@virtuozzo.com</email>
</author>
<published>2018-11-08T09:05:42Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=b782911b5297c4be0c2de88fc3b36a760eca6321'/>
<id>urn:sha1:b782911b5297c4be0c2de88fc3b36a760eca6321</id>
<content type='text'>
When queue_interrupt() is called from fuse_dev_do_write(), it came from
userspace directly. Userspace may pass any request id, even the request's
we have not interrupted (or even background's request). This patch adds
sanity check to make kernel safe against that.

Signed-off-by: Kirill Tkhai &lt;ktkhai@virtuozzo.com&gt;
Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</content>
</entry>
<entry>
<title>fuse: Do some refactoring in fuse_dev_do_write()</title>
<updated>2019-02-13T12:15:13Z</updated>
<author>
<name>Kirill Tkhai</name>
<email>ktkhai@virtuozzo.com</email>
</author>
<published>2018-11-08T09:05:36Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=7407a10de57f5810f3f0e778eb1db0923d24ab16'/>
<id>urn:sha1:7407a10de57f5810f3f0e778eb1db0923d24ab16</id>
<content type='text'>
This is needed for next patch.

Signed-off-by: Kirill Tkhai &lt;ktkhai@virtuozzo.com&gt;
Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</content>
</entry>
<entry>
<title>fuse: Wake up req-&gt;waitq of only if not background</title>
<updated>2019-02-13T12:15:13Z</updated>
<author>
<name>Kirill Tkhai</name>
<email>ktkhai@virtuozzo.com</email>
</author>
<published>2018-11-08T09:05:31Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=5e0fed717a383ce6e894334cffe7a2acc9dc17b9'/>
<id>urn:sha1:5e0fed717a383ce6e894334cffe7a2acc9dc17b9</id>
<content type='text'>
Currently, we wait on req-&gt;waitq in request_wait_answer() function only,
and it's never used for background requests.  Since wake_up() is not a
light-weight macros, instead of this, it unfolds in really called function,
which makes locking operations taking some cpu cycles, let's avoid its call
for the case we definitely know it's completely useless.

Signed-off-by: Kirill Tkhai &lt;ktkhai@virtuozzo.com&gt;
Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</content>
</entry>
<entry>
<title>fuse: Optimize request_end() by not taking fiq-&gt;waitq.lock</title>
<updated>2019-02-13T12:15:13Z</updated>
<author>
<name>Kirill Tkhai</name>
<email>ktkhai@virtuozzo.com</email>
</author>
<published>2018-11-08T09:05:25Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=217316a601016d0d2913290070e6338576ae73f6'/>
<id>urn:sha1:217316a601016d0d2913290070e6338576ae73f6</id>
<content type='text'>
We take global fiq-&gt;waitq.lock every time, when we are in this function,
but interrupted requests are just small subset of all requests. This patch
optimizes request_end() and makes it to take the lock when it's really
needed.

queue_interrupt() needs small change for that. After req is linked to
interrupt list, we do smp_mb() and check for FR_FINISHED again. In case of
FR_FINISHED bit has appeared, we remove req and leave the function:

Signed-off-by: Kirill Tkhai &lt;ktkhai@virtuozzo.com&gt;
Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</content>
</entry>
<entry>
<title>fuse: Kill fasync only if interrupt is queued in queue_interrupt()</title>
<updated>2019-02-13T12:15:13Z</updated>
<author>
<name>Kirill Tkhai</name>
<email>ktkhai@virtuozzo.com</email>
</author>
<published>2018-11-08T09:05:20Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=8da6e9183275c837e7d2401b3018ff81f657a19a'/>
<id>urn:sha1:8da6e9183275c837e7d2401b3018ff81f657a19a</id>
<content type='text'>
We should sent signal only in case of interrupt is really queued.  Not a
real problem, but this makes the code clearer and intuitive.

Signed-off-by: Kirill Tkhai &lt;ktkhai@virtuozzo.com&gt;
Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</content>
</entry>
</feed>
