<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/drivers/md/raid1.c, branch linux-3.12.y</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-3.12.y</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-3.12.y'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2017-05-09T06:19:50Z</updated>
<entry>
<title>md:raid1: fix a dead loop when read from a WriteMostly disk</title>
<updated>2017-05-09T06:19:50Z</updated>
<author>
<name>Wei Fang</name>
<email>fangwei1@huawei.com</email>
</author>
<published>2016-03-21T11:18:32Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=204c56d845fcec43176160f831b86444d745cf67'/>
<id>urn:sha1:204c56d845fcec43176160f831b86444d745cf67</id>
<content type='text'>
commit 816b0acf3deb6d6be5d0519b286fdd4bafade905 upstream.

If first_bad == this_sector when we get the WriteMostly disk
in read_balance(), valid disk will be returned with zero
max_sectors. It'll lead to a dead loop in make_request(), and
OOM will happen because of endless allocation of struct bio.

Since we can't get data from this disk in this case, so
continue for another disk.

Signed-off-by: Wei Fang &lt;fangwei1@huawei.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
Cc: Julia Lawall &lt;julia.lawall@lip6.fr&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</content>
</entry>
<entry>
<title>md/raid1: submit_bio_wait() returns 0 on success</title>
<updated>2015-11-12T13:09:03Z</updated>
<author>
<name>Jes Sorensen</name>
<email>Jes.Sorensen@redhat.com</email>
</author>
<published>2015-10-20T16:09:12Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=4ab46026e088c6e7801e278101d66a60ce6b72b8'/>
<id>urn:sha1:4ab46026e088c6e7801e278101d66a60ce6b72b8</id>
<content type='text'>
commit 203d27b0226a05202438ddb39ef0ef1acb14a759 upstream.

This was introduced with 9e882242c6193ae6f416f2d8d8db0d9126bd996b
which changed the return value of submit_bio_wait() to return != 0 on
error, but didn't update the caller accordingly.

Fixes: 9e882242c6 ("block: Add submit_bio_wait(), remove from md")
Reported-by: Bill Kuzeja &lt;William.Kuzeja@stratus.com&gt;
Signed-off-by: Jes Sorensen &lt;Jes.Sorensen@redhat.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</content>
</entry>
<entry>
<title>md/raid1: extend spinlock to protect raid1_end_read_request against inconsistencies</title>
<updated>2015-08-25T14:56:56Z</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.com</email>
</author>
<published>2015-07-27T01:48:52Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=44174364dd6ce4a345f402474294d7694f7312ab'/>
<id>urn:sha1:44174364dd6ce4a345f402474294d7694f7312ab</id>
<content type='text'>
commit 423f04d63cf421ea436bcc5be02543d549ce4b28 upstream.

raid1_end_read_request() assumes that the In_sync bits are consistent
with the -&gt;degaded count.
raid1_spare_active updates the In_sync bit before the -&gt;degraded count
and so exposes an inconsistency, as does error()
So extend the spinlock in raid1_spare_active() and error() to hide those
inconsistencies.

This should probably be part of
  Commit: 34cab6f42003 ("md/raid1: fix test for 'was read error from
  last working device'.")
as it addresses the same issue.  It fixes the same bug and should go
to -stable for same reasons.

Fixes: 76073054c95b ("md/raid1: clean up read_balance.")
Signed-off-by: NeilBrown &lt;neilb@suse.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</content>
</entry>
<entry>
<title>md/raid1: fix test for 'was read error from last working device'.</title>
<updated>2015-08-19T06:36:38Z</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.com</email>
</author>
<published>2015-07-23T23:22:16Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=7ceb73910bf3e50b19ae16ad9481f3eb4d09df65'/>
<id>urn:sha1:7ceb73910bf3e50b19ae16ad9481f3eb4d09df65</id>
<content type='text'>
commit 34cab6f42003cb06f48f86a86652984dec338ae9 upstream.

When we get a read error from the last working device, we don't
try to repair it, and don't fail the device.  We simple report a
read error to the caller.

However the current test for 'is this the last working device' is
wrong.
When there is only one fully working device, it assumes that a
non-faulty device is that device.  However a spare which is rebuilding
would be non-faulty but so not the only working device.

So change the test from "!Faulty" to "In_sync".  If -&gt;degraded says
there is only one fully working device and this device is in_sync,
this must be the one.

This bug has existed since we allowed read_balance to read from
a recovering spare in v3.0

Reported-and-tested-by: Alexander Lyakas &lt;alex.bolshoy@gmail.com&gt;
Fixes: 76073054c95b ("md/raid1: clean up read_balance.")
Signed-off-by: NeilBrown &lt;neilb@suse.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</content>
</entry>
<entry>
<title>md/raid1: fix read balance when a drive is write-mostly.</title>
<updated>2015-03-05T14:37:07Z</updated>
<author>
<name>Tomáš Hodek</name>
<email>tomas.hodek@volny.cz</email>
</author>
<published>2015-02-23T00:00:38Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=8107488ca4689830bc4a11d80017cee009f458ee'/>
<id>urn:sha1:8107488ca4689830bc4a11d80017cee009f458ee</id>
<content type='text'>
commit d1901ef099c38afd11add4cfb3312c02ef21ec4a upstream.

When a drive is marked write-mostly it should only be the
target of reads if there is no other option.

This behaviour was broken by

commit 9dedf60313fa4dddfd5b9b226a0ef12a512bf9dc
    md/raid1: read balance chooses idlest disk for SSD

which causes a write-mostly device to be *preferred* is some cases.

Restore correct behaviour by checking and setting
best_dist_disk and best_pending_disk rather than best_disk.

We only need to test one of these as they are both changed
from -1 or &gt;=0 at the same time.

As we leave min_pending and best_dist unchanged, any non-write-mostly
device will appear better than the write-mostly device.

Reported-by: Tomáš Hodek &lt;tomas.hodek@volny.cz&gt;
Reported-by: Dark Penguin &lt;darkpenguin@yandex.ru&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Link: http://marc.info/?l=linux-raid&amp;m=135982797322422
Fixes: 9dedf60313fa4dddfd5b9b226a0ef12a512bf9dc
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>md/raid1: fix_read_error should act on all non-faulty devices.</title>
<updated>2014-10-13T13:41:38Z</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2014-09-18T01:09:04Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=995925f1ecd8c80bb265153119362c495648a91e'/>
<id>urn:sha1:995925f1ecd8c80bb265153119362c495648a91e</id>
<content type='text'>
commit b8cb6b4c121e1bf1963c16ed69e7adcb1bc301cd upstream.

If a devices is being recovered it is not InSync and is not Faulty.

If a read error is experienced on that device, fix_read_error()
will be called, but it ignores non-InSync devices.  So it will
neither fix the error nor fail the device.

It is incorrect that fix_read_error() ignores non-InSync devices.
It should only ignore Faulty devices.  So fix it.

This became a bug when we allowed reading from a device that was being
recovered.  It is suitable for any subsequent -stable kernel.

Fixes: da8840a747c0dbf49506ec906757a6b87b9741e9
Reported-by: Alexander Lyakas &lt;alex.bolshoy@gmail.com&gt;
Tested-by: Alexander Lyakas &lt;alex.bolshoy@gmail.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>md/raid1,raid10: always abort recover on write error.</title>
<updated>2014-09-17T15:06:51Z</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2014-07-31T00:16:29Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=8c74345afe07978d0b7efdaf561183bdd888a3e1'/>
<id>urn:sha1:8c74345afe07978d0b7efdaf561183bdd888a3e1</id>
<content type='text'>
commit 2446dba03f9dabe0b477a126cbeb377854785b47 upstream.

Currently we don't abort recovery on a write error if the write error
to the recovering device was triggerd by normal IO (as opposed to
recovery IO).

This means that for one bitmap region, the recovery might write to the
recovering device for a few sectors, then not bother for subsequent
sectors (as it never writes to failed devices).  In this case
the bitmap bit will be cleared, but it really shouldn't.

The result is that if the recovering device fails and is then re-added
(after fixing whatever hardware problem triggerred the failure),
the second recovery won't redo the region it was in the middle of,
so some of the device will not be recovered properly.

If we abort the recovery, the region being processes will be cancelled
(bit not cleared) and the whole region will be retried.

As the bug can result in data corruption the patch is suitable for
-stable.  For kernels prior to 3.11 there is a conflict in raid10.c
which will require care.

Original-from: jiao hui &lt;jiaohui@bwstor.com.cn&gt;
Reported-and-tested-by: jiao hui &lt;jiaohui@bwstor.com.cn&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</content>
</entry>
<entry>
<title>md/raid1: r1buf_pool_alloc: free allocate pages when subsequent allocation fails.</title>
<updated>2014-05-29T09:38:14Z</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2014-04-09T02:25:43Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=6c662df0d0c0770b92283c45b9566e748812d7b8'/>
<id>urn:sha1:6c662df0d0c0770b92283c45b9566e748812d7b8</id>
<content type='text'>
commit da1aab3dca9aa88ae34ca392470b8943159e25fe upstream.

When performing a user-request check/repair (MD_RECOVERY_REQUEST is set)
on a raid1, we allocate multiple bios each with their own set of pages.

If the page allocations for one bio fails, we currently do *not* free
the pages allocated for the previous bios, nor do we free the bio itself.

This patch frees all the already-allocate pages, and makes sure that
all the bios are freed as well.

This bug can cause a memory leak which can ultimately OOM a machine.
It was introduced in 3.10-rc1.

Fixes: a07876064a0b73ab5ef1ebcf14b1cf0231c07858
Cc: Kent Overstreet &lt;koverstreet@google.com&gt;
Reported-by: Russell King - ARM Linux &lt;linux@arm.linux.org.uk&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>md/raid1: restore ability for check and repair to fix read errors.</title>
<updated>2014-02-22T21:32:29Z</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2014-02-05T01:17:01Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=93bc05512d3f7bb51e1d0631be065099292708f2'/>
<id>urn:sha1:93bc05512d3f7bb51e1d0631be065099292708f2</id>
<content type='text'>
commit 1877db75589a895bbdc4c4c3f23558e57b521141 upstream.

commit 30bc9b53878a9921b02e3b5bc4283ac1c6de102a
    md/raid1: fix bio handling problems in process_checks()

Move the bio_reset() to a point before where BIO_UPTODATE is checked,
so that check now always report that the bio is uptodate, even if it is not.

This causes process_check() to sometimes treat read-errors as
successful matches so the good data isn't written out.

This patch preserves the flag until it is needed.

Bug was introduced in 3.11, but backported to 3.10-stable (as it fixed
an even worse bug).  So suitable for any -stable since 3.10.

Reported-and-tested-by: Michael Tokarev &lt;mjt@tls.msk.ru&gt;
Fixed: 30bc9b53878a9921b02e3b5bc4283ac1c6de102a
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>md: Fix skipping recovery for read-only arrays.</title>
<updated>2013-10-24T01:55:17Z</updated>
<author>
<name>Lukasz Dorau</name>
<email>lukasz.dorau@intel.com</email>
</author>
<published>2013-10-24T01:55:17Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=61e4947c99c4494336254ec540c50186d186150b'/>
<id>urn:sha1:61e4947c99c4494336254ec540c50186d186150b</id>
<content type='text'>
Since:
        commit 7ceb17e87bde79d285a8b988cfed9eaeebe60b86
        md: Allow devices to be re-added to a read-only array.

spares are activated on a read-only array. In case of raid1 and raid10
personalities it causes that not-in-sync devices are marked in-sync
without checking if recovery has been finished.

If a read-only array is degraded and one of its devices is not in-sync
(because the array has been only partially recovered) recovery will be skipped.

This patch adds checking if recovery has been finished before marking a device
in-sync for raid1 and raid10 personalities. In case of raid5 personality
such condition is already present (at raid5.c:6029).

Bug was introduced in 3.10 and causes data corruption.

Cc: stable@vger.kernel.org
Signed-off-by: Pawel Baldysiak &lt;pawel.baldysiak@intel.com&gt;
Signed-off-by: Lukasz Dorau &lt;lukasz.dorau@intel.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
</entry>
</feed>
