summaryrefslogtreecommitdiff
path: root/block/bio-integrity-auto.c
AgeCommit message (Collapse)Author
2026-01-10block: zero non-PI portion of auto integrity bufferCaleb Sander Mateos
The auto-generated integrity buffer for writes needs to be fully initialized before being passed to the underlying block device, otherwise the uninitialized memory can be read back by userspace or anyone with physical access to the storage device. If protection information is generated, that portion of the integrity buffer is already initialized. The integrity data is also zeroed if PI generation is disabled via sysfs or the PI tuple size is 0. However, this misses the case where PI is generated and the PI tuple size is nonzero, but the metadata size is larger than the PI tuple. In this case, the remainder ("opaque") of the metadata is left uninitialized. Generalize the BLK_INTEGRITY_CSUM_NONE check to cover any case when the metadata is larger than just the PI tuple. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Fixes: c546d6f43833 ("block: only zero non-PI metadata tuples in bio_integrity_prep") Reviewed-by: Anuj Gupta <anuj20.g@samsung.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-04block: make bio auto-integrity deadlock safeChristoph Hellwig
The current block layer automatic integrity protection allocates the actual integrity buffer, which has three problems: - because it happens at the bottom of the I/O stack and doesn't use a mempool it can deadlock under load - because the data size in a bio is almost unbounded when using lage folios it can relatively easily exceed the maximum kmalloc size - even when it does not exceed the maximum kmalloc size, it could exceed the maximum segment size of the device Fix this by limiting the I/O size so that we can allocate at least a 2MiB integrity buffer, i.e. 128MiB for 8 byte PI and 512 byte integrity intervals, and create a mempool as a last resort for this maximum size, mirroring the scheme used for bvecs. As a nice upside none of this can fail now, so we remove the error handling and open code the trivial addition of the bip vec. The new allocation helpers sit outside of bio-integrity-auto.c because I plan to reuse them for file system based PI in the near future. Fixes: 7ba1ba12eeef ("block: Block layer data integrity support") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Anuj Gupta <anuj20.g@samsung.com> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-04block: blocking mempool_alloc doesn't failChristoph Hellwig
So remove the error check for it in bio_integrity_prep. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Anuj Gupta <anuj20.g@samsung.com> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-07-01block: rename tuple_size field in blk_integrity to metadata_sizeAnuj Gupta
The tuple_size field in blk_integrity currently represents the total size of metadata associated with each data interval. To make the meaning more explicit, rename tuple_size to metadata_size. This is a purely mechanical rename with no functional changes. Suggested-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Anuj Gupta <anuj20.g@samsung.com> Link: https://lore.kernel.org/20250630090548.3317-2-anuj20.g@samsung.com Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-05-12block: always allocate integrity buffer when requiredKeith Busch
Many nvme metadata formats can not strip or generate the metadata on the controller side. For these formats, a host provided integrity buffer is mandatory even if it isn't checked. The block integrity read_verify and write_generate attributes prevent allocating the metadata buffer, but we need it when the format requires it, otherwise reads and writes will be rejected by the driver with IO errors. Assume the integrity buffer can be offloaded to the controller if the metadata size is the same as the protection information size. Otherwise provide an unchecked host buffer when the read verify or write generation attributes are disabled. This fixes the following nvme warning: ------------[ cut here ]------------ WARNING: CPU: 1 PID: 371 at drivers/nvme/host/core.c:1036 nvme_setup_rw+0x122/0x210 ... RIP: 0010:nvme_setup_rw+0x122/0x210 ... Call Trace: <TASK> nvme_setup_cmd+0x1b4/0x280 nvme_queue_rqs+0xc4/0x1f0 [nvme] blk_mq_dispatch_queue_requests+0x24a/0x430 blk_mq_flush_plug_list+0x50/0x140 __blk_flush_plug+0xc1/0x100 __submit_bio+0x1c1/0x360 ? submit_bio_noacct_nocheck+0x2d6/0x3c0 submit_bio_noacct_nocheck+0x2d6/0x3c0 ? submit_bio_noacct+0x47/0x4c0 submit_bio_wait+0x48/0xa0 __blkdev_direct_IO_simple+0xee/0x210 ? current_time+0x1d/0x100 ? current_time+0x1d/0x100 ? __bio_clone+0xb0/0xb0 blkdev_read_iter+0xbb/0x140 vfs_read+0x239/0x310 ksys_read+0x58/0xc0 do_syscall_64+0x6c/0x180 entry_SYSCALL_64_after_hwframe+0x4b/0x53 Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20250509153802.3482493-1-kbusch@meta.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-03block: split struct bio_integrity_payloadChristoph Hellwig
Many of the fields in struct bio_integrity_payload are only needed for the default integrity buffer in the block layer, and the variable sized array at the end of the structure makes it very hard to embed into caller allocated structures. Reduce struct bio_integrity_payload to the minimal structure needed in common code and create two separate containing structures for the automatically generated payload and the caller allocated payload. The latter is a simple wrapper for struct bio_integrity_payload and the bvecs, while the former contains the additional fields moved out of struct bio_integrity_payload. Always use a dedicated mempool for automatic integrity metadata instead of depending on bio_set that is submitter controlled and thus often doesn't have the mempool initialized and stop using mempools for the submitter buffers as they aren't in the NOIO I/O submission path where we need to guarantee forward progress. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Anuj Gupta <anuj20.g@samsung.com> Reviewed-by: Anuj Gupta <anuj20.g@samsung.com> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Link: https://lore.kernel.org/r/20250225154449.422989-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-03block: move the block layer auto-integrity code into a new fileChristoph Hellwig
The code that automatically creates a integrity payload and generates and verifies the checksums for bios that don't have submitter-provided integrity payload currently sits right in the middle of the block integrity metadata infrastructure. Split it into a separate file to make the different layers clear. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Anuj Gupta <anuj20.g@samsung.com> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20250225154449.422989-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>