Project

General

Profile

Actions

Bug #8876

closed

kcephfs: hang on read of length 0

Added by Sage Weil almost 10 years ago. Updated almost 8 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
kceph
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

[674656.313542] libceph: osd5 down
[674657.368218] libceph: tid 148530 reply has 49839 bytes we had only 0 bytes ready
[674680.713864] libceph: osd5 up
[674683.728762] libceph: tid 148530 reply has 49839 bytes we had only 0 bytes ready

flab:367203 01:25 PM # cat /proc/2080/stack
[<ffffffffa06c9bdf>] ceph_osdc_wait_request+0x2f/0x100 [libceph]
[<ffffffffa06cb26b>] ceph_osdc_readpages+0x14b/0x1c0 [libceph]
[<ffffffffa0713c43>] striped_read+0x143/0x350 [ceph]
[<ffffffffa071471c>] ceph_sync_read.isra.12+0x1dc/0x300 [ceph]
[<ffffffffa0714a52>] ceph_aio_read+0x212/0x360 [ceph]
[<ffffffff811d1f0a>] do_sync_read+0x5a/0x90
[<ffffffff811d30e1>] vfs_read+0xb1/0x180
[<ffffffff811d335f>] SyS_read+0x4f/0xb0
[<ffffffff8178527f>] tracesys+0xe1/0xe6
[<ffffffffffffffff>] 0xffffffffffffffff

2014-07-18 13:24:50.294071 7fa84a9b3700  1 -- 10.214.133.134:6826/26983 <== client.206191 10.214.131.102:0/921266500 1 ==== osd_op(client.206191.1:148530 10001296a89.00000000 [read 3428405~0 [1@-1]] 0.a95e59dd RETRY=1 retry+read e348892) v4 ==== 159+0+0 (3831072992 0 0) 0x23436240 con 0x18a14b00
2014-07-18 13:24:53.286215 7fa8439a5700  1 -- 10.214.133.134:6826/26983 --> 10.214.131.102:0/921266500 -- osd_op_reply(148530 10001296a89.00000000 [read 3428405~49839] v0'0 uv423857 ondisk = 0) v6 -- ?+0 0x2740500 con 0x18a14b00

I think the bug here is that the kernel client is sending a read of len 0 in the first place, right? The OSD is (correctly) translating that into a read of the whole object (well, starting from offset).

Actions

Also available in: Atom PDF