Bug #4122
closedosd: possible corruption of osd caps
0%
Description
Three users have reported some kind of caps problem I haven't been able to reproduce (EPERM on some operation dependent on caps), all running bobtail. On IRC today, mjevans got osd logs
of this happening, and it appears the caps were corrupted:
2013-02-13 19:39:25.467916 7f766fdb4700 10 osd.0 10 session 0x2c8cc60 client.libvirt has caps osdcap[grant(object_prefix rbd^@children class-read),grant(pool libvirt^@pool^@test rwx)] 'allow class-read object_prefix rbd_children, allow pool libvirt-pool-test rwx' 2013-02-13 19:39:25.468162 7f76787c7700 20 osd.0 10 _dispatch 0x3960240 osd_op(client.4254.0:1 rbd_directory [read 0~0] 3.30a98c1c) v4 2013-02-13 19:39:25.468218 7f76787c7700 15 osd.0 10 require_same_or_newer_map 10 (i am 10) 0x3960240 2013-02-13 19:39:25.468223 7f76787c7700 20 osd.0 10 _share_map_incoming client.4254 10.x.y.z:0/1004293 10 2013-02-13 19:39:25.468232 7f76787c7700 15 osd.0 10 enqueue_op 0x334cb40 prio 63 cost 0 latency 0.000160 osd_op(client.4254.0:1 rbd_directory [read 0~0] 3.30a98c1c) v4 2013-02-13 19:39:25.468266 7f76737bd700 10 osd.0 10 dequeue_op 0x334cb40 prio 63 cost 0 latency 0.000195 osd_op(client.4254.0:1 rbd_directory [read 0~0] 3.30a98c1c) v4 pg pg[3.1c( v 10'2 (0'0,10'2] local-les=8 n=1 ec=4 les/c 8/8 7/7/7) [0] r=0 lpr=7 mlcod 10'2 active+degraded] 2013-02-13 19:39:25.468312 7f76737bd700 20 osd.0 pg_epoch: 10 pg[3.1c( v 10'2 (0'0,10'2] local-les=8 n=1 ec=4 les/c 8/8 7/7/7) [0] r=0 lpr=7 mlcod 10'2 active+degraded] op_has_sufficient_caps pool=3 (libvirt-pool-test) owner=0 need_read_cap=1 need_write_cap=0 need_class_read_cap=0 need_class_write_cap=0 -> NO
The later comments on https://bugs.launchpad.net/glance/+bug/1077045 may be the same problem as well.
Updated by Sage Weil about 11 years ago
- Category set to OSD
- Status changed from New to Need More Info
That output line is right after OSDCap::parse() is called, and the result appears to be wrong. My guess is that there is some subtle problem with the version of boost you are using. I've added a unit test for exactly this case in 2ce28ef1d7f95e71e1043912dfa269ea3b0d1599 (current master) and it's passing on all of our build machines.
What OS are you running? Where did you get the packages? What is 'dpkg -l | grep boost' (or equivalent) say?
Updated by Sage Weil about 11 years ago
AHA, fails on quantal:
Testing input 'allow class-read object_prefix rbd_children, allow pool libvirt-pool-test rwx' test/osd/osdcap.cc:435: Failure Value of: stringify(cap) Actual: "osdcap[grant(object_prefix rbd\0children class-read),grant(pool libvirt\0pool\0test rwx)]" Expected: test_values[i].output Which is: "osdcap[grant(object_prefix rbd_children class-read),grant(pool libvirt-pool-test rwx)]"
http://gitbuilder.sepia.ceph.com/gitbuilder-quantal-amd64/log.cgi?log=2ce28ef1d7f95e71e1043912dfa269ea3b0d1599
Updated by Sage Weil about 11 years ago
- Status changed from Need More Info to Fix Under Review
wip-4122
Updated by Sage Weil about 11 years ago
- Status changed from Fix Under Review to Resolved
- Backport set to bobtail