Bug #1158
closedUnfinished freeze hangs fsstress
0%
Description
I've got a freeze that doesn't finish blocking fsstress. Logs in kai:~gregf/logs/fsstress/freeze_not_finishing.
Haven't diagnosed it any further than that.
Updated by Greg Farnum almost 13 years ago
Although actually based on how long fsstress is taking on this disk maybe nothing was blocked and it was just going slow. Ugh.
Updated by Greg Farnum almost 13 years ago
I managed to reproduce this on my mds_rename branch.
Updated by Greg Farnum almost 13 years ago
Well, it's a nested auth pin.
gregf@kai:~/logs/fsstress/freeze_not_finishing2$ grep -o "dir(10000000002) adjust_nested_auth_pins [-0-9]*/[-0-9]*" out/mds.as | sort | uniq -c 658 dir(10000000002) adjust_nested_auth_pins -1/-1 200 dir(10000000002) adjust_nested_auth_pins -1/0 201 dir(10000000002) adjust_nested_auth_pins 1/0 658 dir(10000000002) adjust_nested_auth_pins 1/1
Further analysis forthcoming, when I figure out a good way to do it...
Updated by Sage Weil almost 13 years ago
If you can reproduce, you can enable the auth pin set define in mdstypes.h, which tracks who the pinners are.
//#define MDS_AUTHPIN_SET // define me for debugging auth pin leaks
Updated by Greg Farnum almost 13 years ago
Unfortunately, adjust_nested_auth_pins never sees the person who actually grabbed it. The others print out the grabbing pointer so it's easy enough to grep | sort | uniq to figure out who's taking and not dropping even without the set being kept in-memory (assuming you've got debugging cranked up).
Updated by Sage Weil almost 13 years ago
- Target version changed from v0.30 to v0.31
Updated by Sage Weil almost 13 years ago
- Translation missing: en.field_story_points set to 5
- Translation missing: en.field_position set to 1
- Translation missing: en.field_position changed from 1 to 695
Updated by Sage Weil almost 13 years ago
- Translation missing: en.field_position deleted (
695) - Translation missing: en.field_position set to 1
- Translation missing: en.field_position changed from 1 to 700
Updated by Sage Weil almost 13 years ago
- Translation missing: en.field_position deleted (
704) - Translation missing: en.field_position set to 714
Updated by Sage Weil almost 13 years ago
- Target version changed from v0.31 to v0.32
Updated by Sage Weil almost 13 years ago
- Target version changed from v0.32 to v0.33
- Translation missing: en.field_position deleted (
739) - Translation missing: en.field_position set to 2
Updated by Sage Weil almost 13 years ago
- Status changed from New to Can't reproduce
FWIW I've hit several of these over the past two weeks and they've all boiled down to unstable locks, usually due to issues with client revocation or cap migration.
Enough has changed I think it makes sense to close it out unless/until we see it on current master.
Updated by John Spray over 7 years ago
- Project changed from Ceph to CephFS
- Category deleted (
1)
Bulk updating project=ceph category=mds bugs so that I can remove the MDS category from the Ceph project to avoid confusion.