Bug #17413
closedobjecter read request stuck for 2 days from OSD
0%
Description
while testing http://tracker.ceph.com/issues/17275 I stumbled upon read request which was stuck for more than 2 days. ceph-fuse is 0.94.9 with patch from #17275. Cluster is 10.2.1.
Example of request:
{ "tid": 629298, "pg": "1.1ab4db54", "osd": 44, "object_id": "1000024655f.00025010", "object_locator": "@1", "target_object_id": "1000024655f.00025010", "target_object_locator": "@1", "paused": 0, "used_replica": 0, "precalc_pgid": 0, "last_sent": "2016-09-24 04:13:37.639395", "attempts": 1, "snapid": "head", "snap_context": "0=[]", "mtime": "0.000000", "osd_ops": [ "read 131072~131072" ] },
I am attaching some debug info, also:
client cache: ceph-post-file: cd539449-cca1-4b36-bf32-54f0353c0815
mds cache: ceph-post-file: 477d14b0-dcc9-4291-a7fd-fe654483b6b8
Now that ceph-fuse is not running so I am unable to collect additional debug from that, but if you would give me instructions I could collect that if issue will be founf again
Files
Updated by Sage Weil almost 7 years ago
- Status changed from New to Closed
The helpful piece of info here would have been teh output from 'ceph daemon osd.nnn ops' (for the osd where the request is blocked). I'm guessing this is a dup of the rare hung op when it collides with snap trimming, but hard to say here! closing this since there isn't any info here that'll help that one.