Bug #36284
closedBluestore might be hanging OSD
0%
Description
/a/dzafman-2018-09-26_22:31:44-rados-wip-zafman-testing-distro-basic-smithi/3074689
Bluetooth stuck in a loop starting after trying to push an object:
Soon after it starts looping. The primary's backfill (osd 2) hangs waiting for push reply from osd.0:
remote/smithi135/log/ceph-osd.2.log.gz
2018-09-27 14:00:42.237 7fb3431b1700 10 osd.2 pg_epoch: 37 pg[1.7( v 15'6190 (15'6100,15'6190] local-lis/les=23/24 n=6190 ec=10/10 lis/c 23/13 les/c/f 24/14/0 17/23/23) [1,0]/[2] backfill=[0,1] r=0 lpr=23 pi=[13,23)/2 rops=1 bft=0,1 crt=15'6190 lcod 0'0 mlcod 0'0 active+undersized+degraded+remapped+backfilling mbc={255={(1+0)=1}}] pushed 1:fbb557fb:::benchmark_data_smithi135_14561_object35464:head, still waiting for push ack from 1 others
remote/smithi135/log/ceph-osd.0.log.gz
2018-09-27 14:00:42.233 7f85ce1cf700 10 bluestore(/var/lib/ceph/osd/ceph-0) queue_transactions ch 0x56459b843800 1.7_head 2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0) _txc_create osr 0x56459b7a4640 = 0x56459d044100 seq 10680 2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0).collection(1.7_head 0x56459b843800) get_onode oid #1:fbb557fb:::benchmark_data_smithi135_14561_object35464:head# key 0x7f8000000000000001fbb557fb216265'nchmark_data_smithi135_14561_object35464!='0xfffffffffffffffeffffffffffffffff'o' 2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0).collection(1.7_head 0x56459b843800) r -2 v.len 0 2018-09-27 14:00:42.233 7f85ce1cf700 10 bluestore(/var/lib/ceph/osd/ceph-0) _txc_add_transaction op 13 got ENOENT on #1:fbb557fb:::benchmark_data_smithi135_14561_object35464:head# 2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0).collection(1.7_head 0x56459b843800) get_onode oid #1:fbb557fb:::benchmark_data_smithi135_14561_object35464:head# key 0x7f8000000000000001fbb557fb216265'nchmark_data_smithi135_14561_object35464!='0xfffffffffffffffeffffffffffffffff'o' 2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0).collection(1.7_head 0x56459b843800) r -2 v.len 0 2018-09-27 14:00:42.233 7f85ce1cf700 15 bluestore(/var/lib/ceph/osd/ceph-0) _touch 1.7_head #1:fbb557fb:::benchmark_data_smithi135_14561_object35464:head# 2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0) _assign_nid 44191 2018-09-27 14:00:42.233 7f85ce1cf700 10 bluestore(/var/lib/ceph/osd/ceph-0) _touch 1.7_head #1:fbb557fb:::benchmark_data_smithi135_14561_object35464:head# = 0 2018-09-27 14:00:42.233 7f85ce1cf700 15 bluestore(/var/lib/ceph/osd/ceph-0) _truncate 1.7_head #1:fbb557fb:::benchmark_data_smithi135_14561_object35464:head# 0x1 2018-09-27 14:00:42.233 7f85ce1cf700 15 bluestore(/var/lib/ceph/osd/ceph-0) _do_truncate 1.7_head #1:fbb557fb:::benchmark_data_smithi135_14561_object35464:head# 0x1 2018-09-27 14:00:42.233 7f85ce1cf700 10 bluestore(/var/lib/ceph/osd/ceph-0) _truncate 1.7_head #1:fbb557fb:::benchmark_data_smithi135_14561_object35464:head# 0x1 = 0 2018-09-27 14:00:42.233 7f85ce1cf700 15 bluestore(/var/lib/ceph/osd/ceph-0) _set_alloc_hint 1.7_head #1:fbb557fb:::benchmark_data_smithi135_14561_object35464:head# object_size 1 write_size 1 flags sequential_write+sequential_read+append_only+immutable 2018-09-27 14:00:42.233 7f85ce1cf700 10 bluestore(/var/lib/ceph/osd/ceph-0) _set_alloc_hint 1.7_head #1:fbb557fb:::benchmark_data_smithi135_14561_object35464:head# object_size 1 write_size 1 flags sequential_write+sequential_read+append_only+immutable = 0 2018-09-27 14:00:42.233 7f85ce1cf700 15 bluestore(/var/lib/ceph/osd/ceph-0) _write 1.7_head #1:fbb557fb:::benchmark_data_smithi135_14561_object35464:head# 0x0~1 2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0) _do_write #1:fbb557fb:::benchmark_data_smithi135_14561_object35464:head# 0x0~1 - have 0x1 (1) bytes fadvise_flags 0x28 2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0) _choose_write_options will prefer large blob and csum sizes 2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0) _choose_write_options prefer csum_order 14 target_blob_size 0x10000 2018-09-27 14:00:42.233 7f85ce1cf700 10 bluestore(/var/lib/ceph/osd/ceph-0) _do_write_small 0x0~1 2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0) _pad_zeros pad 0x0 + 0xfff on front/back, now 0x0~1000 2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0) _do_alloc_write txc 0x56459d044100 1 blobs 2018-09-27 14:00:42.233 7f85ce1cf700 10 fbmap_alloc 0x564598525000 allocate 0x4000/4000,4000,0 2018-09-27 14:00:42.233 7f85ce1cf700 10 fbmap_alloc 0x564598525000 allocate 0x2a244000~4000/4000,4000,0 2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0) _do_alloc_write prealloc [0x2a244000~4000] 2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0) _do_alloc_write forcing csum_order to block_size_order 12 2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0) _do_alloc_write initialize csum setting for new blob Blob(0x5645a8ecd490 blob([]) use_tracker(0x0 0x0) SharedBlob(0x5645a8ecd500 sbid 0x0)) csum_type crc32c csum_order 12 csum_length 0x4000 2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0) _do_alloc_write blob Blob(0x5645a8ecd490 blob([0x2a244000~4000] csum crc32c/0x1000) use_tracker(0x0 0x0) SharedBlob(0x5645a8ecd500 sbid 0x0)) 2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore.blob(0x5645a8ecd490) get_ref 0x0~1 Blob(0x5645a8ecd490 blob([0x2a244000~4000] csum+has_unused crc32c/0x1000 unused=0xfff0) use_tracker(0x0 0x0) SharedBlob(0x5645a8ecd500 sbid 0x0)) 2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore.blob(0x5645a8ecd490) get_ref init 0x4000, 4000 2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0) _do_alloc_write lex 0x0~1: 0x0~1 Blob(0x5645a8ecd490 blob([0x2a244000~4000] csum+has_unused crc32c/0x1000 unused=0xfff0) use_tracker(0x4000 0x1) SharedBlob(0x5645a8ecd500 sbid 0x0)) 2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore.BufferSpace(0x5645a8ecd518 in 0x56459822cc40) _discard 0x0~1000 2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore.extentmap(0x5645a8f76f10) dirty_range mark inline shard dirty 2018-09-27 14:00:42.233 7f85ce1cf700 10 bluestore(/var/lib/ceph/osd/ceph-0) _write 1.7_head #1:fbb557fb:::benchmark_data_smithi135_14561_object35464:head# 0x0~1 = 0 2018-09-27 14:00:42.233 7f85ce1cf700 15 bluestore(/var/lib/ceph/osd/ceph-0) _setattrs 1.7_head #1:fbb557fb:::benchmark_data_smithi135_14561_object35464:head# 2 keys 2018-09-27 14:00:42.233 7f85ce1cf700 10 bluestore(/var/lib/ceph/osd/ceph-0) _setattrs 1.7_head #1:fbb557fb:::benchmark_data_smithi135_14561_object35464:head# 2 keys = 0 2018-09-27 14:00:42.233 7f85ce1cf700 15 bluestore(/var/lib/ceph/osd/ceph-0) _omap_setkeys 1.7_head #1:e0000000::::head# 2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0) _omap_setkeys 0x0000000000000434'._info' <- _info 2018-09-27 14:00:42.233 7f85ce1cf700 10 bluestore(/var/lib/ceph/osd/ceph-0) _omap_setkeys 1.7_head #1:e0000000::::head# = 0 2018-09-27 14:00:42.233 7f85ce1cf700 10 bluestore(/var/lib/ceph/osd/ceph-0) _txc_calc_cost 0x56459d044100 cost 10049 (2 ios * 4000 + 2049 bytes) 2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0) _txc_write_nodes txc 0x56459d044100 onodes 0x5645a8f76dc0 shared_blobs 2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore.extentmap(0x5645a8f76f10) update #1:fbb557fb:::benchmark_data_smithi135_14561_object35464:head# 2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore.extentmap(0x5645a8f76f10) update inline shard 32 bytes from 1 extents 2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0) _record_onode onode #1:fbb557fb:::benchmark_data_smithi135_14561_object35464:head# is 416 (378 bytes onode + 2 bytes spanning blobs + 36 bytes inline extents)
Other than heartbeat handling most other messages are the following repeating pattern.
2018-09-27 14:02:08.353 7f85e39fa700 20 bluestore.MempoolThread(0x564599058b48) _trim_shards cache_size: 2845415832 kv_alloc: 1191853424 kv_used: 897 meta_alloc: 1124744560 meta_used: 71284700 data_alloc: 528817847 data_used: 0 2018-09-27 14:02:08.353 7f85e39fa700 20 bluestore.2QCache(0x56459822c460) _trim onodes 6409 / 85125 buffers 0 / 66102230 2018-09-27 14:02:08.353 7f85e39fa700 20 bluestore.2QCache(0x56459822c380) _trim onodes 715 / 85125 buffers 0 / 66102230 2018-09-27 14:02:08.353 7f85e39fa700 20 bluestore.2QCache(0x56459822c540) _trim onodes 6208 / 85125 buffers 0 / 66102230 2018-09-27 14:02:08.353 7f85e39fa700 20 bluestore.2QCache(0x56459822d500) _trim onodes 6201 / 85125 buffers 0 / 66102230 2018-09-27 14:02:08.353 7f85e39fa700 20 bluestore.2QCache(0x56459822d420) _trim onodes 6100 / 85125 buffers 0 / 66102230 2018-09-27 14:02:08.353 7f85e39fa700 20 bluestore.2QCache(0x56459822d340) _trim onodes 6124 / 85125 buffers 0 / 66102230 2018-09-27 14:02:08.353 7f85e39fa700 20 bluestore.2QCache(0x56459822d260) _trim onodes 6066 / 85125 buffers 0 / 66102230 2018-09-27 14:02:08.353 7f85e39fa700 20 bluestore.2QCache(0x56459822cc40) _trim onodes 5338 / 85125 buffers 0 / 66102230 2018-09-27 14:02:08.405 7f85e39fa700 20 bluestore.MempoolThread(0x564599058b48) _trim_shards cache_size: 2845415832 kv_alloc: 1191853424 kv_used: 897 meta_alloc: 1124744560 meta_used: 71284700 data_alloc: 528817847 data_used: 0 2018-09-27 14:02:08.405 7f85e39fa700 20 bluestore.2QCache(0x56459822c460) _trim onodes 6409 / 85125 buffers 0 / 66102230 2018-09-27 14:02:08.405 7f85e39fa700 20 bluestore.2QCache(0x56459822c380) _trim onodes 715 / 85125 buffers 0 / 66102230 2018-09-27 14:02:08.405 7f85e39fa700 20 bluestore.2QCache(0x56459822c540) _trim onodes 6208 / 85125 buffers 0 / 66102230 2018-09-27 14:02:08.405 7f85e39fa700 20 bluestore.2QCache(0x56459822d500) _trim onodes 6201 / 85125 buffers 0 / 66102230 2018-09-27 14:02:08.405 7f85e39fa700 20 bluestore.2QCache(0x56459822d420) _trim onodes 6100 / 85125 buffers 0 / 66102230 2018-09-27 14:02:08.405 7f85e39fa700 20 bluestore.2QCache(0x56459822d340) _trim onodes 6124 / 85125 buffers 0 / 66102230 2018-09-27 14:02:08.405 7f85e39fa700 20 bluestore.2QCache(0x56459822d260) _trim onodes 6066 / 85125 buffers 0 / 66102230 2018-09-27 14:02:08.405 7f85e39fa700 20 bluestore.2QCache(0x56459822cc40) _trim onodes 5338 / 85125 buffers 0 / 66102230
Updated by Adam Kupczyk over 5 years ago
I have been working on a problem that seems to be related.
Using FIO with rados ioengine stops.
This seems to be related to cases when txc->osr does not contain txc as an element.
Normal:
2018-10-25 13:47:07.913 7f4c07965700 20 bluestore(/home/adam/ceph-2/build/dev/osd0) _txc_finish 0x5569a2112f00 onodes 0x5569a4377080 2018-10-25 13:47:07.913 7f4c07965700 20 bluestore(/home/adam/ceph-2/build/dev/osd0) _txc_finish txc 0x5569a2112f00 done 2018-10-25 13:47:07.913 7f4c07965700 20 bluestore(/home/adam/ceph-2/build/dev/osd0) _txc_finish osr 0x5569a13a57a0 q now empty
Faulty:
2018-10-25 13:47:07.925 7f4c07965700 20 bluestore(/home/adam/ceph-2/build/dev/osd0) _txc_finish 0x5569a2113800 onodes 0x5569a4377b80 2018-10-25 13:47:07.925 7f4c07965700 20 bluestore(/home/adam/ceph-2/build/dev/osd0) _txc_finish osr 0x5569a13a61c0 q now empty
Updated by Adam Kupczyk over 5 years ago
Observation: when deferred_aggressive==false, kv_sync_thread goes to sleep with deferred_done_queue nonempty.
Sometimes after that it wakes up, but sometimes it does not.
if (kv_queue.empty() && ((deferred_done_queue.empty() && deferred_stable_queue.empty()) || !deferred_aggressive)) { if (kv_stop) break; dout(20) << __func__ << " sleep kv_queue=" << kv_queue.size() << " deferred_done_queue=" << deferred_done_queue.size() << " deferred_stable_queue=" << deferred_stable_queue.size() << " deferred_aggressive=" << deferred_aggressive << dendl;
Gives:
2018-10-25 21:46:14.277 7f1035cb1700 20 bluestore(/home/adam/ceph-2/build/dev/osd0) _kv_sync_thread sleep kv_queue=0 deferred_done_queue=12 deferred_stable_queue=0 deferred_aggressive=0
Updated by Adam Kupczyk over 5 years ago
My problem was fixed by:
https://github.com/ceph/ceph/commit/f755bed3e438d2e7d5ed0df30b8d5bebf2d0cf5a
I expect this problem will be gone too.
Updated by Sage Weil over 5 years ago
- Is duplicate of Bug #36473: hung osd_repop, bluestore committed but failed to trigger repop_commit added