Project

General

Profile

Actions

Bug #36284

closed

Bluestore might be hanging OSD

Added by David Zafman over 5 years ago. Updated over 5 years ago.

Status:
Duplicate
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

/a/dzafman-2018-09-26_22:31:44-rados-wip-zafman-testing-distro-basic-smithi/3074689

Bluetooth stuck in a loop starting after trying to push an object:

Soon after it starts looping. The primary's backfill (osd 2) hangs waiting for push reply from osd.0:

remote/smithi135/log/ceph-osd.2.log.gz

2018-09-27 14:00:42.237 7fb3431b1700 10 osd.2 pg_epoch: 37 pg[1.7( v 15'6190 (15'6100,15'6190] local-lis/les=23/24 n=6190 ec=10/10 lis/c 23/13 les/c/f 24/14/0 17/23/23) [1,0]/[2] backfill=[0,1] r=0 lpr=23 pi=[13,23)/2 rops=1 bft=0,1 crt=15'6190 lcod 0'0 mlcod 0'0 active+undersized+degraded+remapped+backfilling mbc={255={(1+0)=1}}] pushed 1:fbb557fb:::benchmark_data_smithi135_14561_object35464:head, still waiting for push ack from 1 others

remote/smithi135/log/ceph-osd.0.log.gz

2018-09-27 14:00:42.233 7f85ce1cf700 10 bluestore(/var/lib/ceph/osd/ceph-0) queue_transactions ch 0x56459b843800 1.7_head
2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0) _txc_create osr 0x56459b7a4640 = 0x56459d044100 seq 10680
2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0).collection(1.7_head 0x56459b843800) get_onode oid #1:fbb557fb:::benchmark_data_smithi135_14561_object35464:head# key 0x7f8000000000000001fbb557fb216265'nchmark_data_smithi135_14561_object35464!='0xfffffffffffffffeffffffffffffffff'o'
2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0).collection(1.7_head 0x56459b843800)  r -2 v.len 0
2018-09-27 14:00:42.233 7f85ce1cf700 10 bluestore(/var/lib/ceph/osd/ceph-0) _txc_add_transaction op 13 got ENOENT on #1:fbb557fb:::benchmark_data_smithi135_14561_object35464:head#
2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0).collection(1.7_head 0x56459b843800) get_onode oid #1:fbb557fb:::benchmark_data_smithi135_14561_object35464:head# key 0x7f8000000000000001fbb557fb216265'nchmark_data_smithi135_14561_object35464!='0xfffffffffffffffeffffffffffffffff'o'
2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0).collection(1.7_head 0x56459b843800)  r -2 v.len 0
2018-09-27 14:00:42.233 7f85ce1cf700 15 bluestore(/var/lib/ceph/osd/ceph-0) _touch 1.7_head #1:fbb557fb:::benchmark_data_smithi135_14561_object35464:head#
2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0) _assign_nid 44191
2018-09-27 14:00:42.233 7f85ce1cf700 10 bluestore(/var/lib/ceph/osd/ceph-0) _touch 1.7_head #1:fbb557fb:::benchmark_data_smithi135_14561_object35464:head# = 0
2018-09-27 14:00:42.233 7f85ce1cf700 15 bluestore(/var/lib/ceph/osd/ceph-0) _truncate 1.7_head #1:fbb557fb:::benchmark_data_smithi135_14561_object35464:head# 0x1
2018-09-27 14:00:42.233 7f85ce1cf700 15 bluestore(/var/lib/ceph/osd/ceph-0) _do_truncate 1.7_head #1:fbb557fb:::benchmark_data_smithi135_14561_object35464:head# 0x1
2018-09-27 14:00:42.233 7f85ce1cf700 10 bluestore(/var/lib/ceph/osd/ceph-0) _truncate 1.7_head #1:fbb557fb:::benchmark_data_smithi135_14561_object35464:head# 0x1 = 0
2018-09-27 14:00:42.233 7f85ce1cf700 15 bluestore(/var/lib/ceph/osd/ceph-0) _set_alloc_hint 1.7_head #1:fbb557fb:::benchmark_data_smithi135_14561_object35464:head# object_size 1 write_size 1 flags sequential_write+sequential_read+append_only+immutable
2018-09-27 14:00:42.233 7f85ce1cf700 10 bluestore(/var/lib/ceph/osd/ceph-0) _set_alloc_hint 1.7_head #1:fbb557fb:::benchmark_data_smithi135_14561_object35464:head# object_size 1 write_size 1 flags sequential_write+sequential_read+append_only+immutable = 0
2018-09-27 14:00:42.233 7f85ce1cf700 15 bluestore(/var/lib/ceph/osd/ceph-0) _write 1.7_head #1:fbb557fb:::benchmark_data_smithi135_14561_object35464:head# 0x0~1
2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0) _do_write #1:fbb557fb:::benchmark_data_smithi135_14561_object35464:head# 0x0~1 - have 0x1 (1) bytes fadvise_flags 0x28
2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0) _choose_write_options will prefer large blob and csum sizes
2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0) _choose_write_options prefer csum_order 14 target_blob_size 0x10000
2018-09-27 14:00:42.233 7f85ce1cf700 10 bluestore(/var/lib/ceph/osd/ceph-0) _do_write_small 0x0~1
2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0) _pad_zeros pad 0x0 + 0xfff on front/back, now 0x0~1000
2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0) _do_alloc_write txc 0x56459d044100 1 blobs
2018-09-27 14:00:42.233 7f85ce1cf700 10 fbmap_alloc 0x564598525000 allocate 0x4000/4000,4000,0
2018-09-27 14:00:42.233 7f85ce1cf700 10 fbmap_alloc 0x564598525000 allocate 0x2a244000~4000/4000,4000,0
2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0) _do_alloc_write prealloc [0x2a244000~4000]
2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0) _do_alloc_write forcing csum_order to block_size_order 12
2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0) _do_alloc_write initialize csum setting for new blob Blob(0x5645a8ecd490 blob([]) use_tracker(0x0 0x0) SharedBlob(0x5645a8ecd500 sbid 0x0)) csum_type crc32c csum_order 12 csum_length 0x4000
2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0) _do_alloc_write blob Blob(0x5645a8ecd490 blob([0x2a244000~4000] csum crc32c/0x1000) use_tracker(0x0 0x0) SharedBlob(0x5645a8ecd500 sbid 0x0))
2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore.blob(0x5645a8ecd490) get_ref 0x0~1 Blob(0x5645a8ecd490 blob([0x2a244000~4000] csum+has_unused crc32c/0x1000 unused=0xfff0) use_tracker(0x0 0x0) SharedBlob(0x5645a8ecd500 sbid 0x0))
2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore.blob(0x5645a8ecd490) get_ref init 0x4000, 4000
2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0) _do_alloc_write  lex 0x0~1: 0x0~1 Blob(0x5645a8ecd490 blob([0x2a244000~4000] csum+has_unused crc32c/0x1000 unused=0xfff0) use_tracker(0x4000 0x1) SharedBlob(0x5645a8ecd500 sbid 0x0))
2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore.BufferSpace(0x5645a8ecd518 in 0x56459822cc40) _discard 0x0~1000
2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore.extentmap(0x5645a8f76f10) dirty_range mark inline shard dirty
2018-09-27 14:00:42.233 7f85ce1cf700 10 bluestore(/var/lib/ceph/osd/ceph-0) _write 1.7_head #1:fbb557fb:::benchmark_data_smithi135_14561_object35464:head# 0x0~1 = 0
2018-09-27 14:00:42.233 7f85ce1cf700 15 bluestore(/var/lib/ceph/osd/ceph-0) _setattrs 1.7_head #1:fbb557fb:::benchmark_data_smithi135_14561_object35464:head# 2 keys
2018-09-27 14:00:42.233 7f85ce1cf700 10 bluestore(/var/lib/ceph/osd/ceph-0) _setattrs 1.7_head #1:fbb557fb:::benchmark_data_smithi135_14561_object35464:head# 2 keys = 0
2018-09-27 14:00:42.233 7f85ce1cf700 15 bluestore(/var/lib/ceph/osd/ceph-0) _omap_setkeys 1.7_head #1:e0000000::::head#
2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0) _omap_setkeys  0x0000000000000434'._info' <- _info
2018-09-27 14:00:42.233 7f85ce1cf700 10 bluestore(/var/lib/ceph/osd/ceph-0) _omap_setkeys 1.7_head #1:e0000000::::head# = 0
2018-09-27 14:00:42.233 7f85ce1cf700 10 bluestore(/var/lib/ceph/osd/ceph-0) _txc_calc_cost 0x56459d044100 cost 10049 (2 ios * 4000 + 2049 bytes)
2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0) _txc_write_nodes txc 0x56459d044100 onodes 0x5645a8f76dc0 shared_blobs
2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore.extentmap(0x5645a8f76f10) update #1:fbb557fb:::benchmark_data_smithi135_14561_object35464:head#
2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore.extentmap(0x5645a8f76f10) update  inline shard 32 bytes from 1 extents
2018-09-27 14:00:42.233 7f85ce1cf700 20 bluestore(/var/lib/ceph/osd/ceph-0) _record_onode onode #1:fbb557fb:::benchmark_data_smithi135_14561_object35464:head# is 416 (378 bytes onode + 2 bytes spanning blobs + 36 bytes inline extents)

Other than heartbeat handling most other messages are the following repeating pattern.

2018-09-27 14:02:08.353 7f85e39fa700 20 bluestore.MempoolThread(0x564599058b48) _trim_shards cache_size: 2845415832 kv_alloc: 1191853424 kv_used: 897 meta_alloc: 1124744560 meta_used: 71284700 data_alloc: 528817847 data_used: 0
2018-09-27 14:02:08.353 7f85e39fa700 20 bluestore.2QCache(0x56459822c460) _trim onodes 6409 / 85125 buffers 0 / 66102230
2018-09-27 14:02:08.353 7f85e39fa700 20 bluestore.2QCache(0x56459822c380) _trim onodes 715 / 85125 buffers 0 / 66102230
2018-09-27 14:02:08.353 7f85e39fa700 20 bluestore.2QCache(0x56459822c540) _trim onodes 6208 / 85125 buffers 0 / 66102230
2018-09-27 14:02:08.353 7f85e39fa700 20 bluestore.2QCache(0x56459822d500) _trim onodes 6201 / 85125 buffers 0 / 66102230
2018-09-27 14:02:08.353 7f85e39fa700 20 bluestore.2QCache(0x56459822d420) _trim onodes 6100 / 85125 buffers 0 / 66102230
2018-09-27 14:02:08.353 7f85e39fa700 20 bluestore.2QCache(0x56459822d340) _trim onodes 6124 / 85125 buffers 0 / 66102230
2018-09-27 14:02:08.353 7f85e39fa700 20 bluestore.2QCache(0x56459822d260) _trim onodes 6066 / 85125 buffers 0 / 66102230
2018-09-27 14:02:08.353 7f85e39fa700 20 bluestore.2QCache(0x56459822cc40) _trim onodes 5338 / 85125 buffers 0 / 66102230
2018-09-27 14:02:08.405 7f85e39fa700 20 bluestore.MempoolThread(0x564599058b48) _trim_shards cache_size: 2845415832 kv_alloc: 1191853424 kv_used: 897 meta_alloc: 1124744560 meta_used: 71284700 data_alloc: 528817847 data_used: 0
2018-09-27 14:02:08.405 7f85e39fa700 20 bluestore.2QCache(0x56459822c460) _trim onodes 6409 / 85125 buffers 0 / 66102230
2018-09-27 14:02:08.405 7f85e39fa700 20 bluestore.2QCache(0x56459822c380) _trim onodes 715 / 85125 buffers 0 / 66102230
2018-09-27 14:02:08.405 7f85e39fa700 20 bluestore.2QCache(0x56459822c540) _trim onodes 6208 / 85125 buffers 0 / 66102230
2018-09-27 14:02:08.405 7f85e39fa700 20 bluestore.2QCache(0x56459822d500) _trim onodes 6201 / 85125 buffers 0 / 66102230
2018-09-27 14:02:08.405 7f85e39fa700 20 bluestore.2QCache(0x56459822d420) _trim onodes 6100 / 85125 buffers 0 / 66102230
2018-09-27 14:02:08.405 7f85e39fa700 20 bluestore.2QCache(0x56459822d340) _trim onodes 6124 / 85125 buffers 0 / 66102230
2018-09-27 14:02:08.405 7f85e39fa700 20 bluestore.2QCache(0x56459822d260) _trim onodes 6066 / 85125 buffers 0 / 66102230
2018-09-27 14:02:08.405 7f85e39fa700 20 bluestore.2QCache(0x56459822cc40) _trim onodes 5338 / 85125 buffers 0 / 66102230

Related issues 1 (0 open1 closed)

Is duplicate of RADOS - Bug #36473: hung osd_repop, bluestore committed but failed to trigger repop_commitResolved

Actions
Actions #1

Updated by Adam Kupczyk over 5 years ago

I have been working on a problem that seems to be related.
Using FIO with rados ioengine stops.

This seems to be related to cases when txc->osr does not contain txc as an element.
Normal:

2018-10-25 13:47:07.913 7f4c07965700 20 bluestore(/home/adam/ceph-2/build/dev/osd0) _txc_finish 0x5569a2112f00 onodes 0x5569a4377080
2018-10-25 13:47:07.913 7f4c07965700 20 bluestore(/home/adam/ceph-2/build/dev/osd0) _txc_finish  txc 0x5569a2112f00 done
2018-10-25 13:47:07.913 7f4c07965700 20 bluestore(/home/adam/ceph-2/build/dev/osd0) _txc_finish osr 0x5569a13a57a0 q now empty

Faulty:
2018-10-25 13:47:07.925 7f4c07965700 20 bluestore(/home/adam/ceph-2/build/dev/osd0) _txc_finish 0x5569a2113800 onodes 0x5569a4377b80
2018-10-25 13:47:07.925 7f4c07965700 20 bluestore(/home/adam/ceph-2/build/dev/osd0) _txc_finish osr 0x5569a13a61c0 q now empty

Actions #2

Updated by Adam Kupczyk over 5 years ago

Observation: when deferred_aggressive==false, kv_sync_thread goes to sleep with deferred_done_queue nonempty.
Sometimes after that it wakes up, but sometimes it does not.

if (kv_queue.empty() &&
    ((deferred_done_queue.empty() && deferred_stable_queue.empty()) ||
     !deferred_aggressive)) {
      if (kv_stop)
    break;
      dout(20) << __func__ << " sleep kv_queue=" << kv_queue.size() <<
              " deferred_done_queue=" << deferred_done_queue.size() <<
              " deferred_stable_queue=" << deferred_stable_queue.size() <<
              " deferred_aggressive=" << deferred_aggressive << dendl;

Gives:

2018-10-25 21:46:14.277 7f1035cb1700 20 bluestore(/home/adam/ceph-2/build/dev/osd0) _kv_sync_thread sleep kv_queue=0 deferred_done_queue=12 deferred_stable_queue=0 deferred_aggressive=0

Actions #3

Updated by Adam Kupczyk over 5 years ago

My problem was fixed by:
https://github.com/ceph/ceph/commit/f755bed3e438d2e7d5ed0df30b8d5bebf2d0cf5a

I expect this problem will be gone too.

Actions #4

Updated by Sage Weil over 5 years ago

  • Status changed from New to Duplicate
Actions #5

Updated by Sage Weil over 5 years ago

  • Is duplicate of Bug #36473: hung osd_repop, bluestore committed but failed to trigger repop_commit added
Actions

Also available in: Atom PDF