Project

General

Profile

Actions

Bug #41213

closed

BlueStore OSD taking more than 60 minutes to boot

Added by Vikhyat Umrao over 4 years ago. Updated over 4 years ago.

Status:
Duplicate
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2019-08-09 20:46:45.770661 7f9c2c29bd80  0 ceph version 12.2.8-128.el7cp (030358773c5213a14c1444a5147258672b2dc15f) luminous (stable), process ceph-osd, pid 6938
2019-08-09 20:46:46.589792 7f9c2c29bd80  0 pidfile_write: ignore empty --pid-file

2019-08-09 20:46:48.286858 7f9c2c29bd80  1 bluestore(/var/lib/ceph/osd/ceph-0) _open_alloc opening allocation metadata
2019-08-09 20:46:48.341199 7f9c2c29bd80  1 bluestore(/var/lib/ceph/osd/ceph-0) _open_alloc loaded 112GiB in 27 extents
2019-08-09 21:59:27.176258 7f9c2c29bd80  0 <cls> /builddir/build/BUILD/ceph-12.2.8/src/cls/cephfs/cls_cephfs.cc:197: loading cephfs
2019-08-09 21:59:27.176438 7f9c2c29bd80  0 <cls> /builddir/build/BUILD/ceph-12.2.8/src/cls/hello/cls_hello.cc:296: loading cls_hello
2019-08-09 21:59:27.176862 7f9c2c29bd80  0 _get_class not permitted to load kvs
2019-08-09 21:59:27.177169 7f9c2c29bd80  0 _get_class not permitted to load lua
2019-08-09 21:59:27.183178 7f9c2c29bd80  0 _get_class not permitted to load sdk
2019-08-09 21:59:27.184062 7f9c2c29bd80  0 osd.0 1206 crush map has features 288514051259236352, adjusting msgr requires for clients
2019-08-09 21:59:27.184073 7f9c2c29bd80  0 osd.0 1206 crush map has features 288514051259236352 was 8705, adjusting msgr requires for mons
2019-08-09 21:59:27.184080 7f9c2c29bd80  0 osd.0 1206 crush map has features 1009089991638532096, adjusting msgr requires for osds
2019-08-09 21:59:27.184133 7f9c2c29bd80  0 osd.0 1206 load_pgs
2019-08-09 21:59:27.184138 7f9c2c29bd80  0 osd.0 1206 load_pgs opened 0 pgs
2019-08-09 21:59:27.184148 7f9c2c29bd80  0 osd.0 1206 using weightedpriority op queue with priority op cut off at 64.
2019-08-09 21:59:27.185149 7f9c2c29bd80 -1 osd.0 1206 log_to_monitors {default=true}
2019-08-09 21:59:27.190570 7f9c2c29bd80  0 osd.0 1206 done with init, starting boot process
2019-08-09 21:59:27.190606 7f9c2c29bd80  1 osd.0 1206 start_boot
2019-08-09 21:59:28.427835 7f9c19092700  1 osd.0 1218 state: booting -> active


Files

ceph-osd.0.log-20190810.gz (395 KB) ceph-osd.0.log-20190810.gz Logs with debug_bluestore = 20 and debug_rocksdb = 20. Vikhyat Umrao, 08/12/2019 08:41 PM
osd.0.perf.dump.txt (26.6 KB) osd.0.perf.dump.txt osd.0.perf.dump Vikhyat Umrao, 08/12/2019 08:55 PM

Related issues 1 (0 open1 closed)

Related to bluestore - Bug #36482: High amount of Read I/O on BlueFS/DB when listing omap keysResolvedIgor Fedotov10/17/2018

Actions
Actions #1

Updated by Vikhyat Umrao over 4 years ago

2019-08-09 03:46:17.568940 7fd72fb05d80  1 bluestore(/var/lib/ceph/osd/ceph-0) _open_alloc opening allocation metadata
2019-08-09 03:46:17.628419 7fd72fb05d80  1 bluestore(/var/lib/ceph/osd/ceph-0) _open_alloc loaded 112GiB in 29 extents
2019-08-09 04:19:00.254361 7fd72fb05d80 20 bluestore(/var/lib/ceph/osd/ceph-0) statfsstore_statfs(0x1113c00000/0x1bf2800000, stored 0x47abd0/0x14bc000, compress 0x0/0x0/0x0)
2019-08-09 04:19:00.254384 7fd72fb05d80 10 bluestore(/var/lib/ceph/osd/ceph-0) _reconcile_bluefs_freespace
2019-08-09 04:19:00.254392 7fd72fb05d80 10 bluestore(/var/lib/ceph/osd/ceph-0) _reconcile_bluefs_freespace we agree bluefs has 0x[1500000~100000000,10f900000~4c3200000,5d3a40000~28980000,5ff140000~9040000,608240000~1db00000,629400000~e0a00000,7b6900000~125900000,d6a100000~11e400000,12c4700000~511f00000,17d6700000~15800000,17ec000000~15100000,1801200000~14900000,1815c00000~124600000,193a300000~e300000,1948700000~df00000,1956700000~db00000,1964300000~12b200000,1a8f600000~4800000,1a93f00000~12400000,1aa6400000~bec00000,1b65100000~29300000,1b8e500000~612c0000]
2019-08-09 04:19:00.254409 7fd72fb05d80 10 bluestore(/var/lib/ceph/osd/ceph-0) _kv_start

2019-08-09 04:59:39.803981 7fd72fb05d80 10 bluestore(/var/lib/ceph/osd/ceph-0) collect_metadata
2019-08-09 04:59:39.804901 7fd72fb05d80 20 bluestore(/var/lib/ceph/osd/ceph-0) statfsstore_statfs(0x1113400000/0x1bf2800000, stored 0x47abd0/0x14bc000, compress 0x0/0x0/0x0)
2019-08-09 04:59:39.805536 7fd72fb05d80  0 osd.0 1165 done with init, starting boot process
2019-08-09 04:59:39.805571 7fd72fb05d80  1 osd.0 1165 start_boot

2019-08-09 04:59:41.920252 7fd71d0fb700 20 bluestore(/var/lib/ceph/osd/ceph-0) _kv_sync_thread sleep
2019-08-09 04:59:41.920255 7fd71d8fc700 20 bluestore(/var/lib/ceph/osd/ceph-0) _kv_finalize_thread wake
2019-08-09 04:59:41.920261 7fd71d8fc700 20 bluestore(/var/lib/ceph/osd/ceph-0) _kv_finalize_thread kv_committed <0x55fef2986840>
2019-08-09 04:59:41.920266 7fd71d8fc700 20 bluestore(/var/lib/ceph/osd/ceph-0) _kv_finalize_thread deferred_stable <>
2019-08-09 04:59:41.920267 7fd71d8fc700 10 bluestore(/var/lib/ceph/osd/ceph-0) _txc_state_proc txc 0x55fef2986840 kv_submitted
2019-08-09 04:59:41.920269 7fd71d8fc700 20 bluestore(/var/lib/ceph/osd/ceph-0) _txc_committed_kv txc 0x55fef2986840
2019-08-09 04:59:41.920282 7fd71d8fc700 20 bluestore(/var/lib/ceph/osd/ceph-0) _deferred_queue txc 0x55fef2986840 osr 0x55fee935dc00
2019-08-09 04:59:41.920291 7fd71d8fc700 20 bluestore.DeferredBatch(0x55ff0926b680) prepare_write seq 1 0x4000~1000 crc 75adefd8
2019-08-09 04:59:41.920298 7fd71d8fc700 20 bluestore(/var/lib/ceph/osd/ceph-0) _kv_finalize_thread sleep
2019-08-09 04:59:41.920377 7fd71c8fa700  1 osd.0 1186 state: booting -> active
2019-08-09 04:59:41.921702 7fd723908700 20 bluestore.MempoolThread(0x55fec7df71e0) _trim_shards cache_size: 8310739918 kv_alloc: 114085068 kv_used: 106910773 meta_alloc: 13421772 meta_used: 60766 data_alloc: 6710886 data_used: 8192

Actions #2

Updated by Vikhyat Umrao over 4 years ago

This is an SSD OSD and was used for heavy omap workload.

# ceph osd metadata 0
{
    "id": 0,
    "arch": "x86_64",
    "back_addr": "10.74.128.24:6805/6938",
    "back_iface": "em1",
    "bluefs": "1",
    "bluefs_db_access_mode": "blk",
    "bluefs_db_block_size": "4096",
    "bluefs_db_dev": "253:3",
    "bluefs_db_dev_node": "dm-3",
    "bluefs_db_driver": "KernelDevice",
    "bluefs_db_model": "",
    "bluefs_db_partition_path": "/dev/dm-3",
    "bluefs_db_rotational": "0",
    "bluefs_db_size": "120032591872",
    "bluefs_db_type": "ssd",
    "bluefs_single_shared_device": "1",
    "bluestore_bdev_access_mode": "blk",
    "bluestore_bdev_block_size": "4096",
    "bluestore_bdev_dev": "253:3",
    "bluestore_bdev_dev_node": "dm-3",
    "bluestore_bdev_driver": "KernelDevice",
    "bluestore_bdev_model": "",
    "bluestore_bdev_partition_path": "/dev/dm-3",
    "bluestore_bdev_rotational": "0",
    "bluestore_bdev_size": "120032591872",
    "bluestore_bdev_type": "ssd",
    "ceph_version": "ceph version 12.2.8-128.el7cp (030358773c5213a14c1444a5147258672b2dc15f) luminous (stable)",
    "cpu": "Intel(R) Xeon(R) CPU E5-2609 v3 @ 1.90GHz",
    "default_device_class": "ssd",
    "distro": "rhel",
    "distro_description": "Employee SKU",
    "distro_version": "7.6",
    "front_addr": "10.74.128.24:6800/6938",
    "front_iface": "em1",
    "hb_back_addr": "10.74.128.24:6806/6938",
    "hb_front_addr": "10.74.128.24:6807/6938",
    "hostname": "dell-per630-2.gsslab.pnq2.redhat.com",
    "journal_rotational": "0",
    "kernel_description": "#1 SMP Wed Dec 19 10:46:58 EST 2018",
    "kernel_version": "3.10.0-957.5.1.el7.x86_64",
    "mem_swap_kb": "13631484",
    "mem_total_kb": "16015856",
    "os": "Linux",
    "osd_data": "/var/lib/ceph/osd/ceph-0",
    "osd_objectstore": "bluestore",
    "rotational": "0" 
}

Actions #4

Updated by Vikhyat Umrao over 4 years ago

- This happens in each OSD restart. Today again I restarted and the same thing.

2019-08-13 03:58:15.889151 7f03338d1d80  1 bluestore(/var/lib/ceph/osd/ceph-0) _open_alloc loaded 112GiB in 28 extents
2019-08-13 05:10:45.611824 7f03206c8700  1 osd.0 1220 state: booting -> active
Actions #5

Updated by Vikhyat Umrao over 4 years ago

- One more finding as I mentioned in comment#2 this only happens to metadata OSD's(RGW index and multi-site sync logs - heavy omap workload).

- Because for data OSD's, we do not have this log message in the log file.

1 bluestore(/var/lib/ceph/osd/ceph-x) _open_alloc loaded y GiB in z extents
Actions #6

Updated by Vikhyat Umrao over 4 years ago

- Today I restarted the OSD with `debug_bluestore = 20` and `debug_bluefs = 20` and I see following logs.

2019-08-13 21:30:44.270714 7f3885faad80  1 bluestore(/var/lib/ceph/osd/ceph-0) _open_alloc opening allocation metadata

2019-08-13 21:30:44.270721 7f3885faad80 10 freelist enumerate_next start
2019-08-13 21:30:44.270798 7f3885faad80 10 freelist enumerate_next 0xb0000~f08000
2019-08-13 21:30:44.270804 7f3885faad80 10 stupidalloc 0x0x55da6aa99110 init_add_free 0xb0000~f08000
2019-08-13 21:30:44.270817 7f3885faad80 10 freelist enumerate_next 0x1500000~100000000
2019-08-13 21:30:44.270818 7f3885faad80 10 stupidalloc 0x0x55da6aa99110 init_add_free 0x1500000~100000000
2019-08-13 21:30:44.270823 7f3885faad80 10 freelist enumerate_next 0x101510000~17c000
2019-08-13 21:30:44.270824 7f3885faad80 10 stupidalloc 0x0x55da6aa99110 init_add_free 0x101510000~17c000
2019-08-13 21:30:44.270899 7f3885faad80 10 freelist enumerate_next 0x1016c0000~4b10000
2019-08-13 21:30:44.270903 7f3885faad80 10 stupidalloc 0x0x55da6aa99110 init_add_free 0x1016c0000~4b10000
2019-08-13 21:30:44.270907 7f3885faad80 10 freelist enumerate_next 0x1061e8000~20000
2019-08-13 21:30:44.270908 7f3885faad80 10 stupidalloc 0x0x55da6aa99110 init_add_free 0x1061e8000~20000
2019-08-13 21:30:44.270931 7f3885faad80 10 bluefs _read_random h 0x55da6c2db100 0x52d52e~99d from file(ino 43889 size 0x577043 mtime 2019-08-09 03:36:30.984811 bdev 1 allocated 580000 extents [1:0xac40000+580000])
2019-08-13 21:30:44.270942 7f3885faad80 20 bluefs _read_random read buffered 0x52d52e~99d of 1:0xac40000+580000
2019-08-13 21:30:44.271069 7f3885faad80 20 bluefs _read_random got 2461
2019-08-13 21:30:44.271101 7f3885faad80 10 bluefs _read_random h 0x55da6c2db300 0xd0be6c~ff7 from file(ino 33103 size 0xdf3756 mtime 2019-07-05 05:06:28.861433 bdev 1 allocated e00000 extents [1:0x1add780000+e00000])
2019-08-13 21:30:44.271119 7f3885faad80 20 bluefs _read_random read buffered 0xd0be6c~ff7 of 1:0x1add780000+e00000
2019-08-13 21:30:44.271269 7f3885faad80 20 bluefs _read_random got 4087
2019-08-13 21:30:44.271304 7f3885faad80 10 freelist enumerate_next 0x106210000~3b64000
2019-08-13 21:30:44.271305 7f3885faad80 10 stupidalloc 0x0x55da6aa99110 init_add_free 0x106210000~3b64000
2019-08-13 21:30:44.271352 7f3885faad80 10 freelist enumerate_next 0x109d9c000~2d58000
2019-08-13 21:30:44.271355 7f3885faad80 10 stupidalloc 0x0x55da6aa99110 init_add_free 0x109d9c000~2d58000
2019-08-13 21:30:44.271398 7f3885faad80 10 freelist enumerate_next 0x10cb34000~2c94000
2019-08-13 21:30:44.271401 7f3885faad80 10 stupidalloc 0x0x55da6aa99110 init_add_free 0x10cb34000~2c94000
2019-08-13 21:30:44.271407 7f3885faad80 10 freelist enumerate_next 0x10f900000~4c3200000
2019-08-13 21:30:44.271408 7f3885faad80 10 stupidalloc 0x0x55da6aa99110 init_add_free 0x10f900000~4c3200000
2019-08-13 21:30:44.271423 7f3885faad80 10 freelist enumerate_next 0x5d2b10000~f00000
2019-08-13 21:30:44.271424 7f3885faad80 10 stupidalloc 0x0x55da6aa99110 init_add_free 0x5d2b10000~f00000
2019-08-13 21:30:44.271525 7f3885faad80 10 bluefs _read_random h 0x55da6c2db300 0xd0ce63~ff4 from file(ino 33103 size 0xdf3756 mtime 2019-07-05 05:06:28.861433 bdev 1 allocated e00000 extents [1:0x1add780000+e00000])
2019-08-13 21:30:44.271533 7f3885faad80 20 bluefs _read_random read buffered 0xd0ce63~ff4 of 1:0x1add780000+e00000
2019-08-13 21:30:44.271680 7f3885faad80 20 bluefs _read_random got 4084
2019-08-13 21:30:44.271903 7f3885faad80 10 bluefs _read_random h 0x55da6c2db300 0xd0de57~ff5 from file(ino 33103 size 0xdf3756 mtime 2019-07-05 05:06:28.861433 bdev 1 allocated e00000 extents [1:0x1add780000+e00000])
2019-08-13 21:30:44.271912 7f3885faad80 20 bluefs _read_random read buffered 0xd0de57~ff5 of 1:0x1add780000+e00000
2019-08-13 21:30:44.272123 7f3885faad80 20 bluefs _read_random got 4085
2019-08-13 21:30:44.272343 7f3885faad80 10 bluefs _read_random h 0x55da6c2db300 0xd0ee4c~ff5 from file(ino 33103 size 0xdf3756 mtime 2019-07-05 05:06:28.861433 bdev 1 allocated e00000 extents [1:0x1add780000+e00000])

<....> going on ...

2019-08-13 21:30:44.330209 7f3885faad80 10 stupidalloc 0x0x55da6aa99110 init_add_free 0x1b65100000~29300000
2019-08-13 21:30:44.330215 7f3885faad80 10 freelist enumerate_next 0x1b8e500000~612c0000
2019-08-13 21:30:44.330216 7f3885faad80 10 stupidalloc 0x0x55da6aa99110 init_add_free 0x1b8e500000~612c0000
2019-08-13 21:30:44.330221 7f3885faad80 10 freelist enumerate_next 0x1bef814000~2fec000
2019-08-13 21:30:44.330222 7f3885faad80 10 stupidalloc 0x0x55da6aa99110 init_add_free 0x1bef814000~2fec000
2019-08-13 21:30:44.330224 7f3885faad80 10 freelist enumerate_next end

2019-08-13 21:30:44.330228 7f3885faad80  1 bluestore(/var/lib/ceph/osd/ceph-0) _open_alloc loaded 112GiB in 28 extents

2019-08-13 21:30:44.330240 7f3885faad80 10 stupidalloc 0x0x55da6aa99110 init_rm_free 0x1500000~100000000
2019-08-13 21:30:44.330245 7f3885faad80 20 stupidalloc 0x0x55da6aa99110 init_rm_free bin 9 rm 0x[1500000~100000000]
2019-08-13 21:30:44.330249 7f3885faad80 10 stupidalloc 0x0x55da6aa99110 init_rm_free 0x10f900000~4c3200000
2019-08-13 21:30:44.330251 7f3885faad80 20 stupidalloc 0x0x55da6aa99110 init_rm_free bin 9 rm 0x[10f900000~4c3200000]
2019-08-13 21:30:44.330253 7f3885faad80 10 stupidalloc 0x0x55da6aa99110 init_rm_free 0x5d3a40000~28980000
2019-08-13 21:30:44.330255 7f3885faad80 20 stupidalloc 0x0x55da6aa99110 init_rm_free bin 9 rm 0x[5d3a40000~28980000]
2019-08-13 21:30:44.330257 7f3885faad80 10 stupidalloc 0x0x55da6aa99110 init_rm_free 0x5ff140000~9040000
2019-08-13 21:30:44.330259 7f3885faad80 20 stupidalloc 0x0x55da6aa99110 init_rm_free bin 9 rm 0x[5ff140000~9040000]
2019-08-13 21:30:44.330261 7f3885faad80 10 stupidalloc 0x0x55da6aa99110 init_rm_free 0x608240000~1db00000
2019-08-13 21:30:44.330263 7f3885faad80 20 stupidalloc 0x0x55da6aa99110 init_rm_free bin 9 rm 0x[608240000~1db00000]
2019-08-13 21:30:44.330264 7f3885faad80 10 stupidalloc 0x0x55da6aa99110 init_rm_free 0x629400000~e0a00000
2019-08-13 21:30:44.330266 7f3885faad80 20 stupidalloc 0x0x55da6aa99110 init_rm_free bin 9 rm 0x[629400000~e0a00000]
2019-08-13 21:30:44.330268 7f3885faad80 10 stupidalloc 0x0x55da6aa99110 init_rm_free 0x7b6900000~125900000
2019-08-13 21:30:44.330269 7f3885faad80 20 stupidalloc 0x0x55da6aa99110 init_rm_free bin 9 rm 0x[7b6900000~125900000]
2019-08-13 21:30:44.330271 7f3885faad80 10 stupidalloc 0x0x55da6aa99110 init_rm_free 0xd6a100000~11e400000

<...>

,608240000~1db00000,629400000~e0a00000,7b6900000~125900000,d6a100000~11e400000,12c4700000~511f00000,17d6700000~15800000,17ec000000~15100000,1801200000~14900000,1815c00000~124600000,193a300000~e300000,1948700000~df00000,1956700000~db00000,1964300000~12b200000,1a8f600000~4800000,1a93f00000~12400000,1aa6400000~bec00000,1b65100000~29300000,1b8e500000~612c0000] as allocated
2019-08-13 21:30:44.330331 7f3885faad80 10 bluestore(/var/lib/ceph/osd/ceph-0) _open_collections
2019-08-13 21:30:44.330411 7f3885faad80 20 bluestore(/var/lib/ceph/osd/ceph-0) _open_collections opened meta 0x55da6b238700 cnode(bits 0)
2019-08-13 21:30:44.330421 7f3885faad80 10 bluefs _read_random h 0x55da6c2db780 0x104a~1056 from file(ino 41330 size 0x49f1cfe mtime 2019-07-05 14:05:26.908612 bdev 1 allocated 4a00000 extents [1:0xdfed40000+4a00000])
2019-08-13 21:30:44.330433 7f3885faad80 20 bluefs _read_random read buffered 0x104a~1056 of 1:0xdfed40000+4a00000
2019-08-13 21:30:44.330588 7f3885faad80 20 bluefs _read_random got 4182
2019-08-13 21:30:44.330603 7f3885faad80 10 bluefs _read_random h 0x55da6af5db00 0x197a~104a from file(ino 33104 size 0x45a803b mtime 2019-07-05 05:06:31.164583 bdev 1 allocated 45c0000 extents [1:0x1ade580000+45c0000])
2019-08-13 21:30:44.330611 7f3885faad80 20 bluefs _read_random read buffered 0x197a~104a of 1:0x1ade580000+45c0000
2019-08-13 21:30:44.330754 7f3885faad80 20 bluefs _read_random got 4170
2019-08-13 21:30:44.330771 7f3885faad80 10 bluefs _read_random h 0x55da6c2db780 0x20a0~1056 from file(ino 41330 size 0x49f1cfe mtime 2019-07-05 14:05:26.908612 bdev 1 allocated 4a00000 extents [1:0xdfed40000+4a00000])
2019-08-13 21:30:44.330780 7f3885faad80 20 bluefs _read_random read buffered 0x20a0~1056 of 1:0xdfed40000+4a00000
2019-08-13 21:30:44.330928 7f3885faad80 20 bluefs _read_random got 4182
2019-08-13 21:30:44.330940 7f3885faad80 10 bluefs _read_random h 0x55da6af5db00 0x29c4~f29 from file(ino 33104 size 0x45a803b mtime 2019-07-05 05:06:31.164583 bdev 1 allocated 45c0000 extents [1:0x1ade580000+45c0000])
2019-08-13 21:30:44.330948 7f3885faad80 20 bluefs _read_random read buffered 0x29c4~f29 of 1:0x1ade580000+45c0000
2019-08-13 21:30:44.331075 7f3885faad80 20 bluefs _read_random got 3881
2019-08-13 21:30:44.331086 7f3885faad80 10 bluefs _read_random h 0x55da6c2db780 0x30f6~104a from file(ino 41330 size 0x49f1cfe mtime 2019-07-05 14:05:26.908612 bdev 1 allocated 4a00000 extents [1:0xdfed40000+4a00000])

and the same messages ... 

Actions #7

Updated by Vikhyat Umrao over 4 years ago

Looks like a duplicate of this one - https://tracker.ceph.com/issues/36482.

Actions #8

Updated by Vikhyat Umrao over 4 years ago

  • Related to Bug #36482: High amount of Read I/O on BlueFS/DB when listing omap keys added
Actions #9

Updated by Vikhyat Umrao over 4 years ago

  • Affected Versions v12.2.8 added
Actions #10

Updated by Sage Weil over 4 years ago

  • Status changed from New to Duplicate

basically a dup of #36482

Actions

Also available in: Atom PDF