Project

General

Profile

Actions

Bug #40684

closed

bluestore objectstore_blackhole=true violates read-after-write

Added by Sage Weil almost 5 years ago. Updated over 4 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

- osd has blackhole set
- osd receives osdmap 17, queues for write
- bluestore drops it (blackhole=true)
- osd receives osdmap 18, tries to read 17, gets enoent, asserts

2019-07-05T19:16:33.737+0000 7f7ce5189700  3 osd.0 17 handle_osd_map epochs [17,18], i have 17, src has [1,18]
2019-07-05T19:16:33.737+0000 7f7ce5189700 10 osd.0 17 handle_osd_map  got inc map for epoch 18
2019-07-05T19:16:33.737+0000 7f7ce5189700 15 bluestore(/var/lib/ceph/osd/ceph-0) read meta #-1:3340e826:::osdmap.17:0# 0x0~0
2019-07-05T19:16:33.737+0000 7f7ce5189700 20 bluestore(/var/lib/ceph/osd/ceph-0) _do_read 0x0~991 size 0x991 (2449)
2019-07-05T19:16:33.737+0000 7f7ce5189700 20 bluestore(/var/lib/ceph/osd/ceph-0) _do_read will do buffered read
2019-07-05T19:16:33.737+0000 7f7ce5189700 20 bluestore(/var/lib/ceph/osd/ceph-0) _do_read  blob Blob(0xf446930 blob([0x90000~4000] csum+has_unused crc32c/0x1000 unused=0xfff0) use_tracker(0x4000 0x991) SharedBlob(0xf4468c0 sbid 0x0)) need 0x0~991 cache has 0x[0~991]
2019-07-05T19:16:33.737+0000 7f7ce5189700 10 bluestore(/var/lib/ceph/osd/ceph-0) read meta #-1:3340e826:::osdmap.17:0# 0x0~991 = 2449
2019-07-05T19:16:33.737+0000 7f7ce5189700 10 osd.0 17 add_map_bl 17 2449 bytes
2019-07-05T19:16:33.741+0000 7f7ce5189700 20 osd.0 17 got_full_map 18, nothing requested
2019-07-05T19:16:33.741+0000 7f7ce5189700 20 osd.0 17 handle_osd_map pg_num_history pg_num_history(e18 pg_nums {1={9=8},2={15=1}} deleted_pools )
2019-07-05T19:16:33.741+0000 7f7ce5189700 10 snap_mapper.record_purged_snaps purged_snaps {18={}}
2019-07-05T19:16:33.741+0000 7f7ce5189700 10 snap_mapper.record_purged_snaps rm 0 keys, set 1 keys
2019-07-05T19:16:33.741+0000 7f7ce5189700 10 osd.0 17 write_superblock sb(9bbea360-2931-4756-b65d-016a23f23e15 osd.0 516b9b2e-6bb7-4cc9-9f20-4b67a6e2c2bf e18 [1,18] lci=[8,18])
2019-07-05T19:16:33.741+0000 7f7ce5189700  0 bluestore(/var/lib/ceph/osd/ceph-0) queue_transactions objectstore_blackhole = TRUE, dropping transaction
...
2019-07-05T19:16:35.653+0000 7f7ce05f0700  3 osd.0 17 handle_osd_map epochs [18,19], i have 18, src has [1,19]
2019-07-05T19:16:35.653+0000 7f7ce05f0700 10 osd.0 17 handle_osd_map  got inc map for epoch 19
2019-07-05T19:16:35.653+0000 7f7cf3dd6700 10 --2- v2:172.21.15.165:6806/12938 >> v2:172.21.15.165:6810/12940 conn(0xf3bcd80 0xf347180 crc :-1 s=READY pgs=4 cs=0 l=0 rx=0 tx=0).write_event try send msg ack, acked 1 messages
2019-07-05T19:16:35.653+0000 7f7ce05f0700 15 bluestore(/var/lib/ceph/osd/ceph-0) read meta #-1:39c0e826:::osdmap.18:0# 0x0~0
2019-07-05T19:16:35.653+0000 7f7ce05f0700 20 bluestore(/var/lib/ceph/osd/ceph-0).collection(meta 0xcc11000) get_onode oid #-1:39c0e826:::osdmap.18:0# key 0x7f7fffffffffffffff39c0e8'&!osdmap.18!='0x0000000000000000ffffffffffffffff'o'
2019-07-05T19:16:35.653+0000 7f7cf3dd6700 10 -- v2:172.21.15.165:6806/12938 >> v2:172.21.15.165:6810/12940 conn(0xf3bcd80 msgr2=0xf347180 crc :-1 s=STATE_CONNECTION_ESTABLISHED l=0)._try_send sent bytes 57 remaining bytes 0
2019-07-05T19:16:35.653+0000 7f7ce05f0700 20 bluestore(/var/lib/ceph/osd/ceph-0).collection(meta 0xcc11000)  r -2 v.len 0
2019-07-05T19:16:35.653+0000 7f7ce05f0700 10 bluestore(/var/lib/ceph/osd/ceph-0) read meta #-1:39c0e826:::osdmap.18:0# 0x0~0 = -2
2019-07-05T19:16:35.661+0000 7f7ce05f0700 -1 /build/ceph-15.0.0-2508-g145e26b/src/osd/OSD.cc: In function 'void OSD::handle_osd_map(MOSDMap*)' thread 7f7ce05f0700 time 2019-07-05T19:16:35.654196+0000
/build/ceph-15.0.0-2508-g145e26b/src/osd/OSD.cc: 7721: FAILED ceph_assert(p != added_maps_bl.end())

 ceph version 15.0.0-2508-g145e26b (145e26ba775f7f5b3d0aa7cf16958fbb530e3f2e) octopus (dev)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x152) [0xc7e2a6]
 2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0xc7e481]
 3: (OSD::handle_osd_map(MOSDMap*)+0x1dcb) [0xcfd40b]
 4: (OSD::_dispatch(Message*)+0xc3) [0xd0f0f3]
 5: (OSD::ms_dispatch(Message*)+0x68) [0xd0f4d8]
 6: (DispatchQueue::entry()+0x10f3) [0x163b4a3]

Related issues 1 (0 open1 closed)

Copied to bluestore - Backport #42041: nautilus: bluestore objectstore_blackhole=true violates read-after-writeResolvedIgor FedotovActions
Actions #1

Updated by Neha Ojha over 4 years ago

  • Status changed from 12 to Need More Info

Don't see the test run for this.

Actions #2

Updated by Sage Weil over 4 years ago

  • Status changed from Need More Info to Pending Backport
  • Backport set to nautilus
  • Pull request ID set to 30475

note that for backport, we only want one commit, 6c2a8e472dc71b962d7de008e30631f125b148c3

Actions #3

Updated by Nathan Cutler over 4 years ago

  • Copied to Backport #42041: nautilus: bluestore objectstore_blackhole=true violates read-after-write added
Actions #4

Updated by Igor Fedotov over 4 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF