Actions
Bug #64282
closedosd crashes due to unexpected pg creation
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Description
DEBUG 2024-01-30 05:30:06,943 [shard 2] osd - ShardServices::dispatch_context_transaction: empty transaction DEBUG 2024-01-30 05:30:06,943 [shard 2] osd - peering_event(id=33554908, detail=PeeringEvent(from=0 pgid=2.2 sent=16 requested=16 evt=epoch_sent: 16 epoch_requested: 16 RenewLease)): exit DEBUG 2024-01-30 05:30:06,943 [shard 2] osd - 0x0 LocalPeeringEvent::start: peering_event(id=33554908, detail=PeeringEvent(from=0 pgid=2.2 sent=16 requested=16 evt=epoch_sent: 16 epoch_requested: 16 RenewLease)): complete INFO 2024-01-30 05:30:07,428 [shard 0] prioritycache - prioritycache tune_memory target: 4294967296 mapped: 15130624 unmapped: 729088 heap: 15859712 old mem: 2845415832 new mem: 2845415832 INFO 2024-01-30 05:30:07,651 [shard 0] alienstore - stat DEBUG 2024-01-30 05:30:07,695 [shard 2] osd - pg_advance_map(id=33554905, detail=PGAdvanceMap(pg=6.7 from=100 to=112)): advancing map to 102 DEBUG 2024-01-30 05:30:07,695 [shard 2] osd - pg_epoch 101 pg[6.7( empty local-lis/les=0/0 n=0 ec=73/73 lis/c=0/0 les/c/f=0/0/0 sis=100) [] r=-1 lpr=100 pi=[73,100)/1 crt=0'0 mlcod 0'0 unknown NOTIFY PeeringState::advance_map handle_advance_map {}/{} -- -1/-1 DEBUG 2024-01-30 05:30:07,695 [shard 2] osd - pg_epoch 102 pg[6.7( empty local-lis/les=0/0 n=0 ec=73/73 lis/c=0/0 les/c/f=0/0/0 sis=100) [] r=-1 lpr=100 pi=[73,100)/1 crt=0'0 mlcod 0'0 unknown NOTIFY state<Started>: Started advmap DEBUG 2024-01-30 05:30:07,695 [shard 2] osd - pg_epoch 102 pg[6.7( empty local-lis/les=0/0 n=0 ec=73/73 lis/c=0/0 les/c/f=0/0/0 sis=100) [] r=-1 lpr=100 pi=[73,100)/1 crt=0'0 mlcod 0'0 unknown NOTIFY check_recovery_sources no source osds () went down DEBUG 2024-01-30 05:30:07,695 [shard 2] osd - pg_advance_map(id=33554905, detail=PGAdvanceMap(pg=6.7 from=100 to=112)): start: getting map 103 DEBUG 2024-01-30 05:30:07,695 [shard 0] osd - get_local_map loading osdmap.103 from disk INFO 2024-01-30 05:30:07,695 [shard 0] osd - load_map osdmap.103 INFO 2024-01-30 05:30:07,695 [shard 0] osd - load_map osdmap.103 INFO 2024-01-30 05:30:07,695 [shard 2] osd - pg_epoch 88 pg[5.d( DNE empty local-lis/les=0/0 n=0 ec=0/0 lis/c=0/0 les/c/f=0/0/0 sis=0) [] r=-1 lpr=0 crt=0'0 mlcod 0'0 unknown enter Initial DEBUG 2024-01-30 05:30:07,695 [shard 0] osd - load_map_bl loading osdmap.103 from disk INFO 2024-01-30 05:30:07,695 [shard 2] osd - Entering state: Initial DEBUG 2024-01-30 05:30:07,695 [shard 2] osd - snap_mapper.reset_prefix_itr::from <0> to <CEPH_NOSNAP> ::update_bits DEBUG 2024-01-30 05:30:07,695 [shard 2] osd - pg_epoch 88 pg[5.d( DNE empty local-lis/les=0/0 n=0 ec=0/0 lis/c=0/0 les/c/f=0/0/0 sis=0) [] r=-1 lpr=0 crt=0'0 mlcod 0'0 unknown ScrubState::ScrubState: entering state ScrubMachine/Inactive Segmentation fault on shard 2. Backtrace: DEBUG 2024-01-30 05:30:07,696 [shard 0] osd - pg_epoch 112 pg[1.0( v 111'92 (0'0,111'92] local-lis/les=13/15 n=0 ec=13/13 lis/c=13/13 les/c/f=15/18/0 sis=13) [3,0] r=1 lpr=13 luod=0'0 lua=0'0 crt=111'93 lcod 107'91 mlcod 111'93 active PeeringState::update_last_complete_ondisk updating last_complete_ondisk to: 111'92 DEBUG 2024-01-30 05:30:07,696 [shard 0] osd - replicated_request(id=1358, detail=RepRequest(from=3 req=osd_repop(client.4112.0:128 1.0 e111/13 1:30306672:devicehealth::main.db.0000000000000000:head v 111'93, mlcod=111'93) v3)): complete DEBUG 2024-01-30 05:30:07,696 [shard 0] osd - replicated_request(id=1358, detail=RepRequest(from=3 req=osd_repop(client.4112.0:128 1.0 e111/13 1:30306672:devicehealth::main.db.0000000000000000:head v 111'93, mlcod=111'93) v3)): exit 0# 0x00005593591ABE41 in ceph-osd 1# 0x00005593591AC2F5 in ceph-osd 2# 0x000055935A8DFA68 in ceph-osd 3# 0x000055935A8DFDD1 in ceph-osd 4# 0x000055935A916C5E in ceph-osd 5# 0x000055935A917A56 in ceph-osd 6# 0x000055935A8B5832 in ceph-osd 7# 0x00002BA18CE8C1CA in /lib64/libpthread.so.0 8# clone in /lib64/libc.so.6 Dump of siginfo: si_signo: 11 si_errno: 0 si_code: 1 si_pid: 24 si_uid: 0 si_status: 0 si_utime: 0 si_stime: 0 si_int: 0 si_ptr: 0 si_overrun: 0 si_timerid: 24 si_addr: 0x18 si_band: 24 si_fd: 0 si_addr_lsb: 0 si_lower: 0 si_upper: 0 si_pkey: 0 si_call_addr: 0x18 si_syscall: 0 si_arch: 0
The crash was caused by attempted pg creation after the corresponding pool had already been removed.
Files
Updated by Matan Breizman 2 months ago
- Status changed from New to Fix Under Review
- Pull request ID set to 55407
Updated by Matan Breizman 13 days ago
- Status changed from Fix Under Review to Resolved
Actions