Project

General

Profile

Actions

Bug #64347

open

src/osd/PG.cc: FAILED ceph_assert(!bad || !cct->_conf->osd_debug_verify_cached_snaps)

Added by Laura Flores 3 months ago. Updated about 1 month ago.

Status:
Pending Backport
Priority:
Normal
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
backport_processed
Backport:
quincy,reef,squid
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

/a/lflores-2024-02-06_20:55:47-rados-wip-yuri4-testing-2024-02-05-0849-distro-default-smithi/7548965

2024-02-06T22:42:10.739 INFO:tasks.ceph.ceph_manager.ceph:no progress seen, keeping timeout for now
2024-02-06T22:42:10.739 DEBUG:teuthology.orchestra.run.smithi079:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph pg dump --format=json
2024-02-06T22:42:10.740 DEBUG:teuthology.orchestra.run.smithi113:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 30 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-osd.6.asok dump_ops_in_flight
2024-02-06T22:42:10.744 INFO:tasks.ceph.osd.7.smithi113.stderr:2024-02-06T22:42:10.642+0000 7f281b9e5640 -1 osd.7 pg_epoch: 3046 pg[104.2( v 2995'18 (2983'14,2995'18] local-lis/les=2562/2563 n=10 ec=296/296 lis/c=2562/2562 les/c/f=2563/2563/0 sis=2562) [7,0,6] r=0 lpr=2562 crt=2995'18 lcod 2991'16 mlcod 2991'16 active+clean+scrubbing [ 104.2:  ]  trimq=[3~4](3)] on_active_advmap removed_snaps already contains [3~1]
2024-02-06T22:42:10.744 INFO:tasks.ceph.osd.7.smithi113.stderr:./src/osd/PG.cc: In function 'virtual void PG::on_active_advmap(const OSDMapRef&)' thread 7f281b9e5640 time 2024-02-06T22:42:10.649362+0000
2024-02-06T22:42:10.744 INFO:tasks.ceph.osd.7.smithi113.stderr:./src/osd/PG.cc: 1901: FAILED ceph_assert(!bad || !cct->_conf->osd_debug_verify_cached_snaps)
2024-02-06T22:42:10.745 INFO:tasks.ceph.osd.7.smithi113.stderr: ceph version 19.0.0-1269-g633ab857 (633ab857b9926af935a3e6291c3e1d9251aca357) squid (dev)
2024-02-06T22:42:10.745 INFO:tasks.ceph.osd.7.smithi113.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x118) [0x5580a3ff3dd8]
2024-02-06T22:42:10.745 INFO:tasks.ceph.osd.7.smithi113.stderr: 2: ceph-osd(+0x3f4f8f) [0x5580a3ff3f8f]
2024-02-06T22:42:10.745 INFO:tasks.ceph.osd.7.smithi113.stderr: 3: ceph-osd(+0x38e4b6) [0x5580a3f8d4b6]
2024-02-06T22:42:10.745 INFO:tasks.ceph.osd.7.smithi113.stderr: 4: (PeeringState::Active::react(PeeringState::AdvMap const&)+0x19e) [0x5580a43eb8be]
2024-02-06T22:42:10.745 INFO:tasks.ceph.osd.7.smithi113.stderr: 5: ceph-osd(+0x82dbc1) [0x5580a442cbc1]
2024-02-06T22:42:10.745 INFO:tasks.ceph.osd.7.smithi113.stderr: 6: (PeeringState::advance_map(std::shared_ptr<OSDMap const>, std::shared_ptr<OSDMap const>, std::vector<int, std::allocator<int> >&, int, std::vector<int, std::allocator<int> >&, int, PeeringCtx&)+0x266) [0x5580a43b96f6]
2024-02-06T22:42:10.745 INFO:tasks.ceph.osd.7.smithi113.stderr: 7: (PG::handle_advance_map(std::shared_ptr<OSDMap const>, std::shared_ptr<OSDMap const>, std::vector<int, std::allocator<int> >&, int, std::vector<int, std::allocator<int> >&, int, PeeringCtx&)+0xfb) [0x5580a41f3d1b]
2024-02-06T22:42:10.745 INFO:tasks.ceph.osd.7.smithi113.stderr: 8: (OSD::advance_pg(unsigned int, PG*, ThreadPool::TPHandle&, PeeringCtx&)+0x39a) [0x5580a4170c1a]
2024-02-06T22:42:10.745 INFO:tasks.ceph.osd.7.smithi113.stderr: 9: (OSD::dequeue_peering_evt(OSDShard*, PG*, std::shared_ptr<PGPeeringEvent>, ThreadPool::TPHandle&)+0x237) [0x5580a417d997]
2024-02-06T22:42:10.745 INFO:tasks.ceph.osd.7.smithi113.stderr: 10: (ceph::osd::scheduler::PGPeeringItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x51) [0x5580a43a8e81]
2024-02-06T22:42:10.745 INFO:tasks.ceph.osd.7.smithi113.stderr: 11: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xab3) [0x5580a4187493]
2024-02-06T22:42:10.745 INFO:tasks.ceph.osd.7.smithi113.stderr: 12: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x293) [0x5580a467f2d3]
2024-02-06T22:42:10.745 INFO:tasks.ceph.osd.7.smithi113.stderr: 13: ceph-osd(+0xa80834) [0x5580a467f834]
2024-02-06T22:42:10.746 INFO:tasks.ceph.osd.7.smithi113.stderr: 14: /lib/x86_64-linux-gnu/libc.so.6(+0x94b43) [0x7f2840062b43]
2024-02-06T22:42:10.746 INFO:tasks.ceph.osd.7.smithi113.stderr: 15: /lib/x86_64-linux-gnu/libc.so.6(+0x126a00) [0x7f28400f4a00]

Description: rados/thrash/{0-size-min-size-overrides/3-size-2-min-size 1-pg-log-overrides/short_pg_log 2-recovery-overrides/{more-active-recovery} 3-scrub-overrides/{max-simultaneous-scrubs-1} backoff/peering_and_degraded ceph clusters/{fixed-2 openstack} crc-failures/default d-balancer/read mon_election/connectivity msgr-failures/osd-dispatch-delay msgr/async objectstore/bluestore-stupid rados supported-random-distro$/{ubuntu_latest} thrashers/pggrow thrashosds-health workloads/rados_api_tests}

This test occurred with the read balancer enabled (d-balancer/read), so it's worth taking a look whether that could be related.


Related issues 5 (3 open2 closed)

Related to RADOS - Bug #65559: src/osd/PG.cc: FAILED ceph_assert(!bad || !cct->_conf->osd_debug_verify_cached_snaps)Closed

Actions
Has duplicate RADOS - Bug #64514: LibRadosTwoPoolsPP.PromoteSnapScrub test failedDuplicateMatan Breizman

Actions
Copied to RADOS - Backport #65305: reef: src/osd/PG.cc: FAILED ceph_assert(!bad || !cct->_conf->osd_debug_verify_cached_snaps)In ProgressMatan BreizmanActions
Copied to RADOS - Backport #65306: squid: src/osd/PG.cc: FAILED ceph_assert(!bad || !cct->_conf->osd_debug_verify_cached_snaps)In ProgressMatan BreizmanActions
Copied to RADOS - Backport #65307: quincy: src/osd/PG.cc: FAILED ceph_assert(!bad || !cct->_conf->osd_debug_verify_cached_snaps)In ProgressMatan BreizmanActions
Actions

Also available in: Atom PDF