Actions
Fix #6990
closedosd crash when running mixed versions of dumpling and master
% Done:
0%
Source:
Q/A
Tags:
Backport:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
steps to reproduce:
1.running a cluster of 2 nodes with dumpling version of ceph.
2. upgrade only the osds on the first node to master branch
3. thrash osds.
This causes the osd running master branch to crash.
logs are copied to ubuntu@mira052.front.sepia.ceph.com:/home/ubuntu/bug
2013-12-12 14:28:26.317723 7fc955f97700 -1 osd/ReplicatedPG.cc: In function 'virtual void ReplicatedPG::d o_backfill(OpRequestRef)' thread 7fc955f97700 time 2013-12-12 14:28:26.316292 osd/ReplicatedPG.cc: 1439: FAILED assert(is_replica()) ceph version 0.67.4-37-ga447fb7 (a447fb7d04fbad84f9ecb57726396bb6ca29d8f6) 1: (ReplicatedPG::do_backfill(std::tr1::shared_ptr<OpRequest>)+0xbd5) [0x5d6125] 2: (PG::do_request(std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3f0) [0x706c80] 3: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x 330) [0x65ae10] 4: (OSD::OpWQ::_process(boost::intrusive_ptr<PG>, ThreadPool::TPHandle&)+0x4a0) [0x671510] 5: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest> >, boos t::intrusive_ptr<PG> >::_void_process(void*, ThreadPool::TPHandle&)+0x9c) [0x6acb8c] 6: (ThreadPool::worker(ThreadPool::WorkThread*)+0x4e6) [0x8b4f06] 7: (ThreadPool::WorkThread::entry()+0x10) [0x8b6d10] 8: (()+0x7e9a) [0x7fc96a09ce9a] 9: (clone()+0x6d) [0x7fc9681e8ccd] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. config file to reproduce the issue: tamil@tamil-VirtualBox:~/tam_final/teuthology$ cat up_master.yaml overrides: ceph: log-whitelist: - wrongly marked me down - objects unfound and apparently lost - log bound mismatch roles: - [mon.a, mon.b, osd.0, osd.1, osd.2, mds.a] - [mon.c, osd.3, osd.4, osd.5, client.0] targets: ubuntu@mira023.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCw8G36ubCLJBcN7Ys9+3erO+GTlJyGJirlP2p1zdkuB4gNpG0scx9lZcM+id8D9ywrA+gQK5DMKaYBuhDHzk8tvbtX9X5TsCdXHpQJtrXmvUCSPKKOK7efnhw/qRB43CYa2p4sM+X1i7QTCXBOjk8syYzM5sxumjsxswsTsVnZ75xRcOIK30W8Cog3wwVsbr4ZaJ8YlMxNObzPqOYlfYCsl+AJ8ELa7hPd+8JTP3EBYjiVvfjntkmYr8CWA+z9kXRxp6Iv9ADr4OAB9uJOkQpOAievN2qF1hCFLoI0Qxlw2px0fVpLl0SFOctVRFnefzWnuYeN+CjNHgnUAVN5HaBj ubuntu@mira052.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC3sW7EMc9QRG2qjunPv8uQ3rCKTYjs/P/6/aYnNUJ8CM3IkJHexkNlkGYdTD5fOyVzQBC1c+SoqPpyRYPcJvNSOOiJpoQuUE1eyVNYLdtrFaqGCN9nmQg0turDQMwDlE8nK2Fmk74xB1Bc7lvaGm9/EqZrYYMq0KSTKGlIXUD/lAHzdAbe0uItRuEi7g7FALZ9lVgUBVdW3zE+pBpIW/yqP3NKNzP6cwaDu00tUGYgnQi8tjDo+0zZEMTa4hFb8dbO4HVz+10J7qZZCPATiX0SAZvGpm9YferGLxUdGG0qeuo/SHjc2UCMg1TfFug3oRSLDlUI3BllscyCWuWXZZ2j tasks: - chef: - install: branch: dumpling - ceph: fs: xfs - install.upgrade: osd.0: branch: master - ceph.restart: daemons: [osd.0, osd.1, osd.2] - thrashosds: chance_pgnum_grow: 1 chance_pgpnum_fix: 1 timeout: 1200 - ceph.restart: daemons: [mon.a] wait-for-healthy: false wait-for-osds-up: true - workunit: clients: client.0: - rados/test.sh - ceph.restart: daemons: [mon.b] wait-for-healthy: false wait-for-osds-up: true - workunit: clients: client.0: - rados/test.sh - ceph.restart: daemons: [mon.c] wait-for-healthy: false wait-for-osds-up: true - ceph.wait_for_mon_quorum: [a, b, c] - workunit: clients: client.0: - rados/test.sh
Actions