Bug #23465
"Mutex.cc: 110: FAILED assert(r == 0)" ("AttributeError: 'tuple' object has no attribute 'split'") in powercycle
Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
powercycle
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
I see latest commit https://github.com/ceph/ceph/commit/c6760eba50860d40e25483c3e4cee772f3ad4468#diff-289c6ff15fd25acee61b31126e02dd06
but unsure how it can break this.
Run: http://pulpito.ceph.com/yuriw-2018-03-26_17:15:42-powercycle-wip_master-yuriw_3.24.18-distro-basic-smithi/
Jobs: '2325406', '2325404', '2325405'
Logs: yuriw-2018-03-26_17:15:42-powercycle-wip_master-yuriw_3.24.18-distro-basic-smithi/2325406/
2018-03-26T17:42:37.491 ERROR:teuthology.run_tasks:Manager failed: internal.archive Traceback (most recent call last): File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/run_tasks.py", line 159, in run_tasks suppress = manager.__exit__(*exc_info) File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__ self.gen.next() File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/task/internal/__init__.py", line 365, in archive fetch_binaries_for_coredumps(path, rem) File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/task/internal/__init__.py", line 312, in fetch_binaries_for_coredumps dump_program = dump_out.split("from '")[1].split(' ')[0] AttributeError: 'tuple' object has no attribute 'split'
Related issues
History
#1 Updated by Josh Durgin about 6 years ago
This isn't related to that suite commit. Run manually, 'file' returns "remote/smithi150/coredump/1522085413.12350.core: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'ceph-osd -f --cluster ceph -i 1'" for this core file.
The crash appears to be a race during shutdown:
2018-03-26 17:30:13.185 7f1e4ace4700 10 osd.1 pg_epoch: 82 pg[1.1( empty local-lis/les=9/10 n=0 ec=9/9 lis/c 9/9 les/c/f 10/10/0 9/9/9) [0,1] r=1 lpr=10 crt=0'0 active mbc={}] on_shutdown 2018-03-26 17:30:13.185 7f1e4ace4700 10 osd.1 pg_epoch: 82 pg[1.1( empty local-lis/les=9/10 n=0 ec=9/9 lis/c 9/9 les/c/f 10/10/0 9/9/9) [0,1] r=1 lpr=10 DELETING crt=0'0 active mbc={}] cancel_copy_ops 2018-03-26 17:30:13.185 7f1e4ace4700 10 osd.1 pg_epoch: 82 pg[1.1( empty local-lis/les=9/10 n=0 ec=9/9 lis/c 9/9 les/c/f 10/10/0 9/9/9) [0,1] r=1 lpr=10 DELETING crt=0'0 active mbc={}] cancel_flush_ops 2018-03-26 17:30:13.185 7f1e4ace4700 10 osd.1 pg_epoch: 82 pg[1.1( empty local-lis/les=9/10 n=0 ec=9/9 lis/c 9/9 les/c/f 10/10/0 9/9/9) [0,1] r=1 lpr=10 DELETING crt=0'0 active mbc={}] cancel_proxy_ops 2018-03-26 17:30:13.185 7f1e4ace4700 10 osd.1 pg_epoch: 82 pg[1.1( empty local-lis/les=9/10 n=0 ec=9/9 lis/c 9/9 les/c/f 10/10/0 9/9/9) [0,1] r=1 lpr=10 DELETING crt=0'0 active mbc={}] clear_backoffs 2018-03-26 17:30:13.185 7f1e4ace4700 20 osd.1 pg_epoch: 82 pg[1.1( empty local-lis/les=9/10 n=0 ec=9/9 lis/c 9/9 les/c/f 10/10/0 9/9/9) [0,1] r=1 lpr=10 DELETING crt=0'0 active mbc={}] exit NotTrimming 2018-03-26 17:30:13.185 7f1e4ace4700 20 osd.1 pg_epoch: 82 pg[1.1( empty local-lis/les=9/10 n=0 ec=9/9 lis/c 9/9 les/c/f 10/10/0 9/9/9) [0,1] r=1 lpr=10 DELETING crt=0'0 active mbc={}] enter NotTrimming 2018-03-26 17:30:13.185 7f1e4ace4700 10 osd.1 pg_epoch: 82 pg[1.1( empty local-lis/les=9/10 n=0 ec=9/9 lis/c 9/9 les/c/f 10/10/0 9/9/9) [0,1] r=1 lpr=10 DELETING crt=0'0 active mbc={}] on_change 2018-03-26 17:30:13.185 7f1e4ace4700 10 osd.1 pg_epoch: 82 pg[1.1( empty local-lis/les=9/10 n=0 ec=9/9 lis/c 9/9 les/c/f 10/10/0 9/9/9) [0,1] r=1 lpr=10 DELETING crt=0'0 active mbc={}] clear_async_reads 2018-03-26 17:30:13.185 7f1e4ace4700 10 osd.1 pg_epoch: 82 pg[1.1( empty local-lis/les=9/10 n=0 ec=9/9 lis/c 9/9 les/c/f 10/10/0 9/9/9) [0,1] r=1 lpr=10 DELETING crt=0'0 active mbc={}] clear_primary_state 2018-03-26 17:30:13.185 7f1e4ace4700 10 osd.1 pg_epoch: 82 pg[1.1( empty local-lis/les=9/10 n=0 ec=9/9 lis/c 9/9 les/c/f 10/10/0 9/9/9) [0,1] r=1 lpr=10 DELETING crt=0'0 active mbc={}] release_backoffs [1:80000000::::head,1:a0000000::::h ead) 2018-03-26 17:30:13.185 7f1e4ace4700 20 osd.1 pg_epoch: 82 pg[1.1( empty local-lis/les=9/10 n=0 ec=9/9 lis/c 9/9 les/c/f 10/10/0 9/9/9) [0,1] r=1 lpr=10 DELETING crt=0'0 active mbc={}] agent_stop 2018-03-26 17:30:13.185 7f1e4ace4700 10 osd.1 pg_epoch: 82 pg[1.1( empty local-lis/les=9/10 n=0 ec=9/9 lis/c 9/9 les/c/f 10/10/0 9/9/9) [0,1] r=1 lpr=10 DELETING crt=0'0 active mbc={}] cancel_recovery 2018-03-26 17:30:13.185 7f1e4ace4700 10 osd.1 pg_epoch: 82 pg[1.1( empty local-lis/les=9/10 n=0 ec=9/9 lis/c 9/9 les/c/f 10/10/0 9/9/9) [0,1] r=1 lpr=10 DELETING crt=0'0 active mbc={}] clear_recovery_state 2018-03-26 17:30:13.185 7f1e3325f700 20 osd.1 op_wq(6) _process empty q, waiting 2018-03-26 17:30:13.185 7f1e35a64700 20 osd.1 op_wq(1) _process empty q, waiting 2018-03-26 17:30:13.185 7f1e3125b700 20 osd.1 op_wq(2) _process empty q, waiting 2018-03-26 17:30:13.185 7f1e31a5c700 20 osd.1 op_wq(1) _process empty q, waiting 2018-03-26 17:30:13.185 7f1e30a5a700 20 osd.1 op_wq(3) _process empty q, waiting 2018-03-26 17:30:13.185 7f1e3ca72700 -1 /build/ceph-13.0.1-3240-gdcc62bb/src/common/Mutex.cc: In function 'void Mutex::Lock(bool)' thread 7f1e3ca72700 time 2018-03-26 17:30:13.186776 /build/ceph-13.0.1-3240-gdcc62bb/src/common/Mutex.cc: 110: FAILED assert(r == 0) ceph version 13.0.1-3240-gdcc62bb (dcc62bb2d0243a458251a2c80b510155ad4bfa5e) mimic (dev) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x7f1e525ce602] 2: (()+0x2ce7d7) [0x7f1e525ce7d7] 3: (Mutex::Lock(bool)+0x1c3) [0x7f1e525a1833] 4: (TrackedOp::mark_event(char const*, utime_t)+0x69) [0x55b6b71b60a9] 5: (ReplicatedBackend::op_commit(ReplicatedBackend::InProgressOp*)+0x6d) [0x55b6b720ea0d] 6: (Context::complete(int)+0x9) [0x55b6b6f77ef9] 7: (PrimaryLogPG::BlessedContext::finish(int)+0x56) [0x55b6b70f4c66] 8: (Context::complete(int)+0x9) [0x55b6b6f77ef9] 9: (Finisher::finisher_thread_entry()+0x1a7) [0x7f1e525ccc67] 10: (()+0x76ba) [0x7f1e5108d6ba] 11: (clone()+0x6d) [0x7f1e508b641d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
#2 Updated by Yuri Weinstein about 6 years ago
- Subject changed from "AttributeError: 'tuple' object has no attribute 'split'" in powercycle to "Mutex.cc: 110: FAILED assert(r == 0)" ("AttributeError: 'tuple' object has no attribute 'split'") in powercycle
#3 Updated by Greg Farnum almost 6 years ago
- Project changed from Ceph to RADOS
#4 Updated by Josh Durgin almost 6 years ago
- Assignee deleted (
Sage Weil) - Priority changed from Urgent to Normal
#5 Updated by Neha Ojha about 5 years ago
- Related to Bug #38594: mimic: common/Mutex.cc: 110: FAILED assert(r == 0) in powercycle added