Bug #20549
cephfs-journal-tool: segfault during journal reset
% Done:
0%
Source:
Q/A
Tags:
Backport:
luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
tools
Labels (FS):
crash
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
2017-07-07T03:48:20.095 INFO:teuthology.orchestra.run.smithi028:Running: 'cephfs-journal-tool --debug-mds=4 --debug-objecter=1 journal reset --force' ... 2017-07-07T03:48:20.218 INFO:teuthology.orchestra.run.smithi028.stderr:2017-07-07 03:48:20.212533 7f5769379e40 4 Successfully wrote new journal pointer and header for rank 5:0 2017-07-07T03:48:20.218 INFO:teuthology.orchestra.run.smithi028.stderr:2017-07-07 03:48:20.212518 7f5759cb5700 1 -- 172.21.15.28:6808/3212126529 <== osd.1 172.21.15.197:6800/32322 4 ==== osd_op_reply(8 200.00000006 [delete] v57'7 uv5 ondisk = -2 ((2) No such file or directory)) v8 ==== 156+0+0 (3157165485 0 0) 0x7f576a5ccd00 con 0x7f576a7d3000 2017-07-07T03:48:20.218 INFO:teuthology.orchestra.run.smithi028.stderr:*** Caught signal (Segmentation fault) ** 2017-07-07T03:48:20.218 INFO:teuthology.orchestra.run.smithi028.stderr: in thread 7f574f4a0700 thread_name:fn_mds_utility 2017-07-07T03:48:20.218 INFO:teuthology.orchestra.run.smithi028.stderr:2017-07-07 03:48:20.212662 7f5759cb5700 1 -- 172.21.15.28:6808/3212126529 <== osd.1 172.21.15.197:6800/32322 5 ==== osd_op_reply(5 200.00000003 [delete] v57'35 uv33 ondisk = -2 ((2) No such file or directory)) v8 ==== 156+0+0 (4253037945 0 0) 0x7f576a5ccd00 con 0x7f576a7d3000 2017-07-07T03:48:20.218 INFO:teuthology.orchestra.run.smithi028.stderr:2017-07-07 03:48:20.212831 7f5759cb5700 1 -- 172.21.15.28:6808/3212126529 <== osd.1 172.21.15.197:6800/32322 6 ==== osd_op_reply(6 200.00000004 [delete] v57'36 uv33 ondisk = -2 ((2) No such file or directory)) v8 ==== 156+0+0 (3891149632 0 0) 0x7f576a5ccd00 con 0x7f576a7d3000 2017-07-07T03:48:20.218 INFO:teuthology.orchestra.run.smithi028.stderr: ceph version 12.0.3-2341-g7250d71 (7250d71d0b423ef87a7ac7b7c5def16842eb8208) luminous (dev) 2017-07-07T03:48:20.218 INFO:teuthology.orchestra.run.smithi028.stderr: 1: (()+0x4559c1) [0x7f57697ee9c1] 2017-07-07T03:48:20.219 INFO:teuthology.orchestra.run.smithi028.stderr: 2: (()+0xf370) [0x7f575e941370] 2017-07-07T03:48:20.219 INFO:teuthology.orchestra.run.smithi028.stderr: 3: (std::_Rb_tree_increment(std::_Rb_tree_node_base*)+0x22) [0x7f575de533f2] 2017-07-07T03:48:20.219 INFO:teuthology.orchestra.run.smithi028.stderr: 4: (Journaler::_finish_prezero(int, unsigned long, unsigned long)+0x124) [0x7f57696e6534] 2017-07-07T03:48:20.219 INFO:teuthology.orchestra.run.smithi028.stderr: 5: (Context::complete(int)+0x9) [0x7f57694b7489] 2017-07-07T03:48:20.219 INFO:teuthology.orchestra.run.smithi028.stderr: 6: (Finisher::finisher_thread_entry()+0x1c5) [0x7f575fe022b5] 2017-07-07T03:48:20.219 INFO:teuthology.orchestra.run.smithi028.stderr: 7: (()+0x7dc5) [0x7f575e939dc5] 2017-07-07T03:48:20.219 INFO:teuthology.orchestra.run.smithi028.stderr: 8: (clone()+0x6d) [0x7f575d5fc73d] 2017-07-07T03:48:20.219 INFO:teuthology.orchestra.run.smithi028.stderr:2017-07-07 03:48:20.213027 7f574f4a0700 -1 *** Caught signal (Segmentation fault) ** 2017-07-07T03:48:20.219 INFO:teuthology.orchestra.run.smithi028.stderr: in thread 7f574f4a0700 thread_name:fn_mds_utility
Related issues
History
#1 Updated by Zheng Yan over 6 years ago
The reason is that Resetter::reset() creates on-stack journaler. The journaler got destroyed before receiving all prezero osd_op_reply.
#2 Updated by Patrick Donnelly almost 6 years ago
- Assignee set to Zheng Yan
- Priority changed from Normal to Urgent
- Target version set to v13.0.0
- Backport set to luminous
- Labels (FS) crash added
#3 Updated by Zheng Yan almost 6 years ago
- Status changed from New to Fix Under Review
#4 Updated by Patrick Donnelly almost 6 years ago
- Status changed from Fix Under Review to Pending Backport
#5 Updated by Nathan Cutler almost 6 years ago
- Copied to Backport #23936: luminous: cephfs-journal-tool: segfault during journal reset added
#6 Updated by Nathan Cutler almost 6 years ago
- Status changed from Pending Backport to Resolved