Project

General

Profile

Bug #36384

src/osdc/Journaler.cc: 420: FAILED ceph_assert(!r)

Added by Patrick Donnelly 2 months ago. Updated about 2 months ago.

Status:
New
Priority:
High
Category:
-
Target version:
Start date:
10/10/2018
Due date:
% Done:

0%

Source:
Development
Tags:
Backport:
mimic,luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
crash
Pull request ID:

Description

2018-10-09T21:34:42.766 INFO:tasks.ceph.mds.a-s.smithi175.stderr:/build/ceph-14.0.0-3932-g841b270/src/osdc/Journaler.cc: In function 'void Journaler::_finish_reread_head_and_probe(int, C_OnFinisher*)' thread 7f89da15f700 time 2018-10-09 21:34:42.768350
2018-10-09T21:34:42.767 INFO:tasks.ceph.mds.a-s.smithi175.stderr:/build/ceph-14.0.0-3932-g841b270/src/osdc/Journaler.cc: 420: FAILED ceph_assert(!r)
2018-10-09T21:34:42.768 INFO:teuthology.orchestra.run.smithi046.stderr:2018-10-09 21:34:42.766 7fabe9bad700  1  Processor -- start
2018-10-09T21:34:42.768 INFO:teuthology.orchestra.run.smithi046.stderr:2018-10-09 21:34:42.766 7fabe9bad700  1 --  start start
2018-10-09T21:34:42.768 INFO:teuthology.orchestra.run.smithi046.stderr:2018-10-09 21:34:42.766 7fabe9bad700  1 --  --> 172.21.15.46:6789/0 -- auth(proto 0 30 bytes epoch 0) v1 -- 0x7fabe4090dd0 con 0
2018-10-09T21:34:42.768 INFO:teuthology.orchestra.run.smithi046.stderr:2018-10-09 21:34:42.766 7fabe9bad700  1 --  --> 172.21.15.175:6789/0 -- auth(proto 0 30 bytes epoch 0) v1 -- 0x7fabe4091160 con 0
2018-10-09T21:34:42.769 INFO:teuthology.orchestra.run.smithi046.stderr:2018-10-09 21:34:42.766 7fabda85f700  1 -- 172.21.15.46:40176/3240564089 learned_addr learned my addr 172.21.15.46:40176/3240564089 (peer_addr_for_me 172.21.15.46:40176/0)
2018-10-09T21:34:42.769 INFO:teuthology.orchestra.run.smithi046.stderr:2018-10-09 21:34:42.766 7fabe105c700  1 -- 172.21.15.46:40176/3240564089 <== mon.0 172.21.15.46:6789/0 1 ==== auth_reply(proto 2 0 (0) Success) v1 ==== 33+0+0 (603656868 0 0) 0x7fabcc0010c0 con 0x7fabe4091a90
2018-10-09T21:34:42.769 INFO:teuthology.orchestra.run.smithi046.stderr:2018-10-09 21:34:42.766 7fabe105c700  1 -- 172.21.15.46:40176/3240564089 --> 172.21.15.46:6789/0 -- auth(proto 2 32 bytes epoch 0) v1 -- 0x7fabc8003180 con 0
2018-10-09T21:34:42.769 INFO:teuthology.orchestra.run.smithi046.stderr:2018-10-09 21:34:42.766 7fabe105c700  1 -- 172.21.15.46:40176/3240564089 <== mon.1 172.21.15.175:6789/0 1 ==== auth_reply(proto 2 0 (0) Success) v1 ==== 33+0+0 (423033113 0 0) 0x7fabd4000e80 con 0x7fabe41a6250
2018-10-09T21:34:42.769 INFO:teuthology.orchestra.run.smithi046.stderr:2018-10-09 21:34:42.766 7fabe105c700  1 -- 172.21.15.46:40176/3240564089 --> 172.21.15.175:6789/0 -- auth(proto 2 32 bytes epoch 0) v1 -- 0x7fabc8004590 con 0
2018-10-09T21:34:42.769 INFO:teuthology.orchestra.run.smithi046.stderr:2018-10-09 21:34:42.766 7fabe105c700  1 -- 172.21.15.46:40176/3240564089 <== mon.0 172.21.15.46:6789/0 2 ==== auth_reply(proto 2 0 (0) Success) v1 ==== 206+0+0 (847600691 0 0) 0x7fabcc0010c0 con 0x7fabe4091a90
2018-10-09T21:34:42.769 INFO:teuthology.orchestra.run.smithi046.stderr:2018-10-09 21:34:42.766 7fabe105c700  1 -- 172.21.15.46:40176/3240564089 --> 172.21.15.46:6789/0 -- auth(proto 2 165 bytes epoch 0) v1 -- 0x7fabc80034d0 con 0
2018-10-09T21:34:42.769 INFO:teuthology.orchestra.run.smithi046.stderr:2018-10-09 21:34:42.766 7fabe105c700  1 -- 172.21.15.46:40176/3240564089 <== mon.0 172.21.15.46:6789/0 3 ==== mon_map magic: 0 v1 ==== 256+0+0 (3862443278 0 0) 0x7fabcc0016c0 con 0x7fabe4091a90
2018-10-09T21:34:42.769 INFO:teuthology.orchestra.run.smithi046.stderr:2018-10-09 21:34:42.766 7fabe105c700  1 -- 172.21.15.46:40176/3240564089 <== mon.1 172.21.15.175:6789/0 2 ==== auth_reply(proto 2 0 (0) Success) v1 ==== 206+0+0 (3136949635 0 0) 0x7fabd4000f20 con 0x7fabe41a6250
2018-10-09T21:34:42.770 INFO:teuthology.orchestra.run.smithi046.stderr:2018-10-09 21:34:42.766 7fabe105c700  1 -- 172.21.15.46:40176/3240564089 --> 172.21.15.175:6789/0 -- auth(proto 2 165 bytes epoch 0) v1 -- 0x7fabc8003c40 con 0
2018-10-09T21:34:42.770 INFO:teuthology.orchestra.run.smithi046.stderr:2018-10-09 21:34:42.766 7fabe105c700  1 -- 172.21.15.46:40176/3240564089 <== mon.1 172.21.15.175:6789/0 3 ==== mon_map magic: 0 v1 ==== 256+0+0 (3862443278 0 0) 0x7fabd40014a0 con 0x7fabe41a6250
2018-10-09T21:34:42.770 INFO:teuthology.orchestra.run.smithi046.stderr:2018-10-09 21:34:42.766 7fabe105c700  1 -- 172.21.15.46:40176/3240564089 <== mon.0 172.21.15.46:6789/0 4 ==== auth_reply(proto 2 0 (0) Success) v1 ==== 580+0+0 (352379544 0 0) 0x7fabcc000e50 con 0x7fabe4091a90
2018-10-09T21:34:42.770 INFO:teuthology.orchestra.run.smithi046.stderr:2018-10-09 21:34:42.766 7fabe105c700  1 -- 172.21.15.46:40176/3240564089 >> 172.21.15.175:6789/0 conn(0x7fabe41a6250 legacy :-1 s=STATE_CONNECTION_ESTABLISHED l=1).mark_down
2018-10-09T21:34:42.770 INFO:teuthology.orchestra.run.smithi046.stderr:2018-10-09 21:34:42.766 7fabe105c700  1 -- 172.21.15.46:40176/3240564089 --> 172.21.15.46:6789/0 -- mon_subscribe({config=0+,monmap=0+}) v3 -- 0x7fabe408def0 con 0
2018-10-09T21:34:42.770 INFO:teuthology.orchestra.run.smithi046.stderr:2018-10-09 21:34:42.766 7fabe9bad700  1 -- 172.21.15.46:40176/3240564089 --> 172.21.15.46:6789/0 -- mon_subscribe({mgrmap=0+}) v3 -- 0x7fabe4090dd0 con 0
2018-10-09T21:34:42.770 INFO:teuthology.orchestra.run.smithi046.stderr:2018-10-09 21:34:42.766 7fabe9bad700  1 -- 172.21.15.46:40176/3240564089 --> 172.21.15.46:6789/0 -- mon_subscribe({osdmap=0}) v3 -- 0x7fabe4091160 con 0
2018-10-09T21:34:42.770 INFO:teuthology.orchestra.run.smithi046.stderr:2018-10-09 21:34:42.766 7fabe105c700  1 -- 172.21.15.46:40176/3240564089 <== mon.0 172.21.15.46:6789/0 5 ==== config(0 keys) v1 ==== 4+0+0 (0 0 0) 0x7fabcc0016c0 con 0x7fabe4091a90
2018-10-09T21:34:42.771 INFO:teuthology.orchestra.run.smithi046.stderr:2018-10-09 21:34:42.766 7fabe105c700  1 -- 172.21.15.46:40176/3240564089 <== mon.0 172.21.15.46:6789/0 6 ==== mon_map magic: 0 v1 ==== 256+0+0 (3862443278 0 0) 0x7fabcc000e20 con 0x7fabe4091a90
2018-10-09T21:34:42.771 INFO:teuthology.orchestra.run.smithi046.stderr:2018-10-09 21:34:42.766 7fabe105c700  1 -- 172.21.15.46:40176/3240564089 <== mon.0 172.21.15.46:6789/0 7 ==== mgrmap(e 3) v1 ==== 1316+0+0 (1999093257 0 0) 0x7fabcc001090 con 0x7fabe4091a90
2018-10-09T21:34:42.771 INFO:teuthology.orchestra.run.smithi046.stderr:2018-10-09 21:34:42.766 7fabe105c700  1 -- 172.21.15.46:40176/3240564089 <== mon.0 172.21.15.46:6789/0 8 ==== osd_map(39..39 src has 1..39) v4 ==== 4369+0+0 (2427028789 0 0) 0x7fabcc006810 con 0x7fabe4091a90
2018-10-09T21:34:42.771 INFO:tasks.ceph.mds.a-s.smithi175.stderr: ceph version 14.0.0-3932-g841b270 (841b27044263faff77f3bed42b6fcb06b916d3a7) nautilus (dev)
2018-10-09T21:34:42.771 INFO:tasks.ceph.mds.a-s.smithi175.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x7f89e7ff6f4f]
2018-10-09T21:34:42.771 INFO:tasks.ceph.mds.a-s.smithi175.stderr: 2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x7f89e7ff712c]
2018-10-09T21:34:42.771 INFO:tasks.ceph.mds.a-s.smithi175.stderr: 3: ceph-mds() [0x7c59aa]
2018-10-09T21:34:42.771 INFO:tasks.ceph.mds.a-s.smithi175.stderr: 4: (Context::complete(int)+0x9) [0x4dd669]
2018-10-09T21:34:42.771 INFO:tasks.ceph.mds.a-s.smithi175.stderr: 5: (Journaler::_finish_reread_head(int, ceph::buffer::list&, Context*)+0x2be) [0x7c2ece]
2018-10-09T21:34:42.772 INFO:tasks.ceph.mds.a-s.smithi175.stderr: 6: (Context::complete(int)+0x9) [0x4dd669]
2018-10-09T21:34:42.772 INFO:tasks.ceph.mds.a-s.smithi175.stderr: 7: (Finisher::finisher_thread_entry()+0x15e) [0x7f89e803f4de]
2018-10-09T21:34:42.772 INFO:tasks.ceph.mds.a-s.smithi175.stderr: 8: (()+0x76ba) [0x7f89e78bc6ba]
2018-10-09T21:34:42.772 INFO:tasks.ceph.mds.a-s.smithi175.stderr: 9: (clone()+0x6d) [0x7f89e70e541d]
2018-10-09T21:34:42.772 INFO:tasks.ceph.mds.a-s.smithi175.stderr:2018-10-09 21:34:42.766 7f89da15f700 -1 /build/ceph-14.0.0-3932-g841b270/src/osdc/Journaler.cc: In function 'void Journaler::_finish_reread_head_and_probe(int, C_OnFinisher*)' thread 7f89da15f700 time 2018-10-09 21:34:42.768350
2018-10-09T21:34:42.772 INFO:tasks.ceph.mds.a-s.smithi175.stderr:/build/ceph-14.0.0-3932-g841b270/src/osdc/Journaler.cc: 420: FAILED ceph_assert(!r)
2018-10-09T21:34:42.772 INFO:tasks.ceph.mds.a-s.smithi175.stderr:
2018-10-09T21:34:42.772 INFO:tasks.ceph.mds.a-s.smithi175.stderr: ceph version 14.0.0-3932-g841b270 (841b27044263faff77f3bed42b6fcb06b916d3a7) nautilus (dev)
2018-10-09T21:34:42.772 INFO:tasks.ceph.mds.a-s.smithi175.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x7f89e7ff6f4f]
2018-10-09T21:34:42.773 INFO:tasks.ceph.mds.a-s.smithi175.stderr: 2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x7f89e7ff712c]
2018-10-09T21:34:42.773 INFO:tasks.ceph.mds.a-s.smithi175.stderr: 3: ceph-mds() [0x7c59aa]
2018-10-09T21:34:42.773 INFO:tasks.ceph.mds.a-s.smithi175.stderr: 4: (Context::complete(int)+0x9) [0x4dd669]
2018-10-09T21:34:42.773 INFO:tasks.ceph.mds.a-s.smithi175.stderr: 5: (Journaler::_finish_reread_head(int, ceph::buffer::list&, Context*)+0x2be) [0x7c2ece]
2018-10-09T21:34:42.773 INFO:tasks.ceph.mds.a-s.smithi175.stderr: 6: (Context::complete(int)+0x9) [0x4dd669]
2018-10-09T21:34:42.773 INFO:tasks.ceph.mds.a-s.smithi175.stderr: 7: (Finisher::finisher_thread_entry()+0x15e) [0x7f89e803f4de]
2018-10-09T21:34:42.773 INFO:tasks.ceph.mds.a-s.smithi175.stderr: 8: (()+0x76ba) [0x7f89e78bc6ba]
2018-10-09T21:34:42.773 INFO:tasks.ceph.mds.a-s.smithi175.stderr: 9: (clone()+0x6d) [0x7f89e70e541d]

From: /ceph/teuthology-archive/pdonnell-2018-10-09_00:54:26-fs-wip-pdonnell-testing-20181008.224656-distro-basic-smithi/3118420/teuthology.log

Core: /ceph/teuthology-archive/pdonnell-2018-10-09_00:54:26-fs-wip-pdonnell-testing-20181008.224656-distro-basic-smithi/3118420/remote/smithi175/coredump/1539120882.18210.core

Branch: https://github.com/ceph/ceph-ci/tree/wip-pdonnell-testing-20181008.224656

Looks unlikely the group of PRs caused this.

History

#1 Updated by Zheng Yan 2 months ago

    -2> 2018-10-09 21:34:42.766 7f89e2970700  1 -- 172.21.15.175:6817/2317968448 <== osd.7 172.21.15.175:6813/17054 11 ==== osd_op_reply(44 200.00000000 [read 0~0] v0'0 uv0 ondisk = -2 ((2) No such file or directory)) v8 ==== 156+0+0 (1020293177 0 0) 0x647d0c0 con 0x2446a00

there is race in the journal migration test

#2 Updated by Patrick Donnelly about 2 months ago

  • Assignee set to Patrick Donnelly

Also available in: Atom PDF