Project

General

Profile

Actions

Bug #12517

closed

assert failure in Objecter::_finish_op(Objecter::Op*)

Added by ceph zte over 8 years ago. Updated over 8 years ago.

Status:
Duplicate
Priority:
High
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
rbd
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I create im1 in pool song.Then I copy image im1 to pool rbd.

when the copy is not finish,I delete the pool song.Then I saw the below core dump info.

Why ceph does not have the lock ,when copying rbd from the pool,The pool should have lock forbit the delete operate.

Even the copy rbd operate have core dump, but in the pool rbd,it look like that the copy task has finished and successed .

[root@node64 ~]# rbd info --image im1_bak3
rbd image 'im1_bak3':
        size 10240 MB in 2560 objects
        order 22 (4096 kB objects)
        block_name_prefix: rb.0.15c54.238e1f29
        format: 1

   -34> 2015-07-29 17:21:46.402011 7f473379b880  1 -- 10.118.202.64:0/1024572 --> 10.118.202.64:6810/11531 -- osd_op(client.89172.0:455 rb.0.15c53.2ae8944a.0000000001b9 [read 0~4194304] 7.5453652f ack+read+known_if_redirected e78) v4 -- ?+0 0x103a370 con 0xfe07f0
   -33> 2015-07-29 17:21:46.402347 7f473379b880  1 -- 10.118.202.64:0/1024572 --> 10.118.202.64:6815/11780 -- osd_op(client.89172.0:456 rb.0.15c53.2ae8944a.0000000001ba [read 0~4194304] 7.902b0a69 ack+read+known_if_redirected e78) v4 -- ?+0 0x103acc0 con 0xfe62a0
   -32> 2015-07-29 17:21:46.402350 7f472b57f700  1 -- 10.118.202.64:0/1024572 <== osd.1 10.118.202.64:6805/11173 66 ==== osd_map(79..79 src has 1..79) v3 ==== 214+0+0 (2468074320 0 0) 0x7f46f8002610 con 0xfeb060
   -31> 2015-07-29 17:21:46.402771 7f472b57f700 10 monclient: get_version osdmap req 0x7f4720005760
   -30> 2015-07-29 17:21:46.402788 7f472b57f700 10 monclient: _send_mon_message to mon.node64 at 10.118.202.64:6789/0
   -29> 2015-07-29 17:21:46.402800 7f472b57f700  1 -- 10.118.202.64:0/1024572 --> 10.118.202.64:6789/0 -- mon_get_version(what=osdmap handle=1) v1 -- ?+0 0x7f4720002d40 con 0xfdbbe0
   -28> 2015-07-29 17:21:46.402825 7f472b57f700 10 monclient: get_version osdmap req 0x7f4720011860
   -27> 2015-07-29 17:21:46.402830 7f472b57f700 10 monclient: _send_mon_message to mon.node64 at 10.118.202.64:6789/0
   -26> 2015-07-29 17:21:46.402836 7f472b57f700  1 -- 10.118.202.64:0/1024572 --> 10.118.202.64:6789/0 -- mon_get_version(what=osdmap handle=2) v1 -- ?+0 0x7f47200008c0 con 0xfdbbe0
   -25> 2015-07-29 17:21:46.402902 7f472b57f700 10 monclient: get_version osdmap req 0x7f4720011840
   -24> 2015-07-29 17:21:46.402910 7f472b57f700 10 monclient: _send_mon_message to mon.node64 at 10.118.202.64:6789/0
   -23> 2015-07-29 17:21:46.402915 7f472b57f700  1 -- 10.118.202.64:0/1024572 --> 10.118.202.64:6789/0 -- mon_get_version(what=osdmap handle=3) v1 -- ?+0 0x7f4720002fc0 con 0xfdbbe0
   -22> 2015-07-29 17:21:46.402945 7f472b57f700 10 monclient: get_version osdmap req 0x7f4720000fd0
   -21> 2015-07-29 17:21:46.402949 7f472b57f700 10 monclient: _send_mon_message to mon.node64 at 10.118.202.64:6789/0
   -20> 2015-07-29 17:21:46.402952 7f472b57f700  1 -- 10.118.202.64:0/1024572 --> 10.118.202.64:6789/0 -- mon_get_version(what=osdmap handle=4) v1 -- ?+0 0x7f4720022710 con 0xfdbbe0
   -19> 2015-07-29 17:21:46.402961 7f472b57f700 10 monclient: get_version osdmap req 0x7f4720011880
   -18> 2015-07-29 17:21:46.402964 7f472b57f700 10 monclient: _send_mon_message to mon.node64 at 10.118.202.64:6789/0
   -17> 2015-07-29 17:21:46.402967 7f472b57f700  1 -- 10.118.202.64:0/1024572 --> 10.118.202.64:6789/0 -- mon_get_version(what=osdmap handle=5) v1 -- ?+0 0x7f47200229e0 con 0xfdbbe0
   -16> 2015-07-29 17:21:46.403011 7f472b57f700 10 monclient: get_version osdmap req 0x7f47200098d0
   -15> 2015-07-29 17:21:46.403021 7f472b57f700 10 monclient: _send_mon_message to mon.node64 at 10.118.202.64:6789/0
   -14> 2015-07-29 17:21:46.403027 7f472b57f700  1 -- 10.118.202.64:0/1024572 --> 10.118.202.64:6789/0 -- mon_get_version(what=osdmap handle=6) v1 -- ?+0 0x7f4720022710 con 0xfdbbe0
   -13> 2015-07-29 17:21:46.403050 7f472b57f700 10 monclient: get_version osdmap req 0x7f47200218a0
   -12> 2015-07-29 17:21:46.403057 7f472b57f700 10 monclient: _send_mon_message to mon.node64 at 10.118.202.64:6789/0
   -11> 2015-07-29 17:21:46.403063 7f472b57f700  1 -- 10.118.202.64:0/1024572 --> 10.118.202.64:6789/0 -- mon_get_version(what=osdmap handle=7) v1 -- ?+0 0x7f47200229e0 con 0xfdbbe0
   -10> 2015-07-29 17:21:46.403077 7f472b57f700 10 monclient: get_version osdmap req 0x7f4720021880
    -9> 2015-07-29 17:21:46.403084 7f472b57f700 10 monclient: _send_mon_message to mon.node64 at 10.118.202.64:6789/0
    -8> 2015-07-29 17:21:46.403091 7f472b57f700  1 -- 10.118.202.64:0/1024572 --> 10.118.202.64:6789/0 -- mon_get_version(what=osdmap handle=8) v1 -- ?+0 0x7f4720002450 con 0xfdbbe0
    -7> 2015-07-29 17:21:46.403126 7f472b57f700 10 monclient: get_version osdmap req 0x7f4720002320
    -6> 2015-07-29 17:21:46.403136 7f472b57f700 10 monclient: _send_mon_message to mon.node64 at 10.118.202.64:6789/0
    -5> 2015-07-29 17:21:46.403141 7f472b57f700  1 -- 10.118.202.64:0/1024572 --> 10.118.202.64:6789/0 -- mon_get_version(what=osdmap handle=9) v1 -- ?+0 0x7f47200229e0 con 0xfdbbe0
    -4> 2015-07-29 17:21:46.403150 7f472b57f700 10 monclient: get_version osdmap req 0x7f47200067a0
    -3> 2015-07-29 17:21:46.403152 7f472b57f700 10 monclient: _send_mon_message to mon.node64 at 10.118.202.64:6789/0
    -2> 2015-07-29 17:21:46.403154 7f472b57f700  1 -- 10.118.202.64:0/1024572 --> 10.118.202.64:6789/0 -- mon_get_version(what=osdmap handle=10) v1 -- ?+0 0x7f4720022d80 con 0xfdbbe0
    -1> 2015-07-29 17:21:46.403191 7f472b57f700  1 -- 10.118.202.64:0/1024572 <== osd.2 10.118.202.64:6810/11531 125 ==== osd_op_reply(447 rb.0.15c53.2ae8944a.0000000001b1 [read 0~4194304] v0'0 uv0 ack = -2 ((2) No such file or directory)) v6 ==== 199+0+0 (255113851 0 0) 0x7f470c0011d0 con 0xfe07f0
     0> 2015-07-29 17:21:46.405797 7f472b57f700 -1 osdc/Objecter.cc: In function 'void Objecter::_finish_op(Objecter::Op*)' thread 7f472b57f700 time 2015-07-29 17:21:46.403313
osdc/Objecter.cc: 2536: FAILED assert(check_latest_map_ops.find(op->tid) == check_latest_map_ops.end())

 ceph version 0.87 (c51c8f9d80fa4e0168aa52685b8de40e42758578)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85) [0x7f473084eaf5]
 2: (Objecter::_finish_op(Objecter::Op*)+0x27c) [0x7f4732cb906c]
 3: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0x15ec) [0x7f4732cc872c]
 4: (Objecter::ms_dispatch(Message*)+0x2b3) [0x7f4732cd2e03]
 5: (DispatchQueue::entry()+0x62a) [0x7f473096463a]
 6: (DispatchQueue::DispatchThread::entry()+0xd) [0x7f47309e383d]
 7: (()+0x7df3) [0x7f47303c6df3]
 8: (clone()+0x6d) [0x7f472f9d23dd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- logging levels ---
   0/ 5 none
   0/ 1 lockdep
   0/ 1 context
   1/ 1 crush
   1/ 5 mds
   1/ 5 mds_balancer
   1/ 5 mds_locker
   1/ 5 mds_log
   1/ 5 mds_log_expire
   1/ 5 mds_migrator
   0/ 1 buffer
   0/ 1 timer
   0/ 1 filer
   0/ 1 striper
   0/ 1 objecter
   0/ 5 rados
   0/ 5 rbd
   0/ 5 rbd_replay
   0/ 5 journaler
   0/ 5 objectcacher
   0/ 5 client
   0/ 5 osd
   0/ 5 optracker
   0/ 5 objclass
   1/ 3 filestore
   1/ 3 keyvaluestore
   1/ 3 journal
   0/ 5 ms
   1/ 5 mon
   0/10 monc
   1/ 5 paxos
   0/ 5 tp
   1/ 5 auth
   1/ 5 crypto
   1/ 1 finisher
   1/ 5 heartbeatmap
   1/ 5 perfcounter
   1/ 5 rgw
   1/10 civetweb
   1/ 5 javaclient
   1/ 5 asok
   1/ 1 throttle
   0/ 0 refs
  -2/-2 (syslog threshold)
  99/99 (stderr threshold)
  max_recent       500
  max_new         1000
  log_file
--- end dump of recent events ---
terminate called after throwing an instance of 'ceph::FailedAssertion'
*** Caught signal (Aborted) **
 in thread 7f472b57f700
 ceph version 0.87 (c51c8f9d80fa4e0168aa52685b8de40e42758578)
 1: rbd() [0x424402]
 2: (()+0xf130) [0x7f47303ce130]
 3: (gsignal()+0x39) [0x7f472f911989]
 4: (abort()+0x148) [0x7f472f913098]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7f472ff139d5]
 6: (()+0x5e946) [0x7f472ff11946]
 7: (()+0x5e973) [0x7f472ff11973]
 8: (()+0x5eb9f) [0x7f472ff11b9f]
 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x27a) [0x7f473084ecea]
 10: (Objecter::_finish_op(Objecter::Op*)+0x27c) [0x7f4732cb906c]
 11: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0x15ec) [0x7f4732cc872c]
 12: (Objecter::ms_dispatch(Message*)+0x2b3) [0x7f4732cd2e03]
 13: (DispatchQueue::entry()+0x62a) [0x7f473096463a]
 14: (DispatchQueue::DispatchThread::entry()+0xd) [0x7f47309e383d]
 15: (()+0x7df3) [0x7f47303c6df3]
 16: (clone()+0x6d) [0x7f472f9d23dd]
2015-07-29 17:21:46.422417 7f472b57f700 -1 *** Caught signal (Aborted) **
 in thread 7f472b57f700

 ceph version 0.87 (c51c8f9d80fa4e0168aa52685b8de40e42758578)
 1: rbd() [0x424402]
 2: (()+0xf130) [0x7f47303ce130]
 3: (gsignal()+0x39) [0x7f472f911989]
 4: (abort()+0x148) [0x7f472f913098]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7f472ff139d5]
 6: (()+0x5e946) [0x7f472ff11946]
 7: (()+0x5e973) [0x7f472ff11973]
 8: (()+0x5eb9f) [0x7f472ff11b9f]
 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x27a) [0x7f473084ecea]
 10: (Objecter::_finish_op(Objecter::Op*)+0x27c) [0x7f4732cb906c]
 11: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0x15ec) [0x7f4732cc872c]
 12: (Objecter::ms_dispatch(Message*)+0x2b3) [0x7f4732cd2e03]
 13: (DispatchQueue::entry()+0x62a) [0x7f473096463a]
 14: (DispatchQueue::DispatchThread::entry()+0xd) [0x7f47309e383d]
 15: (()+0x7df3) [0x7f47303c6df3]
 16: (clone()+0x6d) [0x7f472f9d23dd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- begin dump of recent events ---
     0> 2015-07-29 17:21:46.422417 7f472b57f700 -1 *** Caught signal (Aborted) **
 in thread 7f472b57f700

 ceph version 0.87 (c51c8f9d80fa4e0168aa52685b8de40e42758578)
 1: rbd() [0x424402]
 2: (()+0xf130) [0x7f47303ce130]
 3: (gsignal()+0x39) [0x7f472f911989]
 4: (abort()+0x148) [0x7f472f913098]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7f472ff139d5]
 6: (()+0x5e946) [0x7f472ff11946]
 7: (()+0x5e973) [0x7f472ff11973]
 8: (()+0x5eb9f) [0x7f472ff11b9f]
 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x27a) [0x7f473084ecea]
 10: (Objecter::_finish_op(Objecter::Op*)+0x27c) [0x7f4732cb906c]
 11: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0x15ec) [0x7f4732cc872c]
 12: (Objecter::ms_dispatch(Message*)+0x2b3) [0x7f4732cd2e03]
 13: (DispatchQueue::entry()+0x62a) [0x7f473096463a]
 14: (DispatchQueue::DispatchThread::entry()+0xd) [0x7f47309e383d]
 15: (()+0x7df3) [0x7f47303c6df3]
 16: (clone()+0x6d) [0x7f472f9d23dd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- logging levels ---
   0/ 5 none
   0/ 1 lockdep
   0/ 1 context
   1/ 1 crush
   1/ 5 mds
   1/ 5 mds_balancer
   1/ 5 mds_locker
   1/ 5 mds_log
   1/ 5 mds_log_expire
   1/ 5 mds_migrator
   0/ 1 buffer
   0/ 1 timer
   0/ 1 filer
   0/ 1 striper
   0/ 1 objecter
   0/ 5 rados
   0/ 5 rbd
   0/ 5 rbd_replay
   0/ 5 journaler
   0/ 5 objectcacher
   0/ 5 client
   0/ 5 osd
   0/ 5 optracker
   0/ 5 objclass
   1/ 3 filestore
   1/ 3 keyvaluestore
   1/ 3 journal
   0/ 5 ms
   1/ 5 mon
   0/10 monc
   1/ 5 paxos
   0/ 5 tp
   1/ 5 auth
   1/ 5 crypto
   1/ 1 finisher
   1/ 5 heartbeatmap
   1/ 5 perfcounter
   1/ 5 rgw
   1/10 civetweb
   1/ 5 javaclient
   1/ 5 asok
   1/ 1 throttle
   0/ 0 refs
  -2/-2 (syslog threshold)
  99/99 (stderr threshold)
  max_recent       500
  max_new         1000
  log_file
--- end dump of recent events ---
Aborted (core dumped)

Related issues 1 (0 open1 closed)

Is duplicate of Ceph - Bug #10372: FAILED assert(check_latest_map_ops.find(op->tid) (firefly,giant)ResolvedSage Weil12/18/2014

Actions
Actions #1

Updated by Haomai Wang over 8 years ago

Your ceph version is 0.86?

Actions #2

Updated by ceph zte over 8 years ago

Sorry,my ceph version is 0.87,but there is not 0.87 in this bug system to choose.Why the bug system does not have 0.87 to choose.

Are you ????I have read your blog about ceph,it is so cool.But recently there

is not new artile about ceph.

Actions #3

Updated by Kefu Chai over 8 years ago

  • Subject changed from ceph copy image core dump! to assert failure in Objecter::_finish_op(Objecter::Op*)
  • Description updated (diff)
Actions #4

Updated by Sage Weil over 8 years ago

  • Status changed from New to Duplicate

This was fixed, #10372. Fixed in later firefly and in hammer, but we didn't backport the fix to giant.

Actions

Also available in: Atom PDF