Project

General

Profile

Actions

Bug #4216

closed

osd: dbojectmap incorrectly skipping ops

Added by Sage Weil about 11 years ago. Updated almost 11 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
OSD
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

ubuntu@teuthology:/a/sage-ooo6/8248 crashed with a bad array index in remove_object_with_snap_hardlinks because a snapped object had an empty oi.snaps vector. this appears to be because when the object was written (first created) a bit earlier the setxattr failed with

2013-02-20 21:43:56.498875 7f6a9d4e1700 10 filestore(/var/lib/ceph/osd/ceph-1) _do_transaction on 0x3a93838
2013-02-20 21:43:56.498881 7f6a9d4e1700 15 filestore(/var/lib/ceph/osd/ceph-1) clone 0.7_head/d30883e7/plana605797-98/head//0 -> 0.7_head/d30883e7/plana605797-98/13c//0
2013-02-20 21:43:56.499046 7f6a9d4e1700 20 filestore(/var/lib/ceph/osd/ceph-1) _do_clone_range 0~3204735 to 0
2013-02-20 21:43:56.499058 7f6a9d4e1700 20 filestore(/var/lib/ceph/osd/ceph-1) _do_clone_range using copy
2013-02-20 21:43:56.499060 7f6a9d4e1700 20 filestore(/var/lib/ceph/osd/ceph-1) _do_copy_range 0~3204735 to 0
2013-02-20 21:43:56.503670 7f6a9d4e1700 20 filestore(/var/lib/ceph/osd/ceph-1) _do_copy_range 0~3204735 to 0 = 59007
2013-02-20 21:43:56.503683 7f6a9d4e1700 20 filestore(/var/lib/ceph/osd/ceph-1) objectmap clone
2013-02-20 21:43:56.503760 7f6a9d4e1700 20 filestore dbobjectmap: seq is 1992
2013-02-20 21:43:56.503778 7f6a9d4e1700 20 filestore dbobjectmap: seq is 1993
2013-02-20 21:43:56.503789 7f6a9d4e1700 20 filestore set_header: setting seq 1594
2013-02-20 21:43:56.503799 7f6a9d4e1700 20 filestore set_map_header: setting 1991 hoid d30883e7/plana605797-98/head//0 parent seq 1594
2013-02-20 21:43:56.503811 7f6a9d4e1700 20 filestore set_map_header: setting 1992 hoid d30883e7/plana605797-98/13c//0 parent seq 1594
2013-02-20 21:43:56.503907 7f6a9d4e1700 20 filestore(/var/lib/ceph/osd/ceph-1) fgetattrs 45 getting '_'
2013-02-20 21:43:56.503927 7f6a9d4e1700 20 filestore(/var/lib/ceph/osd/ceph-1) fgetattrs 45 getting 'snapset'
2013-02-20 21:43:56.503976 7f6a9d4e1700 15 filestore(/var/lib/ceph/osd/ceph-1) setattrs 0.7_head/d30883e7/plana605797-98/13c//0
2013-02-20 21:43:56.504059 7f6a9d4e1700 10 filestore hoid: d30883e7/plana605797-98/13c//0 skipping op, *spos 13278.1.0 <= header.spos 13278.1.0
2013-02-20 21:43:56.504100 7f6a9d4e1700 10 filestore hoid: d30883e7/plana605797-98/13c//0 skipping op, *spos 13278.1.0 <= header.spos 13278.1.0
2013-02-20 21:43:56.504129 7f6a9d4e1700 10 filestore(/var/lib/ceph/osd/ceph-1) setattrs 0.7_head/d30883e7/plana605797-98/13c//0 = 31
2013-02-20 21:43:56.504135 7f6a9d4e1700 10 filestore(/var/lib/ceph/osd/ceph-1) _set_replay_guard 13278.1.0
2013-02-20 21:43:56.573789 7f6a9dce2700 20 filestore dbobjectmap: seq is 1993

note that this is the first time this object appears on this particular OSD.
Actions #1

Updated by Sage Weil about 11 years ago

from gdb:

(gdb) p oi
$5 = {
  soid = {
    oid = {
      name = {
        static npos = <optimized out>, 
        _M_dataplus = {
          <std::allocator<char>> = {
            <__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, 
          members of std::basic_string<char, std::char_traits<char>, std::allocator<char> >::_Alloc_hider: 
          _M_p = 0x2c4d738 "plana605797-98" 
        }
      }
    }, 
    snap = {
      val = 18446744073709551614
    }, 
    hash = 3540550631, 
    max = false, 
    pool = 0, 
    nspace = {
      static npos = <optimized out>, 
      _M_dataplus = {
        <std::allocator<char>> = {
          <__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, 
        members of std::basic_string<char, std::char_traits<char>, std::allocator<char> >::_Alloc_hider: 
        _M_p = 0xc4ed98 "" 
      }
    }, 
    key = {
      static npos = <optimized out>, 
      _M_dataplus = {
        <std::allocator<char>> = {
          <__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, 
        members of std::basic_string<char, std::char_traits<char>, std::allocator<char> >::_Alloc_hider: 
        _M_p = 0xc4ed98 "" 
      }
    }
  }, 
  oloc = {
    pool = 0, 
    key = {
      static npos = <optimized out>, 
      _M_dataplus = {
        <std::allocator<char>> = {
          <__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, 
        members of std::basic_string<char, std::char_traits<char>, std::allocator<char> >::_Alloc_hider: 
        _M_p = 0xc4ed98 "" 
      }
    }
  }, 
  category = {
    static npos = <optimized out>, 
    _M_dataplus = {
---Type <return> to continue, or q <return> to quit---
      <std::allocator<char>> = {
        <__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, 
      members of std::basic_string<char, std::char_traits<char>, std::allocator<char> >::_Alloc_hider: 
      _M_p = 0xc4ed98 "" 
    }
  }, 
  version = {
    version = 9, 
    epoch = 7, 
    __pad = 0
  }, 
  prior_version = {
    version = 8, 
    epoch = 7, 
    __pad = 0
  }, 
  user_version = {
    version = 9, 
    epoch = 7, 
    __pad = 0
  }, 
  last_reqid = {
    name = {
      _type = 8 '\b', 
      _num = 4143, 
      static TYPE_MON = 1, 
      static TYPE_MDS = 2, 
      static TYPE_OSD = 4, 
      static TYPE_CLIENT = 8, 
      static NEW = -1
    }, 
    tid = 262, 
    inc = 0
  }, 
  size = 3204735, 
  mtime = {
    tv = {
      tv_sec = 1361424682, 
      tv_nsec = 196749000
    }
  }, 
  lost = false, 
  wrlock_by = {
    name = {
      _type = 0 '\000', 
      _num = 0, 
      static TYPE_MON = 1, 
      static TYPE_MDS = 2, 
      static TYPE_OSD = 4, 
      static TYPE_CLIENT = 8, 
      static NEW = -1
    }, 
    tid = 0, 
    inc = 0
  }, 
  snaps = {
    <std::_Vector_base<snapid_t, std::allocator<snapid_t> >> = {
      _M_impl = {
        <std::allocator<snapid_t>> = {
---Type <return> to continue, or q <return> to quit---
          <__gnu_cxx::new_allocator<snapid_t>> = {<No data fields>}, <No data fields>}, 
        members of std::_Vector_base<snapid_t, std::allocator<snapid_t> >::_Vector_impl: 
        _M_start = 0x0, 
        _M_finish = 0x0, 
        _M_end_of_storage = 0x0
      }
    }, <No data fields>}, 
  truncate_seq = 0, 
  truncate_size = 0, 
  watchers = {
    _M_t = {
      _M_impl = {
        <std::allocator<std::_Rb_tree_node<std::pair<std::pair<unsigned long, entity_name_t> const, watch_info_t> > >> = {
          <__gnu_cxx::new_allocator<std::_Rb_tree_node<std::pair<std::pair<unsigned long, entity_name_t> const, watch_info_t> > >> = {<No data fields>}, <No data fields>}, 
        members of std::_Rb_tree<std::pair<unsigned long, entity_name_t>, std::pair<std::pair<unsigned long, entity_name_t> const, watch_info_t>, std::_Select1st<std::pair<std::pair<unsigned long, entity_name_t> const, watch_info_t> >, std::less<std::pair<unsigned long, entity_name_t> >, std::allocator<std::pair<std::pair<unsigned long, entity_name_t> const, watch_info_t> > >::_Rb_tree_impl<std::less<std::pair<unsigned long, entity_name_t> >, false>: 
        _M_key_compare = {
          <std::binary_function<std::pair<unsigned long, entity_name_t>, std::pair<unsigned long, entity_name_t>, bool>> = {<No data fields>}, <No data fields>}, 
        _M_header = {
          _M_color = std::_S_red, 
          _M_parent = 0x0, 
          _M_left = 0x7f5989ad1930, 
          _M_right = 0x7f5989ad1930
        }, 
        _M_node_count = 0
      }
    }
  }, 
  uses_tmap = false
}
(gdb) 
(gdb) p oi.snaps
$6 = {
  <std::_Vector_base<snapid_t, std::allocator<snapid_t> >> = {
    _M_impl = {
      <std::allocator<snapid_t>> = {
        <__gnu_cxx::new_allocator<snapid_t>> = {<No data fields>}, <No data fields>}, 
      members of std::_Vector_base<snapid_t, std::allocator<snapid_t> >::_Vector_impl: 
      _M_start = 0x0, 
      _M_finish = 0x0, 
      _M_end_of_storage = 0x0
    }
  }, <No data fields>}

but not that when the object was written originally,
2013-02-20 21:43:56.051200 7f6a964d3700 10 osd.1 pg_epoch: 411 pg[0.7( v 406'134 (0'0,406'134] local-les=317 n=15 ec=1 les/c 317/322 316/316/316) [1,4] r=0 lpr=316 lcod 405'132 mlcod 405'132 active+clean] do_osd_op d30883e7/plana605797-98/head//0 [rollback fb]
2013-02-20 21:43:56.051222 7f6a964d3700 10 osd.1 pg_epoch: 411 pg[0.7( v 406'134 (0'0,406'134] local-les=317 n=15 ec=1 les/c 317/322 316/316/316) [1,4] r=0 lpr=316 lcod 405'132 mlcod 405'132 active+clean] do_osd_op  rollback fb
2013-02-20 21:43:56.051243 7f6a964d3700 10 osd.1 pg_epoch: 411 pg[0.7( v 406'134 (0'0,406'134] local-les=317 n=15 ec=1 les/c 317/322 316/316/316) [1,4] r=0 lpr=316 lcod 405'132 mlcod 405'132 active+clean] _rollback_to d30883e7/plana605797-98/head//0 snapid fb
2013-02-20 21:43:56.051265 7f6a964d3700 10 osd.1 pg_epoch: 411 pg[0.7( v 406'134 (0'0,406'134] local-les=317 n=15 ec=1 les/c 317/322 316/316/316) [1,4] r=0 lpr=316 lcod 405'132 mlcod 405'132 active+clean] get_snapset_context plana605797-98 1 -> 2
2013-02-20 21:43:56.051285 7f6a964d3700 10 osd.1 pg_epoch: 411 pg[0.7( v 406'134 (0'0,406'134] local-les=317 n=15 ec=1 les/c 317/322 316/316/316) [1,4] r=0 lpr=316 lcod 405'132 mlcod 405'132 active+clean] find_object_context d30883e7/plana605797-98/fb//0 @fb snap
set 0=[]:[]+head
2013-02-20 21:43:56.051309 7f6a964d3700 10 osd.1 pg_epoch: 411 pg[0.7( v 406'134 (0'0,406'134] local-les=317 n=15 ec=1 les/c 317/322 316/316/316) [1,4] r=0 lpr=316 lcod 405'132 mlcod 405'132 active+clean] get_object_context 0x503c780 d30883e7/plana605797-98/head/
/0 1 -> 2
2013-02-20 21:43:56.051331 7f6a964d3700 10 osd.1 pg_epoch: 411 pg[0.7( v 406'134 (0'0,406'134] local-les=317 n=15 ec=1 les/c 317/322 316/316/316) [1,4] r=0 lpr=316 lcod 405'132 mlcod 405'132 active+clean] find_object_context  d30883e7/plana605797-98/head//0 want 
fb > snapset seq 0 -- HIT d30883e7/plana605797-98/head//0
2013-02-20 21:43:56.051354 7f6a964d3700 10 osd.1 pg_epoch: 411 pg[0.7( v 406'134 (0'0,406'134] local-les=317 n=15 ec=1 les/c 317/322 316/316/316) [1,4] r=0 lpr=316 lcod 405'132 mlcod 405'132 active+clean] put_snapset_context plana605797-98 2 -> 1
2013-02-20 21:43:56.051376 7f6a964d3700 10 osd.1 pg_epoch: 411 pg[0.7( v 406'134 (0'0,406'134] local-les=317 n=15 ec=1 les/c 317/322 316/316/316) [1,4] r=0 lpr=316 lcod 405'132 mlcod 405'132 active+clean] put_object_context 0x503c780 d30883e7/plana605797-98/head/
/0 2 -> 1
2013-02-20 21:43:56.051398 7f6a964d3700 15 osd.1 pg_epoch: 411 pg[0.7( v 406'134 (0'0,406'134] local-les=317 n=15 ec=1 les/c 317/322 316/316/316) [1,4] r=0 lpr=316 lcod 405'132 mlcod 405'132 active+clean] do_osd_op_effects on session 0x2c0a0e0
2013-02-20 21:43:56.051419 7f6a964d3700 20 osd.1 pg_epoch: 411 pg[0.7( v 406'134 (0'0,406'134] local-les=317 n=15 ec=1 les/c 317/322 316/316/316) [1,4] r=0 lpr=316 lcod 405'132 mlcod 405'132 active+clean] make_writeable d30883e7/plana605797-98/head//0 snapset=0x4
0034e0  snapc=13c=[13c,134,133,131,130,12b,129,127,126,120,11f,11e,11b,113,112,10a,105,102,100,fb,f7,f3,d8,ae,96]
2013-02-20 21:43:56.051523 7f6a964d3700 20 osd.1 pg_epoch: 411 pg[0.7( v 406'134 (0'0,406'134] local-les=317 n=15 ec=1 les/c 317/322 316/316/316) [1,4] r=0 lpr=316 lcod 405'132 mlcod 405'132 active+clean] write_info bigbl 428
2013-02-20 21:43:56.051550 7f6a964d3700 10 osd.1 pg_epoch: 411 pg[0.7( v 406'134 (0'0,406'134] local-les=317 n=15 ec=1 les/c 317/322 316/316/316) [1,4] r=0 lpr=316 lcod 405'132 mlcod 405'132 active+clean] create_snap_collection 0.7_13c, set now [96~1,ae~1,d8~1,f3
~1,113~1,120~1,13c~1]
2013-02-20 21:43:56.051587 7f6a964d3700 10 osd.1 pg_epoch: 411 pg[0.7( v 406'134 (0'0,406'134] local-les=317 n=15 ec=1 les/c 317/322 316/316/316) [1,4] r=0 lpr=316 lcod 405'132 mlcod 405'132 active+clean]  cloning v 7'9 to d30883e7/plana605797-98/13c//0 v 411'135
 snaps=[13c,134,133,131,130,12b,129,127,126,120,11f,11e,11b,113,112,10a,105,102,100,fb,f7,f3,d8,ae,96]
2013-02-20 21:43:56.051624 7f6a964d3700 20 osd.1 pg_epoch: 411 pg[0.7( v 406'134 (0'0,406'134] local-les=317 n=15 ec=1 les/c 317/322 316/316/316) [1,4] r=0 lpr=316 lcod 405'132 mlcod 405'132 active+clean] make_writeable d30883e7/plana605797-98/head//0 done, snaps
et=13c=[13c,134,133,131,130,12b,129,127,126,120,11f,11e,11b,113,112,10a,105,102,100,fb,f7,f3,d8,ae,96]:[13c]+head
2013-02-20 21:43:56.051659 7f6a964d3700 10 osd.1 pg_epoch: 411 pg[0.7( v 406'134 (0'0,406'134] local-les=317 n=15 ec=1 les/c 317/322 316/316/316) [1,4] r=0 lpr=316 lcod 405'132 mlcod 405'132 active+clean]  set mtime to 2013-02-20 21:43:56.051131
2013-02-20 21:43:56.051693 7f6a964d3700 10 osd.1 pg_epoch: 411 pg[0.7( v 406'134 (0'0,406'134] local-les=317 n=15 ec=1 les/c 317/322 316/316/316) [1,4] r=0 lpr=316 lcod 405'132 mlcod 405'132 active+clean]  final snapset 13c=[13c,134,133,131,130,12b,129,127,126,12
0,11f,11e,11b,113,112,10a,105,102,100,fb,f7,f3,d8,ae,96]:[13c]+head in d30883e7/plana605797-98/head//0
2013-02-20 21:43:56.051733 7f6a964d3700 10 osd.1 pg_epoch: 411 pg[0.7( v 406'134 (0'0,406'134] local-les=317 n=16 ec=1 les/c 317/322 316/316/316) [1,4] r=0 lpr=316 lcod 405'132 mlcod 405'132 active+clean] append_log log(0'0,406'134] [411'135 (7'9) clone    d30883
e7/plana605797-98/13c//0 by client.4143.0:3045 2013-02-20 21:31:22.196749 snaps [13c,134,133,131,130,12b,129,127,126,120,11f,11e,11b,113,112,10a,105,102,100,fb,f7,f3,d8,ae,96],411'136 (7'9) modify   d30883e7/plana605797-98/head//0 by client.4143.0:3045 2013-02-20
 21:43:56.051131]
2013-02-20 21:43:56.051797 7f6a964d3700 10 osd.1 pg_epoch: 411 pg[0.7( v 411'135 (0'0,411'135] local-les=317 n=16 ec=1 les/c 317/322 316/316/316) [1,4] r=0 lpr=316 luod=406'134 lcod 405'132 mlcod 405'132 active+clean] add_log_entry 411'135 (7'9) clone    d30883e7
/plana605797-98/13c//0 by client.4143.0:3045 2013-02-20 21:31:22.196749 snaps [13c,134,133,131,130,12b,129,127,126,120,11f,11e,11b,113,112,10a,105,102,100,fb,f7,f3,d8,ae,96]
2013-02-20 21:43:56.051861 7f6a964d3700 10 osd.1 pg_epoch: 411 pg[0.7( v 411'136 (0'0,411'136] local-les=317 n=16 ec=1 les/c 317/322 316/316/316) [1,4] r=0 lpr=316 luod=406'134 lcod 405'132 mlcod 405'132 active+clean] add_log_entry 411'136 (7'9) modify   d30883e7
/plana605797-98/head//0 by client.4143.0:3045 2013-02-20 21:43:56.051131

note that the version is 411'135 there, and the prior version is 7'9.. which is what is in the oi xattr. basically, the rollback in DBObjectMap seems to have not worked.

Actions #2

Updated by Sage Weil about 11 years ago

root@plana46:~# gdb /usr/bin/ceph-osd /tmp/1361425499.5801.core
if you want to look at hte core

Actions #3

Updated by Samuel Just about 11 years ago

A bit below continuing on the same thread (7f6a9d4e1700):

2013-02-20 21:43:56.597514 7f6a9d4e1700 20 filestore dbobjectmap: seq is 1993
2013-02-20 21:43:56.597596 7f6a9e4e3700 5 filestore(/var/lib/ceph/osd/ceph-1) _journaled_ahead 0x5c23280 seq 13281 osr(default 0x28dc7d0) 0x4c19750
2013-02-20 21:43:56.597608 7f6a9d4e1700 10 filestore hoid: d30883e7/plana605797-98/13c//0 setting spos to 13278.1.0
2013-02-20 21:43:56.597616 7f6a9e4e3700 5 filestore(/var/lib/ceph/osd/ceph-1) queue_op 0x5c23280 seq 13281 osr(default 0x28dc7d0) 5304 bytes (queue has 28 ops and 31507 bytes)
2013-02-20 21:43:56.597617 7f6a9d4e1700 20 filestore set_map_header: setting 1992 hoid d30883e7/plana605797-98/13c//0 parent seq 1594

Above, we set the spos to 13278.1.0 at the end of the clone (for make_writeable). 13278.1.1 would have set the object_info attr for the clone. However, this is the last we hear from this thread before the OSD restarts. Note, we have _journaled_ahead up to seq 13281.

After the replay:

2013-02-20 21:44:53.074062 7f599b887780 10 filestore(/var/lib/ceph/osd/ceph-1) _do_transaction on 0x296cbd0
2013-02-20 21:44:53.074067 7f599b887780 15 filestore(/var/lib/ceph/osd/ceph-1) _omap_setkeys meta/16ef7597/infos/head//-1
2013-02-20 21:44:53.074136 7f599b887780 10 filestore hoid: 16ef7597/infos/head//-1 not skipping op, *spos 13263.0.0
2013-02-20 21:44:53.074145 7f599b887780 10 filestore > header.spos 0.0.0
2013-02-20 21:44:53.074248 7f599b887780 1 journal _open /var/lib/ceph/osd/ceph-1/journal fd 22: 104857600 bytes, block size 4096 bytes, directio = 1, aio = 1
2013-02-20 21:44:53.074610 7f5994ae9700 20 filestore(/var/lib/ceph/osd/ceph-1) flusher_entry start
2013-02-20 21:44:53.074631 7f5994ae9700 20 filestore(/var/lib/ceph/osd/ceph-1) flusher_entry sleeping
2013-02-20 21:44:53.074721 7f599b887780 5 filestore(/var/lib/ceph/osd/ceph-1) umount /var/lib/ceph/osd/ceph-1
2013-02-20 21:44:53.074757 7f5997aef700 20 filestore(/var/lib/ceph/osd/ceph-1) sync_entry force_sync set
2013-02-20 21:44:53.074808 7f5997aef700 15 filestore(/var/lib/ceph/osd/ceph-1) sync_entry committing 13263 sync_epoch 1
2013-02-20 21:44:53.074805 7f5994ae9700 20 filestore(/var/lib/ceph/osd/ceph-1) flusher_entry awoke
2013-02-20 21:44:53.074814 7f5994ae9700 20 filestore(/var/lib/ceph/osd/ceph-1) flusher_entry finish
2013-02-20 21:44:53.074828 7f5997aef700 15 filestore(/var/lib/ceph/osd/ceph-1) sync_entry doing a full sync (syncfs(2) if possible)
2013-02-20 21:44:53.180041 7f5997aef700 10 filestore(/var/lib/ceph/osd/ceph-1) sync_entry commit took 0.105250, interval was 0.163580
2013-02-20 21:44:53.180076 7f5997aef700 15 filestore(/var/lib/ceph/osd/ceph-1) sync_entry committed to op_seq 13263
2013-02-20 21:44:53.180519 7f599b887780 1 journal close /var/lib/ceph/osd/ceph-1/journal

we have replayed up to 13263.0.0 and stopped rather than reading up to 13281. The rest of the make_writeable transaction in 13278 including the object_info attr set did not get replayed. Still looking.

Actions #4

Updated by Sage Weil almost 11 years ago

  • Status changed from 12 to Resolved
Actions

Also available in: Atom PDF