Bug #20099
openosd/filestore: osd/PGLog.cc: 911: FAILED assert(last_e.version.version < e.version.version)
0%
Description
My Ceph cluster is down when the server is powered off,
and when i restart my osd, it failed in read_log.
As fllow:
by client.765048.0:2510440 2017-05-25 14:37:44.210324 2017-05-27 01:14:32.791414 7fc20486f7c0 20 read_log 346'6529417 (346'6529402) modify 2/ccdae3a1/rbd_data.4e2875a574153.0000000000001ac6/head by client.576715.0:4770236 2017-05-25 14:37:28.097068 2017-05-27 01:14:32.791421 7fc20486f7c0 20 read_log 346'6529418 (346'6529417) modify 2/ccdae3a1/rbd_data.4e2875a574153.0000000000001ac6/head by client.576715.0:4770237 2017-05-25 14:37:28.097097 2017-05-27 01:14:32.791429 7fc20486f7c0 20 read_log 346'6529419 (346'6529418) modify 2/ccdae3a1/rbd_data.4e2875a574153.0000000000001ac6/head by client.576715.0:4770239 2017-05-25 14:37:28.100208 2017-05-27 01:14:32.791435 7fc20486f7c0 20 read_log 346'6529420 (346'6528484) modify 2/c173e4c1/rbd_data.5d4e9238e1f29.00000000000009a8/head by client.561926.0:2389556 2017-05-25 14:37:39.001718 2017-05-27 01:14:32.791443 7fc20486f7c0 20 read_log 346'6529421 (346'6529420) modify 2/c173e4c1/rbd_data.5d4e9238e1f29.00000000000009a8/head by client.561926.0:2389557 2017-05-25 14:37:39.003726 2017-05-27 01:14:32.791457 7fc20486f7c0 20 read_log 346'6529422 (346'6529391) modify 2/67ac01c1/rbd_data.ba87d238e1f29.0000000000000374/head by client.764086.0:136303 2017-05-25 14:37:28.961843 2017-05-27 01:14:32.791465 7fc20486f7c0 20 read_log 346'6529423 (346'6529397) modify 2/3ca0c401/rbd_data.ba86e10117899.0000000000000830/head by client.764087.0:1603325 2017-05-25 14:37:29.038499 2017-05-27 01:14:32.791473 7fc20486f7c0 20 read_log 346'6529424 (346'6529404) modify 2/ab76dbe1/rbd_data.4de9374b0dc51.00000000000004a6/head by client.561676.0:3091323 2017-05-25 14:37:29.053888 2017-05-27 01:14:32.791479 7fc20486f7c0 20 read_log 346'6529425 (346'6529424) modify 2/ab76dbe1/rbd_data.4de9374b0dc51.00000000000004a6/head by client.561676.0:3091325 2017-05-25 14:37:29.055988 2017-05-27 01:14:32.791486 7fc20486f7c0 20 read_log 346'6529426 (346'6529249) modify 2/61f5a41/rbd_data.4e2875a574153.00000000000028bb/head by client.576715.0:4770241 2017-05-25 14:37:29.283908 2017-05-27 01:14:32.791494 7fc20486f7c0 20 read_log 346'6529427 (346'6529416) modify 2/232529e1/rbd_data.ba26f5a13810.0000000000007d72/head by client.765048.0:2510476 2017-05-25 14:37:47.221941 2017-05-27 01:14:32.791502 7fc20486f7c0 20 read_log 346'6529428 (346'6529392) modify 2/763a1381/rbd_data.5d4f7594a48ee.0000000000000049/head by client.561927.0:7376282 2017-05-25 14:37:40.231966 2017-05-27 01:14:32.791510 7fc20486f7c0 20 read_log 406'6529418 (346'6514935) modify 2/709a41c1/rbd_header.bb711e3fe196/head by client.794682.0:6 2017-05-26 02:09:57.942893 2017-05-27 01:14:32.792691 7fc20486f7c0 -1 osd/PGLog.cc: In function 'static void PGLog::read_log(ObjectStore*, coll_t, coll_t, ghobject_t, const pg_info_t&, std::map<eversion_t, hobject_t>&, PGLog::IndexedLog&, pg_missing_t&, std::ostringstream&, std::set<std::basic_string<char> >*)' thread 7fc20486f7c0 time 2017-05-27 01:14:32.791516 osd/PGLog.cc: 911: FAILED assert(last_e.version.version < e.version.version) ceph version 0.94.9-3.el7cp (7358f71bebe44c463df4d91c2770149e812bbeaa) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85) [0xb11da5] 2: (PGLog::read_log(ObjectStore*, coll_t, coll_t, ghobject_t, pg_info_t const&, std::map<eversion_t, hobject_t, std::less<eversion_t>, std::allocator<std::pair<eversion_t const, hobject_t> > >&, PGLog::IndexedLog&, pg_missing_t&, std::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >&, std::set<std::string, std::less<std::string>, std::allocator<std::string> >*)+0x1a38) [0x733d88] 3: (PG::read_state(ObjectStore*, ceph::buffer::list&)+0x34f) [0x7b73df] 4: (OSD::load_pgs()+0xa99) [0x67c499] 5: (OSD::init()+0x181a) [0x67ff5a] 6: (main()+0x2aec) [0x619ecc] 7: (__libc_start_main()+0xf5) [0x7fc202233b35] 8: /usr/bin/ceph-osd() [0x620e99] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Updated by fang yuxiang almost 7 years ago
i think this is not functional issue of ceph, maybe your local fs data is corrupted.
are you using any block cache suite, such as enhanceio or bcache?
Updated by huanwen ren almost 7 years ago
`read_log 406'6529418` and `read_log 346'6529418` have the same seq
other, I use ceph-kvstore-tool can show as:
_USER_0000000000000007_USER_:0000000346.00000000000006529416 _USER_0000000000000007_USER_:0000000346.00000000000006529417 _USER_0000000000000007_USER_:0000000346.00000000000006529418 _USER_0000000000000007_USER_:0000000346.00000000000006529419 _USER_0000000000000007_USER_:0000000346.00000000000006529420 _USER_0000000000000007_USER_:0000000346.00000000000006529421 _USER_0000000000000007_USER_:0000000346.00000000000006529422 _USER_0000000000000007_USER_:0000000346.00000000000006529423 _USER_0000000000000007_USER_:0000000346.00000000000006529424 _USER_0000000000000007_USER_:0000000346.00000000000006529425 _USER_0000000000000007_USER_:0000000346.00000000000006529426 _USER_0000000000000007_USER_:0000000346.00000000000006529427 _USER_0000000000000007_USER_:0000000346.00000000000006529428 _USER_0000000000000007_USER_:0000000406.00000000000006529418 _USER_0000000000000007_USER_:0000000406.00000000000006529419 _USER_0000000000000007_USER_:_biginfo _USER_0000000000000007_USER_:_epoch _USER_0000000000000007_USER_:_info _USER_0000000000000007_USER_:_infover
Updated by huanwen ren almost 7 years ago
fang yuxiang wrote:
i think this is not functional issue of ceph, maybe your local fs data is corrupted.
are you using any block cache suite, such as enhanceio or bcache?
Local fs use XFS, and It's ok.
we not use any cache features
thank you !
Updated by Greg Farnum almost 7 years ago
- Project changed from Ceph to RADOS
- Status changed from New to Need More Info
- Priority changed from Urgent to Normal
Does this still exist or is it all cleaned up now? The repeating versions is a little weird but that's not enough data to diagnose the issue.