Project

General

Profile

Actions

Bug #20099

open

osd/filestore: osd/PGLog.cc: 911: FAILED assert(last_e.version.version < e.version.version)

Added by huanwen ren almost 7 years ago. Updated almost 7 years ago.

Status:
Need More Info
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

My Ceph cluster is down when the server is powered off,
and when i restart my osd, it failed in read_log.
As fllow:

by client.765048.0:2510440 2017-05-25 14:37:44.210324
2017-05-27 01:14:32.791414 7fc20486f7c0 20 read_log 346'6529417 (346'6529402) modify   2/ccdae3a1/rbd_data.4e2875a574153.0000000000001ac6/head by client.576715.0:4770236 2017-05-25 14:37:28.097068
2017-05-27 01:14:32.791421 7fc20486f7c0 20 read_log 346'6529418 (346'6529417) modify   2/ccdae3a1/rbd_data.4e2875a574153.0000000000001ac6/head by client.576715.0:4770237 2017-05-25 14:37:28.097097
2017-05-27 01:14:32.791429 7fc20486f7c0 20 read_log 346'6529419 (346'6529418) modify   2/ccdae3a1/rbd_data.4e2875a574153.0000000000001ac6/head by client.576715.0:4770239 2017-05-25 14:37:28.100208
2017-05-27 01:14:32.791435 7fc20486f7c0 20 read_log 346'6529420 (346'6528484) modify   2/c173e4c1/rbd_data.5d4e9238e1f29.00000000000009a8/head by client.561926.0:2389556 2017-05-25 14:37:39.001718
2017-05-27 01:14:32.791443 7fc20486f7c0 20 read_log 346'6529421 (346'6529420) modify   2/c173e4c1/rbd_data.5d4e9238e1f29.00000000000009a8/head by client.561926.0:2389557 2017-05-25 14:37:39.003726
2017-05-27 01:14:32.791457 7fc20486f7c0 20 read_log 346'6529422 (346'6529391) modify   2/67ac01c1/rbd_data.ba87d238e1f29.0000000000000374/head by client.764086.0:136303 2017-05-25 14:37:28.961843
2017-05-27 01:14:32.791465 7fc20486f7c0 20 read_log 346'6529423 (346'6529397) modify   2/3ca0c401/rbd_data.ba86e10117899.0000000000000830/head by client.764087.0:1603325 2017-05-25 14:37:29.038499
2017-05-27 01:14:32.791473 7fc20486f7c0 20 read_log 346'6529424 (346'6529404) modify   2/ab76dbe1/rbd_data.4de9374b0dc51.00000000000004a6/head by client.561676.0:3091323 2017-05-25 14:37:29.053888
2017-05-27 01:14:32.791479 7fc20486f7c0 20 read_log 346'6529425 (346'6529424) modify   2/ab76dbe1/rbd_data.4de9374b0dc51.00000000000004a6/head by client.561676.0:3091325 2017-05-25 14:37:29.055988
2017-05-27 01:14:32.791486 7fc20486f7c0 20 read_log 346'6529426 (346'6529249) modify   2/61f5a41/rbd_data.4e2875a574153.00000000000028bb/head by client.576715.0:4770241 2017-05-25 14:37:29.283908
2017-05-27 01:14:32.791494 7fc20486f7c0 20 read_log 346'6529427 (346'6529416) modify   2/232529e1/rbd_data.ba26f5a13810.0000000000007d72/head by client.765048.0:2510476 2017-05-25 14:37:47.221941
2017-05-27 01:14:32.791502 7fc20486f7c0 20 read_log 346'6529428 (346'6529392) modify   2/763a1381/rbd_data.5d4f7594a48ee.0000000000000049/head by client.561927.0:7376282 2017-05-25 14:37:40.231966
2017-05-27 01:14:32.791510 7fc20486f7c0 20 read_log 406'6529418 (346'6514935) modify   2/709a41c1/rbd_header.bb711e3fe196/head by client.794682.0:6 2017-05-26 02:09:57.942893
2017-05-27 01:14:32.792691 7fc20486f7c0 -1 osd/PGLog.cc: In function 'static void PGLog::read_log(ObjectStore*, coll_t, coll_t, ghobject_t, const pg_info_t&, std::map<eversion_t, hobject_t>&, PGLog::IndexedLog&, pg_missing_t&, std::ostringstream&, std::set<std::basic_string<char> >*)' thread 7fc20486f7c0 time 2017-05-27 01:14:32.791516
osd/PGLog.cc: 911: FAILED assert(last_e.version.version < e.version.version)

 ceph version 0.94.9-3.el7cp (7358f71bebe44c463df4d91c2770149e812bbeaa)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85) [0xb11da5]
 2: (PGLog::read_log(ObjectStore*, coll_t, coll_t, ghobject_t, pg_info_t const&, std::map<eversion_t, hobject_t, std::less<eversion_t>, std::allocator<std::pair<eversion_t const, hobject_t> > >&, PGLog::IndexedLog&, pg_missing_t&, std::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >&, std::set<std::string, std::less<std::string>, std::allocator<std::string> >*)+0x1a38) [0x733d88]
 3: (PG::read_state(ObjectStore*, ceph::buffer::list&)+0x34f) [0x7b73df]
 4: (OSD::load_pgs()+0xa99) [0x67c499]
 5: (OSD::init()+0x181a) [0x67ff5a]
 6: (main()+0x2aec) [0x619ecc]
 7: (__libc_start_main()+0xf5) [0x7fc202233b35]
 8: /usr/bin/ceph-osd() [0x620e99]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Actions #1

Updated by fang yuxiang almost 7 years ago

i think this is not functional issue of ceph, maybe your local fs data is corrupted.

are you using any block cache suite, such as enhanceio or bcache?

Actions #2

Updated by huanwen ren almost 7 years ago

`read_log 406'6529418` and `read_log 346'6529418` have the same seq

other, I use ceph-kvstore-tool can show as:

_USER_0000000000000007_USER_:0000000346.00000000000006529416
_USER_0000000000000007_USER_:0000000346.00000000000006529417
_USER_0000000000000007_USER_:0000000346.00000000000006529418
_USER_0000000000000007_USER_:0000000346.00000000000006529419
_USER_0000000000000007_USER_:0000000346.00000000000006529420
_USER_0000000000000007_USER_:0000000346.00000000000006529421
_USER_0000000000000007_USER_:0000000346.00000000000006529422
_USER_0000000000000007_USER_:0000000346.00000000000006529423
_USER_0000000000000007_USER_:0000000346.00000000000006529424
_USER_0000000000000007_USER_:0000000346.00000000000006529425
_USER_0000000000000007_USER_:0000000346.00000000000006529426
_USER_0000000000000007_USER_:0000000346.00000000000006529427
_USER_0000000000000007_USER_:0000000346.00000000000006529428
_USER_0000000000000007_USER_:0000000406.00000000000006529418
_USER_0000000000000007_USER_:0000000406.00000000000006529419
_USER_0000000000000007_USER_:_biginfo
_USER_0000000000000007_USER_:_epoch
_USER_0000000000000007_USER_:_info
_USER_0000000000000007_USER_:_infover
Actions #3

Updated by huanwen ren almost 7 years ago

fang yuxiang wrote:

i think this is not functional issue of ceph, maybe your local fs data is corrupted.

are you using any block cache suite, such as enhanceio or bcache?

Local fs use XFS, and It's ok.
we not use any cache features

thank you !

Actions #4

Updated by Greg Farnum almost 7 years ago

  • Project changed from Ceph to RADOS
  • Status changed from New to Need More Info
  • Priority changed from Urgent to Normal

Does this still exist or is it all cleaned up now? The repeating versions is a little weird but that's not enough data to diagnose the issue.

Actions

Also available in: Atom PDF