Project

General

Profile

Actions

Bug #13802

closed

osd/osd_types.cc: 459: FAILED assert(m_seed < old_pg_num)

Added by Richard Arends over 8 years ago. Updated over 8 years ago.

Status:
Rejected
Priority:
High
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Increasing the number of placement groups results in crashing Openstack instances. This bug should be fixed in version 0.94.4 (http://tracker.ceph.com/issues/10399), we are running 0.94.5, so it looks like the bug is still present.

Messages from the auth log:
Nov 16 13:10:16 cephmon0 sudo: <username> : TTY=pts/3 ; PWD=/home/<username> ; USER=root ; COMMAND=/usr/bin/ceph osd pool set openstack_instances pg_num 576
Nov 16 13:10:40 cephmon0 sudo: <username> : TTY=pts/3 ; PWD=/home/<username> ; USER=root ; COMMAND=/usr/bin/ceph osd pool set openstack_instances pgp_num 576

Messages from Ceph which show the direct result from increasing the number of PG's on the pool openstack_instances:
2015-11-16 13:10:16.726543 mon.0 <ip address>:6789/0 2216469 : cluster [INF] osdmap e27212: 1296 osds: 1260 up, 1260 in
2015-11-16 13:10:16.737443 mon.0 <ip address>:6789/0 2216470 : cluster [INF] pgmap v6424130: 2368 pgs: 64 creating, 2282 active+clean, 22 active+clean+scrubbing; 344 TB data, 1038 TB used, 3537 TB / 4575 TB avail; 48453 kB/s rd, 162 MB/s wr, 1114 op/s

/var/log/libvirt/qemu/instance-<node id>.log :: Elasticsearch node 12:
osd/osd_types.cc: In function 'bool pg_t::is_split(unsigned int, unsigned int, std::set<pg_t>*) const' thread 7f4737b0d700 time 2015-11-16 13:10:16.747109
osd/osd_types.cc: 459: FAILED assert(m_seed < old_pg_num)
ceph version 0.94.3 (95cefea9fd9ab740263bf8bb4796fd864d9afe2b)
1: (()+0x1541ab) [0x7f47484471ab]
2: (()+0x2237d1) [0x7f47485167d1]
3: (()+0x2238ad) [0x7f47485168ad]
4: (()+0xc5709) [0x7f47483b8709]
5: (()+0xdc7ac) [0x7f47483cf7ac]
6: (()+0xdcffa) [0x7f47483cfffa]
7: (()+0xde562) [0x7f47483d1562]
8: (()+0xe42df) [0x7f47483d72df]
9: (()+0x2c4949) [0x7f47485b7949]
10: (()+0x2f236d) [0x7f47485e536d]
11: (()+0x8182) [0x7f4743f6d182]
12: (clone()+0x6d) [0x7f4743c9a47d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
terminate called after throwing an instance of 'ceph::FailedAssertion'
2015-11-16 12:10:17.639+0000: shutting down

/var/log/libvirt/qemu/instance-<node id>.log :: Elasticsearch node 8:
osd/osd_types.cc: In function 'bool pg_t::is_split(unsigned int, unsigned int, std::set<pg_t>*) const' thread 7f066daa6700 time 2015-11-16 13:10:16.733543
osd/osd_types.cc: 459: FAILED assert(m_seed < old_pg_num)
ceph version 0.94.3 (95cefea9fd9ab740263bf8bb4796fd864d9afe2b)
1: (()+0x1541ab) [0x7f067e3e01ab]
2: (()+0x2237d1) [0x7f067e4af7d1]
3: (()+0x2238ad) [0x7f067e4af8ad]
4: (()+0xc5709) [0x7f067e351709]
5: (()+0xdc7ac) [0x7f067e3687ac]
6: (()+0xdcffa) [0x7f067e368ffa]
7: (()+0xde562) [0x7f067e36a562]
8: (()+0xe42df) [0x7f067e3702df]
9: (()+0x2c4949) [0x7f067e550949]
10: (()+0x2f236d) [0x7f067e57e36d]
11: (()+0x8182) [0x7f0679f06182]
12: (clone()+0x6d) [0x7f0679c3347d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
terminate called after throwing an instance of 'ceph::FailedAssertion'
2015-11-16 12:10:17.861+0000: shutting down

Actions #1

Updated by Samuel Just over 8 years ago

Well, both of those backtraces are 94.3.

Actions #2

Updated by Sage Weil over 8 years ago

  • Status changed from New to Rejected

fixed in 0.94.4 by bee86660377cfaa74f7ed668dd02492f25553ff9 ... you need to upgrade the client!

Actions #3

Updated by Richard Arends over 8 years ago

okay.... And i double checked that everything was up to date and the instances where restarted after the change date. Thanks for the pointer, i will investigate why these nodes where running the old version.

Actions

Also available in: Atom PDF