Project

General

Profile

Bug #11305

"FAILED assert(ret)" in upgrade:dumpling-x-firefly-distro-basic-vps run

Added by Yuri Weinstein almost 9 years ago. Updated almost 9 years ago.

Status:
Duplicate
Priority:
High
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
upgrade/dumpling-firefly-x, upgrade/dumpling-x
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Run: http://pulpito.ceph.com/teuthology-2015-03-29_19:13:02-upgrade:dumpling-x-firefly-distro-basic-vps/
Job: 827087
Logs: http://qa-proxy.ceph.com/teuthology/teuthology-2015-03-29_19:13:02-upgrade:dumpling-x-firefly-distro-basic-vps/827087/

In teuthology@teuthology:/a/teuthology-2015-03-29_19:13:02-upgrade:dumpling-x-firefly-distro-basic-vps/827087/remote/vpm105/log$ zgrep "^ ceph version" ceph-osd.3.log.gz -b10 a30

ceph-osd.3.log.gz:1218769666-2015-04-01 10:22:46.305071 7f9b628707a0 15 filestore(/var/lib/ceph/osd/ceph-3) omap_get_values meta/16ef7597/infos/head//-1
ceph-osd.3.log.gz:1218769790-2015-04-01 10:22:46.307737 7f9b628707a0 20 osd.3 0 get_map 1138 - loading and decoding 0x2681a00
ceph-osd.3.log.gz:1218769887-2015-04-01 10:22:46.307751 7f9b628707a0 15 filestore(/var/lib/ceph/osd/ceph-3) read meta/a11a088/osdmap.1138/0//-1 0~0
ceph-osd.3.log.gz:1218770006-2015-04-01 10:22:46.307803 7f9b628707a0 10 filestore(/var/lib/ceph/osd/ceph-3) error opening file /var/lib/ceph/osd/ceph-3/current/meta/DIR_8/DIR_8/osdmap.1138__0_0A11A088__none with flags=2: (2) No such file or directory
ceph-osd.3.log.gz:1218770228-2015-04-01 10:22:46.307815 7f9b628707a0 10 filestore(/var/lib/ceph/osd/ceph-3) FileStore::read(meta/a11a088/osdmap.1138/0//-1) open error: (2) No such file or directory
ceph-osd.3.log.gz:1218770397-2015-04-01 10:22:46.308719 7f9b628707a0 -1 osd/OSD.h: In function 'OSDMapRef OSDService::get_map(epoch_t)' thread 7f9b628707a0 time 2015-04-01 10:22:46.307826
ceph-osd.3.log.gz:1218770556-osd/OSD.h: 670: FAILED assert(ret)
ceph-osd.3.log.gz:1218770591-
ceph-osd.3.log.gz:1218770592: ceph version 0.80.9-197-g899738e (899738e10e82b50dcf7dfffe5cc83937179bf323)
ceph-osd.3.log.gz:1218770669- 1: (OSD::load_pgs()+0x2d30) [0x648050]
ceph-osd.3.log.gz:1218770709- 2: (OSD::init()+0x22c0) [0x64ba00]
ceph-osd.3.log.gz:1218770745- 3: (main()+0x35bc) [0x5fe57c]
ceph-osd.3.log.gz:1218770776- 4: (__libc_start_main()+0xfd) [0x7f9b606d8d1d]
ceph-osd.3.log.gz:1218770824- 5: ceph-osd() [0x5fa029]
ceph-osd.3.log.gz:1218770850- NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.


Related issues

Duplicates Ceph - Bug #11429: OSD::load_pgs: we need to handle the case where an upgrade from earlier versions which ignored non-existent pgs resurrects a pg with a prehistoric osdmap Resolved 04/20/2015

History

#1 Updated by Yuri Weinstein almost 9 years ago

  • Subject changed from "FAILED assert(ret)" in pgrade:dumpling-x-firefly-distro-basic-vps run to "FAILED assert(ret)" in upgrade:dumpling-x-firefly-distro-basic-vps run

#2 Updated by Samuel Just almost 9 years ago

2015-04-01 08:45:22.882112 7fbf5cb567a0 10 osd.3 1143 load_pgs: skipping PG 258.1 because we don't have pool 258

Bug in dumpling where we skip the pg if the pool has been removed, probably won't fix. The bug is showing up now since we fixed that bug in firefly by backporting 879fd0c192f5d3c6afd36c2df359806ea95827b8. We could probably fix it by backporting to dumpling as well -- not sure it's worth it. If it happens again, we can revisit.

#3 Updated by Samuel Just almost 9 years ago

  • Priority changed from Urgent to High

#4 Updated by Yuri Weinstein almost 9 years ago

  • ceph-qa-suite upgrade/dumpling-firefly-x added

Run: http://pulpito.ceph.com/teuthology-2015-03-31_19:13:02-upgrade:dumpling-x-firefly-distro-basic-vps/
Job: 830956
Logs: http://qa-proxy.ceph.com/teuthology/teuthology-2015-03-31_19:13:02-upgrade:dumpling-x-firefly-distro-basic-vps/830956/

In teuthology@teuthology:/a/teuthology-2015-03-31_19:13:02-upgrade:dumpling-x-firefly-distro-basic-vps/830956/remote/vpm159/log$ zgrep "^ ceph version" ceph-osd.3.log.gz -b10 -a30

ceph-osd.3.log.gz:1614687503-    -1> 2015-04-04 06:01:33.996038 7f64fc9b4780 10 filestore(/var/lib/ceph/osd/ceph-3) FileStore::read(meta/ac9631f5/osdmap.551/0//-1) open error: (2) No such file or directory
ceph-osd.3.log.gz:1614687680-     0> 2015-04-04 06:01:34.014275 7f64fc9b4780 -1 osd/OSD.h: In function 'OSDMapRef OSDService::get_map(epoch_t)' thread 7f64fc9b4780 time 2015-04-04 06:01:33.996071
ceph-osd.3.log.gz:1614687847-osd/OSD.h: 670: FAILED assert(ret)
ceph-osd.3.log.gz:1614687882-
ceph-osd.3.log.gz:1614687883: ceph version 0.80.9-197-g899738e (899738e10e82b50dcf7dfffe5cc83937179bf323)
ceph-osd.3.log.gz:1614687960- 1: (OSD::load_pgs()+0x22c1) [0x65cb81]
ceph-osd.3.log.gz:1614688000- 2: (OSD::init()+0x1ba1) [0x666281]
ceph-osd.3.log.gz:1614688036- 3: (main()+0x1eb8) [0x6074f8]
ceph-osd.3.log.gz:1614688067- 4: (__libc_start_main()+0xed) [0x7f64fa7d076d]
ceph-osd.3.log.gz:1614688115- 5: ceph-osd() [0x60b6b9]
ceph-osd.3.log.gz:1614688141- NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
ceph-osd.3.log.gz:1614688234-
ceph-osd.3.log.gz:1614688235---- logging levels ---
ceph-osd.3.log.gz:1614688258-   0/ 5 none
ceph-osd.3.log.gz:1614688271-   0/ 1 lockdep
ceph-osd.3.log.gz:1614688287-   0/ 1 context
ceph-osd.3.log.gz:1614688303-   1/ 1 crush
ceph-osd.3.log.gz:1614688317-   1/ 5 mds
ceph-osd.3.log.gz:1614688329-   1/ 5 mds_balancer
ceph-osd.3.log.gz:1614688350-   1/ 5 mds_locker
ceph-osd.3.log.gz:1614688369-   1/ 5 mds_log
ceph-osd.3.log.gz:1614688385-   1/ 5 mds_log_expire
ceph-osd.3.log.gz:1614688408-   1/ 5 mds_migrator
ceph-osd.3.log.gz:1614688429-   0/ 1 buffer
ceph-osd.3.log.gz:1614688444-   0/ 1 timer
ceph-osd.3.log.gz:1614688458-   0/ 1 filer
ceph-osd.3.log.gz:1614688472-   0/ 1 striper
ceph-osd.3.log.gz:1614688488-   0/ 1 objecter
ceph-osd.3.log.gz:1614688505-   0/ 5 rados
ceph-osd.3.log.gz:1614688519-   0/ 5 rbd
ceph-osd.3.log.gz:1614688531-   0/ 5 journaler
ceph-osd.3.log.gz:1614688549-   0/ 5 objectcacher
ceph-osd.3.log.gz:1614688570-   0/ 5 client
ceph-osd.3.log.gz:1614688585-  20/20 osd
ceph-osd.3.log.gz:1614688597-   0/ 5 optracker
ceph-osd.3.log.gz:1614688615-   0/ 5 objclass
ceph-osd.3.log.gz:1614688632-  20/20 filestore
ceph-osd.3.log.gz:1614688650-   1/ 3 keyvaluestore
ceph-osd.3.log.gz:1614688672-  20/20 journal
ceph-osd.3.log.gz:1614688688-   1/ 1 ms
ceph-osd.3.log.gz:1614688699-   1/ 5 mon
ceph-osd.3.log.gz:1614688711-   0/10 monc
ceph-osd.3.log.gz:1614688724-   1/ 5 paxos
ceph-osd.3.log.gz:1614688738-   0/ 5 tp
ceph-osd.3.log.gz:1614688749-   1/ 5 auth
ceph-osd.3.log.gz:1614688762-   1/ 5 crypto
ceph-osd.3.log.gz:1614688777-   1/ 1 finisher
ceph-osd.3.log.gz:1614688794-   1/ 5 heartbeatmap
ceph-osd.3.log.gz:1614688815-   1/ 5 perfcounter
ceph-osd.3.log.gz:1614688835-   1/ 5 rgw
ceph-osd.3.log.gz:1614688847-   1/10 civetweb
ceph-osd.3.log.gz:1614688864-   1/ 5 javaclient
ceph-osd.3.log.gz:1614688883-   1/ 5 asok
ceph-osd.3.log.gz:1614688896-   1/ 1 throttle
ceph-osd.3.log.gz:1614688913-  -2/-2 (syslog threshold)
ceph-osd.3.log.gz:1614688940-  -1/-1 (stderr threshold)
ceph-osd.3.log.gz:1614688967-  max_recent     10000
ceph-osd.3.log.gz:1614688990-  max_new         1000
ceph-osd.3.log.gz:1614689013-  log_file /var/log/ceph/ceph-osd.3.log
ceph-osd.3.log.gz:1614689053---- end dump of recent events ---
ceph-osd.3.log.gz:1614689087-2015-04-04 06:01:34.048672 7f64fc9b4780 -1 *** Caught signal (Aborted) **
ceph-osd.3.log.gz:1614689161- in thread 7f64fc9b4780
ceph-osd.3.log.gz:1614689185-
ceph-osd.3.log.gz:1614689186: ceph version 0.80.9-197-g899738e (899738e10e82b50dcf7dfffe5cc83937179bf323)
ceph-osd.3.log.gz:1614689263- 1: ceph-osd() [0x99c01a]
ceph-osd.3.log.gz:1614689289- 2: (()+0xfcb0) [0x7f64fbee0cb0]
ceph-osd.3.log.gz:1614689322- 3: (gsignal()+0x35) [0x7f64fa7e50d5]
ceph-osd.3.log.gz:1614689360- 4: (abort()+0x17b) [0x7f64fa7e883b]
ceph-osd.3.log.gz:1614689397- 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f64fb13769d]
ceph-osd.3.log.gz:1614689467- 6: (()+0xb5846) [0x7f64fb135846]
ceph-osd.3.log.gz:1614689501- 7: (()+0xb5873) [0x7f64fb135873]
ceph-osd.3.log.gz:1614689535- 8: (()+0xb596e) [0x7f64fb13596e]
ceph-osd.3.log.gz:1614689569- 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1df) [0xa7ea0f]
ceph-osd.3.log.gz:1614689661- 10: (OSD::load_pgs()+0x22c1) [0x65cb81]
ceph-osd.3.log.gz:1614689702- 11: (OSD::init()+0x1ba1) [0x666281]
ceph-osd.3.log.gz:1614689739- 12: (main()+0x1eb8) [0x6074f8]
ceph-osd.3.log.gz:1614689771- 13: (__libc_start_main()+0xed) [0x7f64fa7d076d]
ceph-osd.3.log.gz:1614689820- 14: ceph-osd() [0x60b6b9]
ceph-osd.3.log.gz:1614689847- NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

#7 Updated by Yuri Weinstein almost 9 years ago

Run: http://pulpito.ceph.com/teuthology-2015-04-10_19:13:01-upgrade:dumpling-x-firefly-distro-basic-vps/
Jobs: ['843769', '843770']

Assertion: osd/OSD.h: 670: FAILED assert(ret)
ceph version 0.80.9-201-g12143ff (12143ff9b25fdd96f8d1a9cecb1329c7f354d414)
 1: (OSD::load_pgs()+0x1d1b) [0x78b2bb]
 2: (OSD::init()+0x1619) [0x78e399]
 3: (main()+0x2366) [0x733d66]
 4: (__libc_start_main()+0xfd) [0x7fad3af79ead]
 5: ceph-osd() [0x737909]

#8 Updated by Samuel Just almost 9 years ago

Changed my mind, going to fix.

#9 Updated by Samuel Just almost 9 years ago

  • Status changed from New to Duplicate

Marking as duplicate of the other one.

Also available in: Atom PDF