Project

General

Profile

Actions

Bug #4785

closed

rbd export-diff: librados/snap_set_diff.cc: 40: FAILED assert(b == r->snaps[r->snaps.size()-1])

Added by Denis kaganovich about 11 years ago. Updated about 11 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Creating backups. Yesterday created snapshots "backup" for every rbd image. Everyday creating snapshot "backup.tmp" and export-diff --snap backup.tmp --from-snap backup. Yesterday was all good. Now for half of images:

librados/snap_set_diff.cc: 40: FAILED assert(b == r->snaps[r->snaps.size()-1])

ceph version 0.60-608-gb73ef01 (b73ef010bf22059c0ca07086208422542d4093cf)
1: (calc_snap_set_diff(CephContext*, librados::snap_set_t const&, unsigned long, unsigned long, interval_set<unsigned long>*, bool*)+0x25e4) [0x7f8de7329cc4]
2: (librbd::diff_iterate(librbd::ImageCtx*, char const*, unsigned long, unsigned long, int ()(unsigned long, unsigned long, int, void), void*)+0x698) [0x7f8de72f7e18]
3: (main()+0x43b7) [0x42ce67]
4: (__libc_start_main()+0xf5) [0x7f8de4925975]
5: rbd() [0x430129]

- same for master & wip-3495.


Files

foo (24.7 KB) foo Denis kaganovich, 04/24/2013 11:11 AM
out (27.1 KB) out Denis kaganovich, 04/24/2013 01:15 PM
Actions #1

Updated by Denis kaganovich about 11 years ago

PS Same without "--snap backup.tmp" - to active image only.

Actions #2

Updated by Sage Weil about 11 years ago

  • Status changed from New to Need More Info
  • Priority changed from Normal to Urgent

Can you confirm what version the OSDs are running? My first guess is they have v0.60 or older code that doesn't have the fixed LISTSNAPS code.

Actions #3

Updated by Denis kaganovich about 11 years ago

git, wip-3495 branch:

  1. ceph-osd --version
    ceph version 0.60-476-gd3752d2 (d3752d2a09f221f8cee6919ce59d102fd7f2f98b)

(I solve wip-3495 problem only this Sunday, 19.04, so all around this day, related time of rebuild and run per 1 of 3 node, +/-)

Actions #4

Updated by Denis kaganovich about 11 years ago

Now installing "next" branch on all nodes (as wip-3495 here now):

librados/snap_set_diff.cc: 40: FAILED assert(b == r->snaps[r->snaps.size()-1])
ceph version 0.60-639-gd44cfc5 (d44cfc524fc0844c6027c586090302d45f360efb)
1: (calc_snap_set_diff(CephContext*, librados::snap_set_t const&, unsigned long, unsigned long, interval_set<unsigned long>*, bool*)+0x248c) [0x7f16178b0eec]
2: (librbd::diff_iterate(librbd::ImageCtx*, char const*, unsigned long, unsigned long, int ()(unsigned long, unsigned long, int, void), void*)+0x63b) [0x7f161787c37b]
3: (main()+0x2784) [0x42d964]
4: (__libc_start_main()+0xf8) [0x7f16150a2d78]
5: rbd() [0x431ac1]

Actions #5

Updated by Denis kaganovich about 11 years ago

PS /etc/init.d/ceph restart - on all...

Actions #6

Updated by Samuel Just about 11 years ago

  • Assignee set to Samuel Just
Actions #7

Updated by Sage Weil about 11 years ago

  • Assignee changed from Samuel Just to Sage Weil
Actions #8

Updated by Sage Weil about 11 years ago

Denis: can you run the rbd failing command with --log-file foo --log-max 1 --debug-ms 1 --debug-rbd 20?

Also, pushed wip-4785 that fixes the version compatibility thing.. altho it sounds like that may not be your problem.

Actions #9

Updated by Denis kaganovich about 11 years ago

Sage Weil wrote:

Denis: can you run the rbd failing command with --log-file foo --log-max 1 --debug-ms 1 --debug-rbd 20?

"--log-max 1" do "rbd: extraneous parameter 1".
All others - attached "foo".

Also, pushed wip-4785 that fixes the version compatibility thing.. altho it sounds like that may not be your problem.

Now - branch "next" everywere.

Actions #10

Updated by Sage Weil about 11 years ago

can you do 'rados -p rbd listsnaps rb.0.c558.238e1f29.000000000000'?

also are you on irc? that would be quicker to debug this (sagewk in #ceph on irc.oftc.net)

thanks!

Actions #11

Updated by Denis kaganovich about 11 years ago

... there was from "--from-snap backup" to active image. To secondary snapshot log differ in not significant details.
Command:
rbd export-diff vm10_1 --from-snap backup --log-file foo --debug-ms 1 --debug-rbd 20 xxxx

Actions #12

Updated by Sage Weil about 11 years ago

Denis: can you attach teh output from the listsnaps command above?

Actions #13

Updated by Denis kaganovich about 11 years ago

HMM. If I right understand, you want console output? IMHO it near same (a bit duplicating) to already attached "foo" log, but OK, attaching...

Actions #14

Updated by Sage Weil about 11 years ago

i mean the output from the command 'rados -p rbd listsnaps rb.0.c558.238e1f29.000000000000'

Actions #15

Updated by Denis kaganovich about 11 years ago

Or you want output from --snap ... --from-snap ... ?
I in doubts!

Actions #16

Updated by Denis kaganovich about 11 years ago

Sage Weil wrote:

i mean the output from the command 'rados -p rbd listsnaps rb.0.c558.238e1f29.000000000000'

OK, sorry:

rb.0.c558.238e1f29.000000000000:
cloneid snaps size overlap
1137 983 4194304 [0~1200128,1232896~5632,1239040~2955264]
head - 4194304

Actions #17

Updated by Sage Weil about 11 years ago

  • Status changed from Need More Info to 7

Perfect, I see the problem now! Can you try wip-4785-b?

Actions #18

Updated by Denis kaganovich about 11 years ago

Thanks! Diffs completed.

Actions #20

Updated by Sage Weil about 11 years ago

  • Status changed from 7 to Resolved
Actions

Also available in: Atom PDF