Bug #4785
closedrbd export-diff: librados/snap_set_diff.cc: 40: FAILED assert(b == r->snaps[r->snaps.size()-1])
0%
Description
Creating backups. Yesterday created snapshots "backup" for every rbd image. Everyday creating snapshot "backup.tmp" and export-diff --snap backup.tmp --from-snap backup. Yesterday was all good. Now for half of images:
librados/snap_set_diff.cc: 40: FAILED assert(b == r->snaps[r->snaps.size()-1])
ceph version 0.60-608-gb73ef01 (b73ef010bf22059c0ca07086208422542d4093cf)
1: (calc_snap_set_diff(CephContext*, librados::snap_set_t const&, unsigned long, unsigned long, interval_set<unsigned long>*, bool*)+0x25e4) [0x7f8de7329cc4]
2: (librbd::diff_iterate(librbd::ImageCtx*, char const*, unsigned long, unsigned long, int ()(unsigned long, unsigned long, int, void), void*)+0x698) [0x7f8de72f7e18]
3: (main()+0x43b7) [0x42ce67]
4: (__libc_start_main()+0xf5) [0x7f8de4925975]
5: rbd() [0x430129]
- same for master & wip-3495.
Files
Updated by Denis kaganovich about 11 years ago
PS Same without "--snap backup.tmp" - to active image only.
Updated by Sage Weil about 11 years ago
- Status changed from New to Need More Info
- Priority changed from Normal to Urgent
Can you confirm what version the OSDs are running? My first guess is they have v0.60 or older code that doesn't have the fixed LISTSNAPS code.
Updated by Denis kaganovich about 11 years ago
git, wip-3495 branch:
- ceph-osd --version
ceph version 0.60-476-gd3752d2 (d3752d2a09f221f8cee6919ce59d102fd7f2f98b)
(I solve wip-3495 problem only this Sunday, 19.04, so all around this day, related time of rebuild and run per 1 of 3 node, +/-)
Updated by Denis kaganovich about 11 years ago
Now installing "next" branch on all nodes (as wip-3495 here now):
librados/snap_set_diff.cc: 40: FAILED assert(b == r->snaps[r->snaps.size()-1])
ceph version 0.60-639-gd44cfc5 (d44cfc524fc0844c6027c586090302d45f360efb)
1: (calc_snap_set_diff(CephContext*, librados::snap_set_t const&, unsigned long, unsigned long, interval_set<unsigned long>*, bool*)+0x248c) [0x7f16178b0eec]
2: (librbd::diff_iterate(librbd::ImageCtx*, char const*, unsigned long, unsigned long, int ()(unsigned long, unsigned long, int, void), void*)+0x63b) [0x7f161787c37b]
3: (main()+0x2784) [0x42d964]
4: (__libc_start_main()+0xf8) [0x7f16150a2d78]
5: rbd() [0x431ac1]
Updated by Denis kaganovich about 11 years ago
PS /etc/init.d/ceph restart - on all...
Updated by Sage Weil about 11 years ago
- Assignee changed from Samuel Just to Sage Weil
Updated by Sage Weil about 11 years ago
Denis: can you run the rbd failing command with --log-file foo --log-max 1 --debug-ms 1 --debug-rbd 20?
Also, pushed wip-4785 that fixes the version compatibility thing.. altho it sounds like that may not be your problem.
Updated by Denis kaganovich about 11 years ago
Sage Weil wrote:
Denis: can you run the rbd failing command with --log-file foo --log-max 1 --debug-ms 1 --debug-rbd 20?
"--log-max 1" do "rbd: extraneous parameter 1".
All others - attached "foo".
Also, pushed wip-4785 that fixes the version compatibility thing.. altho it sounds like that may not be your problem.
Now - branch "next" everywere.
Updated by Sage Weil about 11 years ago
can you do 'rados -p rbd listsnaps rb.0.c558.238e1f29.000000000000'?
also are you on irc? that would be quicker to debug this (sagewk in #ceph on irc.oftc.net)
thanks!
Updated by Denis kaganovich about 11 years ago
... there was from "--from-snap backup" to active image. To secondary snapshot log differ in not significant details.
Command:
rbd export-diff vm10_1 --from-snap backup --log-file foo --debug-ms 1 --debug-rbd 20 xxxx
Updated by Sage Weil about 11 years ago
Denis: can you attach teh output from the listsnaps command above?
Updated by Denis kaganovich about 11 years ago
HMM. If I right understand, you want console output? IMHO it near same (a bit duplicating) to already attached "foo" log, but OK, attaching...
Updated by Sage Weil about 11 years ago
i mean the output from the command 'rados -p rbd listsnaps rb.0.c558.238e1f29.000000000000'
Updated by Denis kaganovich about 11 years ago
Or you want output from --snap ... --from-snap ... ?
I in doubts!
Updated by Denis kaganovich about 11 years ago
Sage Weil wrote:
i mean the output from the command 'rados -p rbd listsnaps rb.0.c558.238e1f29.000000000000'
OK, sorry:
rb.0.c558.238e1f29.000000000000:
cloneid snaps size overlap
1137 983 4194304 [0~1200128,1232896~5632,1239040~2955264]
head - 4194304
Updated by Sage Weil about 11 years ago
- Status changed from Need More Info to 7
Perfect, I see the problem now! Can you try wip-4785-b?
Updated by Sage Weil about 11 years ago
merged to next, 3604c98232615827812099af27ebc3ed2414c8eb