Actions
Bug #18200
closedRBD diff got SIGABRT with "--whole-object" for RBD whose parent also have fast-diff feature enabled.
Status:
Resolved
Priority:
High
Assignee:
-
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
jewel,kraken
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
terminate called after throwing an instance of 'ceph::buffer::end_of_buffer' what(): buffer::end_of_buffer Program received signal SIGABRT, Aborted.
The root cause is,
If the rbd has parent, in https://github.com/ceph/ceph/blame/master/src/librbd/DiffIterate.cc#L267 we will try to get the overlap size. But as we use snap=0 instead of CEPH_NOSNAP, the get_parent_info(0) cannot get parent info but just return NULL. so the size of child RBD is used as overlap.
However, when the size of child is larger than size of parent, , we will do tons of waste operations in Line 286, which iterating inexists(out of range) objects. What is worse, if --whole-object specified, we will try to access the object_map out of range, then cause the error.
Logs with ---whole-object like:
2016-12-08 01:32:26.353581 7f75b171ad80 5 librbd::DiffIterate: diff_iterate from 0 to 18446744073709551614 size from 0 to 128849018880 2016-12-08 01:32:26.353594 7f75b171ad80 10 librbd::DiffIterate: first getting parent diff 2016-12-08 01:32:26.354372 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: loaded object map rbd_object_map.c0623a238e1f29.0000000000000033 2016-12-08 01:32:26.354378 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: computed overlap diffs 2016-12-08 01:32:26.354379 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 0 ->1 2016-12-08 01:32:26.354381 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 1 ->1 2016-12-08 01:32:26.354382 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 2 ->1 2016-12-08 01:32:26.354382 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 3 ->1 2016-12-08 01:32:26.354383 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 4 ->1 2016-12-08 01:32:26.354384 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 5 ->1 2016-12-08 01:32:26.354385 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 6 ->0 2016-12-08 01:32:26.354385 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 7 ->0 2016-12-08 01:32:26.354386 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 8 ->1 2016-12-08 01:32:26.354387 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 9 ->1 2016-12-08 01:32:26.354387 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 10 ->1 2016-12-08 01:32:26.354388 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 11 ->1 2016-12-08 01:32:26.354389 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 12 ->1 2016-12-08 01:32:26.354390 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 13 ->1 2016-12-08 01:32:26.354390 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 14 ->1 2016-12-08 01:32:26.354391 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 15 ->1 2016-12-08 01:32:26.354392 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 16 ->1 2016-12-08 01:32:26.354392 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 17 ->1 2016-12-08 01:32:26.354393 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 18 ->1 2016-12-08 01:32:26.354394 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 19 ->1 2016-12-08 01:32:26.354394 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 20 ->1 2016-12-08 01:32:26.354395 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 21 ->1 2016-12-08 01:32:26.354395 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 22 ->1 2016-12-08 01:32:26.354396 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 23 ->1 2016-12-08 01:32:26.354397 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 24 ->1 2016-12-08 01:32:26.354398 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 25 ->1 2016-12-08 01:32:26.354398 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 26 ->1 2016-12-08 01:32:26.354399 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 27 ->1 2016-12-08 01:32:26.354400 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 28 ->1 2016-12-08 01:32:26.354400 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 29 ->1 2016-12-08 01:32:26.354401 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 30 ->1 2016-12-08 01:32:26.354402 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 31 ->1 2016-12-08 01:32:26.354402 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 32 ->1 2016-12-08 01:32:26.354403 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 33 ->1 .... 2016-12-08 01:32:26.354697 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 474 ->0 2016-12-08 01:32:26.354698 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: computed resize diffs 2016-12-08 01:32:26.354699 7f75b171ad80 5 librbd::DiffIterate: fast diff enabled 2016-12-08 01:32:26.354700 7f75b171ad80 5 librbd::DiffIterate: diff_iterate from 0 to 51 size from 0 to 1992294400 2016-12-08 01:32:26.354716 7f75b171ad80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.0000000000000000 2016-12-08 01:32:26.354723 7f75b171ad80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.0000000000000001 2016-12-08 01:32:26.354725 7f75b171ad80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.0000000000000002 2016-12-08 01:32:26.354726 7f75b171ad80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.0000000000000003 2016-12-08 01:32:26.354728 7f75b171ad80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.0000000000000004 2016-12-08 01:32:26.354729 7f75b171ad80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.0000000000000005 ... 2016-12-08 01:32:26.869895 7f75b171ad80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.00000000000001da 2016-12-08 01:32:26.869896 7f75b171ad80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.00000000000001db 2016-12-08 01:32:26.869897 7f75b171ad80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.00000000000001dc 2016-12-08 01:32:27.054650 7f75b171ad80 -1 *** Caught signal (Aborted) ** in thread 7f75b171ad80 thread_name:rbd ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374) 1: (()+0x1f8ac2) [0x7f75b1384ac2] 2: (()+0x10340) [0x7f759dcea340] 3: (gsignal()+0x39) [0x7f759b5c5cc9] 4: (abort()+0x148) [0x7f759b5c90d8] 5: (__gnu_cxx::__verbose_terminate_handler()+0x155) [0x7f759bed0535] 6: (()+0x5e6d6) [0x7f759bece6d6] 7: (()+0x5e703) [0x7f759bece703] 8: (()+0x5e922) [0x7f759bece922] 9: (()+0x194d2b) [0x7f75a77e9d2b] 10: (()+0x85877) [0x7f75a76da877] 11: (()+0x8f197) [0x7f75a76e4197] 12: (()+0x8fa67) [0x7f75a76e4a67] 13: (()+0xba3c0) [0x7f75a770f3c0] 14: (librbd::Image::diff_iterate2(char const*, unsigned long, unsigned long, bool, bool, int (*)(unsigned long, unsigned long, int, void*), void*)+0x72) [0x7f75a76b3a22] 15: (rbd::action::diff::execute(boost::program_options::variables_map const&)+0x2e0) [0x7f75b12f1b70] 16: (rbd::Shell::execute(std::vector<char const*, std::allocator<char const*> > const&)+0x857) [0x7f75b12dc1f7] 17: (main()+0x62) [0x7f75b12ad682]
W/O --whole-object, we can see we are iterating out of range object ( the parent is 1900MB)
2016-12-08 01:49:33.853183 7fdbd30b2d80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.0000000000000645 2016-12-08 01:49:33.853197 7fdbd30b2d80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.000000000000063d: list_snaps (not found) 2016-12-08 01:49:33.853239 7fdbd30b2d80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.0000000000000646 2016-12-08 01:49:33.853284 7fdbd30b2d80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.0000000000000647 2016-12-08 01:49:33.853290 7fdbd30b2d80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.000000000000063e: list_snaps (not found) 2016-12-08 01:49:33.853333 7fdbd30b2d80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.0000000000000648 2016-12-08 01:49:33.853339 7fdbd30b2d80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.000000000000063f: list_snaps (not found) 2016-12-08 01:49:33.853385 7fdbd30b2d80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.0000000000000649 2016-12-08 01:49:33.853427 7fdbd30b2d80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.000000000000064a 2016-12-08 01:49:33.853660 7fdbd30b2d80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.0000000000000640: list_snaps (not found) 2016-12-08 01:49:33.853668 7fdbd30b2d80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.0000000000000641: list_snaps (not found) 2016-12-08 01:49:33.853672 7fdbd30b2d80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.0000000000000642: list_snaps (not found) 2016-12-08 01:49:33.853709 7fdbd30b2d80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.000000000000064b 2016-12-08 01:49:33.853712 7fdbd30b2d80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.0000000000000643: list_snaps (not found) 2016-12-08 01:49:33.853762 7fdbd30b2d80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.000000000000064c 2016-12-08 01:49:33.853825 7fdbd30b2d80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.000000000000064d 2016-12-08 01:49:33.853831 7fdbd30b2d80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.0000000000000644: list_snaps (not found) 2016-12-08 01:49:33.853833 7fdbd30b2d80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.0000000000000645: list_snaps (not found)
Updated by Jason Dillaman over 7 years ago
- Status changed from New to Fix Under Review
Updated by Jason Dillaman over 7 years ago
- Priority changed from Normal to High
- Backport set to jewel, kraken
Updated by Jason Dillaman over 7 years ago
- Status changed from Fix Under Review to Pending Backport
Updated by Nathan Cutler over 7 years ago
- Copied to Backport #18278: jewel: RBD diff got SIGABRT with "--whole-object" for RBD whose parent also have fast-diff feature enabled. added
Updated by Nathan Cutler over 7 years ago
- Copied to Backport #18279: kraken: RBD diff got SIGABRT with "--whole-object" for RBD whose parent also have fast-diff feature enabled. added
Updated by Jason Dillaman over 7 years ago
- Backport changed from jewel, kraken to jewel
Updated by Nathan Cutler over 7 years ago
- Backport changed from jewel to jewel,kraken
Updated by Nathan Cutler about 7 years ago
- Status changed from Pending Backport to Resolved
Actions