Project

General

Profile

Bug #18200

RBD diff got SIGABRT with "--whole-object" for RBD whose parent also have fast-diff feature enabled.

Added by Xiaoxi Chen 6 months ago. Updated 4 months ago.

Status:
Resolved
Priority:
High
Assignee:
-
Target version:
-
Start date:
12/08/2016
Due date:
% Done:

0%

Source:
Tags:
Backport:
jewel,kraken
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Release:
Needs Doc:
No

Description

terminate called after throwing an instance of 'ceph::buffer::end_of_buffer'

  what():  buffer::end_of_buffer

Program received signal SIGABRT, Aborted.

The root cause is,
If the rbd has parent, in https://github.com/ceph/ceph/blame/master/src/librbd/DiffIterate.cc#L267 we will try to get the overlap size. But as we use snap=0 instead of CEPH_NOSNAP, the get_parent_info(0) cannot get parent info but just return NULL. so the size of child RBD is used as overlap.
However, when the size of child is larger than size of parent, , we will do tons of waste operations in Line 286, which iterating inexists(out of range) objects. What is worse, if --whole-object specified, we will try to access the object_map out of range, then cause the error.

Logs with ---whole-object like:

2016-12-08 01:32:26.353581 7f75b171ad80  5 librbd::DiffIterate: diff_iterate from 0 to 18446744073709551614 size from 0 to 128849018880
2016-12-08 01:32:26.353594 7f75b171ad80 10 librbd::DiffIterate:  first getting parent diff
2016-12-08 01:32:26.354372 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: loaded object map rbd_object_map.c0623a238e1f29.0000000000000033
2016-12-08 01:32:26.354378 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: computed overlap diffs
2016-12-08 01:32:26.354379 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 0 ->1
2016-12-08 01:32:26.354381 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 1 ->1
2016-12-08 01:32:26.354382 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 2 ->1
2016-12-08 01:32:26.354382 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 3 ->1
2016-12-08 01:32:26.354383 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 4 ->1
2016-12-08 01:32:26.354384 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 5 ->1
2016-12-08 01:32:26.354385 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 6 ->0
2016-12-08 01:32:26.354385 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 7 ->0
2016-12-08 01:32:26.354386 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 8 ->1
2016-12-08 01:32:26.354387 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 9 ->1
2016-12-08 01:32:26.354387 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 10 ->1
2016-12-08 01:32:26.354388 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 11 ->1
2016-12-08 01:32:26.354389 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 12 ->1
2016-12-08 01:32:26.354390 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 13 ->1
2016-12-08 01:32:26.354390 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 14 ->1
2016-12-08 01:32:26.354391 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 15 ->1
2016-12-08 01:32:26.354392 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 16 ->1
2016-12-08 01:32:26.354392 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 17 ->1
2016-12-08 01:32:26.354393 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 18 ->1
2016-12-08 01:32:26.354394 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 19 ->1
2016-12-08 01:32:26.354394 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 20 ->1
2016-12-08 01:32:26.354395 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 21 ->1
2016-12-08 01:32:26.354395 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 22 ->1
2016-12-08 01:32:26.354396 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 23 ->1
2016-12-08 01:32:26.354397 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 24 ->1
2016-12-08 01:32:26.354398 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 25 ->1
2016-12-08 01:32:26.354398 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 26 ->1
2016-12-08 01:32:26.354399 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 27 ->1
2016-12-08 01:32:26.354400 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 28 ->1
2016-12-08 01:32:26.354400 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 29 ->1
2016-12-08 01:32:26.354401 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 30 ->1
2016-12-08 01:32:26.354402 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 31 ->1
2016-12-08 01:32:26.354402 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 32 ->1
2016-12-08 01:32:26.354403 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 33 ->1

....

2016-12-08 01:32:26.354697 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: object state: 474 ->0
2016-12-08 01:32:26.354698 7f75b171ad80 20 librbd::DiffIterate: diff_object_map: computed resize diffs
2016-12-08 01:32:26.354699 7f75b171ad80  5 librbd::DiffIterate: fast diff enabled
2016-12-08 01:32:26.354700 7f75b171ad80  5 librbd::DiffIterate: diff_iterate from 0 to 51 size from 0 to 1992294400
2016-12-08 01:32:26.354716 7f75b171ad80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.0000000000000000
2016-12-08 01:32:26.354723 7f75b171ad80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.0000000000000001
2016-12-08 01:32:26.354725 7f75b171ad80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.0000000000000002
2016-12-08 01:32:26.354726 7f75b171ad80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.0000000000000003
2016-12-08 01:32:26.354728 7f75b171ad80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.0000000000000004
2016-12-08 01:32:26.354729 7f75b171ad80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.0000000000000005
...
2016-12-08 01:32:26.869895 7f75b171ad80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.00000000000001da
2016-12-08 01:32:26.869896 7f75b171ad80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.00000000000001db
2016-12-08 01:32:26.869897 7f75b171ad80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.00000000000001dc
2016-12-08 01:32:27.054650 7f75b171ad80 -1 *** Caught signal (Aborted) **
 in thread 7f75b171ad80 thread_name:rbd

 ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
 1: (()+0x1f8ac2) [0x7f75b1384ac2]
 2: (()+0x10340) [0x7f759dcea340]
 3: (gsignal()+0x39) [0x7f759b5c5cc9]
 4: (abort()+0x148) [0x7f759b5c90d8]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x155) [0x7f759bed0535]
 6: (()+0x5e6d6) [0x7f759bece6d6]
 7: (()+0x5e703) [0x7f759bece703]
 8: (()+0x5e922) [0x7f759bece922]
 9: (()+0x194d2b) [0x7f75a77e9d2b]
 10: (()+0x85877) [0x7f75a76da877]
 11: (()+0x8f197) [0x7f75a76e4197]
 12: (()+0x8fa67) [0x7f75a76e4a67]
 13: (()+0xba3c0) [0x7f75a770f3c0]
 14: (librbd::Image::diff_iterate2(char const*, unsigned long, unsigned long, bool, bool, int (*)(unsigned long, unsigned long, int, void*), void*)+0x72) [0x7f75a76b3a22]
 15: (rbd::action::diff::execute(boost::program_options::variables_map const&)+0x2e0) [0x7f75b12f1b70]
 16: (rbd::Shell::execute(std::vector<char const*, std::allocator<char const*> > const&)+0x857) [0x7f75b12dc1f7]
 17: (main()+0x62) [0x7f75b12ad682]

W/O --whole-object, we can see we are iterating out of range object ( the parent is 1900MB)

2016-12-08 01:49:33.853183 7fdbd30b2d80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.0000000000000645
2016-12-08 01:49:33.853197 7fdbd30b2d80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.000000000000063d: list_snaps (not found)
2016-12-08 01:49:33.853239 7fdbd30b2d80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.0000000000000646
2016-12-08 01:49:33.853284 7fdbd30b2d80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.0000000000000647
2016-12-08 01:49:33.853290 7fdbd30b2d80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.000000000000063e: list_snaps (not found)
2016-12-08 01:49:33.853333 7fdbd30b2d80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.0000000000000648
2016-12-08 01:49:33.853339 7fdbd30b2d80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.000000000000063f: list_snaps (not found)
2016-12-08 01:49:33.853385 7fdbd30b2d80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.0000000000000649
2016-12-08 01:49:33.853427 7fdbd30b2d80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.000000000000064a
2016-12-08 01:49:33.853660 7fdbd30b2d80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.0000000000000640: list_snaps (not found)
2016-12-08 01:49:33.853668 7fdbd30b2d80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.0000000000000641: list_snaps (not found)
2016-12-08 01:49:33.853672 7fdbd30b2d80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.0000000000000642: list_snaps (not found)
2016-12-08 01:49:33.853709 7fdbd30b2d80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.000000000000064b
2016-12-08 01:49:33.853712 7fdbd30b2d80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.0000000000000643: list_snaps (not found)
2016-12-08 01:49:33.853762 7fdbd30b2d80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.000000000000064c
2016-12-08 01:49:33.853825 7fdbd30b2d80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.000000000000064d
2016-12-08 01:49:33.853831 7fdbd30b2d80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.0000000000000644: list_snaps (not found)
2016-12-08 01:49:33.853833 7fdbd30b2d80 20 librbd::DiffIterate: object rbd_data.c0623a238e1f29.0000000000000645: list_snaps (not found)

Related issues

Copied to Backport #18278: jewel: RBD diff got SIGABRT with "--whole-object" for RBD whose parent also have fast-diff feature enabled. Resolved
Copied to Backport #18279: kraken: RBD diff got SIGABRT with "--whole-object" for RBD whose parent also have fast-diff feature enabled. Resolved

History

#1 Updated by Jason Dillaman 6 months ago

  • Status changed from New to Need Review

#2 Updated by Jason Dillaman 6 months ago

  • Priority changed from Normal to High
  • Backport set to jewel, kraken

#3 Updated by Jason Dillaman 5 months ago

  • Status changed from Need Review to Pending Backport

#4 Updated by Nathan Cutler 5 months ago

  • Copied to Backport #18278: jewel: RBD diff got SIGABRT with "--whole-object" for RBD whose parent also have fast-diff feature enabled. added

#5 Updated by Nathan Cutler 5 months ago

  • Copied to Backport #18279: kraken: RBD diff got SIGABRT with "--whole-object" for RBD whose parent also have fast-diff feature enabled. added

#6 Updated by Jason Dillaman 5 months ago

  • Backport changed from jewel, kraken to jewel

#7 Updated by Nathan Cutler 5 months ago

  • Backport changed from jewel to jewel,kraken

#8 Updated by Nathan Cutler 4 months ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF