Bug #37360
closedbluefs-bdev-expand aborts
0%
Description
root@node1:~# ceph-bluestore-tool bluefs-bdev-expand --path /var/lib/ceph/osd/ceph-16
infering bluefs devices from bluestore path
slot 1 /var/lib/ceph/osd/ceph-16/block
start:
1 : size 0x14ca000000 : own 0x[100000~c00000,e00000~100000,2500000~1500000,3b00000~100000,3d00000~100000,3f00000~5400000,9400000~900000,9e00000~e00000,ad00000~b00000,b900000~100000,bb00000~100000,bd00000~100000,bf00000~100000,c100000~100000,c300000~600000,ca00000~700000,d200000~300000,d600000~100000,db00000~a00000,e600000~100000,ec00000~600000,f300000~600000,fe00000~400000,10300000~100000,10500000~700000,10d00000~100000,10f00000~800000,11800000~100000,11d00000~700000,12500000~100000,12700000~100000,12900000~100000,12b00000~600000,13200000~300000,13600000~d00000,14700000~500000,14d00000~700000,15500000~200000,15800000~100000,15a00000~600000,16400000~400000,16c00000~400000,17100000~200000,17700000~900000,18100000~800000,18a00000~100000,18c00000~100000,18e00000~100000,19000000~100000,19500000~400000,19a00000~100000,19c00000~800000,1a500000~200000,1ab00000~a00000,1b600000~200000,1bc00000~600000,1c300000~700000,1cb00000~800000,1d400000~300000,1db00000~400000,1e000000~700000,1e800000~700000,1f000000~100000,1f600000~600000,1fd00000~700000,20500000~200000,20800000~500000,20e00000~300000,21200000~700000,21a00000~700000,22200000~100000,22700000~7800000,2a000000~700000,2a800000~200000,2ae00000~500000,2b400000~700000,2bc00000~700000,2c400000~100000,2c600000~600000,2cd00000~100000,2cf00000~200000,2d200000~100000,2d400000~600000,2db00000~200000,2e100000~900000,2eb00000~200000,2ee00000~900000,2f800000~700000,30000000~100000,30200000~200000,30900000~a00000,31700000~400000,32000000~500000,32600000~700000,32e00000~200000,33100000~700000,33900000~100000,33b00000~100000,33d00000~700000,34500000~100000,34a00000~500000,35300000~900000,35d00000~100000,35f00000~200000,36200000~100000,36400000~600000,36b00000~100000,36d00000~700000,37500000~300000,37900000~100000,37c00000~100000,38100000~900000,38e00000~500000,39b00000~800000,3a400000~200000,3a700000~100000,3a900000~400000,3ae00000~100000,3b000000~100000,3b200000~100000,3b400000~100000,3b600000~700000,3be00000~800000,3ca00000~400000,3cf00000~400000,3d800000~b00000,3e400000~a00000,3ef00000~200000,3f200000~800000,3fb00000~100000,3fd00000~200000,40000000~500000,40900000~600000,41400000~b00000,42000000~100000,42200000~100000,42700000~500000,42d00000~100000,42f00000~600000,43600000~100000,43800000~700000,44000000~300000,44400000~600000,44b00000~100000,44d00000~400000,45500000~b00000,46100000~100000,46a00000~900000,47400000~700000,47c00000~700000,48800000~800000,49100000~600000,49800000~200000,49e00000~16900000,69200000~100000,69400000~100000,69700000~300000,69b00000~100000,120000000~40000000]
/build/ceph-12.2.8/src/include/interval_set.h: In function 'T interval_set<T, Map>::range_end() const [with T = long unsigned int; Map = std::map<long unsigned int, long unsigned int, std::less<long unsigned int>, std::allocator<std::pair<const long unsigned int, long unsigned int> > >]' thread 7fe8fb161f80 time 2018-11-22 02:21:21.298051
/build/ceph-12.2.8/src/include/interval_set.h: 419: FAILED assert(!empty())
ceph version 12.2.8 (ae699615bac534ea496ee965ac6192cb7e0e07c0) luminous (stable)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x7fe8f19a7ac2]
2: (main()+0x24aa) [0x558fc13e251a]
3: (__libc_start_main()+0xf1) [0x7fe8eebf62e1]
4: (_start()+0x2a) [0x558fc14627aa]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
(continuation in attachment)
Files
Updated by Igor Fedotov over 5 years ago
Wondering if bluefs-bdev-sizes command works fine? What's about fsck?
Updated by Марк Коренберг over 5 years ago
root@node1:~# ceph-bluestore-tool bluefs-bdev-sizes --path /var/lib/ceph/osd/ceph-16 infering bluefs devices from bluestore path slot 1 /var/lib/ceph/osd/ceph-16/block 1 : size 0x14ca000000 : own 0x[100000~c00000,e00000~100000,2500000~1500000,3b00000~100000,3d00000~100000,3f00000~5400000,9400000~900000,9e00000~e00000,ad00000~b00000,b900000~100000,bb00000~100000,bd00000~100000,bf00000~100000,c100000~100000,c300000~600000,ca00000~700000,d200000~300000,d600000~100000,db00000~a00000,e600000~100000,ec00000~600000,f300000~600000,fe00000~400000,10300000~100000,10500000~700000,10d00000~100000,10f00000~800000,11800000~100000,11d00000~700000,12500000~100000,12700000~100000,12900000~100000,12b00000~600000,13200000~300000,13600000~d00000,14700000~500000,14d00000~700000,15500000~200000,15800000~100000,15a00000~600000,16400000~400000,16c00000~400000,17100000~200000,17700000~900000,18100000~800000,18a00000~100000,18c00000~100000,18e00000~100000,19000000~100000,19500000~400000,19a00000~100000,19c00000~800000,1a500000~200000,1ab00000~a00000,1b600000~200000,1bc00000~600000,1c300000~700000,1cb00000~800000,1d400000~300000,1db00000~400000,1e000000~700000,1e800000~700000,1f000000~100000,1f600000~600000,1fd00000~700000,20500000~200000,20800000~500000,20e00000~300000,21200000~700000,21a00000~700000,22200000~100000,22700000~7800000,2a000000~700000,2a800000~200000,2ae00000~500000,2b400000~700000,2bc00000~700000,2c400000~100000,2c600000~600000,2cd00000~100000,2cf00000~200000,2d200000~100000,2d400000~600000,2db00000~200000,2e100000~900000,2eb00000~200000,2ee00000~900000,2f800000~700000,30000000~100000,30200000~200000,30900000~a00000,31700000~400000,32000000~500000,32600000~700000,32e00000~200000,33100000~700000,33900000~100000,33b00000~100000,33d00000~700000,34500000~100000,34a00000~500000,35300000~900000,35d00000~100000,35f00000~200000,36200000~100000,36400000~600000,36b00000~100000,36d00000~700000,37500000~300000,37900000~100000,37c00000~100000,38100000~900000,38e00000~500000,39b00000~800000,3a400000~200000,3a700000~100000,3a900000~400000,3ae00000~100000,3b000000~100000,3b200000~100000,3b400000~100000,3b600000~700000,3be00000~800000,3ca00000~400000,3cf00000~400000,3d800000~b00000,3e400000~a00000,3ef00000~200000,3f200000~800000,3fb00000~100000,3fd00000~200000,40000000~500000,40900000~600000,41400000~b00000,42000000~100000,42200000~100000,42700000~500000,42d00000~100000,42f00000~600000,43600000~100000,43800000~700000,44000000~300000,44400000~600000,44b00000~100000,44d00000~400000,45500000~b00000,46100000~100000,46a00000~900000,47400000~700000,47c00000~700000,48800000~800000,49100000~600000,49800000~200000,49e00000~16900000,69200000~100000,69400000~100000,69700000~300000,69b00000~100000,120000000~40000000] root@node1:~#
Updated by Марк Коренберг over 5 years ago
root@node1:~# ceph-bluestore-tool fsck --path /var/lib/ceph/osd/ceph-16 fsck success root@node1:~#
Updated by Марк Коренберг over 5 years ago
Problem is still triggered every time.
Updated by Igor Fedotov over 5 years ago
Actually there are 2 aspects for this ticket:
1) the tool improperly handles OSD deployments that lack DB and/or WAL volumes. This is a bug and should be fixed in all supported releases. Will do that shortly.
2) In this ticket Mark is trying to expand main device which isn't supported. Standalone BlueFS volumes are supposed to benefit from the "expand" feature for now only. I'm going to check how feasible main device expansion feature is and implement it if so. But I'm not sure if we plan to backport it to earlier releases.
Mark, meanwhile, may I have some clarification on your intentions to expand this volume. Do you want larger main device for both user data and metadata? Or you want to expand DB part only (not sure why this might be needed though)?
Updated by Марк Коренберг over 5 years ago
I decided to enlarge OSD backing store device to be able to store more data on this OSD without re-creating it.
Sequence of my actions:
1. Created LVM of size 10G.
2. Deployed OSD using ceph-deploy
3. Used it for some tests, felt with data for about 10%.
4. Called lvresize while OSD process was running to enlarge it.
5. Tried to call ceph-bluestore-tool bluefs-bdev-expand --path /var/lib/ceph/osd/ceph-16 WHILE OSD PROCESS WAS RUNNING
6. Restarted OSD process
7. ceph -s showed me usage of this OSD as 87% and generated NEAR_FULL warning
8. ceph osd out osd.16 and waited.
9. Problem still happens.
Updated by Igor Fedotov over 5 years ago
Got it. Thanks, Mark!
So as I said before main device resize isn't supported at the moment.
Will probably start adding the support for offline resizing for such volume in Nautilus+ releases.
Updated by Igor Fedotov over 5 years ago
Updated by Igor Fedotov over 5 years ago
- Status changed from In Progress to Fix Under Review
- Affected Versions v14.0.0 added
- Affected Versions deleted (
v12.2.8)
Updated by Igor Fedotov over 5 years ago
mimic fix (which is completely different from Nautilus one as we don't backport main device expansion feature): https://github.com/ceph/ceph/pull/25348
Updated by Igor Fedotov over 5 years ago
- Affected Versions v13.2.3 added
- Affected Versions deleted (
v14.0.0)
Updated by Igor Fedotov over 5 years ago
- Affected Versions v12.2.8 added
- Affected Versions deleted (
v13.2.3)
Updated by Nathan Cutler over 5 years ago
- Status changed from Fix Under Review to Pending Backport
Updated by Nathan Cutler over 5 years ago
- Copied to Backport #37494: mimic: bluefs-bdev-expand aborts added
Updated by Nathan Cutler over 5 years ago
- Copied to Backport #37495: luminous: bluefs-bdev-expand aborts added
Updated by Igor Fedotov about 5 years ago
- Status changed from Pending Backport to Resolved