Project

General

Profile

Actions

Bug #21087

closed

BlueFS Becomes Totally Inaccessible when Failed to Allocate

Added by WANG Guoqin over 6 years ago. Updated over 6 years ago.

Status:
Can't reproduce
Priority:
High
Assignee:
Target version:
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

After trying to insert thousands of missing osdmaps to the bluefs partitions, they now appear to be full and not accessible by ANY of the tools available.

Log:

   -34> 2017-08-23 22:09:46.011201 7f2b666a6e00  4 rocksdb: [/build/ceph-12.1.4/src/rocksdb/db/version_set.cc:2395] Creating manifest 3506

   -33> 2017-08-23 22:09:46.011204 7f2b666a6e00 10 bluefs open_for_write db/MANIFEST-003506
   -32> 2017-08-23 22:09:46.011209 7f2b666a6e00 20 bluefs open_for_write mapping db/MANIFEST-003506 to bdev 1
   -31> 2017-08-23 22:09:46.011214 7f2b666a6e00 10 bluefs open_for_write h 0x564cc4553860 on file(ino 5120 size 0x0 mtime 2017-08-23 22:09:46.011209 bdev 1 extents [])
   -30> 2017-08-23 22:09:46.011233 7f2b666a6e00 10 bluefs _flush 0x564cc4553860 ignoring, length 35 < min_flush_size 524288
   -29> 2017-08-23 22:09:46.011266 7f2b666a6e00 10 bluefs _flush 0x564cc4553860 ignoring, length 3433 < min_flush_size 524288
   -28> 2017-08-23 22:09:46.011286 7f2b666a6e00 10 bluefs _flush 0x564cc4553860 ignoring, length 3453 < min_flush_size 524288
   -27> 2017-08-23 22:09:46.011287 7f2b666a6e00 10 bluefs _flush 0x564cc4553860 ignoring, length 3453 < min_flush_size 524288
   -26> 2017-08-23 22:09:46.011289 7f2b666a6e00 10 bluefs _fsync 0x564cc4553860 file(ino 5120 size 0x0 mtime 2017-08-23 22:09:46.011209 bdev 1 extents [])
   -25> 2017-08-23 22:09:46.011294 7f2b666a6e00 10 bluefs _flush 0x564cc4553860 0x0~d7d to file(ino 5120 size 0x0 mtime 2017-08-23 22:09:46.011209 bdev 1 extents [])
   -24> 2017-08-23 22:09:46.011298 7f2b666a6e00 10 bluefs _flush_range 0x564cc4553860 pos 0x0 0x0~d7d to file(ino 5120 size 0x0 mtime 2017-08-23 22:09:46.011209 bdev 1 extents [])
   -23> 2017-08-23 22:09:46.011303 7f2b666a6e00 10 bluefs _allocate len 0xd7d from 1
   -22> 2017-08-23 22:09:46.011305 7f2b666a6e00 10 bitmapalloc:reserve instance 94887706734432 num_used 20377 total 20378
   -21> 2017-08-23 22:09:46.011308 7f2b666a6e00 10 bitmapalloc:allocate instance 94887706734432 want_size 1048576 alloc_unit 1048576 hint 0
   -20> 2017-08-23 22:09:46.011320 7f2b666a6e00 20 bluefs _flush_range dirty_seq = 651752 (was clean)
   -19> 2017-08-23 22:09:46.011323 7f2b666a6e00 20 bluefs _flush_range file now file(ino 5120 size 0xd7d mtime 2017-08-23 22:09:46.011319 bdev 1 extents [1:0x1e2400000+100000])
   -18> 2017-08-23 22:09:46.011327 7f2b666a6e00 20 bluefs _flush_range in 1:0x1e2400000+100000 x_off 0x0
   -17> 2017-08-23 22:09:46.011331 7f2b666a6e00 20 bluefs _flush_range caching tail of 0xd7d and padding block with 0x283
   -16> 2017-08-23 22:09:46.011372 7f2b666a6e00 20 bluefs _flush_range h 0x564cc4553860 pos now 0xd7d
   -15> 2017-08-23 22:09:46.011377 7f2b666a6e00 10 bluefs _claim_completed_aios got 1 aios
   -14> 2017-08-23 22:09:46.011379 7f2b666a6e00 10 bluefs wait_for_aio 0x564cc4553860
   -13> 2017-08-23 22:09:46.011532 7f2b666a6e00 10 bluefs wait_for_aio 0x564cc4553860 done in 0.000151
   -12> 2017-08-23 22:09:46.011538 7f2b666a6e00 20 bluefs flush_bdev
   -11> 2017-08-23 22:09:46.020780 7f2b666a6e00 20 bluefs _fsync file metadata was dirty (651752) on file(ino 5120 size 0xd7d mtime 2017-08-23 22:09:46.011319 bdev 1 extents [1:0x1e2400000+100000]), flushing log
   -10> 2017-08-23 22:09:46.020810 7f2b666a6e00 20 bluefs _flush_and_sync_log 1 dirty_files
    -9> 2017-08-23 22:09:46.020813 7f2b666a6e00 20 bluefs _flush_and_sync_log   op_file_update file(ino 5120 size 0xd7d mtime 2017-08-23 22:09:46.011319 bdev 1 extents [1:0x1e2400000+100000])
    -8> 2017-08-23 22:09:46.020818 7f2b666a6e00 10 bluefs _flush_and_sync_log txn(seq 651752 len 0x5e crc 0x417b62b0)
    -7> 2017-08-23 22:09:46.020822 7f2b666a6e00 10 bluefs _flush_and_sync_log allocating more log runway (0xffffffffff55a000 remaining)
    -6> 2017-08-23 22:09:46.020824 7f2b666a6e00 10 bluefs _allocate len 0x400000 from 0
    -5> 2017-08-23 22:09:46.020826 7f2b666a6e00 10 bluefs _allocate len 0x400000 from 1
    -4> 2017-08-23 22:09:46.020827 7f2b666a6e00 10 bitmapalloc:reserve instance 94887706734432 num_used 20378 total 20378
    -3> 2017-08-23 22:09:46.020829 7f2b666a6e00  1 bluefs _allocate failed to allocate 0x400000 on bdev 1, free 0x0; fallback to bdev 2
    -2> 2017-08-23 22:09:46.020831 7f2b666a6e00 10 bluefs _allocate len 0x400000 from 2
    -1> 2017-08-23 22:09:46.020832 7f2b666a6e00 -1 bluefs _allocate failed to allocate 0x400000 on bdev 2, dne
     0> 2017-08-23 22:09:46.031080 7f2b666a6e00 -1 /build/ceph-12.1.4/src/os/bluestore/BlueFS.cc: In function 'int BlueFS::_flush_and_sync_log(std::unique_lock<std::mutex>&, uint64_t, uint64_t)' thread 7f2b666a6e00 time 2017-08-23 22:09:46.020835
/build/ceph-12.1.4/src/os/bluestore/BlueFS.cc: 1394: FAILED assert(r == 0)

 ceph version 12.1.4 (a5f84b37668fc8e03165aaf5cbb380c78e4deba4) luminous (rc)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x564cba859042]
 2: (BlueFS::_flush_and_sync_log(std::unique_lock<std::mutex>&, unsigned long, unsigned long)+0x183d) [0x564cba7def6d]
 3: (BlueFS::_fsync(BlueFS::FileWriter*, std::unique_lock<std::mutex>&)+0x3a1) [0x564cba7e05d1]
 4: (BlueRocksWritableFile::Sync()+0x63) [0x564cba7f1c03]
 5: (rocksdb::WritableFileWriter::SyncInternal(bool)+0x31c) [0x564cbac2276c]
 6: (rocksdb::WritableFileWriter::Sync(bool)+0x132) [0x564cbac25342]
 7: (rocksdb::SyncManifest(rocksdb::Env*, rocksdb::ImmutableDBOptions const*, rocksdb::WritableFileWriter*)+0x1c3) [0x564cbac28f73]
 8: (rocksdb::VersionSet::LogAndApply(rocksdb::ColumnFamilyData*, rocksdb::MutableCFOptions const&, rocksdb::autovector<rocksdb::VersionEdit*, 8ul> const&, rocksdb::InstrumentedMutex*, rocksdb::Directory*, bool, rocksdb::ColumnFamilyOptions const*)+0x173e) [0x564cbaba899e]
 9: (rocksdb::DBImpl::RecoverLogFiles(std::vector<unsigned long, std::allocator<unsigned long> > const&, unsigned long*, bool)+0x1c48) [0x564cbab71e28]
 10: (rocksdb::DBImpl::Recover(std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, bool, bool, bool)+0x8c4) [0x564cbab72f34]
 11: (rocksdb::DB::Open(rocksdb::DBOptions const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, std::vector<rocksdb::ColumnFamilyHandle*, std::allocator<rocksdb::ColumnFamilyHandle*> >*, rocksdb::DB**)+0xedc) [0x564cbab742fc]
 12: (rocksdb::DB::Open(rocksdb::Options const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rocksdb::DB**)+0x6b1) [0x564cbab75b91]
 13: (RocksDBStore::do_open(std::ostream&, bool)+0x8ff) [0x564cba75d19f]
 14: (BlueStore::_open_db(bool)+0xf73) [0x564cba6ea3f3]
 15: (BlueStore::_mount(bool)+0x38d) [0x564cba71a2cd]
 16: (main()+0x14ab) [0x564cba1a672b]
 17: (__libc_start_main()+0xf1) [0x7f2b63b0c3f1]
 18: (_start()+0x29) [0x564cba233a69]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

The problem was, now that almost all tools are refusing to access this partition, I literally have no way to diagnose the problem, nor to modify data to solve the problem (like deleting something to make room).

AFAIK these commands don’t work with the same log shown above,
  • ceph-osd with any param
  • ceph-objectstore-tool with —op info, log, fsck, dup, export, set-osdmap, list-pgs, etc.
  • ceph-bluestore-tool with —command fsck or deep fsck
  • ceph-bluestore-tool with —command bluefs-export or show-label work, but they won’t help

So that my attempt to either remove something to make room, or to export the pgs to a new place, they simply don’t work anyway.

The only thing I could have done was to extend the partition, hopefully to make more room. But the result was even stranger. After extending the partition from 10G to 20G, the line found in the log,

    -4> 2017-08-23 22:09:46.020827 7f2b666a6e00 10 bitmapalloc:reserve instance 94887706734432 num_used 10378 total 10378

simply became
    -4> 2017-08-23 22:09:46.020827 7f2b666a6e00 10 bitmapalloc:reserve instance 94887706734432 num_used 20378 total 20378

with 10000 more blocks in total, and all of them being used.

Well, I’m totally lost and am wondering what I can do next. Hope someone could help me.

BTW this editor killed my chrome dozens of times :(

Actions

Also available in: Atom PDF