Project

General

Profile

Support #21208

Negative Runway in BlueFS?

Added by WANG Guoqin about 2 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
OSD
Target version:
Start date:
08/31/2017
Due date:
% Done:

0%

Tags:
Reviewed:
Affected Versions:
Pull request ID:

Description

Last week I tried to insert some osdmaps into OSDs using ceph-objectstore-tool, and now all the OSDs in BlueFS refuse to start, always looking for more spaces, with the following log,

   -10> 2017-08-31 21:57:30.799962 7f069abf8e00 10 bluefs sync_metadata
    -9> 2017-08-31 21:57:30.799963 7f069abf8e00 20 bluefs flush_bdev
    -8> 2017-08-31 21:57:30.799966 7f069abf8e00 10 bluefs _flush_and_sync_log txn(seq 651792 len 0x130 crc 0x804e23ec)
    -7> 2017-08-31 21:57:30.799969 7f069abf8e00 10 bluefs _flush_and_sync_log allocating more log runway (0xffffffffff532000 remaining)
    -6> 2017-08-31 21:57:30.799970 7f069abf8e00 10 bluefs _allocate len 0x400000 from 0
    -5> 2017-08-31 21:57:30.799971 7f069abf8e00 10 bluefs _allocate len 0x400000 from 1
    -4> 2017-08-31 21:57:30.799972 7f069abf8e00 10 stupidalloc reserve need 0x400000 num_free 0x3e000 num_reserved 0x0
    -3> 2017-08-31 21:57:30.799974 7f069abf8e00  1 bluefs _allocate failed to allocate 0x400000 on bdev 1, free 0x3e000; fallback to bdev 2
    -2> 2017-08-31 21:57:30.799975 7f069abf8e00 10 bluefs _allocate len 0x400000 from 2
    -1> 2017-08-31 21:57:30.799976 7f069abf8e00 -1 bluefs _allocate failed to allocate 0x400000 on bdev 2, dne
     0> 2017-08-31 21:57:30.810152 7f069abf8e00 -1 /build/ceph-12.2.0/src/os/bluestore/BlueFS.cc: In function 'int BlueFS::_flush_and_sync_log(std::unique_lock<std::mutex>&, uint64_t, uint64_t)' thread 7f069abf8e00 time 2017-08-31 21:57:30.799979
/build/ceph-12.2.0/src/os/bluestore/BlueFS.cc: 1394: FAILED assert(r == 0)

I've noticed that after "allocating more log runway", there is a (seemingly) negative value of runway to be allocated. As in,

  int64_t runway = log_writer->file->fnode.get_allocated() -
    log_writer->get_effective_write_pos();

and

    uint64_t get_effective_write_pos() {
      buffer_appender.flush();
      return pos + buffer.length();
    }

I figured that from the log

  -966> 2017-08-31 21:57:30.776060 7f069abf8e00 10 bluefs mount log write pos set to 0xace000

runway equals to minus pos, and from earlier logs before some additional allocation, runway 0xffffffffff55a000 and pos 0xaa6000.

Just want to make sure if this is the normal circumstances.

Also available in: Atom PDF