Bug #53814: Pacific cluster crash - bluestore - Ceph

Actions

Copy link

Bug #53814

closed

Pacific cluster crash

Added by Cyprien Devillez over 2 years ago. Updated over 1 year ago.

Status:

Won't Fix

Priority:

Normal

Assignee:

Target version:

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

Ceph - v16.2.7

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

Hi all,

Last Thursday, few days after an Octopus to Pacific upgrade on a 4 hosts Proxmox install, my Ceph Cluster crashed.

6 of 8 of OSD go down in a a few minutes and don't come back (crash on restart).

ceph status

  cluster:
    id:     19d4d891-5694-457c-9293-25938ba8dcca
    health: HEALTH_WARN
            3 osds down
            1 host (2 osds) down
            Reduced data availability: 257 pgs inactive, 4 pgs down, 25 pgs peering, 89 pgs stale
            Degraded data redundancy: 134052/286740 objects degraded (46.750%), 60 pgs degraded, 60 pgs undersized
            63 daemons have recently crashed
            1 slow ops, oldest one blocked for 3110 sec, mon.pve14 has slow ops

  services:
    mon: 3 daemons, quorum pve13,pve12,pve14 (age 3d)
    mgr: pve13(active, since 3d), standbys: pve12, pve14, pve11
    osd: 8 osds: 2 up (since 43m), 5 in (since 3d); 7 remapped pgs

  data:
    pools:   2 pools, 257 pgs
    objects: 95.58k objects, 352 GiB
    usage:   410 GiB used, 90 GiB / 500 GiB avail
    pgs:     65.370% pgs unknown
             34.630% pgs not active
             134052/286740 objects degraded (46.750%)
             168 unknown
             60  stale+undersized+degraded+peered
             25  stale+peering
             4   stale+down

ceph osd tree

ID  CLASS  WEIGHT   TYPE NAME       STATUS  REWEIGHT  PRI-AFF
-1         3.90637  root default                             
-3         0.97659      host pve11                           
 0    ssd  0.48830          osd.0     down   1.00000  1.00000
 1    ssd  0.48830          osd.1       up   1.00000  1.00000
-5         0.97659      host pve12                           
 3    ssd  0.48830          osd.3       up   1.00000  1.00000
 4    ssd  0.48830          osd.4     down   1.00000  1.00000
-7         0.97659      host pve13                           
 2    ssd  0.48830          osd.2     down         0  1.00000
 5    ssd  0.48830          osd.5     down         0  1.00000
-9         0.97659      host pve14                           
 6    ssd  0.48830          osd.6     down         0  1.00000
 7    ssd  0.48830          osd.7     down   1.00000  1.00000

Most of ceph command (crash ls, pg stat....) stuck without

I try, without success, to set bluestore_allocator to bitmap because it seems to solve problem for other users with same error I found into my log.

janv. 06 14:53:19 pve11 ceph-osd[24802]: 2022-01-06T14:53:19.214+0100 7f2d01c05f00 -1 bluefs _allocate allocation failed, needed 0x8025e
janv. 06 14:53:19 pve11 ceph-osd[24802]: 2022-01-06T14:53:19.214+0100 7f2d01c05f00 -1 bluefs _flush_range allocated: 0x0 offset: 0x0 length: 0x8025e

In attachments the log of the first crashed OSD and it's crash log.
Before the first crash there is a lot more than usual lines like

-2200> 2022-01-06T13:13:45.935+0100 7fce3fd3f700 10 monclient: tick
 -2199> 2022-01-06T13:13:45.935+0100 7fce3fd3f700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2022-01-06T13:13:15.939973+0100)
 -2198> 2022-01-06T13:13:46.419+0100 7fce41a7e700  5 prioritycache tune_memory target: 4294967296 mapped: 3969687552 unmapped: 420519936 heap: 4390207488 old mem: 2845415832 new mem: 2845415832
 -2197> 2022-01-06T13:13:46.935+0100 7fce3fd3f700 10 monclient: tick
 -2196> 2022-01-06T13:13:46.935+0100 7fce3fd3f700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2022-01-06T13:13:16.940086+0100)
 -2195> 2022-01-06T13:13:47.375+0100 7fce43a91700  4 rocksdb: [compaction/compaction_job.cc:1344] [default] [JOB 987] Generated table #30547: 133561 keys, 69031777 bytes
 -2194> 2022-01-06T13:13:47.375+0100 7fce43a91700  4 rocksdb: EVENT_LOG_v1 {"time_micros": 1641471227382765, "cf_name": "default", "job": 987, "event": "table_file_creation", "file_number": 30547, "file_size": 69031777, "table_properties": {"data_size": 67113541, "index_size": 1583332, "index_partitions": 0, "top_level_index_size": 0, "index_key_is_user_key": 0, "index_value_is_delta_encoded": 0, "filter_size": 334021, "raw_key_size": 11682564, "raw_average_key_size": 87, "raw_value_size": 62798627, "raw_average_value_size": 470, "num_data_blocks": 17217, "num_entries": 133561, "num_deletions": 0, "num_merge_operands": 0, "num_range_deletions": 0, "format_version": 0, "fixed_key_len": 0, "filter_policy": "rocksdb.BuiltinBloomFilter", "column_family_name": "default", "column_family_id": 0, "comparator": "leveldb.BytewiseComparator", "merge_operator": ".T:int64_array.b:bitwise_xor", "prefix_extractor_name": "nullptr", "property_collectors": "[]", "compression": "NoCompression", "compression_options": "window_bits=-14; level=32767; strategy=0; max_dict_bytes=0; zstd_max_train_bytes=0; enabled=0; ", "creation_time": 1640448752, "oldest_key_time": 0, "file_creation_time": 1641471220}}
 -2193> 2022-01-06T13:13:47.423+0100 7fce41a7e700  5 prioritycache tune_memory target: 4294967296 mapped: 3971039232 unmapped: 419168256 heap: 4390207488 old mem: 2845415832 new mem: 2845415832

Any advice?

Files

Download all files

ceph-crash-osd.0.log.gz (12.6 KB) ceph-crash-osd.0.log.gz		Cyprien Devillez, 01/10/2022 11:13 AM
ceph-osd.0.log.1.gz (800 KB) ceph-osd.0.log.1.gz		Cyprien Devillez, 01/10/2022 11:13 AM
ceph_storage size.png (35.4 KB) ceph_storage size.png		Cyprien Devillez, 01/10/2022 02:55 PM

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Updated by Igor Fedotov over 2 years ago

Status changed from New to Won't Fix

Given the following output
-4> 2022-01-06T15:30:36.463+0100 7fb49cf7bf00 1 bluefs _allocate unable to allocate 0x90000 on bdev 1, allocator name block, allocator type bitmap, capacity 0x7cffc00000, block size 0x1000, free 0xdf5c6e000, fragmentation 1, allocated 0x0
-3> 2022-01-06T15:30:36.463+0100 7fb49cf7bf00 -1 bluefs _allocate allocation failed, needed 0x8025e

it looks like the storage device is almost full (0xdf5c6e000 of 0x7cffc00000) and the rest of the space is totally fragmented (fragmentation 1). As a result bluefs fails to allocate any extent of 64K length which is a minimal bluefs allocation unit by default (determined by bluefs_shared_min_alloc_size parameter in this case (as there is no standalone DB device)).

So I can see two options to proceed:
1) (PREFERRED) Increase the size of the underlying storage device (e.g. by expanding underlying LVM) and adjust bluefs size via ceph-bluestore-tool's bluefs-bdev-expand command.

2) set bluefs_shared_min_alloc_size to 4096
This might have some performance penalties and is irreversible. Please try against single OSD first and make sure it works as expected since this is a non-default setting which has been barely(if any) tested.

Actions

Copy link

Updated by Cyprien Devillez over 2 years ago

File ceph_storage size.png ceph_storage size.png added

Thank for your reply.

I can't allocate more space, because it's on remote servers with fully allocated disk.
I try to set bluefs_shared_min_alloc_size on one OSD (I don't give much about data, I have backup and I would probably rebuild ceph cluster from scratch after this test) but it seem I don't use the right command

root@pve11:~# ceph config set osd.0 bluefs_shared_min_alloc_size 4096
Error EINVAL: unrecognized config option 'bluefs_shared_min_alloc_size'

I found some info --bluefs-shared-alloc-size 4096 args and bluestore_min_alloc_size_ssd, must I use one of this?

Another things about storage usage, since a long time (few weeks), it was stable around 80%.
It's used by pre-allocated VMs Volume which cumulative size is around 1Tb (on 1,25Tb total) and must not grown.
But when I look on the (attached) Promox monitoring graph reported size (free and total) start to grown 3 hour before crash.

Actions

Copy link

Updated by Igor Fedotov over 2 years ago

Sorry, I meant bluefs_shared_alloc_size indeed.

As for space usage growth - I don't have any ideas at this point - let's bring OSDs up first.

Cyprien Devillez wrote:

Thank for your reply.

I can't allocate more space, because it's on remote servers with fully allocated disk.
I try to set bluefs_shared_min_alloc_size on one OSD (I don't give much about data, I have backup and I would probably rebuild ceph cluster from scratch after this test) but it seem I don't use the right command

[...]

I found some info --bluefs-shared-alloc-size 4096 args and bluestore_min_alloc_size_ssd, must I use one of this?

Another things about storage usage, since a long time (few weeks), it was stable around 80%.
It's used by pre-allocated VMs Volume which cumulative size is around 1Tb (on 1,25Tb total) and must not grown.
But when I look on the (attached) Promox monitoring graph reported size (free and total) start to grown 3 hour before crash.

Actions

Copy link

Updated by Cyprien Devillez over 2 years ago

With ceph config set osd.0 bluefs_shared_alloc_size 4096 and systemctl start ceph-osd@0.service, osd.0 is back and run stable since 15mn.

Can I try the same with all crashed osd?

When I rebuilt a new cluster what can I do to prevent this to happen again?

Actions

Copy link

Updated by Igor Fedotov over 2 years ago

Cyprien Devillez wrote:

With ceph config set osd.0 bluefs_shared_alloc_size 4096 and systemctl start ceph-osd@0.service, osd.0 is back and run stable since 15mn.

Can I try the same with all crashed osd?

yeah given it worked fine for a single OSD and you don't care much about the data I think you're good to go.

When I rebuilt a new cluster what can I do to prevent this to happen again?

IMO the issue is cause by both high space fragmentation and pretty high space utilization. So generally you should take care of both. Unfortunately without the thorough investigation I can only provide some general and straightforward ideas:
1) use hybrid allocator from the beginning - this would hopefully provide less fragmentation over time. E.g. depending on the Ceph release deployed initially one could utilize bitmap allocator for the long period which tend to fragment space under some access patterns.
2) keep enough free space/use larger disks - the higher free space is available the higher probability to find continuous blocks there. And generally smaller disks tend to suffer from high fragmentation more frequently.
3) One can try to start using standalone DB partition located on the same or different physical volume - this would ensure all the blocks at that partition fit bluefs allocation minimal size. And bluefs would spill over to the main device in hopefully rare case of space exhaustion only. But as usual for any static allocation one might face some other issues : proper sizing, potentially ineffective space utilization etc...

Actions

Copy link

Updated by Cyprien Devillez over 2 years ago

I don't have much time since last post so I only try on other osd today and it's not working.

Now I have this error

```
janv. 13 13:05:58 pve14 systemd¹: Starting Ceph object storage daemon osd.7...
janv. 13 13:05:58 pve14 systemd¹: Started Ceph object storage daemon osd.7.
janv. 13 13:06:14 pve14 ceph-osd^2821431: ./src/os/bluestore/BlueFS.cc: In function 'int BlueFS::_write_super(int)' thread 7fa9df456f00 time 2022-01-13T13:06:14.347694+0100
janv. 13 13:06:14 pve14 ceph-osd^2821431: ./src/os/bluestore/BlueFS.cc: 926: FAILED ceph_assert(bl.length() <= get_super_length())
janv. 13 13:06:14 pve14 ceph-osd^2821431: ceph version 16.2.7 (f9aa029788115b5df5eeee328f584156565ee5b7) pacific (stable)
janv. 13 13:06:14 pve14 ceph-osd^2821431: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x124) [0x5597005b0ae6]
janv. 13 13:06:14 pve14 ceph-osd^2821431: 2: /usr/bin/ceph-osd(+0xabcc71) [0x5597005b0c71]
janv. 13 13:06:14 pve14 ceph-osd^2821431: 3: (BlueFS::_write_super(int)+0x586) [0x559700c9ff86]
janv. 13 13:06:14 pve14 ceph-osd^2821431: 4: (BlueFS::_compact_log_async(std::unique_lock<std::mutex>&)+0xb33) [0x559700caa363]
janv. 13 13:06:14 pve14 ceph-osd^2821431: 5: (BlueFS::_flush(BlueFS::FileWriter*, bool, std::unique_lock<std::mutex>&)+0x67) [0x559700cab497]
janv. 13 13:06:14 pve14 ceph-osd^2821431: 6: (BlueRocksWritableFile::Append(rocksdb::Slice const&)+0x100) [0x559700cc37d0]
janv. 13 13:06:14 pve14 ceph-osd^2821431: 7: (rocksdb::LegacyWritableFileWrapper::Append(rocksdb::Slice const&, rocksdb::IOOptions const&, rocksdb::IODebugContext*)+0x48) [0x55970118a24e]
janv. 13 13:06:14 pve14 ceph-osd^2821431: 8: (rocksdb::WritableFileWriter::WriteBuffered(char const*, unsigned long)+0x338) [0x559701364d18]
janv. 13 13:06:14 pve14 ceph-osd^2821431: 9: (rocksdb::WritableFileWriter::Append(rocksdb::Slice const&)+0x5d7) [0x55970136329b]
janv. 13 13:06:14 pve14 ceph-osd^2821431: 10: (rocksdb::BlockBasedTableBuilder::WriteRawBlock(rocksdb::Slice const&, rocksdb::CompressionType, rocksdb::BlockHandle*, bool)+0x11d) [0x55970152d2d7]
janv. 13 13:06:14 pve14 ceph-osd^2821431: 11: (rocksdb::BlockBasedTableBuilder::WriteBlock(rocksdb::Slice const&, rocksdb::BlockHandle*, bool)+0x7d0) [0x55970152d0be]
janv. 13 13:06:14 pve14 ceph-osd^2821431: 12: (rocksdb::BlockBasedTableBuilder::WriteBlock(rocksdb::BlockBuilder*, rocksdb::BlockHandle*, bool)+0x48) [0x55970152c8da]
janv. 13 13:06:14 pve14 ceph-osd^2821431: 13: (rocksdb::BlockBasedTableBuilder::Flush()+0x9a) [0x55970152c88a]
janv. 13 13:06:14 pve14 ceph-osd^2821431: 14: (rocksdb::BlockBasedTableBuilder::Add(rocksdb::Slice const&, rocksdb::Slice const&)+0x197) [0x55970152c3bf]
janv. 13 13:06:14 pve14 ceph-osd^2821431: 15: (rocksdb::BuildTable(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rocksdb::Env*, rocksdb::FileSystem*, rocksdb::ImmutableCFOptions const&, rocks>
janv. 13 13:06:14 pve14 ceph-osd^2821431: 16: (rocksdb::DBImpl::WriteLevel0TableForRecovery(int, rocksdb::ColumnFamilyData*, rocksdb::MemTable*, rocksdb::VersionEdit*)+0x5ea) [0x559701228226]
janv. 13 13:06:14 pve14 ceph-osd^2821431: 17: (rocksdb::DBImpl::RecoverLogFiles(std::vector<unsigned long, std::allocator<unsigned long> > const&, unsigned long*, bool, bool*)+0x1ad1) [0x559701226e9d]
janv. 13 13:06:14 pve14 ceph-osd^2821431: 18: (rocksdb::DBImpl::Recover(std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, bool, bool, bool, unsigned long*)+0x159e) [0x5597012243d4]
janv. 13 13:06:14 pve14 ceph-osd^2821431: 19: (rocksdb::DBImpl::Open(rocksdb::DBOptions const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<rocksdb::ColumnFamilyDescriptor, std:>
janv. 13 13:06:14 pve14 ceph-osd^2821431: 20: (rocksdb::DB::Open(rocksdb::DBOptions const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<rocksdb::ColumnFamilyDescriptor, std::all>
janv. 13 13:06:14 pve14 ceph-osd^2821431: 21: (RocksDBStore::do_open(std::ostream&, bool, bool, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x10a6) [0x5597011398b6]
janv. 13 13:06:14 pve14 ceph-osd^2821431: 22: (BlueStore::_open_db(bool, bool, bool)+0xa19) [0x559700bb7b19]
janv. 13 13:06:14 pve14 ceph-osd^2821431: 23: (BlueStore::_open_db_and_around(bool, bool)+0x332) [0x559700bfcb92]
janv. 13 13:06:14 pve14 ceph-osd^2821431: 24: (BlueStore::_mount()+0x191) [0x559700bff531]
janv. 13 13:06:14 pve14 ceph-osd^2821431: 25: (OSD::init()+0x58d) [0x5597006a65ed]
janv. 13 13:06:14 pve14 ceph-osd^2821431: 26: main()
janv. 13 13:06:14 pve14 ceph-osd^2821431: 27: __libc_start_main()
janv. 13 13:06:14 pve14 ceph-osd^2821431: 28: _start()
janv. 13 13:06:14 pve14 ceph-osd^2821431: ** Caught signal (Aborted) *
```

So I try fsck on osd and got

```
root@pve14:~# ceph-bluestore-tool fsck --path /var/lib/ceph/osd/ceph-7/
2022-01-13T13:44:28.903+0100 7fe341600240 -1 bluefs _check_new_allocations invalid extent 1: 0xaa6b29000~2000: duplicate reference, ino 1
2022-01-13T13:44:28.903+0100 7fe341600240 -1 bluefs mount failed to replay log: (14) Bad address
2022-01-13T13:44:28.903+0100 7fe341600240 -1 bluestore(/var/lib/ceph/osd/ceph-7) _open_bluefs failed bluefs mount: (14) Bad address
2022-01-13T13:44:28.903+0100 7fe341600240 -1 bluestore(/var/lib/ceph/osd/ceph-7) _open_db failed to prepare db environment:
fsck failed: (5) Input/output error
```

Which lead me to this issue https://tracker.ceph.com/issues/48036 (and https://github.com/rook/rook/pull/6793 ). It seems relevant because during last year 4 disk has been replaced (ssd wear out) and each time osd has been removed and recreated on the new disk with the same name.
Is it possible to be the root cause?

Actions

Copy link

Updated by Igor Fedotov over 2 years ago

Cyprien Devillez wrote:

I don't have much time since last post so I only try on other osd today and it's not working.

Now I have this error

Could you please collect OSD startup log with debug-bluefs set to 10?1

Actions

Copy link

Updated by Cyprien Devillez over 2 years ago

Igor Fedotov wrote:

Could you please collect OSD startup log with debug-bluefs set to 10?1

Yes, you could download it on: http://share.bidouille.info/ceph-osd.7.log.gz

Actions

Copy link

Updated by Igor Fedotov about 2 years ago

So unfortunately this last failure looks like a bug caused by downsizing allocation granularity to 4K. Which as I mentioned is a risky thing due to limited QA coverage.
With such a low allocation size (which potentially means more allocation units) bluefs log allocation map takes more space than bluefs superblock can fit. Hence the assertion.

At this point I would suggest to backup the data (if needed) and redeploy the cluster.

As a workaround to bring this OSD up you might want to try to disable bluefs log compaction by setting a very high value (e.g. 1000000) to bluefs_log_compact_min_ratio config parameter. This would hopefully allow to bring OSD up or run ceph-objectstore-tool and perform PG data export. Please don't leave OSD with such a workaround for a regular long-term usage to avoid additional still undiscovered issues. It's for data recovery only!

Actions

Copy link

#10

Updated by jianwei zhang over 1 year ago

ceph-14.2.8 has the same problem

It is unacceptable that the serialized size of the superblock exceeds the expected value。

Need to pay attention to it, or give a repair plan, otherwise there will be a large number of osds that cannot be pulled up

Actions

Copy link

#11

Updated by Igor Fedotov over 1 year ago

jianwei zhang wrote:

ceph-14.2.8 has the same problem

It is unacceptable that the serialized size of the superblock exceeds the expected value。

Need to pay attention to it, or give a repair plan, otherwise there will be a large number of osds that cannot be pulled up

https://github.com/ceph/ceph/pull/48854 addresses the issue.

Actions

Copy link

#12

Updated by jianwei zhang over 1 year ago

Igor Fedotov wrote:

jianwei zhang wrote:

ceph-14.2.8 has the same problem

It is unacceptable that the serialized size of the superblock exceeds the expected value。

Need to pay attention to it, or give a repair plan, otherwise there will be a large number of osds that cannot be pulled up

https://github.com/ceph/ceph/pull/48854 addresses the issue.

https://github.com/ceph/ceph/blob/v14.2.8/src/os/bluestore/BlueFS.cc#L686

bluefs_super_t.bluefs_fnode_t.extents is very large

assert loaction: ceph_assert(bl.length() <= get_super_length());

int BlueFS::_write_super(int dev)
{
  // build superblock
  bufferlist bl;
  encode(super, bl);
  uint32_t crc = bl.crc32c(-1);
  encode(crc, bl);
  dout(10) << __func__ << " super block length(encoded): " << bl.length() << dendl;
  dout(10) << __func__ << " superblock " << super.version << dendl;
  dout(10) << __func__ << " log_fnode " << super.log_fnode << dendl;
  ceph_assert(bl.length() <= get_super_length());
  bl.append_zero(get_super_length() - bl.length());

  bdev[dev]->write(get_super_offset(), bl, false, WRITE_LIFE_SHORT);
  dout(20) << __func__ << " v " << super.version
           << " crc 0x" << std::hex << crc
           << " offset 0x" << get_super_offset() << std::dec
           << dendl;
  return 0;
}

bluefs_super_t.bluefs_fnode_t.extents is very large

Actions

Copy link

#13

Updated by Igor Fedotov over 1 year ago

jianwei zhang wrote:

Igor Fedotov wrote:

jianwei zhang wrote:

ceph-14.2.8 has the same problem

It is unacceptable that the serialized size of the superblock exceeds the expected value。

Need to pay attention to it, or give a repair plan, otherwise there will be a large number of osds that cannot be pulled up

https://github.com/ceph/ceph/pull/48854 addresses the issue.

[...]

bluefs_super_t.bluefs_fnode_t.extents is very large

assert loaction: ceph_assert(bl.length() <= get_super_length());

[...]

yes, this is a known issue.. The PR I shared above fixes that by keeping just a starter part of BlueFS log at superblock. Hence avoiding its overflow. Additionally bluefs fnode has got incremental update mode in late Ceph releases (Quincy+). Pacific to get this in the upcoming release too. Unfortunately Nautilus wouldn't get any relevant fixes as its at EOL for now.

Actions

Copy link

#14

Updated by jianwei zhang over 1 year ago

Igor Fedotov wrote:

yes, this is a known issue.. The PR I shared above fixes that by keeping just a starter part of BlueFS log at superblock. Hence avoiding its overflow. Additionally bluefs fnode has got incremental update mode in late Ceph releases (Quincy+). Pacific to get this in the upcoming release too. Unfortunately Nautilus wouldn't get any relevant fixes as its at EOL for now.

thank you for your reply!

I think your patch should also be associated with this issue, or associate this issue with https://tracker.ceph.com/issues/53466

Actions

Copy link

#15

Updated by Igor Fedotov over 1 year ago

Related to Bug #53466: OSD is unable to allocate free space for BlueFS added

Actions

Copy link

#16

Updated by jianwei zhang over 1 year ago

https://github.com/ceph/ceph/pull/48854/commits/b65c780a3b524a44d0f860b0edda3baaac13c539
os/bluestore: prepend compacted BlueFS log with a starter part.

Actions

Copy link

#17

Updated by jianwei zhang over 1 year ago

2022-12-15 10:34:33.924 7fa072da2a80 10 bluefs _write_super super block length(encoded): 4375
2022-12-15 10:34:33.924 7fa072da2a80 10 bluefs _write_super superblock 250504
///1:0x514d330000~10000 十六进制 10000 转为 十进制 为 65536
2022-12-15 10:34:33.924 7fa072da2a80 10 bluefs _write_super log_fnode file(ino 1 size 0x1270000 mtime 0.000000 bdev 0 allocated 1670000 extents [1:0x514d330000~10000,1:0x514d770000~10000,1:0x5152c30000~10000,1:0x515f430000~10000,1:0x5162b20000~10000,1:0x5166210000~10000,1:0x5168820000~10000,1:0x5169d30000~10000,1:0x516a080000~10000,1:0x516e970000~10000,1:0x516ff80000~10000,1:0x517a630000~10000,1:0x5180a20000~10000,1:0x5182880000~10000,1:0x51841b0000~10000,1:0x5189f50000~10000,1:0x518a780000~10000,1:0x5196d00000~10000,1:0x51971e0000~10000,1:0x5198960000~10000,1:0x51994a0000~10000,1:0x519a750000~10000,1:0x519c6c0000~10000,1:0x519f480000~10000,1:0x51a19a0000~10000,1:0x51a1dc0000~10000,1:0x51a5670000~10000,1:0x51a7cd0000~10000,1:0x51a8600000~10000,1:0x51b68b0000~10000,1:0x51b8940000~10000,1:0x51bd6c0000~10000,1:0x51ce310000~10000,1:0x51d7240000~10000,1:0x51da610000~10000,1:0x51dad00000~10000,1:0x51de260000~10000,1:0x51dec20000~10000,1:0x51e6250000~10000,1:0x51e87a0000~10000,1:0x51e9610000~10000,1:0x51ee290000~10000,1:0x51fa9c0000~10000,1:0x51fb1f0000~10000,1:0x51fc350000~10000,1:0x5200130000~10000,1:0x5202200000~10000,1:0x5203a10000~10000,1:0x5206340000~10000,1:0x520ac40000~10000,1:0x520ece0000~10000,1:0x5210940000~10000,1:0x5213ab0000~10000,1:0x521c270000~10000,1:0x521ea30000~10000,1:0x521eda0000~10000,1:0x5226430000~10000,1:0x52272e0000~10000,1:0x52298c0000~10000,1:0x52303e0000~10000,1:0x52393c0000~10000,1:0x5239820000~10000,1:0x523c4d0000~10000,1:0x5242330000~10000,1:0x5242700000~10000,1:0x5246200000~10000,1:0x5246b60000~10000,1:0x5249760000~10000,1:0x524be80000~10000,1:0x524c730000~10000,1:0x524fbd0000~10000,1:0x5253740000~10000,1:0x5260be0000~10000,1:0x5261210000~10000,1:0x5266750000~10000,1:0x5267a80000~10000,1:0x5275690000~10000,1:0x5277370000~10000,1:0x527b2d0000~10000,1:0x5284fc0000~10000,1:0x5287f20000~10000,1:0x528dd00000~10000,1:0x52930f0000~10000,1:0x52973b0000~10000,1:0x5298210000~10000,1:0x5299fc0000~10000,1:0x529b170000~10000,1:0x529b9b0000~10000,1:0x529bfa0000~10000,1:0x529c410000~10000,1:0x52a43b0000~10000,1:0x52a9950000~10000,1:0x52a9fe0000~10000,1:0x52af770000~10000,1:0x52b53d0000~10000,1:0x52b9a70000~10000,1:0x52da2d0000~10000,1:0x52e5f40000~10000,1:0x52e7da0000~10000,1:0x52eb7c0000~10000,1:0x52eba50000~10000,1:0x52ec760000~10000,1:0x52f2160000~10000,1:0x52fa2e0000~10000,1:0x52fd7b0000~10000,1:0x52fee80000~10000,1:0x53003a0000~10000,1:0x5304440000~10000,1:0x5309c40000~10000,1:0x530b000000~10000,1:0x530c750000~10000,1:0x530da20000~10000,1:0x530dee0000~10000,1:0x530f2d0000~10000,1:0x53121b0000~10000,1:0x5316010000~10000,1:0x531b150000~10000,1:0x531faf0000~10000,1:0x5324770000~10000,1:0x5326d10000~10000,1:0x532a080000~10000,1:0x5334fa0000~10000,1:0x5343b60000~10000,1:0x5347f70000~10000,1:0x534bca0000~10000,1:0x534c640000~10000,1:0x534d0d0000~10000,1:0x5351800000~10000,1:0x5356a50000~10000,1:0x535ad00000~10000,1:0x535e330000~10000,1:0x5369e70000~10000,1:0x536aa90000~10000,1:0x536b270000~10000,1:0x5372480000~10000,1:0x53795b0000~10000,1:0x5379c70000~10000,1:0x5380530000~10000,1:0x538b7b0000~10000,1:0x5394110000~10000,1:0x53974b0000~10000,1:0x53a3380000~10000,1:0x53a5720000~10000,1:0x53addd0000~10000,1:0x53aea80000~10000,1:0x53b6c30000~10000,1:0x53b7ba0000~10000,1:0x53b9040000~10000,1:0x53bb350000~10000,1:0x53bde60000~10000,1:0x53cd780000~10000,1:0x53d2ce0000~10000,1:0x53d32c0000~10000,1:0x53d7180000~10000,1:0x53d7980000~10000,1:0x53d9010000~10000,1:0x53da760000~10000,1:0x53dffd0000~10000,1:0x53e1250000~10000,1:0x53e39b0000~10000,1:0x53eb000000~10000,1:0x53ec120000~10000,1:0x53ed7b0000~10000,1:0x53ef3d0000~10000,1:0x53ef720000~10000,1:0x53efd30000~10000,1:0x53f61a0000~10000,1:0x53f9bd0000~10000,1:0x53fc7f0000~10000,1:0x53ffaf0000~10000,1:0x5400330000~10000,1:0x5403df0000~10000,1:0x5404340000~10000,1:0x540df40000~10000,1:0x540e950000~10000,1:0x540f580000~10000,1:0x5418ac0000~10000,1:0x541fd40000~10000,1:0x5420de0000~10000,1:0x5428580000~10000,1:0x5429460000~10000,1:0x5430770000~10000,1:0x543a9a0000~10000,1:0x5446960000~10000,1:0x54483d0000~10000,1:0x54543f0000~10000,1:0x5456b70000~10000,1:0x5458400000~10000,1:0x545e790000~10000,1:0x54603e0000~10000,1:0x5466e80000~10000,1:0x5468960000~10000,1:0x546d3b0000~10000,1:0x546eca0000~10000,1:0x54704a0000~10000,1:0x5471870000~10000,1:0x54748a0000~10000,1:0x5475e70000~10000,1:0x5477700000~10000,1:0x54784a0000~10000,1:0x547ba50000~10000,1:0x547cb60000~10000,1:0x547e7e0000~10000,1:0x547f1c0000~10000,1:0x54803b0000~10000,1:0x54866f0000~10000,1:0x5489e30000~10000,1:0x548b4d0000~10000,1:0x548f490000~10000,1:0x548fee0000~10000,1:0x5490dc0000~10000,1:0x5492a20000~10000,1:0x5495e90000~10000,1:0x549c4a0000~10000,1:0x54a0f20000~10000,1:0x54a2790000~10000,1:0x54a2850000~10000,1:0x54a68f0000~10000,1:0x54a84c0000~10000,1:0x54ac110000~10000,1:0x54afa80000~10000,1:0x54b1cb0000~10000,1:0x54b2070000~10000,1:0x54b5780000~10000,1:0x54b8f70000~10000,1:0x54bb2a0000~10000,1:0x54bc0b0000~10000,1:0x54bf730000~10000,1:0x54bf780000~10000,1:0x54c9670000~10000,1:0x54cfb90000~10000,1:0x54d0230000~10000,1:0x54d1e30000~10000,1:0x54d3c40000~10000,1:0x54da580000~10000,1:0x54daf80000~10000,1:0x54df340000~10000,1:0x54e48d0000~10000,1:0x54e7d40000~10000,1:0x54e8690000~10000,1:0x54ec1c0000~10000,1:0x54efda0000~10000,1:0x54f3a50000~10000,1:0x54f5bf0000~10000,1:0x54f6220000~10000,1:0x54f6ef0000~10000,1:0x5509540000~10000,1:0x5509ad0000~10000,1:0x550c5e0000~10000,1:0x5512340000~10000,1:0x55139f0000~10000,1:0x5514730000~10000,1:0x5517e90000~10000,1:0x551b580000~10000,1:0x5522c40000~20000,1:0x55256e0000~10000,1:0x5529160000~10000,1:0x5531ae0000~10000,1:0x553a9f0000~10000,1:0x5546870000~10000,1:0x554bab0000~10000,1:0x554c990000~10000,1:0x554cdc0000~10000,1:0x554e0d0000~10000,1:0x5556640000~10000,1:0x55591d0000~10000,1:0x555aab0000~10000,1:0x555da80000~10000,1:0x555e6f0000~10000,1:0x5560100000~10000,1:0x556c590000~10000,1:0x556fb40000~10000,1:0x557b730000~10000,1:0x557c510000~10000,1:0x5582f80000~10000,1:0x5586b10000~10000,1:0x55888f0000~10000,1:0x558cb40000~10000,1:0x5591120000~10000,1:0x55a2a40000~10000,1:0x55a9890000~10000,1:0x55aa260000~10000,1:0x55b7df0000~10000,1:0x55b8e90000~10000,1:0x55ba1e0000~10000,1:0x55c6130000~10000,1:0x55c71c0000~10000,1:0x55ca9b0000~10000,1:0x55cc6c0000~10000,1:0x55d5970000~10000,1:0x55d6e20000~10000,1:0x55da1c0000~10000,1:0x55db360000~10000,1:0x55e95b0000~10000,1:0x505c200000~10000,1:0x505cbc0000~10000,1:0x505d0c0000~10000,1:0x5060510000~10000,1:0x5061de0000~10000,1:0x5064ae0000~10000,1:0x5068790000~10000,1:0x506b670000~10000,1:0x506eb40000~10000,1:0x5070800000~10000,1:0x5072280000~10000,1:0x5074f50000~10000,1:0x5084750000~10000,1:0x5092240000~10000,1:0x509d7f0000~10000,1:0x509ef00000~10000,1:0x50a5a60000~10000,1:0x50a92a0000~10000,1:0x50ad1b0000~10000,1:0x50ad6a0000~10000,1:0x50aff10000~10000,1:0x50b0040000~10000,1:0x50b2930000~10000,1:0x50b2fb0000~10000,1:0x50b3f50000~10000,1:0x50baae0000~10000,1:0x50c2530000~10000,1:0x50c66c0000~10000,1:0x50c8f00000~10000,1:0x50ca3a0000~10000,1:0x50ce350000~10000,1:0x50cf480000~10000,1:0x50d0fd0000~10000,1:0x50d3010000~10000,1:0x50d5d40000~10000,1:0x50df320000~10000,1:0x50dfb60000~10000,1:0x50e4660000~10000,1:0x50e63f0000~10000,1:0x50ed1a0000~10000,1:0x50f0150000~10000,1:0x50f0820000~10000,1:0x50fd670000~10000,1:0x5100170000~10000,1:0x5101bf0000~10000,1:0x5103020000~10000,1:0x5103dd0000~10000,1:0x51067e0000~10000,1:0x5107bc0000~10000,1:0x510b0d0000~10000,1:0x5112da0000~10000,1:0x5115270000~10000,1:0x5117d90000~10000,1:0x5127370000~10000,1:0x5128ab0000~10000,1:0x512de50000~10000,1:0x5131210000~10000,1:0x5132a00000~10000,1:0x513a6d0000~10000,1:0x5143c40000~10000,1:0x51471c0000~10000,1:0x5148ae0000~10000,1:0x514a4c0000~10000,1:0x514d140000~10000]

src/os/bluestore/BlueFS.cc: In function 'int BlueFS::_write_super(int)' thread 7fa072da2a80 time 2022-12-15 10:34:33.925593
src/os/bluestore/BlueFS.cc: FAILED ceph_assert(bl.length() <= get_super_length())
ceph version 14.2.8 nautilus (stable)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14a) [0x561f921c4a6f]
2: (()+0x4cac37) [0x561f921c4c37]
3: (BlueFS::_write_super(int)+0x570) [0x561f927c4b40]
4: (BlueFS::_compact_log_async(std::unique_lock<std::mutex>&)+0x1182) [0x561f927dd922]
5: (BlueFS::sync_metadata()+0x2ad) [0x561f927de41d]
6: (BlueFS::umount()+0xfb) [0x561f927de69b]
7: (BlueStore::_close_bluefs()+0xd) [0x561f926af7fd]
8: (BlueStore::_open_db_and_around(bool)+0x170) [0x561f926ef7d0]
9: (BlueStore::_mount(bool, bool)+0x5c2) [0x561f927328b2]
10: (OSD::init()+0x321) [0x561f922cc611]
11: (main()+0x1bf8) [0x561f9222cdb8]
12: (__libc_start_main()+0xf5) [0x7fa06eb3b3d5]
13: (()+0x567e45) [0x561f92261e45]

Caught signal (Aborted) **

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » bluestore

Custom queries

Bug #53814

Pacific cluster crash

Updated by Igor Fedotov over 2 years ago

Updated by Cyprien Devillez over 2 years ago

Updated by Igor Fedotov over 2 years ago

Updated by Cyprien Devillez over 2 years ago

Updated by Igor Fedotov over 2 years ago

Updated by Cyprien Devillez over 2 years ago

Updated by Igor Fedotov over 2 years ago

Updated by Cyprien Devillez over 2 years ago

Updated by Igor Fedotov about 2 years ago

Updated by jianwei zhang over 1 year ago

Updated by Igor Fedotov over 1 year ago

Updated by jianwei zhang over 1 year ago

Updated by Igor Fedotov over 1 year ago

Updated by jianwei zhang over 1 year ago

Updated by Igor Fedotov over 1 year ago

Updated by jianwei zhang over 1 year ago

Updated by jianwei zhang over 1 year ago