Project

General

Profile

Actions

Bug #52323

closed

[pwl ssd] incorrect first_valid_entry calculation in retire_entries()

Added by jianpeng ma over 2 years ago. Updated about 2 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
pacific
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

If we kill the running program which write data w/ pwl/ssd cache. It can't restart because read wrong . There are two reasons:
1:
2021-08-19T15:37:02.026+0800 7f812e5fc700 0 librbd::cache::pwl::ssd::WriteLog: 0x7f836c0151b0 schedule_update_root: New root: pool_size=1073741824 first_valid_entry=228626432 first_free_entry=228044800 flushed_sync_gen=19536
2021-08-19T15:37:02.030+0800 7f8398bc5700 0 librbd::cache::pwl::ssd::WriteLog: 0x7f836c0151b0 schedule_update_root: New root: pool_size=1073741824 first_valid_entry=228626432 first_free_entry=228696064 flushed_sync_gen=19536
>> new data will overwrite no-retire-log. This mean we judge that there is a problem with the condition that the cache is full. This because we repeat free allocation of space .

2:Wrong calculation of the location of WriteLogCacheEntry cause decode failed.


Related issues 2 (0 open2 closed)

Related to rbd - Bug #52341: [pwl] m_bytes_allocated is calculated incorrectly on reopenResolvedjianpeng ma

Actions
Copied to rbd - Backport #52422: pacific: [pwl ssd] incorrect first_valid_entry calculation in retire_entries()ResolvedDeepika UpadhyayActions
Actions #2

Updated by Ilya Dryomov over 2 years ago

  • Subject changed from Can't restart pwl/ssd after kill program to [pwl ssd] Can't restart after kill program
  • Status changed from New to Fix Under Review
  • Assignee set to jianpeng ma
  • Pull request ID set to 42843
Actions #3

Updated by Ilya Dryomov over 2 years ago

  • Backport set to pacific
Actions #4

Updated by Ilya Dryomov over 2 years ago

The first problem (allocated space accounting) is tracked in https://tracker.ceph.com/issues/52341.

Actions #5

Updated by Ilya Dryomov over 2 years ago

  • Related to Bug #52341: [pwl] m_bytes_allocated is calculated incorrectly on reopen added
Actions #6

Updated by Ilya Dryomov over 2 years ago

  • Subject changed from [pwl ssd] Can't restart after kill program to [pwl ssd] incorrect first_valid_entry calculation in retire_entries()
  • Status changed from Fix Under Review to Pending Backport
Actions #7

Updated by Backport Bot over 2 years ago

  • Copied to Backport #52422: pacific: [pwl ssd] incorrect first_valid_entry calculation in retire_entries() added
Actions #8

Updated by Ilya Dryomov about 2 years ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Actions

Also available in: Atom PDF