Actions
Bug #54052
closedmgr/snap-schedule: scheduled snapshots are not created after ceph-mgr restart
Status:
Resolved
Priority:
High
Assignee:
Category:
Correctness/Safety
Target version:
% Done:
0%
Source:
Tags:
Backport:
quincy, pacific
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
task(medium)
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
From ceph-user - https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/SLR3JIGNUV3TNMRJKPSEZUXJK7XBA3HC/
$ ceph fs snap-schedule add /shares/users 1h 2021-10-31T18:00 Schedule set for path /shares/users $ ceph fs snap-schedule retention add /shares/users 14h10d12m Retention added to path /shares/users Wait until the next complete hour. $ ceph fs snap-schedule status /shares/users {"fs": "cephfs", "subvol": null, "path": "/shares/users", "rel_path": "/shares/users", "schedule": "1h", "retention": {"h": 14, "d": 10, "m": 12}, "start": "2021-10-31T18:00:00", "created": "2022-01-26T23:52:03", "first": "2022-01-27T00:00:00", "last": "2022-01-27T00:00:00", "last_pruned": "2022-01-27T00:00:00", "\ created_count": 1, "pruned_count": 1, "active": true} Now everything looks and works as expected. However, if I restart the active MGR, no new snapshots will be created and the status command does unexpectedly report NULL for some of the properties. $ systemctl restart ceph-mgr@apollon.service $ ceph fs snap-schedule status /shares/users {"fs": "cephfs", "subvol": null, "path": "/shares/users", "rel_path": "/shares/users", "schedule": "1h", "retention": {}, "start": "2021-10-31T18:00:00", "created": "2022-01-26T23:52:03", "first": null, "last": null, "last_pruned": null, "created_count": 0, "pruned_count": 0, "active": true}
Interesting snippets from the log:
* 2022-01-27T00:51:51 : remove last non working snapshot schedule > 2022-01-27T00:51:51.972+0100 7f03b62c0700 0 [snap_schedule DEBUG snap_schedule.fs.schedule_client] SnapDB on cephfs changed for /shares/users, updating next Timer * 2022-01-27T00:52:03 : add new snapshot schedule > 2022-01-27T00:52:03.704+0100 7f03b62c0700 0 [snap_schedule DEBUG snap_schedule.fs.schedule_client] repeat is 3600 > 2022-01-27T00:52:03.704+0100 7f03b62c0700 0 [snap_schedule DEBUG snap_schedule.fs.schedule_client] attempting to add schedule /shares/users 1h > 2022-01-27T00:52:03.704+0100 7f03b62c0700 0 [snap_schedule DEBUG snap_schedule.fs.schedule] schedule with retention {} > 2022-01-27T00:52:03.708+0100 7f03b62c0700 0 [snap_schedule DEBUG snap_schedule.fs.schedule_client] SnapDB on cephfs changed for /shares/users, updating next Timer > 2022-01-27T00:52:03.708+0100 7f03b62c0700 0 [snap_schedule DEBUG snap_schedule.fs.schedule_client] Creating new snapshot timer for /shares/users > 2022-01-27T00:52:03.708+0100 7f03b62c0700 0 [snap_schedule DEBUG snap_schedule.fs.schedule_client] Will snapshot /shares/users in fs cephfs in 477s * 2022-01-27T00:52:08 : add retention > 2022-01-27T00:52:08.172+0100 7f03b62c0700 0 [snap_schedule DEBUG snap_schedule.fs.schedule] parse_retention(14h10d12m) > 2022-01-27T00:52:08.172+0100 7f03b62c0700 0 [snap_schedule DEBUG snap_schedule.fs.schedule] parse_retention(14h10d12m) -> {'h': 14, 'd': 10, 'm': 12} > 2022-01-27T00:52:08.172+0100 7f03b62c0700 0 [snap_schedule DEBUG snap_schedule.fs.schedule] db result is ('{}',) > 2022-01-27T00:52:08.172+0100 7f03b62c0700 0 [snap_schedule DEBUG snap_schedule.fs.schedule_client] SnapDB on cephfs changed for /shares/users, updating next Timer > 2022-01-27T00:52:08.172+0100 7f03b62c0700 0 [snap_schedule DEBUG snap_schedule.fs.schedule_client] Creating new snapshot timer for /shares/users > 2022-01-27T00:52:08.172+0100 7f03b62c0700 0 [snap_schedule DEBUG snap_schedule.fs.schedule_client] Will snapshot /shares/users in fs cephfs in 472s * 2022-01-27T01:00:00 : scheduled snapshot created > 2022-01-27T01:00:00.175+0100 7f039393b700 0 [snap_schedule DEBUG snap_schedule.fs.schedule_client] Scheduled snapshot of /shares/users triggered > 2022-01-27T01:00:00.187+0100 7f039393b700 0 [snap_schedule INFO snap_schedule.fs.schedule_client] created scheduled snapshot of /shares/users > 2022-01-27T01:00:00.187+0100 7f039393b700 0 [snap_schedule DEBUG snap_schedule.fs.schedule_client] created scheduled snapshot /shares/users/.snap/scheduled-2022-01-27-00_00_00 > 2022-01-27T01:00:00.187+0100 7f039393b700 0 [snap_schedule DEBUG snap_schedule.fs.schedule_client] SnapDB on cephfs changed for /shares/users, updating next Timer > 2022-01-27T01:00:00.187+0100 7f039393b700 0 [snap_schedule DEBUG snap_schedule.fs.schedule_client] Creating new snapshot timer for /shares/users > 2022-01-27T01:00:00.187+0100 7f039393b700 0 [snap_schedule DEBUG snap_schedule.fs.schedule_client] Will snapshot /shares/users in fs cephfs in 3600s > 2022-01-27T01:00:00.187+0100 7f039393b700 0 [snap_schedule DEBUG snap_schedule.fs.schedule_client] Pruning snapshots
- 2022-01-27T01:04:43 restart of the MGR and no references to snapshots being taken in the log.
Similar issue: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/7K4T2HI72NJPB6UWEMZAYEUN4MORBL6O/
Updated by Venky Shankar over 2 years ago
- Assignee set to Milind Changire
Milind, please take a look.
Updated by Milind Changire about 2 years ago
- Status changed from New to In Progress
Updated by Venky Shankar about 2 years ago
- Status changed from In Progress to Pending Backport
Updated by Backport Bot about 2 years ago
- Copied to Backport #55055: quincy: mgr/snap-schedule: scheduled snapshots are not created after ceph-mgr restart added
Updated by Backport Bot about 2 years ago
- Copied to Backport #55056: pacific: mgr/snap-schedule: scheduled snapshots are not created after ceph-mgr restart added
Updated by Konstantin Shalygin over 1 year ago
- Status changed from Pending Backport to Resolved
- Tags deleted (
backport_processed)
Actions