Bug #56270
closedcrash: File "mgr/snap_schedule/module.py", in __init__: self.client = SnapSchedClient(self)
0%
5a837b17fdc77936b4d5208c9f69af044a1a69d3a95437cba6b8ecfd77265f87
Description
Sanitized backtrace:
File "mgr/snap_schedule/module.py", in __init__: self.client = SnapSchedClient(self) File "mgr/snap_schedule/fs/schedule_client.py", in __init__: with self.get_schedule_db(fs_name) as conn_mgr: File "mgr/snap_schedule/fs/schedule_client.py", in get_schedule_db: db.executescript(dump)
Crash dump sample:
{ "backtrace": [ " File \"/usr/share/ceph/mgr/snap_schedule/module.py\", line 38, in __init__\n self.client = SnapSchedClient(self)", " File \"/usr/share/ceph/mgr/snap_schedule/fs/schedule_client.py\", line 158, in __init__\n with self.get_schedule_db(fs_name) as conn_mgr:", " File \"/usr/share/ceph/mgr/snap_schedule/fs/schedule_client.py\", line 192, in get_schedule_db\n db.executescript(dump)", "<redacted>" ], "ceph_version": "17.2.0", "crash_id": "2022-06-20T15:13:02.711351Z_f91dc73d-19af-4069-89f9-51528be9fab7", "entity_name": "mgr.165587efc0dd735a4de0efee0479fff494913b05", "mgr_module": "snap_schedule", "mgr_module_caller": "ActivePyModule::load", "mgr_python_exception": "OperationalError", "os_id": "centos", "os_name": "CentOS Stream", "os_version": "8", "os_version_id": "8", "process_name": "ceph-mgr", "stack_sig": "c0b46fcd547d24a8595150ee9a6c3b3e5ae43023f0ce3caa788450149499be30", "timestamp": "2022-06-20T15:13:02.711351Z", "utsname_machine": "x86_64", "utsname_release": "5.4.0-120-generic", "utsname_sysname": "Linux", "utsname_version": "#136-Ubuntu SMP Fri Jun 10 13:40:48 UTC 2022" }
Updated by Telemetry Bot almost 2 years ago
Updated by Andreas Teuchert almost 2 years ago
The full backtrace is:
"backtrace": [ " File \"/usr/share/ceph/mgr/snap_schedule/module.py\", line 38, in __init__\n self.client = SnapSchedClient(self)", " File \"/usr/share/ceph/mgr/snap_schedule/fs/schedule_client.py\", line 158, in __init__\n with self.get_schedule_db(fs_name) as conn_mgr:", " File \"/usr/share/ceph/mgr/snap_schedule/fs/schedule_client.py\", line 192, in get_schedule_db\n db.executescript(dump)", "sqlite3.OperationalError: table schedules already exists" ],
The error is probably caused by the ioctx.remove(SNAP_DB_OBJECT_NAME)
call failing, see https://tracker.ceph.com/issues/56269.
When the code is run for the first time the table doesn't yet exist so db.executescript(dump)
succeeds. Then ioctx.remove(SNAP_DB_OBJECT_NAME)
is supposed to delete the legacy dump which fails so the dump is loaded on every mgr restart and the repeated calls of db.executescript(dump)
fail.
Updated by Venky Shankar almost 2 years ago
- Project changed from mgr to CephFS
- Category set to Correctness/Safety
- Assignee set to Milind Changire
- Target version set to v18.0.0
- Backport set to quincy
- Component(FS) mgr/snap_schedule added
Milind, please take a look.
Updated by Patrick Donnelly almost 2 years ago
- Related to Bug #56269: crash: File "mgr/snap_schedule/module.py", in __init__: self.client = SnapSchedClient(self) added
Updated by Telemetry Bot over 1 year ago
- Affected Versions v17.2.1, v17.2.2 added
Updated by Alexander Mamonov over 1 year ago
{"log":"debug 2022-11-03T08:38:12.502+0000 7f46270f5700 -1 mgr load Failed to construct class in 'snap_schedule'\n","stream":"stderr","time":"2022-11-03T08:38:12.504394471Z"} {"log":" File \"/usr/share/ceph/mgr/snap_schedule/module.py\", line 38, in init\n","stream":"stderr","time":"2022-11-03T08:38:12.50754467Z"} {"log":" File \"/usr/share/ceph/mgr/snap_schedule/fs/schedule_client.py\", line 169, in init\n","stream":"stderr","time":"2022-11-03T08:38:12.507552024Z"} {"log":" File \"/usr/share/ceph/mgr/snap_schedule/fs/schedule_client.py\", line 203, in get_schedule_db\n","stream":"stderr","time":"2022-11-03T08:38:12.507584909Z"} {"log":"debug 2022-11-03T08:38:12.502+0000 7f46270f5700 -1 mgr operator() Failed to run module in active mode ('snap_schedule')\n","stream":"stderr","time":"2022-11-03T08:38:12.507786143Z"}
Updated by Andreas Teuchert over 1 year ago
If you're running into this bug after upgrading from Pacific to Quincy, you can manually delete the legacy schedule DB as described here: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/6NRHZWC4KHUNHVI7K6OT6MJOMX7CVPHJ/.
Note: This bug is already fixed in recent versions of Quincy, so when upgrading directly to 17.2.5 or later, the bug shouldn't occur.
Updated by Patrick Donnelly about 1 year ago
- Status changed from New to Duplicate
Based on this appearing to have been resolved, I'm closing this as a duplicate of #56269.
Updated by Patrick Donnelly about 1 year ago
- Related to deleted (Bug #56269: crash: File "mgr/snap_schedule/module.py", in __init__: self.client = SnapSchedClient(self))
Updated by Patrick Donnelly about 1 year ago
- Is duplicate of Bug #56269: crash: File "mgr/snap_schedule/module.py", in __init__: self.client = SnapSchedClient(self) added