Actions
Bug #58316
openCeph health metric Scraping still broken
Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
OSD
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
This was brought up in #46285 already, but the issue has been marked as rejected.
When I run ceph device scrape-health-metrics HGST_HUH721010AL5200_7JKMZYKG
to collect SMART metrics for a device and then list them via ceph device get-health-metrics HGST_HUH721010AL5200_7JKMZYKG
, I only get
{ "20221220-090607": { "dev": "/dev/sdd", "error": "smartctl failed", "nvme_smart_health_information_add_log_error": "nvme returned an error: sudo: exit status: 1", "nvme_smart_health_information_add_log_error_code": -22, "nvme_vendor": "hgst", "smartctl_error_code": -22, "smartctl_output": "smartctl returned an error (1): stderr:\nsudo: exit status: 1\nstdout:\n" } }
The device is NOT an NVMe drive, it's an SAS-attached spinning disk. The same happens for ALL other (SAS) devices in our cluster. In fact, it's been doing that from day one when the device health feature came out and I have only been waiting for this to be fixed eventually, but the issue is still there.
I am running the latest Pacific release and smartmontools 7.1.
Actions