Bug #59241
closed[crimson] OSD logs with debug 20/20 are not captured
0%
Description
OSD logs captured do not have enough debugging information even when debug level is set to 20/20
The same is not observed for a mgr log on the same cluster.
# ceph config dump 2023-03-31T01:05:26.037+0000 7fe84e08f700 -1 WARNING: the following dangerous and experimental features are enabled: crimson 2023-03-31T01:05:26.038+0000 7fe84e08f700 -1 WARNING: the following dangerous and experimental features are enabled: crimson WHO MASK LEVEL OPTION VALUE RO global basic container_image quay.ceph.io/ceph-ci/ceph@sha256:b667b57bdc127d1992370596b283f9a9992248e5d4f4861a26e85538e5611454 * global advanced enable_experimental_unrecoverable_data_corrupting_features crimson global basic log_to_file true global advanced mon_cluster_log_to_file true global advanced osd_pool_default_pg_autoscale_mode off mon advanced auth_allow_insecure_global_id_reclaim false mon advanced osd_pool_default_crimson true mon advanced public_network 10.1.240.0/23 * mgr advanced debug_mgr 20/20 mgr advanced mgr/cephadm/container_init True * mgr advanced mgr/cephadm/migration_current 5 * mgr advanced mgr/dashboard/ALERTMANAGER_API_HOST http://host.containers.internal:9093 * mgr advanced mgr/dashboard/GRAFANA_API_SSL_VERIFY false * mgr advanced mgr/dashboard/GRAFANA_API_URL https://host.containers.internal:3000 * mgr advanced mgr/dashboard/PROMETHEUS_API_HOST http://host.containers.internal:9095 * mgr advanced mgr/dashboard/ssl_server_port 8443 * mgr advanced mgr/orchestrator/orchestrator cephadm osd advanced debug_osd 20/20 osd host:dell-r640-079 basic osd_memory_target 43298420872 osd host:dell-r640-083 basic osd_memory_target 45803788834 osd host:dell-r640-084 basic osd_memory_target 44372132113 osd advanced osd_memory_target_autotune true
Crimson image used - https://shaman.ceph.com/repos/ceph/main/c3f1eeebbd5e7805720d5f665b68bdca160a5855/crimson/275574/
MGR logs capturing debug 20 level logging -
2023-03-31T01:09:17.696+0000 7f9ffcb22700 15 mgr get_health_checks getting health checks for volumes 2023-03-31T01:09:17.696+0000 7f9ffcb22700 10 mgr update_delta_stats v19734 2023-03-31T01:09:17.696+0000 7f9ffcb22700 10 mgr.server operator() 65 pgs: 65 active+clean; 30 GiB data, 45 GiB used, 1.3 TiB / 1.3 TiB avail 2023-03-31T01:09:17.696+0000 7f9ffcb22700 10 mgr.server operator() 0 health checks 2023-03-31T01:09:17.696+0000 7f9ffcb22700 20 mgr.server operator() health checks: {} 2023-03-31T01:09:17.696+0000 7f9ffcb22700 0 log_channel(cluster) log [DBG] : pgmap v19734: 65 pgs: 65 active+clean; 30 GiB data, 45 GiB used, 1.3 TiB / 1.3 TiB avail 2023-03-31T01:09:17.696+0000 7f9ffcb22700 20 mgr.server adjust_pgs 2023-03-31T01:09:17.696+0000 7f9ffcb22700 10 mgr.server operator() creating_or_unknown 0 max_creating 1024 left 1024 2023-03-31T01:09:17.696+0000 7f9ffcb22700 20 mgr.server operator() misplaced_ratio 0 degraded_ratio 0 inactive_pgs_ratio 0 unknown_pgs_ratio 0; target_max_misplaced_ratio 0.05 2023-03-31T01:09:17.699+0000 7f9fc5d3c700 0 [rbd_support DEBUG root] PerfHandler: tick 2023-03-31T01:09:17.699+0000 7f9fc5d3c700 10 ceph_option_get mgr_stats_period found: 5 2023-03-31T01:09:17.790+0000 7f9febb20700 10 mgr.server handle_report from 0x56526e234c00 mon.dell-r640-083 2023-03-31T01:09:17.790+0000 7f9febb20700 20 mgr.server handle_report updating existing DaemonState for mon.dell-r640-083 2023-03-31T01:09:17.790+0000 7f9febb20700 20 mgr update loading 0 new types, 0 old types, had 119 types, got 1150 bytes of data
Cluster status -
[ceph: root@dell-r640-079 /]# ceph -s 2023-03-31T01:10:12.300+0000 7f4ff4096700 -1 WARNING: the following dangerous and experimental features are enabled: crimson 2023-03-31T01:10:12.300+0000 7f4ff4096700 -1 WARNING: the following dangerous and experimental features are enabled: crimson cluster: id: 86890d16-cf04-11ed-8f60-78ac443b3b3c health: HEALTH_OK services: mon: 3 daemons, quorum dell-r640-079,dell-r640-083,dell-r640-084 (age 10h) mgr: dell-r640-079.fomsxp(active, since 10h), standbys: dell-r640-084.dxijhp osd: 9 osds: 9 up (since 10h), 9 in (since 10h) data: pools: 3 pools, 65 pgs objects: 78.06k objects, 30 GiB usage: 45 GiB used, 1.3 TiB / 1.3 TiB avail pgs: 65 active+clean [ceph: root@dell-r640-079 /]#
Cephadm version -
# yum list | grep cephadm cephadm.noarch 2:18.0.0-3168.gc3f1eeeb.el8 @ceph-noarch ceph-mgr-cephadm.noarch 2:18.0.0-3168.gc3f1eeeb.el8 ceph-noarch
# cephadm shell -- ceph -v Inferring fsid 86890d16-cf04-11ed-8f60-78ac443b3b3c Inferring config /var/lib/ceph/86890d16-cf04-11ed-8f60-78ac443b3b3c/mon.dell-r640-079/config Using ceph image with id '39c6c7c7a15b' and tag 'c3f1eeebbd5e7805720d5f665b68bdca160a5855-crimson' created on 2023-03-29 22:11:49 +0000 UTC quay.ceph.io/ceph-ci/ceph@sha256:b667b57bdc127d1992370596b283f9a9992248e5d4f4861a26e85538e5611454 ceph version 18.0.0-3168-gc3f1eeeb (c3f1eeebbd5e7805720d5f665b68bdca160a5855) reef (dev)
Cluster logs - https://drive.google.com/drive/folders/1jrCeceEepaPIl5JYxsSZ61fgaUI5ap1U?usp=sharing
Updated by Aishwarya Mathuria 11 months ago
With debug_osd set to 20, I can see debug logs from my vstart cluster
DEBUG 2023-05-23 17:28:33,172 [shard 0] osd - pg_epoch 7 pg[1.4( empty local-lis/les=6/7 n=0 ec=6/6 lis/c=6/6 les/c/f=7/7/0 sis=6) [0] r=0 lpr=6 crt=0'0 mlcod 0'0 active+clean PeeringState::prepare_stats_for_publish reporting purged_snaps []
DEBUG 2023-05-23 17:28:33,172 [shard 0] osd - pg_epoch 7 pg[1.4( empty local-lis/les=6/7 n=0 ec=6/6 lis/c=6/6 les/c/f=7/7/0 sis=6) [0] r=0 lpr=6 crt=0'0 mlcod 0'0 active+clean PeeringState::prepare_stats_for_publish publish_stats_to_osd 7:7
DEBUG 2023-05-23 17:28:33,172 [shard 0] osd - log is not dirty
DEBUG 2023-05-23 17:28:33,172 [shard 0] osd - do_peering_event handling epoch_sent: 7 epoch_requested: 7 AllReplicasRecovered for pg: 1.1
INFO 2023-05-23 17:28:33,172 [shard 0] osd - pg_epoch 7 pg[1.1( empty local-lis/les=6/7 n=0 ec=6/6 lis/c=6/0 les/c/f=7/0/0 sis=6) [0] r=0 lpr=6 crt=0'0 mlcod 0'0 active exit Started/Primary/Active/Activating 0.124752 2 0.001637
INFO 2023-05-23 17:28:33,172 [shard 0] osd - Exiting state: Started/Primary/Active/Activating, entered at 1684843113.048149, 0.001637133 spent on 2 events
INFO 2023-05-23 17:28:33,172 [shard 0] osd - pg_epoch 7 pg[1.1( empty local-lis/les=6/7 n=0 ec=6/6 lis/c=6/0 les/c/f=7/0/0 sis=6) [0] r=0 lpr=6 crt=0'0 mlcod 0'0 active enter Started/Primary/Active/Recovered
INFO 2023-05-23 17:28:33,172 [shard 0] osd - Entering state: Started/Primary/Active/Recovered
DEBUG 2023-05-23 17:28:33,173 [shard 0] osd - pg_epoch 7 pg[1.1( empty local-lis/les=6/7 n=0 ec=6/6 lis/c=6/0 les/c/f=7/0/0 sis=6) [0] r=0 lpr=6 crt=0'0 mlcod 0'0 active PeeringState::needs_recovery is recovered
DEBUG 2023-05-23 17:28:33,173 [shard 0] osd - pg_epoch 7 pg[1.1( empty local-lis/les=6/7 n=0 ec=6/6 lis/c=6/0 les/c/f=7/0/0 sis=6) [0] r=0 lpr=6 crt=0'0 mlcod 0'0 active PeeringState::update_calc_stats actingset 0 upset 0 acting_recovery_backfill 0
DEBUG 2023-05-23 17:28:33,173 [shard 0] osd - pg_epoch 7 pg[1.1( empty local-lis/les=6/7 n=0 ec=6/6 lis/c=6/0 les/c/f=7/0/0 sis=6) [0] r=0 lpr=6 crt=0'0 mlcod 0'0 active PeeringState::update_calc_stats shard 0 primary objects 0 missing 0
DEBUG 2023-05-23 17:28:33,173 [shard 0] osd - pg_epoch 7 pg[1.1( empty local-lis/les=6/7 n=0 ec=6/6 lis/c=6/0 les/c/f=7/0/0 sis=6) [0] r=0 lpr=6 crt=0'0 mlcod 0'0 active PeeringState::update_calc_stats object_location_counts {}
I'm going through the logs provided to see why debug logs were not showing up.
Updated by Harsh Kumar 11 months ago
As I have been informed by Sam previously, the way that we enable logging in crimson differs from bluestore.
Could you please share the command to enable debug logging in Crimson?
Also I believe we no longer have the debugging level varying till 20 in Crimson, please correct me, if this is wrong.
Updated by Matan Breizman 2 months ago
- Status changed from New to Resolved
- Pull request ID set to 54813