Actions
Bug #45136
openmgr: crash with "tcmalloc::allocate_full_cpp_throw_oom"
Status:
Need More Info
Priority:
High
Assignee:
-
Category:
ceph-mgr
Target version:
-
% Done:
0%
Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
- Crash time (02:45:38 UTC).
- Downstream version (14.2.8-28.el7cp) running on bare metal (RHEL7.8), system with 32 GB RAM.
- Cluster setup:
# ceph -s cluster: health: HEALTH_WARN 1 pool(s) have non-power-of-two pg_num application not enabled on 1 pool(s) 1 daemons have recently crashed services: mon: 3 daemons, quorum magna057,magna067,magna120 (age 2h) mgr: magna057(active, since 14h), standbys: magna120 osd: 9 osds: 9 up (since 2h), 9 in (since 2h) rgw: 2 daemons active (magna057.rgw0, magna067.rgw0) task status: data: pools: 5 pools, 132 pgs objects: 116.87k objects, 458 GiB usage: 1.3 TiB used, 6.8 TiB / 8.2 TiB avail pgs: 132 active+clean
Several tcmalloc large alloc
started to be reported (/var/log/messages) 4 hours before the crash (every 30min-2hours):
messages:Apr 16 22:41:48 magna057 ceph-mgr: tcmalloc: large alloc 1073741824 bytes == 0x5628c62f8000 @ 0x7fbf605db35f 0x7fbf605faa56 0x5625e6034976 0x5625e6034b77 0x5625e602a839 0x5625e60a1928 0x5625e60a889b 0x5625e60b0157 0x5625e609a916 0x7fbf61400c29 0x7fbf614ae0ad 0x7fbf5eaccea5 0x7fbf5d77a8dd messages:Apr 17 00:25:53 magna057 ceph-mgr: tcmalloc: large alloc 2147483648 bytes == 0x562b05510000 @ 0x7fbf605db35f 0x7fbf605faa56 0x5625e6034976 0x5625e6034b77 0x5625e602a839 0x5625e60a1928 0x5625e60a889b 0x5625e60b0157 0x5625e609a916 0x7fbf61400c29 0x7fbf614ae0ad 0x7fbf5eaccea5 0x7fbf5d77a8dd messages:Apr 17 01:01:45 magna057 ceph-mgr: tcmalloc: large alloc 2147483648 bytes == 0x562bc5db8000 @ 0x7fbf605db35f 0x7fbf605faa56 0x5625e6034976 0x5625e6034b77 0x5625e602a839 0x5625e60a1928 0x5625e60a889b 0x5625e60b0157 0x5625e609a916 0x7fbf61400c29 0x7fbf614ae0ad 0x7fbf5eaccea5 0x7fbf5d77a8dd messages:Apr 17 01:35:23 magna057 ceph-mgr: tcmalloc: large alloc 2147483648 bytes == 0x562c465b8000 @ 0x7fbf605db35f 0x7fbf605faa56 0x5625e6034976 0x5625e6034b77 0x5625e602a839 0x5625e60a1928 0x5625e60a889b 0x5625e60b0157 0x5625e609a916 0x7fbf61400c29 0x7fbf614ae0ad 0x7fbf5eaccea5 0x7fbf5d77a8dd messages:Apr 17 02:09:16 magna057 ceph-mgr: tcmalloc: large alloc 2147483648 bytes == 0x562b054b8000 @ 0x7fbf605db35f 0x7fbf605faa56 0x5625e6034976 0x5625e6034b77 0x5625e602a839 0x5625e60a1928 0x5625e60a889b 0x5625e60b0157 0x5625e609a916 0x7fbf61400c29 0x7fbf614ae0ad 0x7fbf5eaccea5 0x7fbf5d77a8dd messages:Apr 17 02:45:38 magna057 ceph-mgr: tcmalloc: large alloc 2147483648 bytes == (nil) @ 0x7fbf605db35f 0x7fbf605faa56 0x5625e6034976 0x5625e6034b77 0x5625e602a839 0x5625e60a1928 0x5625e60a889b 0x5625e60b0157 0x5625e609a916 0x7fbf61400c29 0x7fbf614ae0ad 0x7fbf5eaccea5 0x7fbf5d77a8dd
Stats:
magna57 (active ceph-mgr colocated with mon and rgw)
magna57 ceph-mgr log:
2020-04-17 02:45:34.040 7fbf5af87700 0 auth: could not find secret_id=390 2020-04-17 02:45:34.040 7fbf5af87700 0 cephx: verify_authorizer could not get service secret for service mgr secret_id=390 2020-04-17 02:45:35.063 7fbf5af87700 0 auth: could not find secret_id=391 2020-04-17 02:45:35.063 7fbf5af87700 0 cephx: verify_authorizer could not get service secret for service mgr secret_id=391 2020-04-17 02:45:35.878 7fbf2fba6700 0 log_channel(cluster) log [DBG] : pgmap v58306: 132 pgs: 9 stale+peering, 123 active+clean; 458 GiB data, 1.2 TiB used, 4.2 TiB / 5.5 TiB avail; 4.0 KiB/s rd, 0 B/s wr, 6 op/s 2020-04-17 02:45:37.879 7fbf2fba6700 0 log_channel(cluster) log [DBG] : pgmap v58307: 132 pgs: 9 stale+peering, 123 active+clean; 458 GiB data, 1.2 TiB used, 4.2 TiB / 5.5 TiB avail; 4.0 KiB/s rd, 0 B/s wr, 6 op/s 2020-04-17 02:45:39.029 7fbf5577c700 -1 *** Caught signal (Aborted) ** in thread 7fbf5577c700 thread_name:ms_dispatch ceph version 14.2.8-28.el7cp (b32eac9fd60c00c62a2d3c85d88b483be7b55ba1) nautilus (stable) 1: (()+0xf630) [0x7fbf5ead4630] 2: (gsignal()+0x37) [0x7fbf5d6b2387] 3: (abort()+0x148) [0x7fbf5d6b3a78] 4: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7fbf5dfc27d5] 5: (()+0x5e746) [0x7fbf5dfc0746] 6: (()+0x5e773) [0x7fbf5dfc0773] 7: (()+0x5e993) [0x7fbf5dfc0993] 8: (()+0x1772b) [0x7fbf605dc72b] 9: (tcmalloc::allocate_full_cpp_throw_oom(unsigned long)+0xf3) [0x7fbf605faa03] 10: (void std::vector<std::pair<Context*, int>, std::allocator<std::pair<Context*, int> > >::_M_realloc_insert<std::pair<Context*, int> >(__gnu_cxx::__normal_iterator<std::pair<Context*, int>*, std::vector<std::pair<Context*, int>, std::allocator<std::pair<Context*, int> > > >, std::pair<Context*, int>&&)+0x56) [0x5625e6034976] 11: (Finisher::queue(Context*, int)+0xe7) [0x5625e6034b77] 12: (ActivePyModules::notify_all(LogEntry const&)+0x109) [0x5625e602a839] 13: (Mgr::handle_log(MLog*)+0x58) [0x5625e60a1928] 14: (Mgr::ms_dispatch(Message*)+0x22b) [0x5625e60a889b] 15: (MgrStandby::ms_dispatch(Message*)+0x97) [0x5625e60b0157] 16: (Dispatcher::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x26) [0x5625e609a916] 17: (DispatchQueue::entry()+0x1699) [0x7fbf61400c29] 18: (DispatchQueue::DispatchThread::entry()+0xd) [0x7fbf614ae0ad] 19: (()+0x7ea5) [0x7fbf5eaccea5] 20: (clone()+0x6d) [0x7fbf5d77a8dd] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. --- begin dump of recent events --- -9999> 2020-04-17 02:32:59.553 7fbf5577c700 4 mgr ms_dispatch service_map(e20 1 svc) v1 -9998> 2020-04-17 02:32:59.908 7fbf30ba8700 4 mgr.server handle_report from 0x56264c176800 mon,magna120 -9997> 2020-04-17 02:33:00.014 7fbf5577c700 4 mgr ms_dispatch active log(5000 entries from seq 232332058 at 2020-04-17 02:32:42.159809) v1 -9996> 2020-04-17 02:33:00.014 7fbf5577c700 4 mgr ms_dispatch log(5000 entries from seq 232332058 at 2020-04-17 02:32:42.159809) v1 -9995> 2020-04-17 02:33:00.058 7fbf5577c700 4 mgr ms_dispatch active service_map(e20 1 svc) v1 -9994> 2020-04-17 02:33:00.058 7fbf5577c700 4 mgr ms_dispatch service_map(e20 1 svc) v1 -9993> 2020-04-17 02:33:00.188 7fbf30ba8700 4 mgr.server handle_report from 0x5625f55c3c00 osd,3 -9992> 2020-04-17 02:33:00.322 7fbf30ba8700 4 mgr.server handle_report from 0x5625f2f72c00 osd,0 -9991> 2020-04-17 02:33:00.694 7fbf51f75700 4 mgr send_beacon active -9990> 2020-04-17 02:33:00.695 7fbf51f75700 10 monclient: _send_mon_message to mon.magna057 at v2:10.8.128.57:3300/0 -9989> 2020-04-17 02:33:01.140 7fbf5477a700 10 monclient: tick -9988> 2020-04-17 02:33:01.140 7fbf5477a700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2020-04-17 02:32:31.141196) -9987> 2020-04-17 02:33:01.140 7fbf5477a700 10 log_client log_queue is 5 last_log 57779 sent 57774 num 5 unsent 5 sending 5 -9986> 2020-04-17 02:33:01.140 7fbf5477a700 10 log_client will send 2020-04-17 02:32:51.451241 mgr.magna057 (mgr.154572) 57775 : cluster [DBG] pgmap v57924: 132 pgs: 9 stale+peering, 123 active+clean; 458 GiB data, 1.2 TiB used, 4.2 TiB / 5.5 TiB avail -9985> 2020-04-17 02:33:01.140 7fbf5477a700 10 log_client will send 2020-04-17 02:32:53.452017 mgr.magna057 (mgr.154572) 57776 : cluster [DBG] pgmap v57925: 132 pgs: 9 stale+peering, 123 active+clean; 458 GiB data, 1.2 TiB used, 4.2 TiB / 5.5 TiB avail -9984> 2020-04-17 02:33:01.140 7fbf5477a700 10 log_client will send 2020-04-17 02:32:55.452660 mgr.magna057 (mgr.154572) 57777 : cluster [DBG] pgmap v57926: 132 pgs: 9 stale+peering, 123 active+clean; 458 GiB data, 1.2 TiB used, 4.2 TiB / 5.5 TiB avail -9983> 2020-04-17 02:33:01.140 7fbf5477a700 10 log_client will send 2020-04-17 02:32:57.485798 mgr.magna057 (mgr.154572) 57778 : cluster [DBG] pgmap v57927: 132 pgs: 9 stale+peering, 123 active+clean; 458 GiB data, 1.2 TiB used, 4.2 TiB / 5.5 TiB avail -9982> 2020-04-17 02:33:01.140 7fbf5477a700 10 log_client will send 2020-04-17 02:32:59.486323 mgr.magna057 (mgr.154572) 57779 : cluster [DBG] pgmap v57928: 132 pgs: 9 stale+peering, 123 active+clean; 458 GiB data, 1.2 TiB used, 4.2 TiB / 5.5 TiB avail -9981> 2020-04-17 02:33:01.140 7fbf5477a700 10 monclient: _send_mon_message to mon.magna057 at v2:10.8.128.57:3300/0 -9980> 2020-04-17 02:33:01.158 7fbf5577c700 4 mgr ms_dispatch active log(5000 entries from seq 232337058 at 2020-04-17 02:32:42.190039) v1 -9979> 2020-04-17 02:33:01.158 7fbf5577c700 4 mgr ms_dispatch log(5000 entries from seq 232337058 at 2020-04-17 02:32:42.190039) v1 -9978> 2020-04-17 02:33:01.197 7fbf5af87700 0 auth: could not find secret_id=390 -9977> 2020-04-17 02:33:01.197 7fbf5af87700 0 cephx: verify_authorizer could not get service secret for service mgr secret_id=390 -9976> 2020-04-17 02:33:01.197 7fbf5af87700 10 monclient: handle_auth_request bad authorizer on 0x56264c176400 -9975> 2020-04-17 02:33:01.205 7fbf5577c700 4 mgr ms_dispatch active service_map(e20 1 svc) v1 -9974> 2020-04-17 02:33:01.205 7fbf5577c700 4 mgr ms_dispatch service_map(e20 1 svc) v1 -9973> 2020-04-17 02:33:01.309 7fbf5577c700 4 mgr ms_dispatch active service_map(e20 1 svc) v1 -9972> 2020-04-17 02:33:01.309 7fbf5577c700 4 mgr ms_dispatch service_map(e20 1 svc) v1 -9971> 2020-04-17 02:33:01.310 7fbf5577c700 4 mgr ms_dispatch active mgrdigest v1 -9970> 2020-04-17 02:33:01.310 7fbf5577c700 4 mgr ms_dispatch mgrdigest v1 -9969> 2020-04-17 02:33:01.485 7fbf2fba6700 0 log_channel(cluster) log [DBG] : pgmap v57929: 132 pgs: 9 stale+peering, 123 active+clean; 458 GiB data, 1.2 TiB used, 4.2 TiB / 5.5 TiB avail -9968> 2020-04-17 02:33:01.485 7fbf2fba6700 10 monclient: _send_mon_message to mon.magna057 at v2:10.8.128.57:3300/0 -9967> 2020-04-17 02:33:01.796 7fbf5577c700 4 mgr ms_dispatch active log(5001 entries from seq 232342058 at 2020-04-17 02:32:43.259328) v1 -9966> 2020-04-17 02:33:01.796 7fbf5577c700 4 mgr ms_dispatch log(5001 entries from seq 232342058 at 2020-04-17 02:32:43.259328) v1 -9965> 2020-04-17 02:33:01.836 7fbf30ba8700 4 mgr.server handle_report from 0x5625ec448c00 rgw,magna067.rgw0 -9964> 2020-04-17 02:33:01.841 7fbf5577c700 4 mgr ms_dispatch active service_map(e20 1 svc) v1 -9963> 2020-04-17 02:33:01.841 7fbf5577c700 4 mgr ms_dispatch service_map(e20 1 svc) v1 -9962> 2020-04-17 02:33:01.841 7fbf5577c700 10 log_client handle_log_ack log(last 57779) v1 -9961> 2020-04-17 02:33:01.841 7fbf5577c700 10 log_client logged 2020-04-17 02:32:51.451241 mgr.magna057 (mgr.154572) 57775 : cluster [DBG] pgmap v57924: 132 pgs: 9 stale+peering, 123 active+clean; 458 GiB data, 1.2 TiB used, 4.2 TiB / 5.5 TiB avail -9960> 2020-04-17 02:33:01.841 7fbf5577c700 10 log_client logged 2020-04-17 02:32:53.452017 mgr.magna057 (mgr.154572) 57776 : cluster [DBG] pgmap v57925: 132 pgs: 9 stale+peering, 123 active+clean; 458 GiB data, 1.2 TiB used, 4.2 TiB / 5.5 TiB avail -9959> 2020-04-17 02:33:01.841 7fbf5577c700 10 log_client logged 2020-04-17 02:32:55.452660 mgr.magna057 (mgr.154572) 57777 : cluster [DBG] pgmap v57926: 132 pgs: 9 stale+peering, 123 active+clean; 458 GiB data, 1.2 TiB used, 4.2 TiB / 5.5 TiB avail -9958> 2020-04-17 02:33:01.841 7fbf5577c700 10 log_client logged 2020-04-17 02:32:57.485798 mgr.magna057 (mgr.154572) 57778 : cluster [DBG] pgmap v57927: 132 pgs: 9 stale+peering, 123 active+clean; 458 GiB data, 1.2 TiB used, 4.2 TiB / 5.5 TiB avail -9957> 2020-04-17 02:33:01.841 7fbf5577c700 10 log_client logged 2020-04-17 02:32:59.486323 mgr.magna057 (mgr.154572) 57779 : cluster [DBG] pgmap v57928: 132 pgs: 9 stale+peering, 123 active+clean; 458 GiB data, 1.2 TiB used, 4.2 TiB / 5.5 TiB avail -9956> 2020-04-17 02:33:01.987 7fbf30ba8700 4 mgr.server handle_report from 0x5625ed9b2c00 mon,magna067 -9955> 2020-04-17 02:33:02.293 7fbf30ba8700 4 mgr.server handle_report from 0x5625ec448800 osd,8 -9954> 2020-04-17 02:33:02.297 7fbf5577c700 4 mgr ms_dispatch active log(5000 entries from seq 232347058 at 2020-04-17 02:32:43.287662) v1 -9953> 2020-04-17 02:33:02.297 7fbf5577c700 4 mgr ms_dispatch log(5000 entries from seq 232347058 at 2020-04-17 02:32:43.287662) v1 -9952> 2020-04-17 02:33:02.327 7fbf30ba8700 4 mgr.server handle_report from 0x5625fb403800 osd,5 -9951> 2020-04-17 02:33:02.340 7fbf5577c700 4 mgr ms_dispatch active service_map(e20 1 svc) v1 -9950> 2020-04-17 02:33:02.340 7fbf5577c700 4 mgr ms_dispatch service_map(e20 1 svc) v1 -9949> 2020-04-17 02:33:02.360 7fbf30ba8700 4 mgr.server handle_report from 0x5629d222dc00 osd,7 -9948> 2020-04-17 02:33:02.397 7fbf30ba8700 4 mgr.server handle_report from 0x56288cd56000 osd,1 -9947> 2020-04-17 02:33:02.695 7fbf51f75700 4 mgr send_beacon active -9946> 2020-04-17 02:33:02.695 7fbf51f75700 10 monclient: _send_mon_message to mon.magna057 at v2:10.8.128.57:3300/0 -9945> 2020-04-17 02:33:02.854 7fbf30ba8700 4 mgr.server handle_report from 0x5625f753c400 mgr,magna057 -9944> 2020-04-17 02:33:02.867 7fbf30ba8700 4 mgr.server handle_report from 0x5625ed58a800 mon,magna057 -9943> 2020-04-17 02:33:02.873 7fbf2dba2700 4 mgr get_config get_config key: mgr/prometheus/magna057/rbd_stats_pools -9942> 2020-04-17 02:33:02.873 7fbf2dba2700 4 mgr get_config get_config key: mgr/prometheus/rbd_stats_pools -9941> 2020-04-17 02:33:02.873 7fbf2dba2700 4 mgr get_typed_config get_typed_config [magna057/]rbd_stats_pools not found -9940> 2020-04-17 02:33:02.997 7fbf5577c700 4 mgr ms_dispatch active log(5000 entries from seq 232352058 at 2020-04-17 02:32:43.316695) v1 -9939> 2020-04-17 02:33:02.997 7fbf5577c700 4 mgr ms_dispatch log(5000 entries from seq 232352058 at 2020-04-17 02:32:43.316695) v1 -9938> 2020-04-17 02:33:03.040 7fbf5577c700 4 mgr ms_dispatch active service_map(e20 1 svc) v1 -9937> 2020-04-17 02:33:03.040 7fbf5577c700 4 mgr ms_dispatch service_map(e20 1 svc) v1 -9936> 2020-04-17 02:33:03.043 7fbf30ba8700 4 mgr.server handle_report from 0x5625ec745400 rgw,magna057.rgw0 -9935> 2020-04-17 02:33:03.177 7fbf5577c700 4 mgr ms_dispatch active service_map(e20 1 svc) v1 -9934> 2020-04-17 02:33:03.177 7fbf5577c700 4 mgr ms_dispatch service_map(e20 1 svc) v1 -9933> 2020-04-17 02:33:03.177 7fbf5577c700 4 mgr ms_dispatch active mgrdigest v1 -9932> 2020-04-17 02:33:03.177 7fbf5577c700 4 mgr ms_dispatch mgrdigest v1 -9931> 2020-04-17 02:33:03.312 7fbf5af87700 0 auth: could not find secret_id=391 -9930> 2020-04-17 02:33:03.312 7fbf5af87700 0 cephx: verify_authorizer could not get service secret for service mgr secret_id=391 -9929> 2020-04-17 02:33:03.312 7fbf5af87700 10 monclient: handle_auth_request bad authorizer on 0x5625ec8bf800 -9928> 2020-04-17 02:33:03.486 7fbf2fba6700 0 log_channel(cluster) log [DBG] : pgmap v57930: 132 pgs: 9 stale+peering, 123 active+clean; 458 GiB data, 1.2 TiB used, 4.2 TiB / 5.5 TiB avail -9927> 2020-04-17 02:33:03.486 7fbf2fba6700 10 monclient: _send_mon_message to mon.magna057 at v2:10.8.128.57:3300/0 -9926> 2020-04-17 02:33:03.655 7fbf5577c700 4 mgr ms_dispatch active log(5000 entries from seq 232357058 at 2020-04-17 02:32:44.303167) v1 -9925> 2020-04-17 02:33:03.655 7fbf5577c700 4 mgr ms_dispatch log(5000 entries from seq 232357058 at 2020-04-17 02:32:44.303167) v1 -9924> 2020-04-17 02:33:03.699 7fbf5577c700 4 mgr ms_dispatch active service_map(e20 1 svc) v1 -9923> 2020-04-17 02:33:03.699 7fbf5577c700 4 mgr ms_dispatch service_map(e20 1 svc) v1 -9922> 2020-04-17 02:33:03.884 7fbf5af87700 0 auth: could not find secret_id=390 -9921> 2020-04-17 02:33:03.884 7fbf5af87700 0 cephx: verify_authorizer could not get service secret for service mgr secret_id=390 -9920> 2020-04-17 02:33:03.884 7fbf5af87700 10 monclient: handle_auth_request bad authorizer on 0x5629d222c000 -9919> 2020-04-17 02:33:04.155 7fbf5577c700 4 mgr ms_dispatch active log(5000 entries from seq 232362058 at 2020-04-17 02:32:44.332027) v1 -9918> 2020-04-17 02:33:04.155 7fbf5577c700 4 mgr ms_dispatch log(5000 entries from seq 232362058 at 2020-04-17 02:32:44.332027) v1 -9917> 2020-04-17 02:33:04.195 7fbf5577c700 4 mgr ms_dispatch active service_map(e20 1 svc) v1 -9916> 2020-04-17 02:33:04.195 7fbf5577c700 4 mgr ms_dispatch service_map(e20 1 svc) v1 -9915> 2020-04-17 02:33:04.696 7fbf51f75700 4 mgr send_beacon active -9914> 2020-04-17 02:33:04.696 7fbf51f75700 10 monclient: _send_mon_message to mon.magna057 at v2:10.8.128.57:3300/0 -9913> 2020-04-17 02:33:04.816 7fbf5577c700 4 mgr ms_dispatch active log(5000 entries from seq 232367058 at 2020-04-17 02:32:44.362081) v1 -9912> 2020-04-17 02:33:04.816 7fbf5577c700 4 mgr ms_dispatch log(5000 entries from seq 232367058 at 2020-04-17 02:32:44.362081) v1 -9911> 2020-04-17 02:33:04.858 7fbf5577c700 4 mgr ms_dispatch active service_map(e20 1 svc) v1 -9910> 2020-04-17 02:33:04.859 7fbf5577c700 4 mgr ms_dispatch service_map(e20 1 svc) v1 -9909> 2020-04-17 02:33:04.887 7fbf5af87700 0 auth: could not find secret_id=391 -9908> 2020-04-17 02:33:04.887 7fbf5af87700 0 cephx: verify_authorizer could not get service secret for service mgr secret_id=391 ... -38> 2020-04-17 02:45:35.878 7fbf2fba6700 10 monclient: _send_mon_message to mon.magna057 at v2:10.8.128.57:3300/0 -37> 2020-04-17 02:45:36.359 7fbf5577c700 4 mgr ms_dispatch active log(5000 entries from seq 235915427 at 2020-04-17 02:45:21.649510) v1 -36> 2020-04-17 02:45:36.359 7fbf5577c700 4 mgr ms_dispatch log(5000 entries from seq 235915427 at 2020-04-17 02:45:21.649510) v1 -35> 2020-04-17 02:45:36.400 7fbf5577c700 4 mgr ms_dispatch active service_map(e20 1 svc) v1 -34> 2020-04-17 02:45:36.400 7fbf5577c700 4 mgr ms_dispatch service_map(e20 1 svc) v1 -33> 2020-04-17 02:45:36.826 7fbf30ba8700 4 mgr.server handle_report from 0x5625f19a4c00 rgw,magna067.rgw0 -32> 2020-04-17 02:45:36.877 7fbf5577c700 4 mgr ms_dispatch active log(5000 entries from seq 235920427 at 2020-04-17 02:45:21.678877) v1 -31> 2020-04-17 02:45:36.877 7fbf5577c700 4 mgr ms_dispatch log(5000 entries from seq 235920427 at 2020-04-17 02:45:21.678877) v1 -30> 2020-04-17 02:45:36.917 7fbf5577c700 4 mgr ms_dispatch active service_map(e20 1 svc) v1 -29> 2020-04-17 02:45:36.917 7fbf5577c700 4 mgr ms_dispatch service_map(e20 1 svc) v1 -28> 2020-04-17 02:45:36.974 7fbf30ba8700 4 mgr.server handle_report from 0x5625ec448c00 mon,magna067 -27> 2020-04-17 02:45:37.032 7fbf5577c700 4 mgr ms_dispatch active service_map(e20 1 svc) v1 -26> 2020-04-17 02:45:37.032 7fbf5577c700 4 mgr ms_dispatch service_map(e20 1 svc) v1 -25> 2020-04-17 02:45:37.032 7fbf5577c700 4 mgr ms_dispatch active mgrdigest v1 -24> 2020-04-17 02:45:37.032 7fbf5577c700 4 mgr ms_dispatch mgrdigest v1 -23> 2020-04-17 02:45:37.210 7fbf30ba8700 4 mgr.server handle_report from 0x5625eb821400 osd,8 -22> 2020-04-17 02:45:37.248 7fbf30ba8700 4 mgr.server handle_report from 0x5629d222d800 osd,5 -21> 2020-04-17 02:45:37.411 7fbf30ba8700 4 mgr.server handle_report from 0x5629d222dc00 osd,7 -20> 2020-04-17 02:45:37.412 7fbf51f75700 4 mgr send_beacon active -19> 2020-04-17 02:45:37.412 7fbf51f75700 10 monclient: _send_mon_message to mon.magna057 at v2:10.8.128.57:3300/0 -18> 2020-04-17 02:45:37.446 7fbf30ba8700 4 mgr.server handle_report from 0x56288cd56000 osd,1 -17> 2020-04-17 02:45:37.517 7fbf5577c700 4 mgr ms_dispatch active log(5000 entries from seq 235925427 at 2020-04-17 02:45:22.664953) v1 -16> 2020-04-17 02:45:37.517 7fbf5577c700 4 mgr ms_dispatch log(5000 entries from seq 235925427 at 2020-04-17 02:45:22.664953) v1 -15> 2020-04-17 02:45:37.557 7fbf5577c700 4 mgr ms_dispatch active service_map(e20 1 svc) v1 -14> 2020-04-17 02:45:37.557 7fbf5577c700 4 mgr ms_dispatch service_map(e20 1 svc) v1 -13> 2020-04-17 02:45:37.824 7fbf30ba8700 4 mgr.server handle_report from 0x5625eda8c000 mon,magna057 -12> 2020-04-17 02:45:37.849 7fbf30ba8700 4 mgr.server handle_report from 0x5625ed668800 mgr,magna057 -11> 2020-04-17 02:45:37.879 7fbf2fba6700 0 log_channel(cluster) log [DBG] : pgmap v58307: 132 pgs: 9 stale+peering, 123 active+clean; 458 GiB data, 1.2 TiB used, 4.2 TiB / 5.5 TiB avail; 4.0 KiB/s rd, 0 B/s wr, 6 op/s -10> 2020-04-17 02:45:37.879 7fbf2fba6700 10 monclient: _send_mon_message to mon.magna057 at v2:10.8.128.57:3300/0 -9> 2020-04-17 02:45:38.008 7fbf31bea700 10 monclient: tick -8> 2020-04-17 02:45:38.008 7fbf31bea700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2020-04-17 02:45:08.009722) -7> 2020-04-17 02:45:38.050 7fbf5577c700 4 mgr ms_dispatch active log(5000 entries from seq 235930427 at 2020-04-17 02:45:22.694286) v1 -6> 2020-04-17 02:45:38.050 7fbf5577c700 4 mgr ms_dispatch log(5000 entries from seq 235930427 at 2020-04-17 02:45:22.694286) v1 -5> 2020-04-17 02:45:38.070 7fbf30ba8700 4 mgr.server handle_report from 0x5625ec745400 rgw,magna057.rgw0 -4> 2020-04-17 02:45:38.090 7fbf5577c700 4 mgr ms_dispatch active service_map(e20 1 svc) v1 -3> 2020-04-17 02:45:38.090 7fbf5577c700 4 mgr ms_dispatch service_map(e20 1 svc) v1 -2> 2020-04-17 02:45:38.786 7fbf5577c700 4 mgr ms_dispatch active log(5000 entries from seq 235935427 at 2020-04-17 02:45:23.777440) v1 -1> 2020-04-17 02:45:38.786 7fbf5577c700 4 mgr ms_dispatch log(5000 entries from seq 235935427 at 2020-04-17 02:45:23.777440) v1 0> 2020-04-17 02:45:39.029 7fbf5577c700 -1 *** Caught signal (Aborted) ** in thread 7fbf5577c700 thread_name:ms_dispatch ceph version 14.2.8-28.el7cp (b32eac9fd60c00c62a2d3c85d88b483be7b55ba1) nautilus (stable) 1: (()+0xf630) [0x7fbf5ead4630] 2: (gsignal()+0x37) [0x7fbf5d6b2387] 3: (abort()+0x148) [0x7fbf5d6b3a78] 4: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7fbf5dfc27d5] 5: (()+0x5e746) [0x7fbf5dfc0746] 6: (()+0x5e773) [0x7fbf5dfc0773] 7: (()+0x5e993) [0x7fbf5dfc0993] 8: (()+0x1772b) [0x7fbf605dc72b] 9: (tcmalloc::allocate_full_cpp_throw_oom(unsigned long)+0xf3) [0x7fbf605faa03] 10: (void std::vector<std::pair<Context*, int>, std::allocator<std::pair<Context*, int> > >::_M_realloc_insert<std::pair<Context*, int> >(__gnu_cxx::__normal_iterator<std::pair<Context*, int>*, std::vector<std::pair<Context*, int>, std::allocator<std::pair<Context*, int> > > >, std::pair<Context*, int>&&)+0x56) [0x5625e6034976] 11: (Finisher::queue(Context*, int)+0xe7) [0x5625e6034b77] 12: (ActivePyModules::notify_all(LogEntry const&)+0x109) [0x5625e602a839] 13: (Mgr::handle_log(MLog*)+0x58) [0x5625e60a1928] 14: (Mgr::ms_dispatch(Message*)+0x22b) [0x5625e60a889b] 15: (MgrStandby::ms_dispatch(Message*)+0x97) [0x5625e60b0157] 16: (Dispatcher::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x26) [0x5625e609a916] 17: (DispatchQueue::entry()+0x1699) [0x7fbf61400c29] 18: (DispatchQueue::DispatchThread::entry()+0xd) [0x7fbf614ae0ad] 19: (()+0x7ea5) [0x7fbf5eaccea5] 20: (clone()+0x6d) [0x7fbf5d77a8dd] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. --- logging levels --- 0/ 5 none 0/ 1 lockdep 0/ 1 context 1/ 1 crush 1/ 5 mds 1/ 5 mds_balancer 1/ 5 mds_locker 1/ 5 mds_log 1/ 5 mds_log_expire 1/ 5 mds_migrator 0/ 1 buffer 0/ 1 timer 0/ 1 filer 0/ 1 striper 0/ 1 objecter 0/ 5 rados 0/ 5 rbd 0/ 5 rbd_mirror 0/ 5 rbd_replay 0/ 5 journaler 0/ 5 objectcacher 0/ 5 client 1/ 5 osd 0/ 5 optracker 0/ 5 objclass 1/ 3 filestore 1/ 3 journal 0/ 0 ms 1/ 5 mon 0/10 monc 1/ 5 paxos 0/ 5 tp 1/ 5 auth 1/ 5 crypto 1/ 1 finisher 1/ 1 reserver 1/ 5 heartbeatmap 1/ 5 perfcounter 1/ 5 rgw 1/ 5 rgw_sync 1/10 civetweb 1/ 5 javaclient 1/ 5 asok 1/ 1 throttle 0/ 0 refs 1/ 5 xio 1/ 5 compressor 1/ 5 bluestore 1/ 5 bluefs 1/ 3 bdev 1/ 5 kstore 4/ 5 rocksdb 4/ 5 leveldb 4/ 5 memdb 1/ 5 kinetic 1/ 5 fuse 1/ 5 mgr 1/ 5 mgrc 1/ 5 dpdk 1/ 5 eventtrace 1/ 5 prioritycache -2/-2 (syslog threshold) -1/-1 (stderr threshold) max_recent 10000 max_new 1000 log_file /var/log/ceph/ceph-mgr.magna057.log --- end dump of recent events ---
Files
Updated by Ernesto Puerta about 4 years ago
Updated by Yaarit Hatuka over 2 years ago
- Crash signature (v1) updated (diff)
Moving the crash data into a note as 'Crash signature (v1)' should hold the value of 'stack_sig' key (which is missing in this case).
{ "os_version_id": "7.8", "utsname_release": "3.10.0-1127.el7.x86_64", "os_name": "Red Hat Enterprise Linux Server", "entity_name": "mgr.magna057", "timestamp": "2020-04-17 02:45:39.021860Z", "process_name": "ceph-mgr", "utsname_machine": "x86_64", "utsname_sysname": "Linux", "os_version": "7.8 (Maipo)", "os_id": "rhel", "utsname_version": "#1 SMP Tue Feb 18 16:39:12 EST 2020", "backtrace": [ "(()+0xf630) [0x7fbf5ead4630]", "(gsignal()+0x37) [0x7fbf5d6b2387]", "(abort()+0x148) [0x7fbf5d6b3a78]", "(__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7fbf5dfc27d5]", "(()+0x5e746) [0x7fbf5dfc0746]", "(()+0x5e773) [0x7fbf5dfc0773]", "(()+0x5e993) [0x7fbf5dfc0993]", "(()+0x1772b) [0x7fbf605dc72b]", "(tcmalloc::allocate_full_cpp_throw_oom(unsigned long)+0xf3) [0x7fbf605faa03]", "(void std::vector<std::pair<Context*, int>, std::allocator<std::pair<Context*, int> > >::_M_realloc_insert<std::pair<Context*, int> >(__gnu_cxx::__normal_iterator<std::pair<Context*, int>*, std::vector<std::pair<Context*, int>, std::allocator<std::pair<Context*, int> > > >, std::pair<Context*, int>&&)+0x56) [0x5625e6034976]", "(Finisher::queue(Context*, int)+0xe7) [0x5625e6034b77]", "(ActivePyModules::notify_all(LogEntry const&)+0x109) [0x5625e602a839]", "(Mgr::handle_log(MLog*)+0x58) [0x5625e60a1928]", "(Mgr::ms_dispatch(Message*)+0x22b) [0x5625e60a889b]", "(MgrStandby::ms_dispatch(Message*)+0x97) [0x5625e60b0157]", "(Dispatcher::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x26) [0x5625e609a916]", "(DispatchQueue::entry()+0x1699) [0x7fbf61400c29]", "(DispatchQueue::DispatchThread::entry()+0xd) [0x7fbf614ae0ad]", "(()+0x7ea5) [0x7fbf5eaccea5]", "(clone()+0x6d) [0x7fbf5d77a8dd]" ], "utsname_hostname": "magna057.ceph.redhat.com", "crash_id": "2020-04-17_02:45:39.021860Z_e73adf03-9acd-413f-b1ee-a961db95901c", "ceph_version": "14.2.8-28.el7cp" }
Updated by Radoslaw Zarzynski almost 2 years ago
- Status changed from New to Need More Info
Nautilus is OOL. Does the problem still happen on a newer version?
Actions