Project

General

Profile

Actions

Support #13211

closed

profiler and getting some memory info with it

Added by Sergey Mir over 8 years ago. Updated over 8 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Tags:
Reviewed:
Affected Versions:
Component(FS):
Labels (FS):
Pull request ID:

Description

ceph version 0.94.3 (95cefea9fd9ab740263bf8bb4796fd864d9afe2b)
3.13.0-61-generic #100-Ubuntu 2015 x86_64 x86_64 x86_64 GNU/Linux

After turning on profiler onto osd.00(i made it from mon02 node, which have monitor and mds daemons), got dump info with ceph tell osd.00 heap dump - i got warn message "mon.mon00@0(leader).osd e1109 we have enough reports/reporters to mark osd.0 down" with other slow requests info, so osd.00 has stopped working. after turning it off - osd has back to work.
here is a part of log file from osd0 when it starts:
2015-09-23 16:53:32.904799 7f4d007a5700 0 osd.0 1109 do_command r=0
2015-09-23 16:53:32.923139 7f4d007a5700 0 turning on heap profiler with prefix /var/log/ceph//osd.0.profile
2015-09-23 16:53:32.933646 7f4d007a5700 0 osd.0 1109 do_command r=0
2015-09-23 16:53:40.971444 7f4d007a5700 0 osd.0 1109 do_command r=0
2015-09-23 16:53:41.152338 7f4d007a5700 0 osd.0 1109 do_command r=0
2015-09-23 16:53:51.230343 7f4d0dfc0700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7f4d03fac700' had timed out after 15
2015-09-23 16:53:51.230996 7f4d0dfc0700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7f4d047ad700' had timed out after 15
2015-09-23 16:53:51.231905 7f4d0c7bd700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7f4d03fac700' had timed out after 15

--
another problem - cannot get mds heap stats
root@mon02 ~ # ceph tell mds.mon02 heap stats
Error EPERM: problem getting command descriptions from mds.

root@mon02 ~ # cat /var/log/ceph/ceph-mds.mon02.log
2015-09-23 16:36:16.408296 7f825e570700 1 mds.-1.0 handle_command: received command from client without `tell` capability: (mon/mds2_ip):0/4048052415
2015-09-23 16:36:16.408952 7f8259465700 0 -- (mon/mds2_ip):6800/10787 >> (mon/mds2_ip):0/4048052415 pipe(0x49b5000 sd=17 :6800 s=2 pgs=2 cs=1 l=0 c=0x494d440).fault, server, going to standby

here is some info about mds:
379044: (mon/mds2_ip):6800/10787 'mon02' mds.-1.0 up:standby seq 1
416149: (mon/mds0_ip):6800/31743 'mon00' mds.0.39 up:active seq 5 export_targets=1
399506: (mon/mds1_ip):6800/3114 'mon01' mds.1.10 up:active seq 6 export_targets=0

is there any way to fix that?

Actions #1

Updated by Sergey Mir over 8 years ago

p.s.
mon and ceph info works fine:
ceph tell osd.0 heap stats
ceph tell mon.mon00 heap stats

Actions #2

Updated by Sergey Mir over 8 years ago

same situation with mds memory check onto virtual machines(same osd and mons check goes fine):

root@node1:~# ceph tell mds.node1 heap stats
2015-09-24 16:19:12.815946 7f55c7879740 20 client.-1 populate_metadata read hostname 'node1'
2015-09-24 16:19:12.816570 7f55a28d1700 10 client.-1 ms_handle_connect on 192.168.0.204:6789/0
2015-09-24 16:19:12.818458 7f55a28d1700 1 client.544635 handle_mds_map epoch 310
2015-09-24 16:19:12.818511 7f55c7879740 10 client.544635 resolve_mds: resolved ID 'node1' to GID 544106
2015-09-24 16:19:12.818559 7f55c7879740 4 client.544635 mds_command: new command op to 544106 tid=1[{"prefix": "get_command_descriptions"}]
2015-09-24 16:19:12.819618 7f55a28d1700 10 client.544635 ms_handle_connect on 192.168.0.204:6800/885
2015-09-24 16:19:12.820104 7f55a28d1700 10 client.544635 handle_command_reply: tid=1
2015-09-24 16:19:12.820490 7f55c7879740 1 client.544635 shutdown
2015-09-24 16:19:12.821738 7f55c7879740 20 client.544635 trim_cache size 0 max 0
Error EPERM: problem getting command descriptions from mds.node1

Actions #3

Updated by Greg Farnum over 8 years ago

Apparently you're not using a client with enough caps on the MDS to give it instructions. The client.admin key that's created by default should include enough, but anything you've created might not unless you set it up for this.

Actions #4

Updated by Sergey Mir over 8 years ago

i made it even from mds node which i trying to get heap stats...so i guess problem here is not in a client...

what about problem with osd? should it be like i sad before?

Actions #5

Updated by Sergey Mir over 8 years ago

Greg Farnum wrote:

Apparently you're not using a client with enough caps on the MDS to give it instructions. The client.admin key that's created by default should include enough, but anything you've created might not unless you set it up for this.

ok, how to set it up then? someone sad that i need to fully reinstall ceph to get "mds heap stats" workable ...
cannot find any instructions about problems with profiler in the web

Actions #6

Updated by Sergey Mir over 8 years ago

there is only root, no other users. even i make it from localhost there is same error - 1 mds.0.39 handle_command: received command from client without `tell` capability: (self_ip):0/1305560688

Actions #7

Updated by Greg Farnum over 8 years ago

  • Status changed from New to Closed

I think this got handled in irc.

Actions

Also available in: Atom PDF