Bug #23778: rados commands suddenly take a very long time to finish - Ceph - Ceph

Actions

Copy link

Bug #23778

closed

rados commands suddenly take a very long time to finish

Added by Lenz Grimmer about 6 years ago. Updated about 6 years ago.

Status:

Resolved

Priority:

Immediate

Assignee:

Casey Bodley

Category:

Target version:

v13.0.0

% Done:

Source:

Development

Tags:

Backport:

Regression:

Yes

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

Not sure at which git revision on master this started, but since a short while, running "radosgw-admin" commands take a very long time to complete in my vstart development environment. I a cluster started with "RGW=1 ../src/vstart.sh -n -x", even a plain "radosgw-admin user list" now takes up to three minutes to conclude. I've tried to capture the output of running "radosgw-admin --debug-ms=1 user listradosgw-admin --debug-ms=1 user list" - see the file attached for details.

Files

rgw-admin.out.bz2 (752 KB) rgw-admin.out.bz2

Lenz Grimmer, 04/18/2018 10:53 AM

Actions

Copy link

Updated by Lenz Grimmer about 6 years ago

I'm now waiting for half an hour for the "create user" command to return. FWIW, in the RGW performance counters, I can see that "objecter.op_active" is changing between 9 and 13, not sure if this is relevant. On the command line, I observe the following notifications:

bash-4.4# ./bin/radosgw-admin user create --uid=lenz  --display-name="Lenz Grimmer" --system
2018-04-18 10:29:19.093 7f447b35a800 -1 WARNING: all dangerous and experimental features are enabled.
2018-04-18 10:29:19.101 7f447b35a800 -1 WARNING: all dangerous and experimental features are enabled.
2018-04-18 10:29:19.103 7f447b35a800 -1 WARNING: all dangerous and experimental features are enabled.
2018-04-18 10:39:39.993 7f4469328700  0 -- 192.168.178.21:0/690559816 >> 192.168.178.21:6805/155325 conn(0x7f443803d7d0 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=173743 cs=1 l=1).process missed message?  skipped from seq 0 to 1417934880
2018-04-18 10:52:59.947 7f4469b29700  0 -- 192.168.178.21:0/690559816 >> 192.168.178.21:6805/155325 conn(0x7f4438032120 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=265602 cs=1 l=1).process missed message?  skipped from seq 2 to 358266069
2018-04-18 11:08:53.768 7f446a32a700  0 -- 192.168.178.21:0/690559816 >> 192.168.178.21:6805/155325 conn(0x7f44380195e0 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=451167 cs=1 l=1).process missed message?  skipped from seq 3 to 1444234136
2018-04-18 11:09:17.260 7f4469328700  0 -- 192.168.178.21:0/690559816 >> 192.168.178.21:6805/155325 conn(0x7f443801e120 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=454371 cs=1 l=1).process missed message?  skipped from seq 2 to 259303
2018-04-18 11:09:44.534 7f4469328700  0 -- 192.168.178.21:0/690559816 >> 192.168.178.21:6805/155325 conn(0x7f44380204c0 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=458387 cs=1 l=1).process missed message?  skipped from seq 2 to 1178026385
2018-04-18 11:12:31.122 7f4469b29700  0 -- 192.168.178.21:0/690559816 >> 192.168.178.21:6805/155325 conn(0x7f4438026cf0 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=478551 cs=1 l=1).process missed message?  skipped from seq 3 to 418839487

Actions

Copy link

Updated by Lenz Grimmer about 6 years ago

I aborted the process after ~2.5hrs and noticed that out/radosgw.8000.log had grown to 6.3GB in size:

bash-4.4# ls -lh out/radosgw.8000.log 
-rw-r--r--. 1 root root 6.3G Apr 18 13:06 out/radosgw.8000.log
bash-4.4# tail -30 out/radosgw.8000.log
2018-04-18 13:06:49.662 7fde9cb68700  1 -- 192.168.178.21:0/690559816 --> 192.168.178.21:6805/155325 -- osd_op(unknown.0.0:9163942 6.1 6:9f626c48:lc::lc.21:head [call lock.lock] snapc 0=[] ondisk+write+known_if_redirected e31) v8 -- 0x555d1b736a40 con 0
2018-04-18 13:06:49.666 7fdeb739d700  1 -- 192.168.178.21:0/690559816 <== osd.1 192.168.178.21:6805/155325 6072 ==== osd_op_reply(9163942 lc.21 [call] v31'17587 uv17587 ondisk = 0) v8 ==== 149+0+0 (3661307855 0 0) 0x555d1b736a40 con 0x555d1b11ea00
2018-04-18 13:06:49.666 7fde9cb68700  1 -- 192.168.178.21:0/690559816 --> 192.168.178.21:6805/155325 -- osd_op(unknown.0.0:9163943 6.1 6:9f626c48:lc::lc.21:head [call rgw.lc_get_head] snapc 0=[] ondisk+read+known_if_redirected e31) v8 -- 0x555d1b887380 con 0
2018-04-18 13:06:49.667 7fdeb739d700  1 -- 192.168.178.21:0/690559816 <== osd.1 192.168.178.21:6805/155325 6073 ==== osd_op_reply(9163943 lc.21 [call] v0'0 uv17587 ondisk = 0) v8 ==== 149+0+24 (2730243119 0 3151557651) 0x555d1b887380 con 0x555d1b11ea00
2018-04-18 13:06:49.667 7fde9cb68700  1 -- 192.168.178.21:0/690559816 --> 192.168.178.21:6805/155325 -- osd_op(unknown.0.0:9163944 6.1 6:9f626c48:lc::lc.21:head [call rgw.lc_list_entries] snapc 0=[] ondisk+read+known_if_redirected e31) v8 -- 0x555d1b9cf380 con 0
2018-04-18 13:06:49.668 7fdeb739d700  1 -- 192.168.178.21:0/690559816 <== osd.1 192.168.178.21:6805/155325 6074 ==== osd_op_reply(9163944 lc.21 [call] v0'0 uv17587 ondisk = 0) v8 ==== 149+0+11 (3821395741 0 3400070829) 0x555d1b9cf380 con 0x555d1b11ea00
2018-04-18 13:06:49.668 7fde9cb68700  1 -- 192.168.178.21:0/690559816 --> 192.168.178.21:6805/155325 -- osd_op(unknown.0.0:9163945 6.1 6:9f626c48:lc::lc.21:head [call rgw.lc_get_next_entry] snapc 0=[] ondisk+read+known_if_redirected e31) v8 -- 0x555d1b8e9a80 con 0
2018-04-18 13:06:49.668 7fdeb739d700  1 -- 192.168.178.21:0/690559816 <== osd.1 192.168.178.21:6805/155325 6075 ==== osd_op_reply(9163945 lc.21 [call] v0'0 uv17587 ondisk = 0) v8 ==== 149+0+14 (3644652679 0 3937909330) 0x555d1b8e9a80 con 0x555d1b11ea00
2018-04-18 13:06:49.668 7fde9cb68700  1 -- 192.168.178.21:0/690559816 --> 192.168.178.21:6805/155325 -- osd_op(unknown.0.0:9163946 6.1 6:9f626c48:lc::lc.21:head [call lock.unlock] snapc 0=[] ondisk+write+known_if_redirected e31) v8 -- 0x555d1b7e69c0 con 0
2018-04-18 13:06:49.671 7fdeb739d700  1 -- 192.168.178.21:0/690559816 <== osd.1 192.168.178.21:6805/155325 6076 ==== osd_op_reply(9163946 lc.21 [call] v31'17588 uv17588 ondisk = 0) v8 ==== 149+0+0 (3196761590 0 0) 0x555d1b7e69c0 con 0x555d1b11ea00
2018-04-18 13:06:51.838 7fdeb639b700  1 -- 192.168.178.21:0/690559816 --> 192.168.178.21:6801/155013 -- osd_op(unknown.0.0:9163947 4.0 4:05bf5b68:::notify.1:head [watch ping cookie 93858361582592 gen 1155278] snapc 0=[] ondisk+write+known_if_redirected e31) v8 -- 0x555d1a916080 con 0
2018-04-18 13:06:51.838 7fdeb639b700  1 -- 192.168.178.21:0/690559816 --> 192.168.178.21:6801/155013 -- osd_op(unknown.0.0:9163948 4.6 4:7759931f:::notify.3:head [watch ping cookie 93858361584896 gen 1155278] snapc 0=[] ondisk+write+known_if_redirected e31) v8 -- 0x555d1b710d00 con 0
2018-04-18 13:06:51.838 7fdeb639b700  1 -- 192.168.178.21:0/690559816 --> 192.168.178.21:6801/155013 -- osd_op(unknown.0.0:9163949 4.3 4:c609908c:::notify.5:head [watch ping cookie 93858361587200 gen 1155278] snapc 0=[] ondisk+write+known_if_redirected e31) v8 -- 0x555d1b99a3c0 con 0
2018-04-18 13:06:51.838 7fdeb639b700  1 -- 192.168.178.21:0/690559816 --> 192.168.178.21:6805/155325 -- osd_op(unknown.0.0:9163950 4.7 4:f95f44c2:::notify.0:head [watch ping cookie 93858361581440 gen 775875] snapc 0=[] ondisk+write+known_if_redirected e31) v8 -- 0x555d1b9d70c0 con 0
2018-04-18 13:06:51.838 7fdeb639b700  1 -- 192.168.178.21:0/690559816 --> 192.168.178.21:6805/155325 -- osd_op(unknown.0.0:9163951 4.5 4:a93a5511:::notify.2:head [watch ping cookie 93858361583744 gen 775875] snapc 0=[] ondisk+write+known_if_redirected e31) v8 -- 0x555d1b8869c0 con 0
2018-04-18 13:06:51.838 7fdeb639b700  1 -- 192.168.178.21:0/690559816 --> 192.168.178.21:6805/155325 -- osd_op(unknown.0.0:9163952 4.5 4:b4812045:::notify.4:head [watch ping cookie 93858361586048 gen 775875] snapc 0=[] ondisk+write+known_if_redirected e31) v8 -- 0x555d1b9ce000 con 0
2018-04-18 13:06:51.838 7fdeb639b700  1 -- 192.168.178.21:0/690559816 --> 192.168.178.21:6805/155325 -- osd_op(unknown.0.0:9163953 4.4 4:2b04a3e9:::notify.6:head [watch ping cookie 93858361588352 gen 775875] snapc 0=[] ondisk+write+known_if_redirected e31) v8 -- 0x555d1b9cf040 con 0
2018-04-18 13:06:51.838 7fdeb639b700  1 -- 192.168.178.21:0/690559816 --> 192.168.178.21:6809/155637 -- osd_op(unknown.0.0:9163954 4.1 4:93e5b521:::notify.7:head [watch ping cookie 93858364325888 gen 1540482] snapc 0=[] ondisk+write+known_if_redirected e31) v8 -- 0x555d1a906d80 con 0
2018-04-18 13:06:51.838 7fdeb639b700  1 -- 192.168.178.21:0/690559816 --> 192.168.178.21:6805/155325 -- ping magic: 0 v1 -- 0x555d1b466ee0 con 0
2018-04-18 13:06:51.838 7fdeb639b700  1 -- 192.168.178.21:0/690559816 --> 192.168.178.21:6801/155013 -- ping magic: 0 v1 -- 0x555d1b4670c0 con 0
2018-04-18 13:06:51.838 7fdeb639b700  1 -- 192.168.178.21:0/690559816 --> 192.168.178.21:6809/155637 -- ping magic: 0 v1 -- 0x555d1b465c20 con 0
2018-04-18 13:06:51.839 7fdeb6b9c700  1 -- 192.168.178.21:0/690559816 <== osd.0 192.168.178.21:6801/155013 4219 ==== osd_op_reply(9163947 notify.1 [watch ping cookie 93858361582592 gen 1155278] v0'0 uv44543 ondisk = 0) v8 ==== 152+0+0 (323144850 0 0) 0x555d1b99a3c0 con 0x555d1b0e6700
2018-04-18 13:06:51.839 7fdeb739d700  1 -- 192.168.178.21:0/690559816 <== osd.2 192.168.178.21:6809/155637 107 ==== osd_op_reply(9163954 notify.7 [watch ping cookie 93858364325888 gen 1540482] v0'0 uv59939 ondisk = 0) v8 ==== 152+0+0 (1657301086 0 0) 0x555d1a906d80 con 0x555d1a612300
2018-04-18 13:06:51.839 7fdeb739d700  1 -- 192.168.178.21:0/690559816 <== osd.1 192.168.178.21:6805/155325 6077 ==== osd_op_reply(9163951 notify.2 [watch ping cookie 93858361583744 gen 775875] v0'0 uv44629 ondisk = 0) v8 ==== 152+0+0 (1020775032 0 0) 0x555d1a906d80 con 0x555d1b11ea00
2018-04-18 13:06:51.839 7fdeb6b9c700  1 -- 192.168.178.21:0/690559816 <== osd.0 192.168.178.21:6801/155013 4220 ==== osd_op_reply(9163948 notify.3 [watch ping cookie 93858361584896 gen 1155278] v0'0 uv64037 ondisk = 0) v8 ==== 152+0+0 (903092514 0 0) 0x555d1b99a3c0 con 0x555d1b0e6700
2018-04-18 13:06:51.839 7fdeb739d700  1 -- 192.168.178.21:0/690559816 <== osd.1 192.168.178.21:6805/155325 6078 ==== osd_op_reply(9163950 notify.0 [watch ping cookie 93858361581440 gen 775875] v0'0 uv29620 ondisk = 0) v8 ==== 152+0+0 (2110732861 0 0) 0x555d1a906d80 con 0x555d1b11ea00
2018-04-18 13:06:51.839 7fdeb739d700  1 -- 192.168.178.21:0/690559816 <== osd.1 192.168.178.21:6805/155325 6079 ==== osd_op_reply(9163952 notify.4 [watch ping cookie 93858361586048 gen 775875] v0'0 uv83624 ondisk = 0) v8 ==== 152+0+0 (143870924 0 0) 0x555d1a906d80 con 0x555d1b11ea00
2018-04-18 13:06:51.839 7fdeb739d700  1 -- 192.168.178.21:0/690559816 <== osd.1 192.168.178.21:6805/155325 6080 ==== osd_op_reply(9163953 notify.6 [watch ping cookie 93858361588352 gen 775875] v0'0 uv32473 ondisk = 0) v8 ==== 152+0+0 (27727412 0 0) 0x555d1a906d80 con 0x555d1b11ea00
2018-04-18 13:06:51.839 7fdeb6b9c700  1 -- 192.168.178.21:0/690559816 <== osd.0 192.168.178.21:6801/155013 4221 ==== osd_op_reply(9163949 notify.5 [watch ping cookie 93858361587200 gen 1155278] v0'0 uv64182 ondisk = 0) v8 ==== 152+0+0 (678091871 0 0) 0x555d1b99a3c0 con 0x555d1b0e6700
2018-04-18 13:06:51.889 7fdeb3b96700  1 -- 192.168.178.21:0/690559816 --> 192.168.178.21:6800/154723 -- mgrreport(rgw.rgw +0-0 packed 182 status=1) v6 -- 0x555d1a776a00 con 0

Actions

Copy link

Updated by Lenz Grimmer about 6 years ago

Some further debugging with Abhishek revealed that the radosgw-admin command does not hang if the option " --rgw-cache-enabled=false" is added.

Actions

Copy link

Updated by Casey Bodley about 6 years ago

Status changed from New to 12
Assignee set to Casey Bodley
Priority changed from Normal to Urgent

Actions

Copy link

Updated by Yehuda Sadeh about 6 years ago

I'm able to reproduce this without radosgw/radosgw-admin at all. Just running multiple concurrent rados requests would cause that, e.g. running multiple of the following:

$ rados -p foo put bar -

It seems to me that we can't get more than one open connection to any monitor/osd at any time. Once there is one open connection from any client, the second one would fail.

Actions

Copy link

Updated by Matt Benjamin about 6 years ago

yeah, Casey and others have been digging into this over the past 2 days; :(

Matt

Actions

Copy link

Updated by Yehuda Sadeh about 6 years ago

Project changed from rgw to Ceph
Subject changed from radosgw-admin commands suddenly take a very long time to finish to rados commands suddenly take a very long time to finish

Moving this outside of rgw as it's a rados issue. Just reproduced it on a new cluster without any radosgw commands involved. Scenario as described above:

Open 4 separate terminals, on 3 of them run:

$ rados --debug-ms=1 put -p foo bar -

On the 4th run:

$ touch /tmp/bla
$ rados --debug-ms=1 put -p foo bar /tmp/bla

The last command will take very long to finish.

Actions

Copy link

Updated by Casey Bodley about 6 years ago

Project changed from Ceph to rgw

Yehuda Sadeh wrote:

It seems to me that we can't get more than one open connection to any monitor/osd at any time. Once there is one open connection from any client, the second one would fail.

yeah, that seems to be it. once you open a second client connection, the server closes the first with a message like ".handle_connect_msg accept replacing existing (lossy) channel (new one lossy=1)"

the server side uses the client messenger's 'nonce' to distinguish between clients with the same address, and clients use generate_random_number() to get a unique one. but after adding some debug logging, i'm seeing that clients are generating the same nonces - they appear to be using the same random seed

radosgw:

2018-04-20 10:51:32.435 7fcc7e246a40 1 generated nonce=606610977102444280 nonce32=1057699576
2018-04-20 10:51:32.462 7fcc7e246a40 1 generated nonce=11680327234415193037 nonce32=329477069

radosgw-admin:

2018-04-20 10:51:49.367 7fbafe33c800 1 generated nonce=606610977102444280 nonce32=1057699576
2018-04-20 10:51:49.389 7fbafe33c800 1 generated nonce=11680327234415193037 nonce32=329477069

https://github.com/ceph/ceph/pull/20670 merged last week and involved changes to generate_random_number(), and reverting that commit appears to resovle the issue

Actions

Copy link