Project

General

Profile

Bug #6709

rgw upgrade test fails during readwrite

Added by Sage Weil over 10 years ago. Updated about 10 years ago.

Status:
Can't reproduce
Priority:
Urgent
Assignee:
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2013-11-02T19:02:14.461 INFO:teuthology.orchestra.run.out:[10.214.132.3]: duration: 35398960
2013-11-02T19:02:14.461 INFO:teuthology.orchestra.run.out:[10.214.132.3]: key: gbljaqydj
2013-11-02T19:02:14.461 INFO:teuthology.orchestra.run.out:[10.214.132.3]: start: 1383443632.317819
2013-11-02T19:02:14.461 INFO:teuthology.orchestra.run.out:[10.214.132.3]: type: r
2013-11-02T19:02:14.462 INFO:teuthology.orchestra.run.out:[10.214.132.3]: worker: 3
2013-11-02T19:07:09.685 INFO:teuthology.orchestra.run.err:[10.214.132.3]: Traceback (most recent call last):
2013-11-02T19:07:09.685 INFO:teuthology.orchestra.run.err:[10.214.132.3]:   File "/home/ubuntu/cephtest/s3-tests/virtualenv/bin/s3tests-test-readwrite", line 9, in <module>
2013-11-02T19:07:09.686 INFO:teuthology.orchestra.run.err:[10.214.132.3]:     load_entry_point('s3tests==0.0.1', 'console_scripts', 's3tests-test-readwrite')()
2013-11-02T19:07:09.686 INFO:teuthology.orchestra.run.err:[10.214.132.3]:   File "/home/ubuntu/cephtest/s3-tests/s3tests/readwrite.py", line 264, in main
2013-11-02T19:07:09.686 INFO:teuthology.orchestra.run.err:[10.214.132.3]:     common.nuke_bucket(bucket)
2013-11-02T19:07:09.686 INFO:teuthology.orchestra.run.err:[10.214.132.3]:   File "/home/ubuntu/cephtest/s3-tests/s3tests/common.py", line 42, in nuke_bucket
2013-11-02T19:07:09.686 INFO:teuthology.orchestra.run.err:[10.214.132.3]:     bucket.set_canned_acl('private')
2013-11-02T19:07:09.686 INFO:teuthology.orchestra.run.err:[10.214.132.3]:   File "/home/ubuntu/cephtest/s3-tests/virtualenv/local/lib/python2.7/site-packages/boto/s3/bucket.py", line 746, in set_canned_acl
2013-11-02T19:07:09.686 INFO:teuthology.orchestra.run.err:[10.214.132.3]:     headers=headers, query_args=query_args)
2013-11-02T19:07:09.686 INFO:teuthology.orchestra.run.err:[10.214.132.3]:   File "/home/ubuntu/cephtest/s3-tests/virtualenv/local/lib/python2.7/site-packages/boto/s3/connection.py", line 547, in make_request
2013-11-02T19:07:09.687 INFO:teuthology.orchestra.run.err:[10.214.132.3]:     retry_handler=retry_handler
2013-11-02T19:07:09.687 INFO:teuthology.orchestra.run.err:[10.214.132.3]:   File "/home/ubuntu/cephtest/s3-tests/virtualenv/local/lib/python2.7/site-packages/boto/connection.py", line 1007, in make_request
2013-11-02T19:07:09.687 INFO:teuthology.orchestra.run.err:[10.214.132.3]:     retry_handler=retry_handler)
2013-11-02T19:07:09.687 INFO:teuthology.orchestra.run.err:[10.214.132.3]:   File "/home/ubuntu/cephtest/s3-tests/virtualenv/local/lib/python2.7/site-packages/boto/connection.py", line 963, in _mexe
2013-11-02T19:07:09.687 INFO:teuthology.orchestra.run.err:[10.214.132.3]:     raise BotoServerError(response.status, response.reason, body)
2013-11-02T19:07:09.687 INFO:teuthology.orchestra.run.err:[10.214.132.3]: boto.exception.BotoServerError: BotoServerError: 500 Internal Server Error

rgw log shows the threads start timing out
2013-11-02 18:54:52.553975 7f8bd37ae700  1 ====== starting new request req=0x9e0c00 =====
2013-11-02 18:54:52.867753 7f8be57d2700  1 ====== starting new request req=0x9da800 =====
2013-11-02 18:54:53.831427 7f8bd47b0700  1 ====== starting new request req=0x9ee410 =====
2013-11-02 18:54:54.117006 7f8bf0fe9700  1 ====== starting new request req=0x9c8020 =====
2013-11-02 18:54:55.041521 7f8bcd7a2700  1 ====== starting new request req=0x9c8bb0 =====
2013-11-02 18:55:22.887125 7f8bef7e6700  1 ====== starting new request req=0x9ae7a0 =====
2013-11-02 18:55:23.288716 7f8bcc7a0700  1 ====== starting new request req=0x9e43a0 =====
2013-11-02 18:55:23.736899 7f8bdcfc1700  1 ====== starting new request req=0x9e4f70 =====
2013-11-02 18:55:25.346139 7f8be17ca700  1 ====== starting new request req=0x9b4b70 =====
2013-11-02 18:55:25.448984 7f8be0fc9700  1 ====== starting new request req=0xa1f750 =====
2013-11-02 18:55:25.767384 7f8bf87f8700  1 ====== starting new request req=0xbe5360 =====
2013-11-02 18:55:26.104406 7f8be9fdb700  1 ====== starting new request req=0xbe5f10 =====
2013-11-02 18:55:26.118136 7f8bd9fbb700  1 ====== starting new request req=0xc05af0 =====
2013-11-02 18:55:27.737650 7f8bd4fb1700  1 ====== starting new request req=0xbe36d0 =====
2013-11-02 18:55:27.750137 7f8bf37ee700  1 ====== starting new request req=0xa212c0 =====
2013-11-02 18:55:53.496947 7f8bd6fb5700  1 ====== starting new request req=0xa21e90 =====
2013-11-02 18:55:53.940631 7f8be1fcb700  1 ====== starting new request req=0xa23a70 =====
2013-11-02 18:55:57.879643 7f8bd17aa700  1 ====== starting new request req=0xbdf650 =====
2013-11-02 18:55:59.788570 7f8bcffa7700  1 ====== starting new request req=0xbe1270 =====
2013-11-02 18:56:00.906427 7f8bf27ec700  1 ====== starting new request req=0xbe1e00 =====
2013-11-02 18:56:01.719889 7f8bd3faf700  1 ====== starting new request req=0xc0fa00 =====
2013-11-02 18:56:02.060497 7f8bed7e2700  1 ====== starting new request req=0xa195c0 =====
2013-11-02 18:56:02.497213 7f8bd5fb3700  1 ====== starting new request req=0xa19f80 =====
2013-11-02 18:56:02.920600 7f8bf3fef700  1 ====== starting new request req=0xa1bd70 =====
2013-11-02 18:56:03.179237 7f8bd97ba700  1 ====== starting new request req=0x99a950 =====
2013-11-02 18:56:27.814639 7f8bd8fb9700  1 ====== starting new request req=0x9c0530 =====
2013-11-02 18:56:31.410228 7f8be8fd9700  1 ====== starting new request req=0x9c2220 =====
2013-11-02 18:56:34.484351 7f8bde7c4700  1 ====== starting new request req=0x9c29c0 =====
2013-11-02 18:56:36.039275 7f8be87d8700  1 ====== starting new request req=0x9bc360 =====
2013-11-02 19:02:13.757336 7f8bdf7c6700  1 ====== starting new request req=0x9bc8c0 =====
2013-11-02 19:02:44.613973 7f8bdffc7700  1 ====== starting new request req=0x9b0900 =====
2013-11-02 19:03:15.304979 7f8bfbfff700  1 ====== starting new request req=0xbf1460 =====
2013-11-02 19:03:49.348320 7f8bdefc5700  1 ====== starting new request req=0xbdb220 =====
2013-11-02 19:03:53.384629 7f8c51767700  1 heartbeat_map is_healthy 'RGWProcess::m_tp thread 0x7f8bccfa1700' had timed out after 600
2013-11-02 19:03:53.384647 7f8c51767700  1 heartbeat_map is_healthy 'RGWProcess::m_tp thread 0x7f8bcefa5700' had timed out after 600
2013-11-02 19:03:53.384654 7f8c51767700  1 heartbeat_map is_healthy 'RGWProcess::m_tp thread 0x7f8bd2fad700' had timed out after 600
2013-11-02 19:03:53.384660 7f8c51767700  1 heartbeat_map is_healthy 'RGWProcess::m_tp thread 0x7f8bd87b8700' had timed out after 600
2013-11-02 19:03:53.384664 7f8c51767700  1 heartbeat_map is_healthy 'RGWProcess::m_tp thread 0x7f8bdb7be700' had timed out after 600
2013-11-02 19:03:53.384674 7f8c51767700  1 heartbeat_map is_healthy 'RGWProcess::m_tp thread 0x7f8bdbfbf700' had timed out after 600
2013-11-02 19:03:53.384681 7f8c51767700  1 heartbeat_map is_healthy 'RGWProcess::m_tp thread 0x7f8be5fd3700' had timed out after 600
2013-11-02 19:03:53.384686 7f8c51767700  1 heartbeat_map is_healthy 'RGWProcess::m_tp thread 0x7f8bf07e8700' had timed out after 600
2013-11-02 19:03:53.384690 7f8c51767700  1 heartbeat_map is_healthy 'RGWProcess::m_tp thread 0x7f8bf2fed700' had timed out after 600
2013-11-02 19:03:53.384694 7f8c51767700  1 heartbeat_map is_healthy 'RGWProcess::m_tp thread 0x7f8bf7ff7700' had timed out after 600
2013-11-02 19:03:53.384698 7f8c51767700  1 heartbeat_map is_healthy 'RGWProcess::m_tp thread 0x7f8bf97fa700' had timed out after 600
2013-11-02 19:03:53.384701 7f8c51767700  1 heartbeat_map is_healthy 'RGWProcess::m_tp thread 0x7f8bf9ffb700' had timed out after 600
2013-11-02 19:03:53.384703 7f8c51767700  1 heartbeat_map is_healthy 'RGWProcess::m_tp thread 0x7f8bfa7fc700' had timed out after 600
2013-11-02 19:03:58.384839 7f8c51767700  1 heartbeat_map is_healthy 'RGWProcess::m_tp thread 0x7f8bccfa1700' had timed out after 600
2013-11-02 19:03:58.384856 7f8c51767700  1 heartbeat_map is_healthy 'RGWProcess::m_tp thread 0x7f8bcefa5700' had timed out after 600
2013-11-02 19:03:58.384863 7f8c51767700  1 heartbeat_map is_healthy 'RGWProcess::m_tp thread 0x7f8bd2fad700' had timed out after 600
2013-11-02 19:03:58.384870 7f8c51767700  1 heartbeat_map is_healthy 'RGWProcess::m_tp thread 0x7f8bd87b8700' had timed out after 600
2013-11-02 19:03:58.384876 7f8c51767700  1 heartbeat_map is_healthy 'RGWProcess::m_tp thread 0x7f8bdb7be700' had timed out after 600
2013-11-02 19:03:58.384885 7f8c51767700  1 heartbeat_map is_healthy 'RGWProcess::m_tp thread 0x7f8bdbfbf700' had timed out after 600
....

ubuntu@teuthology:/a/teuthology-2013-11-02_16:22:27-upgrade-next-testing-basic-plana/80721

(several other failures in the same run too)

History

#1 Updated by Ian Colle over 10 years ago

  • Assignee set to Yehuda Sadeh

#2 Updated by Yehuda Sadeh over 10 years ago

It's hard to determine what exactly happened, as we don't turn on the gateway logging anymore. But generally it looks like the gateway sent requests to the osds and responses were slow enough so that client ended timing out.

#3 Updated by Yehuda Sadeh about 10 years ago

  • Status changed from New to Can't reproduce

There's a good chance that this one is #7030, closing it for now.

Also available in: Atom PDF