Project

General

Profile

Actions

Bug #9616

closed

upgrade test restarts rgw, test gets 500

Added by Yuri Weinstein over 9 years ago. Updated over 9 years ago.

Status:
Resolved
Priority:
Immediate
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-09-27_18:45:01-upgrade:dumpling-dumpling-distro-basic-vps/515469/

2014-09-27T21:15:05.172 INFO:teuthology.orchestra.run.vpm194.stderr:s3tests.fuzz.test.test_fuzzer.test_duplicate_header ... ok
2014-09-27T21:15:05.174 INFO:teuthology.orchestra.run.vpm194.stderr:s3tests.fuzz.test.test_fuzzer.test_expand_headers ... ok
2014-09-27T21:15:05.174 INFO:teuthology.orchestra.run.vpm194.stderr:
2014-09-27T21:15:05.174 INFO:teuthology.orchestra.run.vpm194.stderr:======================================================================
2014-09-27T21:15:05.175 INFO:teuthology.orchestra.run.vpm194.stderr:FAIL: s3tests.functional.test_s3.test_post_object_set_key_from_filename
2014-09-27T21:15:05.175 INFO:teuthology.orchestra.run.vpm194.stderr:----------------------------------------------------------------------
2014-09-27T21:15:05.175 INFO:teuthology.orchestra.run.vpm194.stderr:Traceback (most recent call last):
2014-09-27T21:15:05.175 INFO:teuthology.orchestra.run.vpm194.stderr:  File "/home/ubuntu/cephtest/s3-tests/virtualenv/local/lib/python2.7/site-packages/nose/case.py", line 197, in runTest
2014-09-27T21:15:05.175 INFO:teuthology.orchestra.run.vpm194.stderr:    self.test(*self.arg)
2014-09-27T21:15:05.175 INFO:teuthology.orchestra.run.vpm194.stderr:  File "/home/ubuntu/cephtest/s3-tests/s3tests/functional/test_s3.py", line 1217, in test_post_object_set_key_from_filename
2014-09-27T21:15:05.176 INFO:teuthology.orchestra.run.vpm194.stderr:    eq(r.status_code, 204)
2014-09-27T21:15:05.176 INFO:teuthology.orchestra.run.vpm194.stderr:AssertionError: 500 != 204
2014-09-27T21:15:05.176 INFO:teuthology.orchestra.run.vpm194.stderr:-------------------- >> begin captured logging << --------------------
2014-09-27T21:15:05.176 INFO:teuthology.orchestra.run.vpm194.stderr:boto: DEBUG: path=/test-client.0-2q5qz2k93rwxct7-110/
2014-09-27T21:15:05.177 INFO:teuthology.orchestra.run.vpm194.stderr:boto: DEBUG: auth_path=/test-client.0-2q5qz2k93rwxct7-110/
2014-09-27T21:15:05.177 INFO:teuthology.orchestra.run.vpm194.stderr:boto: DEBUG: Method: PUT
2014-09-27T21:15:05.177 INFO:teuthology.orchestra.run.vpm194.stderr:boto: DEBUG: Path: /test-client.0-2q5qz2k93rwxct7-110/
2014-09-27T21:15:05.178 INFO:teuthology.orchestra.run.vpm194.stderr:boto: DEBUG: Data:
2014-09-27T21:15:05.178 INFO:teuthology.orchestra.run.vpm194.stderr:boto: DEBUG: Headers: {}
2014-09-27T21:15:05.178 INFO:teuthology.orchestra.run.vpm194.stderr:boto: DEBUG: Host: vpm194.front.sepia.ceph.com:7280
2014-09-27T21:15:05.179 INFO:teuthology.orchestra.run.vpm194.stderr:boto: DEBUG: Port: 7280
2014-09-27T21:15:05.179 INFO:teuthology.orchestra.run.vpm194.stderr:boto: DEBUG: Params: {}
2014-09-27T21:15:05.179 INFO:teuthology.orchestra.run.vpm194.stderr:boto: DEBUG: Token: None
2014-09-27T21:15:05.180 INFO:teuthology.orchestra.run.vpm194.stderr:boto: DEBUG: StringToSign:
2014-09-27T21:15:05.180 INFO:teuthology.orchestra.run.vpm194.stderr:PUT
2014-09-27T21:15:05.180 INFO:teuthology.orchestra.run.vpm194.stderr:
2014-09-27T21:15:05.181 INFO:teuthology.orchestra.run.vpm194.stderr:
2014-09-27T21:15:05.181 INFO:teuthology.orchestra.run.vpm194.stderr:Sun, 28 Sep 2014 04:09:23 GMT
2014-09-27T21:15:05.181 INFO:teuthology.orchestra.run.vpm194.stderr:/test-client.0-2q5qz2k93rwxct7-110/
2014-09-27T21:15:05.182 INFO:teuthology.orchestra.run.vpm194.stderr:boto: DEBUG: Signature:
2014-09-27T21:15:05.182 INFO:teuthology.orchestra.run.vpm194.stderr:AWS OFCMCJYFSXAPIJMFDIWW:Y5KOteL5UMT54x1KUosI6dZai2Q=
2014-09-27T21:15:05.182 INFO:teuthology.orchestra.run.vpm194.stderr:boto: DEBUG: Final headers: {'Date': 'Sun, 28 Sep 2014 04:09:23 GMT', 'Content-Length': 0, 'Authorization': u'AWS OFCMCJYFSXAPIJMFDIWW:Y5KOteL5UMT54x1KUosI6dZai2Q=', 'User-Agent': 'Boto/2.32.1 Python/2.7.3 Linux/3.13.0-36-generic'}
2014-09-27T21:15:05.182 INFO:teuthology.orchestra.run.vpm194.stderr:boto: DEBUG: Response headers: [('date', 'Sun, 28 Sep 2014 04:09:23 GMT'), ('transfer-encoding', 'chunked'), ('content-type', 'application/xml'), ('server', 'Apache/2.2.22 (Ubuntu) mod_fastcgi/mod_fastcgi-SNAP-0910052141')]
2014-09-27T21:15:05.183 INFO:teuthology.orchestra.run.vpm194.stderr:requests.packages.urllib3.connectionpool: INFO: Starting new HTTP connection (1): vpm194.front.sepia.ceph.com
2014-09-27T21:15:05.183 INFO:teuthology.orchestra.run.vpm194.stderr:requests.packages.urllib3.connectionpool: DEBUG: "POST /test-client.0-2q5qz2k93rwxct7-110 HTTP/1.1" 500 538
2014-09-27T21:15:05.183 INFO:teuthology.orchestra.run.vpm194.stderr:--------------------- >> end captured logging << ---------------------
archive_path: /var/lib/teuthworker/archive/teuthology-2014-09-27_18:45:01-upgrade:dumpling-dumpling-distro-basic-vps/515469
branch: dumpling
description: upgrade:dumpling/rgw/{0-cluster/start.yaml 1-dumpling-install/v0.67.10.yaml
  2-workload/testrgw.yaml 3-upgrade-sequence/upgrade-osd-mon-mds.yaml 4-final/swift.yaml}
email: ceph-qa@ceph.com
job_id: '515469'
kernel: &id001
  kdb: true
  sha1: distro
last_in_suite: false
machine_type: vps
name: teuthology-2014-09-27_18:45:01-upgrade:dumpling-dumpling-distro-basic-vps
nuke-on-error: true
os_type: ubuntu
overrides:
  admin_socket:
    branch: dumpling
  ceph:
    conf:
      global:
        osd heartbeat grace: 100
      mon:
        debug mon: 20
        debug ms: 1
        debug paxos: 20
      osd:
        debug filestore: 20
        debug journal: 20
        debug ms: 1
        debug osd: 20
    fs: xfs
    log-whitelist:
    - slow request
    - scrub
    sha1: c70331db13bdbf1f967ece48cdbec28b97c3d19c
  ceph-deploy:
    branch:
      dev: dumpling
    conf:
      client:
        log file: /var/log/ceph/ceph-$name.$pid.log
      mon:
        debug mon: 1
        debug ms: 20
        debug paxos: 20
        osd default pool size: 2
  install:
    ceph:
      sha1: c70331db13bdbf1f967ece48cdbec28b97c3d19c
  rgw:
    default_idle_timeout: 1200
  s3tests:
    branch: dumpling
    idle_timeout: 1200
  workunit:
    sha1: c70331db13bdbf1f967ece48cdbec28b97c3d19c
owner: scheduled_teuthology@teuthology
priority: 1000
roles:
- - mon.a
  - mds.a
  - osd.0
  - osd.1
  - osd.2
- - mon.b
  - mon.c
  - osd.3
  - osd.4
  - osd.5
  - client.0
suite: upgrade:dumpling
suite_branch: dumpling
suite_path: /var/lib/teuthworker/src/ceph-qa-suite_dumpling
targets:
  ubuntu@vpm055.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDeSLSKAYC5i7XlAR77bPVsJebdcVckMKSgoyWVnxVDRK9AZQDVh+VvcEohbtFY/P0GBoixTwWmfCQN1nUdZcfa3cu+wdwXfE3PwYjW64n/Gn46GQ3tlh285MS9udphskSbMKtvQ2+Czq68aF32qkY/3Owhidpxb+H1bM2KNRNvw2S3s+dgtECV2jWU5AWklaqLLZy0gJy8VYVSoxXXZtOS4AoDF5vga6LnpUjeHWav/dunLnuala93gkuIGNj+XMzPyod7u37jvdPjO29p0Grm6KF3cx2QQkqGrmCLz1dfLdZpkOyxkkpaomfIRI9AqxsOdMfpTewq/mtLA2o6Bz4f
  ubuntu@vpm194.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDkJfBXospmcmgJ9TGYAPhzvlQ0bmrw/Hoo/HTCdC1SB1VH5+Oul9Wt5i3hDSsHoy211a2V0d9ylX99+Ndzedp8LwYvsH0cTYLxmTjIBjESA+PZo34WU1d9L9LLBM/lZpXWJ8OcM3ynGS7U1neriG1jus7jrW8EvHnr2/+RmKGF+PY3SZeg6Cb4NZVhsvOtG3RG9W3FW855w05Y76ZRhO8Y3uTgzRgFB5+2fuvUbEJhwhFHKop33l+xlGXAk3hq13z9OLbEeZpQtDa7OR3tMjnGTzBU8fpFWJwx+Cq6mHz54kvAzwfdDbmLKx1pqGC7TuG42VU+biiFRaZ6Nw3q1TWD
tasks:
- internal.lock_machines:
  - 2
  - vps
- internal.save_config: null
- internal.check_lock: null
- internal.connect: null
- internal.push_inventory: null
- internal.serialize_remote_roles: null
- internal.check_conflict: null
- internal.check_ceph_data: null
- internal.vm_setup: null
- kernel: *id001
- internal.base: null
- internal.archive: null
- internal.coredump: null
- internal.sudo: null
- internal.syslog: null
- internal.timer: null
- chef: null
- clock.check: null
- install:
    tag: v0.67.10
- ceph: null
- install.upgrade:
    all:
      branch: dumpling
- rgw:
  - client.0
- parallel:
  - workload
  - upgrade-sequence
- swift:
    client.0:
      rgw_server: client.0
teuthology_branch: master
tube: vps
upgrade-sequence:
  sequential:
  - ceph.restart:
    - osd.0
  - sleep:
      duration: 30
  - ceph.restart:
    - osd.1
  - sleep:
      duration: 30
  - ceph.restart:
    - osd.2
  - sleep:
      duration: 30
  - ceph.restart:
    - rgw.client.0
  - sleep:
      duration: 30
  - ceph.restart:
    - osd.3
  - sleep:
      duration: 30
  - ceph.restart:
    - osd.4
  - sleep:
      duration: 30
  - ceph.restart:
    - osd.5
  - sleep:
      duration: 60
  - ceph.restart:
    - mon.a
  - sleep:
      duration: 60
  - ceph.restart:
    - mon.b
  - sleep:
      duration: 60
  - ceph.restart:
    - mon.c
  - sleep:
      duration: 60
  - ceph.restart:
    - mds.a
  - sleep:
      duration: 60
verbose: true
worker_log: /var/lib/teuthworker/archive/worker_logs/worker.vps.3030
workload:
  sequential:
  - s3tests:
      client.0:
        rgw_server: client.0
description: upgrade:dumpling/rgw/{0-cluster/start.yaml 1-dumpling-install/v0.67.10.yaml
  2-workload/testrgw.yaml 3-upgrade-sequence/upgrade-osd-mon-mds.yaml 4-final/swift.yaml}
duration: 1316.7317011356354
failure_reason: 'Command failed on vpm194 with status 1: "S3TEST_CONF=/home/ubuntu/cephtest/archive/s3-tests.client.0.conf
  BOTO_CONFIG=/home/ubuntu/cephtest/boto.cfg /home/ubuntu/cephtest/s3-tests/virtualenv/bin/nosetests
  -w /home/ubuntu/cephtest/s3-tests -v -a ''!fails_on_rgw''"'
flavor: basic
owner: scheduled_teuthology@teuthology
success: false
Actions #1

Updated by Sage Weil over 9 years ago

  • Project changed from Ceph to rgw
  • Priority changed from Normal to High
Actions #2

Updated by Sage Weil over 9 years ago

  • Priority changed from High to Urgent
Actions #3

Updated by Yehuda Sadeh over 9 years ago

Not sure what the test is doing exactly, but the 500 is because the rgw process was restarted in the middle of the test. There were a few 500s coming from apache at that point, which is expected.

Actions #4

Updated by Sage Weil over 9 years ago

Yehuda Sadeh wrote:

Not sure what the test is doing exactly, but the 500 is because the rgw process was restarted in the middle of the test. There were a few 500s coming from apache at that point, which is expected.

seems like the test is sort of flawed. :/ if we are going to test rgw restart then we should wait for the tests to finish, then restart, then rerun?

i wonder if there is a way to make apache retry the fastcgi failure instead of generating 500? i assume this wouldn't happenw ith civetweb because the client would see the disconnect and... retry?

Actions #5

Updated by Sage Weil over 9 years ago

  • Subject changed from "FAIL: s3tests.functional.test_s3.test_post_object_set_key_from_filename" in upgrade:dumpling-dumpling-distro-basic-vps run to upgrade test restarts rgw, test gets 500
Actions #6

Updated by Yehuda Sadeh over 9 years ago

Even if we do somehow get it to retry (might require changes to the fastcgi module), we'll still get 500s from requests that are getting cut short by shutting down the radosgw process. And this might be problematic with civetweb too.

Actions #7

Updated by Sage Weil over 9 years ago

  • Assignee set to Yuri Weinstein
Actions #8

Updated by Sage Weil over 9 years ago

  • Priority changed from Urgent to Immediate
Actions #9

Updated by Tamilarasi muthamizhan over 9 years ago

  • Status changed from New to 7

fixed

commit 46caf2d461a00f47529a8e5bd090b82826d35690
Author: tamil <tamil.muthamizhan@inktank.com>
Date:   Tue Oct 21 14:57:53 2014 -0700

    fixes issue: #9616 and further optimization of the upgrade:dumpling suites

    Signed-off-by: tamil <tamil.muthamizhan@inktank.com>

Actions #10

Updated by Sage Weil over 9 years ago

  • Status changed from 7 to Resolved
Actions

Also available in: Atom PDF