Project

General

Profile

Actions

Bug #11570

closed

rgw (?) problem in upgrade:hammer-hammer-distro-basic-typica run

Added by Yuri Weinstein almost 9 years ago. Updated over 8 years ago.

Status:
Resolved
Priority:
Urgent
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
hammer
Regression:
No
Severity:
1 - critical
Reviewed:
Affected Versions:
ceph-qa-suite:
upgrade/hammer
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Run: http://pulpito-rdu.front.sepia.ceph.com/teuthology-2015-05-07_12:33:35-upgrade:hammer-hammer-distro-basic-typica/
Job: 11454
Logs: http://typica002.front.sepia.ceph.com/teuthology-2015-05-07_12:33:35-upgrade:hammer-hammer-distro-basic-typica/11454/teuthology.log

2015-05-07T12:56:16.850 INFO:teuthology.orchestra.run.typica042.stderr:s3tests.fuzz.test.test_fuzzer.test_expand_headers ... ok
2015-05-07T12:56:16.850 INFO:teuthology.orchestra.run.typica042.stderr:
2015-05-07T12:56:16.851 INFO:teuthology.orchestra.run.typica042.stderr:----------------------------------------------------------------------
2015-05-07T12:56:16.851 INFO:teuthology.orchestra.run.typica042.stderr:Ran 285 tests in 963.300s
2015-05-07T12:56:16.851 INFO:teuthology.orchestra.run.typica042.stderr:
2015-05-07T12:56:16.851 INFO:teuthology.orchestra.run.typica042.stderr:OK (SKIP=4)
2015-05-07T12:56:16.878 INFO:tasks.s3tests:Cleaning up boto...
2015-05-07T12:56:16.878 INFO:teuthology.orchestra.run.typica042:Running: 'rm /home/ubuntu/cephtest/boto.cfg'
2015-05-07T12:56:16.955 INFO:teuthology.orchestra.run.typica042:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage radosgw-admin -n client.0 user rm --uid bar.client.0 --purge-data'
2015-05-07T12:56:22.641 INFO:teuthology.orchestra.run.typica042:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage radosgw-admin -n client.0 user rm --uid foo.client.0 --purge-data'
2015-05-07T12:56:25.619 INFO:teuthology.orchestra.run.typica042.stderr:could not remove user: unable to remove user, unable to delete user data
2015-05-07T12:56:26.361 INFO:tasks.s3tests:Removing s3-tests...
2015-05-07T12:56:26.362 INFO:teuthology.orchestra.run.typica042:Running: 'rm -rf /home/ubuntu/cephtest/s3-tests'
2015-05-07T12:56:26.466 INFO:tasks.rgw:Stopping apache...
2015-05-07T12:56:26.476 INFO:teuthology.misc:Shutting down rgw daemons...
2015-05-07T12:56:26.477 DEBUG:tasks.rgw.client.0:waiting for process to exit
2015-05-07T12:56:26.515 INFO:tasks.rgw.client.0.typica042.stdout:2015-05-07 12:56:26.518613 7fbdf0cb1840 -1 shutting down
2015-05-07T12:56:32.475 INFO:tasks.rgw.client.0:Stopped
2015-05-07T12:56:32.476 INFO:teuthology.orchestra.run.typica042:Running: 'rm -f /home/ubuntu/cephtest/rgw.opslog.client.0.sock'
2015-05-07T12:56:32.489 INFO:tasks.rgw:Removing apache config...
2015-05-07T12:56:32.489 INFO:teuthology.orchestra.run.typica042:Running: 'rm -f /home/ubuntu/cephtest/apache/apache.client.0.conf && rm -f /home/ubuntu/cephtest/apache/htdocs.client.0/rgw.fcgi'
2015-05-07T12:56:32.567 INFO:tasks.rgw:Cleaning up apache directories...
2015-05-07T12:56:32.567 INFO:teuthology.orchestra.run.typica042:Running: 'rm -rf /home/ubuntu/cephtest/apache/tmp.client.0 && rmdir /home/ubuntu/cephtest/apache/htdocs.client.0'
2015-05-07T12:56:32.670 INFO:teuthology.orchestra.run.typica042:Running: 'rmdir /home/ubuntu/cephtest/apache'
2015-05-07T12:56:32.747 ERROR:teuthology.parallel:Exception in parallel execution
Traceback (most recent call last):
  File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 82, in __exit__
    for result in self:
  File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 101, in next
    resurrect_traceback(result)
  File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 19, in capture_traceback
    return func(*args, **kwargs)
  File "/home/teuthworker/src/teuthology_master/teuthology/task/parallel.py", line 50, in _run_spawned
    mgr = run_tasks.run_one_task(taskname, ctx=ctx, config=config)
  File "/home/teuthworker/src/teuthology_master/teuthology/run_tasks.py", line 41, in run_one_task
    return fn(**kwargs)
  File "/home/teuthworker/src/teuthology_master/teuthology/task/sequential.py", line 48, in task
    mgr.__enter__()
  File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
    return self.gen.next()
  File "/home/teuthworker/src/ceph-qa-suite_hammer/tasks/s3tests.py", line 441, in task
    pass
  File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
    self.gen.next()
  File "/home/teuthworker/src/teuthology_master/teuthology/contextutil.py", line 44, in nested
    if exit(*exc):
  File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
    self.gen.next()
  File "/home/teuthworker/src/ceph-qa-suite_hammer/tasks/s3tests.py", line 228, in create_users
    '--purge-data',
  File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/cluster.py", line 64, in run
    return [remote.run(**kwargs) for remote in remotes]
  File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/remote.py", line 156, in run
    r = self._runner(client=self.ssh, name=self.shortname, **kwargs)
  File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/run.py", line 378, in run
    r.wait()
  File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/run.py", line 114, in wait
    label=self.label)
CommandFailedError: Command failed on typica042 with status 22: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage radosgw-admin -n client.0 user rm --uid foo.client.0 --purge-data'
2015-05-07T12:56:32.752 ERROR:teuthology.run_tasks:Saw exception from tasks.
Traceback (most recent call last):
  File "/home/teuthworker/src/teuthology_master/teuthology/run_tasks.py", line 53, in run_tasks
    manager = run_one_task(taskname, ctx=ctx, config=config)
  File "/home/teuthworker/src/teuthology_master/teuthology/run_tasks.py", line 41, in run_one_task
    return fn(**kwargs)
  File "/home/teuthworker/src/teuthology_master/teuthology/task/parallel.py", line 43, in task
    p.spawn(_run_spawned, ctx, confg, taskname)
  File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 89, in __exit__
    raise
CommandFailedError: Command failed on typica042 with status 22: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage radosgw-admin -n client.0 user rm --uid foo.client.0 --purge-data'
2015-05-07T12:56:32.753 DEBUG:teuthology.run_tasks:Unwinding manager ceph

Related issues 1 (0 open1 closed)

Related to rgw - Bug #11442: Objects with names starting with underscore become inaccessible after upgrading to HammerResolvedYehuda Sadeh04/21/2015

Actions
Actions #2

Updated by Sage Weil almost 9 years ago

  • Project changed from Ceph to rgw
Actions #3

Updated by Yuri Weinstein almost 9 years ago

  • Backport set to hammer
Actions #6

Updated by Yehuda Sadeh almost 9 years ago

One suspect would be the broken objects starting with underscore on older hammer, so hammer to hammer upgrade means that when trying to remove user it'll fail to read those objects and user removal will fail.

Actions #7

Updated by Yuri Weinstein almost 9 years ago

  • Severity changed from 3 - minor to 1 - critical
Actions #8

Updated by Loïc Dachary almost 9 years ago

  • Status changed from New to 12

Yehuda: is this a blocker for the hammer release ? If it was already in v0.94.1 it would not be a blocker, right ?

Actions #9

Updated by Sage Weil almost 9 years ago

I don't think it would be a blocker, but.. should we try to run the repair tool as part of this test suite?

Actions #10

Updated by Yuri Weinstein almost 9 years ago

  • Assignee set to Yehuda Sadeh

Yehuda, can you provide an example on how to add the repair tool to this test suite ?

Actions #11

Updated by Yehuda Sadeh almost 9 years ago

I updated the release notes. Basically need to run for each affected bucket (without --fix it will be a dry run):

   $ radosgw-admin bucket check --check-head-obj-locator \
                                --bucket=<bucket> [--fix]
Actions #12

Updated by Loïc Dachary almost 9 years ago

Yehuda added to the v0.94.2 release notes: it's not a blocker https://github.com/ceph/ceph/pull/4789

Actions #13

Updated by Sage Weil almost 9 years ago

For the upgrade test to work, the above needs to be enclosed in a loop over all buckets.. i assume 'radosgw-admin bucket list', but it spits out json...

Actions #14

Updated by Sage Weil almost 9 years ago

  • Assignee changed from Yehuda Sadeh to Yuri Weinstein

split this into an -older and -newer group, and don't test rgw on the -older. we did a similar thing in firefly (or dumpling? i forget).

Actions #15

Updated by Yuri Weinstein almost 9 years ago

  • Status changed from 12 to 7
  • Release set to hammer

PR https://github.com/ceph/ceph-qa-suite/pull/465
Number of jobs generated by new suite will be significantly higher: 189 vs old 63 !

Surprisingly unchanged suite passed http://pulpito.ceph.redhat.com/teuthology-2015-06-17_16:05:02-upgrade:hammer-hammer-distro-basic-magna/ ?!

Test run off wip branch http://pulpito.ceph.com/teuthology-2015-06-17_15:48:23-upgrade:hammer-hammer---basic-vps/

Blocked by #12101 and #11966 at the moment

Actions #17

Updated by Yuri Weinstein over 8 years ago

  • Status changed from 7 to Resolved
Actions

Also available in: Atom PDF