Project

General

Profile

Actions

Bug #3734

closed

osd/objecter: misdirected op in librados api tests

Added by Sage Weil over 11 years ago. Updated over 11 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
-
Category:
Objecter
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
bobtail
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

ubuntu@teuthology:/a/sage-2013-01-06_16:48:35-regression-next-testing-basic/36431$ zgrep WRN remote/ubuntu@plana07.front.sepia.ceph.com/log/cluster.mon.a.log.gz
2013-01-06 19:17:24.750111 osd.3 10.214.131.35:6800/7699 15 : [WRN] client.4471 10.214.133.33:0/1004914 misdirected client.4471.0:3 pg 97.db6ee63a to osd.3 in e331, client e330 pg 97.db6ee63a features 268435455

the job was
ubuntu@teuthology:/a/sage-2013-01-06_16:48:35-regression-next-testing-basic/36431$ cat orig.config.yaml 
kernel:
  kdb: true
  sha1: null
nuke-on-error: true
overrides:
  ceph:
    conf:
      global:
        ms inject socket failures: 5000
    fs: xfs
    log-whitelist:
    - slow request
    sha1: ce49968938ca3636f48fe543111aa219f36914d8
  s3tests:
    branch: next
  workunit:
    sha1: ce49968938ca3636f48fe543111aa219f36914d8
roles:
- - mon.a
  - osd.0
  - osd.1
  - osd.2
- - mds.a
  - osd.3
  - osd.4
  - osd.5
- - client.0
tasks:
- chef: null
- clock: null
- ceph:
    log-whitelist:
    - wrongly marked me down
    - objects unfound and apparently lost
- thrashosds:
    timeout: 1200
- ceph-fuse: null
- workunit:
    clients:
      client.0:
      - rados/test.sh

the client-side stuff passed, but this was in the log.

Related issues 1 (0 open1 closed)

Related to Ceph - Bug #3632: occasional testrados failure: process_8 exited with a signalResolvedJosh Durgin12/16/2012

Actions
Actions #1

Updated by Sage Weil over 11 years ago

  • Category changed from OSD to Objecter
  • Priority changed from Urgent to High

epoch 328:

pool 97 'foo.4830..' rep size 2 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 326 owner 0

epoch 329 it is gone.

request was sent epoch 331, so this is an objecter bug. judging by the pool name, this looks like rados_watch_notify.

aha, bug in scan_requests.

Actions #2

Updated by Sage Weil over 11 years ago

  • Status changed from 12 to Fix Under Review
  • Priority changed from High to Urgent

wip-3734

Actions #3

Updated by Sage Weil over 11 years ago

  • Status changed from Fix Under Review to Resolved
Actions

Also available in: Atom PDF