Actions
Bug #8396
closedosd: message delayed in Session misdirected after split
% Done:
0%
Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
logs:ubuntu@teuthology:/a/teuthology-2014-05-18_02:30:21-rados-master-testing-basic-plana/260734
ceph version 0.80-469-g991f7f1
2014-05-18T11:36:19.886 INFO:teuthology.orchestra.run.plana39.stderr:config set <var> <val> [<val>...] 2014-05-18T11:36:19.886 INFO:teuthology.orchestra.run.plana39.stderr:version 2014-05-18T11:36:19.887 INFO:teuthology.orchestra.run.plana39.stderr:2 2014-05-18T11:36:19.887 INFO:teuthology.orchestra.run.plana39.stderr:config get <var> 2014-05-18T11:36:19.887 INFO:teuthology.orchestra.run.plana39.stderr:0 2014-05-18T11:36:19.888 INFO:teuthology.orchestra.run.plana39.stderr:admin_socket: invalid command 2014-05-18T11:36:19.891 INFO:teuthology.task.thrashosds.ceph_manager:waiting on admin_socket for 3, ['dump_ops_in_flight'] 2014-05-18T11:36:20.140 INFO:teuthology.task.ceph.osd.3.plana39.stderr:2014-05-18 11:36:20.139436 7f38ca252780 -1 journal FileJournal::_open: disabling aio for non-block journal. Use journal_force_aio to force use of aio anyway 2014-05-18T11:36:24.745 INFO:teuthology.task.rados.rados.0.plana59.stderr:3885: Error: oid 432 read returned error code -2 2014-05-18T11:36:24.794 INFO:teuthology.task.rados.rados.0.plana59.stderr:./test/osd/RadosModel.h: In function 'virtual void ReadOp::_finish(TestOp::CallbackInfo*)' thread 7fcf5dffb700 time 2014-05-18 11:36:24.743525 2014-05-18T11:36:24.794 INFO:teuthology.task.rados.rados.0.plana59.stderr:./test/osd/RadosModel.h: 1045: FAILED assert(0) 2014-05-18T11:36:24.833 INFO:teuthology.task.rados.rados.0.plana59.stderr: ceph version 0.80-469-g991f7f1 (991f7f15a6e107b33a24bbef1169f21eb7fcce2c) 2014-05-18T11:36:24.833 INFO:teuthology.task.rados.rados.0.plana59.stderr: 1: (ReadOp::_finish(TestOp::CallbackInfo*)+0x12f) [0x41b03f] 2014-05-18T11:36:24.833 INFO:teuthology.task.rados.rados.0.plana59.stderr: 2: (librados::C_AioComplete::finish(int)+0x1d) [0x7fcf68cb694d] 2014-05-18T11:36:24.833 INFO:teuthology.task.rados.rados.0.plana59.stderr: 3: (Context::complete(int)+0x9) [0x7fcf68c94409] 2014-05-18T11:36:24.834 INFO:teuthology.task.rados.rados.0.plana59.stderr: 4: (Finisher::finisher_thread_entry()+0x1c0) [0x7fcf68d47490] 2014-05-18T11:36:24.834 INFO:teuthology.task.rados.rados.0.plana59.stderr: 5: (()+0x7e9a) [0x7fcf688ebe9a] 2014-05-18T11:36:24.834 INFO:teuthology.task.rados.rados.0.plana59.stderr: 6: (clone()+0x6d) [0x7fcf681023fd] 2014-05-18T11:36:24.834 INFO:teuthology.task.rados.rados.0.plana59.stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. archive_path: /var/lib/teuthworker/archive/teuthology-2014-05-18_02:30:21-rados-master-testing-basic-plana/260734 branch: master description: rados/thrash/{clusters/fixed-2.yaml fs/xfs.yaml msgr-failures/osd-delay.yaml thrashers/pggrow.yaml workloads/readwrite.yaml} email: null job_id: '260734' kernel: kdb: true sha1: 335cb91ce950ce0e12294af671c64a468d89194c last_in_suite: false machine_type: plana name: teuthology-2014-05-18_02:30:21-rados-master-testing-basic-plana nuke-on-error: true os_type: ubuntu overrides: admin_socket: branch: master ceph: conf: global: ms inject delay max: 1 ms inject delay probability: 0.005 ms inject delay type: osd ms inject internal delays: 0.002 ms inject socket failures: 2500 mon: debug mon: 20 debug ms: 1 debug paxos: 20 osd: debug filestore: 20 debug journal: 20 debug ms: 1 debug osd: 20 osd sloppy crc: true fs: xfs log-whitelist: - slow request sha1: 991f7f15a6e107b33a24bbef1169f21eb7fcce2c ceph-deploy: branch: dev: master conf: client: log file: /var/log/ceph/ceph-$name.$pid.log mon: debug mon: 1 debug ms: 20 debug paxos: 20 osd default pool size: 2 install: ceph: sha1: 991f7f15a6e107b33a24bbef1169f21eb7fcce2c s3tests: branch: master workunit: sha1: 991f7f15a6e107b33a24bbef1169f21eb7fcce2c owner: scheduled_teuthology@teuthology roles: - - mon.a - mon.c - osd.0 - osd.1 - osd.2 - client.0 - - mon.b - mds.a - osd.3 - osd.4 - osd.5 - client.1 suite: rados tasks: - chef: null - clock.check: null - install: null - ceph: log-whitelist: - wrongly marked me down - objects unfound and apparently lost - thrashosds: chance_pgnum_grow: 2 chance_pgpnum_fix: 1 timeout: 1200 - rados: clients: - client.0 objects: 500 op_weights: delete: 10 read: 45 write: 45 ops: 4000 teuthology_branch: master verbose: true
Updated by Samuel Just almost 10 years ago
Hmm... op misdirected to the pre-split pgid (3.7 not 3.47)... I suspect the new queuing code.
Updated by Samuel Just almost 10 years ago
It's the waiting_for_pg queue. It used to get reprocessed through the op queue, but now gets directly enqueue'd into the pg queue.
Updated by Samuel Just almost 10 years ago
- Status changed from New to In Progress
- Assignee set to Samuel Just
Updated by Greg Farnum almost 10 years ago
- Status changed from 7 to Resolved
Merged into master as of commit: 25466839589813047c975e44e67e14f34e32139e
Updated by Sage Weil over 9 years ago
ubuntu@teuthology:/a/sage-2014-07-24_11:53:12-rados-master-testing-basic-plana/376677
Updated by Sage Weil over 9 years ago
- Subject changed from assert in ReadOp::_finish to osd: message delayed in Session misdirected after split
Updated by Sage Weil over 9 years ago
very likely another instance, but i didn't look closely.
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-07-27_02:30:01-rados-next-testing-basic-plana/380416$ grep 'oid 419' teuthology.log 2014-07-27T18:46:45.851 INFO:teuthology.task.rados.rados.0.out:[10.214.133.30]: 419: write initial oid 419 2014-07-27T18:46:46.445 INFO:teuthology.task.rados.rados.0.out:[10.214.133.30]: update_object_version oid 419 v 65 (ObjNum 418 snap 0 seq_num 418) dirty exists 2014-07-27T18:51:55.132 INFO:teuthology.task.rados.rados.0.out:[10.214.133.30]: 1046: read oid 419 snap -1 2014-07-27T19:01:58.428 INFO:teuthology.task.rados.rados.0.out:[10.214.133.30]: 1835: read oid 419 snap -1 2014-07-27T19:03:08.596 INFO:teuthology.task.rados.rados.0.out:[10.214.133.30]: 1902: read oid 419 snap -1 2014-07-27T19:19:34.867 INFO:teuthology.task.rados.rados.0.out:[10.214.133.30]: append oid 419 current snap is 0 2014-07-27T19:19:42.713 INFO:teuthology.task.rados.rados.0.out:[10.214.133.30]: update_object_version oid 419 v 165 (ObjNum 1704 snap 0 seq_num 1704) dirty exists 2014-07-27T19:20:52.849 INFO:teuthology.task.rados.rados.0.out:[10.214.133.30]: 2826: read oid 419 snap -1 2014-07-27T19:26:51.491 INFO:teuthology.task.rados.rados.0.out:[10.214.133.30]: append oid 419 current snap is 0 2014-07-27T19:26:56.633 INFO:teuthology.task.rados.rados.0.out:[10.214.133.30]: update_object_version oid 419 v 176 (ObjNum 1904 snap 0 seq_num 1904) dirty exists 2014-07-27T19:39:14.354 INFO:teuthology.task.rados.rados.0.out:[10.214.133.30]: 3452: read oid 419 snap -1 2014-07-27T19:39:39.875 INFO:teuthology.task.rados.rados.0.err:[10.214.133.30]: 3452: Error: oid 419 read returned error code -2 ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-07-27_02:30:01-rados-next-testing-basic-plana/380416$ grep -A2 desc summary.yaml description: rados/thrash/{clusters/fixed-2.yaml fs/xfs.yaml msgr-failures/osd-delay.yaml thrashers/morepggrow.yaml workloads/ec-readwrite.yaml} duration: 4551.11395406723 ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-07-27_02:30:01-rados-next-testing-basic-plana/380416$
Updated by Sage Weil over 9 years ago
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-08-01_02:32:01-rados-master-testing-basic-plana/392256
Actions