Bug #2508
osdc/ObjectCacher.cc:761: void ObjectCacher::bh_write_commit(int64_t, sobject_t, loff_t, uint64_t, tid_t, int): Assertion `ob->last_commit_tid < tid' failed.
0%
Description
Hi,
we've random KVM VPS crashes with the following error:
kvm: osdc/ObjectCacher.cc:761: void ObjectCacher::bh_write_commit(int64_t, sobject_t, loff_t, uint64_t, tid_t, int): Assertion `ob->last_commit_tid < tid' failed.
Seems to be similar to #1490 which is closed.
Ceph Version: ceph version 0.47.2-4-ge868b44 (e868b44b3959a71c731f4ec9ff9773dead6dfcb5)
KVM/QEMU Version: QEMU emulator version 1.0 (qemu-kvm-1.0+dfsg-11, Debian), Copyright (c) 2003-2008 Fabrice Bellard
Kernel: 3.2.0-2-amd64 (Debian Wheezy Kernel)
Associated revisions
ReplicatedPG: fix replay op ordering
After a client reconnect, the client replays outstanding ops. The
OSD then immediately responds with success if the op has already
committed (version < ReplicatedPG::get_first_in_progress).
Otherwise, we stick it in waiting_for_ondisk to be replied to when
eval_repop concludes that waitfor_disk is empty.
Fixes #2508
Signed-off-by: Samuel Just <sam.just@inktank.com>
ReplicatedPG: fix replay op ordering
After a client reconnect, the client replays outstanding ops. The
OSD then immediately responds with success if the op has already
committed (version < ReplicatedPG::get_first_in_progress).
Otherwise, we stick it in waiting_for_ondisk to be replied to when
eval_repop concludes that waitfor_disk is empty.
Fixes #2508
Signed-off-by: Samuel Just <sam.just@inktank.com>
Conflicts:
src/osd/ReplicatedPG.cc
History
#1 Updated by Josh Durgin over 11 years ago
- Priority changed from Normal to High
Hi Simon,
If this is at all reproducible, could you try setting osd_min_pg_log_entries higher on all your osds, say 5000, and see if it still happens? We might want to raise the default from 1000.
If that doesn't fix it, we'd probably need logs of it happening on both the client and osd. On the client side, 'debug ms = 1' and 'debug objecter = 10'. On the osd side (unfortunately there's no way to know which osd is required, so it'd be needed on all of them) 'debug ms = 1' and 'debug osd = 20'.
Thanks!
Josh
#2 Updated by Simon Frerichs over 11 years ago
Hi Josh,
i've increased osd_min_pg_log_entries to 5000. Let's see if it fixes the problem.
Simon
#3 Updated by Sage Weil about 11 years ago
- Subject changed from Random KVM Crash - kvm: osdc/ObjectCacher.cc:761: void ObjectCacher::bh_write_commit(int64_t, sobject_t, loff_t, uint64_t, tid_t, int): Assertion `ob->last_commit_tid < tid' failed. to osdc/ObjectCacher.cc:761: void ObjectCacher::bh_write_commit(int64_t, sobject_t, loff_t, uint64_t, tid_t, int): Assertion `ob->last_commit_tid < tid' failed.
- Category set to msgr
- Status changed from New to 12
- Priority changed from High to Urgent
I just reproduced this several times on master using the socket failure injection. This is probably socket errors breaking the messengers ordering constraints.
Good news is it appears to be easy to reproduce, and probably not related to the OSD's dup op detection window!
#4 Updated by Sage Weil about 11 years ago
- Target version set to v0.50
#5 Updated by Sage Weil about 11 years ago
- Category changed from msgr to OSD
- Assignee set to Samuel Just
easy to reproduce with
interactive-on-error: true overrides: ceph: conf: global: ms inject socket failures: 5000 client: debug ms: 20 debug objecter: 20 debug objectcacher: 20 osd: debug ms: 1 debug osd: 20 debug filestore: 20 fs: btrfs log-whitelist: - slow request branch: master roles: - - mon.a - mon.c - osd.0 - osd.1 - osd.2 - - mon.b - mds.a - osd.3 - osd.4 - osd.5 - - client.0 tasks: - ceph: null - ceph-fuse: null - workunit: clients: all: - suites/blogbench.sh
#6 Updated by Samuel Just about 11 years ago
- Status changed from 12 to Resolved
645ac87b84ae348483c7d52b1968af5f44378dea
#7 Updated by Tamilarasi muthamizhan almost 11 years ago
- Status changed from Resolved to Closed
- Source changed from Community (user) to Q/A
Recent logs: ubuntu@teuthology: /a/teuthology-2012-09-24_19:00:29-regression-master-testing-gcov/29465
2012-09-25T19:46:18.284 INFO:teuthology.orchestra.run.err:osdc/ObjectCacher.cc: In function 'void ObjectCacher::bh_write_commit(int64_t, sobject_t, loff_t, uint64_t, tid_t, int)' thread 7fd64d7fa700 time 2012-09-25 19:46:18.262015 2012-09-25T19:46:18.284 INFO:teuthology.orchestra.run.err:osdc/ObjectCacher.cc: 764: FAILED assert(ob->last_commit_tid < tid) 2012-09-25T19:46:18.298 INFO:teuthology.orchestra.run.err: ceph version 0.51-702-g3ca3cd4 (commit:3ca3cd4c6cd6b96ed4b207196ea3a3f3b95bc1fe) 2012-09-25T19:46:18.299 INFO:teuthology.orchestra.run.err: 1: (ObjectCacher::bh_write_commit(long, sobject_t, long, unsigned long, unsigned long, int)+0xfeb) [0x7fd658c21cbb] 2012-09-25T19:46:18.299 INFO:teuthology.orchestra.run.err: 2: (ObjectCacher::C_WriteCommit::finish(int)+0x9b) [0x7fd658c2c92b] 2012-09-25T19:46:18.299 INFO:teuthology.orchestra.run.err: 3: (Context::complete(int)+0x12) [0x7fd658bc9842] 2012-09-25T19:46:18.299 INFO:teuthology.orchestra.run.err: 4: (librbd::C_Request::finish(int)+0xa1) [0x7fd658c08981] 2012-09-25T19:46:18.299 INFO:teuthology.orchestra.run.err: 5: (Context::complete(int)+0x12) [0x7fd658bc9842] 2012-09-25T19:46:18.299 INFO:teuthology.orchestra.run.err: 6: (librbd::rados_req_cb(void*, void*)+0x5c) [0x7fd658be243c] 2012-09-25T19:46:18.299 INFO:teuthology.orchestra.run.err: 7: (librados::C_AioSafe::finish(int)+0x20) [0x7fd657d73970] 2012-09-25T19:46:18.300 INFO:teuthology.orchestra.run.err: 8: (Finisher::finisher_thread_entry()+0x340) [0x7fd657df1ae0] 2012-09-25T19:46:18.300 INFO:teuthology.orchestra.run.err: 9: (Finisher::FinisherThread::entry()+0x15) [0x7fd657d5cba5] 2012-09-25T19:46:18.300 INFO:teuthology.orchestra.run.err: 10: (Thread::_entry_func(void*)+0x12) [0x7fd657f70b62] 2012-09-25T19:46:18.300 INFO:teuthology.orchestra.run.err: 11: (()+0x7e9a) [0x7fd657360e9a] 2012-09-25T19:46:18.300 INFO:teuthology.orchestra.run.err: 12: (clone()+0x6d) [0x7fd6576684bd] 2012-09-25T19:46:18.300 INFO:teuthology.orchestra.run.err: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. 2012-09-25T19:46:18.301 INFO:teuthology.orchestra.run.err:terminate called after throwing an instance of 'ceph::FailedAssertion' ubuntu@teuthology:/a/teuthology-2012-09-24_19:00:29-regression-master-testing-gcov/29465$ cat config.yaml kernel: &id001 kdb: true sha1: 1c17c1aacab817f4621baeee15707ee992c6410a nuke-on-error: true overrides: ceph: conf: client: rbd cache: true global: ms inject socket failures: 5000 coverage: true fs: xfs log-whitelist: - slow request sha1: 3ca3cd4c6cd6b96ed4b207196ea3a3f3b95bc1fe s3tests: branch: master workunit: sha1: 3ca3cd4c6cd6b96ed4b207196ea3a3f3b95bc1fe roles: - - mon.a - osd.0 - osd.1 - osd.2 - - mds.a - osd.3 - osd.4 - osd.5 - - client.0 targets: ubuntu@plana19.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC0IExVUuQEnwBzUbN6+jgEb/qLLsSDFLAe0OPv+R3Q3uUZn+4QjB+FQ3sMMEwaGpEhUfbpWn3xZbCmi4NngTsgPDjImYJPMeMecaxvVXqAJt4IM6gdBN0415lrKXtbXaGBJCmeFFDB+xN+JN6Qhk8DWN62DnJB8MS5vKn9u0S3HvtGzY/QrnutT3AX9I4097isbTepLFRC4n2CoAC9srXaxAprFgLgOIYHm386B3W7yK3yfIImWofvZxYpPCsr/7ws8DSgdRX9eDucGQfY2WcYKCEIgZclGPBhExm7q/ahHUqYKrzWY3RD93AuXJVgOJk5Yp8C1Ryx8vyJkAxBW4nb ubuntu@plana35.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCtCEUU7ZjJxMBOYq9dlPZppwlahU15OIq1a/xqI0YIpAF7ZLf1eis04s6Kzvl45ooKP2J6DW598Lr9zTtFb6j9RoAkgIbIOkWZmLQjeZesi//7B8uXBw7niNzU1iBI83K8dPL2cwTsgwXXtoaD03pW9HCv4ADhH12/CsfoqNPfql7ze4JlI07seHmUqkBuBFxLVgdZDXb+Cjdg2sQnMS5qZo0OGBp2yfBla01Pk8V1sHGg39miF0EDJZyIOL2ziZOcm1f02WVkUAD8MMOtfQbf33DT/ya4Th20YtRMQxNLkUYG6rxcjutzILFKDr7ugbvOwtBhVS7qplqLWAwHdi0p ubuntu@plana36.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCe7CpJbnd7W2/n42TTTjDArnVkyZfbRANfmkdgfDM+6AYg6qd9wUhes6LP++eMvhuM96Sz5W4380o8OME0cguG1LkkADbm8pQbPAPZwF1Fj28YxgZKpc2PTPsF+sjOujC+AaXaQ82ffSkLL0oElKZgAiFEGCytSdUNFHZxjztDIOoWlt7kylQCy4sJCEbND8JFwFfeGyyePvMl3CNdbnR7H5GuyIx70iglLBO/XFwArjeOUZ/FboRZWOBivpZQf9IMy8k2rQetzxTyugd7cTVdq1G5N5NeHpbQfv286G2oDaZj1HT252jDF04UP083zMxH1W9gmOoUKzIhl+iXaNLZ tasks: - internal.lock_machines: 3 - internal.save_config: null - internal.check_lock: null - internal.connect: null - internal.check_conflict: null - kernel: *id001 - internal.base: null - internal.archive: null - internal.coredump: null - internal.syslog: null - internal.timer: null - chef: null - clock: null - ceph: log-whitelist: - wrongly marked me down - objects unfound and apparently lost - thrashosds: timeout: 1200 - rbd_fsx: clients: - client.0 ops: 20000 ubuntu@teuthology:/a/teuthology-2012-09-24_19:00:29-regression-master-testing-gcov/29465$ cat summary.yaml ceph-sha1: 3ca3cd4c6cd6b96ed4b207196ea3a3f3b95bc1fe client.0-kernel-sha1: 1c17c1aacab817f4621baeee15707ee992c6410a description: collection:rbd-thrash clusters:6-osd-3-machine.yaml fs:xfs.yaml msgr-failures:few.yaml thrashers:default.yaml workloads:rbd_fsx_cache_writeback.yaml duration: 3020.0398390293121 failure_reason: 'Command crashed: ''CEPH_CONF=/tmp/cephtest/ceph.conf LD_LIBRARY_PATH=/tmp/cephtest/binary/usr/local/lib /tmp/cephtest/enable-coredump /tmp/cephtest/binary/usr/local/bin/ceph-coverage /tmp/cephtest/archive/coverage /tmp/cephtest/binary/usr/local/bin/test_librbd_fsx -d -W -R -p 100 -P /tmp/cephtest/archive -t 1 -l 1073741824 -S 0 -N 20000 pool_client.0 image_client.0''' flavor: gcov mds.a-kernel-sha1: 1c17c1aacab817f4621baeee15707ee992c6410a mon.a-kernel-sha1: 1c17c1aacab817f4621baeee15707ee992c6410a owner: scheduled_teuthology@teuthology success: false
#8 Updated by Sage Weil almost 11 years ago
- Status changed from Closed to In Progress
- Target version changed from v0.50 to v0.54a
#9 Updated by Sage Weil almost 11 years ago
- Status changed from In Progress to Resolved
false alarm; this is librbd with caching enabled, so the tid's are artificially generated by LibrbdWriteback, not by objecter. this is a different bug.
#10 Updated by Sage Weil almost 11 years ago
Sage Weil wrote:
false alarm; this is librbd with caching enabled, so the tid's are artificially generated by LibrbdWriteback, not by objecter. this is a different bug.
fwiw the other bug was in ObjectCacher, due to all waiters on rx not be woken up when we learn an object doesn't exist. this could lead to a op reordering when a later read succeeds immediately but the old one is still waiting on actual reads.