Project

General

Profile

Actions

Bug #2046

closed

filestore: do_op running during commit

Added by Sage Weil about 12 years ago. Updated about 12 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
OSD
Target version:
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

commit_start() is supposed to quiesce writes, but I see

2012-02-08 16:12:43.487876 7f5324ccb700 filestore(/tmp/cephtest/data/osd.3.data) sync_entry committing 127294 sync_epoch 787
2012-02-08 16:12:43.487912 7f531bab7700 filestore(/tmp/cephtest/data/osd.3.data) _flush_op_queue waiting for apply finisher
2012-02-08 16:12:43.487928 7f531bab7700 filestore(/tmp/cephtest/data/osd.3.data) flush complete
2012-02-08 16:12:43.487939 7f531bab7700 osd.3 1388 _remove_pg 0.17 taking osd_lock
2012-02-08 16:12:43.487954 7f531bab7700 osd.3 1388 _remove_pg 0.17 removing final
2012-02-08 16:12:43.487972 7f5324ccb700 filestore(/tmp/cephtest/data/osd.3.data) taking async snap 'snap_127294'
2012-02-08 16:12:43.487998 7f531bab7700 filestore(/tmp/cephtest/data/osd.3.data) queue_transactions existing osr 0x2bfd2d0/0x2c05770
2012-02-08 16:12:43.488018 7f531bab7700 filestore(/tmp/cephtest/data/osd.3.data) queue_transactions (parallel) 127295 0x7e71c00
2012-02-08 16:12:43.488048 7f531bab7700 filestore(/tmp/cephtest/data/osd.3.data) queue_op 0xa66f6c0 seq 127295 193 bytes   (queue has 1 ops and 193 bytes)
2012-02-08 16:12:43.488081 7f531bab7700 osd.3 1388 pg[0.17( empty n=17 ec=1 les/c 1340/1367 1377/1377/1377) [2,5] r=-1 lpr=1377 lcod 1363'306 stray DELETING] remove_watchers
2012-02-08 16:12:43.488103 7f531bab7700 osd.3 1388 _put_pool 0 1 -> 0
2012-02-08 16:12:43.488116 7f531bab7700 osd.3 1388 _remove_pg 0.17 all done
2012-02-08 16:12:43.488136 7f5322cc7700 filestore(/tmp/cephtest/data/osd.3.data) _do_op 0xa66f6c0 127295 osr 0x2bfd2d0/0x2c05770 start
2012-02-08 16:12:43.488153 7f5322cc7700 filestore(/tmp/cephtest/data/osd.3.data) _do_transaction on 0x7e71c00
2012-02-08 16:12:43.488167 7f5322cc7700 filestore(/tmp/cephtest/data/osd.3.data) _destroy_collection /tmp/cephtest/data/osd.3.data/current/0.17_1df
2012-02-08 16:12:43.488396 7f5322cc7700 filestore(/tmp/cephtest/data/osd.3.data) _destroy_collection /tmp/cephtest/data/osd.3.data/current/0.17_1df = 0
2012-02-08 16:12:43.488410 7f5322cc7700 filestore(/tmp/cephtest/data/osd.3.data) _destroy_collection /tmp/cephtest/data/osd.3.data/current/0.17_244
2012-02-08 16:12:43.488546 7f5322cc7700 filestore(/tmp/cephtest/data/osd.3.data) _destroy_collection /tmp/cephtest/data/osd.3.data/current/0.17_244 = 0
2012-02-08 16:12:43.488558 7f5322cc7700 filestore(/tmp/cephtest/data/osd.3.data) _destroy_collection /tmp/cephtest/data/osd.3.data/current/0.17_254
2012-02-08 16:12:43.491911 7f5322cc7700 filestore(/tmp/cephtest/data/osd.3.data) _destroy_collection /tmp/cephtest/data/osd.3.data/current/0.17_254 = 0
2012-02-08 16:12:43.491940 7f5322cc7700 filestore(/tmp/cephtest/data/osd.3.data) remove meta/5172cdfb/pglog_0.17/0
2012-02-08 16:12:43.493447 7f5322cc7700 filestore(/tmp/cephtest/data/osd.3.data) remove meta/5172cdfb/pglog_0.17/0 = 0
2012-02-08 16:12:43.493464 7f5322cc7700 filestore(/tmp/cephtest/data/osd.3.data) remove meta/28da3187/pginfo_0.17/0
2012-02-08 16:12:43.496779 7f5322cc7700 filestore(/tmp/cephtest/data/osd.3.data) remove meta/28da3187/pginfo_0.17/0 = 0
2012-02-08 16:12:43.496796 7f5322cc7700 filestore(/tmp/cephtest/data/osd.3.data) _destroy_collection /tmp/cephtest/data/osd.3.data/current/0.17_head
2012-02-08 16:12:43.497054 7f5322cc7700 filestore(/tmp/cephtest/data/osd.3.data) _destroy_collection /tmp/cephtest/data/osd.3.data/current/0.17_head = 0
2012-02-08 16:12:43.497068 7f5322cc7700 filestore(/tmp/cephtest/data/osd.3.data) _do_op 0xa66f6c0 127295 r = 0, finisher 0x2be4f10 0
2012-02-08 16:12:43.497079 7f5322cc7700 filestore(/tmp/cephtest/data/osd.3.data) _finish_op on osr 0x2bfd2d0/0x2c05770
2012-02-08 16:12:43.502713 7f5324ccb700 filestore(/tmp/cephtest/data/osd.3.data) async snap create 'snap_127294' transid 0 got (22) Invalid argument
os/FileStore.cc: In function 'void FileStore::sync_entry()' thread 7f5324ccb700 time 2012-02-08 16:12:43.569505
os/FileStore.cc: 3078: FAILED assert(0 == "async snap ioctl error")

Actions #1

Updated by Sage Weil about 12 years ago

this was broken by 259c509a8941bf7cdad8bd4ede0ccd73ca8a83d3, way back in v0.25! Sigh. The wait condition for _op_apply_start() was broken.

Actions

Also available in: Atom PDF