Bug #15019
closedhammer: fs test fails with log [ERR] : OSD full dropping all updates 100% full
0%
Updated by Wei-Chung Cheng about 8 years ago
- Tracker changed from Bug to Backport
- Release set to hammer
Updated by Wei-Chung Cheng about 8 years ago
- Status changed from New to In Progress
- Assignee set to Wei-Chung Cheng
Updated by Loïc Dachary about 8 years ago
- Related to Backport #13335: hammer: OSD crashed when reached pool's max_bytes quota added
Updated by Loïc Dachary about 8 years ago
- Subject changed from check osd full mechanism will cause failure when osd is too full to hammer: revert "OSD crashed when reached pool's max_bytes quota"
Updated by Loïc Dachary about 8 years ago
- fail http://pulpito.ceph.com/loic-2016-03-01_20:26:33-fs-hammer-backports---basic-multi/
- new bug (happened twice) "log [ERR] : OSD full dropping all updates 100% full"
Trying on hammer
teuthology-suite --priority 1000 --suite fs --filter="fs/recovery/{clusters/2-remote-clients.yaml debug/mds_client.yaml mounts/ceph-fuse.yaml tasks/mds-full.yaml}" --suite-branch hammer --email loic@dachary.org --ceph hammer --machine-type smithi,mira
teuthology-suite --priority 1000 --suite fs --filter="fs/recovery/{clusters/2-remote-clients.yaml debug/mds_client.yaml mounts/ceph-fuse.yaml tasks/mds-full.yaml}" --suite-branch hammer --email loic@dachary.org --ceph v0.94.6 --machine-type smithi,mira
The same test failed February 8th, 2016 http://pulpito.ceph.com/loic-2016-02-08_23:43:36-fs-hammer-backports---basic-multi/1087/
The same test passed January 29th, 2016 http://pulpito.ceph.com/loic-2016-01-29_03:02:05-fs-hammer---basic-multi/49423/
git log --merges --since 2016-01-29 --until 2016-02-07 --format='%H' ceph/hammer | \ while read sha1 ; do \ echo ; git log --format='** %aD "%s":https://github.com/ceph/ceph/commit/%H' ${sha1}^1..${sha1} ; \ done | perl -p -e 'print "* \"PR $1\":https://github.com/ceph/ceph/pull/$1\n" if(/Merge pull request #(\d+)/)'
- PR 7524
- Fri, 5 Feb 2016 21:10:46 -0500 Merge pull request #7524 from ktdreyer/wip-14637-hammer-man-radosgw-admin-orphans
- Wed, 3 Feb 2016 19:51:58 -0700 doc: add orphans commands to radosgw-admin
- Thu, 4 Feb 2016 11:04:39 -0700 man: rebuild manpages
- PR 7526
- Fri, 5 Feb 2016 10:30:22 +0100 Merge pull request #7526 from ceph/wip-14516-hammer
- Mon, 1 Feb 2016 16:33:55 -0800 rgw-admin: document orphans commands in usage
- PR 7441
- Fri, 5 Feb 2016 12:47:33 +0700 Merge pull request #7441 from odivlad/backport-pr-14569
- Thu, 9 Jul 2015 16:56:07 +0800 rgw: Make RGW_MAX_PUT_SIZE configurable
- PR 7442
- Fri, 5 Feb 2016 12:46:54 +0700 Merge pull request #7442 from odivlad/backport-pr-14570
- Mon, 21 Sep 2015 20:32:29 +0200 rgw: fix wrong etag calculation during POST on S3 bucket.
- PR 7454
- Wed, 3 Feb 2016 12:41:56 +0700 Merge pull request #7454 from dachary/wip-14584-hammer
- Tue, 18 Aug 2015 15:22:55 +0800 qa/fsstress.sh: fix 'cp not writing through dangling symlink'
- PR 6918
- Wed, 3 Feb 2016 11:38:57 +0700 Merge pull request #6918 from asheplyakov/hammer-bug-12449
- Wed, 16 Dec 2015 15:31:52 +0300 Check for full before changing the cached obc
- PR 7236
- Sat, 30 Jan 2016 21:42:29 -0500 Merge pull request #7236 from athanatos/wip-14376
- Thu, 14 Jan 2016 08:35:23 -0800 config_opts: increase suicide timeout to 300 to match recovery
- PR 6450
- Sat, 30 Jan 2016 21:42:12 -0500 Merge pull request #6450 from dachary/wip-13672-hammer
- Tue, 3 Nov 2015 00:21:51 +0100 tests: test/librados/test.cc must create profile
- Mon, 2 Nov 2015 20:24:51 +0100 tests: destroy testprofile before creating one
- Mon, 2 Nov 2015 20:23:52 +0100 tests: add destroy_ec_profile{,_pp} helpers
- PR 6680
- Sat, 30 Jan 2016 21:41:39 -0500 Merge pull request #6680 from SUSE/wip-13859-hammer
- Thu, 3 Sep 2015 20:30:50 +0200 ceph.spec.in: fix License line
- PR 6791
- Sat, 30 Jan 2016 21:41:18 -0500 Merge pull request #6791 from branch-predictor/bp-5812-backport
- Mon, 6 Jul 2015 09:56:11 +0200 tools: fix race condition in seq/rand bench
- Wed, 20 May 2015 12:41:22 +0200 tools: add --no-verify option to rados bench
- PR 6973
- Sat, 30 Jan 2016 21:40:38 -0500 Merge pull request #6973 from dreamhost/wip-configure-hammer
- Tue, 5 May 2015 15:07:33 0800 "configure.ac: no use to add " before ac_ext=c
- PR 7206
- Sat, 30 Jan 2016 21:40:13 -0500 Merge pull request #7206 from dzafman/wip-14292
- Mon, 15 Jun 2015 17:55:41 -0700 ceph_osd: Add required feature bits related to this branch to osd_required mask
- Thu, 4 Jun 2015 18:47:42 -0700 osd: CEPH_FEATURE_CHUNKY_SCRUB feature now required
- PR 7207
- Sat, 30 Jan 2016 21:39:42 -0500 Merge pull request #7207 from rldleblanc/recency_fix_for_hammer
- Wed, 25 Nov 2015 14:40:26 -0500 osd: recency should look at newest (not oldest) hitsets
- Wed, 25 Nov 2015 14:39:08 -0500 osd/ReplicatedPG: fix promotion recency logic
- PR 7347
- Sat, 30 Jan 2016 21:39:11 -0500 Merge pull request #7347 from tchaikov/wip-hammer-10093
- Tue, 21 Apr 2015 14:04:40 +0800 tools: ceph-monstore-tool must do out_store.close()
- PR 7411
- Sat, 30 Jan 2016 21:38:35 -0500 Merge pull request #7411 from dachary/wip-14467-hammer
- Mon, 18 Jan 2016 08:24:46 -0700 osd: disable filestore_xfs_extsize by default
- PR 7412
- Sat, 30 Jan 2016 21:38:13 -0500 Merge pull request #7412 from dachary/wip-14470-hammer
- Tue, 5 Jan 2016 14:34:05 +0800 tools: monstore: add 'show-versions' command.
- Wed, 16 Sep 2015 18:28:52 +0800 tools: ceph_monstore_tool: add inflate-pgmap command
- Tue, 20 Oct 2015 15:23:49 +0800 tools:support printing the crushmap in readable fashion.
- Mon, 14 Sep 2015 19:50:47 +0800 tools:print the map infomation in human readable format.
- Mon, 14 Sep 2015 19:19:05 +0800 tools:remove the local file when get map failed.
- Mon, 13 Jul 2015 12:35:13 +0100 tools: ceph_monstore_tool: describe behavior of rewrite command
- Fri, 19 Jun 2015 22:57:57 +0800 tools/ceph-monstore-tools: add rewrite command
- Tue, 21 Apr 2015 14:04:40 +0800 tools: ceph-monstore-tool must do out_store.close()
- PR 7446
- Sat, 30 Jan 2016 21:37:46 -0500 Merge pull request #7446 from liewegas/wip-14537-hammer
- Thu, 28 Jan 2016 02:09:53 -0800 mon: compact full epochs also
- PR 7182
- Sat, 30 Jan 2016 11:45:31 -0800 Merge pull request #7182 from dachary/wip-14143-hammer
- Mon, 14 Dec 2015 17:41:49 -0500 librbd: optionally validate RBD pool configuration
- PR 7183
- Sat, 30 Jan 2016 11:45:20 -0800 Merge pull request #7183 from dachary/wip-14283-hammer
- Tue, 18 Aug 2015 16:05:29 -0400 rbd: fix bench-write
- PR 7416
- Sat, 30 Jan 2016 11:45:05 -0800 Merge pull request #7416 from dachary/wip-14466-hammer
- Thu, 21 Jan 2016 13:45:42 +0200 rbd-replay: handle EOF gracefully
- PR 7417
- Sat, 30 Jan 2016 11:44:50 -0800 Merge pull request #7417 from dachary/wip-14553-hammer
- Fri, 22 Jan 2016 11:18:40 -0800 rbd: remove canceled tasks from timer thread
- PR 7407
- Sat, 30 Jan 2016 11:44:32 -0800 Merge pull request #7407 from dillaman/wip-14543-hammer
- Thu, 28 Jan 2016 14:38:20 -0500 librbd: ImageWatcher shouldn't block the notification thread
- Thu, 28 Jan 2016 14:35:54 -0500 librados_test_stub: watch/notify now behaves similar to librados
- Thu, 28 Jan 2016 12:40:18 -0500 tests: simulate writeback flush during snap create
- PR 6980
- Sat, 30 Jan 2016 11:44:12 -0800 Merge pull request #6980 from dillaman/wip-14063-hammer
- Fri, 18 Dec 2015 15:22:13 -0500 librbd: fix merge-diff for >2GB diff-files
- PR 6353
- Fri, 29 Jan 2016 23:31:47 +0700 Merge pull request #6353 from theanalyst/wip-13513-hammer
- Wed, 19 Aug 2015 20:32:39 +0200 rgw: url_decode values from X-Object-Manifest during GET on Swift DLO.
- PR 6620
- Fri, 29 Jan 2016 23:31:16 +0700 Merge pull request #6620 from SUSE/wip-13820-hammer
- Wed, 23 Sep 2015 09:49:36 -0500 rgw: fix modification to index attrs when setting acls
- PR 7186
- Fri, 29 Jan 2016 23:30:57 +0700 Merge pull request #7186 from dachary/wip-13888-hammer
- Thu, 19 Nov 2015 13:38:40 +0300 Fixing NULL pointer dereference
- PR 5789
- Fri, 29 Jan 2016 08:52:51 -0500 Merge pull request #5789 from SUSE/wip-12928-hammer
- Mon, 1 Jun 2015 15:57:03 +0200 ceph.spec.in summary-ended-with-dot
- Mon, 1 Jun 2015 14:58:31 +0200 ceph.spec.in libcephfs_jni1 has no %post and %postun
- PR 7434
- Fri, 29 Jan 2016 08:50:56 -0500 Merge pull request #7434 from tchaikov/wip-14441-hammer
- Wed, 23 Dec 2015 11:23:38 +0800 "man: document listwatchers cmd in "rados manpage
The most likely candidate for this regression is : https://github.com/ceph/ceph/pull/6918 hammer: osd: check for full before changing the cached obc (hammer)
Trying on v0.94.5
Updated by Loïc Dachary about 8 years ago
- Tracker changed from Backport to Bug
- Subject changed from hammer: revert "OSD crashed when reached pool's max_bytes quota" to hammer: fs test fails with log [ERR] : OSD full dropping all updates 100% full
- Status changed from In Progress to Duplicate
- % Done set to 0
This is a manifestation of an incomplete backport for which an the followup has been scheduled at http://tracker.ceph.com/issues/14824#note-5 . The discussion and rationale for keeping the incomplete backport and failing this test can be found in the QE validation hammer mail thread. The issue http://tracker.ceph.com/issues/14716 is referenced and http://tracker.ceph.com/issues/14716#note-11 states that it is related to http://tracker.ceph.com/issues/14824.
Closing https://github.com/ceph/ceph/pull/7977 and marking this as a duplicate. The title is changed to match the error message so that it's easier to find when future tests fail.
Updated by Loïc Dachary about 8 years ago
- Related to Backport #14824: hammer: rbd and pool quota do not go well together added