Project

General

Profile

Actions

Bug #14698

closed

Test failure: test_full_fsync (tasks.cephfs.test_full.TestQuotaFull)

Added by Greg Farnum about 8 years ago. Updated about 8 years ago.

Status:
Duplicate
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Actions #2

Updated by Greg Farnum about 8 years ago

Dunno — it's QuotaFull rather than ClusterFull so I created a new ticket.

Actions #3

Updated by Greg Farnum about 8 years ago

http://pulpito.ceph.com/gregf-2016-02-08_22:02:06-fs-greg-fs-testing-27-1---basic-mira/1009/

Also this failure seems to be consistent now, rather than very erratic. I think we busted something in our code or our environment.

Actions #4

Updated by Greg Farnum about 8 years ago

I'm confused — the log doesn't seem to include any of the output from the remote python script. Which makes me think we're not getting to the output points — which is possible.

So I added some debug spam, but John, it looks like this is trying to write an amount equal to the full size of the cluster, and we fsync it and expect that to pass. Perhaps that has in the past due to slow updates, but I actually wouldn't expect it to pass since any extra space used up ought to make it fail. Am I missing something here about this test just being racy and needing to be more forgiving? (And can't we just write slightly less than the full amount ahead of time, and then go over the line in the loop?)

Actions #5

Updated by John Spray about 8 years ago

The output is

2016-02-08T22:42:16.737 INFO:teuthology.orchestra.run.mira101:Running: 'sudo adjust-ulimits daemon-helper kill python -c \'\nimport time\nimport datetime\nimport subprocess\nimport os\n\n# Write some buffered data through before going full, all should be well\nbytes = 0\nf = os.open("/home/ubuntu/cephtest/mnt.0/full_test_file", os.O_WRONLY | os.O_CREAT)\nbytes += os.write(f, \'"\'"\'a\'"\'"\' * 4096)\nos.fsync(f)\n\n# Okay, now we\'"\'"\'re going to fill up the filesystem, and then keep\n# writing until we see an error from fsync.  As long as we\'"\'"\'re doing\n# buffered IO, the error should always only appear from fsync and not\n# from write\nfull = False\n\nfor n in range(0, 32 + 1):\n    try:\n        bytes += os.write(f, \'"\'"\'x\'"\'"\' * 1024 * 1024)\n    except OSError as e:\n        print "Unexpected error %s from write() instead of fsync()" % e\n        raise\n\n    try:\n        os.fsync(f)\n    except OSError as e:\n        print "Reached fullness after %.2f MB" % (bytes / (1024.0 * 1024.0))\n        full = True\n        break\n    else:\n        print "Not full yet after %.2f MB" % (bytes / (1024.0 * 1024.0))\n\n    if n > 32 * 0.8:\n        # Be cautious in the last region where we expect to hit\n        # the full condition, so that we don\'"\'"\'t overshoot too dramatically\n        time.sleep(15.0)\n\nif not full:\n    raise RuntimeError("Failed to reach fullness after writing %d bytes" % bytes)\n\n# The error sticks to the inode until we dispose of it\ntry:\n    os.close(f)\nexcept OSError:\n    print "Saw error from close() as expected"\nelse:\n    raise RuntimeError("Did not see expected error from close()")\n\nos.unlink("/home/ubuntu/cephtest/mnt.0/full_test_file")\n\''
2016-02-08T22:42:37.258 INFO:teuthology.orchestra.run.mira005:Running: 'sudo logrotate /etc/logrotate.d/ceph-test.conf'
2016-02-08T22:42:37.263 INFO:teuthology.orchestra.run.mira101:Running: 'sudo logrotate /etc/logrotate.d/ceph-test.conf'
2016-02-08T22:43:07.284 INFO:teuthology.orchestra.run.mira005:Running: 'sudo logrotate /etc/logrotate.d/ceph-test.conf'
2016-02-08T22:43:07.288 INFO:teuthology.orchestra.run.mira101:Running: 'sudo logrotate /etc/logrotate.d/ceph-test.conf'
2016-02-08T22:43:37.308 INFO:teuthology.orchestra.run.mira005:Running: 'sudo logrotate /etc/logrotate.d/ceph-test.conf'
2016-02-08T22:43:37.312 INFO:teuthology.orchestra.run.mira101:Running: 'sudo logrotate /etc/logrotate.d/ceph-test.conf'
2016-02-08T22:43:50.904 INFO:teuthology.orchestra.run.mira101.stderr:Traceback (most recent call last):
2016-02-08T22:43:50.904 INFO:teuthology.orchestra.run.mira101.stderr:  File "<string>", line 21, in <module>
2016-02-08T22:43:50.904 INFO:teuthology.orchestra.run.mira101.stderr:OSError: [Errno 28] No space left on device
2016-02-08T22:43:50.961 INFO:teuthology.orchestra.run.mira101.stderr:daemon-helper: command failed with exit status 1

(it's hard to see because it comes out at the point the command actually runs rather than when the test failure is reported from the exception)

Actions #6

Updated by John Spray about 8 years ago

The weirdness here is that the error is coming from os.write() rather than os.fsync().

The test expects that the client doesn't see the osdmap with the full flag apart from when it's actually trying to flush something (the client doesn't subscribe continuously). This behaviour is surprising because we're finishing our previous fsync(), so the objecter should be idle, and then before we hit the next write() we're apparently receiving an updated osdmap with the full flag set, causing a failure in write().

The other scenario that would cause this would be if we were for some reason doing a direct write instead of a buffered write.

Actions #7

Updated by John Spray about 8 years ago

aaaand here's why:

2016-02-09 06:42:18.603243 7ff73b7fe700 10 client.4445.objecter tick
2016-02-09 06:42:18.603261 7ff73b7fe700  2 client.4445.objecter  tid 9 on osd.0 is laggy
2016-02-09 06:42:18.603268 7ff73b7fe700 10 client.4445.objecter _maybe_request_map subscribing (onetime) to next osd map

This is something I had seen on the openstack runs, but not on real hardware.

When OSDs are so slow that we have laggy ops, the clients start getting osdmaps all the time, and the subtler rules about exactly when we get things like full flags go out the window.

Actions #8

Updated by Greg Farnum about 8 years ago

  • Assignee set to Greg Farnum

John Spray wrote:

The output is
[...]

(it's hard to see because it comes out at the point the command actually runs rather than when the test failure is reported from the exception)

Yeah, I was just confused because we weren't getting any of the debug outputs and I wasn't sure we were breaking out in the first write (which I guess is the case). I've got patches adding some gratuitous debugging output now.

...okay, this is a serious problem with the laggy op detection in Objecter: http://tracker.ceph.com/issues/14750

I'll run that through testing and make sure we're good. Glad our tests spotted that, even if indirectly!

Regarding the slow ops...that's troubling. I wouldn't have expected

Actions #9

Updated by Greg Farnum about 8 years ago

  • Status changed from New to Duplicate
Actions #10

Updated by Greg Farnum about 8 years ago

Actions

Also available in: Atom PDF