Project

General

Profile

Actions

Bug #10513

closed

ceph_test_librbd_fsx fails with thrasher

Added by Loïc Dachary over 9 years ago. Updated about 9 years ago.

Status:
Won't Fix
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

http://pulpito.ceph.com/loic-2015-01-08_10:36:47-rbd-giant-backports-testing-basic-vps/690698/

s)
2015-01-08T03:31:52.223 INFO:teuthology.orchestra.run.vpm020.stdout:1368 write    0xcdd426c thru    0xcdda38b    (0x6120 bytes)
2015-01-08T03:31:52.245 INFO:teuthology.orchestra.run.vpm020.stdout:1370 write    0x1c01757 thru    0x1c111a4    (0xfa4e bytes)
2015-01-08T03:31:52.359 INFO:teuthology.orchestra.run.vpm020.stdout:1371 punch    from 0xbd27283 to 0xbd28661, (0x13de bytes)
2015-01-08T03:31:52.360 INFO:teuthology.orchestra.run.vpm020.stdout:1372 read    0x8ec1a40 thru    0x8ed14b4    (0xfa75 bytes)
2015-01-08T03:31:52.362 INFO:teuthology.orchestra.run.vpm020.stdout:1376 read    0x2c2dc2f thru    0x2c3a85c    (0xcc2e bytes)
2015-01-08T03:31:52.363 INFO:teuthology.orchestra.run.vpm020.stdout:1377 write    0x49bbbaa thru    0x49c3cee    (0x8145 bytes)
2015-01-08T03:31:52.381 INFO:teuthology.orchestra.run.vpm020.stdout:1378 clone    18 order 24 su 65536 sc 10
2015-01-08T03:31:53.488 INFO:teuthology.orchestra.run.vpm020.stdout:leaving image image_client.0-clone17 intact
2015-01-08T03:31:54.094 INFO:tasks.thrashosds.thrasher:in_osds:  [0, 5, 4, 3, 2, 1]  out_osds:  [] dead_osds:  [] live_osds:  [1, 0, 2, 3, 5, 4]
2015-01-08T03:31:54.094 INFO:tasks.thrashosds.thrasher:choose_action: min_in 3 min_out 0 min_live 2 min_dead 0
2015-01-08T03:31:54.094 INFO:tasks.thrashosds.thrasher:inject_pause on 2
2015-01-08T03:31:54.094 INFO:tasks.thrashosds.thrasher:Testing filestore_inject_stall pause injection for duration 3
2015-01-08T03:31:54.094 INFO:tasks.thrashosds.thrasher:Checking after 0, should_be_down=False
2015-01-08T03:31:54.095 INFO:teuthology.orchestra.run.vpm020:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --admin-daemon /var/run/ceph/ceph-osd.2.asok config set filestore_inject_stall 3'
2015-01-08T03:31:55.767 INFO:teuthology.orchestra.run.vpm020.stdout:checking clone #16, image image_client.0-clone16 against file /home/ubuntu/cephtest/archive/fsx-image_client.0-parent17
2015-01-08T03:31:57.236 INFO:teuthology.orchestra.run.vpm020.stdout:1379 trunc    from 0xcdda38c to 0x356fbb0
2015-01-08T03:31:57.276 INFO:teuthology.orchestra.run.vpm020.stdout:1380 punch    from 0x2dc498a to 0x2dd41a9, (0xf81f bytes)
2015-01-08T03:31:58.230 INFO:teuthology.orchestra.run.vpm020.stdout:1381 write    0x9812d9b thru    0x981dbff    (0xae65 bytes)
2015-01-08T03:31:58.256 INFO:teuthology.orchestra.run.vpm020.stdout:1382 read    0x41e63cb thru    0x41eaaac    (0x46e2 bytes)
2015-01-08T03:31:58.260 INFO:teuthology.orchestra.run.vpm020.stdout:1384 punch    from 0x56cf68e to 0x56d8789, (0x90fb bytes)
2015-01-08T03:31:58.262 INFO:teuthology.orchestra.run.vpm020.stdout:1385 punch    from 0x34d0564 to 0x34de4ae, (0xdf4a bytes)
2015-01-08T03:31:58.503 INFO:teuthology.orchestra.run.vpm020.stdout:1386 read    0x14bc432 thru    0x14cc124    (0xfcf3 bytes)
2015-01-08T03:31:58.516 INFO:teuthology.orchestra.run.vpm020.stdout:1387 trunc    from 0x981dc00 to 0x4b15e63
2015-01-08T03:31:59.360 INFO:teuthology.orchestra.run.vpm020.stdout:1388 write    0x9a9f9e9 thru    0x9aa4022    (0x463a bytes)
2015-01-08T03:31:59.564 INFO:teuthology.orchestra.run.vpm020.stdout:1389 read    0x8ab5000 thru    0x8ab9bf3    (0x4bf4 bytes)
2015-01-08T03:31:59.588 INFO:teuthology.orchestra.run.vpm020.stdout:1390 write    0xc97e2a0 thru    0xc980f4e    (0x2caf bytes)
2015-01-08T03:31:59.607 INFO:teuthology.orchestra.run.vpm020.stdout:1391 write    0x83e230d thru    0x83eb9c6    (0x96ba bytes)
2015-01-08T03:31:59.926 INFO:teuthology.orchestra.run.vpm020.stdout:1392 read    0x393a1d1 thru    0x3941192    (0x6fc2 bytes)
2015-01-08T03:31:59.929 INFO:teuthology.orchestra.run.vpm020.stdout:1393 read    0xa4b4684 thru    0xa4bf84a    (0xb1c7 bytes)
2015-01-08T03:31:59.956 INFO:teuthology.orchestra.run.vpm020.stdout:1394 write    0xea94054 thru    0xea94507    (0x4b4 bytes)
2015-01-08T03:31:59.973 INFO:teuthology.orchestra.run.vpm020.stdout:1398 read    0x4b1ef26 thru    0x4b26525    (0x7600 bytes)
2015-01-08T03:31:59.975 INFO:teuthology.orchestra.run.vpm020.stdout:1399 clone    19 order 21 su 32768 sc 14
2015-01-08T03:32:04.230 INFO:teuthology.orchestra.run.vpm020.stdout:truncating image image_client.0-clone18 from 0xea94508 (overlap 0x356fbb0) to 0x557f91
2015-01-08T03:32:04.549 INFO:tasks.thrashosds.thrasher:in_osds:  [0, 5, 4, 3, 2, 1]  out_osds:  [] dead_osds:  [] live_osds:  [1, 0, 2, 3, 5, 4]
2015-01-08T03:32:04.549 INFO:tasks.thrashosds.thrasher:choose_action: min_in 3 min_out 0 min_live 2 min_dead 0
2015-01-08T03:32:04.549 INFO:tasks.thrashosds.thrasher:Removing osd 4, in_osds are: [0, 5, 4, 3, 2, 1]
2015-01-08T03:32:04.549 INFO:teuthology.orchestra.run.vpm020:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph osd out 4'
2015-01-08T03:32:04.973 INFO:teuthology.orchestra.run.vpm020.stdout:checking clone #17, image image_client.0-clone17 against file /home/ubuntu/cephtest/archive/fsx-image_client.0-parent18
2015-01-08T03:32:08.495 ERROR:teuthology.parallel:Exception in parallel execution
Traceback (most recent call last):
  File "/home/teuthworker/src/teuthology_giant/teuthology/parallel.py", line 82, in __exit__
    for result in self:
  File "/home/teuthworker/src/teuthology_giant/teuthology/parallel.py", line 101, in next
    resurrect_traceback(result)
  File "/home/teuthworker/src/teuthology_giant/teuthology/parallel.py", line 19, in capture_traceback
    return func(*args, **kwargs)
  File "/var/lib/teuthworker/src/ceph-qa-suite_giant/tasks/rbd_fsx.py", line 82, in _run_one_client
    remote.run(args=args)
  File "/home/teuthworker/src/teuthology_giant/teuthology/orchestra/remote.py", line 128, in run
    r = self._runner(client=self.ssh, name=self.shortname, **kwargs)
  File "/home/teuthworker/src/teuthology_giant/teuthology/orchestra/run.py", line 368, in run
    r.wait()
  File "/home/teuthworker/src/teuthology_giant/teuthology/orchestra/run.py", line 103, in wait
    raise CommandCrashedError(command=self.command)
CommandCrashedError: Command crashed: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph_test_librbd_fsx -d -W -R -p 100 -P /home/ubuntu/cephtest/archive -r 1 -w 1 -t 1 -h 1 -l 250000000 -S 0 -N 2000 pool_client.0 image_client.0'

Actions #1

Updated by Loïc Dachary over 9 years ago

  • Status changed from New to Won't Fix

josh: There's no trace of the actual error in the teuthology or ceph client logs. Looking at the syslogs, this and several other failures from that run were from running out of memory. This happened with and without caching, so I suspect these tests may simply need more memory in their current configuration, since osds are sharing it too, and they use much more with thrashing.

Actions #2

Updated by Loïc Dachary about 9 years ago

  • Subject changed from ceph_test_librbd_fsx fails with thrasher (giant) to ceph_test_librbd_fsx fails with thrasher

On dumpling this time ( http://pulpito.ceph.com/loic-2015-01-29_15:39:38-rbd-dumpling-backports---basic-multi/730029/ ), the test never completes. It is running on bare metal, therefore presumably with more RAM. And the error is not that it crashes, just that it hangs forever.

Actions

Also available in: Atom PDF