Project

General

Profile

Actions

Bug #2689

closed

qemu iozone test hangs

Added by Sage Weil almost 12 years ago. Updated over 10 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

kernel: &id001
  branch: testing
  kdb: true
nuke-on-error: true
overrides:
  ceph:
    branch: next
    fs: btrfs
    log-whitelist:
    - slow request
roles:
- - mon.a
  - mon.c
  - osd.0
  - osd.1
  - osd.2
- - mon.b
  - mds.a
  - osd.3
  - osd.4
  - osd.5
- - client.0
targets:
  ubuntu@plana06.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDHBLmNMiGX6Hq9/WJ7HowKhQ8u3qvIHd5SbaQ5DcD40K+5dstZBCFzbZfSuKDwYfMJJ/wwONK6z0rZbz+ox1qjU2xD3Vxwq7IFSHDtYtJIYQ65e6yVmkhZPCULITkLsKkVG0X4PCJQrrceKMWWQtQvTHPzdeCAiTzCperz52tHat4IXq64EGwjzCDybPEa+ZlctV3PPGsvDjPTS+rhekGuS8aNRSIV8LJ4jmQ4BeZWiShO9C42gIMYwvahgIkFdnK+25iA4F0HINHgqGZU0I0eRIFMP7pgE2qQ8NoajEIuCPid3/eZwibCw9W9fya2MklksiOaS9U9eghEC7u0lyrX
  ubuntu@plana64.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDk4GmsUmC8svnRI6Xd+mRX2MwKb4RHECAeLfqTm2COfqfolS2wKGw3U92eJcyvpZ+2p82X7uBrimjZh5JgRtxJ1aGUG4Pi60+JBYF0WpohM/3aYISFegVNET9rcapdDaAi6fFB5vhT06Q/cYEO0tPrdqGb/O3oiDSurtqtfOzkdwSPWSTY/hSegXgOeG6EjuEfvnU4BbgXWkLlDQRXCdgQd35F0SlKJVgMo+J1MgMCEK4qnBMFN614P1gBSzZCBsSUGQdjYBOzZfCRlI2bUdPDtB0kyjp7o5Ns9gLd07TLw8h9oxvI7wxG16XnLOAIzPBNOaH4OztTMGg3wJ/1e26t
  ubuntu@plana65.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDTwfkF9asvpySXF/DOk10UkRDNtRwgGgLww/I/3E2r+JpsfYtW62TA1HMXjtB1g7SrcIolqCiiMd+5MIURIND94n76JiZ2o4DplLKIqUB6ys46gro7mwoeFnZNOuwdAA5bO4dfgeQ3yPtfIqpWTejkCB7ai/kG04C4ekz6EgplwtqWIfvXnij4fNaqvm3s/IxGhnO40DOGNwsAldEJo2fuJN8KHnYzsU/Dx5kJ85jQl2eQJI74VpMoh2Ge7+n9Q8rJhegfcHYPLJsX/Uyrf7Rtk1RfeTyZbIOSJIQDbQNepu278kvc9IEnFg3WfvWespfrUExVgnXq53xd1RIFcx6L
task:
- ceph:
    conf:
      client:
        rbd cache: false
tasks:
- internal.lock_machines: 3
- internal.save_config: null
- internal.check_lock: null
- internal.connect: null
- internal.check_conflict: null
- kernel: *id001
- internal.base: null
- internal.archive: null
- internal.coredump: null
- internal.syslog: null
- internal.timer: null
- chef: null
- clock: null
- ceph: null
- qemu:
    all:
      image_size: 20480
      test: https://raw.github.com/ceph/ceph/master/qa/workunits/suites/iozone.sh

also with various combinations of caching on/off.. doesn't seem to matter
Actions #1

Updated by Josh Durgin almost 12 years ago

  • Status changed from New to In Progress
  • Assignee set to Josh Durgin
Actions #2

Updated by Sage Weil almost 12 years ago

  • Target version set to v0.50
Actions #3

Updated by Sage Weil over 11 years ago

  • Priority changed from Urgent to Normal
Actions #4

Updated by Sage Weil over 11 years ago

  • Assignee deleted (Josh Durgin)
Actions #5

Updated by Sage Weil over 11 years ago

  • Priority changed from Normal to High
Actions #6

Updated by Sage Weil over 11 years ago

  • Status changed from In Progress to 12
Actions #7

Updated by Sage Weil over 11 years ago

let's retest this with all of the recent caching fixes?

Actions #8

Updated by Josh Durgin over 11 years ago

  • Status changed from 12 to 7
  • Assignee set to Josh Durgin
  • Target version deleted (v0.50)

Testing again since some possible causes were fixed.

Actions #9

Updated by Josh Durgin over 11 years ago

  • Status changed from 7 to In Progress

This seems to still be a problem. I'll try to get more information about what's going on. It looks like there's an error somewhere:

        Iozone: Performance Test of File I/O
                Version $Revision: 3.397 $
                Compiled for 64 bit mode.
                Build: linux-AMD64 

        Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
                     Al Slater, Scott Rhine, Mike Wisner, Ken Goss
                     Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
                     Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
                     Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone,
                     Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root,
                     Fabrice Bacchella, Zhenghua Xue, Qin Li, Darren Sawyer.
                     Ben England.

        Run began: Fri Dec 28 19:30:45 2012

        Include close in write timing
        Include fsync in write timing
        File size set to 10485760 KB
        Record Size 1024 KB
        Command line used: iozone -c -e -s 10240M -r 1M -t 1 -F f3 -i 0 -i 1
        Output is in Kbytes/sec
        Time Resolution = 0.000001 seconds.
        Processor cache size set to 1024 Kbytes.
        Processor cache line size set to 32 bytes.
        File stride size set to 17 * record size.
        Throughput test with 1 process
        Each process writes a 10485760 Kbyte file in 1024 Kbyte records

Error writing block 10197, fd= 4

        Children see throughput for  1 initial writers  =       0.00 KB/sec
        Parent sees throughput for  1 initial writers   =       0.00 KB/sec
        Min throughput per process                      =       0.00 KB/sec 
        Max throughput per process                      =       0.00 KB/sec
        Avg throughput per process                      =       0.00 KB/sec
        Min xfer                                        =       0.00 KB

Child 0
f3: No such file or directory

Actions #10

Updated by Josh Durgin about 11 years ago

  • Status changed from In Progress to 12
Actions #11

Updated by Sage Weil over 10 years ago

  • Assignee deleted (Josh Durgin)
Actions #12

Updated by Sage Weil over 10 years ago

  • Priority changed from High to Urgent
Actions #13

Updated by Sage Weil over 10 years ago

  • Assignee set to Sage Weil
Actions #14

Updated by Sage Weil over 10 years ago

  • Assignee changed from Sage Weil to Josh Durgin
Actions #15

Updated by Josh Durgin over 10 years ago

  • Status changed from 12 to Resolved

Retested with qemu 1.5 from the havana cloud archive for ubuntu, and ceph 0.67.4. It worked fine. I'm not sure exactly what fixed it, but I'm guessing it's the newer version of qemu (maybe the async flush).

Actions

Also available in: Atom PDF