Project

General

Profile

Actions

Bug #4204

closed

kclient: regression triggered by direct io path

Added by Sage Weil about 11 years ago. Updated about 11 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

this probably related to recent changes in the testing branch (which fails); master passes.

2013-02-19T01:20:29.912 INFO:teuthology.task.workunit.client.0.out:writing pattern
2013-02-19T01:20:29.912 INFO:teuthology.task.workunit.client.0.out:read_direct buf_align 0 offset 4190208 len 1024
2013-02-19T01:20:29.912 INFO:teuthology.task.workunit.client.0.out:read_sync buf_align 0 offset 4190208 len 1024
2013-02-19T01:20:29.913 INFO:teuthology.task.workunit.client.0.out:read_direct buf_align 0 offset 4190208 len 2048
2013-02-19T01:20:29.913 INFO:teuthology.task.workunit.client.0.out:read_sync buf_align 0 offset 4190208 len 2048
2013-02-19T01:20:29.913 INFO:teuthology.task.workunit.client.0.out:read_direct buf_align 0 offset 4190208 len 4096
2013-02-19T01:20:29.913 INFO:teuthology.task.workunit.client.0.out:read_sync buf_align 0 offset 4190208 len 4096
2013-02-19T01:20:29.913 INFO:teuthology.task.workunit.client.0.out:read_direct buf_align 0 offset 4190208 len 8192
2013-02-19T01:20:29.914 INFO:teuthology.task.workunit.client.0.out:read_sync buf_align 0 offset 4190208 len 8192
2013-02-19T01:20:29.914 INFO:teuthology.task.workunit.client.0.out:read_direct buf_align 0 offset 4190208 len 16384
2013-02-19T01:20:29.914 INFO:teuthology.task.workunit.client.0.out:read_sync buf_align 0 offset 4190208 len 16384
2013-02-19T01:20:29.914 INFO:teuthology.task.workunit.client.0.out:read_direct buf_align 0 offset 4190720 len 1024
2013-02-19T01:20:29.915 INFO:teuthology.task.workunit.client.0.out:read_sync buf_align 0 offset 4190720 len 1024
2013-02-19T01:20:29.915 INFO:teuthology.task.workunit.client.0.out:read_direct buf_align 0 offset 4190720 len 2048
2013-02-19T01:20:29.915 INFO:teuthology.task.workunit.client.0.out:read_sync buf_align 0 offset 4190720 len 2048
2013-02-19T01:20:29.915 INFO:teuthology.task.workunit.client.0.out:read_direct buf_align 0 offset 4190720 len 4096
2013-02-19T01:20:29.915 INFO:teuthology.task.workunit.client.0.out:error: offset 4190720 had 4194304

the test is:

kernel:
  kdb: true
  branch: testing
nuke-on-error: true
overrides:
  ceph:
    conf:
      osd:
        osd op thread timeout: 60
    fs: btrfs
    log-whitelist:
    - slow request
roles:
- - mon.a
  - mon.c
  - osd.0
  - osd.1
  - osd.2
- - mon.b
  - mds.a
  - osd.3
  - osd.4
  - osd.5
- - client.0
tasks:
- chef: null
- clock: null
- ceph: null
- kclient: null
- workunit:
    clients:
      all:
      - direct_io

Actions #1

Updated by Alex Elder about 11 years ago

I'm certain it's because of this commit: 29c3c8b721a96b4a82f4224527c7103e5e910b80
Which was done for this issue: http://tracker.ceph.com/issues/4166

I'm just pulling that commit out and am going to abandon that
change completely. It was not well thought out and I didn't
test it adequately, and I'm ashamed of myself.

I'm testing the result now and will update the testing branch
accordingly after I've verified this is fixed.

Actions #2

Updated by Alex Elder about 11 years ago

I'm having a lot of trouble testing this.

I can write to files in a ceph file system. But direct I/O
writes gives ENODEV.

I can read from files in a ceph file system. But direct I/O
read again produces ENODEV.

This is the case with the testing branch, but with the master
branch as well. The results below are from the master branch.

root@plana92:/tmp/cephtest# dd if=/dev/zero of=/tmp/cephtest/mnt.0/foo bs=65536 count=16
16+0 records in
16+0 records out
1048576 bytes (1.0 MB) copied, 0.0593075 s, 17.7 MB/s
root@plana92:/tmp/cephtest# dd if=/tmp/cephtest/mnt.0/foo bs=65536 of=/tmp/zero dd: reading `/tmp/cephtest/mnt.0/foo': No such device or address
16+0 records in
16+0 records out
1048576 bytes (1.0 MB) copied, 0.00306812 s, 342 MB/s
root@plana92:/tmp/cephtest# dd if=/dev/zero of=/tmp/cephtest/mnt.0/foo bs=65536 count=16 oflag=direct
dd: writing `/tmp/cephtest/mnt.0/foo': No such device or address
1+0 records in
0+0 records out
0 bytes (0 B) copied, 0.127708 s, 0.0 kB/s
root@plana92:/tmp/cephtest# dd if=/tmp/cephtest/mnt.0/foo bs=65536 of=/tmp/zero iflag=direct
dd: reading `/tmp/cephtest/mnt.0/foo': No such device or address
0+0 records in
0+0 records out
0 bytes (0 B) copied, 0.00112755 s, 0.0 kB/s
root@plana92:/tmp/cephtest#

Actions #3

Updated by Sage Weil about 11 years ago

  • Status changed from 12 to Resolved

this is working now that alex yanked that commit.

Actions

Also available in: Atom PDF