Ceph : Issues
https://tracker.ceph.com/
https://tracker.ceph.com/favicon.ico
2011-08-09T10:34:32Z
Ceph
Redmine
Ceph - Bug #1382 (Resolved): kclient: crash on resending osd ops
https://tracker.ceph.com/issues/1382
2011-08-09T10:34:32Z
Brian Chrisman
brchrisman@gmail.com
<p>Under performance testing with the SCST iSCSI driver on top of RBD (we'll switch to LIO at some point in the future, but probably a fair bit later), this crash occurs after a few hours of perf benchmarking.<br />I'm investigating whether we can run those same block io tests directly on top of RBD, but they may have Windows initiators (checking into that).</p>
<p>I'm attaching the kernel log.. and unfortunately I turned off ceph logging (ms=0) for this run to make sure logging isn't affecting performance.</p>
<p>I can probably recreate with ms=1 if it'll help.</p>
<p>I've included a bunch of the kernel messages in the runup to the crash (rsyslogd over udp), though most of it's scst stuff.</p>
Ceph - Bug #1233 (Resolved): ceph -v reports incorrect version
https://tracker.ceph.com/issues/1233
2011-06-27T13:04:07Z
Brian Chrisman
brchrisman@gmail.com
<p>I'm not sure what the .git_version file is supposed to be, as I see no reference in git docs to such a file.<br />From newdream's origin/master as of today,<br />./autogen.sh<br />./configure --without-tcmalloc # for my environment<br />cd src<br />make ceph_ver.h<br />grep CEPH_GIT_NICE_VER ceph_ver.h<br />----<br />#define CEPH_GIT_NICE_VER "0.24.2-2867-g19614bb"</p>
CephFS - Bug #1206 (Closed): NFS reexport file creation lags 1-3 seconds
https://tracker.ceph.com/issues/1206
2011-06-20T17:02:49Z
Brian Chrisman
brchrisman@gmail.com
<p>I'm attaching the kernel logs and mds logs for the creation of a file called scale-product/testfoo3 via a touch running on a remote NFS client.<br />I've looked through the logs, but they're a little difficult to track.</p>
<p>There are a few other file accesses in these logs for other services we're using.<br />However, I think it unlikely that those are causing problems, as I can create files locally on the ceph kernel mount without experiencing this lag.</p>
Ceph - Bug #1194 (Resolved): kclient: NFS reexport does not survive ceph fs remount
https://tracker.ceph.com/issues/1194
2011-06-16T14:23:40Z
Brian Chrisman
brchrisman@gmail.com
<p>NFS doesn't survive restarts/remounts<br />Reproduce:<br />create new ceph fs<br />export via NFS<br />(on NFS client) nfs mount<br />(on NFS client) create a couple files/directories<br />stop nfs server<br />umount ceph filesystem<br />mount ceph filesystem<br />start nfs server<br />(on NFS client) access (cat) file in nfs mount</p>
<ol>
<li>Stale NFS file handle</li>
</ol>
<p>If I perform a 'find' on the ceph filesystem before attempting to access from NFS client after restart, no stale filehandle.<br />Kernel client is still depending upon something being in a cache.</p>
<p>ceph version is ~0.28 era.<br />kernel is 2.6.39 pulled within the last week from ceph-client git repo</p>
CephFS - Bug #1114 (Rejected): NFS export extreme slowdown
https://tracker.ceph.com/issues/1114
2011-05-26T15:29:19Z
Brian Chrisman
brchrisman@gmail.com
<p>Attached is debug mds 20 output.<br />Below is ceph -w output for a corresponding period.<br />Time synchronization is < 0.1s.<br />Master branch as of today 5/26.<br />Use case is copying a 1GB+ source build tree to a ceph filesystem (kernel-client mounted) via NFS export.<br />Consistently occurs after some roughly deterministic but unknown amount of data/metadata operations in the copy.</p>
<ol>
<li>ceph -w<br />2011-05-26 22:02:02.672197 pg v63: 2376 pgs: 2376 active+clean; 1218 KB data, 31224 KB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:02:02.684161 mds e6: 1/1/1 up {0=scale-192-168-98-110=up:active}, 2 up:standby<br />2011-05-26 22:02:02.684233 osd e14: 12 osds: 12 up, 12 in<br />2011-05-26 22:02:02.684444 log 2011-05-26 21:56:32.882616 mon2 192.168.98.111:6789/0 25 : [WRN] message from mon0 was stamped 0.013284s in the future, clocks not synchronized<br />2011-05-26 22:02:02.684995 mon e1: 3 mons at {0=192.168.98.109:6789/0,1=192.168.98.110:6789/0,2=192.168.98.111:6789/0}<br />2011-05-26 22:02:04.921748 pg v64: 2376 pgs: 2376 active+clean; 64692 KB data, 107 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:02:05.984357 pg v65: 2376 pgs: 2376 active+clean; 155 MB data, 194 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:02:09.535567 pg v66: 2376 pgs: 2376 active+clean; 159 MB data, 219 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:02:11.052284 pg v67: 2376 pgs: 2376 active+clean; 165 MB data, 251 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:02:14.489935 pg v68: 2376 pgs: 2376 active+clean; 165 MB data, 251 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:02:16.112182 pg v69: 2376 pgs: 2376 active+clean; 165 MB data, 287 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:02:19.476662 pg v70: 2376 pgs: 2376 active+clean; 165 MB data, 287 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:02:24.578577 pg v71: 2376 pgs: 2376 active+clean; 165 MB data, 287 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:02:26.277742 pg v72: 2376 pgs: 2376 active+clean; 165 MB data, 287 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:02:29.656958 pg v73: 2376 pgs: 2376 active+clean; 165 MB data, 287 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:02:34.767601 pg v74: 2376 pgs: 2376 active+clean; 166 MB data, 287 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:02:36.344764 pg v75: 2376 pgs: 2376 active+clean; 166 MB data, 297 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:02:39.503245 pg v76: 2376 pgs: 2376 active+clean; 166 MB data, 297 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:02:44.899181 pg v77: 2376 pgs: 2376 active+clean; 166 MB data, 297 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:02:46.123144 pg v78: 2376 pgs: 2376 active+clean; 166 MB data, 313 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:02:49.457926 pg v79: 2376 pgs: 2376 active+clean; 166 MB data, 313 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:02:54.571233 pg v80: 2376 pgs: 2376 active+clean; 166 MB data, 313 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:02:56.245927 pg v81: 2376 pgs: 2376 active+clean; 166 MB data, 313 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:02:59.885295 pg v82: 2376 pgs: 2376 active+clean; 166 MB data, 313 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:03:00.871756 pg v83: 2376 pgs: 2376 active+clean; 166 MB data, 313 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:03:04.952345 pg v84: 2376 pgs: 2376 active+clean; 166 MB data, 313 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:03:05.801877 pg v85: 2376 pgs: 2376 active+clean; 166 MB data, 325 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:03:09.827020 pg v86: 2376 pgs: 2376 active+clean; 166 MB data, 326 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:03:14.510075 pg v87: 2376 pgs: 2376 active+clean; 167 MB data, 333 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:03:15.576846 pg v88: 2376 pgs: 2376 active+clean; 167 MB data, 349 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:03:19.883004 pg v89: 2376 pgs: 2376 active+clean; 167 MB data, 349 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:03:25.129790 pg v90: 2376 pgs: 2376 active+clean; 167 MB data, 349 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:03:25.966880 pg v91: 2376 pgs: 2376 active+clean; 167 MB data, 369 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:03:29.806945 pg v92: 2376 pgs: 2376 active+clean; 167 MB data, 369 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:03:35.147184 pg v93: 2376 pgs: 2376 active+clean; 167 MB data, 369 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:03:36.022737 pg v94: 2376 pgs: 2376 active+clean; 167 MB data, 369 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:03:39.726807 pg v95: 2376 pgs: 2376 active+clean; 167 MB data, 369 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:03:44.744657 pg v96: 2376 pgs: 2376 active+clean; 167 MB data, 369 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:03:45.879049 pg v97: 2376 pgs: 2376 active+clean; 167 MB data, 369 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:03:49.756106 pg v98: 2376 pgs: 2376 active+clean; 167 MB data, 369 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:03:54.523294 pg v99: 2376 pgs: 2376 active+clean; 167 MB data, 369 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:03:55.648484 pg v100: 2376 pgs: 2376 active+clean; 168 MB data, 372 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:03:59.963840 pg v101: 2376 pgs: 2376 active+clean; 168 MB data, 372 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:04:00.838651 pg v102: 2376 pgs: 2376 active+clean; 169 MB data, 374 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:04:05.071747 pg v103: 2376 pgs: 2376 active+clean; 169 MB data, 374 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:04:05.825393 pg v104: 2376 pgs: 2376 active+clean; 169 MB data, 374 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:04:10.235816 pg v105: 2376 pgs: 2376 active+clean; 169 MB data, 374 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:04:15.224835 pg v106: 2376 pgs: 2376 active+clean; 169 MB data, 374 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:04:16.160427 pg v107: 2376 pgs: 2376 active+clean; 169 MB data, 374 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:04:20.171603 pg v108: 2376 pgs: 2376 active+clean; 169 MB data, 374 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:04:20.943961 pg v109: 2376 pgs: 2376 active+clean; 169 MB data, 374 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:04:25.211361 pg v110: 2376 pgs: 2376 active+clean; 169 MB data, 374 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:04:25.932742 pg v111: 2376 pgs: 2376 active+clean; 169 MB data, 374 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:04:29.591203 pg v112: 2376 pgs: 2376 active+clean; 169 MB data, 374 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:04:35.053130 pg v113: 2376 pgs: 2376 active+clean; 170 MB data, 374 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:04:36.558676 pg v114: 2376 pgs: 2376 active+clean; 170 MB data, 374 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:04:39.625175 pg v115: 2376 pgs: 2376 active+clean; 170 MB data, 374 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:04:45.228658 pg v116: 2376 pgs: 2376 active+clean; 170 MB data, 374 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:04:46.104386 pg v117: 2376 pgs: 2376 active+clean; 170 MB data, 374 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:04:50.041891 pg v118: 2376 pgs: 2376 active+clean; 170 MB data, 374 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:04:54.680541 pg v119: 2376 pgs: 2376 active+clean; 170 MB data, 374 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:04:55.706833 pg v120: 2376 pgs: 2376 active+clean; 170 MB data, 374 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:04:59.685786 pg v121: 2376 pgs: 2376 active+clean; 170 MB data, 374 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:05:01.258347 pg v122: 2376 pgs: 2376 active+clean; 170 MB data, 374 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:05:04.961594 pg v123: 2376 pgs: 2376 active+clean; 170 MB data, 374 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:05:05.924606 pg v124: 2376 pgs: 2376 active+clean; 170 MB data, 374 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:05:10.130142 pg v125: 2376 pgs: 2376 active+clean; 170 MB data, 374 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:05:14.674699 pg v126: 2376 pgs: 2376 active+clean; 170 MB data, 374 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:05:15.663701 pg v127: 2376 pgs: 2376 active+clean; 170 MB data, 375 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:05:20.208742 pg v128: 2376 pgs: 2376 active+clean; 170 MB data, 375 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:05:25.133026 pg v129: 2376 pgs: 2376 active+clean; 170 MB data, 375 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:05:35.009655 pg v130: 2376 pgs: 2376 active+clean; 170 MB data, 375 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:05:35.956974 pg v131: 2376 pgs: 2376 active+clean; 170 MB data, 375 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:05:55.059282 pg v132: 2376 pgs: 2376 active+clean; 170 MB data, 375 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:05:55.850950 pg v133: 2376 pgs: 2376 active+clean; 170 MB data, 377 MB used, 18240 GB / 18265 GB avail<br />2011-05-26 22:06:04.488704 pg v134: 2376 pgs: 2376 active+clean; 170 MB data, 376 MB used, 18240 GB / 18265 GB avail</li>
</ol>
CephFS - Bug #1111 (Resolved): file lock requests in wait queue not getting cleaned up after proc...
https://tracker.ceph.com/issues/1111
2011-05-25T11:36:23Z
Brian Chrisman
brchrisman@gmail.com
<p>Our interpretation of events:<br />1) proc1 requests lock<br />2) proc1 receives lock<br />3) proc2 requests lock<br />4) proc2 gets on wait list<br />5) proc2 exits<br />6) proc1 releases lock<br />7) proc2 gets assigned lock by mds though process is gone</p>
<p>I will add our test code here soon</p>
CephFS - Bug #1087 (Resolved): userspace Client readdir_r failing
https://tracker.ceph.com/issues/1087
2011-05-13T07:24:08Z
Brian Chrisman
brchrisman@gmail.com
<p>I chased this down a bit of a ways but there's a lot to look through.<br />This log is output from testceph with client debug 20.<br />In particular, the program has already created and opened the directory 'readdir_r_test' and then opens the file 'opened_file' (search for 'ceph_open').<br />Then it calls 'readdir_r'<br />Either there's a bug or I'm not calling the API correctly.<br />My patch to testceph.cc shows how I'm calling it in the samba vfs module.</p>
<p>2011-05-13 06:13:57.067364 7f04341ec720 client7107 open enter(readdir_r_test/opened_file, 64,438) = <br />2011-05-13 06:13:57.067371 7f04341ec720 client7107 path_walk readdir_r_test/opened_file<br />2011-05-13 06:13:57.067385 7f04341ec720 client7107 0 1.head( cap_refs={} open={} ref=3 caps=pAsLsXs mode=40755 mtime=2011-05-13 06:13:56.968093) readdir_r_test<br />2011-05-13 06:13:57.067393 7f04341ec720 client7107 _lookup have dn readdir_r_test mds-1 ttl 0.000000 seq 0<br />2011-05-13 06:13:57.067402 7f04341ec720 client7107 _lookup on #1/readdir_r_test<br />2011-05-13 06:13:57.067410 7f04341ec720 client7107 dir hash is 2 on readdir_r_test => 2444911409<br />2011-05-13 06:13:57.067417 7f04341ec720 client7107 choose_target_mds 0x7f0420001f40 is_hash=1 hash=2444911409<br />2011-05-13 06:13:57.067422 7f04341ec720 client7107 choose_target_mds from caps<br />2011-05-13 06:13:57.067427 7f04341ec720 client7107 mds is 0<br />2011-05-13 06:13:57.067433 7f04341ec720 client7107 send_request rebuilding request 29 for mds0<br />2011-05-13 06:13:57.067440 7f04341ec720 client7107 encode_cap_releases enter (req: 0x1338890, mds: 0)<br />2011-05-13 06:13:57.067447 7f04341ec720 client7107 send_request set sent_stamp to 2011-05-13 06:13:57.067445<br />2011-05-13 06:13:57.067455 7f04341ec720 client7107 send_request client_request(unknown0:29 lookup #1/readdir_r_test) v1 to mds0<br />2011-05-13 06:13:57.067467 7f04341ec720 client7107 awaiting reply|forward|kick on 0x7fff637de5e0<br />2011-05-13 06:13:57.067922 7f0430d89710 client7107 handle_client_reply got a reply. Safe:1 tid 29<br />2011-05-13 06:13:57.067938 7f0430d89710 client7107 insert_trace from 2011-05-13 06:13:57.067445 mds0 is_target=1 is_dentry=1<br />2011-05-13 06:13:57.067944 7f0430d89710 client7107 features 0x1fa<br />2011-05-13 06:13:57.067949 7f0430d89710 client7107 update_snap_trace len 48<br />2011-05-13 06:13:57.067957 7f0430d89710 client7107 update_snap_trace snaprealm(1 nref=4 c=0 seq=1 parent=0 my_snaps=[] cached_snapc=1=[]) seq 1 <= 1 and same parent, SKIPPING<br />2011-05-13 06:13:57.067963 7f0430d89710 client7107 hrm is_target=1 is_dentry=1<br />2011-05-13 06:13:57.067984 7f0430d89710 client7107 add_update_inode had 100000015cd.head( cap_refs={} open={} ref=2 caps=pAsxLsXsxFsx mode=40777 mtime=2011-05-13 06:13:57.007942 parent=0x1338f10) caps pAsxLsXsxFsx<br />2011-05-13 06:13:57.068006 7f0430d89710 client7107 add_update_cap issued pAsxLsXsxFsx <del>> pAsxLsXsxFsx from mds0 on 100000015cd.head( cap_refs={} open={} ref=2 caps=pAsxLsXsxFsx mode=40777 mtime=2011-05-13 06:13:57.007942 parent=0x1338f10)<br />2011-05-13 06:13:57.068021 7f0430d89710 client7107 add_update_inode had 1.head( cap_refs={} open={} ref=3 caps=pAsLsXs mode=40755 mtime=2011-05-13 06:13:56.968093) caps pAsLsXs<br />2011-05-13 06:13:57.068029 7f0430d89710 client7107 dir hash is 2<br />2011-05-13 06:13:57.068042 7f0430d89710 client7107 update_inode_file_bits 1.head( cap_refs={} open={} ref=3 caps=pAsLsXs mode=40755 mtime=2011-05-13 06:13:56.968093) pAsLsXs mtime 2011-05-13 06:13:56.968093<br />2011-05-13 06:13:57.068059 7f0430d89710 client7107 add_update_cap issued pAsLsXs -> pAsLsXs from mds0 on 1.head( cap_refs={} open={} ref=3 caps=pAsLsXs mode=40755 mtime=2011-05-13 06:13:56.968093)<br />2011-05-13 06:13:57.068066 7f0430d89710 client7107 got dirfrag map for 1 frag * to mds 0<br />2011-05-13 06:13:57.068073 7f0430d89710 client7107 insert_dentry_inode readdir_r_test vino 100000015cd.head in dir 1.head<br />2011-05-13 06:13:57.068079 7f0430d89710 client7107 had dentry readdir_r_test with correct vino 100000015cd.head<br />2011-05-13 06:13:57.068086 7f0430d89710 client7107 handle_client_reply signalling caller 0x7fff637de5e0<br />2011-05-13 06:13:57.068097 7f0430d89710 client7107 handle_client_reply awaiting kickback on tid 29 0x7f0430d885d0<br />2011-05-13 06:13:57.068142 7f04341ec720 client7107 sendrecv kickback on tid 29 0x7f0430d885d0<br />2011-05-13 06:13:57.068151 7f04341ec720 client7107 lat 0.000705<br />2011-05-13 06:13:57.068161 7f04341ec720 client7107 _lookup res is 0<br />2011-05-13 06:13:57.068181 7f04341ec720 client7107 _lookup 1.head( cap_refs={} open={} ref=3 caps=pAsLsXs mode=40755 mtime=2011-05-13 06:13:56.968093) readdir_r_test = 100000015cd.head( cap_refs={} open={} ref=2 caps=pAsxLsXsxFsx mode=40777 mtime=2011-05-13 06:13:57.007942 parent=0x1338f10)<br />2011-05-13 06:13:57.068195 7f04341ec720 client7107 1 100000015cd.head( cap_refs={} open={} ref=2 caps=pAsxLsXsxFsx mode=40777 mtime=2011-05-13 06:13:57.007942 parent=0x1338f10) opened_file<br />2011-05-13 06:13:57.068206 7f04341ec720 client7107 _lookup on #100000015cd/opened_file<br />2011-05-13 06:13:57.068215 7f04341ec720 client7107 dir hash is 2 on opened_file => 433343845<br />2011-05-13 06:13:57.068221 7f04341ec720 client7107 choose_target_mds 0x7f0420002d50 is_hash=1 hash=433343845<br />2011-05-13 06:13:57.068226 7f04341ec720 client7107 choose_target_mds from caps<br />2011-05-13 06:13:57.068231 7f04341ec720 client7107 mds is 0<br />2011-05-13 06:13:57.068237 7f04341ec720 client7107 send_request rebuilding request 30 for mds0<br />2011-05-13 06:13:57.068244 7f04341ec720 client7107 encode_cap_releases enter (req: 0x1338b60, mds: 0)<br />2011-05-13 06:13:57.068251 7f04341ec720 client7107 send_request set sent_stamp to 2011-05-13 06:13:57.068249<br />2011-05-13 06:13:57.068259 7f04341ec720 client7107 send_request client_request(unknown0:30 lookup #100000015cd/opened_file) v1 to mds0<br />2011-05-13 06:13:57.068272 7f04341ec720 client7107 awaiting reply|forward|kick on 0x7fff637de5e0<br />2011-05-13 06:13:57.068614 7f0430d89710 client7107 handle_client_reply got a reply. Safe:1 tid 30<br />2011-05-13 06:13:57.068630 7f0430d89710 client7107 insert_trace from 2011-05-13 06:13:57.068249 mds0 is_target=0 is_dentry=1<br />2011-05-13 06:13:57.068635 7f0430d89710 client7107 features 0x1fa<br />2011-05-13 06:13:57.068640 7f0430d89710 client7107 update_snap_trace len 48<br />2011-05-13 06:13:57.068649 7f0430d89710 client7107 update_snap_trace snaprealm(1 nref=4 c=0 seq=1 parent=0 my_snaps=[] cached_snapc=1=[]) seq 1 <= 1 and same parent, SKIPPING<br />2011-05-13 06:13:57.068655 7f0430d89710 client7107 hrm is_target=0 is_dentry=1<br />2011-05-13 06:13:57.068675 7f0430d89710 client7107 add_update_inode had 100000015cd.head( cap_refs={} open={} ref=2 caps=pAsxLsXsxFsx mode=40777 mtime=2011-05-13 06:13:57.007942 parent=0x1338f10) caps pAsxLsXsxFsx<br />2011-05-13 06:13:57.068696 7f0430d89710 client7107 add_update_cap issued pAsxLsXsxFsx -> pAsxLsXsxFsx from mds0 on 100000015cd.head( cap_refs={} open={} ref=2 caps=pAsxLsXsxFsx mode=40777 mtime=2011-05-13 06:13:57.007942 parent=0x1338f10)<br />2011-05-13 06:13:57.068703 7f0430d89710 client7107 got dirfrag map for 100000015cd frag * to mds -1<br />2011-05-13 06:13:57.068710 7f0430d89710 client7107 handle_client_reply signalling caller 0x7fff637de5e0<br />2011-05-13 06:13:57.068720 7f0430d89710 client7107 handle_client_reply awaiting kickback on tid 30 0x7f0430d885d0<br />2011-05-13 06:13:57.068734 7f04341ec720 client7107 sendrecv kickback on tid 30 0x7f0430d885d0<br />2011-05-13 06:13:57.068743 7f04341ec720 client7107 lat 0.000492<br />2011-05-13 06:13:57.068750 7f04341ec720 client7107 _lookup res is -2<br />2011-05-13 06:13:57.068765 7f04341ec720 client7107 _lookup 100000015cd.head( cap_refs={} open={} ref=2 caps=pAsxLsXsxFsx mode=40777 mtime=2011-05-13 06:13:57.007942 parent=0x1338f10) opened_file = -2<br />2011-05-13 06:13:57.068775 7f04341ec720 client7107 path_walk readdir_r_test<br />2011-05-13 06:13:57.068786 7f04341ec720 client7107 0 1.head( cap_refs={} open={} ref=3 caps=pAsLsXs mode=40755 mtime=2011-05-13 06:13:56.968093) readdir_r_test<br />2011-05-13 06:13:57.068794 7f04341ec720 client7107 _lookup have dn readdir_r_test mds-1 ttl 0.000000 seq 0<br />2011-05-13 06:13:57.068803 7f04341ec720 client7107 _lookup on #1/readdir_r_test<br />2011-05-13 06:13:57.068811 7f04341ec720 client7107 dir hash is 2 on readdir_r_test => 2444911409<br />2011-05-13 06:13:57.068818 7f04341ec720 client7107 choose_target_mds 0x7f0420001f40 is_hash=1 hash=2444911409<br />2011-05-13 06:13:57.068823 7f04341ec720 client7107 choose_target_mds from caps<br />2011-05-13 06:13:57.068846 7f04341ec720 client7107 mds is 0<br />2011-05-13 06:13:57.068854 7f04341ec720 client7107 send_request rebuilding request 31 for mds0<br />2011-05-13 06:13:57.068861 7f04341ec720 client7107 encode_cap_releases enter (req: 0x1338890, mds: 0)<br />2011-05-13 06:13:57.068868 7f04341ec720 client7107 send_request set sent_stamp to 2011-05-13 06:13:57.068866<br />2011-05-13 06:13:57.068876 7f04341ec720 client7107 send_request client_request(unknown0:31 lookup #1/readdir_r_test) v1 to mds0<br />2011-05-13 06:13:57.068889 7f04341ec720 client7107 awaiting reply|forward|kick on 0x7fff637de5e0<br />2011-05-13 06:13:57.069274 7f0430d89710 client7107 handle_client_reply got a reply. Safe:1 tid 31<br />2011-05-13 06:13:57.069290 7f0430d89710 client7107 insert_trace from 2011-05-13 06:13:57.068866 mds0 is_target=1 is_dentry=1<br />2011-05-13 06:13:57.069296 7f0430d89710 client7107 features 0x1fa<br />2011-05-13 06:13:57.069301 7f0430d89710 client7107 update_snap_trace len 48<br />2011-05-13 06:13:57.069309 7f0430d89710 client7107 update_snap_trace snaprealm(1 nref=4 c=0 seq=1 parent=0 my_snaps=[] cached_snapc=1=[]) seq 1 <= 1 and same parent, SKIPPING<br />2011-05-13 06:13:57.069316 7f0430d89710 client7107 hrm is_target=1 is_dentry=1<br />2011-05-13 06:13:57.069337 7f0430d89710 client7107 add_update_inode had 100000015cd.head( cap_refs={} open={} ref=2 caps=pAsxLsXsxFsx mode=40777 mtime=2011-05-13 06:13:57.007942 parent=0x1338f10) caps pAsxLsXsxFsx<br />2011-05-13 06:13:57.069359 7f0430d89710 client7107 add_update_cap issued pAsxLsXsxFsx -> pAsxLsXsxFsx from mds0 on 100000015cd.head( cap_refs={} open={} ref=2 caps=pAsxLsXsxFsx mode=40777 mtime=2011-05-13 06:13:57.007942 parent=0x1338f10)<br />2011-05-13 06:13:57.069373 7f0430d89710 client7107 add_update_inode had 1.head( cap_refs={} open={} ref=3 caps=pAsLsXs mode=40755 mtime=2011-05-13 06:13:56.968093) caps pAsLsXs<br />2011-05-13 06:13:57.069380 7f0430d89710 client7107 dir hash is 2<br />2011-05-13 06:13:57.069393 7f0430d89710 client7107 update_inode_file_bits 1.head( cap_refs={} open={} ref=3 caps=pAsLsXs mode=40755 mtime=2011-05-13 06:13:56.968093) pAsLsXs mtime 2011-05-13 06:13:56.968093<br />2011-05-13 06:13:57.069410 7f0430d89710 client7107 add_update_cap issued pAsLsXs -> pAsLsXs from mds0 on 1.head( cap_refs={} open={} ref=3 caps=pAsLsXs mode=40755 mtime=2011-05-13 06:13:56.968093)<br />2011-05-13 06:13:57.069417 7f0430d89710 client7107 got dirfrag map for 1 frag * to mds 0<br />2011-05-13 06:13:57.069424 7f0430d89710 client7107 insert_dentry_inode readdir_r_test vino 100000015cd.head in dir 1.head<br />2011-05-13 06:13:57.069430 7f0430d89710 client7107 had dentry readdir_r_test with correct vino 100000015cd.head<br />2011-05-13 06:13:57.069437 7f0430d89710 client7107 handle_client_reply signalling caller 0x7fff637de5e0<br />2011-05-13 06:13:57.069446 7f0430d89710 client7107 handle_client_reply awaiting kickback on tid 31 0x7f0430d885d0<br />2011-05-13 06:13:57.069460 7f04341ec720 client7107 sendrecv kickback on tid 31 0x7f0430d885d0<br />2011-05-13 06:13:57.069469 7f04341ec720 client7107 lat 0.000601<br />2011-05-13 06:13:57.069477 7f04341ec720 client7107 _lookup res is 0<br />2011-05-13 06:13:57.069496 7f04341ec720 client7107 _lookup 1.head( cap_refs={} open={} ref=3 caps=pAsLsXs mode=40755 mtime=2011-05-13 06:13:56.968093) readdir_r_test = 100000015cd.head( cap_refs={} open={} ref=2 caps=pAsxLsXsxFsx mode=40777 mtime=2011-05-13 06:13:57.007942 parent=0x1338f10)<br />2011-05-13 06:13:57.069505 7f04341ec720 client7107 _create(100000015cd opened_file, 0666)<br />2011-05-13 06:13:57.069524 7f04341ec720 client7107 dir hash is 2 on opened_file => 433343845<br />2011-05-13 06:13:57.069532 7f04341ec720 client7107 choose_target_mds 0x7f0420002d50 is_hash=1 hash=433343845<br />2011-05-13 06:13:57.069537 7f04341ec720 client7107 choose_target_mds from caps<br />2011-05-13 06:13:57.069542 7f04341ec720 client7107 mds is 0<br />2011-05-13 06:13:57.069548 7f04341ec720 client7107 send_request rebuilding request 32 for mds0<br />2011-05-13 06:13:57.069555 7f04341ec720 client7107 encode_cap_releases enter (req: 0x1338b60, mds: 0)<br />2011-05-13 06:13:57.069560 7f04341ec720 client7107 encode_dentry_release enter(dn:0x1330ae0)<br />2011-05-13 06:13:57.069574 7f04341ec720 client7107 encode_inode_release enter(in:100000015cd.head( cap_refs={} open={} ref=3 caps=pAsxLsXsxFsx mode=40777 mtime=2011-05-13 06:13:57.007942 parent=0x1338f10), req:0x1338b60 mds:0, drop:256, unless:512, have:, force:1)<br />2011-05-13 06:13:57.069585 7f04341ec720 client7107 send_request set sent_stamp to 2011-05-13 06:13:57.069583<br />2011-05-13 06:13:57.069593 7f04341ec720 client7107 send_request client_request(unknown0:32 create #100000015cd/opened_file) v1 to mds0<br />2011-05-13 06:13:57.069606 7f04341ec720 client7107 awaiting reply|forward|kick on 0x7fff637de7c0<br />2011-05-13 06:13:57.070072 7f0430d89710 client7107 mds0 seq now 11<br />2011-05-13 06:13:57.070096 7f0430d89710 client7107 handle_cap_grant on in 100000015cd mds0 seq 7 caps now pAsLsXsxFsx was pAsxLsXsxFsx<br />2011-05-13 06:13:57.070116 7f0430d89710 client7107 update_inode_file_bits 100000015cd.head( cap_refs={} open={} ref=3 caps=pAsxLsXsxFsx mode=40777 mtime=2011-05-13 06:13:57.007942 parent=0x1338f10) pAsxLsXsxFsx mtime 2011-05-13 06:13:57.007942<br />2011-05-13 06:13:57.070125 7f0430d89710 client7107 revocation of Ax<br />2011-05-13 06:13:57.070139 7f0430d89710 client7107 check_caps on 100000015cd.head( cap_refs={} open={} ref=3 caps=pAsLsXsxFsx mode=40777 mtime=2011-05-13 06:13:57.007942 parent=0x1338f10) wanted - used - is_delayed=1<br />2011-05-13 06:13:57.070154 7f0430d89710 client7107 cap mds0 issued pAsLsXsxFsx implemented pAsxLsXsxFsx revoking Ax<br />2011-05-13 06:13:57.070161 7f0430d89710 client7107 completed revocation of Ax<br />2011-05-13 06:13:57.070167 7f0430d89710 client7107 flushing -<br />2011-05-13 06:13:57.070189 7f0430d89710 client7107 send_cap 100000015cd.head( cap_refs={} open={} ref=3 caps=pAsLsXsxFsx mode=40777 mtime=2011-05-13 06:13:57.007942 parent=0x1338f10) mds0 seq 7 used - want - flush - retain pAsLsXsFs held pAsxLsXsxFsx revoking Ax dropping XxFx<br />2011-05-13 06:13:57.073359 7f0430d89710 client7107 handle_client_reply got a reply. Safe:0 tid 32<br />2011-05-13 06:13:57.073388 7f0430d89710 client7107 insert_trace from 2011-05-13 06:13:57.069583 mds0 is_target=1 is_dentry=1<br />2011-05-13 06:13:57.073394 7f0430d89710 client7107 features 0x1fa<br />2011-05-13 06:13:57.073400 7f0430d89710 client7107 update_snap_trace len 48<br />2011-05-13 06:13:57.073411 7f0430d89710 client7107 update_snap_trace snaprealm(1 nref=4 c=0 seq=1 parent=0 my_snaps=[] cached_snapc=1=[]) seq 1 <= 1 and same parent, SKIPPING<br />2011-05-13 06:13:57.073417 7f0430d89710 client7107 hrm is_target=1 is_dentry=1<br />2011-05-13 06:13:57.073443 7f0430d89710 client7107 add_update_inode adding 100000015ce.head( cap_refs={} open={} ref=0 caps=</del> mode=100000 mtime=0.000000) caps pAsxLsXsxFscr<br />2011-05-13 06:13:57.073455 7f0430d89710 client7107 update_inode_file_bits 100000015ce.head( cap_refs={} open={} ref=0 caps=- mode=100666 mtime=0.000000) - mtime 2011-05-13 06:13:57.015188<br />2011-05-13 06:13:57.073463 7f0430d89710 client7107 add_update_cap first one, opened snaprealm 0x7f0420006e60<br />2011-05-13 06:13:57.073482 7f0430d89710 client7107 add_update_cap issued - <del>> pAsxLsXsxFscr from mds0 on 100000015ce.head( cap_refs={} open={} ref=0 caps=pAsxLsXsxFscr mode=100666 mtime=2011-05-13 06:13:57.015188)<br />2011-05-13 06:13:57.073500 7f0430d89710 client7107 add_update_inode had 100000015cd.head( cap_refs={} open={} ref=3 caps=pAsLsXsFs mode=40777 mtime=2011-05-13 06:13:57.007942 parent=0x1338f10) caps pAsLsXsxFsx<br />2011-05-13 06:13:57.073506 7f0430d89710 client7107 dir hash is 2<br />2011-05-13 06:13:57.073521 7f0430d89710 client7107 update_inode_file_bits 100000015cd.head( cap_refs={} open={} ref=3 caps=pAsLsXsFs mode=40777 mtime=2011-05-13 06:13:57.007942 parent=0x1338f10) pAsLsXsFs mtime 2011-05-13 06:13:57.015188<br />2011-05-13 06:13:57.073541 7f0430d89710 client7107 add_update_cap issued pAsLsXsFs -> pAsLsXsxFsx from mds0 on 100000015cd.head( cap_refs={} open={} ref=3 caps=pAsLsXsxFsx mode=40777 mtime=2011-05-13 06:13:57.015188 parent=0x1338f10)<br />2011-05-13 06:13:57.073548 7f0430d89710 client7107 got dirfrag map for 100000015cd frag * to mds -1<br />2011-05-13 06:13:57.073558 7f0430d89710 client7107 insert_dentry_inode opened_file vino 100000015ce.head in dir 100000015cd.head<br />2011-05-13 06:13:57.073564 7f0430d89710 client7107 had vino 100000015ce.head unlinked, linking<br />2011-05-13 06:13:57.073570 7f0430d89710 client7107 setting dn offset to 2<br />2011-05-13 06:13:57.073577 7f0430d89710 client7107 handle_client_reply signalling caller 0x7fff637de7c0<br />2011-05-13 06:13:57.073587 7f0430d89710 client7107 handle_client_reply awaiting kickback on tid 32 0x7f0430d885d0<br />2011-05-13 06:13:57.073602 7f04341ec720 client7107 sendrecv kickback on tid 32 0x7f0430d885d0<br />2011-05-13 06:13:57.073614 7f04341ec720 client7107 lat 0.004029<br />2011-05-13 06:13:57.073628 7f04341ec720 client7107 _lookup have dn opened_file mds-1 ttl 0.000000 seq 0<br />2011-05-13 06:13:57.073652 7f04341ec720 client7107 _lookup 100000015cd.head( cap_refs={} open={} ref=3 caps=pAsLsXsxFsx mode=40777 mtime=2011-05-13 06:13:57.015188 parent=0x1338f10) opened_file = 100000015ce.head( cap_refs={} open={} ref=1 caps=pAsxLsXsxFscr mode=100666 mtime=2011-05-13 06:13:57.015188 parent=0x1330ae0)<br />2011-05-13 06:13:57.073664 7f04341ec720 client7107 _create_fh 100000015ce mode 1<br />2011-05-13 06:13:57.073672 7f04341ec720 client7107 create(#100000015cd/opened_file, 0666 layout 0 0 0 0 -1) = 0<br />2011-05-13 06:13:57.073682 7f04341ec720 client7107 open exit(readdir_r_test/opened_file, 64) = 11<br />ceph_open: success<br />2011-05-13 06:13:57.073727 7f04341ec720 client7107 readdir_r_cb 100000015cd.head( cap_refs={} open={} ref=3 caps=pAsLsXsxFsx mode=40777 mtime=2011-05-13 06:13:57.015188 parent=0x1338f10) offset 0 frag * fragpos 0 at_end=0<br />2011-05-13 06:13:57.073734 7f04341ec720 client7107 including .<br />2011-05-13 06:13:57.073749 7f04341ec720 client7107 fill_dirent '.' -> 100000015cd type 4 w/ next_off 1<br />2011-05-13 06:13:57.073759 7f04341ec720 client7107 fill_stat on 100000015cd snap/devhead mode 040777 mtime 2011-05-13 06:13:57.015188 ctime 2011-05-13 06:13:57.015188<br />2011-05-13 06:13:57.073765 7f04341ec720 client7107 including ..<br />2011-05-13 06:13:57.073770 7f04341ec720 client7107 fill_dirent '..' -> 100000015cd type 4 w/ next_off 2<br />2011-05-13 06:13:57.073779 7f04341ec720 client7107 fill_stat on 100000015cd snap/devhead mode 040777 mtime 2011-05-13 06:13:57.015188 ctime 2011-05-13 06:13:57.015188<br />ceph_readdir_r: fail, returned: -1<br />2011-05-13 06:13:57.073802 7f04341ec720 client7107 unmounting<br />2011-05-13 06:13:57.073809 7f04341ec720 client7107 waiting on 2 requests<br />2011-05-13 06:13:57.096541 7f0430d89710 client7107 handle_client_reply got a reply. Safe:1 tid 27<br />2011-05-13 06:13:57.096568 7f0430d89710 client7107 insert_trace from 2011-05-13 06:13:57.064945 mds0 is_target=0 is_dentry=0<br />2011-05-13 06:13:57.096574 7f0430d89710 client7107 insert_trace -</del> no trace<br />2011-05-13 06:13:57.096590 7f0430d89710 client7107 unmounting: trim pass, size was 3+4<br />2011-05-13 06:13:57.096597 7f0430d89710 client7107 unmounting: trim pass, size still 3+4<br />2011-05-13 06:13:57.096609 7f0430d89710 client7107 dump_inode: inode 1 ref 3 dir 0x13338a0<br />2011-05-13 06:13:57.096614 7f0430d89710 client7107 dir size 2<br />2011-05-13 06:13:57.096620 7f0430d89710 client7107 dn barfile ref 0<br />2011-05-13 06:13:57.096625 7f0430d89710 client7107 dump_inode: inode 100000015cc ref 2 dir 0<br />2011-05-13 06:13:57.096631 7f0430d89710 client7107 dn readdir_r_test ref 1<br />2011-05-13 06:13:57.096637 7f0430d89710 client7107 dump_inode: inode 100000015cd ref 3 dir 0x1338e10<br />2011-05-13 06:13:57.096642 7f0430d89710 client7107 dir size 1<br />2011-05-13 06:13:57.096647 7f0430d89710 client7107 dn opened_file ref 0<br />2011-05-13 06:13:57.096652 7f0430d89710 client7107 dump_inode: inode 100000015ce ref 2 dir 0<br />2011-05-13 06:13:57.096671 7f0430d89710 client7107 dump_cache: inode 100000015cd.head ref 3 dir 0x1338e10 100000015cd.head( cap_refs={} open={} ref=3 caps=pAsLsXsxFsx mode=40777 mtime=2011-05-13 06:13:57.015188 parent=0x1338f10)<br />2011-05-13 06:13:57.096677 7f0430d89710 client7107 dir size 1<br />2011-05-13 06:13:57.096689 7f0430d89710 client7107 dump_cache: inode 1.head ref 3 dir 0x13338a0 1.head( cap_refs={} open={} ref=3 caps=pAsLsXs mode=40755 mtime=2011-05-13 06:13:56.968093)<br />2011-05-13 06:13:57.096695 7f0430d89710 client7107 dir size 2<br />2011-05-13 06:13:57.096708 7f0430d89710 client7107 dump_cache: inode 100000015ce.head ref 2 dir 0 100000015ce.head( cap_refs={} open={1=1} ref=2 caps=pAsxLsXsxFscr mode=100666 mtime=2011-05-13 06:13:57.015188 parent=0x1330ae0)<br />2011-05-13 06:13:57.096722 7f0430d89710 client7107 dump_cache: inode 100000015cc.head ref 2 dir 0 100000015cc.head( cap_refs={} open={1=1} ref=2 caps=pAsxLsXsFscr mode=100666 mtime=2011-05-13 06:13:56.968093 parent=0x1333820)<br />2011-05-13 06:13:57.096736 7f04341ec720 client7107 waiting on 1 requests<br />2011-05-13 06:13:57.236599 7f0430d89710 client7107 handle_client_reply got a reply. Safe:1 tid 32<br />2011-05-13 06:13:57.236630 7f0430d89710 client7107 insert_trace from 2011-05-13 06:13:57.069583 mds0 is_target=0 is_dentry=0<br />2011-05-13 06:13:57.236636 7f0430d89710 client7107 insert_trace -- no trace<br />2011-05-13 06:13:57.236653 7f0430d89710 client7107 unmounting: trim pass, size was 3+4<br />2011-05-13 06:13:57.236660 7f0430d89710 client7107 unmounting: trim pass, size still 3+4<br />2011-05-13 06:13:57.236667 7f0430d89710 client7107 dump_inode: inode 1 ref 3 dir 0x13338a0<br />2011-05-13 06:13:57.236672 7f0430d89710 client7107 dir size 2<br />2011-05-13 06:13:57.236678 7f0430d89710 client7107 dn barfile ref 0<br />2011-05-13 06:13:57.236683 7f0430d89710 client7107 dump_inode: inode 100000015cc ref 2 dir 0<br />2011-05-13 06:13:57.236689 7f0430d89710 client7107 dn readdir_r_test ref 1<br />2011-05-13 06:13:57.236694 7f0430d89710 client7107 dump_inode: inode 100000015cd ref 3 dir 0x1338e10<br />2011-05-13 06:13:57.236700 7f0430d89710 client7107 dir size 1<br />2011-05-13 06:13:57.236705 7f0430d89710 client7107 dn opened_file ref 0<br />2011-05-13 06:13:57.236710 7f0430d89710 client7107 dump_inode: inode 100000015ce ref 2 dir 0<br />2011-05-13 06:13:57.236731 7f0430d89710 client7107 dump_cache: inode 100000015cd.head ref 3 dir 0x1338e10 100000015cd.head( cap_refs={} open={} ref=3 caps=pAsLsXsxFsx mode=40777 mtime=2011-05-13 06:13:57.015188 parent=0x1338f10)<br />2011-05-13 06:13:57.236737 7f0430d89710 client7107 dir size 1<br />2011-05-13 06:13:57.236749 7f0430d89710 client7107 dump_cache: inode 1.head ref 3 dir 0x13338a0 1.head( cap_refs={} open={} ref=3 caps=pAsLsXs mode=40755 mtime=2011-05-13 06:13:56.968093)<br />2011-05-13 06:13:57.236755 7f0430d89710 client7107 dir size 2<br />2011-05-13 06:13:57.236769 7f0430d89710 client7107 dump_cache: inode 100000015ce.head ref 2 dir 0 100000015ce.head( cap_refs={} open={1=1} ref=2 caps=pAsxLsXsxFscr mode=100666 mtime=2011-05-13 06:13:57.015188 parent=0x1330ae0)<br />2011-05-13 06:13:57.236783 7f0430d89710 client7107 dump_cache: inode 100000015cc.head ref 2 dir 0 100000015cc.head( cap_refs={} open={1=1} ref=2 caps=pAsxLsXsFscr mode=100666 mtime=2011-05-13 06:13:56.968093 parent=0x1333820)<br />Warning: Some files were not closed prior to unmounting;<br />Ceph is closing them now.<br />2011-05-13 06:13:57.236848 7f04341ec720 client7107 _release 0x132f3a0 mode 1 on 100000015cc.head( cap_refs={} open={1=1} ref=2 caps=pAsxLsXsFscr mode=100666 mtime=2011-05-13 06:13:56.968093 parent=0x1333820)<br />2011-05-13 06:13:57.236871 7f04341ec720 client7107 check_caps on 100000015cc.head( cap_refs={} open={1=0} ref=2 caps=pAsxLsXsFscr mode=100666 mtime=2011-05-13 06:13:56.968093 parent=0x1333820) wanted - used - is_delayed=0<br />2011-05-13 06:13:57.236888 7f04341ec720 client7107 cap_delay_requeue on 100000015cc.head( cap_refs={} open={1=0} ref=2 caps=pAsxLsXsFscr mode=100666 mtime=2011-05-13 06:13:56.968093 parent=0x1333820)<br />2011-05-13 06:13:57.236903 7f04341ec720 client7107 cap mds0 issued pAsxLsXsFscr implemented pAsxLsXsFscr revoking -<br />2011-05-13 06:13:57.236910 7f04341ec720 client7107 flushing -<br />2011-05-13 06:13:57.236933 7f04341ec720 client7107 send_cap 100000015cc.head( cap_refs={} open={1=0} ref=2 caps=pAsxLsXsFscr mode=100666 mtime=2011-05-13 06:13:56.968093 parent=0x1333820) mds0 seq 8 used - want - flush - retain p held pAsxLsXsFscr revoking - dropping AsxLsXsFscr<br />2011-05-13 06:13:57.236967 7f04341ec720 client7107 _release 0x13381a0 mode 1 on 100000015ce.head( cap_refs={} open={1=1} ref=2 caps=pAsxLsXsxFscr mode=100666 mtime=2011-05-13 06:13:57.015188 parent=0x1330ae0)<br />2011-05-13 06:13:57.236982 7f04341ec720 client7107 check_caps on 100000015ce.head( cap_refs={} open={1=0} ref=2 caps=pAsxLsXsxFscr mode=100666 mtime=2011-05-13 06:13:57.015188 parent=0x1330ae0) wanted - used - is_delayed=0<br />2011-05-13 06:13:57.236997 7f04341ec720 client7107 cap_delay_requeue on 100000015ce.head( cap_refs={} open={1=0} ref=2 caps=pAsxLsXsxFscr mode=100666 mtime=2011-05-13 06:13:57.015188 parent=0x1330ae0)<br />2011-05-13 06:13:57.237012 7f04341ec720 client7107 cap mds0 issued pAsxLsXsxFscr implemented pAsxLsXsxFscr revoking -<br />2011-05-13 06:13:57.237019 7f04341ec720 client7107 flushing -<br />2011-05-13 06:13:57.237041 7f04341ec720 client7107 send_cap 100000015ce.head( cap_refs={} open={1=0} ref=2 caps=pAsxLsXsxFscr mode=100666 mtime=2011-05-13 06:13:57.015188 parent=0x1330ae0) mds0 seq 1 used - want - flush - retain p held pAsxLsXsxFscr revoking - dropping AsxLsXsxFscr<br />2011-05-13 06:13:57.237075 7f04341ec720 client7107 _ll_drop_pins<br />2011-05-13 06:13:57.237086 7f04341ec720 client7107 _ll_put 0x7f0420001f40 1 1 -> 0<br />2011-05-13 06:13:57.237092 7f04341ec720 client7107 trim_cache unlinking dn opened_file in dir 100000015cd<br />2011-05-13 06:13:57.237104 7f04341ec720 client7107 remove_cap mds0 on 100000015ce.head( cap_refs={} open={1=0} ref=0 caps=p mode=100666 mtime=2011-05-13 06:13:57.015188)<br />2011-05-13 06:13:57.237114 7f04341ec720 client7107 remove_cap last one, closing snaprealm 0x7f0420006e60<br />2011-05-13 06:13:57.237127 7f04341ec720 client7107 trim_cache unlinking dn barfile in dir 1<br />2011-05-13 06:13:57.237138 7f04341ec720 client7107 remove_cap mds0 on 100000015cc.head( cap_refs={} open={1=0} ref=0 caps=p mode=100666 mtime=2011-05-13 06:13:56.968093)<br />2011-05-13 06:13:57.237144 7f04341ec720 client7107 remove_cap last one, closing snaprealm 0x7f0420006e60<br />2011-05-13 06:13:57.237154 7f04341ec720 client7107 trim_cache unlinking dn readdir_r_test in dir 1<br />2011-05-13 06:13:57.237177 7f04341ec720 client7107 remove_cap mds0 on 1.head( cap_refs={} open={} ref=0 caps=pAsLsXs mode=40755 mtime=2011-05-13 06:13:56.968093)<br />2011-05-13 06:13:57.237185 7f04341ec720 client7107 remove_cap last one, closing snaprealm 0x7f0420006e60<br />2011-05-13 06:13:57.237214 7f04341ec720 client7107 _flush 100000015cd.head( cap_refs={1024=0} open={} ref=2 caps=pAsLsXsxFsx mode=40777 mtime=2011-05-13 06:13:57.015188)<br />2011-05-13 06:13:57.237228 7f04341ec720 client7107 flush_caps<br />2011-05-13 06:13:57.237244 7f04341ec720 client7107 check_caps on 100000015cd.head( cap_refs={1024=0,8192=0} open={} ref=1 caps=pAsLsXsxFsx mode=40777 mtime=2011-05-13 06:13:57.015188) wanted - used - is_delayed=1<br />2011-05-13 06:13:57.237259 7f04341ec720 client7107 cap mds0 issued pAsLsXsxFsx implemented pAsLsXsxFsx revoking -<br />2011-05-13 06:13:57.237295 7f04341ec720 client7107 flushing -<br />2011-05-13 06:13:57.237318 7f04341ec720 client7107 send_cap 100000015cd.head( cap_refs={1024=0,8192=0} open={} ref=1 caps=pAsLsXsxFsx mode=40777 mtime=2011-05-13 06:13:57.015188) mds0 seq 8 used - want - flush - retain p held pAsLsXsxFsx revoking - dropping AsLsXsxFsx<br />2011-05-13 06:13:57.237339 7f04341ec720 client7107 wait_sync_caps want 0 (last is 0, 0 total flushing)<br />2011-05-13 06:13:57.237348 7f04341ec720 client7107 cache still has 0+1 items, waiting (for caps to release?)<br />2011-05-13 06:13:57.237359 7f04341ec720 client7107 dump_cache: inode 100000015cd.head ref 1 dir 0 100000015cd.head( cap_refs={1024=0,8192=0} open={} ref=1 caps=p mode=40777 mtime=2011-05-13 06:13:57.015188)<br />2011-05-13 06:14:57.015498 7f0430d89710 client7107 handle_client_session client_session(stale seq 11) v1<br />2011-05-13 06:14:57.015527 7f0430d89710 client7107 renew_caps mds0<br />2011-05-13 06:14:57.015550 7f0430d89710 client7107 unmounting: trim pass, size was 0+1<br />2011-05-13 06:14:57.015557 7f0430d89710 client7107 unmounting: trim pass, size still 0+1<br />2011-05-13 06:14:57.015574 7f0430d89710 client7107 dump_cache: inode 100000015cd.head ref 1 dir 0 100000015cd.head( cap_refs={1024=0,8192=0} open={} ref=1 caps=p mode=40777 mtime=2011-05-13 06:13:57.015188)<br />2011-05-13 06:14:57.015915 7f0430d89710 client7107 handle_client_session client_session(renewcaps seq 2) v1<br />2011-05-13 06:14:57.015934 7f0430d89710 client7107 unmounting: trim pass, size was 0+1<br />2011-05-13 06:14:57.015940 7f0430d89710 client7107 unmounting: trim pass, size still 0+1<br />2011-05-13 06:14:57.015953 7f0430d89710 client7107 dump_cache: inode 100000015cd.head ref 1 dir 0 100000015cd.head( cap_refs={1024=0,8192=0} open={} ref=1 caps=p mode=40777 mtime=2011-05-13 06:13:57.015188)</p>
Ceph - Bug #1072 (Won't Fix): cfuse mount fails when monitors have incorrect time synchronization
https://tracker.ceph.com/issues/1072
2011-05-06T16:10:59Z
Brian Chrisman
brchrisman@gmail.com
<p>I've run into this a few times, but this time was the most obvious.<br />client debugging:<br />2011-05-06 21:40:11.448803 7f9a74e4a720 monclient(hunting): authenticate timed out after 30<br />cfuse fails mount after that.</p>
<p>ceph -s reports a warning about lack of time sync, but at that point, we should probably have an error returned on mount.</p>
<p>When I sync my clocks on all nodes, it mounts straight away.</p>
<p>client and mon logging:<br /><a class="external" href="http://pastebin.com/2q4MLdpS">http://pastebin.com/2q4MLdpS</a><br /><a class="external" href="http://pastebin.com/vRLtraX8">http://pastebin.com/vRLtraX8</a></p>
Ceph - Bug #989 (Resolved): rstats and fragstat messages in mds logs
https://tracker.ceph.com/issues/989
2011-04-06T16:23:57Z
Brian Chrisman
brchrisman@gmail.com
<p>I'd like to track down what's going wrong here. These messages generate about 500MB of mds log file while I'm copying around 4.1GB of files onto the cluster (kernel client).<br />I'm trying to get our IT group to setup a DMZ here to allow remote access.</p>
<p>2011-04-06 22:01:47.648612 waiting_locks -- <br />2011-04-06 22:01:47.648630 2011-04-06 22:01:47.648863 7f6cd6122710 mds0.server handle_client_file_setlock: start: 0, length: 0, client: 4113, pid: 763, type: 4<br />2011-04-06 22:01:47.648879 2011-04-06 22:01:47.648889 7f6cd6122710 mds0.server state prior to lock change: ceph_lock_state_t. held_locks.size()=1, waiting_locks.size()=0, client_held_lock_counts -- {4113=1}<br /> client_waiting_lock_counts -- {}<br /> held_locks -- start: 1073741826, length: 510, client: 4113, pid: 763, type: 1<br />2011-04-06 22:01:47.648916 waiting_locks -- <br />2011-04-06 22:01:47.648922 2011-04-06 22:01:47.648928 7f6cd6122710 mds0.server got unlock<br />2011-04-06 22:01:47.648968 7f6cd6122710 mds0.server state after lock change: ceph_lock_state_t. held_locks.size()=0, waiting_locks.size()=0, client_held_lock_counts -- {}<br /> client_waiting_lock_counts -- {}<br /> held_locks -- <br /> waiting_locks -- <br />2011-04-06 22:01:47.648978 2011-04-06 22:01:48.947122 7f6cd6122710 mds0.cache.dir(602) mismatch between head items and fnode.fragstat! printing dentries<br />2011-04-06 22:01:48.947188 7f6cd6122710 mds0.cache.dir(602) [dentry #100/stray2/1000000012c [2,head] auth (dversion lock) v=2411 inode=0x7f6cd85dc910 state=new | inodepin dirty 0x7f6cc8346348]<br />2011-04-06 22:01:48.947200 7f6cd6122710 mds0.cache.dir(602) [dentry #100/stray2/10000000134 [2,head] auth (dversion lock) v=2413 inode=0x7f6cd85e4980 state=new | inodepin dirty 0x7f6cc8345838]<br />2011-04-06 22:01:48.947211 7f6cd6122710 mds0.cache.dir(602) [dentry #100/stray2/10000000135 [2,head] auth (dversion lock) v=2418 inode=0x7f6cd85de2c0 state=new | inodepin dirty 0x7f6cc8345be8]<br />2011-04-06 22:01:48.947223 7f6cd6122710 mds0.cache.dir(602) [dentry #100/stray2/10000000136 [2,head] auth (dversion lock) pv=2420 v=2414 ap=0+1 inode=0x7f6cd85deb50 state=new | inodepin dirty 0x7f6cc8343ab8]<br />2011-04-06 22:01:48.947234 7f6cd6122710 mds0.cache.dir(602) [dentry #100/stray2/10000000137 [2,head] auth (dversion lock) v=2415 inode=0x7f6cd863a2f0 state=new | inodepin dirty 0x7f6cc83422c0]<br />2011-04-06 22:01:48.947245 7f6cd6122710 mds0.cache.dir(602) [dentry #100/stray2/10000000138 [2,head] auth (dversion lock) v=2417 inode=0x7f6cd85d48a0 state=new | inodepin dirty 0x7f6cc8341b60]<br />2011-04-06 22:01:48.947253 7f6cd6122710 mds0.cache.dir(602) mismatch between child accounted_rstats and my rstats!<br />2011-04-06 22:01:48.947273 7f6cd6122710 mds0.cache.dir(602) [dentry #100/stray2/1000000012c [2,head] auth (dversion lock) v=2411 inode=0x7f6cd85dc910 state=new | inodepin dirty 0x7f6cc8346348] n(v0 1=1+0)<br />2011-04-06 22:01:48.947285 7f6cd6122710 mds0.cache.dir(602) [dentry #100/stray2/10000000134 [2,head] auth (dversion lock) v=2413 inode=0x7f6cd85e4980 state=new | inodepin dirty 0x7f6cc8345838] n(v0 b153 1=1+0)<br />2011-04-06 22:01:48.947296 7f6cd6122710 mds0.cache.dir(602) [dentry #100/stray2/10000000135 [2,head] auth (dversion lock) v=2418 inode=0x7f6cd85de2c0 state=new | inodepin dirty 0x7f6cc8345be8] n(v0 1=1+0)<br />2011-04-06 22:01:48.947308 7f6cd6122710 mds0.cache.dir(602) [dentry #100/stray2/10000000136 [2,head] auth (dversion lock) pv=2420 v=2414 ap=0+1 inode=0x7f6cd85deb50 state=new | inodepin dirty 0x7f6cc8343ab8] n(v0 1=1+0)<br />2011-04-06 22:01:48.947319 7f6cd6122710 mds0.cache.dir(602) [dentry #100/stray2/10000000137 [2,head] auth (dversion lock) v=2415 inode=0x7f6cd863a2f0 state=new | inodepin dirty 0x7f6cc83422c0] n(v0 1=1+0)<br />2011-04-06 22:01:48.947330 7f6cd6122710 mds0.cache.dir(602) [dentry #100/stray2/10000000138 [2,head] auth (dversion lock) v=2417 inode=0x7f6cd85d48a0 state=new | inodepin dirty 0x7f6cc8341b60] n(v0 1=1+0)<br />2011-04-06 22:01:48.947431 7f6cd6122710 mds0.cache.dir(602) mismatch between head items and fnode.fragstat! printing dentries<br />2011-04-06 22:01:48.947448 7f6cd6122710 mds0.cache.dir(602) [dentry #100/stray2/1000000012c [2,head] auth (dversion lock) v=2411 inode=0x7f6cd85dc910 state=new | inodepin dirty 0x7f6cc8346348]<br />2011-04-06 22:01:48.947459 7f6cd6122710 mds0.cache.dir(602) [dentry #100/stray2/10000000134 [2,head] auth (dversion lock) v=2413 inode=0x7f6cd85e4980 state=new | inodepin dirty 0x7f6cc8345838]<br />2011-04-06 22:01:48.947470 7f6cd6122710 mds0.cache.dir(602) [dentry #100/stray2/10000000135 [2,head] auth (dversion lock) v=2418 inode=0x7f6cd85de2c0 state=new | inodepin dirty 0x7f6cc8345be8]<br />2011-04-06 22:01:48.947481 7f6cd6122710 mds0.cache.dir(602) [dentry #100/stray2/10000000136 [2,head] auth (dversion lock) pv=2420 v=2414 ap=0+1 inode=0x7f6cd85deb50 state=new | inodepin dirty 0x7f6cc8343ab8]<br />2011-04-06 22:01:48.947521 7f6cd6122710 mds0.cache.dir(602) [dentry #100/stray2/10000000137 [2,head] auth (dversion lock) pv=2422 v=2415 ap=0+1 inode=0x7f6cd863a2f0 state=new | inodepin dirty 0x7f6cc83422c0]<br />2011-04-06 22:01:48.947532 7f6cd6122710 mds0.cache.dir(602) [dentry #100/stray2/10000000138 [2,head] auth (dversion lock) v=2417 inode=0x7f6cd85d48a0 state=new | inodepin dirty 0x7f6cc8341b60]<br />2011-04-06 22:01:48.947539 7f6cd6122710 mds0.cache.dir(602) mismatch between child accounted_rstats and my rstats!</p>
Ceph - Bug #931 (Resolved): opening up security. xattr
https://tracker.ceph.com/issues/931
2011-03-24T12:44:58Z
Brian Chrisman
brchrisman@gmail.com
<p>Working on samba integration. In order to use the samba module which stores NT ACLs in xattrs, the security namespace/prefix needs to be opened up.<br />While it looks like 'security.NTACL' is most immediately necessary, other samba modules use other elements within the 'security.' namespace.</p>
CephFS - Bug #930 (Resolved): libceph not exporting getattr
https://tracker.ceph.com/issues/930
2011-03-24T12:39:27Z
Brian Chrisman
brchrisman@gmail.com
<p>I'm looking at this to include in a samba vfs module.<br />Looks trivial to add, so I'll just plan on it being implemented.</p>
Ceph - Bug #917 (Resolved): OSD sending out-of-order acks
https://tracker.ceph.com/issues/917
2011-03-22T18:00:52Z
Brian Chrisman
brchrisman@gmail.com
<p>This was detected by a cfuse ObjectCacher assert.<br />The relevant messages from the OSD are:<br /><pre>2011-03-24 20:30:19.738078 7f5fe9c4c710 -- 192.168.98.110:0/14840 <== osd1 192.168.98.109:6804/31545 4114 ==== osd_op_reply(24066 10000005dcb.00000009 [write 20480~20480] ack = 0) v1 ==== 106+0+0 (366877589 0 0) 0x7f5fd99a41a0 con 0x1014d70
2011-03-24 20:30:19.738282 7f5fe9c4c710 -- 192.168.98.110:0/14840 <== osd1 192.168.98.109:6804/31545 4115 ==== osd_op_reply(24068 10000005dcb.00000009 [write 315392~16384] ack = 0) v1 ==== 106+0+0 (2786375306 0 0) 0x7f5fd9486410 con 0x1014d70
2011-03-24 20:30:19.738433 7f5fe9c4c710 -- 192.168.98.110:0/14840 <== osd1 192.168.98.109:6804/31545 4116 ==== osd_op_reply(24069 10000005dcb.00000009 [write 339968~8192] ack = 0) v1 ==== 106+0+0 (223975158 0 0) 0x7f5fd920c8c0 con 0x1014d70
2011-03-24 20:30:19.743467 7f5fe9c4c710 -- 192.168.98.110:0/14840 <== osd1 192.168.98.109:6804/31545 4117 ==== osd_op_reply(24066 10000005dcb.00000009 [write 20480~20480] ondisk = 0) v1 ==== 106+0+0 (4070484477 0 0) 0x7f5fd99a41a0 con 0x1014d70
2011-03-24 20:30:19.743765 7f5fe9c4c710 -- 192.168.98.110:0/14840 <== osd1 192.168.98.109:6804/31545 4118 ==== osd_op_reply(24067 10000005dcb.00000009 [write 303104~8192] ondisk = 0) v1 ==== 106+0+0 (1211671729 0 0) 0x7f5fd9486410 con 0x1014d70</pre><br />Each of these is a write operation.</p>
<p>Here's the full debug log for the core dump we're seeing from cfuse.<br />This is on v0.25.1.<br />I can reproduce this if required.</p>
<pre>[write 1204224~57344] ack = 0) v1 ==== 106+0+0 (462844183 0 0)
0x7fc19c0aece0 con 0x238ed80
osdc/ObjectCacher.cc: In function 'void
ObjectCacher::bh_write_ack(int, sobject_t, loff_t, uint64_t, tid_t)',
in thread '0x7fc1ced85710'
osdc/ObjectCacher.cc: 673: FAILED assert(ob->last_ack_tid < tid)
ceph version 0.25.1 (commit:7f4a161e7fd1bb0d0eab3958b427a3c4c585bc05)
1: (ObjectCacher::bh_write_ack(int, sobject_t, long, unsigned long,
unsigned long)+0x6b0) [0x4ef5a0]
2: (ObjectCacher::C_WriteAck::finish(int)+0x5a) [0x4fe30a]
3: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0x93c) [0x4d0d6c]
4: (Client::ms_dispatch(Message*)+0x143) [0x4a4e43]
5: (SimpleMessenger::dispatch_entry()+0x882) [0x454b32]
6: (SimpleMessenger::DispatchThread::entry()+0x1c) [0x44d0ac]
7: (()+0x77e1) [0x7fc1d21b77e1]
8: (clone()+0x6d) [0x7fc1d10a351d]
ceph version 0.25.1 (commit:7f4a161e7fd1bb0d0eab3958b427a3c4c585bc05)
1: (ObjectCacher::bh_write_ack(int, sobject_t, long, unsigned long,
unsigned long)+0x6b0) [0x4ef5a0]
2: (ObjectCacher::C_WriteAck::finish(int)+0x5a) [0x4fe30a]
3: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0x93c) [0x4d0d6c]
4: (Client::ms_dispatch(Message*)+0x143) [0x4a4e43]
5: (SimpleMessenger::dispatch_entry()+0x882) [0x454b32]
6: (SimpleMessenger::DispatchThread::entry()+0x1c) [0x44d0ac]
7: (()+0x77e1) [0x7fc1d21b77e1]
8: (clone()+0x6d) [0x7fc1d10a351d]
*** Caught signal (Aborted) **
in thread 0x7fc1ced85710
ceph version 0.25.1 (commit:7f4a161e7fd1bb0d0eab3958b427a3c4c585bc05)
1: cfuse() [0x553dcc]
2: (()+0xf4c0) [0x7fc1d21bf4c0]
3: (gsignal()+0x35) [0x7fc1d0ff49b5]
4: (abort()+0x175) [0x7fc1d0ff6195]
5: (__gnu_cxx::__verbose_terminate_handler()+0x12d) [0x7fc1d1899aad]
6: (()+0xbcc36) [0x7fc1d1897c36]
7: (()+0xbcc63) [0x7fc1d1897c63]
8: (()+0xbcd5e) [0x7fc1d1897d5e]
9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x373) [0x5400c3]
10: (ObjectCacher::bh_write_ack(int, sobject_t, long, unsigned long,
unsigned long)+0x6b0) [0x4ef5a0]
11: (ObjectCacher::C_WriteAck::finish(int)+0x5a) [0x4fe30a]
12: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0x93c) [0x4d0d6c]
13: (Client::ms_dispatch(Message*)+0x143) [0x4a4e43]
14: (SimpleMessenger::dispatch_entry()+0x882) [0x454b32]
15: (SimpleMessenger::DispatchThread::entry()+0x1c) [0x44d0ac]
16: (()+0x77e1) [0x7fc1d21b77e1]
17: (clone()+0x6d) [0x7fc1d10a351d]</pre>
CephFS - Bug #854 (Duplicate): unsynchronized clocks between kernel-client/cmds cause PJD fstest ...
https://tracker.ceph.com/issues/854
2011-03-02T14:32:43Z
Brian Chrisman
brchrisman@gmail.com
<p>I'm seeing a varied number (generally 5-8) of POSIX tests within the PJD fstest suite failing when the tests are being run on a node (atop the ceph kernel client) where that node's clock is not synchronized with the node hosting the active MDS.<br />Synchronizing the clocks in the cluster using ntpdate/xntpd returns PJD fstests to full success.</p>
<p>This is not likely critical because:<br />- failures are small corner cases (unexpected ctimes during operations like lchown of a symlink, for example)<br />- workaround of having clocks set correctly is reasonable<br />- may be a known design issue (mds creating a timestamp instead of taking the client's time stamp?) for some reason</p>
<p>Here's a histogram of the test failures (out of 21 repeat runs)<br />(count) (test number) (status) (line number) (test filename)</p>
<pre><code>19 102:fail:[218 /opt/scale/lib/pjdfstests/tests/chown/00.t]<br /> 21 112:fail:[236 /opt/scale/lib/pjdfstests/tests/chown/00.t]<br /> 13 141:fail:[287 /opt/scale/lib/pjdfstests/tests/chown/00.t]<br /> 17 145:fail:[302 /opt/scale/lib/pjdfstests/tests/chown/00.t]<br /> 21 153:fail:[332 /opt/scale/lib/pjdfstests/tests/chown/00.t]<br /> 13 27:fail:[70 /opt/scale/lib/pjdfstests/tests/chmod/00.t]<br /> 18 31:fail:[78 /opt/scale/lib/pjdfstests/tests/chmod/00.t]<br /> 11 97:fail:[209 /opt/scale/lib/pjdfstests/tests/chown/00.t]</code></pre>
<p>Different tests will fail in different runs... a few (with 21/21) fail consistently.</p>
Ceph - Bug #735 (Resolved): Manual drive pull testing hangs filesystem
https://tracker.ceph.com/issues/735
2011-01-24T12:31:00Z
Brian Chrisman
brchrisman@gmail.com
<p>It appears that drive failure problems in my configuration are not making their way up through the stack to kill off OSDs.<br />Setup:<br />- journal and data were two partition on same drive (for error isolation)<br />- 3 node cluster<br />- 4 SATA disks per node, cosd-per-disk config<br />- one partition from each disk in md (raid 1) root filesystem<br />- I/O generated continuously throughout testing<br />- kernel client running alongside daemons on all nodes<br />- running code is a 0.24rc with 2.6.37rc8 kernel<br />Symptoms:<br />- cosd of pulled drive reported journal errors on raw device journal<br />- md root filesystem recognized failure and responded properly<br />- cosd servicing pulled drive did not die and began inflating memory usage<br />- ceph filesystem unresponsive (waited >> 10 minutes for ls response on client)<br />- with same setup, if cosd is killed soon after drive pull, no problems at all<br />My Theory(ies):<br />- drive fail not being converted to cosd I/O error via btrfs, or I/O error ignored by cosd<br />- cosd memory inflation doesn't really matter, as cosd is expected to exit on error to allow re-peering</p>
<p>I can provide detailed hardware specs if it will help.</p>
Ceph - Bug #639 (Resolved): RHEL6 beta packaging breaks on 'libcls_rbd.so.1.0.0.debug'
https://tracker.ceph.com/issues/639
2010-12-08T11:19:27Z
Brian Chrisman
brchrisman@gmail.com
<p>/usr/lib/debug/usr/lib64/rados-classes/libcls_rbd.so.1.0.0.debug is<br />installed but unpackaged.</p>
<p>In spec file I see this is being touched for a purpose that's not clearly apparent to me.<br />The path gets mucked up with the usr/lib/debug prepended to usr/lib64.. etc.</p>
<p>ceph.spec.in:80 has an extra usr/lib and should probably be:<br />mkdir -p $RPM_BUILD_ROOT/%{_libdir}/debug/rados-classes/<br />touch $RPM_BUILD_ROOT/%{_libdir}/debug/rados-classes/libcls_rbd.so.1.0.0.debug</p>
<p>But if so then this directory and file also need to be identified for packaging.<br />I'm not sure why this file/directory exists, so I just removed in my build.</p>