Project

General

Profile

Bug #17270

Updated by Jason Dillaman over 7 years ago

Hello everyone, 

 I have met a ceph-fuse crash when i add osd to osd pool. 

 I am writing data through ceph-fuse,then i add one osd to osd pool, after less than 30 s, the ceph-fuse process crash. 

 The ceph-fuse client is 10.2.2, and the ceph osd is 0.94.3, details beblow: 

 <pre> 
 [root@localhost ~]# rpm -qa | grep ceph 
 libcephfs1-10.2.2-0.el7.centos.x86_64 
 python-cephfs-10.2.2-0.el7.centos.x86_64 
 ceph-common-0.94.3-0.el7.x86_64 
 ceph-fuse-10.2.2-0.el7.centos.x86_64 
 ceph-0.94.3-0.el7.x86_64 
 ceph-mds-10.2.2-0.el7.centos.x86_64 
 [root@localhost ~]#  
 [root@localhost ~]#  
 [root@localhost ~]# rpm -qa | grep rados 
 librados2-devel-0.94.3-0.el7.x86_64 
 librados2-0.94.3-0.el7.x86_64 
 libradosstriper1-0.94.3-0.el7.x86_64 
 python-rados-0.94.3-0.el7.x86_64 
 </pre> 

 ceph stat: 

 <pre> 
 [root@localhost ~]# ceph status 
     cluster a7f64266-0894-4f1e-a635-d0aeaca0e993 
      health HEALTH_WARN 
             clock skew detected on mon.2, mon.0 
             19 pgs stale 
             19 pgs stuck stale 
             Monitor clock skew detected  
      monmap e3: 3 mons at {0=10.222.5.229:6789/0,1=10.222.5.156:6789/0,2=10.222.5.162:6789/0} 
             election epoch 26, quorum 0,1,2 1,2,0 
      mdsmap e58: 1/1/1 up {0=0=up:active}, 1 up:standby 
      osdmap e324: 9 osds: 9 up, 9 in 
       pgmap v3505: 320 pgs, 3 pools, 4638 MB data, 1302 objects 
             23373 MB used, 68695 MB / 92069 MB avail 
                  301 active+clean 
                   19 stale+active+clean 
 </pre> 

 ceph osd stat: 
 <pre> 
 [root@localhost ~]# ceph osd dump 
 epoch 324 
 fsid a7f64266-0894-4f1e-a635-d0aeaca0e993 
 created 2016-09-13 11:08:34.629245 
 modified 2016-09-13 16:21:53.285729 
 flags  
 pool 0 'rbd' replicated size 1 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0 
 pool 5 'cephfs_data' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 285 flags hashpspool crash_replay_interval 45 stripe_width 0 
 pool 6 'cephfs_metadata' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 282 flags hashpspool stripe_width 0 
 max_osd 9 
 osd.0 up     in    weight 1 up_from 271 up_thru 321 down_at 242 last_clean_interval [169,175) 10.222.5.229:6800/3780 10.222.5.229:6801/3780 10.222.5.229:6802/3780 10.222.5.229:6803/3780 exists,up 1bf6cda4-bf1a-4f8a-836d-b6aec970d257 
 osd.1 up     in    weight 1 up_from 223 up_thru 320 down_at 186 last_clean_interval [20,183) 10.222.5.229:6804/2228 10.222.5.229:6805/2228 10.222.5.229:6806/2228 10.222.5.229:6807/2228 exists,up 3f3ad2fa-52b1-46fd-af6c-05178b814e25 
 osd.2 up     in    weight 1 up_from 224 up_thru 320 down_at 186 last_clean_interval [22,183) 10.222.5.229:6808/2259 10.222.5.229:6809/2259 10.222.5.229:6810/2259 10.222.5.229:6811/2259 exists,up 9199193e-9928-4c5d-8adc-2c32a4c8716b 
 osd.3 up     in    weight 1 up_from 312 up_thru 313 down_at 303 last_clean_interval [0,0) 10.222.5.156:6800/3592 10.222.5.156:6801/3592 10.222.5.156:6802/3592 10.222.5.156:6803/3592 exists,up 9b8f1cb0-51df-42aa-8be4-8f6347235cc2 
 osd.4 up     in    weight 1 up_from 25 up_thru 322 down_at 0 last_clean_interval [0,0) 10.222.5.156:6804/25567 10.222.5.156:6805/25567 10.222.5.156:6806/25567 10.222.5.156:6807/25567 exists,up 0c719e5e-f8fc-46e0-926d-426bf6881ee0 
 osd.5 up     in    weight 1 up_from 27 up_thru 310 down_at 0 last_clean_interval [0,0) 10.222.5.156:6808/25678 10.222.5.156:6809/25678 10.222.5.156:6810/25678 10.222.5.156:6811/25678 exists,up 729e0749-2ce3-426a-a7f1-a3cbfa88ba0b 
 osd.6 up     in    weight 1 up_from 40 up_thru 313 down_at 0 last_clean_interval [0,0) 10.222.5.162:6807/15887 10.222.5.162:6808/15887 10.222.5.162:6809/15887 10.222.5.162:6810/15887 exists,up dea24f0f-4666-40af-98af-5ab8d42c37c6 
 osd.7 up     in    weight 1 up_from 45 up_thru 313 down_at 0 last_clean_interval [0,0) 10.222.5.162:6811/16040 10.222.5.162:6812/16040 10.222.5.162:6813/16040 10.222.5.162:6814/16040 exists,up 0e238745-0091-4790-9b39-c9d36f4ebbee 
 osd.8 up     in    weight 1 up_from 49 up_thru 314 down_at 0 last_clean_interval [0,0) 10.222.5.162:6815/16206 10.222.5.162:6816/16206 10.222.5.162:6817/16206 10.222.5.162:6818/16206 exists,up 59637f86-f283-4397-a63b-474976ee8047 
 [root@localhost ~]#  
 [root@localhost ~]# ceph osd tree 
 ID WEIGHT    TYPE NAME        UP/DOWN REWEIGHT PRIMARY-AFFINITY  
 -1 9.00000 root default                                      
 -5 3.00000       host yxy02                                    
  1 1.00000           osd.1         up    1.00000            1.00000  
  2 1.00000           osd.2         up    1.00000            1.00000  
  0 1.00000           osd.0         up    1.00000            1.00000  
 -6 3.00000       host yxy03                                    
  4 1.00000           osd.4         up    1.00000            1.00000        -------------> OSD JUST ADD! 
  5 1.00000           osd.5         up    1.00000            1.00000  
  3 1.00000           osd.3         up    1.00000            1.00000  
 -7 3.00000       host zwr01                                    
  6 1.00000           osd.6         up    1.00000            1.00000  
  7 1.00000           osd.7         up    1.00000            1.00000  
  8 1.00000           osd.8         up    1.00000            1.00000  
 </pre> 

  


 Here is the client log: 
 <pre> 
    -72> 2016-09-13 16:15:21.586874 7f55783b0700    1 -- 10.222.5.162:0/3252549074 <== osd.4 10.222.5.156:6804/25567 3251 ==== osd_op_reply(17132 100000000b0.00000001 [write 0~2228224] v323'38 uv38 ondisk = 0) v6 ==== 187+0+0 (1642657507 0 0) 0x7f54e4590a70 con 0x7f55500219f0 
    -71> 2016-09-13 16:15:21.586873 7f55782af700    1 -- 10.222.5.162:0/3252549074 <== osd.0 10.222.5.229:6800/3780 786 ==== osd_op_reply(17148 100000000af.00000001 [write 3670016~524288] v323'38 uv38 ondisk = 0) v6 ==== 187+0+0 (239835621 0 0) 0x7f54dc0013f0 con 0x7f555001d7a0 
    -70> 2016-09-13 16:15:21.587052 7f55783b0700    1 -- 10.222.5.162:0/3252549074 <== osd.4 10.222.5.156:6804/25567 3252 ==== osd_op_reply(17138 100000000b0.00000001 [write 2228224~1966080] v323'39 uv39 ondisk = 0) v6 ==== 187+0+0 (1447166925 0 0) 0x7f54e4590a70 con 0x7f55500219f0 
    -69> 2016-09-13 16:15:21.587058 7f55815b5700    5 client.24191 put_cap_ref dropped last FILE_BUFFER ref on 100000000af.head(faked_ino=0 ref=6 ll_ref=1 cap_refs={1024=0,4096=0,8192=0} open={2=0} mode=100755 size=10485760/16777216 mtime=2016-09-13 16:15:17.607516 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) dirty_caps=Fw flushing_caps=Fw objectset[100000000af ts 0/0 objects 3 dirty_or_tx 0] parents=0x7f550c00c980 0x7f556c0be140) 
    -68> 2016-09-13 16:15:21.587098 7f55815b5700    5 client.24191 put_cap_ref dropped last FILE_CACHE ref on 100000000af.head(faked_ino=0 ref=6 ll_ref=1 cap_refs={1024=0,4096=0,8192=0} open={2=0} mode=100755 size=10485760/16777216 mtime=2016-09-13 16:15:17.607516 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) dirty_caps=Fw flushing_caps=Fw objectset[100000000af ts 0/0 objects 3 dirty_or_tx 0] parents=0x7f550c00c980 0x7f556c0be140) 
    -67> 2016-09-13 16:15:21.587146 7f55815b5700    5 client.24191 put_cap_ref dropped last FILE_BUFFER ref on 100000000b0.head(faked_ino=0 ref=6 ll_ref=1 cap_refs={1024=0,4096=0,8192=0} open={2=0} mode=100755 size=10485760/16777216 mtime=2016-09-13 16:15:17.591570 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) dirty_caps=Fw flushing_caps=Fw objectset[100000000b0 ts 0/0 objects 3 dirty_or_tx 0] parents=0x7f54bc5f90e0 0x7f556c0be600) 
    -66> 2016-09-13 16:15:21.587171 7f55815b5700    5 client.24191 put_cap_ref dropped last FILE_CACHE ref on 100000000b0.head(faked_ino=0 ref=6 ll_ref=1 cap_refs={1024=0,4096=0,8192=0} open={2=0} mode=100755 size=10485760/16777216 mtime=2016-09-13 16:15:17.591570 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) dirty_caps=Fw flushing_caps=Fw objectset[100000000b0 ts 0/0 objects 3 dirty_or_tx 0] parents=0x7f54bc5f90e0 0x7f556c0be600) 
    -65> 2016-09-13 16:15:21.587663 7f55783b0700    1 -- 10.222.5.162:0/3252549074 <== osd.4 10.222.5.156:6804/25567 3253 ==== osd_op_reply(17141 100000000b5.00000001 [write 0~1966080] v323'40 uv40 ondisk = 0) v6 ==== 187+0+0 (4162314622 0 0) 0x7f54e4590a70 con 0x7f55500219f0 
    -64> 2016-09-13 16:15:21.587822 7f55386f7700    1 -- 10.222.5.162:0/3252549074 <== osd.5 10.222.5.156:6808/25678 2865 ==== osd_op_reply(17161 100000000ba.00000000 [write 0~2228224] v323'48 uv48 ondisk = 0) v6 ==== 187+0+0 (3143255082 0 0) 0x7f54e0002f10 con 0x7f5550047900 
    -63> 2016-09-13 16:15:21.587915 7f55386f7700    1 -- 10.222.5.162:0/3252549074 <== osd.5 10.222.5.156:6808/25678 2866 ==== osd_op_reply(17168 100000000ba.00000000 [write 2228224~1966080] v323'49 uv49 ondisk = 0) v6 ==== 187+0+0 (2364584196 0 0) 0x7f54e0002f10 con 0x7f5550047900 
    -62> 2016-09-13 16:15:21.588742 7f55815b5700    5 client.24191 put_cap_ref dropped last FILE_BUFFER ref on 100000000ba.head(faked_ino=0 ref=6 ll_ref=1 cap_refs={1024=0,4096=0,8192=0} open={2=1} mode=100755 size=4194304/4194304 mtime=2016-09-13 16:15:17.676598 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) flushing_caps=Fw objectset[100000000ba ts 0/0 objects 1 dirty_or_tx 0] parents=0x7f54c8a384d0 0x7f556c1119e0) 
    -61> 2016-09-13 16:15:21.588782 7f55815b5700    5 client.24191 put_cap_ref dropped last FILE_CACHE ref on 100000000ba.head(faked_ino=0 ref=6 ll_ref=1 cap_refs={1024=0,4096=0,8192=0} open={2=1} mode=100755 size=4194304/4194304 mtime=2016-09-13 16:15:17.676598 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) flushing_caps=Fw objectset[100000000ba ts 0/0 objects 1 dirty_or_tx 0] parents=0x7f54c8a384d0 0x7f556c1119e0) 
    -60> 2016-09-13 16:15:21.645868 7f55783b0700    1 -- 10.222.5.162:0/3252549074 <== osd.4 10.222.5.156:6804/25567 3254 ==== osd_op_reply(17156 100000000b7.00000000 [write 0~1310720] v323'53 uv53 ondisk = 0) v6 ==== 187+0+0 (371286939 0 0) 0x7f54e4590a70 con 0x7f55500219f0 
    -59> 2016-09-13 16:15:21.646023 7f55783b0700    1 -- 10.222.5.162:0/3252549074 <== osd.4 10.222.5.156:6804/25567 3255 ==== osd_op_reply(17165 100000000b7.00000000 [write 1310720~2883584] v323'54 uv54 ondisk = 0) v6 ==== 187+0+0 (719229667 0 0) 0x7f54e4590a70 con 0x7f55500219f0 
    -58> 2016-09-13 16:15:21.646091 7f55815b5700    5 client.24191 put_cap_ref dropped last FILE_BUFFER ref on 100000000b7.head(faked_ino=0 ref=6 ll_ref=1 cap_refs={1024=0,4096=0,8192=0} open={2=1} mode=100755 size=4194304/4194304 mtime=2016-09-13 16:15:17.664874 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) flushing_caps=Fw objectset[100000000b7 ts 0/0 objects 1 dirty_or_tx 0] parents=0x7f54d0318800 0x7f556c110ba0) 
    -57> 2016-09-13 16:15:21.646151 7f55815b5700    5 client.24191 put_cap_ref dropped last FILE_CACHE ref on 100000000b7.head(faked_ino=0 ref=6 ll_ref=1 cap_refs={1024=0,4096=0,8192=0} open={2=1} mode=100755 size=4194304/4194304 mtime=2016-09-13 16:15:17.664874 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) flushing_caps=Fw objectset[100000000b7 ts 0/0 objects 1 dirty_or_tx 0] parents=0x7f54d0318800 0x7f556c110ba0) 
    -56> 2016-09-13 16:15:21.995626 7f55783b0700    1 -- 10.222.5.162:0/3252549074 <== osd.4 10.222.5.156:6804/25567 3256 ==== osd_op_reply(17151 100000000b3.00000002 [write 0~2097152] v323'48 uv48 ondisk = 0) v6 ==== 187+0+0 (2547660202 0 0) 0x7f54e4590a70 con 0x7f55500219f0 
    -55> 2016-09-13 16:15:21.995774 7f55815b5700    5 client.24191 put_cap_ref dropped last FILE_BUFFER ref on 100000000b3.head(faked_ino=0 ref=6 ll_ref=1 cap_refs={1024=0,4096=0,8192=0} open={2=0} mode=100755 size=10485760/16777216 mtime=2016-09-13 16:15:17.608865 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) dirty_caps=Fw flushing_caps=Fw objectset[100000000b3 ts 0/0 objects 3 dirty_or_tx 0] parents=0x7f5534a0e720 0x7f556c0b77c0) 
    -54> 2016-09-13 16:15:21.995839 7f55783b0700    1 -- 10.222.5.162:0/3252549074 <== osd.4 10.222.5.156:6804/25567 3257 ==== osd_op_reply(17154 100000000b5.00000001 [write 1966080~2228224] v323'41 uv41 ondisk = 0) v6 ==== 187+0+0 (2882567974 0 0) 0x7f54e4590a70 con 0x7f55500219f0 
    -53> 2016-09-13 16:15:21.995829 7f55815b5700    5 client.24191 put_cap_ref dropped last FILE_CACHE ref on 100000000b3.head(faked_ino=0 ref=6 ll_ref=1 cap_refs={1024=0,4096=0,8192=0} open={2=0} mode=100755 size=10485760/16777216 mtime=2016-09-13 16:15:17.608865 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) dirty_caps=Fw flushing_caps=Fw objectset[100000000b3 ts 0/0 objects 3 dirty_or_tx 0] parents=0x7f5534a0e720 0x7f556c0b77c0) 
    -52> 2016-09-13 16:15:21.995978 7f55815b5700    5 client.24191 put_cap_ref dropped last FILE_BUFFER ref on 100000000b5.head(faked_ino=0 ref=6 ll_ref=1 cap_refs={1024=0,4096=0,8192=0} open={2=0} mode=100755 size=10485760/16777216 mtime=2016-09-13 16:15:17.635885 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) dirty_caps=Fw flushing_caps=Fw objectset[100000000b5 ts 0/0 objects 3 dirty_or_tx 0] parents=0x7f54d0303c20 0x7f556c037ac0) 
    -51> 2016-09-13 16:15:21.996022 7f55815b5700    5 client.24191 put_cap_ref dropped last FILE_CACHE ref on 100000000b5.head(faked_ino=0 ref=6 ll_ref=1 cap_refs={1024=0,4096=0,8192=0} open={2=0} mode=100755 size=10485760/16777216 mtime=2016-09-13 16:15:17.635885 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) dirty_caps=Fw flushing_caps=Fw objectset[100000000b5 ts 0/0 objects 3 dirty_or_tx 0] parents=0x7f54d0303c20 0x7f556c037ac0) 
    -50> 2016-09-13 16:15:21.998647 7f55783b0700    1 -- 10.222.5.162:0/3252549074 <== osd.4 10.222.5.156:6804/25567 3258 ==== osd_op_reply(17158 100000000b8.00000000 [write 0~1966080] v323'49 uv49 ondisk = 0) v6 ==== 187+0+0 (1584949222 0 0) 0x7f54e4590a70 con 0x7f55500219f0 
    -49> 2016-09-13 16:15:21.998788 7f55783b0700    1 -- 10.222.5.162:0/3252549074 <== osd.4 10.222.5.156:6804/25567 3259 ==== osd_op_reply(17166 100000000b8.00000000 [write 1966080~2228224] v323'50 uv50 ondisk = 0) v6 ==== 187+0+0 (3946494964 0 0) 0x7f54e4590a70 con 0x7f55500219f0 
    -48> 2016-09-13 16:15:21.998913 7f55815b5700    5 client.24191 put_cap_ref dropped last FILE_BUFFER ref on 100000000b8.head(faked_ino=0 ref=6 ll_ref=1 cap_refs={1024=0,4096=0,8192=0} open={2=1} mode=100755 size=4194304/4194304 mtime=2016-09-13 16:15:17.669978 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) flushing_caps=Fw objectset[100000000b8 ts 0/0 objects 1 dirty_or_tx 0] parents=0x7f5515c3c040 0x7f556c111060) 
    -47> 2016-09-13 16:15:21.998986 7f55815b5700    5 client.24191 put_cap_ref dropped last FILE_CACHE ref on 100000000b8.head(faked_ino=0 ref=6 ll_ref=1 cap_refs={1024=0,4096=0,8192=0} open={2=1} mode=100755 size=4194304/4194304 mtime=2016-09-13 16:15:17.669978 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) flushing_caps=Fw objectset[100000000b8 ts 0/0 objects 1 dirty_or_tx 0] parents=0x7f5515c3c040 0x7f556c111060) 
    -46> 2016-09-13 16:15:22.720826 7f55797fa700 10 monclient: tick 
    -45> 2016-09-13 16:15:22.720941 7f55797fa700 10 monclient: renew subs? (now: 2016-09-13 16:15:22.720929; renew after: 2016-09-13 16:10:33.093544) -- yes 
    -44> 2016-09-13 16:15:22.720983 7f55797fa700 10 monclient: renew_subs - empty 
    -43> 2016-09-13 16:15:22.721661 7f55805b3700    2 -- 10.222.5.162:0/3252549074 >> 10.222.5.229:6789/0 pipe(0x7f556c133990 sd=12 :55626 s=2 pgs=946 cs=1 l=1 c=0x7f556c09e3a0).reader got KEEPALIVE_ACK 
    -42> 2016-09-13 16:15:22.872793 7f557b7fe700    1 -- 10.222.5.162:0/3252549074 --> 10.222.5.162:6800/24314 -- client_caps(update ino 100000000b0 1394 seq 4 tid 1514 caps=pAsLsXsFsc dirty=Fw wanted=- follows 1 size 10485760/0 ts 1/18446744073709551615 mtime 2016-09-13 16:15:17.591570) v8 -- ?+0 0x7f555c001e20 con 0x7f556c00e790 
    -41> 2016-09-13 16:15:22.872906 7f557b7fe700    1 -- 10.222.5.162:0/3252549074 --> 10.222.5.162:6800/24314 -- client_caps(update ino 100000000b1 1395 seq 4 tid 1515 caps=pAsLsXsFsc dirty=Fw wanted=- follows 1 size 10485760/0 ts 1/18446744073709551615 mtime 2016-09-13 16:15:17.596742) v8 -- ?+0 0x7f555c002910 con 0x7f556c00e790 
    -40> 2016-09-13 16:15:22.873074 7f557b7fe700    1 -- 10.222.5.162:0/3252549074 --> 10.222.5.162:6800/24314 -- client_caps(update ino 100000000af 1393 seq 4 tid 1516 caps=pAsLsXsFsc dirty=Fw wanted=- follows 1 size 10485760/0 ts 1/18446744073709551615 mtime 2016-09-13 16:15:17.607516) v8 -- ?+0 0x7f555c004330 con 0x7f556c00e790 
    -39> 2016-09-13 16:15:22.873126 7f557b7fe700    1 -- 10.222.5.162:0/3252549074 --> 10.222.5.162:6800/24314 -- client_caps(update ino 100000000b3 1397 seq 4 tid 1517 caps=pAsLsXsFsc dirty=Fw wanted=- follows 1 size 10485760/0 ts 1/18446744073709551615 mtime 2016-09-13 16:15:17.608865) v8 -- ?+0 0x7f555c004730 con 0x7f556c00e790 
    -38> 2016-09-13 16:15:22.873206 7f557b7fe700    1 -- 10.222.5.162:0/3252549074 --> 10.222.5.162:6800/24314 -- client_caps(update ino 100000000b4 1398 seq 4 tid 1518 caps=pAsLsXsFsc dirty=Fw wanted=- follows 1 size 10485760/0 ts 1/18446744073709551615 mtime 2016-09-13 16:15:17.604355) v8 -- ?+0 0x7f555c009120 con 0x7f556c00e790 
    -37> 2016-09-13 16:15:22.873358 7f557b7fe700    1 -- 10.222.5.162:0/3252549074 --> 10.222.5.162:6800/24314 -- client_caps(update ino 100000000b2 1396 seq 4 tid 1519 caps=pAsxLsXsxFsxcrwb dirty=Fw wanted=Fxcb follows 1 size 10485760/0 ts 1/18446744073709551615 mtime 2016-09-13 16:15:17.609681) v8 -- ?+0 0x7f555c0096e0 con 0x7f556c00e790 
    -36> 2016-09-13 16:15:22.873440 7f557b7fe700    1 -- 10.222.5.162:0/3252549074 --> 10.222.5.162:6800/24314 -- client_caps(update ino 100000000b5 1399 seq 4 tid 1520 caps=pAsLsXsFsc dirty=Fw wanted=- follows 1 size 10485760/0 ts 1/18446744073709551615 mtime 2016-09-13 16:15:17.635885) v8 -- ?+0 0x7f555c009ca0 con 0x7f556c00e790 
    -35> 2016-09-13 16:15:22.873506 7f557b7fe700    1 -- 10.222.5.162:0/3252549074 --> 10.222.5.162:6800/24314 -- client_caps(update ino 100000000b6 1400 seq 4 tid 1521 caps=pAsxLsXsxFsxcrwb dirty=Fw wanted=Fxcb follows 1 size 10485760/0 ts 1/18446744073709551615 mtime 2016-09-13 16:15:17.653195) v8 -- ?+0 0x7f555c00a260 con 0x7f556c00e790 
    -34> 2016-09-13 16:15:22.873554 7f557b7fe700    1 -- 10.222.5.162:0/3252549074 --> 10.222.5.162:6800/24314 -- client_caps(update ino 100000000b7 1401 seq 1 caps=pAsxLsXsxFsxcrwb dirty=- wanted=pAsxXsxFxwb follows 0 size 4194304/4325376 ts 1/18446744073709551615 mtime 2016-09-13 16:15:17.664874) v8 -- ?+0 0x7f555c00a820 con 0x7f556c00e790 
    -33> 2016-09-13 16:15:22.873689 7f557b7fe700    1 -- 10.222.5.162:0/3252549074 --> 10.222.5.162:6800/24314 -- client_caps(update ino 100000000b8 1402 seq 1 caps=pAsxLsXsxFsxcrwb dirty=- wanted=pAsxXsxFxwb follows 0 size 4194304/4325376 ts 1/18446744073709551615 mtime 2016-09-13 16:15:17.669978) v8 -- ?+0 0x7f555c00ade0 con 0x7f556c00e790 
    -32> 2016-09-13 16:15:22.873731 7f557b7fe700    1 -- 10.222.5.162:0/3252549074 --> 10.222.5.162:6800/24314 -- client_caps(update ino 100000000b9 1403 seq 1 caps=pAsxLsXsxFsxcrwb dirty=- wanted=pAsxXsxFxwb follows 0 size 4194304/4325376 ts 1/18446744073709551615 mtime 2016-09-13 16:15:17.671537) v8 -- ?+0 0x7f555c00b3b0 con 0x7f556c00e790 
    -31> 2016-09-13 16:15:22.873788 7f557b7fe700    1 -- 10.222.5.162:0/3252549074 --> 10.222.5.162:6800/24314 -- client_caps(update ino 100000000bb 1405 seq 1 caps=pAsxLsXsxFsxcrwb dirty=- wanted=pAsxXsxFxwb follows 0 size 4194304/4325376 ts 1/18446744073709551615 mtime 2016-09-13 16:15:17.674291) v8 -- ?+0 0x7f555c00b980 con 0x7f556c00e790 
    -30> 2016-09-13 16:15:22.873899 7f557b7fe700    1 -- 10.222.5.162:0/3252549074 --> 10.222.5.162:6800/24314 -- client_caps(update ino 100000000bc 1406 seq 1 caps=pAsxLsXsxFsxcrwb dirty=- wanted=pAsxXsxFxwb follows 0 size 4194304/4325376 ts 1/18446744073709551615 mtime 2016-09-13 16:15:17.672719) v8 -- ?+0 0x7f555c00bf50 con 0x7f556c00e790 
    -29> 2016-09-13 16:15:22.874009 7f557b7fe700    1 -- 10.222.5.162:0/3252549074 --> 10.222.5.162:6800/24314 -- client_caps(update ino 100000000ba 1404 seq 1 caps=pAsxLsXsxFsxcrwb dirty=- wanted=pAsxXsxFxwb follows 0 size 4194304/4325376 ts 1/18446744073709551615 mtime 2016-09-13 16:15:17.676598) v8 -- ?+0 0x7f555c001e20 con 0x7f556c00e790 
    -28> 2016-09-13 16:15:22.874677 7f557a7fc700    1 -- 10.222.5.162:0/3252549074 <== mds.0 10.222.5.162:6800/24314 8318 ==== client_caps(revoke ino 100000000b2 1396 seq 5 caps=pAsLsXsFsxcrwb dirty=- wanted=Fxcb follows 0 size 10485760/16777216 ts 1/18446744073709551615 mtime 2016-09-13 16:15:17.609681) v8 ==== 216+0+0 (2189111699 0 0) 0x7f556800d780 con 0x7f556c00e790 
    -27> 2016-09-13 16:15:22.874760 7f557a7fc700    5 client.24191 handle_cap_grant on in 100000000b2 mds.0 seq 5 caps now pAsLsXsFsxcrwb was pAsxLsXsxFsxcrwb 
    -26> 2016-09-13 16:15:22.874783 7f557a7fc700    1 -- 10.222.5.162:0/3252549074 --> 10.222.5.162:6800/24314 -- client_caps(update ino 100000000b2 1396 seq 5 caps=pAsLsXsFsxcrwb dirty=- wanted=Fxcb follows 0 size 10485760/0 ts 1/18446744073709551615 mtime 2016-09-13 16:15:17.609681) v8 -- ?+0 0x7f556c0c3600 con 0x7f556c00e790 
    -25> 2016-09-13 16:15:22.874829 7f557a7fc700    1 -- 10.222.5.162:0/3252549074 <== mds.0 10.222.5.162:6800/24314 8319 ==== client_caps(revoke ino 100000000b6 1400 seq 5 caps=pAsLsXsFsxcrwb dirty=- wanted=Fxcb follows 0 size 10485760/16777216 ts 1/18446744073709551615 mtime 2016-09-13 16:15:17.653195) v8 ==== 216+0+0 (2007964154 0 0) 0x7f556800e300 con 0x7f556c00e790 
    -24> 2016-09-13 16:15:22.874867 7f557a7fc700    5 client.24191 handle_cap_grant on in 100000000b6 mds.0 seq 5 caps now pAsLsXsFsxcrwb was pAsxLsXsxFsxcrwb 
    -23> 2016-09-13 16:15:22.874880 7f557a7fc700    1 -- 10.222.5.162:0/3252549074 --> 10.222.5.162:6800/24314 -- client_caps(update ino 100000000b6 1400 seq 5 caps=pAsLsXsFsxcrwb dirty=- wanted=Fxcb follows 0 size 10485760/0 ts 1/18446744073709551615 mtime 2016-09-13 16:15:17.653195) v8 -- ?+0 0x7f556c112820 con 0x7f556c00e790 
    -22> 2016-09-13 16:15:23.688495 7f557affd700    1 -- 10.222.5.162:0/3252549074 --> 10.222.5.156:6804/25567 -- osd_op(client.24191.0:17171 5.91cd2dc2 100000000bd.00000000 [write 0~4194304] snapc 1=[] ondisk+write+known_if_redirected e323) v7 -- ?+0 0x7f5550001410 con 0x7f55500219f0 
    -21> 2016-09-13 16:15:23.688651 7f557affd700    1 -- 10.222.5.162:0/3252549074 --> 10.222.5.156:6804/25567 -- osd_op(client.24191.0:17172 5.d4d2b698 100000000be.00000000 [write 0~4194304] snapc 1=[] ondisk+write+known_if_redirected e323) v7 -- ?+0 0x7f5550024380 con 0x7f55500219f0 
    -20> 2016-09-13 16:15:24.291465 7f55783b0700    1 -- 10.222.5.162:0/3252549074 <== osd.4 10.222.5.156:6804/25567 3260 ==== osd_op_reply(17171 100000000bd.00000000 [write 0~4194304] v323'49 uv49 ondisk = 0) v6 ==== 187+0+0 (3181485819 0 0) 0x7f54e4590a70 con 0x7f55500219f0 
    -19> 2016-09-13 16:15:24.291638 7f55815b5700    5 client.24191 put_cap_ref dropped last FILE_BUFFER ref on 100000000bd.head(faked_ino=0 ref=6 ll_ref=1 cap_refs={1024=0,4096=0,8192=0} open={2=1} mode=100755 size=4194304/4194304 mtime=2016-09-13 16:15:17.689988 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) flushing_caps=Fw objectset[100000000bd ts 0/0 objects 1 dirty_or_tx 0] parents=0x7f5514004e80 0x7f556c0c2c80) 
    -18> 2016-09-13 16:15:24.291707 7f55815b5700    5 client.24191 put_cap_ref dropped last FILE_CACHE ref on 100000000bd.head(faked_ino=0 ref=6 ll_ref=1 cap_refs={1024=0,4096=0,8192=0} open={2=1} mode=100755 size=4194304/4194304 mtime=2016-09-13 16:15:17.689988 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) flushing_caps=Fw objectset[100000000bd ts 0/0 objects 1 dirty_or_tx 0] parents=0x7f5514004e80 0x7f556c0c2c80) 
    -17> 2016-09-13 16:15:24.292120 7f55783b0700    1 -- 10.222.5.162:0/3252549074 <== osd.4 10.222.5.156:6804/25567 3261 ==== osd_op_reply(17172 100000000be.00000000 [write 0~4194304] v323'46 uv46 ondisk = 0) v6 ==== 187+0+0 (31370795 0 0) 0x7f54e4590a70 con 0x7f55500219f0 
    -16> 2016-09-13 16:15:24.292231 7f55815b5700    5 client.24191 put_cap_ref dropped last FILE_BUFFER ref on 100000000be.head(faked_ino=0 ref=6 ll_ref=1 cap_refs={1024=0,4096=0,8192=0} open={2=1} mode=100755 size=4194304/4194304 mtime=2016-09-13 16:15:17.690092 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) flushing_caps=Fw objectset[100000000be ts 0/0 objects 1 dirty_or_tx 0] parents=0x7f551401e490 0x7f556c0c3140) 
    -15> 2016-09-13 16:15:24.292271 7f55815b5700    5 client.24191 put_cap_ref dropped last FILE_CACHE ref on 100000000be.head(faked_ino=0 ref=6 ll_ref=1 cap_refs={1024=0,4096=0,8192=0} open={2=1} mode=100755 size=4194304/4194304 mtime=2016-09-13 16:15:17.690092 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) flushing_caps=Fw objectset[100000000be ts 0/0 objects 1 dirty_or_tx 0] parents=0x7f551401e490 0x7f556c0c3140) 
    -14> 2016-09-13 16:15:24.532379 7f55604f5700    1 -- 10.222.5.162:0/3252549074 <== osd.1 10.222.5.229:6804/2228 1469 ==== osd_op_reply(17159 100000000b6.00000002 [write 0~1703936] v323'50 uv50 ondisk = 0) v6 ==== 187+0+0 (1709573551 0 0) 0x7f54c4001300 con 0x7f555002c1a0 
    -13> 2016-09-13 16:15:24.732652 7f55825b7700    1 -- 10.222.5.162:0/3252549074 --> 10.222.5.156:6800/3592 -- ping magic: 0 v1 -- ?+0 0x7f554c0078a0 con 0x7f5550037a10 
    -12> 2016-09-13 16:15:26.827049 7f55604f5700    1 -- 10.222.5.162:0/3252549074 <== osd.1 10.222.5.229:6804/2228 1470 ==== osd_op_reply(17162 100000000b6.00000002 [write 1703936~393216] v323'51 uv51 ondisk = 0) v6 ==== 187+0+0 (3172726132 0 0) 0x7f54c4001300 con 0x7f555002c1a0 
    -11> 2016-09-13 16:15:26.832007 7f55602f3700    1 -- 10.222.5.162:0/3252549074 <== osd.8 10.222.5.162:6815/16206 1451 ==== osd_op_reply(17147 100000000b6.00000001 [write 0~1966080] v323'50 uv50 ondisk = 0) v6 ==== 187+0+0 (4058597970 0 0) 0x7f54cc001150 con 0x7f55500461a0 
    -10> 2016-09-13 16:15:26.832353 7f55602f3700    1 -- 10.222.5.162:0/3252549074 <== osd.8 10.222.5.162:6815/16206 1452 ==== osd_op_reply(17155 100000000b6.00000001 [write 1966080~2228224] v323'51 uv51 ondisk = 0) v6 ==== 187+0+0 (2720977930 0 0) 0x7f54cc001150 con 0x7f55500461a0 
     -9> 2016-09-13 16:15:26.843866 7f557a7fc700    1 -- 10.222.5.162:0/3252549074 <== mds.0 10.222.5.162:6800/24314 8320 ==== client_caps(flush_ack ino 100000000b0 1394 seq 4 tid 1488 caps=pAsxLsXsxFsxcrwb dirty=Fw wanted=- follows 0 size 0/0 mtime 0.000000) v8 ==== 216+0+0 (744883454 0 0) 0x7f556800dc90 con 0x7f556c00e790 
     -8> 2016-09-13 16:15:26.843945 7f557a7fc700    5 client.24191 handle_cap_flush_ack mds.0 cleaned - on 100000000b0.head(faked_ino=0 ref=3 ll_ref=1 cap_refs={1024=0,4096=0,8192=0} open={2=0} mode=100755 size=10485760/16777216 mtime=2016-09-13 16:15:17.591570 caps=pAsLsXsFsc(0=pAsLsXsFsc) flushing_caps=Fw objectset[100000000b0 ts 0/0 objects 3 dirty_or_tx 0] parents=0x7f54bc5f90e0 0x7f556c0be600) with Fw 
     -7> 2016-09-13 16:15:26.843990 7f557a7fc700    1 -- 10.222.5.162:0/3252549074 <== mds.0 10.222.5.162:6800/24314 8321 ==== client_caps(flush_ack ino 100000000b1 1395 seq 4 tid 1489 caps=pAsxLsXsxFsxcrwb dirty=Fw wanted=- follows 0 size 0/0 mtime 0.000000) v8 ==== 216+0+0 (3235489794 0 0) 0x7f5568000d70 con 0x7f556c00e790 
     -6> 2016-09-13 16:15:26.844017 7f557a7fc700    5 client.24191 handle_cap_flush_ack mds.0 cleaned - on 100000000b1.head(faked_ino=0 ref=3 ll_ref=1 cap_refs={1024=0,4096=0,8192=0} open={2=0} mode=100755 size=10485760/16777216 mtime=2016-09-13 16:15:17.596742 caps=pAsLsXsFsc(0=pAsLsXsFsc) flushing_caps=Fw objectset[100000000b1 ts 0/0 objects 3 dirty_or_tx 0] parents=0x7f5520029860 0x7f556c0bd380) with Fw 
     -5> 2016-09-13 16:15:26.851251 7f55786b3700    1 -- 10.222.5.162:0/3252549074 <== osd.7 10.222.5.162:6811/16040 2517 ==== osd_op_reply(17134 100000000b2.00000001 [write 0~2752512] v323'46 uv46 ondisk = 0) v6 ==== 187+0+0 (1467277884 0 0) 0x7f5500003fe0 con 0x7f5550016c60 
     -4> 2016-09-13 16:15:26.851253 7f55607f8700    1 -- 10.222.5.162:0/3252549074 <== osd.3 10.222.5.156:6800/3592 8 ==== osd_op_reply(17002 100000009c4.00000002 [write 0~1179648,write 1179648~917504] v323'29 uv29 ondisk = 0) v6 ==== 229+0+0 (2689207009 0 0) 0x7f54c0002350 con 0x7f5550037a10 
     -3> 2016-09-13 16:15:26.851516 7f55607f8700    1 -- 10.222.5.162:0/3252549074 <== osd.3 10.222.5.156:6800/3592 9 ==== osd_op_reply(17130 100000000b6.00000000 [write 0~4194304] v323'52 uv52 ondisk = 0) v6 ==== 187+0+0 (2987763432 0 0) 0x7f54c0002350 con 0x7f5550037a10 
     -2> 2016-09-13 16:15:26.852306 7f55786b3700    1 -- 10.222.5.162:0/3252549074 <== osd.7 10.222.5.162:6811/16040 2518 ==== osd_op_reply(17140 100000000b2.00000001 [write 2752512~1441792] v323'47 uv47 ondisk = 0) v6 ==== 187+0+0 (3703741868 0 0) 0x7f5500003fe0 con 0x7f5550016c60 
     -1> 2016-09-13 16:15:26.852467 7f55607f8700    1 -- 10.222.5.162:0/3252549074 <== osd.3 10.222.5.156:6800/3592 10 ==== osd_op_reply(17149 100000000b2.00000002 [write 0~1572864] v323'37 uv37 ondisk = 0) v6 ==== 187+0+0 (3935522164 0 0) 0x7f54c0002350 con 0x7f5550037a10 
      0> 2016-09-13 16:15:26.853667 7f55815b5700 -1 osdc/ObjectCacher.cc: In function 'void ObjectCacher::Object::try_merge_bh(ObjectCacher::BufferHead*)' thread 7f55815b5700 time 2016-09-13 16:15:26.851601 
 osdc/ObjectCacher.cc: 146: FAILED assert(p->second == bh) 

  ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374) 
  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85) [0x7f5587528965] 
  2: (ObjectCacher::Object::try_merge_bh(ObjectCacher::BufferHead*)+0x315) [0x7f55873d5475] 
  3: (ObjectCacher::bh_write_commit(long, sobject_t, std::vector<std::pair<long, unsigned long>, std::allocator<std::pair<long, unsigned long> > >&, unsigned long, int)+0x905) [0x7f55873da135] 
  4: (ObjectCacher::C_WriteCommit::finish(int)+0x58) [0x7f55873e8018] 
  5: (Context::complete(int)+0x9) [0x7f558739b2f9] 
  6: (C_Lock::finish(int)+0x29) [0x7f558739bc49] 
  7: (Context::complete(int)+0x9) [0x7f558739b2f9] 
  8: (Finisher::finisher_thread_entry()+0x216) [0x7f558744c6d6] 
  9: (()+0x7dc5) [0x7f5586199dc5] 
  10: (clone()+0x6d) [0x7f558507fced] 
  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. 
 </pre> 



 This problem was reproducible. 

  I remove one osd from the osd tree and after one minute, I add the same osd to osd pool and then fuse client crush. 

 Ceph fuse is writing data through smallfile too, and the script is  

 "    python smallfile_cli.py --top /mnt/test --threads 8 --files 20 --file-size 10240 --record-size 512 --operation create" 

 and my remove osd    steps are: 

 1.    kill -9 $pid_num 

 2. ceph osd out $id 

 3. ceph osd down $id 

 4. ceph osd crush remove osd.$id 

 5. ceph auth del osd.$id 

 6. ceph osd rm osd.$id 

 and my add osd steps are: 

 1. mkfs.xfs    and remount my the osd remove.. 

 2.    ceph osd create 

 3. ceph-osd -i $id --mkfs --osd-data=/data/osd/osd.$id --mkkey 

 4. ceph auth add osd.$id osd 'allow *' mon 'allow rwx' -i /data/osd/osd.$id/keyring 

 5. ceph osd crush create-or-move osd.$id 1.0 host=xxxx 

 6. ceph-osd -i $id 






Back