Project

General

Profile

Bug #18816

Updated by Greg Farnum about 7 years ago

Note the "mds_log = false" below. If you do that, this happens: 

 Have crushed MDS daemon during executing different payload tests with such dump: 

 --- begin dump of recent events --- 
      0> 2017-02-03 02:34:41.974639 7f7e8ec5e700 -1 *** Caught signal (Aborted) ** 
  in thread 7f7e8ec5e700 thread_name:ms_dispatch 

  ceph version 10.2.4-211-g12b091b (12b091b4a40947aa43919e71a318ed0dcedc8734) 
  1: (()+0x5142a2) [0x557c51e092a2] 
  2: (()+0x10b00) [0x7f7e95df2b00] 
  3: (gsignal()+0x37) [0x7f7e93ccb8d7] 
  4: (abort()+0x13a) [0x7f7e93ccccaa] 
  5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x265) [0x557c51f133d5] 
  6: (MutationImpl::~MutationImpl()+0x28e) [0x557c51bb9e1e] 
  7: (std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release()+0x39) [0x557c51b2ccf9] 
  8: (Locker::check_inode_max_size(CInode*, bool, bool, unsigned long, bool, unsigned long, utime_t)+0x9a7) [0x557c51ca2757] 
  9: (Locker::remove_client_cap(CInode*, client_t)+0xb1) [0x557c51ca38f1] 
  10: (Locker::_do_cap_release(client_t, inodeno_t, unsigned long, unsigned int, unsigned int)+0x90d) [0x557c51ca424d] 
  11: (Locker::handle_client_cap_release(MClientCapRelease*)+0x1cc) [0x557c51ca449c] 
  12: (MDSRank::handle_deferrable_message(Message*)+0xc1c) [0x557c51b33d3c] 
  13: (MDSRank::_dispatch(Message*, bool)+0x1e1) [0x557c51b3c991] 
  14: (MDSRankDispatcher::ms_dispatch(Message*)+0x15) [0x557c51b3dae5] 
  15: (MDSDaemon::ms_dispatch(Message*)+0xc3) [0x557c51b25703] 
  16: (DispatchQueue::entry()+0x78b) [0x557c5200d06b] 
  17: (DispatchQueue::DispatchThread::entry()+0xd) [0x557c51ee5dcd] 
  18: (()+0x8734) [0x7f7e95dea734] 
  19: (clone()+0x6d) [0x7f7e93d80d3d] 
  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. 

 How to reproduce the issue: 

 1th option is FIO test. Run FIO payload on client machine with config below, then execute "echo 3 > /proc/sys/vm/drop_caches" and after run test again. 

 [test] 
 blocksize=64k 
 filename=/mnt/mycephfs/payload3G 
 rw=randread 
 direct=1 
 buffered=0 
 ioengine=libaio 
 iodepth=8 
 runtime=300 
 filesize=3G   

 2th option is "inodes payload". Download latest Joomla distribution, unzip and perfrom this: 

 time for i in `seq 100`; do cp -a joomla /mnt/mycephfs/joomla${i}; done 

 My sandbox is pretty simple and consists server (cephnode below) and client (payload below) machines: 

 cephnode:~ # ceph -s 
     cluster c848af4a-98ea-498c-87d6-059ebf609287 
      health HEALTH_WARN 
             mds cephnode is laggy 
      monmap e1: 1 mons at {cephnode=192.168.10.20:6789/0} 
             election epoch 9, quorum 0 cephnode 
       fsmap e96: 1/1/1 up {0=cephnode=up:active(laggy or crashed)} 
      osdmap e96: 1 osds: 1 up, 1 in 
             flags sortbitwise,require_jewel_osds 
       pgmap v1832: 204 pgs, 3 pools, 3072 MB data, 787 objects 
             3117 MB used, 396 GB / 399 GB avail 
                  204 active+clean 

 cephnode:~ # cat /etc/ceph/ceph.conf 
 [global] 
 fsid = c848af4a-98ea-498c-87d6-059ebf609287 
 mon_initial_members = cephnode 
 mon_host = 192.168.10.20 
 auth_cluster_required = cephx 
 auth_service_required = cephx 
 auth_client_required = cephx 
 osd_pool_default_size = 1 
 mds_log = false 

 cephnode:~ # lsb_release -a 
 LSB Version:      n/a 
 Distributor ID: SUSE 
 Description:      SUSE Linux Enterprise Server 12 SP2 
 Release:          12.2 
 Codename:         n/a 

 cephnode:~ # uname -a 
 Linux cephnode 4.4.38-93-default #1 SMP Wed Dec 14 12:59:43 UTC 2016 (2d3e9d4) x86_64 x86_64 x86_64 GNU/Linux 

 cephnode:~ # ceph -v 
 ceph version 10.2.4-211-g12b091b (12b091b4a40947aa43919e71a318ed0dcedc8734) 

 payload:~ # ceph -v 
 ceph version 10.2.4-211-g12b091b (12b091b4a40947aa43919e71a318ed0dcedc8734) 

 payload:~ # mount 
 ... 
 192.168.10.20:6789:/ on /mnt/mycephfs type ceph (rw,relatime,name=admin,secret=<hidden>,acl) 

 payload:~ # lsb_release -a 
 LSB Version:      n/a 
 Distributor ID: SUSE 
 Description:      SUSE Linux Enterprise Server 12 SP2 
 Release:          12.2 
 Codename:         n/a 

 payload:~ # uname -a 
 Linux payload 4.4.38-93-default #1 SMP Wed Dec 14 12:59:43 UTC 2016 (2d3e9d4) x86_64 x86_64 x86_64 GNU/Linux 
 " 

 Priority: High 
 Affected Versions: 10.2.4 
 ceph-qa-suite: 'fs' and 'kcephfs' 
 Release: jewel

Back