Project

General

Profile

Bug #14022

map_sem for read + request_mutex are held indefinitely

Added by Micha Krause over 5 years ago. Updated about 2 years ago.

Status:
Resolved
Priority:
Low
Assignee:
Category:
libceph
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

Hi,

when using rsync, my kernel rbd device hangs, here is the data i collected so far:


root@chaus:~# uname -a
Linux chaus 4.2.0-0.bpo.1-amd64 #1 SMP Debian 4.2.5-1~bpo8+1 (2015-11-02) x86_64 GNU/Linux

root@chaus:~# ls -l /sys/kernel/debug/ceph/
total 0
drwxr-xr-x 2 root root 0 Nov 16 09:49 46e857ee-855c-4165-8413-8950f8d081be.client81525668

root@chaus:~# ls -l /sys/kernel/debug/ceph/46e857ee-855c-4165-8413-8950f8d081be.client81525668/
total 0
-rw------- 1 root root 0 Nov 16 09:49 client_options
-rw------- 1 root root 0 Nov 16 09:49 monc
-rw------- 1 root root 0 Nov 16 09:49 monmap
-rw------- 1 root root 0 Nov 16 09:49 osdc
-rw------- 1 root root 0 Nov 16 09:49 osdmap

root@chaus:~# cat /sys/kernel/debug/ceph/46e857ee-855c-4165-8413-8950f8d081be.client81525668/client_options 
name=rbd-ahaus,secret=<hidden>

root@chaus:~# cat /sys/kernel/debug/ceph/46e857ee-855c-4165-8413-8950f8d081be.client81525668/monc
have osdmap 178267
want next osdmap

root@chaus:~# cat /sys/kernel/debug/ceph/46e857ee-855c-4165-8413-8950f8d081be.client81525668/monmap
epoch 10
        mon0    10.210.32.11:6789
        mon1    10.210.33.11:6789
        mon2    10.210.34.11:6789

root@chaus:~# cat /sys/kernel/debug/ceph/46e857ee-855c-4165-8413-8950f8d081be.client81525668/osdc  <-- Hangs, (does not return to shell prompt)

root@chaus:~# cat /sys/kernel/debug/ceph/46e857ee-855c-4165-8413-8950f8d081be.client81525668/osdmap 
epoch 178267
flags
pool 25 pg_num 8 (7) read_tier -1 write_tier -1
pool 27 pg_num 1 (0) read_tier -1 write_tier -1
pool 28 pg_num 1 (0) read_tier -1 write_tier -1
pool 29 pg_num 1 (0) read_tier -1 write_tier -1
pool 30 pg_num 1 (0) read_tier -1 write_tier -1
pool 31 pg_num 1 (0) read_tier -1 write_tier -1
pool 32 pg_num 1 (0) read_tier -1 write_tier -1
pool 33 pg_num 1 (0) read_tier -1 write_tier -1
pool 34 pg_num 128 (127) read_tier -1 write_tier -1
pool 35 pg_num 512 (511) read_tier -1 write_tier -1
pool 36 pg_num 256 (255) read_tier -1 write_tier -1
pool 40 pg_num 4096 (4095) read_tier -1 write_tier -1
pool 45 pg_num 128 (127) read_tier -1 write_tier -1
pool 46 pg_num 64 (63) read_tier -1 write_tier -1
pool 53 pg_num 1 (0) read_tier -1 write_tier -1
pool 54 pg_num 512 (511) read_tier -1 write_tier -1
pool 56 pg_num 128 (127) read_tier -1 write_tier -1                                                                                                                                           
pool 57 pg_num 128 (127) read_tier -1 write_tier -1                                                                                                                                           
pool 58 pg_num 128 (127) read_tier -1 write_tier -1                                                                                                                                           
pool 59 pg_num 128 (127) read_tier -1 write_tier -1                                                                                                                                           
pool 60 pg_num 8 (7) read_tier -1 write_tier -1                                                                                                                                               
pool 61 pg_num 128 (127) read_tier -1 write_tier -1                                                                                                                                           
pool 62 pg_num 32 (31) read_tier -1 write_tier -1                                                                                                                                             
osd0    10.210.32.21:6800       100%    (exists, up)    100%                                                                                                                                  
osd1    10.210.32.21:6802       100%    (exists, up)    100%                                                                                                                                  
osd2    10.210.32.21:6804        95%    (exists, up)    100%                                                                                                                                  
osd3    10.210.32.21:6806        89%    (exists, up)    100%                                                                                                                                  
osd4    10.210.32.21:6803       100%    (exists, up)    100%                                                                                                                                  
osd5    10.210.32.21:6805        89%    (exists, up)    100%                                                                                                                                  
osd6    10.210.32.21:6825        84%    (exists, up)    100%                                                                                                                                  
osd7    10.210.32.21:6801       100%    (exists, up)    100%                                                                                                                                  
osd8    10.210.32.21:6828       100%    (exists, up)    100%                                                                                                                                  
osd9    10.210.32.21:6813       100%    (exists, up)    100%                                                                                                                                  
osd10   10.210.34.22:6802       100%    (exists, up)    100%
osd11   10.210.34.22:6803       100%    (exists, up)    100%
osd12   10.210.34.22:6806       100%    (exists, up)    100%
osd13   10.210.34.22:6808       100%    (exists, up)    100%
osd14   10.210.34.22:6800       100%    (exists, up)    100%
osd15   10.210.32.23:6801       100%    (exists, up)    100%
osd16   10.210.32.23:6805       100%    (exists, up)    100%
osd17   10.210.32.23:6807       100%    (exists, up)    100%
osd18   10.210.32.23:6809       100%    (exists, up)    100%
osd19   10.210.32.23:6817       100%    (exists, up)    100%
osd20   10.210.34.21:6800       100%    (exists, up)    100%
osd21   10.210.34.21:6801       100%    (exists, up)    100%
osd22   10.210.34.21:6807       100%    (exists, up)    100%
osd23   10.210.34.21:6809       100%    (exists, up)    100%
osd24   10.210.34.21:6816       100%    (exists, up)    100%
osd25   10.210.33.21:6800       100%    (exists, up)    100%
osd26   10.210.33.21:6802       100%    (exists, up)    100%
osd27   10.210.33.21:6804       100%    (exists, up)    100%
osd28   10.210.33.21:6807       100%    (exists, up)    100%
osd29   10.210.33.21:6809       100%    (exists, up)    100%
osd30   10.210.34.22:6811       100%    (exists, up)    100%
osd31   10.210.34.22:6813       100%    (exists, up)    100%
osd32   10.210.34.22:6815       100%    (exists, up)    100%
osd33   10.210.34.22:6817       100%    (exists, up)    100%
osd34   10.210.34.22:6801       100%    (exists, up)    100%
osd35   10.210.34.22:6819       100%    (exists, up)    100%
osd36   10.210.33.22:6803       100%    (exists, up)    100%
osd37   10.210.33.22:6805       100%    (exists, up)    100%
osd38   10.210.33.22:6807       100%    (exists, up)    100%
osd39   10.210.33.22:6809       100%    (exists, up)    100%
osd40   10.210.33.22:6811       100%    (exists, up)    100%
osd41   10.210.33.22:6814       100%    (exists, up)    100%
osd42   10.210.33.22:6804        89%    (exists, up)    100%
osd43   10.210.33.22:6818       100%    (exists, up)    100%
osd44   10.210.33.22:6821       100%    (exists, up)    100%
osd45   10.210.33.22:6800       100%    (exists, up)    100%
osd46   10.210.33.22:6820       100%    (exists, up)    100%
osd47   10.210.32.25:6800        84%    (exists, up)    100%
osd48   10.210.32.25:6804       100%    (exists, up)    100%
osd49   10.210.32.25:6806        94%    (exists, up)    100%
osd50   10.210.32.25:6809       100%    (exists, up)    100%
osd51   10.210.32.25:6815       100%    (exists, up)    100%
osd52   10.210.32.25:6818        89%    (exists, up)    100%
osd53   10.210.32.25:6823        89%    (exists, up)    100%
osd54   10.210.32.25:6829       100%    (exists, up)    100%
osd55   10.210.32.25:6831        79%    (exists, up)    100%
osd56   10.210.32.25:6833       100%    (exists, up)    100%
osd57   10.210.32.25:6835       100%    (exists, up)    100%
osd58   10.210.34.23:6802       100%    (exists, up)    100%
osd59   10.210.34.23:6804       100%    (exists, up)    100%
osd60   10.210.34.23:6813       100%    (exists, up)    100%
osd61   10.210.34.23:6817       100%    (exists, up)    100%
osd62   10.210.34.23:6800       100%    (exists, up)    100%
osd63   10.210.34.23:6821       100%    (exists, up)    100%
osd64   10.210.34.23:6823       100%    (exists, up)    100%
osd65   10.210.34.23:6825        94%    (exists, up)    100%
osd66   10.210.34.23:6827       100%    (exists, up)    100%
osd67   10.210.34.23:6829       100%    (exists, up)    100%
osd68   10.210.34.23:6831       100%    (exists, up)    100%
osd69   10.210.33.23:6811       100%    (exists, up)    100%
osd70   10.210.33.23:6813       100%    (exists, up)    100%
osd71   10.210.33.23:6815       100%    (exists, up)    100%
osd72   10.210.33.23:6818       100%    (exists, up)    100%
osd73   10.210.33.23:6820        94%    (exists, up)    100%
osd74   10.210.33.23:6822       100%    (exists, up)    100%
osd75   10.210.33.23:6826       100%    (exists, up)    100%
osd76   10.210.33.23:6829       100%    (exists, up)    100%
osd77   10.210.33.23:6832       100%    (exists, up)    100%
osd78   10.210.33.23:6836       100%    (exists, up)    100%
osd79   10.210.33.23:6840       100%    (exists, up)    100%
osd80   10.210.32.22:6800       100%    (exists, up)    100%
osd81   10.210.32.22:6803       100%    (exists, up)    100%
osd82   10.210.32.22:6807       100%    (exists, up)    100%
osd83   10.210.32.22:6810        94%    (exists, up)    100%
osd84   10.210.32.22:6812       100%    (exists, up)    100%
osd85   10.210.32.22:6815       100%    (exists, up)    100%
osd86   10.210.32.22:6819       100%    (exists, up)    100%
osd87   10.210.32.22:6823        94%    (exists, up)    100%
osd88   10.210.32.22:6804       100%    (exists, up)    100%
osd89   10.210.32.22:6828        89%    (exists, up)    100%
osd90   10.210.32.22:6833       100%    (exists, up)    100%
osd91   10.210.34.24:6807       100%    (exists, up)    100%
osd92   10.210.34.24:6809        89%    (exists, up)    100%
osd93   10.210.34.24:6811       100%    (exists, up)    100%
osd94   10.210.34.24:6814       100%    (exists, up)    100%
osd95   10.210.34.24:6823        94%    (exists, up)    100%
osd96   10.210.34.24:6825       100%    (exists, up)    100%
osd97   10.210.34.24:6828       100%    (exists, up)    100%
osd98   10.210.34.24:6831       100%    (exists, up)    100%
osd99   10.210.34.24:6834       100%    (exists, up)    100%
osd100  10.210.34.24:6800       100%    (exists, up)    100%
osd101  10.210.34.24:6804        94%    (exists, up)    100%
osd102  10.210.33.24:6804       100%    (exists, up)    100%
osd103  10.210.33.24:6806       100%    (exists, up)    100%
osd104  10.210.33.24:6808       100%    (exists, up)    100%
osd105  10.210.33.24:6811       100%    (exists, up)    100%
osd106  10.210.33.24:6813       100%    (exists, up)    100%
osd107  10.210.33.24:6817       100%    (exists, up)    100%
osd108  10.210.33.24:6820       100%    (exists, up)    100%
osd109  10.210.33.24:6822       100%    (exists, up)    100%
osd110  10.210.33.24:6827        84%    (exists, up)    100%
osd111  10.210.33.24:6834       100%    (exists, up)    100%
osd112  10.210.33.24:6837       100%    (exists, up)    100%
osd113  10.210.34.25:6805       100%    (exists, up)    100%
osd114  10.210.34.25:6801       100%    (exists, up)    100%
osd115  10.210.34.25:6818       100%    (exists, up)    100%
osd116  10.210.34.25:6825       100%    (exists, up)    100%
osd117  10.210.34.25:6827       100%    (exists, up)    100%
osd118  10.210.34.25:6806       100%    (exists, up)    100%
osd119  10.210.34.25:6831       100%    (exists, up)    100%
osd120  10.210.34.25:6833        94%    (exists, up)    100%
osd121  10.210.34.25:6835       100%    (exists, up)    100%
osd122  10.210.34.25:6837        79%    (exists, up)    100%
osd123  10.210.34.25:6839       100%    (exists, up)    100%
osd124  10.210.34.25:6841       100%    (exists, up)    100%
osd125  10.210.33.25:6800       100%    (exists, up)    100%
osd126  10.210.33.25:6802       100%    (exists, up)    100%
osd127  10.210.33.25:6804       100%    (exists, up)    100%
osd128  10.210.33.25:6806       100%    (exists, up)    100%
osd129  10.210.33.25:6808        94%    (exists, up)    100%
osd130  10.210.33.25:6810       100%    (exists, up)    100%
osd131  10.210.33.25:6812       100%    (exists, up)    100%
osd132  10.210.33.25:6814       100%    (exists, up)    100%
osd133  10.210.33.25:6816       100%    (exists, up)    100%
osd134  10.210.33.25:6818       100%    (exists, up)    100%
osd135  10.210.33.25:6820       100%    (exists, up)    100%
osd136  10.210.33.25:6822       100%    (exists, up)    100%
osd137  10.210.32.24:6801       100%    (exists, up)    100%
osd138  10.210.32.24:6803        79%    (exists, up)    100%
osd139  10.210.32.24:6806       100%    (exists, up)    100%
osd140  10.210.32.24:6809       100%    (exists, up)    100%
osd141  10.210.32.24:6805       100%    (exists, up)    100%
osd142  10.210.32.24:6816       100%    (exists, up)    100%
osd143  10.210.32.24:6818       100%    (exists, up)    100%
osd144  10.210.32.24:6821       100%    (exists, up)    100%
osd145  10.210.32.24:6823       100%    (exists, up)    100%
osd146  10.210.32.24:6825       100%    (exists, up)    100%
osd147  10.210.32.24:6827       100%    (exists, up)    100%
osd148  10.210.32.24:6829       100%    (exists, up)    100%
osd149  10.210.33.10:6806       100%    (exists, up)    100%
osd150  10.210.33.10:6801       100%    (exists, up)    100%
osd151  10.210.33.10:6810       100%    (exists, up)    100%
osd152  10.210.33.10:6813       100%    (exists, up)    100%
osd153  10.210.33.10:6817       100%    (exists, up)    100%
osd154  10.210.33.10:6821       100%    (exists, up)    100%
osd155  10.210.33.10:6824       100%    (exists, up)    100%
osd156  10.210.33.10:6826       100%    (exists, up)    100%
osd157  10.210.33.10:6828       100%    (exists, up)    100%
osd158  10.210.33.10:6830       100%    (exists, up)    100%
osd159  10.210.33.10:6832       100%    (exists, up)    100%
osd160  10.210.33.10:6834       100%    (exists, up)    100%
osd161  10.210.32.10:6800       100%    (exists, up)    100%
osd162  10.210.32.10:6802       100%    (exists, up)    100%
osd163  10.210.32.10:6804       100%    (exists, up)    100%
osd164  10.210.32.10:6806       100%    (exists, up)    100%
osd165  10.210.32.10:6808       100%    (exists, up)    100%
osd166  10.210.32.10:6810       100%    (exists, up)    100%
osd167  10.210.32.10:6812       100%    (exists, up)    100%
osd168  10.210.32.10:6814        79%    (exists, up)    100%
osd169  10.210.32.10:6816       100%    (exists, up)    100%
osd170  10.210.32.10:6818       100%    (exists, up)    100%
osd171  10.210.32.10:6820       100%    (exists, up)    100%
osd172  10.210.34.26:6800       100%    (exists, up)    100%
osd173  10.210.34.26:6802       100%    (exists, up)    100%
osd174  10.210.34.26:6804       100%    (exists, up)    100%
osd175  10.210.34.26:6806       100%    (exists, up)    100%
osd176  10.210.34.26:6808       100%    (exists, up)    100%
osd177  10.210.34.26:6810       100%    (exists, up)    100%
osd178  10.210.34.26:6812       100%    (exists, up)    100%
osd179  10.210.34.26:6814       100%    (exists, up)    100%
osd180  10.210.34.26:6816       100%    (exists, up)    100%
osd181  10.210.34.26:6818       100%    (exists, up)    100%
osd182  10.210.34.26:6820       100%    (exists, up)    100%
osd183  10.210.33.8:6800        100%    (exists, up)    100%
osd184  10.210.33.8:6802        100%    (exists, up)    100%
osd185  10.210.33.8:6804        100%    (exists, up)    100%
osd186  10.210.33.8:6806        100%    (exists, up)    100%
osd187  10.210.33.8:6808        100%    (exists, up)    100%
osd188  10.210.33.8:6810        100%    (exists, up)    100%
osd189  10.210.33.8:6812        100%    (exists, up)    100%
osd190  10.210.33.8:6814        100%    (exists, up)    100%
osd191  10.210.33.8:6816        100%    (exists, up)    100%
osd192  10.210.33.8:6818        100%    (exists, up)    100%
osd193  10.210.33.8:6820        100%    (exists, up)    100%

root@chaus:~# ps auxf <-- Hangs

root@chaus:~# netstat -autpn
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      675/sshd        
tcp        0      0 0.0.0.0:25              0.0.0.0:*               LISTEN      767/exim4       
tcp        0      0 0.0.0.0:54907           0.0.0.0:*               LISTEN      658/rpc.statd   
tcp        0      0 0.0.0.0:5666            0.0.0.0:*               LISTEN      749/nrpe        
tcp        0      0 0.0.0.0:111             0.0.0.0:*               LISTEN      649/rpcbind     
tcp      265      0 10.2.0.184:43610        10.210.34.23:6802       CLOSE_WAIT  -               
tcp      265      0 10.2.0.184:37066        10.210.33.23:6811       CLOSE_WAIT  -               
tcp      265      0 10.2.0.184:54940        10.210.33.25:6820       CLOSE_WAIT  -               
tcp      583      0 10.2.0.184:52776        10.210.33.22:6807       CLOSE_WAIT  -               
tcp      265      0 10.2.0.184:35526        10.210.34.24:6823       CLOSE_WAIT  -               
tcp     1219      0 10.2.0.184:54180        10.210.33.25:6818       CLOSE_WAIT  -               
tcp     1219      0 10.2.0.184:52378        10.210.34.23:6829       CLOSE_WAIT  -               
tcp      265      0 10.2.0.184:50624        10.210.34.24:6804       CLOSE_WAIT  -               
tcp      265      0 10.2.0.184:46080        10.210.32.24:6825       CLOSE_WAIT  -               
tcp     1219      0 10.2.0.184:45458        10.210.32.25:6800       CLOSE_WAIT  -               
tcp      583      0 10.2.0.184:49914        10.210.33.23:6826       CLOSE_WAIT  -               
tcp      265      0 10.2.0.184:42036        10.210.34.22:6813       CLOSE_WAIT  -               
tcp      901      0 10.2.0.184:44448        10.210.34.26:6808       CLOSE_WAIT  -               
tcp      265      0 10.2.0.184:37284        10.210.33.25:6808       CLOSE_WAIT  -               
tcp      265      0 10.2.0.184:42858        10.210.34.22:6808       CLOSE_WAIT  -               
tcp        1      0 10.2.0.184:51536        10.210.34.24:6807       CLOSE_WAIT  -               
tcp      265      0 10.2.0.184:60356        10.210.33.22:6814       CLOSE_WAIT  -               
tcp        0      0 10.2.0.184:22           10.6.6.137:50894        ESTABLISHED 18364/7         
tcp      265      0 10.2.0.184:35674        10.210.34.26:6804       CLOSE_WAIT  -               
tcp      583      0 10.2.0.184:52150        10.210.32.10:6820       CLOSE_WAIT  -               
tcp      265      0 10.2.0.184:38264        10.210.34.22:6815       CLOSE_WAIT  -               
tcp        0      0 10.2.0.184:22           10.6.6.137:49258        ESTABLISHED 15872/5         
tcp      265      0 10.2.0.184:57114        10.210.34.26:6810       CLOSE_WAIT  -               
tcp      265      0 10.2.0.184:49298        10.210.33.22:6804       CLOSE_WAIT  -               
tcp      265      0 10.2.0.184:57480        10.210.32.24:6805       CLOSE_WAIT  -               
tcp      901      0 10.2.0.184:53614        10.210.32.25:6804       CLOSE_WAIT  -               
tcp      265      0 10.2.0.184:37266        10.210.33.23:6836       CLOSE_WAIT  -               
tcp      901      0 10.2.0.184:45544        10.210.33.10:6806       CLOSE_WAIT  -               
tcp     1537      0 10.2.0.184:46374        10.210.32.25:6806       CLOSE_WAIT  -               
tcp        1      0 10.2.0.184:56040        10.210.33.11:6789       CLOSE_WAIT  -               
tcp      583      0 10.2.0.184:59506        10.210.33.8:6808        CLOSE_WAIT  -               
tcp     2809      0 10.2.0.184:52040        10.210.32.10:6816       CLOSE_WAIT  -               
tcp     4399      0 10.2.0.184:52222        10.210.32.25:6815       CLOSE_WAIT  -               
tcp      265      0 10.2.0.184:40684        10.210.34.26:6800       CLOSE_WAIT  -               
tcp      265      0 10.2.0.184:59528        10.210.33.8:6804        CLOSE_WAIT  -               
tcp     4825      0 10.2.0.184:40366        10.210.33.21:6804       CLOSE_WAIT  -               
tcp      265      0 10.2.0.184:38218        10.210.33.24:6822       CLOSE_WAIT  -               
tcp      265      0 10.2.0.184:42118        10.210.32.10:6806       CLOSE_WAIT  -               
tcp      265      0 10.2.0.184:47072        10.210.34.25:6801       CLOSE_WAIT  -               
tcp     1219      0 10.2.0.184:36734        10.210.34.25:6805       CLOSE_WAIT  -               
tcp      265      0 10.2.0.184:46782        10.210.32.23:6805       CLOSE_WAIT  -               
tcp        0   4224 10.2.0.184:22           10.6.6.137:50899        ESTABLISHED 18382/8         
tcp     1237      0 10.2.0.184:40304        10.210.32.25:6831       CLOSE_WAIT  -               
tcp      901      0 10.2.0.184:38000        10.210.34.25:6841       CLOSE_WAIT  -               
tcp     1219      0 10.2.0.184:58966        10.210.34.26:6814       CLOSE_WAIT  -               
tcp      583      0 10.2.0.184:43550        10.210.34.25:6835       CLOSE_WAIT  -               
tcp        0      0 10.2.0.184:22           10.6.6.137:49278        ESTABLISHED 15892/6         
tcp      583      0 10.2.0.184:58630        10.210.34.23:6804       CLOSE_WAIT  -               
tcp      265      0 10.2.0.184:35394        10.210.33.22:6809       CLOSE_WAIT  -               
tcp      265      0 10.2.0.184:52048        10.210.34.25:6831       CLOSE_WAIT  -               
tcp     1219      0 10.2.0.184:50384        10.210.32.10:6802       CLOSE_WAIT  -               
tcp      265      0 10.2.0.184:41756        10.210.33.22:6820       CLOSE_WAIT  -               
tcp      265      0 10.2.0.184:47582        10.210.33.10:6830       CLOSE_WAIT  -               
tcp6       0      0 :::22                   :::*                    LISTEN      675/sshd        
tcp6       0      0 :::25                   :::*                    LISTEN      767/exim4       
tcp6       0      0 :::53503                :::*                    LISTEN      658/rpc.statd   
tcp6       0      0 :::5666                 :::*                    LISTEN      749/nrpe        
tcp6       0      0 :::111                  :::*                    LISTEN      649/rpcbind     
udp        0      0 0.0.0.0:35628           0.0.0.0:*                           658/rpc.statd   
udp        0      0 0.0.0.0:111             0.0.0.0:*                           649/rpcbind     
udp        0      0 0.0.0.0:824             0.0.0.0:*                           649/rpcbind     
udp        0      0 127.0.0.1:834           0.0.0.0:*                           658/rpc.statd   
udp6       0      0 :::56597                :::*                                658/rpc.statd   
udp6       0      0 :::111                  :::*                                649/rpcbind     
udp6       0      0 :::824                  :::*                                649/rpcbind     

root@chaus:~# iostat 1 -xm /dev/rbd0
Linux 4.2.0-0.bpo.1-amd64 (chaus)       12/08/2015      _x86_64_        (2 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.51    0.00    1.03   97.35    0.00    1.11

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
rbd0              0.43     0.32    1.53    1.78     0.05     0.15   122.58     1.45   79.02   32.94  118.66 299.15  99.26

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    0.00  100.00    0.00    0.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
rbd0              0.00     0.00    0.00    0.00     0.00     0.00     0.00   128.00    0.00    0.00    0.00   0.00 100.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    0.00  100.00    0.00    0.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
rbd0              0.00     0.00    0.00    0.00     0.00     0.00     0.00   128.00    0.00    0.00    0.00   0.00 100.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.50    0.00    0.50   99.00    0.00    0.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
rbd0              0.00     0.00    0.00    0.00     0.00     0.00     0.00   128.00    0.00    0.00    0.00   0.00 100.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    0.00  100.00    0.00    0.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
rbd0              0.00     0.00    0.00    0.00     0.00     0.00     0.00   128.00    0.00    0.00    0.00   0.00 100.00

I can provide more data if needed.

Micha Krause

dmesg (130 KB) Micha Krause, 12/08/2015 09:08 AM

journal (299 KB) Micha Krause, 12/11/2015 01:16 PM


Related issues

Related to Linux kernel client - Bug #15891: [rbd] i/o to rbd block device stopped constantly Resolved 05/16/2016

History

#1 Updated by Micha Krause over 5 years ago

Output of


root@chaus:~# echo w >/proc/sysrq-trigger
root@chaus:~# dmesg 

in attached file.

Randomly picking one of the hanging cat processes:

root@chaus:~# cat /proc/6996/stack 
[<ffffffffa05165f8>] osdc_show+0x38/0x170 [libceph]
[<ffffffff811e1c4b>] seq_read+0xcb/0x380
[<ffffffff811bfac1>] vfs_read+0x81/0x120
[<ffffffff811c0822>] SyS_read+0x42/0xa0
[<ffffffff815586f2>] system_call_fast_compare_end+0xc/0x6b
[<ffffffffffffffff>] 0xffffffffffffffff

#2 Updated by Ilya Dryomov over 5 years ago

  • Project changed from rbd to Linux kernel client
  • Subject changed from RBD device hangs to request_mutex is held indefinitely
  • Assignee set to Ilya Dryomov

#3 Updated by Ilya Dryomov over 5 years ago

Can you do the following:

  1. echo "waiting tasks" >/dev/kmsg
  2. echo w >/proc/sysrq-trigger
  3. echo "end of waiting tasks" >/dev/kmsg
  4. echo "all tasks" >/dev/kmsg
  5. echo t >/proc/sysrq-trigger
  6. echo "end of all tasks"
  7. echo "lockdep" >/dev/kmsg
  8. echo d >/proc/sysrq-trigger
  9. echo "end of lockdep" >/dev/kmsg

and make sure that the file you attach starts with "waiting tasks" and ends with "end of lockdep"?

#4 Updated by Ilya Dryomov over 5 years ago

  • Subject changed from request_mutex is held indefinitely to map_sem for write + request_mutex are held indefinitely

Can you also attach the entire syslog for this boot? Compress it, if it ends up too big, just email it to me directly.

#5 Updated by Micha Krause over 5 years ago

My dmesg buffer seems to be to small, i home this output from journalctl is helpfull, but there are missed messages in between.

#6 Updated by Micha Krause over 5 years ago

Can't help you with the syslog, because the boot messages are already rotated away.

I could reboot the system, and wait for the problem to reapear, if that helps?

#7 Updated by Ilya Dryomov over 5 years ago

Yeah, what you attached is pretty useless. Your kernel is compiled w/o lockdep, so I need to see all the stacktraces to try to figure out what is holding the lock.
Try to recreate the issue, run those echos and make sure you have all of syslog for that boot - maybe allocate more memory to systemd-journal, etc. There can't be any "missed N messages" in there for it to be useful.

#8 Updated by Josh Durgin over 5 years ago

  • Status changed from New to Need More Info

#9 Updated by Ilya Dryomov almost 5 years ago

  • Related to Bug #15891: [rbd] i/o to rbd block device stopped constantly added

#10 Updated by Ilya Dryomov almost 5 years ago

  • Subject changed from map_sem for write + request_mutex are held indefinitely to map_sem for read + request_mutex are held indefinitely

#11 Updated by Ilya Dryomov almost 5 years ago

  • Category set to libceph
  • Priority changed from Normal to Low

OSD client has been rewritten in 4.7.

#12 Updated by Ilya Dryomov about 2 years ago

  • Status changed from Need More Info to Resolved

No new occurrences, request_mutex is gone, closing.

Also available in: Atom PDF