Project

General

Profile

Actions

Bug #4909

closed

mds: stalled/stuck directory (standby)

Added by Denis kaganovich almost 11 years ago. Updated about 10 years ago.

Status:
Can't reproduce
Priority:
High
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I many times break actions (debug mysql replication script, just multiple dump redirections) directly to directory, mounted via kernel on one of 3 node to 2 other nodes (to avoid kernel-to-userspace problems on same machine). After one of break it directory is "stuck" (others are good). Multiple time restarting mds, and even once upgrade whole ceph, now all clients connected to all 3 mds list... Directory stuck. IMHO there are another locking bug. Mds debug 10 attached for actual MDS.


Files

ceph-mds.1.log (47.8 KB) ceph-mds.1.log Denis kaganovich, 05/03/2013 07:18 PM
Actions #1

Updated by Denis kaganovich almost 11 years ago

& (without debug 10) now log flooding on other node (mds.4):

2013-05-04 13:47:27.648019 7fe8c59ca700 0 mds.0.server handle_client_file_setlock: start: 0, length
: 0, client: 237010, pid: 28233, type: 4

2013-05-04 13:47:28.647913 7fe8c59ca700 0 mds.0.server handle_client_file_setlock: start: 0, length
: 0, client: 237010, pid: 28254, type: 4

2013-05-04 13:47:29.657538 7fe8c59ca700 0 mds.0.server handle_client_file_setlock: start: 0, length
: 0, client: 237010, pid: 28285, type: 4

2013-05-04 13:47:30.648111 7fe8c59ca700 0 mds.0.server handle_client_file_setlock: start: 0, length
: 0, client: 237010, pid: 28306, type: 4

2013-05-04 13:47:31.647988 7fe8c59ca700 0 mds.0.server handle_client_file_setlock: start: 0, length
: 0, client: 237010, pid: 28334, type: 4

Actions #2

Updated by Denis kaganovich almost 11 years ago

Sorry, comment 1 is about ctdbd (IMHO), forget. Only main issue.

Actions #3

Updated by Denis kaganovich almost 11 years ago

Directory accessed only after reboot one of node (with stalled mount's) - not after only ceph daemons restarting.

Actions #4

Updated by Sage Weil almost 11 years ago

  • Project changed from Ceph to CephFS
  • Category deleted (1)
  • Priority changed from Normal to High
Actions #5

Updated by Sage Weil about 10 years ago

  • Status changed from New to Can't reproduce
Actions

Also available in: Atom PDF