Bug #1150
kclient: ERESTARTSYS from flock/fcntl locks
0%
Description
After upgrade from 0.28.2 to 0.29 amanda backup no longer working if amanda files placed on ceph fs and ceph fs mounted by mount.
I got this message in amanda log file (/amanda - it's ceph fs mountpoint):
driver: could not lock log file /amanda/state/servers/log/log: Interrupted system call
With cfuse work fine.
Related issues
History
#1 Updated by Sage Weil almost 13 years ago
- Target version set to v0.30
which version of the kernel client are you using?
Can you try the latest for-linus branch of ceph-client.git? There is a locking related fix there for .. interrupted calls. Your symptoms sound a bit off, but either way we need to see what happens on the latest code...
Thanks!
#2 Updated by Fyodor Ustinov almost 13 years ago
Ok, I will write later about the test results.
#3 Updated by Fyodor Ustinov almost 13 years ago
On 3.0.0-rc3 kernel (master branch) the same trouble.
#4 Updated by Sage Weil almost 13 years ago
- Target version changed from v0.30 to v0.31
#5 Updated by Sage Weil almost 13 years ago
- translation missing: en.field_story_points set to 5
- translation missing: en.field_position set to 1
- translation missing: en.field_position changed from 1 to 693
#6 Updated by Sage Weil almost 13 years ago
- translation missing: en.field_position deleted (
696) - translation missing: en.field_position set to 702
#7 Updated by Sage Weil over 12 years ago
- Target version changed from v0.31 to v0.32
#8 Updated by Sage Weil over 12 years ago
- translation missing: en.field_position deleted (
705) - translation missing: en.field_position set to 709
#9 Updated by Sage Weil over 12 years ago
- translation missing: en.field_position deleted (
715) - translation missing: en.field_position set to 726
#10 Updated by Sage Weil over 12 years ago
- translation missing: en.field_position deleted (
733) - translation missing: en.field_position set to 735
#11 Updated by Sage Weil over 12 years ago
- translation missing: en.field_story_points deleted (
5) - translation missing: en.field_position deleted (
735) - translation missing: en.field_position set to 735
#12 Updated by Fyodor Ustinov over 12 years ago
Ceph: 0.31
Kernel 3.0.0-rc6
The same.
ps: In mds.0.log:
2011-07-10 15:38:45.094709 2011-07-10 15:38:45.094750 7f9422eeb700 mds0.server handle_client_file_setlock: start: 0, length: 0, client: 4804, pid: 1721, type: 2
2011-07-10 15:38:45.094757 2011-07-10 15:38:45.128499 7f9422eeb700 mds0.server handle_client_file_setlock: start: 0, length: 0, client: 4804, pid: 1727, type: 2
2011-07-10 15:38:45.128516 2011-07-10 15:38:45.128571 7f9422eeb700 mds0.server handle_client_file_setlock: start: 0, length: 0, client: 4804, pid: 1731, type: 2
2011-07-10 15:38:45.128579 2011-07-10 15:38:45.128623 7f9422eeb700 mds0.server handle_client_file_setlock: start: 0, length: 0, client: 4804, pid: 1730, type: 2
2011-07-10 15:38:45.128631 2011-07-10 15:38:45.715763 7f9422eeb700 mds0.server handle_client_file_setlock: start: 0, length: 0, client: 4804, pid: 1717, type: 2
2011-07-10 15:38:45.715780 2011-07-10 15:38:50.273853 7f9422eeb700 mds0.server handle_client_file_setlock: start: 0, length: 0, client: 4804, pid: 1714, type: 2
2011-07-10 15:38:50.273873 2011-07-10 15:41:16.000984 7f9422eeb700 mds0.server handle_client_file_setlock: start: 0, length: 0, client: 4804, pid: 1731, type: 4
2011-07-10 15:41:16.001007 2011-07-10 15:41:16.001080 7f9422eeb700 mds0.server handle_client_file_setlock: start: 0, length: 0, client: 4804, pid: 1724, type: 4
2011-07-10 15:41:16.001089 2011-07-10 15:41:16.001640 7f9422eeb700 mds0.server handle_client_file_setlock: start: 0, length: 0, client: 4804, pid: 1730, type: 4
2011-07-10 15:41:16.001657 2011-07-10 15:41:16.001990 7f9422eeb700 mds0.server handle_client_file_setlock: start: 0, length: 0, client: 4804, pid: 1723, type: 4
2011-07-10 15:41:16.002007 2011-07-10 15:41:16.002077 7f9422eeb700 mds0.server handle_client_file_setlock: start: 0, length: 0, client: 4804, pid: 1722, type: 4
2011-07-10 15:41:16.002087 2011-07-10 15:41:16.002340 7f9422eeb700 mds0.server handle_client_file_setlock: start: 0, length: 0, client: 4804, pid: 1728, type: 4
2011-07-10 15:41:16.002354 2011-07-10 15:41:16.002420 7f9422eeb700 mds0.server handle_client_file_setlock: start: 0, length: 0, client: 4804, pid: 1720, type: 4
2011-07-10 15:41:16.002431 2011-07-10 15:41:16.002483 7f9422eeb700 mds0.server handle_client_file_setlock: start: 0, length: 0, client: 4804, pid: 1729, type: 4
2011-07-10 15:41:16.002493 2011-07-10 15:41:16.002544 7f9422eeb700 mds0.server handle_client_file_setlock: start: 0, length: 0, client: 4804, pid: 1727, type: 4
2011-07-10 15:41:16.002553 2011-07-10 15:41:16.002605 7f9422eeb700 mds0.server handle_client_file_setlock: start: 0, length: 0, client: 4804, pid: 1726, type: 4
2011-07-10 15:41:16.002615 2011-07-10 15:41:16.002668 7f9422eeb700 mds0.server handle_client_file_setlock: start: 0, length: 0, client: 4804, pid: 1721, type: 4
2011-07-10 15:41:16.002677 2011-07-10 15:41:16.003173 7f9422eeb700 mds0.server handle_client_file_setlock: start: 0, length: 0, client: 4804, pid: 1719, type: 4
2011-07-10 15:41:16.003187 2011-07-10 15:41:16.003249 7f9422eeb700 mds0.server handle_client_file_setlock: start: 0, length: 0, client: 4804, pid: 1718, type: 4
2011-07-10 15:41:16.003258 2011-07-10 15:41:16.043456 7f9422eeb700 mds0.server handle_client_file_setlock: start: 0, length: 0, client: 4804, pid: 1714, type: 4
2011-07-10 15:41:16.043473 2011-07-10 15:41:16.043542 7f9422eeb700 mds0.server handle_client_file_setlock: start: 0, length: 0, client: 4804, pid: 1717, type: 4
2011-07-10 15:41:16.043553 2011-07-10 15:41:16.084223 7f9422eeb700 mds0.server handle_client_file_setlock: start: 0, length: 0, client: 4804, pid: 1713, type: 2
#13 Updated by Sage Weil over 12 years ago
- Target version changed from v0.32 to v0.33
- translation missing: en.field_position deleted (
754) - translation missing: en.field_position set to 16
#14 Updated by Sage Weil over 12 years ago
- Subject changed from amanda and 0.29 to kclient: amanda gets ERESTARTSYS from flock/fcntl locks
#15 Updated by Sage Weil over 12 years ago
- Target version changed from v0.33 to v0.34
- translation missing: en.field_position deleted (
38) - translation missing: en.field_position set to 3
#16 Updated by Sage Weil over 12 years ago
- translation missing: en.field_position deleted (
9) - translation missing: en.field_position set to 776
#17 Updated by Sage Weil over 12 years ago
- Target version changed from v0.34 to v0.35
- translation missing: en.field_position deleted (
789) - translation missing: en.field_position set to 1
#18 Updated by Sage Weil over 12 years ago
- translation missing: en.field_position deleted (
16) - translation missing: en.field_position set to 25
#19 Updated by Sage Weil over 12 years ago
- Target version changed from v0.35 to v0.36
- translation missing: en.field_position deleted (
30) - translation missing: en.field_position set to 6
#20 Updated by Sage Weil over 12 years ago
- translation missing: en.field_position deleted (
35) - translation missing: en.field_position set to 1
- translation missing: en.field_position changed from 1 to 861
#21 Updated by Sage Weil over 12 years ago
- Target version deleted (
v0.36) - translation missing: en.field_position deleted (
861) - translation missing: en.field_position set to 10
#22 Updated by Sage Weil over 12 years ago
- translation missing: en.field_position deleted (
11) - translation missing: en.field_position set to 54
#23 Updated by Sage Weil over 12 years ago
- Subject changed from kclient: amanda gets ERESTARTSYS from flock/fcntl locks to kclient: ERESTARTSYS from flock/fcntl locks
the ping_pong code is here:
http://junkcode.samba.org/ftp/unpacked/junkcode/ping_pong.cl start by running ping_pong on just one of the kernel client mount
point like this:ping_pong /mnt/test.dat 3
it lock very fast :
T02-OSD161:/usr/src/ping_pong# ./ping_pong /mnt/test.dat 3
? ?1664 locks/secbut when l start a second copy of ping_pong on another kernel client
node in my cluster, there have been some errors:T02-OSD186:/usr/src/getosd# ./ping_pong /mnt/test.dat 3
lock at 0 failed! - Interrupted system call
lock at 0 failed! - Interrupted system call
lock at 0 failed! - Interrupted system call
lock at 2 failed! - Interrupted system call
lock at 1 failed! - Interrupted system call
lock at 2 failed! - Interrupted system callls this a bug of ceph ,or l do something wrong?
#24 Updated by Greg Farnum over 12 years ago
- Status changed from New to Duplicate
I forgot about this bug! It's almost certainly the same as #1475 -- EINTR and ERESTARTSYS are synonyms, right? :)