Project

General

Profile

Actions

Bug #13926

closed

lockup in multithreaded application

Added by Burkhard Linke over 8 years ago. Updated over 5 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Correctness/Safety
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
fs
Component(FS):
ceph-fuse
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

A multithreaded applications end up in a blocked state when multiple threads try to access the same file.

  1. apt-cache policy ceph-fuse
    ceph-fuse:
    Installed: 0.94.5-1trusty

ceph.conf:

[global]
osd_pool_default_pgp_num = 512
osd_pool_default_min_size = 2
auth_service_required = cephx
mon_initial_members = <one monitor>
fsid = <fs id>
cluster_network = <network>
auth_supported = cephx
auth_cluster_required = cephx
mon_host = <monitor hosts>
auth_client_required = cephx
osd_pool_default_size = 3
osd_pool_default_pg_num = 512
public_network = <network>
#fuse_use_invalidate_cb = True
debug_client=20/20

The locked up process is consuming 100% CPU in system call at that time (40 CPU cores):

top - 10:38:26 up 7 min, 1 user, load average: 0.99, 0.64, 0.30
Tasks: 40 total, 2 running, 38 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.1 us, 2.5 sy, 0.0 ni, 97.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem: 26411547+total, 2673076 used, 26144240+free, 43872 buffers
KiB Swap: 26855424+total, 0 used, 26855424+free. 687924 cached Mem

PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                 
4039 blinke 20 0 16.198g 38060 2152 R 100.1 0.0 2:41.25 hammer

Trying to terminate the process (e.g. CTRL-C) kills the worker thread, but the main thread keeps running. Accessing the list of file handles associated with the process (/proc/$PID/fd) also blocks.

Debug output is available with ceph-post-file id a8eb75d5-cc13-430a-bed8-428c8a33d6d8

Actions #1

Updated by Zheng Yan over 8 years ago

looks like you have debug_client=20, could you upload the client log to some place.

Actions #2

Updated by Burkhard Linke over 8 years ago

The logfile has already been uploaded with ceph-post-file, id is a8eb75d5-cc13-430a-bed8-428c8a33d

Actions #3

Updated by Greg Farnum over 8 years ago

  • Assignee set to Zheng Yan
Actions #4

Updated by Greg Farnum about 8 years ago

Zheng, anything come out of this?

Actions #5

Updated by Zheng Yan about 8 years ago

  • Status changed from New to Need More Info

did not find anything in the log

Actions #6

Updated by Loïc Dachary about 8 years ago

  • Target version deleted (v0.94.6)
Actions #7

Updated by Greg Farnum almost 8 years ago

  • Category changed from 45 to Correctness/Safety
  • Component(FS) ceph-fuse added
Actions #8

Updated by Zheng Yan over 5 years ago

  • Status changed from Need More Info to Closed

no update for a long time

Actions

Also available in: Atom PDF