Project

General

Profile

Actions

Bug #64607

open

ceph: fstest generic/580 test failure with infinitely loop

Added by Xiubo Li 3 months ago. Updated 2 months ago.

Status:
Fix Under Review
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

This is reported by Luis, please see https://patchwork.kernel.org/project/ceph-devel/patch/20240125023920.1287555-4-xiubli@redhat.com/.

I'm seeing an issue with fstest generic/580, which seems to enter an
infinite loop effectively rendering the testing VM unusable.  It's pretty
easy to reproduce, just run the test ensuring to be using msgv2 (I'm
mounting the filesystem with 'ms_mode=crc'), and you should see the
following on the logs:

[...]
  libceph: prepare_sparse_read_cont: ret 0x1000 total_resid 0x0 resid 0x0                                                                                       
  libceph: osd1 (2)192.168.155.1:6810 read processing error                                                                                                     
  libceph: mon0 (2)192.168.155.1:40608 session established                                                                                                      
  libceph: bad late_status 0x1                                                                                                                                  
  libceph: osd1 (2)192.168.155.1:6810 protocol error, bad epilogue                                                                                              
  libceph: mon0 (2)192.168.155.1:40608 session established                                                                                                      
  libceph: prepare_sparse_read_cont: ret 0x1000 total_resid 0x0 resid 0x0                                                                                       
  libceph: osd1 (2)192.168.155.1:6810 read processing error              
  libceph: mon0 (2)192.168.155.1:40608 session established                                                                                                      
  libceph: bad late_status 0x1                                                                                                                                  
[...]

Reverting this patch (commit 8e46a2d068c9 ("libceph: just wait for more
data to be available on the socket")) seems to fix.  I haven't
investigated it further, but since it'll take me some time to refresh my
memory, I thought I should report it immediately.  Maybe someone has any
idea.

Cheers,
-- 
Luís


Related issues 1 (0 open1 closed)

Copied to CephFS - Bug #64654: fscrypt: add mount-syntax/v2 test for fscryptDuplicateXiubo Li

Actions
Actions

Also available in: Atom PDF