Bug #64748
closedreef: snaptest-git-ceph.sh failure
0%
Description
- /a/vshankar-2024-03-05_07:34:10-fs-wip-vshankar-testing1-reef-2024-03-05-1017-testing-default-smithi/7582083
- /a/vshankar-2024-03-05_07:34:10-fs-wip-vshankar-testing1-reef-2024-03-05-1017-testing-default-smithi/7582140
First job has the following failure
2024-03-05T10:07:24.695 INFO:tasks.workunit.client.0.smithi100.stderr:error: RPC failed; curl 18 transfer closed with outstanding read data remaining 2024-03-05T10:07:24.695 INFO:tasks.workunit.client.0.smithi100.stderr:fetch-pack: unexpected disconnect while reading sideband packet 2024-03-05T10:07:24.697 INFO:tasks.workunit.client.0.smithi100.stderr:fatal: early EOF 2024-03-05T10:07:24.698 INFO:tasks.workunit.client.0.smithi100.stderr:fatal: fetch-pack: invalid index-pack output
While the second one has
2024-03-05T11:35:55.301 INFO:tasks.workunit.client.0.smithi120.stderr:Updating files: 2% (279/12313)^MUpdating files: 3% (370/12313)^MUpdating files: 4% (493/12313)^MUpdating files: 4% (515/12313)^MUpdating files: 5% (616/12313)^MUpdat ing files: 5% (714/12313)^MUpdating files: 6% (739/12313)^MUpdating files: 6% (749/12313)^MUpdating files: 6% (821/12313)^MUpdating files: 7% (862/12313)^MUpdating files: 8% (986/12313)^MUpdating files: 8% (1025/12313)^MUpdating fi les: 9% (1109/12313)^MUpdating files: 9% (1218/12313)^MUpdating files: 10% (1232/12313)^MUpdating files: 11% (1355/12313)^MUpdating files: 11% (1467/12313)^MUpdating files: 12% (1478/12313)^MUpdating files: 12% (1503/12313)^MUpdating fi les: 12% (1560/12313)^MUpdating files: 13% (1601/12313)^MUpdating files: 14% (1724/12313)^MUpdating files: 14% (1832/12313)^MUpdating files: 15% (1847/12313)^MUpdating files: 16% (1971/12313)^MUpdating files: 16% (2012/12313)^MUpdating fi les: 16% (2085/12313)^MUpdating files: 17% (2094/12313)^MUpdating files: 17% (2146/12313)^MUpdating files: 17% (2161/12313)^MUpdating files: 18% (2217/12313)^MUpdating files: 18% (2249/12313)^MUpdating files: 19% (2340/12313)^MUpdating fi les: 19% (2369/12313)^MUpdating files: 20% (2463/12313)^MUpdating files: 20% (2585/12313)^MUpdating files: 21% (2586/12313)^MUpdating files: 22% (2709/12313)^MUpdating files: 22% (2761/12313)^MUpdating files: 23% (2832/12313)^MUpdating fi les: 23% (2891/12313)^MUpdating files: 24% (2956/12313)^MUpdating files: 24% (2983/12313)^MUpdating files: 24% (3008/12313)^MUpdating files: 25% (3079/12313)^MUpdating files: 26% (3202/12313)^MUpdating files: 26% (3229/12313)^MUpdating fi les: 26% (3230/12313)^MUpdating files: 26% (3231/12313)^MUpdating files: 27% (3325/12313)^MUpdating files: 27% (3335/12313)^MUpdating files: 28% (3448/12313)^MUpdating files: 29% (3571/12313)^MUpdating files: 29% (3683/12313)^MUpdating fi les: 30% (3694/12313)^MUpdating files: 31% (3818/12313)^MUpdating files: 32% (3941/12313)^MUpdating files: 32% (3952/12313)^MUpdating files: 32% (3982/12313)^MUpdating files: 32% (4039/12313)^MUpdating files: 33% (4064/12313)^MUpdating fi les: 33% (4175/12313)^MUpdating files: 34% (4187/12313)^MUpdating files: 34% (4266/12313)^MUpdating files: 34% (4282/12313)^MUpdating files: 35% (4310/12313)^MUpdating files: 36% (4433/12313)^MUpdating files: 36% (4453/12313)^MUpdating fi les: 37% (4556/12313)^MUpdating files: 37% (4650/12313)^MUpdating files: 38% (4679/12313)^MUpdating files: 39% (4803/12313)^MUpdating files: 39% (4831/12313)^MUpdating files: 40% (4926/12313)^MUpdating files: 41% (5049/12313)^MUpdating fi les: 41% (5103/12313)^MUpdating files: 42% (5172/12313)^MUpdating files: 42% (5231/12313)^MUpdating files: 43% (5295/12313)^MUpdating files: 43% (5296/12313)^MUpdating files: 44% (5418/12313)^MUpdating files: 45% (5541/12313)^MUpdating fi les: 46% (5664/12313)^MUpdating files: 46% (5684/12313)^MUpdating files: 47% (5788/12313)^MUpdating files: 47% (5806/12313)^MUpdating files: 48% (5911/12313)^MUpdating files: 49% (6034/12313)^MUpdating files: 50% (6157/12313)^MUpdating fi les: 50% (6238/12313)^MUpdating files: 51% (6280/12313)^MUpdating files: 51% (6382/12313)^MUpdating files: 52% (6403/12313)^MUpdating files: 52% (6440/12313)^MUpdating files: 52% (6461/12313)^MUpdating files: 53% (6526/12313)^MUpdating fi les: 53% (6607/12313)^MUpdating files: 53% (6612/12313)^MUpdating files: 54% (6650/12313)^MUpdating files: 55% (6773/12313)^MUpdating files: 55% (6855/12313)^MUpdating files: 56% (6896/12313)^MUpdating files: 57% (7019/12313)^MUpdating fi les: 58% (7142/12313)^MUpdating files: 58% (7207/12313)^MUpdating files: 59% (7265/12313)^MUpdating files: 60% (7388/12313)^MUpdating files: 60% (7498/12313)^MUpdating files: 61% (7511/12313)^MUpdating files: 61% (7526/12313)^MUpdating fi les: 61% (7545/12313)^MUpdating files: 62% (7635/12313)^MUpdating files: 62% (7664/12313)^MUpdating files: 62% (7750/12313)^MUpdating files: 63% (7758/12313)^MUpdating files: 64% (7881/12313)^MUpdating files: 64% (7952/12313)^MUpdating fi les: 65% (8004/12313)^MUpdating files: 65% (8087/12313)^MUpdating files: 66% (8127/12313)^MUpdating files: 67% (8250/12313)^MUpdating files: 67% (8290/12313)^MUpdating files: 68% (8373/12313)^MUpdating files: 68% (8468/12313)^MUpdating fi les: 69% (8496/12313)^MUpdating files: 70% (8620/12313)^MUpdating files: 70% (8657/12313)^MUpdating files: 70% (8681/12313)^MUpdating files: 71% (8743/12313)^MUpdating files: 71% (8798/12313)^MUpdating files: 72% (8866/12313)^MUpdating fi les: 73% (8989/12313)^MUpdating files: 73% (9008/12313)^MUpdating files: 74% (9112/12313)^MUpdating files: 74% (9117/12313)^MUpdating files: 74% (9188/12313)^MUpdating files: 75% (9235/12313)^MUpdating files: 76% (9358/12313)^MUpdating fi les: 76% (9370/12313)^MUpdating files: 77% (9482/12313)^MUpdating files: 78% (9605/12313)^MUpdating files: 79% (9728/12313)^MUpdating files: 79% (9792/12313)^MUpdating files: 80% (9851/12313)^MUpdating files: 80% (9874/12313)^MUpdating fi les: 81% (9974/12313)^MUpdating files: 82% (10097/12313)^MUpdating files: 83% (10220/12313)^MUpdating files: 83% (10298/12313)^MUpdating files: 84% (10343/12313)^MUpdating files: 84% (10402/12313)^MUpdating files: 84% (10407/12313)^MUpdat ing files: 84% (10441/12313)^MUpdating files: 85% (10467/12313)^MUpdating files: 86% (10590/12313)^MUpdating files: 86% (10592/12313)^MUpdating files: 86% (10595/12313)^MUpdating files: 86% (10668/12313)^MUpdating files: 87% (10713/12313) ^MUpdating files: 88% (10836/12313)^MUpdating files: 88% (10875/12313)^MUpdating files: 89% (10959/12313)^MUpdating files: 89% (11054/12313)^MUpdating files: 90% (11082/12313)^MUpdating files: 90% (11105/12313)^MUpdating files: 91% (11205 /12313)^MUpdating files: 91% (11233/12313)^MUpdating files: 92% (11328/12313)^MUpdating files: 92% (11367/12313)^MUpdating files: 92% (11384/12313)^MUpdating files: 93% (11452/12313)^MUpdating files: 93% (11514/12313)^MUpdating files: 93% (11525/12313)^MUpdating files: 94% (11575/12313)^MUpdating files: 94% (11685/12313)^MUpdating files: 95% (11698/12313)^MUpdating files: 96% (11821/12313)^MUpdating files: 96% (11836/12313)^MUpdating files: 97% (11944/12313)^MUpdating file s: 97% (11981/12313)^MUpdating files: 98% (12067/12313)^MUpdating files: 98% (12131/12313)^MUpdating files: 99% (12190/12313)^MUpdating files: 100% (12313/12313)^MUpdating files: 100% (12313/12313), done. 2024-03-05T11:35:55.302 DEBUG:teuthology.orchestra.run.smithi158:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 30 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-osd.7.asok dump_ops_in_flight 2024-03-05T11:35:55.302 INFO:tasks.ceph.osd.6.smithi158.stderr:2024-03-05T11:35:55.299+0000 7f9c6b20e700 -1 received signal: Hangup from /usr/bin/python3 /bin/daemon-helper kill ceph-osd -f --cluster ceph -i 6 (PID: 89081) UID: 0 2024-03-05T11:35:55.322 DEBUG:teuthology.orchestra.run:got remote process result: 128 2024-03-05T11:35:55.322 INFO:tasks.workunit:Stopping ['fs/snaps'] on client.0...
No coredumps, but the kernel ring buffer shows:
2024-03-05T11:27:44.304184+00:00 smithi120 kernel: ceph: sync getxattr 00000000df3c3b41 during filling trace 2024-03-05T11:27:44.304276+00:00 smithi120 kernel: SELinux: inode_doinit_use_xattr: getxattr returned 16 for dev=ceph ino=1099511628279 2024-03-05T11:27:44.329101+00:00 smithi120 kernel: libceph: osd5 primary-affinity 0xb9f0 2024-03-05T11:27:44.744662+00:00 smithi120 kernel: ceph: sync getxattr 0000000073eef473 during filling trace 2024-03-05T11:27:44.744736+00:00 smithi120 kernel: SELinux: inode_doinit_use_xattr: getxattr returned 16 for dev=ceph ino=1099511628281 2024-03-05T11:27:44.796822+00:00 smithi120 kernel: ceph: sync getxattr 00000000c3d9dfae during filling trace 2024-03-05T11:27:44.796895+00:00 smithi120 kernel: SELinux: inode_doinit_use_xattr: getxattr returned 16 for dev=ceph ino=1099511628281 2024-03-05T11:27:44.974044+00:00 smithi120 kernel: libceph: mds0 (1)172.21.15.158:6837 socket closed (con state OPEN) 2024-03-05T11:27:45.305045+00:00 smithi120 kernel: libceph: mds0 (1)172.21.15.158:6837 socket closed (con state OPEN) 2024-03-05T11:27:49.572057+00:00 smithi120 kernel: libceph: mds0 (1)172.21.15.158:6837 socket closed (con state OPEN) 2024-03-05T11:27:49.848044+00:00 smithi120 kernel: libceph: mds0 (1)172.21.15.158:6837 socket closed (con state OPEN) 2024-03-05T11:27:50.115068+00:00 smithi120 kernel: libceph: mds0 (1)172.21.15.158:6837 socket closed (con state OPEN) 2024-03-05T11:27:50.514082+00:00 smithi120 kernel: libceph: mds0 (1)172.21.15.158:6837 socket closed (con state OPEN) 2024-03-05T11:27:52.407051+00:00 smithi120 kernel: libceph: mds0 (1)172.21.15.158:6837 socket closed (con state OPEN) 2024-03-05T11:27:52.733746+00:00 smithi120 kernel: ceph: sync getxattr 000000009ce92dd3 during filling trace 2024-03-05T11:27:52.733854+00:00 smithi120 kernel: SELinux: inode_doinit_use_xattr: getxattr returned 16 for dev=ceph ino=1099511628286 2024-03-05T11:27:52.760047+00:00 smithi120 kernel: libceph: mds0 (1)172.21.15.158:6837 socket closed (con state OPEN) 2024-03-05T11:27:53.939103+00:00 smithi120 kernel: libceph: mds0 (1)172.21.15.158:6837 socket closed (con state OPEN) 2024-03-05T11:27:54.226045+00:00 smithi120 kernel: libceph: mds0 (1)172.21.15.158:6837 socket closed (con state OPEN) 2024-03-05T11:27:57.788133+00:00 smithi120 kernel: libceph: mds0 (1)172.21.15.158:6837 socket closed (con state OPEN) 2024-03-05T11:27:58.430047+00:00 smithi120 kernel: libceph: mds0 (1)172.21.15.158:6837 socket closed (con state OPEN) 2024-03-05T11:27:59.318047+00:00 smithi120 kernel: libceph: mds0 (1)172.21.15.158:6837 socket closed (con state OPEN) 2024-03-05T11:27:59.627193+00:00 smithi120 kernel: ceph: sync getxattr 000000001da0a228 during filling trace 2024-03-05T11:27:59.627250+00:00 smithi120 kernel: SELinux: inode_doinit_use_xattr: getxattr returned 16 for dev=ceph ino=1099511628287 2024-03-05T11:27:59.667451+00:00 smithi120 kernel: ceph: sync getxattr 000000008c47fcaa during filling trace 2024-03-05T11:27:59.667533+00:00 smithi120 kernel: SELinux: inode_doinit_use_xattr: getxattr returned 16 for dev=ceph ino=1099511628299 2024-03-05T11:27:59.708028+00:00 smithi120 kernel: ceph: sync getxattr 000000007d19aa90 during filling trace 2024-03-05T11:27:59.708116+00:00 smithi120 kernel: SELinux: inode_doinit_use_xattr: getxattr returned 16 for dev=ceph ino=1099511628311
Xiubo, are the messages in the kernel ring buffer telling us anything?
Updated by Xiubo Li about 2 months ago
Venky,
These two failures both caused by the EOF issue and there has a existing tracker for this and please see https://tracker.ceph.com/issues/59413, we have a fix for it and but the backport PR hasn't be merged to reef yet.
This kernel log is just a warning and the call will be retried and won't cause failure:
2024-03-05T11:27:52.733746+00:00 smithi120 kernel: ceph: sync getxattr 000000009ce92dd3 during filling trace
While for the following one also is okay because the caller won't do anything when it fails:
2024-03-05T11:27:52.733854+00:00 smithi120 kernel: SELinux: inode_doinit_use_xattr: getxattr returned 16 for dev=ceph ino=1099511628286
Checked the kernel logs I didn't see other errors relate to this.
So let's merge the reef backport first and then try to see could we see it again. Make sense ?
Thanks
- Xiubo
Updated by Venky Shankar about 2 months ago
- Has duplicate Bug #59413: cephfs: qa snaptest-git-ceph.sh failed with "got remote process result: 128" added
Updated by Venky Shankar about 2 months ago
- Status changed from New to Duplicate