Project

General

Profile

Actions

Bug #56445

closed

Command failed on smithi162 with status 123: "find /home/ubuntu/cephtest/archive/syslog -name '*.log' -print0 | sudo xargs -0 --no-run-if-empty -- gzip --"

Added by Venky Shankar almost 2 years ago. Updated 8 months ago.

Status:
Can't reproduce
Priority:
Normal
Assignee:
Category:
QA Suite
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

Haven't seen this before and started happening pretty recently. E.g.: https://pulpito.ceph.com/vshankar-2022-06-29_09:19:00-fs-wip-vshankar-testing-20220627-100931-testing-default-smithi/6904969/

Seems to erroring out when gathering syslog

```
2022-06-30T11:21:01.722 INFO:teuthology.orchestra.run.smithi162.stderr:grep: /home/ubuntu/cephtest/archive/syslog/kern.log: Permission denied
2022-06-30T11:21:01.737 INFO:teuthology.task.internal.syslog:Compressing syslogs...
2022-06-30T11:21:01.738 DEBUG:teuthology.orchestra.run.smithi146:> find /home/ubuntu/cephtest/archive/syslog name '*.log' -print0 | sudo xargs -0 --no-run-if-empty - gzip --
2022-06-30T11:21:01.744 DEBUG:teuthology.orchestra.run.smithi162:> find /home/ubuntu/cephtest/archive/syslog name '*.log' -print0 | sudo xargs -0 --no-run-if-empty - gzip --
2022-06-30T11:21:01.834 INFO:teuthology.orchestra.run.smithi162.stderr:gzip: /home/ubuntu/cephtest/archive/syslog/kern.log.gz already exists; not overwritten
2022-06-30T11:21:01.856 DEBUG:teuthology.orchestra.run:got remote process result: 123
2022-06-30T11:21:01.856 ERROR:teuthology.run_tasks:Manager failed: internal.syslog
```

kern.log.gz already exists perhaps. Not sure why. I', also not sure what changes in fs suite is causing this to show up. Other instances:

- https://pulpito.ceph.com/vshankar-2022-06-29_09:19:00-fs-wip-vshankar-testing-20220627-100931-testing-default-smithi/6904976/
- https://pulpito.ceph.com/vshankar-2022-06-29_09:19:00-fs-wip-vshankar-testing-20220627-100931-testing-default-smithi/6904993/

Actions #1

Updated by Venky Shankar almost 2 years ago

Formatting the errs:

2022-06-30T11:21:01.722 INFO:teuthology.orchestra.run.smithi162.stderr:grep: /home/ubuntu/cephtest/archive/syslog/kern.log: Permission denied
2022-06-30T11:21:01.737 INFO:teuthology.task.internal.syslog:Compressing syslogs...
2022-06-30T11:21:01.738 DEBUG:teuthology.orchestra.run.smithi146:> find /home/ubuntu/cephtest/archive/syslog name '*.log' -print0 | sudo xargs -0 --no-run-if-empty - gzip --
2022-06-30T11:21:01.744 DEBUG:teuthology.orchestra.run.smithi162:> find /home/ubuntu/cephtest/archive/syslog name '*.log' -print0 | sudo xargs -0 --no-run-if-empty - gzip --
2022-06-30T11:21:01.834 INFO:teuthology.orchestra.run.smithi162.stderr:gzip: /home/ubuntu/cephtest/archive/syslog/kern.log.gz already exists; not overwritten
2022-06-30T11:21:01.856 DEBUG:teuthology.orchestra.run:got remote process result: 123
2022-06-30T11:21:01.856 ERROR:teuthology.run_tasks:Manager failed: internal.syslog
Actions #2

Updated by Brad Hubbard over 1 year ago

The 123 error is coming from xargs, presumably because xargs is returning an error.

From 'man xargs' EXIT STATUS section.

 123 if any invocation of the command exited with status 1-125

Powercycle suite seems to reproduce this nicely so I'll see what I can work out.

Actions #3

Updated by Laura Flores about 1 year ago

/a/yuriw-2023-03-14_20:10:47-rados-wip-yuri-testing-2023-03-14-0714-reef-distro-default-smithi/7207226

And several more on that run.

Actions #4

Updated by Zack Cerza 8 months ago

  • Status changed from New to Need More Info
  • Assignee set to Brad Hubbard

Brad, did you end up finding anything with this issue?

Actions #5

Updated by Brad Hubbard 8 months ago

Zack Cerza wrote:

Brad, did you end up finding anything with this issue?

Do you have a recent example Zack? It stopped happening for me I'm afraid and I
had trouble reproducing it since.

Actions #6

Updated by Zack Cerza 8 months ago

  • Status changed from Need More Info to Can't reproduce

Nope, we were just scrubbing some bugs.

So the xargs call is failing because the sub-commands are failing, because of the permission error.

Since we're not necessarily seeing this anymore I'll close it, but in case it gets reopened, it seems to me debugging should start with determining why the permission error occurs, as it's very normal for a job to read and write to ~ubuntu/cephtest/archive

We should also track which OSes this happens on, as it could be SELinux-related.

Actions

Also available in: Atom PDF