QA Run #65099
openwip-yuri10-testing-2024-03-24-1159
Description
--- done. these PRs were included:
https://github.com/ceph/ceph/pull/55196 - osd: EC Partial Stripe Reads (Retry of #23138 and #52746)
Updated by Yuri Weinstein about 1 month ago
- QA Runs set to wip-yuri10-testing-2024-03-24-1159
Updated by Laura Flores about 1 month ago
- Status changed from QA Testing to QA Needs Approval
Updated by Laura Flores about 1 month ago
- Status changed from QA Needs Approval to QA Testing
Updated by Laura Flores 29 days ago
Found some related failures.
/a/yuriw-2024-03-24_22:19:24-rados-wip-yuri10-testing-2024-03-24-1159-distro-default-smithi/7620593
2024-03-25T01:02:10.414 INFO:tasks.workunit.client.0.smithi080.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/erasure-code/test-erasure-eio.sh:107: rados_get: local expect=fail
2024-03-25T01:02:10.414 INFO:tasks.workunit.client.0.smithi080.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/erasure-code/test-erasure-eio.sh:112: rados_get: '[' fail = fail ']'
2024-03-25T01:02:10.414 INFO:tasks.workunit.client.0.smithi080.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/erasure-code/test-erasure-eio.sh:114: rados_get: rados --pool pool-jerasure get obj-size-81310-1-10 td/test-erasure-eio/COPY
2024-03-25T01:02:10.615 INFO:tasks.workunit.client.0.smithi080.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/erasure-code/test-erasure-eio.sh:115: rados_get: return
2024-03-25T01:02:10.615 INFO:tasks.workunit.client.0.smithi080.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/erasure-code/test-erasure-eio.sh:243: rados_get_data_bad_size: return 1
2024-03-25T01:02:10.615 INFO:tasks.workunit.client.0.smithi080.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/erasure-code/test-erasure-eio.sh:323: TEST_rados_get_bad_size_shard_1: return 1
/a/yuriw-2024-03-24_22:19:24-rados-wip-yuri10-testing-2024-03-24-1159-distro-default-smithi/7620482
/a/yuriw-2024-03-24_22:19:24-rados-wip-yuri10-testing-2024-03-24-1159-distro-default-smithi/7620682
/a/yuriw-2024-03-24_22:19:24-rados-wip-yuri10-testing-2024-03-24-1159-distro-default-smithi/7620582
2024-03-24T22:41:18.025 INFO:tasks.ceph.osd.1.smithi012.stderr:/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos9/DIST/centos9/MACHINE_SIZE/gigantic/release/19.0.0-2398-g9c429fb6/rpm/el9/BUILD/ceph-19.0.0-2398-g9c429fb6/src/osd/ECUtil.cc: In function 'int ECUtil::decode(const ECUtil::stripe_info_t&, ceph::ErasureCodeInterfaceRef&, std::set<int>, std::map<int, ceph::buffer::v15_2_0::list>&, ceph::bufferlist*)' thread 7fe31fcb7640 time 2024-03-24T22:41:18.025515+0000
2024-03-24T22:41:18.025 INFO:tasks.ceph.osd.1.smithi012.stderr:/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos9/DIST/centos9/MACHINE_SIZE/gigantic/release/19.0.0-2398-g9c429fb6/rpm/el9/BUILD/ceph-19.0.0-2398-g9c429fb6/src/osd/ECUtil.cc: 31: FAILED ceph_assert(total_data_size % sinfo.get_chunk_size() == 0)
2024-03-24T22:41:18.028 INFO:tasks.rados.rados.0.smithi012.stdout:887: finishing rollback tid 0 to smithi01241422-11
2024-03-24T22:41:18.029 INFO:tasks.ceph.osd.1.smithi012.stderr: ceph version 19.0.0-2398-g9c429fb6 (9c429fb6abfb5deed62697d56d8997d0f0d6d83f) squid (dev)
2024-03-24T22:41:18.029 INFO:tasks.ceph.osd.1.smithi012.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x11e) [0x5614923b30f6]
2024-03-24T22:41:18.029 INFO:tasks.ceph.osd.1.smithi012.stderr: 2: ceph-osd(+0x3f62b2) [0x5614923b32b2]
2024-03-24T22:41:18.029 INFO:tasks.ceph.osd.1.smithi012.stderr: 3: ceph-osd(+0x3b7595) [0x561492374595]
2024-03-24T22:41:18.029 INFO:tasks.ceph.osd.1.smithi012.stderr: 4: ceph-osd(+0x6e490c) [0x5614926a190c]
2024-03-24T22:41:18.029 INFO:tasks.ceph.osd.1.smithi012.stderr: 5: (ECCommon::ReadPipeline::complete_read_op(ECCommon::ReadOp&)+0x252) [0x5614926a39a2]
2024-03-24T22:41:18.029 INFO:tasks.ceph.osd.1.smithi012.stderr: 6: (ECBackend::handle_sub_read_reply(pg_shard_t, ECSubReadReply&, ZTracer::Trace const&)+0xd89) [0x5614928b6459]
2024-03-24T22:41:18.029 INFO:tasks.ceph.osd.1.smithi012.stderr: 7: (ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x9d) [0x5614928b6bcd]
2024-03-24T22:41:18.029 INFO:tasks.ceph.osd.1.smithi012.stderr: 8: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x56) [0x5614926bfe76]
2024-03-24T22:41:18.030 INFO:tasks.ceph.osd.1.smithi012.stderr: 9: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x80d) [0x56149260941d]
2024-03-24T22:41:18.030 INFO:tasks.ceph.osd.1.smithi012.stderr: 10: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x197) [0x561492542417]
2024-03-24T22:41:18.030 INFO:tasks.ceph.osd.1.smithi012.stderr: 11: (ceph::osd::scheduler::PGOpItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x69) [0x5614927902d9]
2024-03-24T22:41:18.030 INFO:tasks.ceph.osd.1.smithi012.stderr: 12: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xd07) [0x56149255d7a7]
2024-03-24T22:41:18.030 INFO:tasks.ceph.osd.1.smithi012.stderr: 13: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x2aa) [0x561492a63f2a]
2024-03-24T22:41:18.030 INFO:tasks.ceph.osd.1.smithi012.stderr: 14: ceph-osd(+0xaa74d4) [0x561492a644d4]
2024-03-24T22:41:18.030 INFO:tasks.ceph.osd.1.smithi012.stderr: 15: /lib64/libc.so.6(+0x9f802) [0x7fe34349f802]
2024-03-24T22:41:18.030 INFO:tasks.ceph.osd.1.smithi012.stderr: 16: /lib64/libc.so.6(+0x3f450) [0x7fe34343f450]
/a/yuriw-2024-03-24_22:19:24-rados-wip-yuri10-testing-2024-03-24-1159-distro-default-smithi/7620629
/a/yuriw-2024-03-24_22:19:24-rados-wip-yuri10-testing-2024-03-24-1159-distro-default-smithi/7620562
/a/yuriw-2024-03-24_22:19:24-rados-wip-yuri10-testing-2024-03-24-1159-distro-default-smithi/7620493
/a/yuriw-2024-03-24_22:19:24-rados-wip-yuri10-testing-2024-03-24-1159-distro-default-smithi/7620766
/a/yuriw-2024-03-24_22:19:24-rados-wip-yuri10-testing-2024-03-24-1159-distro-default-smithi/7620696
2024-03-25T00:03:47.244 INFO:tasks.ceph:Generating config...
2024-03-25T00:03:47.244 ERROR:teuthology.contextutil:Saw exception from nested tasks
Traceback (most recent call last):
File "/home/teuthworker/src/git.ceph.com_teuthology_e691533f9cbb33d85b2187bba20d7102f098636d/teuthology/contextutil.py", line 30, in nested
vars.append(enter())
File "/usr/lib/python3.8/contextlib.py", line 113, in __enter__
return next(self.gen)
File "/home/teuthworker/src/github.com_ceph_ceph-c_9c429fb6abfb5deed62697d56d8997d0f0d6d83f/qa/tasks/ceph.py", line 693, in cluster
mons = get_mons(
File "/home/teuthworker/src/github.com_ceph_ceph-c_9c429fb6abfb5deed62697d56d8997d0f0d6d83f/qa/tasks/ceph.py", line 510, in get_mons
assert mons
AssertionError
/a/yuriw-2024-03-24_22:19:24-rados-wip-yuri10-testing-2024-03-24-1159-distro-default-smithi/7620713
2024-03-25T01:13:39.669 INFO:tasks.workunit.client.0.smithi057.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/mon/mkfs.sh:46: mon_mkfs: ceph-mon --id a --fsid 1b7721ad-53dd-46de-9e5c-35e413869819 --mkfs --mon-data=mkfs/a --mon-initial-members=a --mon-host=127.0.0.1:7110 '--key=corrupted key'
2024-03-25T01:13:40.053 INFO:tasks.workunit.client.0.smithi057.stderr:2024-03-25T01:13:40.073+0000 7fc429255b00 -1 mon.a@-1(???) e0 error decoding keyring [mon.]
2024-03-25T01:13:40.053 INFO:tasks.workunit.client.0.smithi057.stderr: key = corrupted key
2024-03-25T01:13:40.053 INFO:tasks.workunit.client.0.smithi057.stderr: caps mon = "allow *"
2024-03-25T01:13:40.053 INFO:tasks.workunit.client.0.smithi057.stderr:: error setting modifier for [mon.] type=key val=corrupted key: Malformed input [buffer:3]
2024-03-25T01:13:40.053 INFO:tasks.workunit.client.0.smithi057.stderr:2024-03-25T01:13:40.073+0000 7fc429255b00 -1 ceph-mon: error creating monfs: (22) Invalid argument
2024-03-25T01:13:40.054 INFO:tasks.workunit.client.0.smithi057.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/mon/mkfs.sh:132: auth_cephx_key: rm -fr mkfs/a/store.db
2024-03-25T01:13:40.056 INFO:tasks.workunit.client.0.smithi057.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/mon/mkfs.sh:133: auth_cephx_key: rm -fr mkfs/a/kv_backend
2024-03-25T01:13:40.056 INFO:tasks.workunit.client.0.smithi057.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/mon/mkfs.sh:136: auth_cephx_key: mon_mkfs --key=AQDDzwBme7QCKRAAnQOaDLkESezQnTnQYSXS8g==
2024-03-25T01:13:40.057 INFO:tasks.workunit.client.0.smithi057.stderr://home/ubuntu/cephtest/clone.client.0/qa/standalone/mon/mkfs.sh:43: mon_mkfs: uuidgen
2024-03-25T01:13:40.058 INFO:tasks.workunit.client.0.smithi057.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/mon/mkfs.sh:43: mon_mkfs: local fsid=f1cd752b-38fe-411f-8959-298738145745
2024-03-25T01:13:40.058 INFO:tasks.workunit.client.0.smithi057.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/mon/mkfs.sh:46: mon_mkfs: ceph-mon --id a --fsid f1cd752b-38fe-411f-8959-298738145745 --mkfs --mon-data=mkfs/a --mon-initial-members=a --mon-host=127.0.0.1:7110 --key=AQDDzwBme7QCKRAAnQOaDLkESezQnTnQYSXS8g==
2024-03-25T01:13:40.080 INFO:tasks.workunit.client.0.smithi057.stderr:2024-03-25T01:13:40.100+0000 7ff2ea62ab00 -1 'mkfs/a' already exists and is not empty: monitor may already exist
2024-03-25T01:13:40.082 DEBUG:teuthology.orchestra.run:got remote process result: 1
2024-03-25T01:13:40.082 INFO:tasks.workunit.client.0.smithi057.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/mon/mkfs.sh:138: auth_cephx_key: '[' -f mkfs/a/keyring ']'
2024-03-25T01:13:40.082 INFO:tasks.workunit.client.0.smithi057.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/mon/mkfs.sh:138: auth_cephx_key: return 1
2024-03-25T01:13:40.082 INFO:tasks.workunit.client.0.smithi057.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/mon/mkfs.sh:184: run: return 1
2024-03-25T01:13:40.083 INFO:tasks.workunit:Stopping ['mon'] on client.0...
2024-03-25T01:13:40.083 DEBUG:teuthology.orchestra.run.smithi057:> sudo rm -rf -- /home/ubuntu/cephtest/workunits.list.client.0 /home/ubuntu/cephtest/clone.client.0
2024-03-25T01:13:40.415 ERROR:teuthology.run_tasks:Saw exception from tasks.
/a/yuriw-2024-03-24_22:19:24-rados-wip-yuri10-testing-2024-03-24-1159-distro-default-smithi/7620568
2024-03-25T00:39:26.381 INFO:tasks.daemonwatchdog.daemon_watchdog:daemon ceph.mon.a is failed for ~10s
2024-03-25T00:39:28.447 ERROR:teuthology.run_tasks:Manager failed: ceph
Traceback (most recent call last):
File "/home/teuthworker/src/git.ceph.com_teuthology_e691533f9cbb33d85b2187bba20d7102f098636d/teuthology/run_tasks.py", line 154, in run_tasks
suppress = manager.__exit__(*exc_info)
File "/usr/lib/python3.8/contextlib.py", line 120, in __exit__
next(self.gen)
File "/home/teuthworker/src/github.com_ceph_ceph-c_9c429fb6abfb5deed62697d56d8997d0f0d6d83f/qa/tasks/ceph.py", line 1935, in task
mon0_remote.run(
File "/usr/lib/python3.8/contextlib.py", line 120, in __exit__
next(self.gen)
File "/home/teuthworker/src/git.ceph.com_teuthology_e691533f9cbb33d85b2187bba20d7102f098636d/teuthology/contextutil.py", line 54, in nested
raise exc[1]
File "/usr/lib/python3.8/contextlib.py", line 131, in __exit__
self.gen.throw(type, value, traceback)
File "/home/teuthworker/src/github.com_ceph_ceph-c_9c429fb6abfb5deed62697d56d8997d0f0d6d83f/qa/tasks/ceph.py", line 252, in ceph_log
yield
File "/home/teuthworker/src/git.ceph.com_teuthology_e691533f9cbb33d85b2187bba20d7102f098636d/teuthology/contextutil.py", line 46, in nested
if exit(*exc):
File "/usr/lib/python3.8/contextlib.py", line 120, in __exit__
next(self.gen)
File "/home/teuthworker/src/github.com_ceph_ceph-c_9c429fb6abfb5deed62697d56d8997d0f0d6d83f/qa/tasks/ceph.py", line 1449, in run_daemon
teuthology.stop_daemons_of_type(ctx, type_, cluster_name)
File "/home/teuthworker/src/git.ceph.com_teuthology_e691533f9cbb33d85b2187bba20d7102f098636d/teuthology/misc.py", line 1173, in stop_daemons_of_type
daemon.stop()
File "/home/teuthworker/src/git.ceph.com_teuthology_e691533f9cbb33d85b2187bba20d7102f098636d/teuthology/orchestra/daemon/state.py", line 139, in stop
run.wait([self.proc], timeout=timeout)
File "/home/teuthworker/src/git.ceph.com_teuthology_e691533f9cbb33d85b2187bba20d7102f098636d/teuthology/orchestra/run.py", line 473, in wait
check_time()
File "/home/teuthworker/src/git.ceph.com_teuthology_e691533f9cbb33d85b2187bba20d7102f098636d/teuthology/contextutil.py", line 134, in __call__
raise MaxWhileTries(error_msg)
teuthology.exceptions.MaxWhileTries: reached maximum tries (51) after waiting for 300 seconds
Updated by Yuri Weinstein 28 days ago
- Status changed from QA Testing to QA Needs Approval
Updated by Yuri Weinstein 27 days ago
- Git Branch set to wip-yuri10-testing-2024-03-24-1159
Updated by Yuri Weinstein 27 days ago
- Git Branch changed from wip-yuri10-testing-2024-03-24-1159 to yuriw/ceph/commits/wip-yuri10-testing-2024-03-24-1159
Updated by Laura Flores 27 days ago
- Assignee changed from Laura Flores to Radoslaw Zarzynski
Updated by Radoslaw Zarzynski 26 days ago
Yes, this is ready for a retest.
BTW: https://tracker.ceph.com/issues/65237 will be a reference point for reviewing.
Updated by Yuri Weinstein 26 days ago
Radoslaw Zarzynski wrote:
Yes, this is ready for a retest.
BTW: https://tracker.ceph.com/issues/65237 will be a reference point for reviewing.
I think the latest run includes the latest commits
Did you review it yet?
Updated by Radoslaw Zarzynski 26 days ago
I reviewed the reference point (https://tracker.ceph.com/issues/65237#note-20). As it's broken, there is no business in reviewing this.
Updated by Yuri Weinstein 26 days ago
Do we want this to stay open?
@Radoslaw Smigielski
Updated by Laura Flores 22 days ago
Yuri Weinstein wrote in #note-14:
Do we want this to stay open?
@Radoslaw Smigielski
Meant for @Radoslaw Zarzynski
Updated by Yuri Weinstein 21 days ago
Seems like it needs rebase as I see a recent commit by @Radoslaw Zarzynski
rebasing
Updated by Yuri Weinstein 21 days ago
- Status changed from QA Needs Approval to QA Needs Rerun/Rebuilt
- Assignee changed from Radoslaw Zarzynski to Yuri Weinstein
Updated by Yuri Weinstein 20 days ago
- Status changed from QA Needs Rerun/Rebuilt to QA Needs Approval
- Assignee changed from Yuri Weinstein to Radoslaw Zarzynski
- Tags set to core
Updated by Radoslaw Zarzynski 20 days ago
In the new run (https://pulpito.ceph.com/yuriw-2024-04-10_14:20:47-rados-wip-yuri10-testing-2024-03-24-1159-distro-default-smithi/) everything has failed because of a lab's issue:
HTTPSConnectionPool(host='shaman.ceph.com', port=443): Max retries exceeded with url: /api/search?status=ready&project=ceph&flavor=default&distros=centos%2F9%2Fx86_64&ref=wip-yuri10-testing-2024-03-24-1159 (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fd335862970>: Failed to establish a new connection: [Errno 110] Connection timed out'))
I'm going to rerun the failures.