Project

General

Profile

Activity

From 09/01/2022 to 09/30/2022

09/30/2022

11:55 PM Bug #47212 (Resolved): out-of-order "Error: finished tid 3 when last_acked_tid was 5"
No backports for crimson for now. Samuel Just
11:14 PM Bug #57739 (New): crimson: LogMissingRequest and RepRequest operator<< access possibly invalid req
... Samuel Just
09:49 PM Bug #57738: crimson: repop ordering bug
... Samuel Just
09:43 PM Bug #57738: crimson: repop ordering bug
... Samuel Just
09:39 PM Bug #57738 (Resolved): crimson: repop ordering bug
... Samuel Just

09/28/2022

08:30 AM Bug #57693 (Resolved): Messenger test failed against test_messenger_peer.cc
... Yingxin Cheng
02:44 AM Bug #57549: Crimson: Alienstore not work after ceph enable c++20
remove --bluestore-devs /dev/nvme7n1 or using a debug build, still the same problem.
I find a different place comp...
Jianxin Li

09/26/2022

10:27 AM Bug #57578: crimson: assertion failure in _do_transaction_step()
The correct order is:... Radoslaw Zarzynski

09/22/2022

09:47 PM Bug #57654 (New): crimson/osd: check blocked peering ops when we get a new map and cancel any for...
Samuel Just
08:10 PM Bug #57653 (New): crimson/os: remove CollectionRef from FuturizedStore interface
It seems to serve no purpose other than to impose a bunch of complexity to maintain the refcount for FuturizedCollect... Samuel Just
07:21 PM Bug #57629: crimson: segfault during mkfs
Tried https://github.com/ceph/ceph/pull/48203, slightly different backtrace... Samuel Just
01:35 AM Bug #57629: crimson: segfault during mkfs
More complete backtrace from gdb:... Samuel Just
12:07 AM Bug #57629 (New): crimson: segfault during mkfs
Release build: ./do_cmake.sh -DWITH_SEASTAR=ON -DWITH_MGR_DASHBOARD_FRONTEND=OFF -DWITH_CCACHE=ON -DCMAKE_BUILD_TYPE=... Samuel Just
12:02 AM Bug #57549: Crimson: Alienstore not work after ceph enable c++20
Ah, with a release build, I'm getting a segfault during mkfs (https://tracker.ceph.com/issues/57629). Not apparently... Samuel Just

09/21/2022

06:45 PM Bug #57549: Crimson: Alienstore not work after ceph enable c++20
Hmm, pgs are going active+clean for me with 1 osd, will try a release build next. Do you get the same result if you ... Samuel Just
06:41 PM Bug #57549: Crimson: Alienstore not work after ceph enable c++20
Hmm, two differences to investigate, I'm using 3 osds and I'm using a debug build. Samuel Just
05:07 PM Bug #57626 (New): crimson: add gate for peering related async tasks
Some peering handlers on PG spawn async tasks on other cores, we need to gate these for clean osd shutdown (or block ... Samuel Just
12:36 AM Bug #57617: crimson: need to actually set version/user_version for duplicate ops
in wip-sjust-testing Samuel Just
12:08 AM Bug #57617 (In Progress): crimson: need to actually set version/user_version for duplicate ops
Samuel Just
12:08 AM Bug #57617 (Resolved): crimson: need to actually set version/user_version for duplicate ops
Tends to create ceph_test_rados failures that look like:... Samuel Just

09/20/2022

06:58 AM Bug #57549: Crimson: Alienstore not work after ceph enable c++20
testing commit sha1: d64757910360804a94ea80787e1ea5f0853e5aff
Or all the commit after this version.
Jianxin Li
12:55 AM Bug #57506 (Resolved): crimson: vstart cluster pgs stuck in +wait
Samuel Just
12:54 AM Bug #57494 (Resolved): crimson: IO hang with vstart immediately after cluster start
Samuel Just
12:54 AM Bug #57495 (Resolved): crimson: osd crash
Samuel Just
12:48 AM Bug #57547: Hang with seastore at wait_for_active stage
Sam, you can just reproduce it by main+Pr48057+Pr48059(I think main should be OK since the pr doesn't fix it),
1.set...
chunmei liu

09/19/2022

11:13 PM Bug #57547: Hang with seastore at wait_for_active stage
uploaded grep '34\.6' <log file for primary> | gzip > pg34.6.log.gz to rados-api-test-3 branch, see the last commit. chunmei liu

09/17/2022

09:17 PM Bug #57547: Hang with seastore at wait_for_active stage
DEBUG 2022-09-16 21:11:31,921 [shard 0] osd - do_peering_event
ignoring epoch_sent: 110 epoch_requested: 110 MLease ...
Samuel Just
09:14 PM Bug #57547: Hang with seastore at wait_for_active stage
The first log from this bug seems just not to be long enough, it only extends for about .5s after 111.6 is created, s... Samuel Just
08:55 PM Bug #57547: Hang with seastore at wait_for_active stage
Unfortunately, git-lfs doesn't seem to work:
fetch: Fetching reference refs/heads/rados-api-test-3
batch response...
Samuel Just
01:24 AM Bug #57547: Hang with seastore at wait_for_active stage
I use git lfs to put the big log files into github.
https://github.com/liu-chunmei/ceph/tree/rados-api-test-3, che...
chunmei liu

09/16/2022

09:34 PM Bug #57547: Hang with seastore at wait_for_active stage
From your ceph pg dump, there appear to be no pgs in pool 37. Can you find in the osd logs any evidence of pgs from ... Samuel Just
03:49 AM Bug #57547: Hang with seastore at wait_for_active stage
Samuel Just wrote:
> Can you add to the description:
> - exact sha1 tested (with github link if a branch not yet me...
chunmei liu
04:55 PM Bug #57578 (In Progress): crimson: assertion failure in _do_transaction_step()
Radoslaw Zarzynski
04:55 PM Bug #57578: crimson: assertion failure in _do_transaction_step()
The problem is that we prepend the @clone@:... Radoslaw Zarzynski
02:49 PM Bug #57578: crimson: assertion failure in _do_transaction_step()
The same test fails on BlueStore too. Radoslaw Zarzynski
02:16 PM Bug #57578 (Resolved): crimson: assertion failure in _do_transaction_step()
Cluster based on @dc9b89d619920da9b69b72e80ffdf057f865be50@ deployed with:... Radoslaw Zarzynski
05:39 AM Bug #57548: Hang with alienstore
I think this because c++20 pull-in and cause crimson+alienstore doesn't work. after that bug fix, I will retest it. chunmei liu

09/15/2022

03:01 PM Bug #57542 (Fix Under Review): crimson: PGAdvanceMap updates from wrong version
Radoslaw Zarzynski
05:39 AM Bug #57549: Crimson: Alienstore not work after ceph enable c++20
Samuel Just wrote:
> So, I can't immediately reproduce this -- vstart clusters with alienstore on current main work ...
Jianxin Li
03:43 AM Bug #57549: Crimson: Alienstore not work after ceph enable c++20
So, I can't immediately reproduce this -- vstart clusters with alienstore on current main work for me locally.
We'...
Samuel Just
03:33 AM Bug #57549 (Closed): Crimson: Alienstore not work after ceph enable c++20
After ceph enable c++20(PR: https://github.com/ceph/ceph/pull/45133), crimson alienstore can not work normally while ... Jianxin Li
04:07 AM Bug #57530 (Resolved): crimson/seastore: crash in --mkfs with vstart
Samuel Just
04:02 AM Bug #57548: Hang with alienstore
Can you add to the description:
- exact sha1 tested (with github link if a branch not yet merged)
- vstart.sh comma...
Samuel Just
01:23 AM Bug #57548: Hang with alienstore
error log file:
https://gist.github.com/liu-chunmei/37236ed967fcec81cb9d8df22001d730
chunmei liu
01:19 AM Bug #57548 (New): Hang with alienstore
when start crimson with alien store, run ceph_test_rados_api_aio_pp, still meet system hang, client request not return. chunmei liu
03:58 AM Bug #57547: Hang with seastore at wait_for_active stage
The target raw pg for the op is 37.7fc1f406, but that log snippet does not seem to contain any pgs from that pool (37... Samuel Just
03:49 AM Bug #57547: Hang with seastore at wait_for_active stage
Can you add to the description:
- exact sha1 tested (with github link if a branch not yet merged)
- vstart.sh comma...
Samuel Just
01:17 AM Bug #57547: Hang with seastore at wait_for_active stage
The hang op is "append" see (./bin/ceph --admin-daemon asok/client.admin.<pid>.asok objecter_requests | less) out put... chunmei liu
01:13 AM Bug #57547: Hang with seastore at wait_for_active stage
https://gist.github.com/liu-chunmei/1370d7925833ab5c9b11217e9d2c9e51
error log file.
chunmei liu
01:12 AM Bug #57547 (New): Hang with seastore at wait_for_active stage
when do ceph_test_rados_api_aio_pp test, when pg is not active, the client request will put to PGActivationBlocker::w... chunmei liu

09/14/2022

11:46 PM Bug #57530: crimson/seastore: crash in --mkfs with vstart
https://github.com/ceph/ceph/pull/48105
Samuel Just
05:41 PM Bug #57542 (Resolved): crimson: PGAdvanceMap updates from wrong version
... Radoslaw Zarzynski
01:31 PM Bug #57539 (New): crimson osd not showing correct object count
I was running some tests on a 1TB regular SSD with crimson-osd/seastore (using BlockSegmentManager, I had to modify t... Aravind Ramesh
11:00 AM Bug #57536 (In Progress): crimson: OSDMapGate is updated in wrong order
The out-of-order `OSDMapGate::current` updates
==============================================
Local replication
...
Radoslaw Zarzynski

09/13/2022

11:21 PM Bug #57530 (Resolved): crimson/seastore: crash in --mkfs with vstart
main branch, commit 2bdccfd5eab5a18a4b6a69ef7ee31916a3f1968e... Samuel Just
02:18 AM Bug #57506: crimson: vstart cluster pgs stuck in +wait
https://github.com/ceph/ceph/pull/48057 covers the stuck part, but the fact that IO continues regardless means that w... Samuel Just
02:18 AM Bug #57508 (New): crimson: need to actually block IO while read lease expires in +wait state
See https://tracker.ceph.com/issues/57506 -- pgs in +wait should block IO until it clears. See the implementation in... Samuel Just

09/12/2022

11:15 PM Bug #57506: crimson: vstart cluster pgs stuck in +wait
Steady state several minutes later after the rados bench instance completed (successfully!)... Samuel Just
11:05 PM Bug #57506 (Resolved): crimson: vstart cluster pgs stuck in +wait
Main (c49b81c7d619cea23e9707d1f5bcc7de3049c4fd) + sjust/wip-io-hang (https://github.com/ceph/ceph/pull/48057)
<pre...
Samuel Just
11:14 PM Bug #57495: crimson: osd crash
https://github.com/ceph/ceph/pull/48057 Samuel Just
11:14 PM Bug #57494: crimson: IO hang with vstart immediately after cluster start
https://github.com/ceph/ceph/pull/48057 Samuel Just
12:06 AM Bug #57494: crimson: IO hang with vstart immediately after cluster start
Ok, the problem is reusing the same PipeHandle on requeue. Patch that fixes that seems to have solved the problem, w... Samuel Just

09/11/2022

12:15 AM Bug #57495: crimson: osd crash
Essential problem is that we don't atomically unblock and record_unblock. Samuel Just

09/10/2022

11:07 PM Bug #57495 (Resolved): crimson: osd crash
c49b81c7d619cea23e9707d1f5bcc7de3049c4fd with the folowing debugging (may have changed the continuation timing)
<p...
Samuel Just
10:05 PM Bug #57494: crimson: IO hang with vstart immediately after cluster start
Hmm, in fact ClientRequest::Orderer::requeue does cancel the outstanding handles. Samuel Just
09:33 PM Bug #57494: crimson: IO hang with vstart immediately after cluster start
We don't actually release the pipeline handle when we are interrupted. This can result in a circular dependency if p... Samuel Just
09:22 PM Bug #57494: crimson: IO hang with vstart immediately after cluster start
Least recent went through an actingset change
DEBUG 2022-09-10 21:16:26,115 [shard 0] osd - client_request(id=10, ...
Samuel Just
09:21 PM Bug #57494: crimson: IO hang with vstart immediately after cluster start
Most recent stuck on await_map:
DEBUG 2022-09-10 21:16:38,168 [shard 0] osd - client_request(id=350, detail=m=[osd...
Samuel Just
09:20 PM Bug #57494: crimson: IO hang with vstart immediately after cluster start
./bin/ceph --admin-daemon asok/bench.asok objecter_requests
...
{
"ops": [
{
"tid": 6,
...
Samuel Just
09:19 PM Bug #57494 (Resolved): crimson: IO hang with vstart immediately after cluster start
Reproduces about 1/2 of the time with (edit, more like 1/8 times now)
MDS=0 MGR=1 OSD=3 MON=1 ../src/vstart.sh --...
Samuel Just
 

Also available in: Atom