Project

General

Profile

Activity

From 05/07/2018 to 06/05/2018

06/05/2018

05:34 PM Bug #24365 (Pending Backport): cosbench stuck at booting cosbench driver
Neha Ojha
01:33 AM Bug #24365 (Fix Under Review): cosbench stuck at booting cosbench driver
https://github.com/ceph/ceph/pull/22405 Neha Ojha
04:04 PM Bug #24408 (Pending Backport): tell ... config rm <foo> not idempotent
Kefu Chai
11:00 AM Bug #24423 (Resolved): failed to load OSD map for epoch X, got 0 bytes
After upgrading to Mimic I deleted a non-lvm OSD and recreated it with 'ceph-volume lvm prepare --bluestore --data /d... Sergey Malinin
10:37 AM Bug #24422: Ceph OSDs crashing in BlueStore::queue_transactions() using EC
the same to https://tracker.ceph.com/issues/21475. and i already modify bluestore_deferred_throttle_bytes = 0
bluest...
鹏 张
10:31 AM Bug #24422: Ceph OSDs crashing in BlueStore::queue_transactions() using EC
2018-06-05T17:46:28.273183+08:00 node54 ceph-osd: /work/build/rpmbuild/BUILD/infinity-3.2.5/src/os/bluestore/BlueStor... 鹏 张
10:31 AM Bug #24422: Ceph OSDs crashing in BlueStore::queue_transactions() using EC
鹏 张 wrote:
> ceph version: 12.2.5
> data pool use Ec module 2 + 1.
> When restart one osd,it case crash and restar...
鹏 张
10:26 AM Bug #24422: Ceph OSDs crashing in BlueStore::queue_transactions() using EC
1.-45> 2018-06-05 17:47:56.886142 7f8972974700 -1 bluestore(/var/lib/ceph/osd/ceph-12) _txc_add_transaction error (2)... 鹏 张
10:25 AM Bug #24422 (Duplicate): Ceph OSDs crashing in BlueStore::queue_transactions() using EC
ceph version: 12.2.5
data pool use Ec module 3 + 1.
When restart one osd,it case crash and restart more and more.
...
鹏 张
04:42 AM Bug #24419 (Won't Fix): ceph-objectstore-tool unable to open mon store
Hi,everyone;
I use luminous v12.2.5,and i try to recovery monitor database from osds,
I perform step by step acc...
dovefi Z
03:32 AM Backport #24291 (In Progress): jewel: common: JSON output from rados bench write has typo in max_...
https://github.com/ceph/ceph/pull/22407 Prashant D
02:37 AM Bug #23875: Removal of snapshot with corrupt replica crashes osd

If update_snap_map() ignores the error from remove_oid() we still crash because an op from the primary related to...
David Zafman
02:20 AM Backport #24292 (In Progress): mimic: common: JSON output from rados bench write has typo in max_...
https://github.com/ceph/ceph/pull/22406 Prashant D

06/04/2018

06:32 PM Bug #24368: osd: should not restart on permanent failures
It would, but the previous settings were there for a reason so I'm not sure if it's feasible to backport this for cep... Greg Farnum
05:10 PM Bug #24371 (Fix Under Review): Ceph-osd crash when activate SPDK
Greg Farnum
04:00 PM Bug #24408 (Fix Under Review): tell ... config rm <foo> not idempotent
https://github.com/ceph/ceph/pull/22395 Sage Weil
03:56 PM Bug #24408 (Resolved): tell ... config rm <foo> not idempotent
... Sage Weil
02:56 PM Backport #24407 (In Progress): mimic: read object attrs failed at EC recovery
Kefu Chai
02:56 PM Backport #24407 (Resolved): mimic: read object attrs failed at EC recovery
https://github.com/ceph/ceph/pull/22394 Kefu Chai
02:54 PM Bug #24406 (Resolved): read object attrs failed at EC recovery
https://github.com/ceph/ceph/pull/22196 Kefu Chai
02:18 PM Backport #24290 (In Progress): luminous: common: JSON output from rados bench write has typo in m...
https://github.com/ceph/ceph/pull/22391 Prashant D
11:53 AM Bug #24366 (Pending Backport): omap_digest handling still not correct
Kefu Chai
06:27 AM Bug #23352: osd: segfaults under normal operation
Looking at the crash in http://tracker.ceph.com/issues/23352#note-14 there's a fairly glaring problem.... Brad Hubbard
12:14 AM Bug #23352: osd: segfaults under normal operation
Hi Kjetil,
Sure, worth a look, but AFAICT all access is protected by SafeTimers locks.
Brad Hubbard
02:08 AM Backport #24258 (In Progress): luminous: crush device class: Monitor Crash when moving Bucket int...
https://github.com/ceph/ceph/pull/22381 Prashant D

06/02/2018

12:04 AM Bug #24365 (In Progress): cosbench stuck at booting cosbench driver
Two things caused this issue:
1. cosbench requires openjdk-8. The cbt task does install this dependency, but we al...
Neha Ojha

06/01/2018

08:05 PM Bug #23352: osd: segfaults under normal operation
Brad Hubbard wrote:
> I've confirmed that in all of the SafeTimer segfaults the 'schedule' multimap is empty, indica...
Kjetil Joergensen
06:01 PM Bug #24368: osd: should not restart on permanent failures
Sounds like something that would be useful in our stable releases - Greg, do you agree? Nathan Cutler
05:56 PM Backport #24360 (Need More Info): luminous: osd: leaked Session on osd.7
Do Not Backport For Now
see https://github.com/ceph/ceph/pull/22339#issuecomment-393574371 for details
Nathan Cutler
05:44 PM Backport #24383 (Resolved): mimic: osd: stray osds in async_recovery_targets cause out of order ops
https://github.com/ceph/ceph/pull/22889 Nathan Cutler
05:28 PM Backport #24381 (Resolved): luminous: omap_digest handling still not correct
https://github.com/ceph/ceph/pull/22375 David Zafman
05:28 PM Backport #24380 (Resolved): mimic: omap_digest handling still not correct
https://github.com/ceph/ceph/pull/22374 David Zafman
08:02 AM Bug #24342: Monitor's routed_requests leak
Greg Farnum wrote:
> What version are you running? The MRoute handling is all pretty old; though we've certainly dis...
Xuehan Xu
07:16 AM Bug #24373 (Fix Under Review): osd: eternal stuck PG in 'unfound_recovery'
Mykola Golub
05:22 AM Bug #24373: osd: eternal stuck PG in 'unfound_recovery'
https://github.com/ceph/ceph/pull/22358
Kouya Shimura
04:57 AM Bug #24373 (Resolved): osd: eternal stuck PG in 'unfound_recovery'
A PG might be eternally stuck in 'unfound_recovery' after some OSDs are marked down.
For example, the following st...
Kouya Shimura
06:12 AM Backport #24375 (In Progress): mimic: mon: auto compaction on rocksdb should kick in more often
Kefu Chai
06:11 AM Backport #24375 (Resolved): mimic: mon: auto compaction on rocksdb should kick in more often
https://github.com/ceph/ceph/pull/22361 Kefu Chai
06:10 AM Backport #24374 (In Progress): luminous: mon: auto compaction on rocksdb should kick in more often
Kefu Chai
06:08 AM Backport #24374 (Resolved): luminous: mon: auto compaction on rocksdb should kick in more often
https://github.com/ceph/ceph/pull/22360 Kefu Chai
06:08 AM Bug #24361 (Pending Backport): auto compaction on rocksdb should kick in more often
Kefu Chai
04:47 AM Bug #24371: Ceph-osd crash when activate SPDK
This is a bug in NVMEDevice, the bug fix has been committed.
Please have a review PR https://github.com/ceph/ceph...
Anonymous
02:02 AM Bug #24371: Ceph-osd crash when activate SPDK
I'm working on the issue. Anonymous
02:01 AM Bug #24371 (Resolved): Ceph-osd crash when activate SPDK
Enable SPDK and configure bluestore as mentioned in http://docs.ceph.com/docs/master/rados/configuration/bluestore-co... Anonymous
02:56 AM Feature #24363: Configure DPDK with mellanox NIC
next, compiling pass. but all binaries can not run.
output error
EAL: VFIO_RESOURCE_LIST tailq is already registere...
YongSheng Zhang
02:38 AM Feature #24363: Configure DPDK with mellanox NIC
log details
mellanox NIC over fabric
When compiling output error.
1. lack numa and cryptopp libraries
I ...
YongSheng Zhang
12:23 AM Feature #24363: Configure DPDK with mellanox NIC
Append
NIC over optical fiber
YongSheng Zhang
12:07 AM Bug #24160 (Resolved): Monitor down when large store data needs to compact triggered by ceph tell...
Kefu Chai

05/31/2018

11:34 PM Bug #24368 (In Progress): osd: should not restart on permanent failures
https://github.com/ceph/ceph/pull/22349 has the simple restart interval change. Will investigate the options for cond... Greg Farnum
11:25 PM Bug #24368: osd: should not restart on permanent failures
See https://www.freedesktop.org/software/systemd/man/systemd.service.html#Restart= for the details on Restart options. Greg Farnum
11:17 PM Bug #24368 (Resolved): osd: should not restart on permanent failures
Last week at OpenStack I heard a few users report OSDs were not failing hard and fast as they should be on disk issue... Greg Farnum
07:01 PM Bug #24366 (In Progress): omap_digest handling still not correct
https://github.com/ceph/ceph/pull/22346 David Zafman
05:39 PM Bug #24366 (Resolved): omap_digest handling still not correct

When running bluestore the object info data_digest is not needed. In that case the omap_digest handling is still b...
David Zafman
06:08 PM Bug #24349 (Pending Backport): osd: stray osds in async_recovery_targets cause out of order ops
Josh Durgin
12:51 AM Bug #24349: osd: stray osds in async_recovery_targets cause out of order ops
https://github.com/ceph/ceph/pull/22330 Josh Durgin
12:46 AM Bug #24349 (Resolved): osd: stray osds in async_recovery_targets cause out of order ops
Related to https://tracker.ceph.com/issues/23827
http://pulpito.ceph.com/yuriw-2018-05-24_17:07:20-powercycle-mast...
Neha Ojha
05:07 PM Bug #24365 (Resolved): cosbench stuck at booting cosbench driver
... Neha Ojha
03:54 PM Bug #24342: Monitor's routed_requests leak
What version are you running? The MRoute handling is all pretty old; though we've certainly discovered a number of le... Greg Farnum
02:17 PM Feature #24363 (New): Configure DPDK with mellanox NIC
Hi all
Whether ceph-13.1.0 support DPDK on mellanox NIC?
I found many issues when compiling. I even though handle t...
YongSheng Zhang
01:22 PM Bug #24362 (Triaged): ceph-objectstore-tool incorrectly invokes crush_location_hook
Ceph release being used: 12.5.5 (cad919881333ac92274171586c827e01f554a70a) luminous (stable)
/etc/ceph/ceph.conf c...
Roman Chebotarev
11:50 AM Backport #24359 (In Progress): mimic: osd: leaked Session on osd.7
Kefu Chai
07:39 AM Backport #24359 (Resolved): mimic: osd: leaked Session on osd.7
https://github.com/ceph/ceph/pull/22339 Nathan Cutler
09:40 AM Bug #24361 (Fix Under Review): auto compaction on rocksdb should kick in more often
https://github.com/ceph/ceph/pull/22337 Kefu Chai
09:07 AM Bug #24361 (Resolved): auto compaction on rocksdb should kick in more often
in rocksdb, by default, "max_bytes_for_level_base" is 256MB, "max_bytes_for_level_multiplier" is 10. so with this set... Kefu Chai
07:39 AM Backport #24360 (Resolved): luminous: osd: leaked Session on osd.7
https://github.com/ceph/ceph/pull/29859 Nathan Cutler
07:38 AM Backport #24350 (In Progress): mimic: slow mon ops from osd_failure
Nathan Cutler
07:37 AM Backport #24350 (Resolved): mimic: slow mon ops from osd_failure
https://github.com/ceph/ceph/pull/22297 Nathan Cutler
07:38 AM Backport #24356 (Resolved): luminous: osd: pg hard limit too easy to hit
https://github.com/ceph/ceph/pull/22592 Nathan Cutler
07:38 AM Backport #24355 (Resolved): mimic: osd: pg hard limit too easy to hit
https://github.com/ceph/ceph/pull/22621 Nathan Cutler
07:37 AM Backport #24351 (Resolved): luminous: slow mon ops from osd_failure
https://github.com/ceph/ceph/pull/22568 Nathan Cutler
05:31 AM Bug #20924 (Pending Backport): osd: leaked Session on osd.7
i think https://github.com/ceph/ceph/pull/22292 indeed addresses this issue
https://github.com/ceph/ceph/pull/22384
Kefu Chai
04:51 AM Backport #24246 (In Progress): mimic: Manager daemon y is unresponsive during teuthology cluster ...
https://github.com/ceph/ceph/pull/22333 Prashant D
02:55 AM Backport #24245 (In Progress): luminous: Manager daemon y is unresponsive during teuthology clust...
https://github.com/ceph/ceph/pull/22331 Prashant D

05/30/2018

11:31 PM Bug #24160 (Fix Under Review): Monitor down when large store data needs to compact triggered by c...
Josh Durgin
10:45 PM Bug #23830: rados/standalone/erasure-code.yaml gets 160 byte pgmeta object
This looks like a similar failure: http://pulpito.ceph.com/nojha-2018-05-30_20:43:02-rados-wip-async-up2-2018-05-30-d... Neha Ojha
02:17 PM Bug #24342: Monitor's routed_requests leak
It seems that this problem has been fixed by https://github.com/ceph/ceph/commit/39e06ef8f070e136e54452bdea3f6105cd79... Xuehan Xu
01:10 PM Bug #24342 (Closed): Monitor's routed_requests leak
Joao Eduardo Luis
12:09 PM Bug #24342: Monitor's routed_requests leak
Sorry, it seems that the latest version doesn't have this problem. Really sorry. please close this. Xuehan Xu
09:36 AM Bug #24342: Monitor's routed_requests leak
https://github.com/ceph/ceph/pull/22315 Xuehan Xu
08:54 AM Bug #24342 (Closed): Monitor's routed_requests leak
Recently, we found that, in our non-leader monitors, there are a lot of routed requests that has not been recycled, a... Xuehan Xu
01:58 PM Bug #24327: osd: segv in pg_log_entry_t::encode()
Sage Weil wrote:
> This crash doesn't look familiar, and it's not clear to me what might cause segfault here. Do yo...
frank lin
01:48 PM Bug #24327 (Need More Info): osd: segv in pg_log_entry_t::encode()
This crash doesn't look familiar, and it's not clear to me what might cause segfault here. Do you have a core file? Sage Weil
01:55 PM Bug #24339: FULL_FORCE ops are dropped if fail-safe full check fails, but not resent in scan_requ...
Josh and I noticed this by code inspection. I'm nailing down out of space handling nits in the kernel client and wan... Ilya Dryomov
01:46 PM Bug #24339: FULL_FORCE ops are dropped if fail-safe full check fails, but not resent in scan_requ...
This is somewhat by design (or lack thereof)... the fail-safe check is there to prevent us from writing when we are *... Sage Weil
05:40 AM Backport #24215 (In Progress): mimic: "process (unknown)" in ceph logs
https://github.com/ceph/ceph/pull/22311 Prashant D
03:29 AM Backport #24214 (In Progress): luminous: Module 'balancer' has failed: could not find bucket -14
https://github.com/ceph/ceph/pull/22308 Prashant D

05/29/2018

11:01 PM Feature #23979: Limit pg log length during recovery/backfill so that we don't run out of memory.
Initial testing is referenced here: https://github.com/ceph/ceph/pull/21508 Josh Durgin
10:59 PM Bug #24243 (Pending Backport): osd: pg hard limit too easy to hit
https://github.com/ceph/ceph/pull/22187 Josh Durgin
10:59 PM Bug #24304 (Fix Under Review): MgrStatMonitor decode crash on 12.2.4->12.2.5 upgrade
wrong bug Josh Durgin
10:58 PM Bug #24304 (Pending Backport): MgrStatMonitor decode crash on 12.2.4->12.2.5 upgrade
https://github.com/ceph/ceph/pull/22187 Josh Durgin
10:03 PM Feature #11601: osd: share cached osdmaps across osd daemons
A vague possibility that the future seastar-based OSD may run each logical disk OSD inside a single process, which co... Greg Farnum
07:38 PM Bug #24339 (New): FULL_FORCE ops are dropped if fail-safe full check fails, but not resent in sca...
FULL_FORCE ops are dropped if fail-safe full check fails in do_op(). scan_requests() uses op->respects_full() which ... Ilya Dryomov
06:49 PM Bug #23646 (Resolved): scrub interaction with HEAD boundaries and clones is broken
David Zafman
01:11 PM Bug #24322 (Pending Backport): slow mon ops from osd_failure
mimic: https://github.com/ceph/ceph/pull/22297 Kefu Chai
12:53 PM Backport #24328 (In Progress): luminous: assert manager.get_num_active_clean() == pg_num on rados...
Kefu Chai
09:40 AM Backport #24328 (Resolved): luminous: assert manager.get_num_active_clean() == pg_num on rados/si...
https://github.com/ceph/ceph/pull/22296 Nathan Cutler
12:47 PM Backport #24329 (Resolved): mimic: assert manager.get_num_active_clean() == pg_num on rados/singl...
Kefu Chai
09:40 AM Backport #24329 (Resolved): mimic: assert manager.get_num_active_clean() == pg_num on rados/singl...
https://github.com/ceph/ceph/pull/22492 Nathan Cutler
10:02 AM Bug #22530 (Resolved): pool create cmd's expected_num_objects is not correctly interpreted
Nathan Cutler
10:02 AM Backport #23316 (Resolved): jewel: pool create cmd's expected_num_objects is not correctly interp...
Nathan Cutler
10:01 AM Backport #24058 (Resolved): jewel: Deleting a pool with active notify linger ops can result in se...
Nathan Cutler
09:59 AM Backport #24244 (Resolved): jewel: osd/EC: slow/hung ops in multimds suite test
Nathan Cutler
09:59 AM Backport #24244 (In Progress): jewel: osd/EC: slow/hung ops in multimds suite test
Nathan Cutler
09:56 AM Backport #24294 (Resolved): mimic: control-c on ceph cli leads to segv
Nathan Cutler
09:55 AM Backport #24294 (In Progress): mimic: control-c on ceph cli leads to segv
Nathan Cutler
09:52 AM Backport #24256 (Resolved): mimic: osd: Assertion `!node_algorithms::inited(this->priv_value_tra...
Nathan Cutler
09:41 AM Backport #24333 (Resolved): luminous: local_reserver double-reservation of backfilled pg
https://github.com/ceph/ceph/pull/23493 Nathan Cutler
09:41 AM Backport #24332 (Resolved): mimic: local_reserver double-reservation of backfilled pg
https://github.com/ceph/ceph/pull/22559 Nathan Cutler
08:26 AM Feature #24231: librbd/libcephfs/librgw should ignore rados_mon/osd_op_timeouts options (requires...
libcephfs doesn't use librados, so it doesn't need any changes.
The rados_mon_op_timeout affects anything that use...
John Spray
07:55 AM Bug #20924: osd: leaked Session on osd.7
https://github.com/ceph/ceph/pull/22292 might address this issue. Kefu Chai
07:37 AM Bug #24327 (Need More Info): osd: segv in pg_log_entry_t::encode()
The affected osd restarted itself and everything seems fine then.But what is the cause of the crash?... frank lin
06:37 AM Backport #24204 (In Progress): mimic: LibRadosMiscPool.PoolCreationRace segv
https://github.com/ceph/ceph/pull/22291 Prashant D
06:20 AM Backport #24216 (In Progress): luminous: "process (unknown)" in ceph logs
https://github.com/ceph/ceph/pull/22290 Prashant D
03:32 AM Bug #24321: assert manager.get_num_active_clean() == pg_num on rados/singleton/all/max-pg-per-osd...
mimic: https://github.com/ceph/ceph/pull/22288 Kefu Chai
03:31 AM Bug #24321 (Pending Backport): assert manager.get_num_active_clean() == pg_num on rados/singleton...
Kefu Chai

05/28/2018

10:54 PM Feature #24176: osd: add command to drop OSD cache
Anyone looking into this? If not, I can pick it up. Mohamad Gebai
03:21 PM Bug #24145 (Duplicate): osdmap decode error in rados/standalone/*
Kefu Chai
03:19 PM Bug #23492: Abort in OSDMap::decode() during qa/standalone/erasure-code/test-erasure-eio.sh
/a/kchai-2018-05-28_09:21:54-rados-wip-kefu-testing-2018-05-28-1113-distro-basic-smithi/2601187
on mimic branch.
...
Kefu Chai
11:51 AM Bug #24321 (Fix Under Review): assert manager.get_num_active_clean() == pg_num on rados/singleton...
https://github.com/ceph/ceph/pull/22275 Kefu Chai
05:28 AM Bug #23352: osd: segfaults under normal operation
I've confirmed that in all of the SafeTimer segfaults the 'schedule' multimap is empty, indicating this is the last e... Brad Hubbard
05:16 AM Bug #23352: osd: segfaults under normal operation
If we look at the coredump from 23585 and compare it to this message.
[117735.930255] safe_timer[52573]: segfault ...
Brad Hubbard
04:32 AM Bug #24023 (Duplicate): Segfault on OSD in 12.2.5
Duplicate of 23352 Brad Hubbard
04:30 AM Bug #23564 (Duplicate): OSD Segfaults
Duplicate of 23352 Brad Hubbard
04:28 AM Bug #23585 (Duplicate): osd: safe_timer segfault
Duplicate of 23352 Brad Hubbard
02:47 AM Bug #24160: Monitor down when large store data needs to compact triggered by ceph tell mon.xx com...
PR :
https://github.com/ceph/ceph/pull/22056/
相洋 于

05/27/2018

05:58 PM Feature #11601: osd: share cached osdmaps across osd daemons
Attached the file CephScaleTestMarch2015.pdf
Do we have any plan for this guys?
Chuong Le
02:55 PM Bug #24322 (Fix Under Review): slow mon ops from osd_failure
https://github.com/ceph/ceph/pull/22259 Sage Weil
02:46 PM Bug #23585: osd: safe_timer segfault
Hi Brad, sure, thanks. Alex Gorbachev

05/26/2018

01:51 PM Bug #24322 (Resolved): slow mon ops from osd_failure
... Sage Weil
01:39 PM Bug #24162 (Resolved): control-c on ceph cli leads to segv
Sage Weil
01:38 PM Bug #24219 (Resolved): osd: InProgressOp freed by on_change(); in-flight op may use-after-free in...
Sage Weil
01:36 PM Bug #24321 (Resolved): assert manager.get_num_active_clean() == pg_num on rados/singleton/all/max...
... Sage Weil
01:29 PM Bug #24320 (Resolved): out of order reply and/or osd assert with set-chunks-read.yaml
... Sage Weil
02:00 AM Bug #23614 (Pending Backport): local_reserver double-reservation of backfilled pg
Josh Durgin
01:59 AM Bug #23490 (Duplicate): luminous: osd: double recovery reservation for PG when EIO injected (whil...
Josh Durgin
01:25 AM Bug #23352: osd: segfaults under normal operation
Thanks,
That gives us seven cores across 12.2.4-12.2.5 on Xenial and Centos and one core from the MMgrReport::enco...
Brad Hubbard
12:35 AM Bug #23431 (Duplicate): OSD Segmentation fault in thread_name:safe_timer
Closing as a duplicate of #23352 where we are focussing. Brad Hubbard
12:33 AM Bug #23564: OSD Segfaults
Since the stack from this core is the following can we also close this as a duplicate of 23352?
(gdb) bt
#0 0x00...
Brad Hubbard
12:31 AM Bug #23585: osd: safe_timer segfault
Alex,
Can we close this bug also as a duplicate of 23352?
Brad Hubbard
12:28 AM Bug #24023: Segfault on OSD in 12.2.5
Alex,
Why are we running multiple trackers for the same issue?
Can we close this as a duplicate?
Brad Hubbard

05/25/2018

10:25 PM Bug #23614 (Fix Under Review): local_reserver double-reservation of backfilled pg
Explanation of the problem and resolution included in the pull request.
https://github.com/ceph/ceph/pull/22255
Neha Ojha
10:06 PM Bug #24219 (Pending Backport): osd: InProgressOp freed by on_change(); in-flight op may use-after...
Sage Weil
09:25 PM Bug #24304 (Fix Under Review): MgrStatMonitor decode crash on 12.2.4->12.2.5 upgrade
This is due to the fast-path decoding for object_stat_sum_t not being updated in the backport. Fix: https://github.co... Josh Durgin
04:22 PM Bug #24304 (Closed): MgrStatMonitor decode crash on 12.2.4->12.2.5 upgrade
This appears to be specific to a downstream build, closing. John Spray
12:29 PM Bug #24304 (Resolved): MgrStatMonitor decode crash on 12.2.4->12.2.5 upgrade
... John Spray
03:08 PM Backport #24297 (Resolved): mimic: RocksDB compression is not supported at least on Debian.
Kefu Chai
11:03 AM Backport #24297 (Resolved): mimic: RocksDB compression is not supported at least on Debian.
https://github.com/ceph/ceph/pull/22183 Nathan Cutler
03:06 PM Bug #24023: Segfault on OSD in 12.2.5
ALso posted this in bug http://tracker.ceph.com/issues/23352
Hi Brad, we had one too just now, core dump and log:
...
Alex Gorbachev
08:04 AM Bug #24023: Segfault on OSD in 12.2.5
hi,
i've noticed similar/same segfault on my deployment. random segfaults on random osds appears under load or wit...
Jan Krcmar
03:05 PM Bug #23352: osd: segfaults under normal operation
Hi Brad, we had one too just now, core dump and log:
https://drive.google.com/open?id=1t1jfjqwjhUUBzWjxamos3Hr7ghj...
Alex Gorbachev
07:54 AM Bug #23352: osd: segfaults under normal operation
Thanks Beom-Seok,
I've set up a centos environment to debug those cores along with the Xenial ones. I will update ...
Brad Hubbard
03:11 AM Bug #23352: osd: segfaults under normal operation
Today two osd crashes.
coredump at:
https://drive.google.com/open?id=1rXtW0riZMBwP5OqrJ7QdRIOAsKFr-kYw
https://d...
Beom-Seok Park
02:10 PM Bug #23965: FAIL: s3tests.functional.test_s3.test_multipart_upload_resend_part with ec cache pools
https://github.com/ceph/ceph/pull/22126 merged to remove failures from rgw suite. moving to rados project Casey Bodley
12:28 PM Backport #24259 (Resolved): mimic: crush device class: Monitor Crash when moving Bucket into Defa...
Kefu Chai
11:03 AM Backport #24294 (Resolved): mimic: control-c on ceph cli leads to segv
https://github.com/ceph/ceph/pull/22225 Nathan Cutler
11:03 AM Backport #24293 (Resolved): jewel: mon: slow op on log message
https://github.com/ceph/ceph/pull/22431 Nathan Cutler
11:03 AM Backport #24292 (Resolved): mimic: common: JSON output from rados bench write has typo in max_lat...
https://github.com/ceph/ceph/pull/22406 Nathan Cutler
11:03 AM Backport #24291 (Resolved): jewel: common: JSON output from rados bench write has typo in max_lat...
https://github.com/ceph/ceph/pull/22407 Nathan Cutler
11:03 AM Backport #24290 (Resolved): luminous: common: JSON output from rados bench write has typo in max_...
https://github.com/ceph/ceph/pull/22391 Nathan Cutler
03:47 AM Bug #24045 (Resolved): Eviction still raced with scrub due to preemption
David Zafman
03:47 AM Bug #22881 (Resolved): scrub interaction with HEAD boundaries and snapmapper repair is broken
David Zafman
03:46 AM Backport #24016 (Resolved): luminous: scrub interaction with HEAD boundaries and snapmapper repai...
David Zafman
03:43 AM Backport #23863 (Resolved): luminous: scrub interaction with HEAD boundaries and clones is broken
David Zafman
03:39 AM Backport #24153 (Resolved): luminous: Eviction still raced with scrub due to preemption
David Zafman
03:38 AM Bug #23267 (Resolved): scrub errors not cleared on replicas can cause inconsistent pg state when ...
David Zafman
03:37 AM Backport #23486 (Resolved): jewel: scrub errors not cleared on replicas can cause inconsistent pg...
David Zafman
03:30 AM Bug #23811: RADOS stat slow for some objects on same OSD
... Chang Liu

05/24/2018

08:41 PM Bug #23267: scrub errors not cleared on replicas can cause inconsistent pg state when replica tak...
merged https://github.com/ceph/ceph/pull/21194 Yuri Weinstein
08:38 PM Backport #23316: jewel: pool create cmd's expected_num_objects is not correctly interpreted
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/22050
merged
Yuri Weinstein
08:38 PM Backport #23316: jewel: pool create cmd's expected_num_objects is not correctly interpreted
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/22050
merged
Yuri Weinstein
08:37 PM Bug #23966: Deleting a pool with active notify linger ops can result in seg fault
merged https://github.com/ceph/ceph/pull/22188 Yuri Weinstein
08:36 PM Bug #23769: osd/EC: slow/hung ops in multimds suite test
jewel backport PR https://github.com/ceph/ceph/pull/22189 merged Yuri Weinstein
06:07 PM Bug #24192: cluster [ERR] Corruption detected: object 2:f59d1934:::smithi14913526-5822:head is mi...
... David Zafman
06:05 PM Bug #24199 (Pending Backport): common: JSON output from rados bench write has typo in max_latency...
Sage Weil
06:03 PM Bug #24162 (Pending Backport): control-c on ceph cli leads to segv
mimic backport https://github.com/ceph/ceph/pull/22225 Sage Weil
05:59 PM Bug #23879: test_mon_osdmap_prune.sh fails
/a/sage-2018-05-23_14:50:29-rados-wip-sage2-testing-2018-05-22-1410-distro-basic-smithi/2576533 Sage Weil
03:40 PM Feature #24232: Add new command ceph mon status
added a card to the backlog: https://trello.com/c/PTgwBpmx Joao Eduardo Luis
01:27 PM Feature #24232: Add new command ceph mon status
Sorry for the confusion, I did not check that we have ceph osd stat and ceph mon stat has the same purpose. I wanted ... Vikhyat Umrao
10:55 AM Feature #24232: Add new command ceph mon status
copy/pasting from the PR opened to address this issue (https://github.com/ceph/ceph/pull/22202):... Joao Eduardo Luis
01:44 PM Bug #24037 (Resolved): osd: Assertion `!node_algorithms::inited(this->priv_value_traits().to_nod...
Sage Weil
01:42 PM Bug #24145: osdmap decode error in rados/standalone/*
... Sage Weil
01:39 PM Bug #17257: ceph_test_rados_api_lock fails LibRadosLockPP.LockExclusiveDurPP
... Sage Weil
12:08 PM Backport #24279 (In Progress): luminous: RocksDB compression is not supported at least on Debian.
Kefu Chai
12:08 PM Backport #24279 (Resolved): luminous: RocksDB compression is not supported at least on Debian.
https://github.com/ceph/ceph/pull/22215 Kefu Chai
09:48 AM Bug #24025 (Pending Backport): RocksDB compression is not supported at least on Debian.
Kefu Chai
09:43 AM Bug #24025: RocksDB compression is not supported at least on Debian.
tested... Kefu Chai
08:22 AM Bug #23352: osd: segfaults under normal operation
Hi Alex,
I notice there are several more coredumps attached to the related bug reports. Are they all separate cras...
Brad Hubbard
03:07 AM Bug #24264: ssd-primary crush rule not working as intended
Sorry, here's my updated rule instead of the one in the document.
rule ssd-primary {
id 2
type r...
Horace Ng
03:05 AM Bug #24264 (Closed): ssd-primary crush rule not working as intended
I've set up the rule according to the doc, but some of the PGs are still being assigned to the same host though my fa... Horace Ng

05/23/2018

09:36 PM Bug #23787 (Rejected): luminous: "osd-scrub-repair.sh'" failures in rados
This is an incompatibility between the OSD version 64ffa817000d59d91379f7335439845930f58530 (luminous) and the versio... David Zafman
06:40 PM Bug #22920 (Resolved): filestore journal replay does not guard omap operations
Nathan Cutler
06:40 PM Backport #22934 (Resolved): luminous: filestore journal replay does not guard omap operations
Nathan Cutler
06:35 PM Bug #23878 (Resolved): assert on pg upmap
Nathan Cutler
06:34 PM Backport #23925 (Resolved): luminous: assert on pg upmap
Nathan Cutler
06:32 PM Backport #24259 (Resolved): mimic: crush device class: Monitor Crash when moving Bucket into Defa...
https://github.com/ceph/ceph/pull/22169 Nathan Cutler
06:32 PM Backport #24258 (Resolved): luminous: crush device class: Monitor Crash when moving Bucket into D...
https://github.com/ceph/ceph/pull/22381 Nathan Cutler
06:32 PM Backport #24244 (New): jewel: osd/EC: slow/hung ops in multimds suite test
Nathan Cutler
05:09 PM Backport #24244 (Resolved): jewel: osd/EC: slow/hung ops in multimds suite test
https://github.com/ceph/ceph/pull/22189
partial backport for mdsmonitor
Abhishek Lekshmanan
06:31 PM Backport #24256 (Resolved): mimic: osd: Assertion `!node_algorithms::inited(this->priv_value_tra...
https://github.com/ceph/ceph/pull/22160 Nathan Cutler
06:31 PM Backport #24246 (Resolved): mimic: Manager daemon y is unresponsive during teuthology cluster tea...
https://github.com/ceph/ceph/pull/22333 Nathan Cutler
06:31 PM Backport #24245 (Resolved): luminous: Manager daemon y is unresponsive during teuthology cluster ...
https://github.com/ceph/ceph/pull/22331 Nathan Cutler
04:27 PM Bug #23352: osd: segfaults under normal operation
Sage, I had tried to do this, but we don't know when these crashes would happen, just that they will occur. Random t... Alex Gorbachev
04:10 PM Bug #23352 (Need More Info): osd: segfaults under normal operation
Alex, how reproducible is this for you? Could you reproduce with debug timer = 20? Sage Weil
04:21 PM Backport #24058 (In Progress): jewel: Deleting a pool with active notify linger ops can result in...
https://github.com/ceph/ceph/pull/22188 Kefu Chai
04:15 PM Bug #24243 (Resolved): osd: pg hard limit too easy to hit
The default ratio of 2x mon_max_pg_per_osd is easy to hit for clusters that have differently weighted disks (e.g. 1 a... Josh Durgin
03:27 PM Bug #24025: RocksDB compression is not supported at least on Debian.
mimic: https://github.com/ceph/ceph/pull/22183 Kefu Chai
03:25 PM Bug #24025 (Fix Under Review): RocksDB compression is not supported at least on Debian.
https://github.com/ceph/ceph/pull/22181 Kefu Chai
02:53 PM Bug #24025: RocksDB compression is not supported at least on Debian.
because we fail to pass -DWITH_SNAPPY etc to cmake while building rocksdb. this bug also impacts rpm package. i can h... Kefu Chai
01:51 PM Bug #24229 (Triaged): Libradosstriper successfully removes nonexistent objects instead of returni...
Sage Weil
11:57 AM Bug #24242 (New): tcmalloc::ThreadCache::ReleaseToCentralCache on rhel (w/ centos packages)
... Sage Weil
11:43 AM Bug #24222 (Pending Backport): Manager daemon y is unresponsive during teuthology cluster teardown
Sage Weil
08:41 AM Bug #23145: OSD crashes during recovery of EC pg
osd in last peering stage will call pg_log.roll_forward(at last of PG::activate), is there possible the entry rollbf... Zengran Zhang
06:52 AM Bug #23386 (Pending Backport): crush device class: Monitor Crash when moving Bucket into Default ...
https://github.com/ceph/ceph/pull/22169 Kefu Chai
01:21 AM Bug #24037 (Pending Backport): osd: Assertion `!node_algorithms::inited(this->priv_value_traits(...
Sage Weil

05/22/2018

09:55 PM Bug #24222 (Fix Under Review): Manager daemon y is unresponsive during teuthology cluster teardown
https://github.com/ceph/ceph/pull/22158 Sage Weil
02:20 AM Bug #24222 (Resolved): Manager daemon y is unresponsive during teuthology cluster teardown
... Sage Weil
08:47 PM Feature #24232 (Fix Under Review): Add new command ceph mon status
Add new command ceph mon status
For more information please check - https://tracker.ceph.com/issues/24217
Changed...
Vikhyat Umrao
08:32 PM Bug #23965: FAIL: s3tests.functional.test_s3.test_multipart_upload_resend_part with ec cache pools
Josh Durgin wrote:
> Casey, could you or someone else familiar with rgw look through the logs for this and identify ...
Casey Bodley
03:19 PM Bug #23965: FAIL: s3tests.functional.test_s3.test_multipart_upload_resend_part with ec cache pools
Casey, could you or someone else familiar with rgw look through the logs for this and identify the relevant OSD reque... Josh Durgin
07:17 PM Feature #24231 (New): librbd/libcephfs/librgw should ignore rados_mon/osd_op_timeouts options (re...
librbd/libcephfs/librgw should ignore rados_mon/osd_op_timeouts options
https://bugzilla.redhat.com/show_bug.cgi?id=...
Vikhyat Umrao
04:09 PM Bug #24025 (In Progress): RocksDB compression is not supported at least on Debian.
... Radoslaw Zarzynski
03:48 PM Bug #24037 (Fix Under Review): osd: Assertion `!node_algorithms::inited(this->priv_value_traits(...
https://github.com/ceph/ceph/pull/22156 Radoslaw Zarzynski
02:35 PM Bug #24229 (Triaged): Libradosstriper successfully removes nonexistent objects instead of returni...
libradosstriper remove() call on nonexistent objects returns zero instead of ENOENT.
Tested on luminous 12.2.5-1xe...
Stan K
11:35 AM Feature #24099: osd: Improve workflow when creating OSD on raw block device if there was bluestor...

> Point out that it found existing data on the OSD, and possibly suggest using `ceph-volume lvm zap` if that's what...
John Spray
10:51 AM Bug #24199 (Fix Under Review): common: JSON output from rados bench write has typo in max_latency...
John Spray
07:00 AM Bug #23371: OSDs flaps when cluster network is made down
we have not observed this behavior in kraken.
when ever the Cluster interface is made down, few OSDs which goes do...
Nokia ceph-users
03:55 AM Bug #23352: osd: segfaults under normal operation
OSD log attached Alex Gorbachev
03:15 AM Bug #23352: osd: segfaults under normal operation
It's an internal comment for others looking at this - though if you (Alex) have an osd log to go with the 'MMgrReport... Josh Durgin
02:59 AM Bug #23352: osd: segfaults under normal operation
Josh, is this something I can extract from the OSD node for you, or is this an internal comment? Alex Gorbachev
01:10 AM Bug #23352: osd: segfaults under normal operation
I put the core file from comment #14 and binaries from 12.2.5 in senta02:/slow/jdurgin/ceph/bugs/tracker_23352/2018-0... Josh Durgin
03:49 AM Backport #24059 (In Progress): luminous: Deleting a pool with active notify linger ops can result...
https://github.com/ceph/ceph/pull/22143 Prashant D

05/21/2018

10:04 PM Bug #24219: osd: InProgressOp freed by on_change(); in-flight op may use-after-free in op_commit()
/a/teuthology-2018-05-21_20:00:50-powercycle-mimic-distro-basic-smithi/2563192
powercycle/osd/{clusters/3osd-1per-...
Sage Weil
09:40 PM Bug #24219 (Fix Under Review): osd: InProgressOp freed by on_change(); in-flight op may use-after...
https://github.com/ceph/ceph/pull/22133 Sage Weil
09:28 PM Bug #24219 (Resolved): osd: InProgressOp freed by on_change(); in-flight op may use-after-free in...
... Sage Weil
07:29 PM Bug #22330 (Need More Info): ec: src/common/interval_map.h: 161: FAILED assert(len > 0)
need to capture some logs... Sage Weil
07:15 PM Bug #23031: FAILED assert(!parent->get_log().get_missing().is_missing(soid))
I hit this issue a couple of times while trying to reproduce #23614... Neha Ojha
06:36 PM Backport #24200 (Resolved): mimic: PrimaryLogPG::try_flush_mark_clean mixplaced ctx release
Sage Weil
08:48 AM Backport #24200 (Resolved): mimic: PrimaryLogPG::try_flush_mark_clean mixplaced ctx release
Nathan Cutler
06:24 PM Bug #23386 (Fix Under Review): crush device class: Monitor Crash when moving Bucket into Default ...
https://github.com/ceph/ceph/pull/22127 Sage Weil
05:14 PM Bug #23386: crush device class: Monitor Crash when moving Bucket into Default root
reproduces on luminous with... Sage Weil
01:52 PM Bug #23386: crush device class: Monitor Crash when moving Bucket into Default root
I suspect the recent pr https://github.com/ceph/ceph/pull/22091 fixed this, but figuring out how to reproduce to be s... Sage Weil
05:59 PM Bug #23965 (Fix Under Review): FAIL: s3tests.functional.test_s3.test_multipart_upload_resend_part...
https://github.com/ceph/ceph/pull/22126 removes ec-cache pools from the rgw suite Casey Bodley
04:55 PM Bug #22656: scrub mismatch on bytes (cache pools)
http://qa-proxy.ceph.com/teuthology/dzafman-2018-05-18_11:33:31-rados-wip-zafman-testing-mimic-distro-basic-smithi/25... David Zafman
04:21 PM Backport #22934: luminous: filestore journal replay does not guard omap operations
Victor Denisov wrote:
> https://github.com/ceph/ceph/pull/21547
merged
Yuri Weinstein
04:13 PM Backport #23925: luminous: assert on pg upmap
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/21818
merged
Yuri Weinstein
04:01 PM Backport #24213 (In Progress): mimic: Module 'balancer' has failed: could not find bucket -14
Nathan Cutler
03:59 PM Backport #24213 (Resolved): mimic: Module 'balancer' has failed: could not find bucket -14
https://github.com/ceph/ceph/pull/22120 Nathan Cutler
03:59 PM Backport #24216 (Resolved): luminous: "process (unknown)" in ceph logs
https://github.com/ceph/ceph/pull/22290 Nathan Cutler
03:59 PM Backport #24215 (Resolved): mimic: "process (unknown)" in ceph logs
https://github.com/ceph/ceph/pull/22311 Nathan Cutler
03:59 PM Backport #24214 (Resolved): luminous: Module 'balancer' has failed: could not find bucket -14
https://github.com/ceph/ceph/pull/22308 Nathan Cutler
03:03 PM Bug #23585 (Triaged): osd: safe_timer segfault
Josh Durgin
02:17 PM Bug #21142: OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
We are experiencing this too. Majority of the OSDs went down. We tried removing the intervals. It works on some OSDs ... Dexter John Genterone
01:44 PM Bug #24167: Module 'balancer' has failed: could not find bucket -14
mimic backport: https://github.com/ceph/ceph/pull/22120 Sage Weil
01:42 PM Bug #24167 (Pending Backport): Module 'balancer' has failed: could not find bucket -14
Sage Weil
01:00 PM Bug #23431: OSD Segmentation fault in thread_name:safe_timer
Hi.
We have the same issue: ...
Aleksei Zakharov
12:07 PM Bug #24123 (Pending Backport): "process (unknown)" in ceph logs
Sage Weil
09:50 AM Backport #24048 (In Progress): luminous: pg-upmap cannot balance in some case
https://github.com/ceph/ceph/pull/22115 Prashant D
09:43 AM Bug #24199: common: JSON output from rados bench write has typo in max_latency key
PR: https://github.com/ceph/ceph/pull/22112 Sandor Zeestraten
06:23 AM Bug #24199 (Resolved): common: JSON output from rados bench write has typo in max_latency key
The JSON output from `rados bench write --format json/json-pretty` has a typo in the `max_latency` key.
It contains ...
Sandor Zeestraten
08:48 AM Backport #24204 (Resolved): mimic: LibRadosMiscPool.PoolCreationRace segv
https://github.com/ceph/ceph/pull/22291 Nathan Cutler
08:43 AM Bug #24174: PrimaryLogPG::try_flush_mark_clean mixplaced ctx release
mimic: https://github.com/ceph/ceph/pull/22113 Kefu Chai
08:41 AM Bug #24174 (Pending Backport): PrimaryLogPG::try_flush_mark_clean mixplaced ctx release
Kefu Chai
07:11 AM Bug #24076 (Duplicate): rados/test.sh fails in "bin/ceph_test_rados_api_misc --gtest_filter=*Pool...
Kefu Chai
06:24 AM Backport #24198 (In Progress): luminous: mon: slow op on log message
Kefu Chai
06:23 AM Backport #24198 (Resolved): luminous: mon: slow op on log message
https://github.com/ceph/ceph/pull/22109 Kefu Chai
06:20 AM Backport #24195 (Resolved): mimic: mon: slow op on log message
Kefu Chai
02:51 AM Bug #20924: osd: leaked Session on osd.7
osd.4
/a/sage-2018-05-20_18:11:15-rados-wip-sage3-testing-2018-05-20-1031-distro-basic-smithi/2558319
rados/ver...
Sage Weil
02:24 AM Bug #24150 (Pending Backport): LibRadosMiscPool.PoolCreationRace segv
Sage Weil

05/20/2018

06:58 PM Bug #18239 (Duplicate): nan in ceph osd df again
Sage Weil
10:32 AM Bug #24023: Segfault on OSD in 12.2.5
Alexander M wrote:
> Alex Gorbachev wrote:
> > This continues to happen every day, usually during scrub
>
> I've...
Alexander Morozov
10:30 AM Bug #24023: Segfault on OSD in 12.2.5
Alex Gorbachev wrote:
> This continues to happen every day, usually during scrub
I've faced with the same issue
...
Alexander Morozov
09:45 AM Backport #24195 (In Progress): mimic: mon: slow op on log message
https://github.com/ceph/ceph/pull/22104 Kefu Chai
09:42 AM Backport #24195 (Resolved): mimic: mon: slow op on log message
Kefu Chai
09:40 AM Bug #24180 (Pending Backport): mon: slow op on log message
Kefu Chai

05/19/2018

07:04 PM Bug #24192 (Duplicate): cluster [ERR] Corruption detected: object 2:f59d1934:::smithi14913526-582...

davidz@teuthology:/a/dzafman-2018-05-18_11:36:58-rados-wip-zafman-testing-distro-basic-smithi/2549009...
David Zafman

05/18/2018

08:45 PM Bug #24180: mon: slow op on log message
https://github.com/ceph/ceph/pull/22098 Sage Weil
08:44 PM Bug #24180 (Fix Under Review): mon: slow op on log message
https://github.com/ceph/ceph/pull/22098 Sage Weil
08:41 PM Bug #24180 (Resolved): mon: slow op on log message
... Sage Weil
08:37 PM Bug #20924: osd: leaked Session on osd.7
osd.7
/a/sage-2018-05-18_16:20:24-rados-wip-sage-testing-2018-05-18-0817-distro-basic-smithi/2548324
rados/veri...
Sage Weil
02:26 PM Bug #20924: osd: leaked Session on osd.7
osd.7
/a/sage-2018-05-18_13:08:19-rados-wip-sage2-testing-2018-05-17-0701-distro-basic-smithi/2546923
rados/ver...
Sage Weil
08:16 PM Backport #24149 (Resolved): mimic: Eviction still raced with scrub due to preemption
David Zafman
07:24 PM Bug #24162 (Fix Under Review): control-c on ceph cli leads to segv
hacky workaround: https://github.com/ceph/ceph/pull/22093 Sage Weil
07:18 PM Bug #24162: control-c on ceph cli leads to segv
... Sage Weil
07:09 PM Bug #24037: osd: Assertion `!node_algorithms::inited(this->priv_value_traits().to_node_ptr(value...
related?... Sage Weil
01:26 PM Bug #24037 (In Progress): osd: Assertion `!node_algorithms::inited(this->priv_value_traits().to_...
Radoslaw Zarzynski
01:15 PM Bug #24037: osd: Assertion `!node_algorithms::inited(this->priv_value_traits().to_node_ptr(value...
Scenario I can see after static analysis:
1. An instance of `TrackedOp` in `STATE_LIVE` is being dereferenced - th...
Radoslaw Zarzynski
06:59 PM Bug #23352: osd: segfaults under normal operation
The latest ones look like this, below.
Crash dump at https://drive.google.com/open?id=12v95-TCHlkrBZ16ni5UkhYkXRt...
Alex Gorbachev
06:41 PM Bug #23352: osd: segfaults under normal operation
For some reason we are also seeing more of these happening, simultaneous failures and recoveries are occurring during... Alex Gorbachev
02:36 AM Bug #23352: osd: segfaults under normal operation
I run into this issue with 12.2.5, it affects cluster stability heavily. wei jin
06:12 PM Bug #24167 (Fix Under Review): Module 'balancer' has failed: could not find bucket -14
https://github.com/ceph/ceph/pull/22091 Sage Weil
05:02 PM Feature #24176 (Resolved): osd: add command to drop OSD cache
Idea here is to basically make it possible for performance testing on the same data set in RADOS without restarting t... Patrick Donnelly
04:24 PM Feature #22420 (Resolved): Add support for obtaining a list of available compression options
Nathan Cutler
04:04 PM Bug #23487 (Resolved): There is no 'ceph osd pool get erasure allow_ec_overwrites' command
Nathan Cutler
04:04 PM Backport #23668 (Resolved): luminous: There is no 'ceph osd pool get erasure allow_ec_overwrites'...
Nathan Cutler
04:03 PM Bug #23664 (Resolved): cache-try-flush hits wrlock, busy loops
Nathan Cutler
04:03 PM Backport #23914 (Resolved): luminous: cache-try-flush hits wrlock, busy loops
Nathan Cutler
04:02 PM Bug #23860 (Resolved): luminous->master: luminous crashes with AllReplicasRecovered in Started/Pr...
Nathan Cutler
04:02 PM Backport #23988 (Resolved): luminous: luminous->master: luminous crashes with AllReplicasRecovere...
Nathan Cutler
04:02 PM Bug #23980 (Resolved): UninitCondition in PG::RecoveryState::Incomplete::react(PG::AdvMap const&)
Nathan Cutler
04:01 PM Backport #24015 (Resolved): luminous: UninitCondition in PG::RecoveryState::Incomplete::react(PG:...
Nathan Cutler
02:30 PM Backport #24135 (Resolved): mimic: Add support for obtaining a list of available compression options
Sage Weil
02:25 PM Bug #24174: PrimaryLogPG::try_flush_mark_clean mixplaced ctx release
https://github.com/ceph/ceph/pull/22084 Sage Weil
02:24 PM Bug #24174 (Resolved): PrimaryLogPG::try_flush_mark_clean mixplaced ctx release
... Sage Weil

05/17/2018

10:22 PM Bug #24167: Module 'balancer' has failed: could not find bucket -14
It looks like we also don't create weight-sets for new buckets. And if you create buckets and move things into them ... Sage Weil
09:58 PM Bug #24167 (Resolved): Module 'balancer' has failed: could not find bucket -14
crushmap may contain choose_args for deleted buckets... Sage Weil
05:39 PM Bug #23965: FAIL: s3tests.functional.test_s3.test_multipart_upload_resend_part with ec cache pools
Casey Bodley
03:52 PM Bug #23763 (Resolved): upgrade: bad pg num and stale health status in mixed lumnious/mimic cluster
Kefu Chai
03:52 PM Backport #23808 (Resolved): luminous: upgrade: bad pg num and stale health status in mixed lumnio...
Kefu Chai
03:42 PM Backport #23808: luminous: upgrade: bad pg num and stale health status in mixed lumnious/mimic cl...
Kefu Chai wrote:
> https://github.com/ceph/ceph/pull/21556
merged
Yuri Weinstein
03:45 PM Bug #24162 (Resolved): control-c on ceph cli leads to segv
... Sage Weil
03:43 PM Backport #23668: luminous: There is no 'ceph osd pool get erasure allow_ec_overwrites' command
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/21378
merged
Yuri Weinstein
03:42 PM Backport #23914: luminous: cache-try-flush hits wrlock, busy loops
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/21764
merged
Yuri Weinstein
03:41 PM Backport #23988: luminous: luminous->master: luminous crashes with AllReplicasRecovered in Starte...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/21964
merged
Yuri Weinstein
03:40 PM Backport #23988: luminous: luminous->master: luminous crashes with AllReplicasRecovered in Starte...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/21964
merged
Yuri Weinstein
03:38 PM Backport #24015: luminous: UninitCondition in PG::RecoveryState::Incomplete::react(PG::AdvMap con...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/21993
merged
Yuri Weinstein
01:55 PM Backport #23786 (Resolved): luminous: "utilities/env_librados.cc:175:33: error: unused parameter ...
Sage Weil
01:55 PM Bug #22330: ec: src/common/interval_map.h: 161: FAILED assert(len > 0)
Sage Weil
01:50 PM Bug #23145: OSD crashes during recovery of EC pg
Peter Woodman wrote:
> Each OSD is on its own host- these are small arm64 machines. Unfortunately i've already tried...
Sage Weil
11:49 AM Bug #24159 (Duplicate): Monitor down when large store data needs to compact triggered by ceph tel...
Nathan Cutler
10:38 AM Bug #24159 (Duplicate): Monitor down when large store data needs to compact triggered by ceph tel...
I have met a monitor problem with capacity too large in our production environment.
This logical volume for monito...
相洋 于
10:38 AM Bug #24160 (Resolved): Monitor down when large store data needs to compact triggered by ceph tell...
I have met a monitor problem with capacity too large in our production environment.
This logical volume for monito...
相洋 于
09:04 AM Bug #23598 (Duplicate): hammer->jewel: ceph_test_rados crashes during radosbench task in jewel ra...
#23290 does not contain any of the PR mentioned above. so it's not a regression. Kefu Chai
08:33 AM Backport #24153 (In Progress): luminous: Eviction still raced with scrub due to preemption
Nathan Cutler
08:33 AM Backport #24149 (In Progress): mimic: Eviction still raced with scrub due to preemption
Nathan Cutler
08:33 AM Backport #24149 (New): mimic: Eviction still raced with scrub due to preemption
Nathan Cutler
08:27 AM Bug #23962 (Resolved): ceph_daemon.py format_dimless units list index out of range
Nathan Cutler
08:26 AM Bug #24000 (Resolved): mon: snap delete on deleted pool returns 0 without proper payload
Nathan Cutler
08:25 AM Bug #23899 (Resolved): run cmd 'ceph daemon osd.0 smart' cause osd daemon Segmentation fault
Nathan Cutler
07:37 AM Backport #23316 (In Progress): jewel: pool create cmd's expected_num_objects is not correctly int...
Kefu Chai

05/16/2018

10:29 PM Backport #24153: luminous: Eviction still raced with scrub due to preemption
I'm pulling in these pull requests also on top of existing pull request (not yet merged) https://github.com/ceph/ceph... David Zafman
10:26 PM Backport #24153 (Resolved): luminous: Eviction still raced with scrub due to preemption
https://github.com/ceph/ceph/pull/22044 David Zafman
08:52 PM Bug #24150 (Fix Under Review): LibRadosMiscPool.PoolCreationRace segv
https://github.com/ceph/ceph/pull/22042 Sage Weil
08:51 PM Bug #24150 (Resolved): LibRadosMiscPool.PoolCreationRace segv
... Sage Weil
08:36 PM Backport #24149 (Resolved): mimic: Eviction still raced with scrub due to preemption
https://github.com/ceph/ceph/pull/22041 David Zafman
08:28 PM Bug #24045 (Pending Backport): Eviction still raced with scrub due to preemption
Sage Weil
07:31 PM Bug #24148: Segmentation fault out of ObcLockManager::get_lock_type()
The pg 3.3 involved here was never scrubbed, so unrelated to my changes. David Zafman
07:16 PM Bug #24148 (Duplicate): Segmentation fault out of ObcLockManager::get_lock_type()

teuthology:/a/dzafman-2018-05-16_09:57:45-rados:thrash-wip-zafman-testing-distro-basic-smithi/2539708
remote/smi...
David Zafman
06:53 PM Bug #22354: v12.2.2 unable to create bluestore osd using ceph-disk
kobi ginon wrote:
> Note: i still believe there is a relation to rocksdb somehow and the clearing of disk's forces t...
Jon Heese
03:52 PM Backport #24027 (Resolved): mimic: ceph_daemon.py format_dimless units list index out of range
Sage Weil
03:51 PM Backport #24103 (Resolved): mimic: mon: snap delete on deleted pool returns 0 without proper payload
Sage Weil
03:50 PM Backport #24104 (Resolved): mimic: run cmd 'ceph daemon osd.0 smart' cause osd daemon Segmentatio...
Sage Weil
03:49 PM Bug #24145 (Duplicate): osdmap decode error in rados/standalone/*
... Sage Weil
12:03 PM Feature #24099: osd: Improve workflow when creating OSD on raw block device if there was bluestor...
This is not a ceph-volume issue, the description of this issue doesn't point to a ceph-volume operation, but rather, ... Alfredo Deza

05/15/2018

10:58 PM Bug #23145: OSD crashes during recovery of EC pg
Each OSD is on its own host- these are small arm64 machines. Unfortunately i've already tried stopping osd6, it just ... Peter Woodman
10:37 PM Bug #23145: OSD crashes during recovery of EC pg
Hmm, it's possible that if you stop osd.6 that this PG will be able to peer with the remaining OSDs... want to give i... Sage Weil
10:34 PM Bug #23145: OSD crashes during recovery of EC pg
Peter Woodman wrote:
> For the record, I discovered recently that a number of OSDs were operating with write caching...
Sage Weil
10:33 PM Bug #23145: OSD crashes during recovery of EC pg
Hmm, I think the problem comes before that. This is problematic:... Sage Weil
10:20 PM Bug #23145: OSD crashes during recovery of EC pg
For the record, I discovered recently that a number of OSDs were operating with write caching enabled, and because th... Peter Woodman
10:15 PM Bug #23145: OSD crashes during recovery of EC pg
This code appears to be the culprit, at least in this case:... Sage Weil
02:48 PM Bug #24023: Segfault on OSD in 12.2.5
This continues to happen every day, usually during scrub Alex Gorbachev
01:15 PM Backport #24135 (In Progress): mimic: Add support for obtaining a list of available compression o...
Kefu Chai
01:13 PM Backport #24135 (Resolved): mimic: Add support for obtaining a list of available compression options
https://github.com/ceph/ceph/pull/22004 Kefu Chai
12:29 PM Feature #22448 (Resolved): Visibility for snap trim queue length
Already merged to master, luminous and jewel. Piotr Dalek
12:28 PM Backport #22449 (Resolved): jewel: Visibility for snap trim queue length
Piotr Dalek
10:44 AM Bug #23767: "ceph ping mon" doesn't work
Confirmed on my cluster (13.0.2-1969-g49365c7). John Spray
10:37 AM Fix #24126: ceph osd purge command error message improvement
How are you seeing that ugly logfile style output? When I try it, it looks like this:... John Spray
10:32 AM Feature #24127: "osd purge" should print more helpful message when daemon is up
This is completely reasonable as a general point, but not really actionable as a tracker ticket -- we aren't ever goi... John Spray
10:31 AM Bug #23937: FAILED assert(info.history.same_interval_since != 0)
I can't post using ceph-post-file, so I uploaded file here https://eocloud.eu:8080/swift/v1/rwadolowski/ceph-osd.33.l... Rafal Wadolowski
06:31 AM Bug #24007: rados.connect get a segmentation fault
John Spray wrote:
> Is there a backtrace or any other message from the crash?
there are many different backtraces.
xianpao chen
03:15 AM Backport #24015 (In Progress): luminous: UninitCondition in PG::RecoveryState::Incomplete::react(...
https://github.com/ceph/ceph/pull/21993 Prashant D

05/14/2018

09:59 PM Bug #23145: OSD crashes during recovery of EC pg
happy to see action on this ticket. for the record, i still have the data for this pg. Peter Woodman
09:38 PM Bug #22837 (Resolved): discover_all_missing() not always called during activating
David Zafman
09:36 PM Bug #23576 (Can't reproduce): osd: active+clean+inconsistent pg will not scrub or repair
David Zafman
06:12 PM Bug #23576: osd: active+clean+inconsistent pg will not scrub or repair
Sorry for the lack of updates, there were no messages of any sort in the logs when attempting to deep scrub or repair... Michael Sudnick
04:26 PM Bug #23576: osd: active+clean+inconsistent pg will not scrub or repair
No, I never had that message in any of our logs. After a month the PGs ran their own deep-scrub again and I was able... David Turner
09:29 PM Bug #22354: v12.2.2 unable to create bluestore osd using ceph-disk
Hi again
indeed your method also works
in my simple test i just cleared 2 GB out of the disk
before zap setting ...
kobi ginon
08:45 PM Bug #22354: v12.2.2 unable to create bluestore osd using ceph-disk
Hi Jon , thanks a lot for the reply
i'm fighting with issue for a day now, and i have a very strange observation
...
kobi ginon
08:01 PM Bug #22354: v12.2.2 unable to create bluestore osd using ceph-disk
I think I ran into the same thing last week reusing an OSD disk. I did a dd of /dev/zero to the disk for ~10-15 minu... Jon Heese
02:38 AM Bug #22354: v12.2.2 unable to create bluestore osd using ceph-disk
Hi all
i m using the following version ceph-12.2.2-0.el7.x86_64.
it seem's that even with dd of 100MB or 110MB
i s...
kobi ginon
07:22 PM Feature #24127 (New): "osd purge" should print more helpful message when daemon is up
Compilers like GCC and clang are sometimes able to make suggestions when a user makes certain
common mistakes for wh...
Jesse Williamson
07:19 PM Fix #24126 (New): ceph osd purge command error message improvement
In response to the command "ceph osd purge 1 --yes-i-really-mean-it", we
get:
2018-05-10 15:18:03.444 7f29c0ae2700...
Jesse Williamson
06:54 PM Bug #24123 (Fix Under Review): "process (unknown)" in ceph logs
PR: https://github.com/ceph/ceph/pull/21985 Mykola Golub
06:47 PM Bug #24123 (Resolved): "process (unknown)" in ceph logs
get_process_name from libcommon was broken when cleaning up headers (95fc248). As a result we don't log process name ... Mykola Golub
02:44 PM Bug #24007: rados.connect get a segmentation fault
Is there a backtrace or any other message from the crash? John Spray
10:42 AM Bug #24077 (Resolved): test_pool_create_fail (tasks.mgr.dashboard.test_pool.PoolTest) fails
Kefu Chai
08:05 AM Backport #23912 (In Progress): luminous: mon: High MON cpu usage when cluster is changing
Kefu Chai
06:52 AM Backport #23912: luminous: mon: High MON cpu usage when cluster is changing
-http://tracker.ceph.com/issues/23912-
https://github.com/ceph/ceph/pull/21968
Xiaoxi Chen
04:20 AM Backport #23988 (In Progress): luminous: luminous->master: luminous crashes with AllReplicasRecov...
https://github.com/ceph/ceph/pull/21964 Prashant D

05/12/2018

05:25 PM Bug #24022: "ceph tell osd.x bench" writes resulting JSON to stderr instead of stdout.
https://github.com/ceph/ceph/pull/21960 Марк Коренберг
11:36 AM Bug #24022: "ceph tell osd.x bench" writes resulting JSON to stderr instead of stdout.
https://github.com/ceph/ceph/pull/21957 Kefu Chai
11:16 AM Bug #24022: "ceph tell osd.x bench" writes resulting JSON to stderr instead of stdout.
Марк i am leaving this ticket open. will close it once
> But, I found the same bug about stdou/stderr for "debug ...
Kefu Chai
11:13 AM Bug #24022 (Resolved): "ceph tell osd.x bench" writes resulting JSON to stderr instead of stdout.
Kefu Chai
12:20 PM Backport #24104 (In Progress): mimic: run cmd 'ceph daemon osd.0 smart' cause osd daemon Segmenta...
Kefu Chai
12:20 PM Backport #24104 (Resolved): mimic: run cmd 'ceph daemon osd.0 smart' cause osd daemon Segmentatio...
https://github.com/ceph/ceph/pull/21959 Kefu Chai
12:16 PM Backport #24103 (In Progress): mimic: mon: snap delete on deleted pool returns 0 without proper p...
Kefu Chai
12:14 PM Backport #24103 (Resolved): mimic: mon: snap delete on deleted pool returns 0 without proper payload
https://github.com/ceph/ceph/pull/21958 Kefu Chai
11:24 AM Bug #23899 (Pending Backport): run cmd 'ceph daemon osd.0 smart' cause osd daemon Segmentation fault
https://github.com/ceph/ceph/pull/21950 Kefu Chai
11:11 AM Bug #23899 (Resolved): run cmd 'ceph daemon osd.0 smart' cause osd daemon Segmentation fault
Kefu Chai
11:22 AM Bug #24000 (Pending Backport): mon: snap delete on deleted pool returns 0 without proper payload
Kefu Chai

05/11/2018

08:06 PM Bug #23195 (Resolved): Read operations segfaulting multiple OSDs
Nathan Cutler
08:06 PM Backport #23850 (Resolved): luminous: Read operations segfaulting multiple OSDs
Nathan Cutler
04:28 PM Backport #23850: luminous: Read operations segfaulting multiple OSDs
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/21911
merged
Yuri Weinstein
03:44 PM Feature #24099: osd: Improve workflow when creating OSD on raw block device if there was bluestor...
Another related issue I found is that zapping requires root, even when the user executing it already has write permis... Niklas Hambuechen
03:32 PM Feature #24099 (New): osd: Improve workflow when creating OSD on raw block device if there was bl...
On Ceph Luminous, when creating a new bluestore OSD on a block device... Niklas Hambuechen
02:16 PM Bug #24077 (Fix Under Review): test_pool_create_fail (tasks.mgr.dashboard.test_pool.PoolTest) fails
Josh, it's not a mon crash. mon was just not happy with this command, please see @handle_bad_get()@ in @cmd_getval(Ce... Kefu Chai
06:54 AM Bug #24094: some objects are lost after one of osd in cache-tier is broken
New findings:
For the object: s3://B6-2017-12-22-10-25-42/timecost.txt, which index is .dir.0089274c-7a8b-4e66-83d...
Lei Xue
04:33 AM Bug #24094 (New): some objects are lost after one of osd in cache-tier is broken
I have a small cluster to setup, some configs:
* 9 machines
* 9*2 4T SSD as cache tier(size == 1)
* 9*14 8T HDD as...
Lei Xue
04:39 AM Bug #23899 (Fix Under Review): run cmd 'ceph daemon osd.0 smart' cause osd daemon Segmentation fault
https://github.com/ceph/ceph/pull/21691 Kefu Chai
03:20 AM Backport #23986 (In Progress): luminous: recursive lock of objecter session::lock on cancel
https://github.com/ceph/ceph/pull/21939 Prashant D

05/10/2018

11:40 PM Bug #24077: test_pool_create_fail (tasks.mgr.dashboard.test_pool.PoolTest) fails
Looks to have caused a monitor crash:... Josh Durgin
09:13 AM Bug #24077 (Resolved): test_pool_create_fail (tasks.mgr.dashboard.test_pool.PoolTest) fails
... Kefu Chai
08:25 PM Bug #24037: osd: Assertion `!node_algorithms::inited(this->priv_value_traits().to_node_ptr(value...
Here's another that looks related:... Patrick Donnelly
04:33 PM Bug #24041 (Resolved): ceph-disk log is written to /var/run/ceph
Nathan Cutler
03:44 PM Bug #24041: ceph-disk log is written to /var/run/ceph
https://github.com/ceph/ceph/pull/21870
merged
Yuri Weinstein
04:33 PM Backport #24042 (Resolved): luminous: ceph-disk log is written to /var/run/ceph
Nathan Cutler
04:29 PM Backport #24083 (Resolved): luminous: rados: not all exceptions accept keyargs
https://github.com/ceph/ceph/pull/22979 Nathan Cutler
03:11 PM Bug #20924: osd: leaked Session on osd.7
osd.3
/a//yuriw-2018-05-09_22:08:37-rados-mimic-distro-basic-smithi/2511364/remote/smithi118/log/valgrind/osd.3.lo...
Kefu Chai
02:08 PM Bug #23492: Abort in OSDMap::decode() during qa/standalone/erasure-code/test-erasure-eio.sh
i saw a similar abort, except that it came from OSD::init() during qa/standalone/scrub/osd-scrub-repair.sh:... Casey Bodley
12:43 PM Bug #24023: Segfault on OSD in 12.2.5
This is happening on a regular basis, 1-2 per day
Alex Gorbachev
12:29 PM Bug #24078 (New): spdk crash during librados shutdown
I use spdk_tgt from spdk project:
1. rbd create foo --size 1000
2. python /home/sample/build_pool/agent/repo/test/j...
Pawel Kaminski
10:56 AM Bug #23966: Deleting a pool with active notify linger ops can result in seg fault
forward-port for master: https://github.com/ceph/ceph/pull/21831 Kefu Chai
09:34 AM Bug #24033 (Pending Backport): rados: not all exceptions accept keyargs
Kefu Chai
08:48 AM Bug #24076 (Fix Under Review): rados/test.sh fails in "bin/ceph_test_rados_api_misc --gtest_filte...
https://github.com/ceph/ceph/pull/21927 Kefu Chai
08:45 AM Bug #24076: rados/test.sh fails in "bin/ceph_test_rados_api_misc --gtest_filter=*PoolCreationRace*"
the test of LibRadosMiscPool.PoolCreationRace creates a ctx for poolrac2.%d before creating it, and sends a bunch of ... Kefu Chai
08:36 AM Bug #24076 (Duplicate): rados/test.sh fails in "bin/ceph_test_rados_api_misc --gtest_filter=*Pool...
it's a regression introduced by https://github.com/ceph/ceph/pull/21609
http://pulpito.ceph.com/kchai-2018-05-09_1...
Kefu Chai
01:28 AM Bug #23119: MD5-checksum of the snapshot for rbd image in Ceph(as OpenStack-Glance backend Storag...
Jason Dillaman wrote:
> Moving to RADOS since it sounds like it's an issue of corruption on your cache tier.
I re...
宏伟 唐

05/09/2018

09:24 PM Bug #23937: FAILED assert(info.history.same_interval_since != 0)
The pg is waiting for state from osd.33 - can you use ceph-post-file to upload the full log from the crash?
You mi...
Josh Durgin
09:11 PM Bug #24000: mon: snap delete on deleted pool returns 0 without proper payload
Josh Durgin
09:11 PM Bug #24006: ceph-osd --mkfs has nondeterministic output
Sounds like we need to flush the log before exiting in ceph-osd. Josh Durgin
09:08 PM Bug #23879: test_mon_osdmap_prune.sh fails
Sounds like we need to block for trimming sometimes when there's a constant propose workload. Josh Durgin
09:02 PM Bug #24037: osd: Assertion `!node_algorithms::inited(this->priv_value_traits().to_node_ptr(value...
Sounds like a use-after-free of some sort, unrelated to other crashes we've seen. Josh Durgin
08:48 PM Bug #24057: cbt fails to copy results to the archive dir
This seems to be an issue with cbt not being able to copy output files to its archive dir, and hence we don't find th... Neha Ojha
12:00 PM Bug #24057: cbt fails to copy results to the archive dir
Neha, mind taking a look? i've run into this failure couple times. Kefu Chai
11:59 AM Bug #24057 (Rejected): cbt fails to copy results to the archive dir
/a/kchai-2018-05-08_12:15:21-rados-wip-kefu-testing2-2018-05-08-1834-distro-basic-mira/2501280... Kefu Chai
06:44 PM Backport #24068 (Resolved): luminous: osd sends op_reply out of order
https://github.com/ceph/ceph/pull/23137 Nathan Cutler
06:38 PM Bug #23827 (Pending Backport): osd sends op_reply out of order
Josh Durgin
04:45 PM Bug #23827 (Fix Under Review): osd sends op_reply out of order
The cause for this issue is that we are not tracking enough dup ops for this test, which does multiple writes to the ... Neha Ojha
04:01 PM Backport #24059 (Resolved): luminous: Deleting a pool with active notify linger ops can result in...
https://github.com/ceph/ceph/pull/22143 Casey Bodley
04:01 PM Backport #24058 (Resolved): jewel: Deleting a pool with active notify linger ops can result in se...
https://github.com/ceph/ceph/pull/22188 Casey Bodley
02:21 PM Bug #24022 (Fix Under Review): "ceph tell osd.x bench" writes resulting JSON to stderr instead of...
I tend to agree: https://github.com/ceph/ceph/pull/21905 John Spray
02:09 PM Backport #24026 (Resolved): mimic: pg-upmap cannot balance in some case
Kefu Chai
12:11 PM Bug #23966 (Pending Backport): Deleting a pool with active notify linger ops can result in seg fault
Kefu Chai
08:05 AM Bug #23851 (Resolved): OSD crashes on empty snapset
Nathan Cutler
08:05 AM Backport #23852 (Resolved): luminous: OSD crashes on empty snapset
Nathan Cutler

05/08/2018

11:09 PM Support #22531: OSD flapping under repair/scrub after recieve inconsistent PG LFNIndex.cc: 439: F...
For the record...
I was also suffering this problem on a pg repair. That was because I was following the procedure...
Chris Dunlop
11:05 PM Backport #23852: luminous: OSD crashes on empty snapset
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/21638
merged
Yuri Weinstein
09:37 PM Bug #23909 (Resolved): snaps missing in mapper, should be: 188,18f,191,195,197,198,199,19d,19e,1a...
David Zafman
08:56 PM Backport #24048 (Resolved): luminous: pg-upmap cannot balance in some case
https://github.com/ceph/ceph/pull/22115 Nathan Cutler
05:08 PM Bug #20876: BADAUTHORIZER on mgr, hung ceph tell mon.*
Triggered on Luminous 12.2.5 again.
Mon quorum worked as expected, after all monitors restart, not healed. all pgs...
Марк Коренберг
04:53 PM Bug #24045 (Resolved): Eviction still raced with scrub due to preemption

We put code in cache tier eviction to check the scrub range, but that isn't sufficient. During scrub preemption re...
David Zafman
06:58 AM Backport #23850 (In Progress): luminous: Read operations segfaulting multiple OSDs
-https://github.com/ceph/ceph/pull/21873- Victor Denisov
06:48 AM Bug #23402: objecter: does not resend op on split interval
we also met this problem with osd_debug_op_order=true, that result "out of order" assert huang jun
04:30 AM Backport #24042 (In Progress): luminous: ceph-disk log is written to /var/run/ceph
Kefu Chai
04:30 AM Backport #24042 (Resolved): luminous: ceph-disk log is written to /var/run/ceph
https://github.com/ceph/ceph/pull/21870 Kefu Chai
04:28 AM Bug #24041: ceph-disk log is written to /var/run/ceph
https://github.com/ceph/ceph/pull/18375 Kefu Chai
04:28 AM Bug #24041 (Resolved): ceph-disk log is written to /var/run/ceph
it should go to /var/log/ceph Kefu Chai

05/07/2018

08:38 PM Bug #24037 (Resolved): osd: Assertion `!node_algorithms::inited(this->priv_value_traits().to_nod...
... Patrick Donnelly
07:24 PM Bug #23909: snaps missing in mapper, should be: 188,18f,191,195,197,198,199,19d,19e,1a1,1a3 was r...

Nevermind. I see you branch was still on ci repo.
$ git branch --contains c20a95b0b9f4082dcebb339135683b91fe39e...
David Zafman
07:18 PM Bug #23909 (Need More Info): snaps missing in mapper, should be: 188,18f,191,195,197,198,199,19d,...

Does your branch include c20a95b0b9f4082dcebb339135683b91fe39ec0a? The change I made was needed to make that fix w...
David Zafman
05:25 PM Bug #23966: Deleting a pool with active notify linger ops can result in seg fault
Alternative Mimic fix: https://github.com/ceph/ceph/pull/21859 Jason Dillaman
02:55 PM Bug #23966: Deleting a pool with active notify linger ops can result in seg fault
will reset the member variables of C_notify_Finish in its dtor for debugging, to see if it has been destroyed or not ... Kefu Chai
07:43 AM Bug #23966: Deleting a pool with active notify linger ops can result in seg fault
the test still fails with the fixes above: /a/kchai-2018-05-06_15:50:41-rados-wip-kefu-testing-2018-05-06-2204-distro... Kefu Chai
03:34 PM Bug #24033 (Fix Under Review): rados: not all exceptions accept keyargs
Patrick Donnelly
01:19 PM Bug #24033: rados: not all exceptions accept keyargs
https://github.com/ceph/ceph/pull/21853 Rishabh Dave
12:55 PM Bug #24033 (Resolved): rados: not all exceptions accept keyargs
The method make_ex() in rados.pyx raises exceptions irrespective of the fact whether an exception can or cannot handl... Rishabh Dave
02:15 AM Backport #23925 (In Progress): luminous: assert on pg upmap
Prashant D
12:05 AM Bug #24023: Segfault on OSD in 12.2.5
Another one occurred today on a different OSD:
2018-05-06 19:48:33.636221 7f0f55922700 -1 *** Caught signal (Segme...
Alex Gorbachev
 

Also available in: Atom