Project

General

Profile

Activity

From 06/18/2019 to 07/17/2019

07/17/2019

11:01 PM Backport #39513: mimic: osd: segv in _preboot -> heartbeat
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28220
merged
Yuri Weinstein
10:44 PM Bug #39152: nautilus osd crash: Caught signal (Aborted) tp_osd_tp
once this is backported at released (#39693) we should confirm this fixes the problematic osd Sage Weil
10:29 PM Bug #40809 (New): qa: "Failed to send signal 1: None" in rados
Run: http://pulpito.ceph.com/yuriw-2019-07-15_19:24:27-rados-wip-yuri4-testing-2019-07-15-1517-mimic-distro-basic-smi... Yuri Weinstein
10:18 PM Backport #39311: mimic: crushtool crash on Fedora 28 and newer
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/27986
merged
Yuri Weinstein
10:17 PM Backport #39720: mimic: short pg log+nautilus-p2p-stress-split: "Error: finished tid 3 when last_...
David Zafman wrote:
> https://github.com/ceph/ceph/pull/28089
merged
Yuri Weinstein
10:17 PM Bug #40791: high variance in pg size
This is Luminous, 12.2.12 by now.
Balancing on bytes (reweight-by-utilization) was unable to resolve the issue pre...
Lars Marowsky-Brée
09:16 PM Bug #40791: high variance in pg size
It sure looks like the PG count isn't a power of two, so some of them are simply half size compared to the others. (S... Greg Farnum
09:15 PM Bug #40791 (Need More Info): high variance in pg size
Which ceph version are you using? Neha Ojha
10:17 PM Backport #39374: mimic: ceph tell osd.xx bench help : gives wrong help
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28097
merged
Yuri Weinstein
10:16 PM Backport #39422: mimic: Don't mark removed osds in when running "ceph osd in any|all|*"
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28142
merged
Yuri Weinstein
10:13 PM Backport #38341: mimic: pg stuck in backfill_wait with plenty of disk space
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28201
merged
Yuri Weinstein
09:33 PM Bug #23879: test_mon_osdmap_prune.sh fails

Another time on mimic so I assume Nautilus needs a fix too.
http://qa-proxy.ceph.com/teuthology/yuriw-2019-07-09_1...
David Zafman
09:26 PM Bug #40765: mimic: "Command failed (workunit test rados/test.sh)" in smoke/master/mimic
... Sage Weil
09:19 PM Bug #40726: "OSD::osd_op_tp thread 0x7f6dafcf0700' had timed out after 15"
This happens occasionally on Mira nodes; but if it pops up repeatedly on the same node or test suite that may be evid... Greg Farnum
09:15 PM Bug #40774 (Resolved): mon: interval_set.h: 490: FAILED ceph_assert(p->first > start+len)
Neha Ojha
09:12 PM Bug #40777 (Closed): hit assert in AuthMonitor::update_from_paxos
That assert means there was a read error when the monitor tried to get data off of disk. Check your disk! Greg Farnum
08:56 PM Bug #38238: rados/test.sh: api_aio_pp doesn't seem to start

http://qa-proxy.ceph.com/teuthology/yuriw-2019-07-09_15:21:18-rados-wip-yuri-testing-2019-07-08-2007-mimic-distro-b...
David Zafman
08:56 PM Bug #40070 (Rejected): mon/OSDMonitor: target_size_bytes integer overflow
this is by design. the target_size is new in nautilus, so we don't encode it in the map until require_osd_release >=... Sage Weil
08:54 PM Bug #40081: mon: luminous crash attempting to decode maps after nautilus quorum has been formed
https://github.com/ceph/ceph/pull/28672 (nautilus backport PR) Sage Weil
08:43 PM Bug #40000: osds do not bound xattrs and/or aggregate xattr data in pg log
from the ML,... Sage Weil
08:42 PM Bug #40000 (Need More Info): osds do not bound xattrs and/or aggregate xattr data in pg log
The message dump is 260M (once de-hexified), but the decode of the pg_log_t in the message indicates it is 2484154195... Sage Weil
05:50 PM Bug #40483 (Fix Under Review): Pool settings aren't populated to OSD after restart.
https://github.com/ceph/ceph/pull/29093 Sage Weil
04:57 PM Bug #40755 (Fix Under Review): _txc_add_transaction error (2) No such file or directory not handl...
https://github.com/ceph/ceph/pull/29092 Sage Weil
03:45 PM Bug #40793 (Rejected): mgr mon commands pile up
This was a side-effect of #40792. A targeted mon command was queued for down mon, which forced the MonClient to keep... Sage Weil
03:38 PM Bug #40792 (Fix Under Review): monc: send_command to specific down mon breaks other mon msgs
https://github.com/ceph/ceph/pull/29090 Sage Weil
02:45 PM Bug #40804 (Fix Under Review): ceph mgr module ls -f plain crashes mon
https://github.com/ceph/ceph/pull/29089 Sage Weil
02:28 PM Bug #40804 (Resolved): ceph mgr module ls -f plain crashes mon
Sage Weil
11:05 AM Bug #40410: ceph pg query Segmentation fault in 12.2.10
Hi Brad Hubbard
We manually generated some buckets after deployed the cluster.In order to avoid id repetition,we...
qingbo han
04:50 AM Bug #40410: ceph pg query Segmentation fault in 12.2.10
Brad Hubbard
04:49 AM Bug #40410: ceph pg query Segmentation fault in 12.2.10
qingbo han wrote:
> hi Brad Hubbard:
> I think your theory is correct. I run ceph pg query correctly when I uli...
Brad Hubbard

07/16/2019

05:27 PM Bug #40793 (Rejected): mgr mon commands pile up
on lab cluster, a mon was down for a few days. on restart,... Sage Weil
05:24 PM Bug #40792 (Resolved): monc: send_command to specific down mon breaks other mon msgs
On lab cluser, mgr regularly sends mgrbeacons. all is fine.
but, if one mon is down, *and* we send the smart scra...
Sage Weil
05:00 PM Bug #40791 (Closed): high variance in pg size
We're seeing a cluster that has a history of being very unbalanced in terms of OSD utilisation. The balancer in upmap... Jan Fajerski
03:08 PM Bug #40620 (Pending Backport): Explicitly requested repair of an inconsistent PG cannot be schedu...
Sage Weil
03:06 PM Bug #40635 (Fix Under Review): IndexError: list index out of range in thrash_pg_upmap
https://github.com/ceph/ceph/pull/29069 Sage Weil
03:02 PM Bug #40635: IndexError: list index out of range in thrash_pg_upmap
Looks like this triggers when there are no pools, and the pg dump pg_stats is thus empty. Sage Weil
02:45 PM Bug #40635: IndexError: list index out of range in thrash_pg_upmap
/a/sage-2019-07-15_19:52:54-rados-wip-sage-testing-2019-07-15-0918-distro-basic-smithi/4121793 Sage Weil
09:18 AM Bug #40410: ceph pg query Segmentation fault in 12.2.10
hi Brad Hubbard:
I think your theory is correct. I run ceph pg query correctly when I ulimit -s 16384.You said s...
qingbo han
04:28 AM Bug #40410: ceph pg query Segmentation fault in 12.2.10
Hello Han,
Many thanks to Radoslaw Zarzynski for the fruitful discussion we had regarding this issue last night. I...
Brad Hubbard
08:14 AM Bug #40785 (Need More Info): In case of osd full scenario 100% pgs went to unknown state, when ad...
After populating the more data, osds were being nearfull and full. When added more storage in this situation, all pgs... servesha dudhgaonkar
02:14 AM Bug #40777: hit assert in AuthMonitor::update_from_paxos
... sdkfzv sdkfzv

07/15/2019

07:41 PM Backport #40639: mimic: osd: report omap/data/metadata usage
Josh Durgin wrote:
> https://github.com/ceph/ceph/pull/28852
merged
Yuri Weinstein
05:49 PM Bug #40774 (Fix Under Review): mon: interval_set.h: 490: FAILED ceph_assert(p->first > start+len)
https://github.com/ceph/ceph/pull/29051 Sage Weil
04:23 PM Bug #40777: hit assert in AuthMonitor::update_from_paxos
Is this reproducible? If so, can you add mon logs (ideally both for peons and leader), at 'debug mon = 10', 'debug pa... Joao Eduardo Luis
09:17 AM Bug #40777 (New): hit assert in AuthMonitor::update_from_paxos
I created the ceph cluster by the rook(https://github.com/rook/rook), and ceph version is 12.2.7 stable.
After I reb...
sdkfzv sdkfzv
01:29 AM Bug #40410: ceph pg query Segmentation fault in 12.2.10
Still looking into this. The issue in the new core is the same as the original coredump. Brad Hubbard

07/13/2019

04:27 PM Bug #40765: mimic: "Command failed (workunit test rados/test.sh)" in smoke/master/mimic
Seems on mimic as well
http://pulpito.ceph.com/teuthology-2019-07-13_06:00:03-smoke-mimic-testing-basic-smithi/
...
Yuri Weinstein

07/12/2019

11:24 PM Bug #40774: mon: interval_set.h: 490: FAILED ceph_assert(p->first > start+len)
similar failure: /ceph/teuthology-archive/pdonnell-2019-07-11_22:52:33-fs-wip-pdonnell-testing-20190711.203149-distro... Patrick Donnelly
11:23 PM Bug #40774 (Resolved): mon: interval_set.h: 490: FAILED ceph_assert(p->first > start+len)
While removing snapshots:... Patrick Donnelly
11:07 PM Bug #40772: mon: pg size change delayed 1 minute because osdmap 35 delay
Kefu can you take a look? See the attached monitor logs. David Zafman
10:48 PM Bug #40772: mon: pg size change delayed 1 minute because osdmap 35 delay

This looks to be a monitor issue. We see that osdmap 35 may be getting hung up during the critical period 00:29:49 ...
David Zafman
09:21 PM Bug #40772 (New): mon: pg size change delayed 1 minute because osdmap 35 delay

osd-recovery-prio.sh TEST_recovery_pool_priority fails intermittently due to a delay in recovery starting on a pg. ...
David Zafman
10:21 PM Bug #40725 (Resolved): osd-scrub-snaps.sh fails
Sage Weil
02:19 AM Bug #40725 (Fix Under Review): osd-scrub-snaps.sh fails
Kefu Chai
09:11 PM Bug #38357: ClsLock.TestExclusiveEphemeralStealEphemeral failed
... Sage Weil
03:24 PM Bug #40765 (Duplicate): mimic: "Command failed (workunit test rados/test.sh)" in smoke/master/mimic
Run: http://pulpito.ceph.com/teuthology-2019-07-12_06:00:03-smoke-mimic-testing-basic-smithi/
Jobs: '4113997', '4113...
Yuri Weinstein
02:08 PM Bug #40755 (Resolved): _txc_add_transaction error (2) No such file or directory not handled on op...
... Sage Weil
01:58 PM Bug #40635: IndexError: list index out of range in thrash_pg_upmap
/a/sage-2019-07-11_17:46:52-rados-wip-sage-testing-2019-07-11-1048-distro-basic-smithi/4111022 Sage Weil
12:33 PM Bug #38124 (Resolved): OSD down on snaptrim.
Nathan Cutler
12:33 PM Backport #39698 (Resolved): mimic: OSD down on snaptrim.
Nathan Cutler

07/11/2019

10:35 PM Backport #40638 (In Progress): luminous: osd: report omap/data/metadata usage
Brad Hubbard
10:34 PM Backport #40638 (Duplicate): luminous: osd: report omap/data/metadata usage
Brad Hubbard
02:00 PM Backport #40638 (In Progress): luminous: osd: report omap/data/metadata usage
Nathan Cutler
10:34 PM Feature #38550 (Duplicate): osd: Implement lazy omap usage statistics per osd
Brad Hubbard
10:28 PM Backport #40744 (Resolved): nautilus: core: lazy omap stat collection
https://github.com/ceph/ceph/pull/29188 Brad Hubbard
10:22 PM Feature #38136: core: lazy omap stat collection
Requires backport of https://github.com/ceph/ceph/pull/26614 and https://github.com/ceph/ceph/pull/28070 Brad Hubbard
10:22 PM Backport #38552 (In Progress): mimic: core: lazy omap stat collection
Requires backport of https://github.com/ceph/ceph/pull/26614 and https://github.com/ceph/ceph/pull/28070 Brad Hubbard
10:21 PM Backport #38551 (In Progress): luminous: core: lazy omap stat collection
Requires backport of https://github.com/ceph/ceph/pull/26614 and https://github.com/ceph/ceph/pull/28070 Brad Hubbard
07:24 PM Backport #40650: luminous: os/bluestore: fix >2GB writes
Neha Ojha wrote:
> https://github.com/ceph/ceph/pull/28965
merged
Yuri Weinstein
04:39 PM Bug #38724: _txc_add_transaction error (39) Directory not empty not handled on operation 21 (op 1...
Rebooted node 4, on node 1 and 2, 2 OSDs each crashed and will not start.
The logs are similar, seems to be the BUG ...
Edward Kalk
02:36 PM Bug #38724: _txc_add_transaction error (39) Directory not empty not handled on operation 21 (op 1...
^^this results in the Production VMs becoming unresponsive as their disks are unavailable when we have multiple OSDs ... Edward Kalk
02:33 PM Bug #38724: _txc_add_transaction error (39) Directory not empty not handled on operation 21 (op 1...
Sometimes when this happens, the OSDs repeatedly crash and Linux system prevents them from being started. it takes 10... Edward Kalk
02:28 PM Bug #38724: _txc_add_transaction error (39) Directory not empty not handled on operation 21 (op 1...
Was Bug 38724:
```ceph-osd.9.log: -3> 2019-07-11 09:15:13.569 7fc7b8243700 -1 bluestore(/var/lib/ceph/osd/ceph-9...
Edward Kalk
02:28 PM Bug #38724: _txc_add_transaction error (39) Directory not empty not handled on operation 21 (op 1...
OSD 9, 15, 10, 13 crashed this AM.
```ceph.log:2019-07-11 09:15:15.501601 mon.synergy0 (mon.0) 4248 : cluster [IN...
Edward Kalk
04:28 PM Bug #40740 (New): "Error: finished tid 3 when last_acked_tid was 5" in upgrade:luminous-x-mimic
Run: http://pulpito.ceph.com/teuthology-2019-07-11_02:25:02-upgrade:luminous-x-mimic-distro-basic-smithi/
Job: 41101...
Yuri Weinstein
03:18 PM Backport #39693: nautilus: _txc_add_transaction error (39) Directory not empty not handled on ope...
Edward Kalk wrote:
> found a few things that seem like fixes for this on github... : https://github.com/ceph/ceph/pu...
Nathan Cutler
02:50 PM Backport #38276 (Resolved): luminous: osd_map_message_max default is too high?
Nathan Cutler
02:36 PM Backport #38751 (In Progress): mimic: should report EINVAL in ErasureCode::parse() if m<=0
Nathan Cutler
02:34 PM Backport #38750 (Resolved): luminous: should report EINVAL in ErasureCode::parse() if m<=0
Nathan Cutler
02:01 PM Backport #40639 (In Progress): mimic: osd: report omap/data/metadata usage
Nathan Cutler
01:59 PM Backport #40730 (In Progress): nautilus: mon: auth mon isn't loading full KeyServerData after res...
Nathan Cutler
01:58 PM Backport #40730 (Resolved): nautilus: mon: auth mon isn't loading full KeyServerData after restart
https://github.com/ceph/ceph/pull/28993 Nathan Cutler
01:58 PM Backport #40732 (Resolved): mimic: mon: auth mon isn't loading full KeyServerData after restart
https://github.com/ceph/ceph/pull/30181 Nathan Cutler
01:58 PM Backport #40731 (Rejected): luminous: mon: auth mon isn't loading full KeyServerData after restart
Nathan Cutler
01:13 PM Backport #39537 (In Progress): luminous: osd/ReplicatedBackend.cc: 1321: FAILED assert(get_parent...
Nathan Cutler
01:12 PM Backport #39538 (Resolved): mimic: osd/ReplicatedBackend.cc: 1321: FAILED assert(get_parent()->ge...
Nathan Cutler
01:12 PM Bug #39582 (Resolved): Binary data in OSD log from "CRC header" message
Nathan Cutler
01:12 PM Backport #39737 (Resolved): mimic: Binary data in OSD log from "CRC header" message
Nathan Cutler
01:11 PM Backport #39744 (Resolved): mimic: mon: "FAILED assert(pending_finishers.empty())" when paxos res...
Nathan Cutler
10:21 AM Bug #40726 (New): "OSD::osd_op_tp thread 0x7f6dafcf0700' had timed out after 15"
osd.7 was marked down by itself because of unhealthy heartbeat.... Kefu Chai
04:37 AM Bug #40725: osd-scrub-snaps.sh fails
David, mind taking a look? Kefu Chai
04:37 AM Bug #40725 (Resolved): osd-scrub-snaps.sh fails
... Kefu Chai
01:54 AM Feature #40420: Introduce an ceph.conf option to disable HEALTH_WARN when nodeep-scrub/scrub flag...
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-June/035406.html
https://pad.ceph.com/p/health-mute
Vikhyat Umrao
01:18 AM Bug #40641: OSD failure after PGInfo back to previous versions, resulting in PGLog error rollback
Neha Ojha wrote:
> How did you find out there were unrecoverable objects? Was there any indication in the logs?
T...
tao ning

07/10/2019

09:10 PM Bug #40641: OSD failure after PGInfo back to previous versions, resulting in PGLog error rollback
How did you find out there were unrecoverable objects? Was there any indication in the logs? Neha Ojha
09:06 PM Bug #40674 (Resolved): TEST_corrupt_snapset_scrub_rep fails
Neha Ojha
09:04 PM Bug #40718 (Duplicate): touch in txn on (old) nautilus osd
Josh Durgin
02:44 PM Bug #40718 (Duplicate): touch in txn on (old) nautilus osd
... Sage Weil
07:51 PM Bug #40722: "IOError: [Errno 2] No such file or directory: '/tmp/pip-build-o9ggCd/unknown/setup.p...
@Alfredo can you pls take a look? Yuri Weinstein
07:00 PM Bug #40722 (New): "IOError: [Errno 2] No such file or directory: '/tmp/pip-build-o9ggCd/unknown/s...
Run: http://pulpito.ceph.com/teuthology-2019-07-10_05:10:03-ceph-disk-mimic-distro-basic-mira/
Jobs: '4108064', '410...
Yuri Weinstein
05:39 PM Bug #40721: backfill caught in loop from block
original blocked request is... Sage Weil
04:42 PM Bug #40721: backfill caught in loop from block
actually, this retry is triggered on every osdmap. Sage Weil
04:42 PM Bug #40721 (Can't reproduce): backfill caught in loop from block
... Sage Weil
05:08 PM Bug #40720 (Fix Under Review): mimic, nautilus: make bitmap allocator the default allocator for b...
Neha Ojha
04:29 PM Bug #40720 (Resolved): mimic, nautilus: make bitmap allocator the default allocator for bluestore
The default for nautilus is already bitmap allocator.
We might just need to cherry-pick 231b7dd9c5dc1d22e93a8f81d07e...
Neha Ojha
03:49 PM Backport #40650 (In Progress): luminous: os/bluestore: fix >2GB writes
https://github.com/ceph/ceph/pull/28965 Neha Ojha
03:49 PM Backport #40651 (In Progress): mimic: os/bluestore: fix >2GB writes
https://github.com/ceph/ceph/pull/28967 Neha Ojha
03:48 PM Backport #40652 (In Progress): nautilus: os/bluestore: fix >2GB writes
https://github.com/ceph/ceph/pull/28966 Neha Ojha
03:11 PM Bug #40712: ceph-mon crash with assert(err == 0) after rocksdb->get
I also opened an issue in rocksdb: https://github.com/facebook/rocksdb/issues/5558, and I attached the db file in thi... Yang Dongsheng
12:18 PM Bug #40712 (New): ceph-mon crash with assert(err == 0) after rocksdb->get
(1)I found a very strange problem in our environment that the ceph-mon crashed with below error in log:... Yang Dongsheng
03:02 PM Backport #39693: nautilus: _txc_add_transaction error (39) Directory not empty not handled on ope...
found a few things that seem like fixes for this on github... : https://github.com/ceph/ceph/pull/27929/commits Edward Kalk
02:06 PM Backport #39693: nautilus: _txc_add_transaction error (39) Directory not empty not handled on ope...
Will this fix be included in : https://tracker.ceph.com/projects/ceph/roadmap#v14.2.2 ? Edward Kalk
03:01 PM Bug #38724: _txc_add_transaction error (39) Directory not empty not handled on operation 21 (op 1...
found a few things that seem like fixes for this on github... : https://github.com/ceph/ceph/pull/27929/commits Edward Kalk
02:49 PM Bug #38724: _txc_add_transaction error (39) Directory not empty not handled on operation 21 (op 1...
We hit this bug again : "2019-07-10 09:16:27.728 7f73b844c700 -1 bluestore(/var/lib/ceph/osd/ceph-5) _txc_add_transac... Edward Kalk
02:06 PM Bug #38724: _txc_add_transaction error (39) Directory not empty not handled on operation 21 (op 1...
will this be included in : https://tracker.ceph.com/projects/ceph/roadmap#v14.2.2 . ? Edward Kalk
11:56 AM Bug #39555 (In Progress): backfill_toofull while OSDs are not full (Unneccessary HEALTH_ERR)
oops, reverting - I had not seen Joao's question Nathan Cutler
11:54 AM Bug #39555 (Pending Backport): backfill_toofull while OSDs are not full (Unneccessary HEALTH_ERR)
Nathan Cutler
06:11 AM Bug #40410: ceph pg query Segmentation fault in 12.2.10
hi Brad Hubbard
I failed to reproduce segfault in python several times.I had upload coredump in ceph, the id ...
qingbo han

07/09/2019

08:53 PM Backport #39693: nautilus: _txc_add_transaction error (39) Directory not empty not handled on ope...
I dropped notes in "http://tracker.ceph.com/issues/38724". Not sure I understand the status. "pending backport" says ... Edward Kalk
04:30 PM Backport #38276: luminous: osd_map_message_max default is too high?
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28640
merged
Yuri Weinstein
04:26 PM Bug #38724: _txc_add_transaction error (39) Directory not empty not handled on operation 21 (op 1...
I am confused by the "Copied to RADOS - Backport #39693: nautilus" status. "Pending Backport 07/03/2019"
Does this m...
Edward Kalk
04:15 PM Bug #38724: _txc_add_transaction error (39) Directory not empty not handled on operation 21 (op 1...
We have hit this as well, it was triggered when I rebooted a node. A few OSD on other hosts crashed. Here's some log:... Edward Kalk
04:04 PM Backport #38750: luminous: should report EINVAL in ErasureCode::parse() if m<=0
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28111
merged
Yuri Weinstein
02:21 PM Bug #40634 (Pending Backport): mon: auth mon isn't loading full KeyServerData after restart
Kefu Chai
10:50 AM Bug #39555: backfill_toofull while OSDs are not full (Unneccessary HEALTH_ERR)
The pull request provided with the fix has been merged (https://github.com/ceph/ceph/pull/28204). Does anyone still s... Joao Eduardo Luis
01:58 AM Bug #40410: ceph pg query Segmentation fault in 12.2.10
Hello Han,
I don't see any glaring differences in the binaries so far but I did notice this in the dmesg output.
...
Brad Hubbard

07/08/2019

02:05 PM Bug #40692 (New): Ceph daemons failing to start when large unix groups exist
While tracking down this [1] error I found where the error came from in the [2] code and looked into the getgrnam_r f... David Turner

07/07/2019

02:48 AM Bug #38827 (Resolved): valgrind: UninitCondition in ceph::crypto::onwire::AES128GCM_OnWireRxHandl...
Kefu Chai

07/05/2019

02:54 PM Bug #40674 (Fix Under Review): TEST_corrupt_snapset_scrub_rep fails
https://github.com/ceph/ceph/pull/28901 Sage Weil
03:38 AM Bug #40674 (Resolved): TEST_corrupt_snapset_scrub_rep fails
... Kefu Chai

07/04/2019

05:59 PM Bug #40649: set_mon_vals failed to set cluster_network = 10.1.2.0/24: Configuration option 'clust...
Anyway, I generated the ceph.conf file used on client machines using the command "ceph config generate-minimal-conf".... Vandeir Eduardo
05:49 PM Bug #40649: set_mon_vals failed to set cluster_network = 10.1.2.0/24: Configuration option 'clust...
As described by Manuel Rios in https://tracker.ceph.com/issues/40282 , the workaround is include configs:
public_n...
Vandeir Eduardo
02:36 PM Bug #20973: src/osdc/ Objecter.cc: 3106: FAILED assert(check_latest_map_ops.find(op->tid) == chec...
... Kefu Chai
12:58 AM Bug #40641: OSD failure after PGInfo back to previous versions, resulting in PGLog error rollback
Neha Ojha wrote:
> Did you see a crash in the logs somewhere? Can you tell us which osd failed and why and also atta...
tao ning
12:53 AM Backport #40667 (In Progress): nautilus: PG scrub stamps reset to 0.000000
David Zafman

07/03/2019

11:43 PM Bug #40668 (Resolved): mon_osd_report_timeout should not be allowed to be less than 2x the value ...
We should have a safety built in that will not allow the mon_osd_report_timeout to be set less than a value that is 2... Neha Ojha
10:48 PM Feature #40640: Network ping monitoring

See also https://pad.ceph.com/p/Network_ping_monitoring
Examples, with warning threshold set to 1 microsecond.
...
David Zafman
12:52 AM Feature #40640 (Resolved): Network ping monitoring

The simplest version of this would be to see warnings if heartbeat ping response time exceeds certain thresholds.
David Zafman
10:35 PM Backport #40667 (Resolved): nautilus: PG scrub stamps reset to 0.000000
https://github.com/ceph/ceph/pull/28869 David Zafman
10:29 PM Bug #40073 (Pending Backport): PG scrub stamps reset to 0.000000
David Zafman
09:51 PM Bug #40666 (New): osd fails to get latest map
... Sage Weil
09:35 PM Bug #40483: Pool settings aren't populated to OSD after restart.
Sage Weil
09:34 PM Fix #40564: Objecter does not have perfcounters for op latency
The TMAP* operations are obsolete and deprecated/removed. Adding some latency stats would be useful, though. Update... Sage Weil
09:30 PM Bug #40622 (Resolved): PG stuck in active+clean+remapped
This looks like crush is just failing to find a good replica because 50% of the osds in a rack are down. Try using t... Sage Weil
09:29 PM Bug #40620 (Fix Under Review): Explicitly requested repair of an inconsistent PG cannot be schedu...
Neha Ojha
01:45 AM Bug #40620: Explicitly requested repair of an inconsistent PG cannot be scheduled timely on a OSD...
PR: https://github.com/ceph/ceph/pull/28839 Jeegn Chen
09:28 PM Bug #40641 (Need More Info): OSD failure after PGInfo back to previous versions, resulting in PGL...
Did you see a crash in the logs somewhere? Can you tell us which osd failed and why and also attach the osd logs? Neha Ojha
03:50 AM Bug #40641 (Need More Info): OSD failure after PGInfo back to previous versions, resulting in PGL...
Ceph Version 12.2.7... tao ning
09:24 PM Bug #40635: IndexError: list index out of range in thrash_pg_upmap
side node: using random.choose(seq) would do the same thing Josh Durgin
06:35 PM Bug #40662 (Rejected): Too many deep scrubs with noscrub set and nodeep-scrub unset
David Zafman
06:06 PM Bug #40662 (Rejected): Too many deep scrubs with noscrub set and nodeep-scrub unset

We intended to add a 1 hour backoff to scrub handling when noscrub is set. This will result in too many deep scrub...
David Zafman
05:18 PM Bug #38403 (Duplicate): osd: leaked from OSDMap::apply_incremental
#20491 Sage Weil
01:53 PM Backport #40655 (Resolved): nautilus: Lower the default value of osd_deep_scrub_large_omap_object...
https://github.com/ceph/ceph/pull/29173 Nathan Cutler
01:53 PM Backport #40654 (Resolved): mimic: Lower the default value of osd_deep_scrub_large_omap_object_ke...
https://github.com/ceph/ceph/pull/29174 Nathan Cutler
01:53 PM Backport #40653 (Resolved): luminous: Lower the default value of osd_deep_scrub_large_omap_object...
https://github.com/ceph/ceph/pull/29175 Nathan Cutler
01:52 PM Backport #40652 (Resolved): nautilus: os/bluestore: fix >2GB writes
https://github.com/ceph/ceph/pull/28966 Nathan Cutler
01:52 PM Backport #40651 (Resolved): mimic: os/bluestore: fix >2GB writes
https://github.com/ceph/ceph/pull/28967 Nathan Cutler
01:52 PM Backport #40650 (Resolved): luminous: os/bluestore: fix >2GB writes
https://github.com/ceph/ceph/pull/28965 Nathan Cutler
01:45 PM Bug #40577 (Resolved): vstart.sh can't work.
Nathan Cutler
01:31 PM Bug #40649 (New): set_mon_vals failed to set cluster_network = 10.1.2.0/24: Configuration option ...
When using any rbd command on a client machine, for example, "rbd ls poolname", those messages are always displayed:
...
Vandeir Eduardo
01:28 PM Bug #40583 (Pending Backport): Lower the default value of osd_deep_scrub_large_omap_object_key_th...
Sage Weil
10:37 AM Bug #40646: FTBFS with devtoolset-8-gcc-c++-8.3.1-3.el7.x86_64 and devtoolset-8-libstdc++-docs-8....
temporary workaround posted at https://github.com/ceph/ceph/pull/28859 Kefu Chai
10:28 AM Bug #40646: FTBFS with devtoolset-8-gcc-c++-8.3.1-3.el7.x86_64 and devtoolset-8-libstdc++-docs-8....
https://bugzilla.redhat.com/show_bug.cgi?id=1726630
alternatively, we can pin on the previous version: devtoolset...
Kefu Chai
10:04 AM Bug #40646 (Resolved): FTBFS with devtoolset-8-gcc-c++-8.3.1-3.el7.x86_64 and devtoolset-8-libstd...
... Kefu Chai
09:41 AM Bug #40642 (Duplicate): Bluestore crash due to mass activation on another pool
looks like a duplicate for https://tracker.ceph.com/issues/38724 Igor Fedotov
04:55 AM Bug #40642 (Duplicate): Bluestore crash due to mass activation on another pool
Newly deployed Nautilus cluster with SSD and HDD pools on Ubuntu 18.04.2 with kernel 4.15.0-54.
When adding a doz...
Nigel Williams
07:57 AM Documentation #40643 (New): clearify begin hour + end hour
documentation doesn't mention if it's allowed to have a scrubbing window across midnight, and if it is, it should spe... Torben Hørup
12:20 AM Backport #40639 (Resolved): mimic: osd: report omap/data/metadata usage
https://github.com/ceph/ceph/pull/28852 Josh Durgin
12:19 AM Backport #40638 (Resolved): luminous: osd: report omap/data/metadata usage
https://github.com/ceph/ceph/pull/28851 Josh Durgin
12:17 AM Bug #40637 (Resolved): osd: report omap/data/metadata usage
This is to track the backport of https://github.com/ceph/ceph/pull/18096. This is helpful to tell when a given OSD's ... Josh Durgin

07/02/2019

11:30 PM Bug #39175 (Resolved): RGW DELETE calls partially missed shortly after OSD startup
That's great. Marking this bug as "Resolved" since migrating to BlueStore fixed the issue. Neha Ojha
11:12 PM Bug #40636 (Resolved): os/bluestore: fix >2GB writes
This is related to https://tracker.ceph.com/issues/23527#note-6 Neha Ojha
11:09 PM Bug #40635 (Resolved): IndexError: list index out of range in thrash_pg_upmap
... Sage Weil
11:06 PM Bug #23879: test_mon_osdmap_prune.sh fails
/a/sage-2019-07-02_17:58:21-rados-wip-sage-testing-2019-07-02-1056-distro-basic-smithi/4087740 Sage Weil
11:05 PM Bug #40634: mon: auth mon isn't loading full KeyServerData after restart
https://github.com/ceph/ceph/pull/28850 Sage Weil
11:05 PM Bug #40634 (Fix Under Review): mon: auth mon isn't loading full KeyServerData after restart
https://github.com/ceph/ceph/pull/28850 Sage Weil
10:59 PM Bug #40634 (Resolved): mon: auth mon isn't loading full KeyServerData after restart
/a/sage-2019-07-02_17:58:21-rados-wip-sage-testing-2019-07-02-1056-distro-basic-smithi/4087648
symptom is a failed...
Sage Weil
08:29 PM Backport #40625 (Resolved): nautilus: OSDs get killed by OOM due to a broken switch
https://github.com/ceph/ceph/pull/29391 Nathan Cutler
02:46 PM Bug #40622 (Resolved): PG stuck in active+clean+remapped
A cluster have 6 servers, in 3 racks, 2 servers per a rack.
A replication rule distributes replicas to the 3 racks: ...
Mike Almateia
01:33 PM Bug #40620 (Resolved): Explicitly requested repair of an inconsistent PG cannot be scheduled time...
Since osd_scrub_during_recovery=false is used as default, when a OSD has some recovering PG, it will not schedule any... Jeegn Chen
01:11 PM Bug #23117: PGs stuck in "activating" after osd_max_pg_per_osd_hard_ratio has been exceeded once
Ceph version was 13.2.5 on the reinstalled host and 13.2.4 on the other hosts. Gaudenz Steinlin
01:09 PM Bug #23117: PGs stuck in "activating" after osd_max_pg_per_osd_hard_ratio has been exceeded once
We also hit this problem with a cluster which had replicated pools with a replication factor of 3 and a CRUSH rule wi... Gaudenz Steinlin
09:54 AM Bug #40586 (Pending Backport): OSDs get killed by OOM due to a broken switch
Kefu Chai
12:27 AM Bug #40586: OSDs get killed by OOM due to a broken switch
Greg Farnum wrote:
> Is this something you're working on, Xie?
Ah, sorry, forgot to link the pr, should be all se...
xie xingguo
09:45 AM Bug #40533 (Resolved): thrashosds/test_pool_min_size races with radosbench tests
Kefu Chai
09:33 AM Bug #40410: ceph pg query Segmentation fault in 12.2.10
Brad Hubbard wrote:
> Interesting, thanks Han.
>
> Would you mind uploading an sosreport from one node where the ...
qingbo han
02:00 AM Documentation #40488 (Resolved): Describe in documentation that EC can't recover below min_size p...
Kefu Chai

07/01/2019

09:44 PM Bug #39175: RGW DELETE calls partially missed shortly after OSD startup
After switching both of these over to using BlueStore for the SSDs, the problem has gone away! Thanks! Bryan Stillwell
09:17 PM Bug #40586: OSDs get killed by OOM due to a broken switch
Is this something you're working on, Xie? Greg Farnum
09:17 PM Documentation #40568 (Fix Under Review): monmaptool: document the new --addv argument
Patrick Donnelly
08:03 PM Backport #39698: mimic: OSD down on snaptrim.
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28202
merged
Yuri Weinstein
08:00 PM Backport #39538: mimic: osd/ReplicatedBackend.cc: 1321: FAILED assert(get_parent()->get_log().get...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28259
merged
Yuri Weinstein
07:57 PM Backport #39737: mimic: Binary data in OSD log from "CRC header" message
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28503
merged
Yuri Weinstein
07:57 PM Backport #39744: mimic: mon: "FAILED assert(pending_finishers.empty())" when paxos restart
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28540
merged
Yuri Weinstein
07:25 PM Feature #40610 (New): ceph-objectstore-tool option to "add-clone-metadata"

For a scenario where for some reason snapset information gets corrupt and there is a clone in the objectstore, reco...
David Zafman
05:41 AM Bug #40576: src/osd/PrimaryLogPG.cc: 10513: FAILED assert(head_obc)
I removed the object from osd.12 and osd.16 and the cluster was able to return to HEALTH_OK. I appreciate the help i... Kenneth Van Alstyne
02:02 AM Bug #40601 (Fix Under Review): osd: osd being wrongly reported down because of getloadavg taking ...
Kefu Chai

06/30/2019

02:13 PM Bug #40601: osd: osd being wrongly reported down because of getloadavg taking hearbeat_lock for t...
https://github.com/ceph/ceph/pull/28799 dongdong tao
12:48 PM Bug #40601 (Fix Under Review): osd: osd being wrongly reported down because of getloadavg taking ...
currently OSD::heartbeat() will call getloadavg() to get the load info.
Since getloadavg is just open the file /proc...
dongdong tao

06/29/2019

03:25 PM Bug #39059 (Can't reproduce): assert in ceph::net::SocketMessenger::unregister_conn()
not reproducible anymore. Kefu Chai
12:15 AM Bug #40586 (Resolved): OSDs get killed by OOM due to a broken switch
Jun 11 04:19:26 host-192-168-9-12 kernel: [409881.136278] Node 1 Normal: 26515*4kB (UEM) 1226*8kB (UEM) 40*16kB (UEM)... xie xingguo

06/28/2019

08:11 PM Bug #40576: src/osd/PrimaryLogPG.cc: 10513: FAILED assert(head_obc)

I think you just didn't remove it from osd.12 because osd.12 the primary is trying to push the stray clone to osd.9.
David Zafman
08:04 PM Bug #40576: src/osd/PrimaryLogPG.cc: 10513: FAILED assert(head_obc)

Where is the line like this in the new crash:
/tmp/b-2I_Suq1c6XVP/build/node/root/packages/ceph/workdir-FunpCBst...
David Zafman
12:00 PM Bug #40576: src/osd/PrimaryLogPG.cc: 10513: FAILED assert(head_obc)
The removal appeared to be successful, however the OSD still crashes in a similar way:... Kenneth Van Alstyne
03:51 AM Bug #40576: src/osd/PrimaryLogPG.cc: 10513: FAILED assert(head_obc)

Because of the missing _object info_ the --op list mechanism gets an error so it can't generate the JSON for this b...
David Zafman
02:14 AM Bug #40576: src/osd/PrimaryLogPG.cc: 10513: FAILED assert(head_obc)
... David Zafman
02:09 AM Bug #40576: src/osd/PrimaryLogPG.cc: 10513: FAILED assert(head_obc)
The problematic object appears to be rbd_data.554d416b8b4567.000000000000151f:b691. I can't get this OSD to give up ... Kenneth Van Alstyne
02:08 AM Bug #40576 (Closed): src/osd/PrimaryLogPG.cc: 10513: FAILED assert(head_obc)
It appears that a snapshot containing an inconsistent object was removed at some point, causing ceph-osd to now crash... Kenneth Van Alstyne
04:45 PM Bug #40583 (Fix Under Review): Lower the default value of osd_deep_scrub_large_omap_object_key_th...
Neha Ojha
03:42 PM Bug #40583 (Resolved): Lower the default value of osd_deep_scrub_large_omap_object_key_threshold
The current default of 2million k/v pairs is too high. Recovery takes too long
for bucket index objects with this mu...
Neha Ojha
08:34 AM Documentation #40579 (Fix Under Review): doc: POOL_NEAR_FULL on OSD_NEAR_FULL
http://docs.ceph.com/docs/luminous/rados/operations/health-checks/#pool-near-full only describes pool_near_full when ... Torben Hørup
03:45 AM Bug #40410 (Need More Info): ceph pg query Segmentation fault in 12.2.10
Brad Hubbard
02:32 AM Bug #40577 (Resolved): vstart.sh can't work.
When firstly do_cmake.sh, it will create a ceph.conf .in build dir.
plugin dir = lib
erasure code dir = lib
When d...
jianpeng ma
01:53 AM Bug #40421: osd: lost op?
Greg Farnum wrote:
> Has this recurred on master? What PRs were in that test branch?
I haven't looked at a recent...
Patrick Donnelly

06/27/2019

08:23 PM Feature #40528 (Fix Under Review): Better default value for osd_snap_trim_sleep
Neha Ojha
04:43 PM Backport #40265 (In Progress): nautilus: Setting noscrub causing extraneous deep scrubs
David Zafman
03:10 PM Documentation #40568 (Fix Under Review): monmaptool: document the new --addv argument
Since the introduction of the new Messenger v2 protocol in Ceph Nautilus the @monmaptool@ has been updated to support... Luca Castoro
02:06 PM Bug #40554: "admin_socket" value cannot be configured via the MON config store
Note that this error is printed even if the client *does not* have an override for "admin socket". For non-daemons, t... Jason Dillaman
07:35 AM Fix #40564 (New): Objecter does not have perfcounters for op latency
Now Objecter's perf counter is incomplete,such as:
Firstly, the number of CEPH_OSD_OP_TMAPUP、CEPH_OSD_OP_TMAPGET and...
侯 斌

06/26/2019

10:27 PM Bug #40410: ceph pg query Segmentation fault in 12.2.10
Interesting, thanks Han.
Would you mind uploading an sosreport from one node where the failure does happen and one...
Brad Hubbard
06:40 AM Bug #40410: ceph pg query Segmentation fault in 12.2.10
Brad Hubbard wrote:
> Hello Han,
>
> It's still not clear to me exactly what is going on here. There is some sort...
qingbo han
06:31 AM Bug #40410: ceph pg query Segmentation fault in 12.2.10
Hello Han,
It's still not clear to me exactly what is going on here. There is some sort of invalid memory access o...
Brad Hubbard
09:26 PM Feature #40562 (New): More granular setting for alerting full ratios for disks with different sizes
Following is the RFE description:
When we're using heterogeneous hardware i.e. disks having different sizes.
T...
Neha Ojha
09:10 PM Bug #40421: osd: lost op?
Has this recurred on master? What PRs were in that test branch? Greg Farnum
09:09 PM Bug #40521: cli timeout (e.g., ceph pg dump)
In the second log snippet there:
> remote/smithi203/log/ceph-mon.b.log.gz:2019-06-22T11:51:16.769+0000 7f779cd5c700 ...
Greg Farnum
07:25 AM Bug #40522: on_local_recover doesn't touch?
... jianpeng ma

06/25/2019

10:54 PM Documentation #40488 (In Progress): Describe in documentation that EC can't recover below min_siz...
Neha Ojha
10:23 PM Bug #40388: Mimic: osd crashes during hit_set_trim and hit_set_remove_all if hit set object doesn...
Is there any kind of logs I can add to help on this case? Lazuardi Nasution
03:56 PM Bug #40554 (New): "admin_socket" value cannot be configured via the MON config store
Attempting to set a global "admin_socket" value for all clients does not properly apply and result in warning message... Jason Dillaman
10:18 AM Backport #40537 (Resolved): nautilus: osd/PG.cc: 2410: FAILED ceph_assert(scrub_queued)
https://github.com/ceph/ceph/pull/29372 Nathan Cutler
07:19 AM Bug #40410: ceph pg query Segmentation fault in 12.2.10
Brad Hubbard wrote:
> could you please install the ceph-debuginfo and valgrind packages and then run the following c...
qingbo han
05:57 AM Bug #40410: ceph pg query Segmentation fault in 12.2.10
could you please install the ceph-debuginfo and valgrind packages and then run the following command?... Brad Hubbard
05:15 AM Bug #40451 (Pending Backport): osd/PG.cc: 2410: FAILED ceph_assert(scrub_queued)
Kefu Chai
05:06 AM Bug #40533 (Fix Under Review): thrashosds/test_pool_min_size races with radosbench tests
Kefu Chai
04:24 AM Bug #40533 (Resolved): thrashosds/test_pool_min_size races with radosbench tests
test_pool_min_size checks the existing pools and try to update the number of alive osds to exercise the feature of re... Kefu Chai
12:10 AM Bug #40530 (Resolved): Scrub reserves from actingbackfill put waits for acting

Scrub sends requests to actingbackfill shards, but waits for only acting.size() grants. This happens when a PG is ...
David Zafman

06/24/2019

10:32 PM Bug #38358: short pg log + cache tier ceph_test_rados out of order reply
avoiding this in the qa suite as of this pr: https://github.com/ceph/ceph/pull/28658 Sage Weil
07:12 PM Documentation #40488: Describe in documentation that EC can't recover below min_size pre-octopus
I think it's important to convey the message that EC pools are not able to recover when below min_size, when running ... Torben Hørup
06:58 PM Documentation #40488: Describe in documentation that EC can't recover below min_size pre-octopus
I believe adding documentation to the master docs should be enough to address this tracker issue.
Torben Hørup: ma...
Neha Ojha
06:48 PM Documentation #40488: Describe in documentation that EC can't recover below min_size pre-octopus
The fix doesn't seem to be backported according to Neha Ojha https://github.com/ceph/ceph/pull/17619#issuecomment-505... Torben Hørup
06:26 PM Bug #40468 (Rejected): mon: assert on remote state in Paxos::dispatch can fail
Bug in branch, not upstream. Greg Farnum
05:42 PM Feature #40528 (Resolved): Better default value for osd_snap_trim_sleep
Currently, this value is set to 0 by default, which is not very helpful.
We should make the default emulate "osd_de...
Neha Ojha
03:32 PM Bug #40527 (Need More Info): merge_from: both pgs incomplete but incorrectly uses the source
this one has me scratching my head...... Sage Weil
02:54 PM Bug #40522 (Can't reproduce): on_local_recover doesn't touch?
... Sage Weil
02:47 PM Bug #40521 (Can't reproduce): cli timeout (e.g., ceph pg dump)
client gets kicked from first mon. reconnects to a second one, but doesn't complete the command.... Sage Weil
10:04 AM Backport #40504 (Resolved): nautilus: osd: rollforward may need to mark pglog dirty
https://github.com/ceph/ceph/pull/31034 Nathan Cutler
10:04 AM Backport #40503 (Resolved): mimic: osd: rollforward may need to mark pglog dirty
https://github.com/ceph/ceph/pull/31035 Nathan Cutler
10:04 AM Backport #40502 (Resolved): luminous: osd: rollforward may need to mark pglog dirty
https://github.com/ceph/ceph/pull/31036 Nathan Cutler
07:07 AM Bug #40410: ceph pg query Segmentation fault in 12.2.10
The output of "ceph report" is in the attachment qingbo han
05:23 AM Bug #40410: ceph pg query Segmentation fault in 12.2.10
Thanks for that. Could you attach the output of "ceph report" please? Brad Hubbard
03:50 AM Bug #38219: rebuild-mondb hangs
@Kefu Chai i don't understand why the osdmap/first_committed and osdmap/last_committed = 1?
it already set osdmap/l...
huang jun

06/22/2019

10:22 AM Documentation #40488 (Resolved): Describe in documentation that EC can't recover below min_size p...
All versions previous to this fix https://tracker.ceph.com/issues/18749 are unable to recover when below min size.
...
Torben Hørup
10:19 AM Bug #18749: OSD: allow EC PGs to do recovery below min_size
Will this be backported ? Torben Hørup
04:59 AM Bug #18749 (Resolved): OSD: allow EC PGs to do recovery below min_size
Kefu Chai

06/21/2019

11:27 AM Bug #40483 (Resolved): Pool settings aren't populated to OSD after restart.
In vstart-ed cluster kill osd then restart using ceph-osd -i N
Restarted OSD doesn't observe pool settings (e.g. blu...
Igor Fedotov

06/20/2019

09:38 PM Bug #40468 (Fix Under Review): mon: assert on remote state in Paxos::dispatch can fail
https://github.com/ceph/ceph/pull/28680 Greg Farnum
06:08 PM Bug #40468 (Rejected): mon: assert on remote state in Paxos::dispatch can fail
Paxos.cc::dispatch, line 1426 in current master:... Greg Farnum
10:04 AM Bug #38664 (Resolved): crush: choose_args array size mis-sized when weight-sets are enabled
Nathan Cutler
10:04 AM Backport #38719 (Resolved): luminous: crush: choose_args array size mis-sized when weight-sets ar...
Nathan Cutler
12:13 AM Backport #38719: luminous: crush: choose_args array size mis-sized when weight-sets are enabled
Prashant D wrote:
> https://github.com/ceph/ceph/pull/27085
merged
Yuri Weinstein
10:04 AM Bug #39284 (Resolved): ceph-objectstore-tool rename dump-import to dump-export
Nathan Cutler
10:03 AM Backport #39343 (Resolved): luminous: ceph-objectstore-tool rename dump-import to dump-export
Nathan Cutler
12:09 AM Backport #39343: luminous: ceph-objectstore-tool rename dump-import to dump-export
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/27636
merged
Yuri Weinstein
10:03 AM Bug #38381 (Resolved): Rados.get_fsid() returning bytes in python3
Nathan Cutler
10:03 AM Backport #38873 (Resolved): luminous: Rados.get_fsid() returning bytes in python3
Nathan Cutler
12:09 AM Backport #38873: luminous: Rados.get_fsid() returning bytes in python3
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/27674
merged
Yuri Weinstein
10:02 AM Bug #39023 (Resolved): osd/PGLog: preserve original_crt to check rollbackability
Nathan Cutler
10:02 AM Bug #38894 (Resolved): osd/PGLog.h: print olog_can_rollback_to before deciding to rollback
Nathan Cutler
10:02 AM Backport #39042 (Resolved): luminous: osd/PGLog: preserve original_crt to check rollbackability
Nathan Cutler
12:08 AM Backport #39042: luminous: osd/PGLog: preserve original_crt to check rollbackability
Prashant D wrote:
> https://github.com/ceph/ceph/pull/27715
merged
Yuri Weinstein
10:01 AM Backport #38905 (Resolved): luminous: osd/PGLog.h: print olog_can_rollback_to before deciding to ...
Nathan Cutler
12:08 AM Backport #38905: luminous: osd/PGLog.h: print olog_can_rollback_to before deciding to rollback
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/27715
merged
Yuri Weinstein
10:01 AM Bug #38945 (Resolved): osd: leaked pg refs on shutdown
Nathan Cutler
10:00 AM Backport #39205 (Resolved): nautilus: osd: leaked pg refs on shutdown
Nathan Cutler
10:00 AM Backport #39204 (Resolved): luminous: osd: leaked pg refs on shutdown
Nathan Cutler
12:07 AM Backport #39204: luminous: osd: leaked pg refs on shutdown
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/27810
merged
Yuri Weinstein
10:00 AM Bug #38784 (Resolved): osd: FAILED ceph_assert(attrs || !pg_log.get_missing().is_missing(soid) ||...
Nathan Cutler
09:59 AM Backport #39218 (Resolved): luminous: osd: FAILED ceph_assert(attrs || !pg_log.get_missing().is_m...
Nathan Cutler
12:06 AM Backport #39218: luminous: osd: FAILED ceph_assert(attrs || !pg_log.get_missing().is_missing(soid...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/27878
merged
Yuri Weinstein
09:59 AM Bug #39353 (Resolved): Error message displayed when mon_osd_max_split_count would be exceeded is ...
Nathan Cutler
09:59 AM Backport #39563 (Resolved): luminous: Error message displayed when mon_osd_max_split_count would ...
Nathan Cutler
12:05 AM Backport #39563: luminous: Error message displayed when mon_osd_max_split_count would be exceeded...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/27908
merged
Yuri Weinstein
09:51 AM Backport #39719 (Resolved): luminous: short pg log+nautilus-p2p-stress-split: "Error: finished ti...
Nathan Cutler
12:05 AM Backport #39719: luminous: short pg log+nautilus-p2p-stress-split: "Error: finished tid 3 when la...
David Zafman wrote:
> https://github.com/ceph/ceph/pull/28185
merged
Yuri Weinstein
09:50 AM Bug #40370 (Duplicate): ceph osd pool ls detail -f json doesn't show the pool id
Nathan Cutler
09:22 AM Bug #40403 (Pending Backport): osd: rollforward may need to mark pglog dirty
Kefu Chai
09:19 AM Backport #40465 (Resolved): nautilus: osd beacon sometimes has empty pg list
https://github.com/ceph/ceph/pull/29254 Nathan Cutler
09:19 AM Backport #40464 (Resolved): mimic: osd beacon sometimes has empty pg list
https://github.com/ceph/ceph/pull/29253 Nathan Cutler
04:05 AM Bug #40410: ceph pg query Segmentation fault in 12.2.10
qingbo han wrote:
> Brad Hubbard wrote:
> > Could you provide details of your OS and upload a debug log with debug_...
qingbo han
04:03 AM Bug #40410: ceph pg query Segmentation fault in 12.2.10
Brad Hubbard wrote:
> Could you provide details of your OS and upload a debug log with debug_osd=20 and a coredump? ...
qingbo han
12:07 AM Backport #39431: luminous: Degraded PG does not discover remapped data on originating OSD
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/27751
merged
Yuri Weinstein

06/19/2019

09:16 PM Bug #40287: OSDMonitor: missing `pool_id` field in `osd pool ls` command
This is duplicate of http://tracker.ceph.com/issues/40370. Neha Ojha
09:12 PM Bug #40388 (Can't reproduce): Mimic: osd crashes during hit_set_trim and hit_set_remove_all if hi...
Neha Ojha
11:05 AM Bug #40388: Mimic: osd crashes during hit_set_trim and hit_set_remove_all if hit set object doesn...
I'm afraid I cannot replicate this problem and do debugging and core dump anymore since I have removed the entire cac... Lazuardi Nasution
12:05 AM Bug #40388: Mimic: osd crashes during hit_set_trim and hit_set_remove_all if hit set object doesn...
Can you upload a coredump as well as details of your OS and a log with debug_osd=20 set please? If the files are larg... Brad Hubbard
09:07 PM Bug #40377 (Pending Backport): osd beacon sometimes has empty pg list
Neha Ojha
09:06 PM Bug #40454: snap_mapper error, scrub gets r -2..repaired
Neha reports this hasn't been seen in master in a while; is there a reason you think it's not a new-in-branch bug? Greg Farnum
07:58 PM Bug #40454: snap_mapper error, scrub gets r -2..repaired
Also note there are a lot of objects in this PG that are similarly affected:... Sage Weil
07:57 PM Bug #40454 (Can't reproduce): snap_mapper error, scrub gets r -2..repaired
... Sage Weil
07:26 PM Bug #39581: osd/PG.cc: 2523: FAILED ceph_assert(scrub_queued)
This was probably cause by 40f71cda0ed4fe78dbcbc1ba73f0cc973bf5c415
osd/: move start_peering_interval and callees ...
David Zafman
07:23 PM Bug #39581 (Duplicate): osd/PG.cc: 2523: FAILED ceph_assert(scrub_queued)
Duplicate of https://tracker.ceph.com/issues/40451 Neha Ojha
06:56 PM Bug #39581: osd/PG.cc: 2523: FAILED ceph_assert(scrub_queued)
/a/nojha-2019-06-17_20:20:02-rados-wip-ec-below-min-size-2019-06-17-distro-basic-smithi/4043727 Neha Ojha
07:21 PM Bug #40451 (Fix Under Review): osd/PG.cc: 2410: FAILED ceph_assert(scrub_queued)
https://github.com/ceph/ceph/pull/28660 Sage Weil
07:14 PM Bug #40451: osd/PG.cc: 2410: FAILED ceph_assert(scrub_queued)
It looks to me like this happened as a side-effect of unblocking the op.... Sage Weil
07:13 PM Bug #40451 (Resolved): osd/PG.cc: 2410: FAILED ceph_assert(scrub_queued)
... Sage Weil
06:59 PM Backport #40192 (Resolved): nautilus: Rados.get_fsid() returning bytes in python3
Jason Dillaman
12:34 PM Bug #23857: flush (manifest) vs async recovery causes out of order op
... Kefu Chai
08:16 AM Feature #40419: [RFE] Estimated remaining time on recovery?
I know it won't probably be accurate, but just like any other time remaining console. However this is still valuable,... Sébastien Han
03:18 AM Backport #38276 (In Progress): luminous: osd_map_message_max default is too high?
Kefu Chai
01:52 AM Bug #40421 (New): osd: lost op?
Could use some help figuring out what happened here.
MDS got stuck in up:replay because it didn't get a reply to t...
Patrick Donnelly
12:00 AM Bug #40408: OSD:crashed with Caught signal (Aborted) in shutdown
... Brad Hubbard

06/18/2019

11:52 PM Bug #40410: ceph pg query Segmentation fault in 12.2.10
Could you provide details of your OS and upload a debug log with debug_osd=20 and a coredump? You can use http://docs... Brad Hubbard
11:47 AM Bug #40410 (New): ceph pg query Segmentation fault in 12.2.10
I used ceph pg 16.7ff query in luminous-12.2.10,it always Segmentation fault.
I gdb this command,the stack as follo...
qingbo han
10:15 PM Feature #40420 (Resolved): Introduce an ceph.conf option to disable HEALTH_WARN when nodeep-scrub...
RHBZ - https://bugzilla.redhat.com/show_bug.cgi?id=1721703 Vikhyat Umrao
09:41 PM Feature #40419 (Resolved): [RFE] Estimated remaining time on recovery?
It'd be nice (although hard to be accurate) to give an estimated remaining time for an on-going recovery.
We have a ...
Sébastien Han
05:53 PM Bug #40403: osd: rollforward may need to mark pglog dirty
Second PR: https://github.com/ceph/ceph/pull/28621 Neha Ojha
04:17 PM Backport #40192: nautilus: Rados.get_fsid() returning bytes in python3
Jason Dillaman wrote:
> https://github.com/ceph/ceph/pull/28476
merged
Yuri Weinstein
03:08 PM Feature #23493: config: strip/escape single-quotes in values when setting them via conf file/assi...
i think https://github.com/ceph/ceph/pull/28634 is a small step in the right direction.... Kefu Chai
08:39 AM Bug #40408 (New): OSD:crashed with Caught signal (Aborted) in shutdown
@#0 0x00007ff669a404ab in raise () from /lib64/libpthread.so.0
#1 0x00005606cc765d2a in reraise_fatal (signum=6)
...
tao ning
 

Also available in: Atom