Project

General

Profile

Activity

From 05/22/2019 to 06/20/2019

06/20/2019

09:38 PM Bug #40468 (Fix Under Review): mon: assert on remote state in Paxos::dispatch can fail
https://github.com/ceph/ceph/pull/28680 Greg Farnum
06:08 PM Bug #40468 (Rejected): mon: assert on remote state in Paxos::dispatch can fail
Paxos.cc::dispatch, line 1426 in current master:... Greg Farnum
10:04 AM Bug #38664 (Resolved): crush: choose_args array size mis-sized when weight-sets are enabled
Nathan Cutler
10:04 AM Backport #38719 (Resolved): luminous: crush: choose_args array size mis-sized when weight-sets ar...
Nathan Cutler
12:13 AM Backport #38719: luminous: crush: choose_args array size mis-sized when weight-sets are enabled
Prashant D wrote:
> https://github.com/ceph/ceph/pull/27085
merged
Yuri Weinstein
10:04 AM Bug #39284 (Resolved): ceph-objectstore-tool rename dump-import to dump-export
Nathan Cutler
10:03 AM Backport #39343 (Resolved): luminous: ceph-objectstore-tool rename dump-import to dump-export
Nathan Cutler
12:09 AM Backport #39343: luminous: ceph-objectstore-tool rename dump-import to dump-export
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/27636
merged
Yuri Weinstein
10:03 AM Bug #38381 (Resolved): Rados.get_fsid() returning bytes in python3
Nathan Cutler
10:03 AM Backport #38873 (Resolved): luminous: Rados.get_fsid() returning bytes in python3
Nathan Cutler
12:09 AM Backport #38873: luminous: Rados.get_fsid() returning bytes in python3
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/27674
merged
Yuri Weinstein
10:02 AM Bug #39023 (Resolved): osd/PGLog: preserve original_crt to check rollbackability
Nathan Cutler
10:02 AM Bug #38894 (Resolved): osd/PGLog.h: print olog_can_rollback_to before deciding to rollback
Nathan Cutler
10:02 AM Backport #39042 (Resolved): luminous: osd/PGLog: preserve original_crt to check rollbackability
Nathan Cutler
12:08 AM Backport #39042: luminous: osd/PGLog: preserve original_crt to check rollbackability
Prashant D wrote:
> https://github.com/ceph/ceph/pull/27715
merged
Yuri Weinstein
10:01 AM Backport #38905 (Resolved): luminous: osd/PGLog.h: print olog_can_rollback_to before deciding to ...
Nathan Cutler
12:08 AM Backport #38905: luminous: osd/PGLog.h: print olog_can_rollback_to before deciding to rollback
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/27715
merged
Yuri Weinstein
10:01 AM Bug #38945 (Resolved): osd: leaked pg refs on shutdown
Nathan Cutler
10:00 AM Backport #39205 (Resolved): nautilus: osd: leaked pg refs on shutdown
Nathan Cutler
10:00 AM Backport #39204 (Resolved): luminous: osd: leaked pg refs on shutdown
Nathan Cutler
12:07 AM Backport #39204: luminous: osd: leaked pg refs on shutdown
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/27810
merged
Yuri Weinstein
10:00 AM Bug #38784 (Resolved): osd: FAILED ceph_assert(attrs || !pg_log.get_missing().is_missing(soid) ||...
Nathan Cutler
09:59 AM Backport #39218 (Resolved): luminous: osd: FAILED ceph_assert(attrs || !pg_log.get_missing().is_m...
Nathan Cutler
12:06 AM Backport #39218: luminous: osd: FAILED ceph_assert(attrs || !pg_log.get_missing().is_missing(soid...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/27878
merged
Yuri Weinstein
09:59 AM Bug #39353 (Resolved): Error message displayed when mon_osd_max_split_count would be exceeded is ...
Nathan Cutler
09:59 AM Backport #39563 (Resolved): luminous: Error message displayed when mon_osd_max_split_count would ...
Nathan Cutler
12:05 AM Backport #39563: luminous: Error message displayed when mon_osd_max_split_count would be exceeded...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/27908
merged
Yuri Weinstein
09:51 AM Backport #39719 (Resolved): luminous: short pg log+nautilus-p2p-stress-split: "Error: finished ti...
Nathan Cutler
12:05 AM Backport #39719: luminous: short pg log+nautilus-p2p-stress-split: "Error: finished tid 3 when la...
David Zafman wrote:
> https://github.com/ceph/ceph/pull/28185
merged
Yuri Weinstein
09:50 AM Bug #40370 (Duplicate): ceph osd pool ls detail -f json doesn't show the pool id
Nathan Cutler
09:22 AM Bug #40403 (Pending Backport): osd: rollforward may need to mark pglog dirty
Kefu Chai
09:19 AM Backport #40465 (Resolved): nautilus: osd beacon sometimes has empty pg list
https://github.com/ceph/ceph/pull/29254 Nathan Cutler
09:19 AM Backport #40464 (Resolved): mimic: osd beacon sometimes has empty pg list
https://github.com/ceph/ceph/pull/29253 Nathan Cutler
04:05 AM Bug #40410: ceph pg query Segmentation fault in 12.2.10
qingbo han wrote:
> Brad Hubbard wrote:
> > Could you provide details of your OS and upload a debug log with debug_...
qingbo han
04:03 AM Bug #40410: ceph pg query Segmentation fault in 12.2.10
Brad Hubbard wrote:
> Could you provide details of your OS and upload a debug log with debug_osd=20 and a coredump? ...
qingbo han
12:07 AM Backport #39431: luminous: Degraded PG does not discover remapped data on originating OSD
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/27751
merged
Yuri Weinstein

06/19/2019

09:16 PM Bug #40287: OSDMonitor: missing `pool_id` field in `osd pool ls` command
This is duplicate of http://tracker.ceph.com/issues/40370. Neha Ojha
09:12 PM Bug #40388 (Can't reproduce): Mimic: osd crashes during hit_set_trim and hit_set_remove_all if hi...
Neha Ojha
11:05 AM Bug #40388: Mimic: osd crashes during hit_set_trim and hit_set_remove_all if hit set object doesn...
I'm afraid I cannot replicate this problem and do debugging and core dump anymore since I have removed the entire cac... Lazuardi Nasution
12:05 AM Bug #40388: Mimic: osd crashes during hit_set_trim and hit_set_remove_all if hit set object doesn...
Can you upload a coredump as well as details of your OS and a log with debug_osd=20 set please? If the files are larg... Brad Hubbard
09:07 PM Bug #40377 (Pending Backport): osd beacon sometimes has empty pg list
Neha Ojha
09:06 PM Bug #40454: snap_mapper error, scrub gets r -2..repaired
Neha reports this hasn't been seen in master in a while; is there a reason you think it's not a new-in-branch bug? Greg Farnum
07:58 PM Bug #40454: snap_mapper error, scrub gets r -2..repaired
Also note there are a lot of objects in this PG that are similarly affected:... Sage Weil
07:57 PM Bug #40454 (Can't reproduce): snap_mapper error, scrub gets r -2..repaired
... Sage Weil
07:26 PM Bug #39581: osd/PG.cc: 2523: FAILED ceph_assert(scrub_queued)
This was probably cause by 40f71cda0ed4fe78dbcbc1ba73f0cc973bf5c415
osd/: move start_peering_interval and callees ...
David Zafman
07:23 PM Bug #39581 (Duplicate): osd/PG.cc: 2523: FAILED ceph_assert(scrub_queued)
Duplicate of https://tracker.ceph.com/issues/40451 Neha Ojha
06:56 PM Bug #39581: osd/PG.cc: 2523: FAILED ceph_assert(scrub_queued)
/a/nojha-2019-06-17_20:20:02-rados-wip-ec-below-min-size-2019-06-17-distro-basic-smithi/4043727 Neha Ojha
07:21 PM Bug #40451 (Fix Under Review): osd/PG.cc: 2410: FAILED ceph_assert(scrub_queued)
https://github.com/ceph/ceph/pull/28660 Sage Weil
07:14 PM Bug #40451: osd/PG.cc: 2410: FAILED ceph_assert(scrub_queued)
It looks to me like this happened as a side-effect of unblocking the op.... Sage Weil
07:13 PM Bug #40451 (Resolved): osd/PG.cc: 2410: FAILED ceph_assert(scrub_queued)
... Sage Weil
06:59 PM Backport #40192 (Resolved): nautilus: Rados.get_fsid() returning bytes in python3
Jason Dillaman
12:34 PM Bug #23857: flush (manifest) vs async recovery causes out of order op
... Kefu Chai
08:16 AM Feature #40419: [RFE] Estimated remaining time on recovery?
I know it won't probably be accurate, but just like any other time remaining console. However this is still valuable,... Sébastien Han
03:18 AM Backport #38276 (In Progress): luminous: osd_map_message_max default is too high?
Kefu Chai
01:52 AM Bug #40421 (New): osd: lost op?
Could use some help figuring out what happened here.
MDS got stuck in up:replay because it didn't get a reply to t...
Patrick Donnelly
12:00 AM Bug #40408: OSD:crashed with Caught signal (Aborted) in shutdown
... Brad Hubbard

06/18/2019

11:52 PM Bug #40410: ceph pg query Segmentation fault in 12.2.10
Could you provide details of your OS and upload a debug log with debug_osd=20 and a coredump? You can use http://docs... Brad Hubbard
11:47 AM Bug #40410 (New): ceph pg query Segmentation fault in 12.2.10
I used ceph pg 16.7ff query in luminous-12.2.10,it always Segmentation fault.
I gdb this command,the stack as follo...
qingbo han
10:15 PM Feature #40420 (Resolved): Introduce an ceph.conf option to disable HEALTH_WARN when nodeep-scrub...
RHBZ - https://bugzilla.redhat.com/show_bug.cgi?id=1721703 Vikhyat Umrao
09:41 PM Feature #40419 (Resolved): [RFE] Estimated remaining time on recovery?
It'd be nice (although hard to be accurate) to give an estimated remaining time for an on-going recovery.
We have a ...
Sébastien Han
05:53 PM Bug #40403: osd: rollforward may need to mark pglog dirty
Second PR: https://github.com/ceph/ceph/pull/28621 Neha Ojha
04:17 PM Backport #40192: nautilus: Rados.get_fsid() returning bytes in python3
Jason Dillaman wrote:
> https://github.com/ceph/ceph/pull/28476
merged
Yuri Weinstein
03:08 PM Feature #23493: config: strip/escape single-quotes in values when setting them via conf file/assi...
i think https://github.com/ceph/ceph/pull/28634 is a small step in the right direction.... Kefu Chai
08:39 AM Bug #40408 (New): OSD:crashed with Caught signal (Aborted) in shutdown
@#0 0x00007ff669a404ab in raise () from /lib64/libpthread.so.0
#1 0x00005606cc765d2a in reraise_fatal (signum=6)
...
tao ning

06/17/2019

05:35 PM Bug #40403 (Resolved): osd: rollforward may need to mark pglog dirty
This is an improvement over http://tracker.ceph.com/issues/36739
https://github.com/ceph/ceph/pull/27015 - merged
...
Neha Ojha
01:26 PM Bug #18749: OSD: allow EC PGs to do recovery below min_size
I think it would make sense to create a note on this recovery limitation in the documentation http://docs.ceph.com/do... Torben Hørup

06/16/2019

05:56 PM Bug #40388: Mimic: osd crashes during hit_set_trim and hit_set_remove_all if hit set object doesn...
Lazuardi Nasution wrote:
> Bug #19185 still happen on Mimic (v13.2.6). I must remove entire cache pool to have affec...
Lazuardi Nasution
05:44 PM Bug #40388 (Can't reproduce): Mimic: osd crashes during hit_set_trim and hit_set_remove_all if hi...
Bug #19185 still happen on Mimic (v13.2.6). I must remove entire cache pool to have affected OSDs normal again. Itis ... Lazuardi Nasution

06/15/2019

09:53 AM Backport #40382 (In Progress): nautilus: RuntimeError: expected MON_CLOCK_SKEW but got none
Nathan Cutler
09:45 AM Backport #40382 (Resolved): nautilus: RuntimeError: expected MON_CLOCK_SKEW but got none
https://github.com/ceph/ceph/pull/28576 Nathan Cutler
09:51 AM Backport #39738 (Resolved): nautilus: Binary data in OSD log from "CRC header" message
Nathan Cutler

06/14/2019

09:23 PM Bug #40377 (Fix Under Review): osd beacon sometimes has empty pg list
https://github.com/ceph/ceph/pull/28566 Sage Weil
09:22 PM Bug #40377 (Resolved): osd beacon sometimes has empty pg list
from a user,... Sage Weil
08:22 PM Bug #40376 (Closed): MON is not processing requests
Rafal Wadolowski wrote:
> It looks like the reply is needed like in this bug https://github.com/ceph/ceph/pull/20467...
Greg Farnum
08:01 PM Bug #40376: MON is not processing requests
It looks like the reply is needed like in this bug https://github.com/ceph/ceph/pull/20467 . Am I right? Rafal Wadolowski
07:48 PM Bug #40376 (Closed): MON is not processing requests
Monitor doesn't consume operations in queue. On peon they are stucked.

Function responsible for that is located a...
Rafal Wadolowski
07:28 PM Bug #40370 (Fix Under Review): ceph osd pool ls detail -f json doesn't show the pool id
Neha Ojha
05:11 PM Bug #40370 (Duplicate): ceph osd pool ls detail -f json doesn't show the pool id
"ceph osd pool ls detail" shows the pool id only in the plain text format, not when using with -f jsony Paul Emmerich
05:06 PM Bug #38893 (Pending Backport): RuntimeError: expected MON_CLOCK_SKEW but got none
https://github.com/ceph/ceph/pull/28353 Neha Ojha
04:55 PM Backport #39738: nautilus: Binary data in OSD log from "CRC header" message
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28504
merged
Yuri Weinstein
02:52 PM Bug #40367 (Can't reproduce): "*** Caught signal (Segmentation fault) **" in upgrade:luminous-x-n...
Run: http://pulpito.ceph.com/teuthology-2019-06-14_02:25:02-upgrade:luminous-x-nautilus-distro-basic-smithi/
Job: 40...
Yuri Weinstein
02:50 PM Bug #40366 (New): "MaxWhileTries: 'wait_until_healthy'" in upgrade:luminous-x-nautilus
Run: http://pulpito.ceph.com/teuthology-2019-06-14_02:25:02-upgrade:luminous-x-nautilus-distro-basic-smithi/
Jobs: '...
Yuri Weinstein
08:34 AM Bug #39555: backfill_toofull while OSDs are not full (Unneccessary HEALTH_ERR)
In our case this issue surfaces on a relatively empty cluster (top used OSDs are around 35%) during a small rebalance... Maks Kowalik
02:29 AM Backport #39744 (In Progress): mimic: mon: "FAILED assert(pending_finishers.empty())" when paxos ...
https://github.com/ceph/ceph/pull/28540 Prashant D

06/13/2019

02:44 PM Bug #40070: mon/OSDMonitor: target_size_bytes integer overflow
This was resolved by setting "ceph osd require-osd-release nautilus". Nathan Fish
10:57 AM Backport #39743 (In Progress): nautilus: mon: "FAILED assert(pending_finishers.empty())" when pax...
https://github.com/ceph/ceph/pull/28528 Prashant D
10:24 AM Backport #40322 (Resolved): nautilus: nautilus with requrie_osd_release < nautilus cannot increas...
https://github.com/ceph/ceph/pull/29671 Nathan Cutler
07:40 AM Bug #40294: librados mon_command json parser int/float type problem
Greg Farnum wrote:
> What's the failure mode? What version are you running against?
ceph version is 14.2.1
i a...
Dominik Csapak

06/12/2019

09:18 PM Bug #40193: Changing pg_num and other pool settings are ignored
Also check your cluster health — according to the monitor it's got "736 pgs creating" which is unusual, and probably ... Greg Farnum
09:15 PM Bug #40193 (Duplicate): Changing pg_num and other pool settings are ignored
Josh Durgin
09:14 PM Bug #39570 (Pending Backport): nautilus with requrie_osd_release < nautilus cannot increase pg_num
Neha Ojha
09:10 PM Bug #40245: filestore::read() does not assert on EIO
Oh, this originated in the Red Hat tracker: https://bugzilla.redhat.com/show_bug.cgi?id=1682967
More discussion ha...
Greg Farnum
09:07 PM Bug #40294: librados mon_command json parser int/float type problem
What's the failure mode? What version are you running against? Greg Farnum
10:26 AM Bug #40294 (New): librados mon_command json parser int/float type problem
for some commands there is a parameterof type float,
e.g. for 'osd reweight' there is the parameter 'weight' of type...
Dominik Csapak
10:14 AM Bug #39164 (Resolved): "sudo yum -y install python34-cephfs" fails on mimic
Nathan Cutler
10:14 AM Backport #39236 (Resolved): nautilus: "sudo yum -y install python34-cephfs" fails on mimic
Nathan Cutler
10:12 AM Backport #39239 (Resolved): luminous: "sudo yum -y install python34-cephfs" fails on mimic
Nathan Cutler
09:42 AM Backport #39420 (Resolved): luminous: Don't mark removed osds in when running "ceph osd in any|al...
Nathan Cutler
02:56 AM Backport #39738 (In Progress): nautilus: Binary data in OSD log from "CRC header" message
https://github.com/ceph/ceph/pull/28504 Prashant D
02:37 AM Bug #40287 (Fix Under Review): OSDMonitor: missing `pool_id` field in `osd pool ls` command
Chang Liu
02:36 AM Bug #40287 (Resolved): OSDMonitor: missing `pool_id` field in `osd pool ls` command
Chang Liu
02:03 AM Backport #39737 (In Progress): mimic: Binary data in OSD log from "CRC header" message
https://github.com/ceph/ceph/pull/28503 Prashant D

06/11/2019

11:09 PM Backport #39239: luminous: "sudo yum -y install python34-cephfs" fails on mimic
@kefu feel free to resolve, thx Yuri Weinstein
03:49 PM Backport #39239 (In Progress): luminous: "sudo yum -y install python34-cephfs" fails on mimic
Kefu Chai
10:27 PM Bug #39150: mon: "FAILED ceph_assert(session_map.sessions.empty())" when out of quorum
/a/yuriw-2019-06-07_19:41:42-rados-wip-yuri4-testing-2019-06-07-1600-nautilus-distro-basic-smithi/4012630/ Neha Ojha
08:24 PM Bug #40282: set_mon_vals failed to set
Workarround is move cluster_network and public_network to the ceph.conf them we dont get spam in logs.
Regards
Manuel Rios
08:21 PM Bug #40282: set_mon_vals failed to set
Image of log: https://gyazo.com/5b16e7241528ad971ad74baa9914492f Manuel Rios
08:20 PM Bug #40282 (New): set_mon_vals failed to set
Hi,
We use centralized ceph management for mantain our conf.
ceph version 14.2.1 (d555a9489eb35f84f2e1ef49b77e1...
Manuel Rios
08:16 PM Backport #40274 (Resolved): nautilus: librados 'buffer::create' and related functions are not exp...
https://github.com/ceph/ceph/pull/29244 Nathan Cutler
08:15 PM Backport #40265 (Resolved): nautilus: Setting noscrub causing extraneous deep scrubs
https://github.com/ceph/ceph/pull/28768 Nathan Cutler
02:02 PM Bug #40193: Changing pg_num and other pool settings are ignored
By the way, I tried rolling back to 14.2.0 (with existing cluster state) and also kernel 4.15. Neither did anything. Nathan Fish
09:59 AM Bug #22278 (Resolved): FreeBSD fails to build with WITH_SPDK=ON
it's in 40da9ab5e360d84cd5bb2b705a72eaeb066628cc of SPDK Kefu Chai
06:03 AM Bug #40198 (Pending Backport): Setting noscrub causing extraneous deep scrubs
Kefu Chai
05:48 AM Bug #39972 (Pending Backport): librados 'buffer::create' and related functions are not exported i...
Kefu Chai

06/10/2019

10:07 PM Backport #39420: luminous: Don't mark removed osds in when running "ceph osd in any|all|*"
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/27728
merged
Yuri Weinstein
10:00 PM Bug #40250 (New): luminous: ""'"'(defer backfill|defer recovery)'"'"' /var/log/ceph/ceph-osd.*.lo...
Runs:
http://pulpito.front.sepia.ceph.com/yuriw-2019-06-06_14:54:17-rados-wip-yuri-testing-2019-06-05-2303-luminous-...
Yuri Weinstein
08:31 PM Bug #40245 (New): filestore::read() does not assert on EIO
Greg Farnum
05:42 PM Bug #40245: filestore::read() does not assert on EIO
Maybe we should add a comment to the code there explaining why m_filestore_fail_eio is not checked and we always retu... David Zafman
05:22 PM Bug #40245 (Fix Under Review): filestore::read() does not assert on EIO
Greg Farnum
05:14 PM Bug #40245 (Won't Fix): filestore::read() does not assert on EIO
... Greg Farnum
04:02 PM Backport #40192 (In Progress): nautilus: Rados.get_fsid() returning bytes in python3
Nathan Cutler
10:27 AM Backport #40228 (Resolved): nautilus: mon: rados/multimon tests fail with clock skew
https://github.com/ceph/ceph/pull/28576 Nathan Cutler
10:22 AM Backport #39476 (Resolved): nautilus: segv in fgets() in collect_sys_info reading /proc/cpuinfo
Nathan Cutler

06/07/2019

03:26 AM Bug #40198 (In Progress): Setting noscrub causing extraneous deep scrubs
David Zafman
03:00 AM Bug #40198 (Resolved): Setting noscrub causing extraneous deep scrubs

ceph osd set noscrub
Wait 1 day or ceph --admin-daemon primary.asok trigger_scrub PGID...
David Zafman

06/06/2019

07:43 PM Bug #40193 (Duplicate): Changing pg_num and other pool settings are ignored
... Nathan Fish
05:16 PM Bug #36739: ENOENT in collection_move_rename on EC backfill target
https://github.com/ceph/ceph/pull/27015 (more complete fix) merged Sage Weil
05:15 PM Bug #20491 (Resolved): objecter leaked OSDMap in handle_osd_map
Sage Weil
03:52 PM Backport #40192 (Resolved): nautilus: Rados.get_fsid() returning bytes in python3
https://github.com/ceph/ceph/pull/28476 Jason Dillaman
12:22 AM Bug #39997: not able to create osd keyring
I am using ceph 13.2(mimic ) pooja gupta

06/05/2019

09:46 PM Backport #40180 (Resolved): nautilus: qa/standalone/scrub/osd-scrub-snaps.sh sometimes fails
https://github.com/ceph/ceph/pull/29252 David Zafman
09:46 PM Backport #40179 (Resolved): mimic: qa/standalone/scrub/osd-scrub-snaps.sh sometimes fails
https://github.com/ceph/ceph/pull/29251 David Zafman
09:43 PM Bug #40078 (Pending Backport): qa/standalone/scrub/osd-scrub-snaps.sh sometimes fails
David Zafman
09:11 PM Bug #40078 (In Progress): qa/standalone/scrub/osd-scrub-snaps.sh sometimes fails
Neha Ojha
09:23 PM Bug #39665 (Fix Under Review): kstore: memory may leak on KStore::_do_read_stripe
Neha Ojha
09:19 PM Bug #39997: not able to create osd keyring
This question is more relevant to the ceph-users mailing list, perhaps with more information about which version you ... Neha Ojha
09:03 PM Bug #40081 (In Progress): mon: luminous crash attempting to decode maps after nautilus quorum has...
Neha Ojha
07:55 PM Backport #39476: nautilus: segv in fgets() in collect_sys_info reading /proc/cpuinfo
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28141
merged
Yuri Weinstein
03:22 PM Bug #40112 (Pending Backport): mon: rados/multimon tests fail with clock skew
Sage Weil
03:17 AM Bug #38403 (Fix Under Review): osd: leaked from OSDMap::apply_incremental
... Kefu Chai
02:58 AM Bug #38403: osd: leaked from OSDMap::apply_incremental
/a/kchai-2019-06-04_14:23:17-rados-wip-kefu-testing-2019-06-01-2346-distro-basic-smithi/4004812/ Kefu Chai
02:06 AM Bug #40154: nautilus: failed to become clean before timeout expired

With osd_max_backfills default to 1 and all recovery targeting OSD.1 all recovery is waiting behind PG 2.a to finis...
David Zafman

06/04/2019

08:28 PM Bug #40154 (New): nautilus: failed to become clean before timeout expired
... Neha Ojha
02:58 AM Bug #36405 (Resolved): unittest_seastar_messenger failure on ARM
Kefu Chai
02:55 AM Bug #39997: not able to create osd keyring
when I tried to run with following command
osd create <uuid> <osd no> --no-mon-config
keyring is generated
Cou...
pooja gupta
01:58 AM Bug #40119 (New): api_tier_pp hung causing a dead job

http://pulpito.ceph.com/dzafman-2019-05-31_07:47:29-rados-wip-zafman-testing-distro-basic-smithi/3992631...
David Zafman

06/03/2019

09:06 PM Support #40103: ceph monitor cannot start
The ceph-users@ceph.com mailing list is a more reliable way to get help on issues like this. Looks like the OSDMap ha... Greg Farnum
09:00 PM Bug #40117 (Duplicate): PG stuck in WaitActingChange
osd.9 requests a switch to acting set=[5] from [9,5] which never shows up. The teuthology test hangs waiting for tha... Samuel Just
08:44 PM Bug #39282 (Resolved): EIO from process_copy_chunk_manifest
Sage Weil
03:48 PM Bug #40112 (Resolved): mon: rados/multimon tests fail with clock skew
See
http://pulpito.ceph.com/sage-2019-05-30_21:14:09-rados:multimon-master-distro-basic-smithi/
or
http://p...
Sage Weil
02:06 PM Bug #39115: ceph pg repair doesn't fix itself if osd is bluestore
See #39116 for the stack trace.
I initially thought that this and the other issue were two separate problems. How...
Iain Buclaw

06/01/2019

10:23 AM Backport #38850 (Resolved): upgrade: 1 nautilus mon + 1 luminous mon can't automatically form quorum
Nathan Cutler
12:36 AM Bug #39175: RGW DELETE calls partially missed shortly after OSD startup
I'd say the odds are high migrating the bucket indexes to bluestore would fix it - the omap structure there is very s... Josh Durgin

05/31/2019

10:32 PM Bug #39115: ceph pg repair doesn't fix itself if osd is bluestore
Since OSDs are crashing we should get stack traces out of the logs (e.g osd.9). Per http://tracker.ceph.com/issues/39... David Zafman
08:19 PM Backport #38850: upgrade: 1 nautilus mon + 1 luminous mon can't automatically form quorum
Joao Eduardo Luis wrote:
> backport PR to nautilus: https://github.com/ceph/ceph/pull/28262
merged
Yuri Weinstein
07:38 PM Bug #39175: RGW DELETE calls partially missed shortly after OSD startup
We are planning on migrating all of our clusters to BlueStore, but that's going to take the rest of the year. We cou... Bryan Stillwell
07:08 PM Support #40103 (New): ceph monitor cannot start
I have a ceph cluster running over 2 years and the monitor began crash since yesterday. I had some flapping OSDs up a... JIANYU LI
01:11 AM Bug #40073 (In Progress): PG scrub stamps reset to 0.000000
David Zafman

05/30/2019

10:49 PM Bug #39175: RGW DELETE calls partially missed shortly after OSD startup
Ok, so this is a different bug then. Any chance you're planning on migrating to bluestore with part of one of the pro... Josh Durgin
09:39 PM Bug #39175: RGW DELETE calls partially missed shortly after OSD startup
Hey Josh,
We backfilled onto the SSDs by creating a new crush rule which just uses the ssd class and switching the...
Bryan Stillwell
02:49 PM Bug #40081: mon: luminous crash attempting to decode maps after nautilus quorum has been formed
-https://github.com/ceph/ceph/pull/28323- (closed; see Pull Request ID field for the real PR)
This actually has us...
Joao Eduardo Luis
10:39 AM Bug #40081 (Closed): mon: luminous crash attempting to decode maps after nautilus quorum has been...
While upgrading, we found a rather annoying corner case:
Assuming we start with 3 luminous ceph-mon, upgrading fro...
Joao Eduardo Luis
01:48 PM Backport #40084 (Resolved): nautilus: osd: Better error message when OSD count is less than osd_p...
https://github.com/ceph/ceph/pull/29992 Nathan Cutler
01:47 PM Backport #40083 (Resolved): mimic: osd: Better error message when OSD count is less than osd_pool...
https://github.com/ceph/ceph/pull/30180 Nathan Cutler
01:47 PM Backport #40082 (Resolved): luminous: osd: Better error message when OSD count is less than osd_p...
https://github.com/ceph/ceph/pull/30298 Nathan Cutler
01:29 PM Feature #38617 (Pending Backport): osd: Better error message when OSD count is less than osd_pool...
Kefu Chai
09:11 AM Backport #39699 (Resolved): nautilus: OSD down on snaptrim.
Nathan Cutler
05:00 AM Bug #23387 (Resolved): Building Ceph on armhf fails due to out-of-memory
i am resolving this issue. as quite a few (probably all) of issues noted by Louwrentius have been addressed by Daniel... Kefu Chai
12:48 AM Bug #39723 (Duplicate): osd: valgrind Leak_DefinitelyLost
Greg Farnum

05/29/2019

10:34 PM Bug #39175: RGW DELETE calls partially missed shortly after OSD startup
Hey Bryan, Neha's out this week. I'd like to verify whether this could be the same bug we'd seen before (http://track... Josh Durgin
10:07 PM Backport #39699: nautilus: OSD down on snaptrim.
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28203
merged
Yuri Weinstein
09:42 PM Bug #38827 (Fix Under Review): valgrind: UninitCondition in ceph::crypto::onwire::AES128GCM_OnWir...
https://github.com/ceph/ceph/pull/28305 Radoslaw Zarzynski
02:54 PM Bug #38827: valgrind: UninitCondition in ceph::crypto::onwire::AES128GCM_OnWireRxHandler::authent...
Second run (on slightly amended branch): http://pulpito.front.sepia.ceph.com/rzarzynski-2019-05-29_13:08:09-rgw-wip-b... Radoslaw Zarzynski
09:41 PM Bug #39723 (Fix Under Review): osd: valgrind Leak_DefinitelyLost
Greg Farnum
09:24 PM Bug #39723: osd: valgrind Leak_DefinitelyLost
Okay, simple osdmap pointer assignment snafu. Working on a quick PR. Greg Farnum
09:36 PM Bug #40073: PG scrub stamps reset to 0.000000

When auto repair is enabled a bug causes a regular scrub to reset time stamps which is only intended to happen when...
David Zafman
07:49 PM Bug #40073: PG scrub stamps reset to 0.000000
The similarity to #40066 is so striking I just had to mention it and create a "Relates to" link. Nathan Cutler
06:47 PM Bug #40073: PG scrub stamps reset to 0.000000
A full pg query:... Greg Farnum
06:47 PM Bug #40073 (Resolved): PG scrub stamps reset to 0.000000
From Ceph-users, https://www.spinics.net/lists/ceph-users/msg52869.html
After upgrading from 14.2.0 to 14.2.1, I'v...
Greg Farnum
09:02 PM Bug #40078 (Resolved): qa/standalone/scrub/osd-scrub-snaps.sh sometimes fails

yuriw-2019-05-16_23:32:37-rados-mimic_v13.2.6_QE-distro-basic-smithi/3959865
Command failed (workunit test scrub...
David Zafman
05:39 PM Bug #40070: mon/OSDMonitor: target_size_bytes integer overflow
This worked fine for me in an earlier version of this cluster, which was running 14.2.0. But it's possible things oth... Nathan Fish
05:37 PM Bug #40070 (Rejected): mon/OSDMonitor: target_size_bytes integer overflow
Nautilus 14.2.1 on Ubuntu 18.04 LTS, kernel 4.18 (HWE)
It appears that the "target_size_bytes" setting has an inte...
Nathan Fish
11:26 AM Backport #39375 (Resolved): nautilus: ceph tell osd.xx bench help : gives wrong help
Nathan Cutler
11:25 AM Backport #39421 (Resolved): nautilus: Don't mark removed osds in when running "ceph osd in any|al...
Nathan Cutler
11:25 AM Backport #39721 (Resolved): nautilus: short pg log+nautilus-p2p-stress-split: "Error: finished ti...
Nathan Cutler
11:25 AM Bug #39441 (Resolved): osd acting cycle
Nathan Cutler
11:25 AM Backport #39512 (Resolved): nautilus: osd acting cycle
Nathan Cutler
11:24 AM Backport #39514 (Resolved): nautilus: osd: segv in _preboot -> heartbeat
Nathan Cutler
11:24 AM Backport #39519 (Resolved): nautilus: snaps missing in mapper, should be: ca was r -2...repaired
Nathan Cutler
11:23 AM Backport #39539 (Resolved): nautilus: osd/ReplicatedBackend.cc: 1321: FAILED assert(get_parent()-...
Nathan Cutler
11:23 AM Backport #39043 (Resolved): nautilus: osd/PGLog: preserve original_crt to check rollbackability
Nathan Cutler
11:21 AM Backport #39432 (Resolved): nautilus: Degraded PG does not discover remapped data on originating OSD
Nathan Cutler
11:21 AM Bug #39263 (Resolved): rados/upgrade/nautilus-x-singleton: mon.c@1(electing).elector(11) Shutting...
Nathan Cutler
11:21 AM Backport #39419 (Resolved): nautilus: rados/upgrade/nautilus-x-singleton: mon.c@1(electing).elect...
Nathan Cutler
11:18 AM Backport #39219 (Resolved): nautilus: osd: FAILED ceph_assert(attrs || !pg_log.get_missing().is_m...
Nathan Cutler

05/28/2019

08:58 PM Bug #38827: valgrind: UninitCondition in ceph::crypto::onwire::AES128GCM_OnWireRxHandler::authent...
Scheduled a resurrected run for validation: http://pulpito.front.sepia.ceph.com/rzarzynski-2019-05-28_20:56:45-rgw-wi... Radoslaw Zarzynski
05:43 PM Bug #38827: valgrind: UninitCondition in ceph::crypto::onwire::AES128GCM_OnWireRxHandler::authent...
Changeset: https://github.com/ceph/ceph/compare/master...rzarzynski:wip-bug-38827. Radoslaw Zarzynski
05:13 PM Bug #38827: valgrind: UninitCondition in ceph::crypto::onwire::AES128GCM_OnWireRxHandler::authent...
This bug looks like being duplicated by of http://tracker.ceph.com/issues/39449 which has been addressed with a pair ... Radoslaw Zarzynski
04:10 PM Backport #39375: nautilus: ceph tell osd.xx bench help : gives wrong help
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28035
merged
Yuri Weinstein
04:10 PM Backport #39421: nautilus: Don't mark removed osds in when running "ceph osd in any|all|*"
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28072
merged
Yuri Weinstein
04:09 PM Backport #39721: nautilus: short pg log+nautilus-p2p-stress-split: "Error: finished tid 3 when la...
David Zafman wrote:
> https://github.com/ceph/ceph/pull/28088
merged
Yuri Weinstein
04:08 PM Backport #39512: nautilus: osd acting cycle
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28160
merged
Yuri Weinstein
04:08 PM Backport #39514: nautilus: osd: segv in _preboot -> heartbeat
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28164
merged
Yuri Weinstein
04:07 PM Backport #39519: nautilus: snaps missing in mapper, should be: ca was r -2...repaired
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28205
merged
Yuri Weinstein
04:07 PM Backport #39539: nautilus: osd/ReplicatedBackend.cc: 1321: FAILED assert(get_parent()->get_log()....
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28219
merged
Yuri Weinstein
04:06 PM Backport #39043: nautilus: osd/PGLog: preserve original_crt to check rollbackability
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/27632
merged
Yuri Weinstein
04:04 PM Backport #39432: nautilus: Degraded PG does not discover remapped data on originating OSD
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/27744
merged
Yuri Weinstein
04:03 PM Backport #39419: nautilus: rados/upgrade/nautilus-x-singleton: mon.c@1(electing).elector(11) Shut...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/27771
merged
Yuri Weinstein
04:03 PM Backport #39219: nautilus: osd: FAILED ceph_assert(attrs || !pg_log.get_missing().is_missing(soid...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/27839
merged
Yuri Weinstein
03:29 PM Bug #39449 (Resolved): Uninit in EVP_DecryptFinal_ex on ceph::crypto::onwire::AES128GCM_OnWireRxH...
This has been backported with:
* https://github.com/ceph/ceph/pull/27320,
* https://github.com/ceph/ceph/pull/27321...
Radoslaw Zarzynski
10:18 AM Bug #40029: ceph-mon: Caught signal (Aborted) in (CrushWrapper::update_choose_args(CephContext*)+...
I removed osd.0 and osd.1 from host-247, and re-ran deployment of osds to host-371. Both got added successfully.
...
Iain Buclaw
09:44 AM Bug #40029: ceph-mon: Caught signal (Aborted) in (CrushWrapper::update_choose_args(CephContext*)+...
Attached logs of primary monitor with:
debug mon 10
debug ms 1
Started prior to osd-57 being added, and stopped ...
Iain Buclaw
09:41 AM Backport #38850: upgrade: 1 nautilus mon + 1 luminous mon can't automatically form quorum
backport PR to nautilus: https://github.com/ceph/ceph/pull/28262 Joao Eduardo Luis
03:48 AM Bug #40035 (New): smoke.sh failing in jenkins "make check" test randomly
... Kefu Chai
02:40 AM Backport #39538 (In Progress): mimic: osd/ReplicatedBackend.cc: 1321: FAILED assert(get_parent()-...
https://github.com/ceph/ceph/pull/28259 Prashant D

05/27/2019

04:22 PM Bug #40029: ceph-mon: Caught signal (Aborted) in (CrushWrapper::update_choose_args(CephContext*)+...
Happens on any host I create osd.57 on. Iain Buclaw
03:53 PM Bug #40029: ceph-mon: Caught signal (Aborted) in (CrushWrapper::update_choose_args(CephContext*)+...
Recreating the OSDs, it seems that the monitors consistently crash when creating osd.57. And they consistently recov... Iain Buclaw
03:21 PM Bug #40029 (Resolved): ceph-mon: Caught signal (Aborted) in (CrushWrapper::update_choose_args(Cep...
When adding a new osd, all primary monitors crashed.... Iain Buclaw

05/25/2019

08:44 PM Bug #39555: backfill_toofull while OSDs are not full (Unneccessary HEALTH_ERR)
Note: I've only seen this in a relatively busy and full environment with quite a few backfills going on. Rene Diepstraten
08:21 PM Bug #39555: backfill_toofull while OSDs are not full (Unneccessary HEALTH_ERR)
Thanks for the PR.
The problem itself seems to be caused as follows:
- A backfill starts to a set osds
- One of th...
Rene Diepstraten

05/24/2019

08:23 PM Bug #20491 (Fix Under Review): objecter leaked OSDMap in handle_osd_map
https://github.com/ceph/ceph/pull/28242
I think we shouldn't backport the fix, as it might upset misbehaved (unloc...
Sage Weil
08:20 PM Bug #20491 (In Progress): objecter leaked OSDMap in handle_osd_map
... Sage Weil
08:45 AM Bug #36405: unittest_seastar_messenger failure on ARM
Another one:... Sebastian Wagner
12:27 AM Backport #39518 (In Progress): mimic: snaps missing in mapper, should be: ca was r -2...repaired
https://github.com/ceph/ceph/pull/28232 Prashant D

05/23/2019

10:30 PM Bug #39175: RGW DELETE calls partially missed shortly after OSD startup
Neha,
It was great meeting with you in Barcelona! I can't remember everything you wanted me to gather, but here's...
Bryan Stillwell
06:49 AM Backport #39513 (In Progress): mimic: osd: segv in _preboot -> heartbeat
https://github.com/ceph/ceph/pull/28220 Prashant D
03:33 AM Backport #39539 (In Progress): nautilus: osd/ReplicatedBackend.cc: 1321: FAILED assert(get_parent...
https://github.com/ceph/ceph/pull/28219 Prashant D
12:47 AM Bug #18643: SnapTrimmer: inconsistencies may lead to snaptrimmer hang
Do we still need to fix something here? https://github.com/ceph/ceph/pull/15635 at least sets a pg to snaptrim_error... David Zafman

05/22/2019

10:00 PM Bug #39555 (In Progress): backfill_toofull while OSDs are not full (Unneccessary HEALTH_ERR)
David Zafman
06:01 AM Bug #39555: backfill_toofull while OSDs are not full (Unneccessary HEALTH_ERR)
The pull request https://github.com/ceph/ceph/pull/28204 generates a warning with a better message.
health: HE...
David Zafman
02:27 PM Bug #40000 (New): osds do not bound xattrs and/or aggregate xattr data in pg log
Currently we are having our cluster in an HEALTH_ERR state with 4 PGs inactive (3 of which are "peering" and 4th is "... Vaibhav Bhembre
02:11 PM Bug #39978: Adding OSD to Luminous Cluster will crash the active mon
Indeed the issue is related to adding a new host to the crush map.
I fixed it by manually adding the host to the cru...
Henry Spanka
09:32 AM Bug #39997 (New): not able to create osd keyring
i have set up i mon and i mgr and two osds in a single node.
when i try to create osd keyring via following command:...
pooja gupta
07:07 AM Backport #39475 (In Progress): mimic: segv in fgets() in collect_sys_info reading /proc/cpuinfo
https://github.com/ceph/ceph/pull/28206 Prashant D
07:06 AM Backport #39519 (In Progress): nautilus: snaps missing in mapper, should be: ca was r -2...repaired
https://github.com/ceph/ceph/pull/28205 Prashant D
06:50 AM Bug #24531: Mimic MONs have slow/long running ops
Joao sent this as a possible fix: https://github.com/ceph/ceph/pull/28177 Dan van der Ster
06:41 AM Bug #24531: Mimic MONs have slow/long running ops
The attached file is three mon's dump_historic_slow_ops file.
I deploy v13.2.5 ceph by rook in kunnertes cluster,I...
jun gong
 

Also available in: Atom