Activity
From 03/15/2019 to 04/13/2019
04/13/2019
- 07:11 PM Backport #38904 (Resolved): mimic: osd/PGLog.h: print olog_can_rollback_to before deciding to rol...
- 04:03 PM Backport #39237 (Resolved): mimic: "sudo yum -y install python34-cephfs" fails on mimic
- 12:40 PM Bug #39286 (Resolved): primary recovery local missing object did not update obc
- If not, the snapset in local obc may inconsistent, then the make_writeable()
will make mistakes..
04/12/2019
- 09:48 PM Bug #39263: rados/upgrade/nautilus-x-singleton: mon.c@1(electing).elector(11) Shutting down becau...
- /a/nojha-2019-04-11_19:53:24-rados-wip-parial-recovery-2019-04-11-distro-basic-smithi/3834700/
- 08:23 PM Backport #38904: mimic: osd/PGLog.h: print olog_can_rollback_to before deciding to rollback
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/27284
merged - 08:11 PM Bug #39284 (In Progress): ceph-objectstore-tool rename dump-import to dump-export
- 07:01 PM Bug #39284 (Resolved): ceph-objectstore-tool rename dump-import to dump-export
dump-import is a stupid name for this command.
Treat dump-import as undocumented synonym for dump-export.- 06:40 PM Bug #39281 (In Progress): object_stat_sum_t decode broken if given older version
- 04:58 PM Bug #39281 (Resolved): object_stat_sum_t decode broken if given older version
When the encode/decode for object_stat_sum_t went from version 19 to 20 the fast path wasn't updated....- 05:46 PM Bug #39282 (Resolved): EIO from process_copy_chunk_manifest
- ...
- 03:14 PM Backport #38901 (Resolved): mimic: Minor rados related documentation fixes
- 03:00 PM Backport #39237: mimic: "sudo yum -y install python34-cephfs" fails on mimic
- Kefu Chai wrote:
> https://github.com/ceph/ceph/pull/27476
merged - 01:10 PM Bug #39249: Some PGs stuck in active+remapped state
- @Mark: Which version of Mimic are you running?
- 01:04 PM Bug #39249: Some PGs stuck in active+remapped state
- ...
- 01:04 PM Bug #39249: Some PGs stuck in active+remapped state
- #3747 ?
- 12:26 PM Backport #38442 (In Progress): luminous: osd-markdown.sh can fail with CLI_DUP_COMMAND=1
- 12:21 PM Backport #39275 (In Progress): nautilus: osd-markdown.sh can fail with CLI_DUP_COMMAND=1
- 12:04 PM Backport #39275 (Resolved): nautilus: osd-markdown.sh can fail with CLI_DUP_COMMAND=1
- https://github.com/ceph/ceph/pull/27550
- 12:09 PM Backport #39271 (In Progress): nautilus: autoscale down can lead to max_pg_per_osd limit
- 12:03 PM Backport #39271 (Resolved): nautilus: autoscale down can lead to max_pg_per_osd limit
- https://github.com/ceph/ceph/pull/27547
- 11:57 AM Bug #38786 (Pending Backport): autoscale down can lead to max_pg_per_osd limit
- 11:55 AM Bug #38359 (Pending Backport): osd-markdown.sh can fail with CLI_DUP_COMMAND=1
- 09:53 AM Bug #39159 (Fix Under Review): qa: Fix ambiguous store_thrash thrash_store in mon_thrash.py
- 04:32 AM Bug #39099: Give recovery for inactive PGs a higher priority
- Checking acting.size() < pool.info.min_size is wrong. During recovery acting == up. So if active.size() < pool.info...
04/11/2019
- 08:40 PM Bug #38840 (In Progress): snaps missing in mapper, should be: ca was r -2...repaired
- 07:14 PM Bug #39263 (Resolved): rados/upgrade/nautilus-x-singleton: mon.c@1(electing).elector(11) Shutting...
- ...
- 04:50 PM Bug #21388 (Duplicate): inconsistent pg but repair does nothing reporting head data_digest != dat...
- This was merged to master Jul 31, 2018 in https://github.com/ceph/ceph/pull/23217 for a different tracker.
- 04:32 PM Bug #39099 (In Progress): Give recovery for inactive PGs a higher priority
- 12:36 PM Bug #39249: Some PGs stuck in active+remapped state
- OSD.11 previously took part in this PG. I don't know now if as primary or not. The bug happened after I made `ceph os...
- 12:35 PM Bug #39249: Some PGs stuck in active+remapped state
- ...
- 12:23 PM Bug #39249: Some PGs stuck in active+remapped state
- ...
- 12:22 PM Bug #39249: Some PGs stuck in active+remapped state
- ...
- 12:22 PM Bug #39249 (Closed): Some PGs stuck in active+remapped state
- Sometimes my PGs stuck in this state. When I stop primary OSD containig this PG, it becomes `active+undersized+degrad...
- 12:14 PM Feature #39248 (New): Add ability to limit number of simultaneously backfilling PGs
- I want to reduce affect of `ceph osd out osd.xxx`. A already set
--osd-recovery-max-active 1
--osd-max-backfills ... - 11:46 AM Bug #38783: Changing mon_pg_warn_max_object_skew has no effect.
- Injecting into mgr has solved the issue, thanks!
- 11:07 AM Backport #39239: luminous: "sudo yum -y install python34-cephfs" fails on mimic
- note to myself or anyone who wants to backport this change to luminous, you need to blacklist the python36 package wh...
- 10:59 AM Backport #39239 (Resolved): luminous: "sudo yum -y install python34-cephfs" fails on mimic
- https://github.com/ceph/ceph/pull/28493
- 10:59 AM Bug #39164 (Pending Backport): "sudo yum -y install python34-cephfs" fails on mimic
- 10:54 AM Backport #39236 (In Progress): nautilus: "sudo yum -y install python34-cephfs" fails on mimic
- 02:46 AM Backport #39236: nautilus: "sudo yum -y install python34-cephfs" fails on mimic
- https://github.com/ceph/ceph/pull/27505
- 02:44 AM Backport #39236 (Resolved): nautilus: "sudo yum -y install python34-cephfs" fails on mimic
- https://github.com/ceph/ceph/pull/27505
- 07:56 AM Bug #39174 (In Progress): crushtool crash on Fedora 28 and newer
- 07:10 AM Bug #39174 (Fix Under Review): crushtool crash on Fedora 28 and newer
- 06:02 AM Bug #39174: crushtool crash on Fedora 28 and newer
- https://bugzilla.redhat.com/show_bug.cgi?id=1515858
- 04:36 AM Bug #39174: crushtool crash on Fedora 28 and newer
- Turning up verbosity gives clues to what might be the problem....
- 02:31 AM Bug #39174: crushtool crash on Fedora 28 and newer
- 02:30 AM Bug #39174: crushtool crash on Fedora 28 and newer
- Vasu Kulkarni wrote:
> very good reason to drop one distro in teuthology and replace it with fedora 28, I think Brad... - 02:47 AM Backport #39237 (In Progress): mimic: "sudo yum -y install python34-cephfs" fails on mimic
- 02:47 AM Backport #39237 (Resolved): mimic: "sudo yum -y install python34-cephfs" fails on mimic
- https://github.com/ceph/ceph/pull/27476
04/10/2019
- 11:51 PM Bug #39145: luminous: jewel-x-singleton: FAILED assert(0 == "we got a bad state machine event")
- Ah, that's because a jewel osd does not know how to deal with this REJECT in the Started/ReplicaActive/RepNotRecoveri...
- 02:22 AM Bug #39145: luminous: jewel-x-singleton: FAILED assert(0 == "we got a bad state machine event")
- Fails in 1 out of 20 runs http://pulpito.ceph.com/nojha-2019-04-09_17:54:07-rados:upgrade:jewel-x-singleton-luminous-...
- 11:46 PM Bug #39150: mon: "FAILED ceph_assert(session_map.sessions.empty())" when out of quorum
- mon.c timeline:
2019-04-06 08:58:28.846 hits a lease timeout and triggers the election process
2019-04-06 08:58:28.... - 10:03 PM Bug #39150: mon: "FAILED ceph_assert(session_map.sessions.empty())" when out of quorum
- Greg Farnum wrote:
> The monitor was out of quorum for 30 minutes; it probably has to do with holding on to client c... - 09:59 PM Bug #39150: mon: "FAILED ceph_assert(session_map.sessions.empty())" when out of quorum
- The monitor was out of quorum for 30 minutes; it probably has to do with holding on to client connections or else not...
- 10:19 PM Backport #38720 (Resolved): mimic: crush: choose_args array size mis-sized when weight-sets are e...
- 10:18 PM Bug #38826 (Resolved): upmap broken the crush rule
- 10:18 PM Backport #38858 (Resolved): mimic: upmap broken the crush rule
- 09:48 PM Bug #39085 (Resolved): monmap created timestamp may be blank
- 09:12 PM Bug #39085 (Pending Backport): monmap created timestamp may be blank
- 09:45 PM Bug #38359 (Fix Under Review): osd-markdown.sh can fail with CLI_DUP_COMMAND=1
- 09:45 PM Bug #38359: osd-markdown.sh can fail with CLI_DUP_COMMAND=1
- npoe, that didn't fix it:
/a/sage-2019-04-10_15:25:57-rados-wip-sage4-testing-2019-04-10-0709-distro-basic-smithi/3... - 09:36 PM Bug #38930: ceph osd safe-to-destroy wrongly approves any out osd
- Hmm, maybe the pg_map is purged of any OSD marked out? Although you can have up OSDs that are out so that shouldn't b...
- 09:30 PM Bug #39174: crushtool crash on Fedora 28 and newer
- very good reason to drop one distro in teuthology and replace it with fedora 28, I think Brad brought this up long ti...
- 08:30 PM Bug #39174 (Resolved): crushtool crash on Fedora 28 and newer
- On Fedora 29, Fedora 30, and RHEL 8, /usr/bin/crushtool crashes when trying to compile the map that Rook uses.
<pr... - 09:28 PM Bug #39054 (Closed): osd push failed because local copy is 4394'133607637
- As Jewel is an outdated release and you ran the potentially-destructive repair tools, you'll have better luck taking ...
- 09:16 PM Backport #38904 (In Progress): mimic: osd/PGLog.h: print olog_can_rollback_to before deciding to ...
- 09:16 PM Backport #38906 (Resolved): nautilus: osd/PGLog.h: print olog_can_rollback_to before deciding to ...
- 09:14 PM Bug #39039: mon connection reset, command not resent
- So it's not the command specifically but that the client doesn't reconnect to a working monitor, right?
- 09:10 PM Backport #38442 (Resolved): luminous: osd-markdown.sh can fail with CLI_DUP_COMMAND=1
- 09:07 PM Backport #39220 (Resolved): mimic: osd: FAILED ceph_assert(attrs || !pg_log.get_missing().is_miss...
- https://github.com/ceph/ceph/pull/27940
- 09:07 PM Bug #36598 (Can't reproduce): osd: "bluestore(/var/lib/ceph/osd/ceph-6) ENOENT on clone suggests ...
- This has not shown up recently, so maybe this got resolved as a result of http://tracker.ceph.com/issues/36739 being ...
- 09:07 PM Backport #39219 (Resolved): nautilus: osd: FAILED ceph_assert(attrs || !pg_log.get_missing().is_m...
- https://github.com/ceph/ceph/pull/27839
- 09:07 PM Backport #39218 (Resolved): luminous: osd: FAILED ceph_assert(attrs || !pg_log.get_missing().is_m...
- https://github.com/ceph/ceph/pull/27878
- 09:05 PM Backport #39206 (Resolved): mimic: osd: leaked pg refs on shutdown
- https://github.com/ceph/ceph/pull/27938
- 09:05 PM Backport #39205 (Resolved): nautilus: osd: leaked pg refs on shutdown
- https://github.com/ceph/ceph/pull/27803
- 09:05 PM Backport #39204 (Resolved): luminous: osd: leaked pg refs on shutdown
- https://github.com/ceph/ceph/pull/27810
- 09:01 PM Bug #39175 (Resolved): RGW DELETE calls partially missed shortly after OSD startup
- We have two separate clusters (physically 2,000+ miles apart) that are seeing
PGs going inconsistent while doing reb... - 04:06 PM Feature #39162 (In Progress): Improvements to standalone tests.
- 05:58 AM Bug #38892: /ceph/src/tools/kvstore_tool.cc:266:1: internal compiler error: Segmentation fault
- See https://github.com/ceph/ceph/pull/27479 for a viable workaround. Note that this is a bug in gcc7 [1] and the pref...
- 04:46 AM Backport #38567 (In Progress): luminous: osd_recovery_priority is not documented (but osd_recover...
- 04:16 AM Bug #39164: "sudo yum -y install python34-cephfs" fails on mimic
- note to myself or anyone who wants to backport this change to luminous, you need to blacklist the python36 package wh...
- 04:13 AM Bug #39164 (Fix Under Review): "sudo yum -y install python34-cephfs" fails on mimic
- 03:24 AM Bug #39164 (Resolved): "sudo yum -y install python34-cephfs" fails on mimic
- see http://pulpito.ceph.com/yuriw-2019-04-09_19:20:36-multimds-wip-yuri3-testing-2019-04-08-2038-mimic-testing-basic-...
- 03:56 AM Bug #38582: Pool storage MAX AVAIL reduction seems higher when single OSD reweight is done
- Correction in the description.
It looks like the pools MAX AVAIL value had dropped after there was a hard disk fail...
04/09/2019
- 10:22 PM Bug #38724 (Need More Info): _txc_add_transaction error (39) Directory not empty not handled on o...
- logging level isn't high enough to tell what data is in this pg. :(
- 10:17 PM Bug #38786 (Fix Under Review): autoscale down can lead to max_pg_per_osd limit
- https://github.com/ceph/ceph/pull/27473
- 09:21 PM Feature #39162 (Resolved): Improvements to standalone tests.
Now that OSDs default to bluestore, need to fix the use of run_osd(). We should replace run_osd_bluestore() with r...- 08:29 PM Backport #38567: luminous: osd_recovery_priority is not documented (but osd_recovery_op_priority is)
- https://github.com/ceph/ceph/pull/27471
- 02:54 PM Bug #39145: luminous: jewel-x-singleton: FAILED assert(0 == "we got a bad state machine event")
- From the osd log including the thread before the crash....
- 02:36 PM Bug #38219 (Fix Under Review): rebuild-mondb hangs
- 12:25 PM Bug #39159 (Resolved): qa: Fix ambiguous store_thrash thrash_store in mon_thrash.py
- Both store_thrash and thrash_store names are used for the same thing in mon_thrash.py. 'thrash_store' is used here: h...
- 08:13 AM Bug #39154 (Resolved): Don't mark removed osds in when running "ceph osd in any|all|*"
- To reproduce....
- 01:47 AM Bug #23030 (Fix Under Review): osd: crash during recovery with assert(p != recovery_info.ss.clone...
- https://github.com/ceph/ceph/pull/27273
- 01:04 AM Bug #39152 (Duplicate): nautilus osd crash: Caught signal (Aborted) tp_osd_tp
- OSD continously crashed
-1> 2019-04-08 17:47:06.615 7f3f3ef62700 -1 /build/ceph-14.2.0/src/os/bluestore/Bl...
04/08/2019
- 11:00 PM Bug #37264 (Resolved): scrub warning check incorrectly uses mon scrub interval
- 10:49 PM Bug #26971 (Duplicate): failed to become clean before timeout expired
- 10:18 PM Bug #26971: failed to become clean before timeout expired
- see http://tracker.ceph.com/issues/39149
- 10:15 PM Bug #26971: failed to become clean before timeout expired
- oh, it's because there's alos 1/10th the probability of choosing the second host:...
- 07:59 PM Bug #26971: failed to become clean before timeout expired
- This is just CRUSH failing. I extracted the osdmap from the data/mon.a.tgz and verified with osdmaptool that it's ju...
- 10:37 PM Bug #39150 (Resolved): mon: "FAILED ceph_assert(session_map.sessions.empty())" when out of quorum
- ...
- 08:42 PM Bug #39148 (New): luminous: powercycle: reached maximum tries (500) after waiting for 3000 seconds
- ...
- 07:02 PM Bug #39145 (New): luminous: jewel-x-singleton: FAILED assert(0 == "we got a bad state machine eve...
- ...
- 05:14 PM Bug #37775: some pg_created messages not sent to mon
- /a/yuriw-2019-04-04_00:00:53-rados-luminous-distro-basic-smithi/3806121/
04/05/2019
- 08:56 PM Bug #26971: failed to become clean before timeout expired
- The up set seems to be problem here.
This is the point when we find out that osd.5 is down... - 08:50 PM Backport #38720: mimic: crush: choose_args array size mis-sized when weight-sets are enabled
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/27082
merged - 08:50 PM Backport #38858: mimic: upmap broken the crush rule
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/27257
merged - 08:30 PM Bug #39087: ec_lost_unfound: a EC shard has missing object after `osd lost`
- /a/yuriw-2019-04-02_20:09:55-rados-wip-yuri3-testing-2019-04-02-1623-mimic-distro-basic-smithi/3801955/ - looks like ...
- 08:06 PM Bug #20086: LibRadosLockECPP.LockSharedDurPP gets EEXIST
- /a/yuriw-2019-04-02_20:09:55-rados-wip-yuri3-testing-2019-04-02-1623-mimic-distro-basic-smithi/3801823/
- 08:04 PM Backport #38906: nautilus: osd/PGLog.h: print olog_can_rollback_to before deciding to rollback
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/27302
merged - 05:45 PM Feature #38940: Allow marking noout by failure domain for maintainance and planned downtime.
- Also related: https://github.com/rook/rook/issues/2825
- 05:43 PM Feature #38940: Allow marking noout by failure domain for maintainance and planned downtime.
- Relevant discussion as this relates to Rook https://github.com/rook/rook/issues/2253
- 04:16 PM Bug #37509: require past_interval bounds mismatch due to osd oldest_map
- /a/yuriw-2019-04-05_00:28:05-rados-wip-yuri2-testing-2019-04-04-1953-nautilus-distro-basic-smithi/3811215/
- 04:12 PM Bug #38238: rados/test.sh: api_aio_pp doesn't seem to start
- /a/yuriw-2019-04-05_00:28:05-rados-wip-yuri2-testing-2019-04-04-1953-nautilus-distro-basic-smithi/3811205/
- 12:00 PM Bug #21174: OSD crash: 903: FAILED assert(objiter->second->version > last_divergent_update)
- Yes, most likely the issue was triggered by a power outage, the 2x OSD FAILED assert and the cluster is unable to rec...
- 11:07 AM Bug #39116: Draining filestore osd, removing, and adding new bluestore osd causes OSDs to crash
- The fix so far is switching the osd back to filestore.
- 08:36 AM Bug #39116: Draining filestore osd, removing, and adding new bluestore osd causes OSDs to crash
- Another PG....
- 07:40 AM Bug #39116: Draining filestore osd, removing, and adding new bluestore osd causes OSDs to crash
- ...
- 06:48 AM Bug #39116: Draining filestore osd, removing, and adding new bluestore osd causes OSDs to crash
- David Zafman wrote:
> Please find a stack trace in the osd log. Is there an assert that would look like this?
> ... - 08:23 AM Bug #39115: ceph pg repair doesn't fix itself if osd is bluestore
- Another PG, where the missing is reported on osd.0/filestore (not osd.9/bluestore in the previous)....
- 08:08 AM Bug #39115: ceph pg repair doesn't fix itself if osd is bluestore
- ...
- 07:08 AM Bug #39115: ceph pg repair doesn't fix itself if osd is bluestore
- David Zafman wrote:
> It would be helpful to see a ceph pg deep-scrub (wait for it to finish) followed by the output... - 07:14 AM Bug #39120 (New): rados: Segmentation fault in thread 7f0aebfff700 thread_name:fn_anonymous
- ...
04/04/2019
- 09:29 PM Bug #38931: osd does not proactively remove leftover PGs
- Greg Farnum wrote:
> So should we backport part of that PR, Neha?
>
> To answer your question more directly, Dan:... - 08:37 PM Bug #38931: osd does not proactively remove leftover PGs
- So should we backport part of that PR, Neha?
To answer your question more directly, Dan: OSDs don't delete PGs the... - 08:51 PM Bug #38900: EC pools don't self repair on client read error
- Yes, client IO is served. The PG is degraded, but the PG state won't necessarily reflect that.
- 08:33 PM Bug #38900: EC pools don't self repair on client read error
- Just to be clear, this means the object remains degraded, but client IO continues to be served?
- 08:32 PM Backport #38850: upgrade: 1 nautilus mon + 1 luminous mon can't automatically form quorum
- This is super weird; the only other recent reference I see to min_mon_release is https://github.com/ceph/ceph/pull/27...
- 04:52 PM Bug #39116: Draining filestore osd, removing, and adding new bluestore osd causes OSDs to crash
- Please find a stack trace in the osd log. Is there an assert that would look like this?
/build/ceph-13.2.5-g###... - 03:24 PM Bug #39116 (New): Draining filestore osd, removing, and adding new bluestore osd causes OSDs to c...
- ...
- 04:44 PM Bug #39115: ceph pg repair doesn't fix itself if osd is bluestore
- It would be helpful to see a ceph pg deep-scrub (wait for it to finish) followed by the output of rados list-inconsis...
- 03:15 PM Bug #39115 (Duplicate): ceph pg repair doesn't fix itself if osd is bluestore
- Running ceph pg repair on an inconsistent PG with missing data, I usually notice that the OSD is marked as down/up be...
- 01:48 PM Bug #39111 (New): "ceph config set" accepts osd ID with letters
- ...
- 09:44 AM Bug #38219: rebuild-mondb hangs
- ...
- 04:19 AM Bug #21174: OSD crash: 903: FAILED assert(objiter->second->version > last_divergent_update)
Here is what the bad log looks like that caused one of the crashes. Clearly _head_ is bad because the log ends wit...- 02:40 AM Bug #21174: OSD crash: 903: FAILED assert(objiter->second->version > last_divergent_update)
- Maybe we could check each log in load_pgs(). If it is corrupt (head != head entry's version), move PG aside and igno...
04/03/2019
- 11:28 PM Bug #39099 (Resolved): Give recovery for inactive PGs a higher priority
Backfill inactive gets priority 220 and we should make sure that if we can have inactive that needs recovery only i...- 07:23 AM Bug #39087: ec_lost_unfound: a EC shard has missing object after `osd lost`
- is this `scrub error` we expect? what we should do is to find out why ceph doesn't recovery PG 2.4s0 ?
- 07:16 AM Bug #39087 (New): ec_lost_unfound: a EC shard has missing object after `osd lost`
- http://pulpito.ceph.com/kchai-2019-04-01_10:38:29-rados-wip-kefu-testing-2019-04-01-1531-distro-basic-mira/3797065/
... - 04:37 AM Feature #38616 (Resolved): Improvements to auto repair
04/02/2019
- 10:22 PM Bug #21174: OSD crash: 903: FAILED assert(objiter->second->version > last_divergent_update)
- Per request on irc.
pg log:
1.cas2 on osd.2: ceph-post-file: d74a0006-c0e9-41b1-a904-7bfe41617253
1.96s3 on osd.... - 07:51 PM Bug #21174: OSD crash: 903: FAILED assert(objiter->second->version > last_divergent_update)
- Output from: ceph-objectstore-tool --no-mon-config --data-path /var/lib/ceph/osd/ceph-0 --op log --pgid 1.cas0
1.c... - 06:29 PM Bug #21174: OSD crash: 903: FAILED assert(objiter->second->version > last_divergent_update)
- Hi Grant, is there a way you could dump the pg log by using a command like this "ceph-objectstore-tool --no-mon-confi...
- 09:51 PM Bug #39085 (Fix Under Review): monmap created timestamp may be blank
- 09:51 PM Bug #39085: monmap created timestamp may be blank
- https://github.com/ceph/ceph/pull/27327
- 07:13 PM Bug #39085 (Resolved): monmap created timestamp may be blank
- On at least one old cluster, monmap created timestamp is empty. lab cluster:...
- 07:27 PM Bug #38219: rebuild-mondb hangs
- I reproduced this again on master, http://pulpito.ceph.com/nojha-2019-04-02_17:39:35-rados:singleton-master-distro-ba...
- 11:46 AM Bug #38219: rebuild-mondb hangs
- http://pulpito.ceph.com/kchai-2019-04-02_08:04:13-rados-wip-kefu-testing-2019-04-01-1531-distro-basic-smithi/
- 01:02 PM Bug #38124: OSD down on snaptrim.
- Hello it's been two months now is there any update about this bug?
- 12:19 PM Bug #38783: Changing mon_pg_warn_max_object_skew has no effect.
- 12:18 PM Bug #38783: Changing mon_pg_warn_max_object_skew has no effect.
- It's an mgr option. You should instead inject it to the mgr daemon.
- 05:56 AM Backport #38905 (In Progress): luminous: osd/PGLog.h: print olog_can_rollback_to before deciding ...
- https://github.com/ceph/ceph/pull/27715
- 01:52 AM Backport #38983 (Resolved): nautilus: Improvements to auto repair
- 12:12 AM Backport #38906 (In Progress): nautilus: osd/PGLog.h: print olog_can_rollback_to before deciding ...
- https://github.com/ceph/ceph/pull/27302
04/01/2019
- 10:59 PM Bug #37439: Degraded PG does not discover remapped data on originating OSD
- My proposal to fix this bug is to call @discover_all_missing@ not only if there are missing objects, but also when th...
- 09:11 PM Bug #37439: Degraded PG does not discover remapped data on originating OSD
- Hi Jonas, thanks for creating a fix for this bug. Could you please upload the latest logs from nautilus, that you hav...
- 08:58 PM Bug #37439 (Fix Under Review): Degraded PG does not discover remapped data on originating OSD
- 01:07 AM Bug #37439: Degraded PG does not discover remapped data on originating OSD
- More findings, now on Nautilus 14.2.0:
OSD.62 once was part of pg 6.65, but content on it got remapped. A restart ... - 10:46 PM Bug #21174: OSD crash: 903: FAILED assert(objiter->second->version > last_divergent_update)
- Grant: I notice that the initial event outlined above is from October. Is that the very first anomalous behavior exh...
- 10:45 PM Feature #3362 (Resolved): Warn users before allowing pools to be created with more than N*<num_os...
- 09:22 PM Backport #38442: luminous: osd-markdown.sh can fail with CLI_DUP_COMMAND=1
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/26616
merged - 09:22 PM Backport #38442: luminous: osd-markdown.sh can fail with CLI_DUP_COMMAND=1
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/26616
merged - 04:10 PM Fix #39071 (New): monclient: initial probe is non-optimal with v2+v1
- When we are probing both v2 and v1 addrs for mons, we treat them as separate mons, which means we might be probing N ...
- 02:25 PM Feature #39066 (Fix Under Review): src/ceph-disk/tests/ceph-disk.sh is using hardcoded port
- 02:21 PM Feature #39066 (Resolved): src/ceph-disk/tests/ceph-disk.sh is using hardcoded port
- Currently it's only possible to run `...make; make tests -j8; ctest ...` on the same machine.
Please consider chan... - 01:48 PM Bug #38945 (Pending Backport): osd: leaked pg refs on shutdown
- 01:09 PM Bug #38219: rebuild-mondb hangs
- i am using following script to reproduce this issue locally, so far no luck...
- 11:51 AM Bug #39059 (Can't reproduce): assert in ceph::net::SocketMessenger::unregister_conn()
- ...
- 03:22 AM Bug #39056: localize-reads does not increment pg stats read count
- when set the flag of '--localize-reads', maybe peer_pg will complete read task, but peer_pg will not count read_num....
- 03:09 AM Bug #39056 (New): localize-reads does not increment pg stats read count
- when I mounted ceph-fuse, I setted the flag of '--localize-reads'. I found during the test that read_num count was In...
- 12:52 AM Backport #38904: mimic: osd/PGLog.h: print olog_can_rollback_to before deciding to rollback
- https://github.com/ceph/ceph/pull/27284
03/31/2019
- 07:02 PM Bug #39055 (New): OSD's crash when specific PG is trying to backfill
- Hi,
I've got a peculiar issue whereby a specific PG is trying to backfill it's objects to the other peers, but th... - 12:08 PM Bug #39054 (Closed): osd push failed because local copy is 4394'133607637
- ceph-osd.1.log:7085:2019-02-27 13:07:21.336004 7f666b5bb700 -1 log_channel(cluster) log [ERR] : 3.33 push 3:ccb8da9c:...
03/30/2019
- 07:14 PM Bug #38931: osd does not proactively remove leftover PGs
- https://github.com/ceph/ceph/pull/27205/commits/f7c5b01e181630bb15e8b923b0334eb6adfdf50a
- 06:15 PM Bug #39053 (New): changing pool crush rule may lead to IO stop
How to reproduce:
1. create some OSDs
2. change their class to, say, "xxx"
3. create replicated crush rule ref...- 01:37 PM Backport #38860 (Resolved): nautilus: upmap broken the crush rule
- 08:46 AM Bug #38784 (Pending Backport): osd: FAILED ceph_assert(attrs || !pg_log.get_missing().is_missing(...
- 08:21 AM Backport #38854 (Resolved): luminous: .mgrstat failed to decode mgrstat state; luminous dev version?
- 08:21 AM Backport #38859 (Resolved): luminous: upmap broken the crush rule
- 08:20 AM Backport #38857 (Resolved): luminous: should set EPOLLET flag on del_event()
- 08:18 AM Backport #39044 (Resolved): mimic: osd/PGLog: preserve original_crt to check rollbackability
- https://github.com/ceph/ceph/pull/27629
- 08:18 AM Backport #39043 (Resolved): nautilus: osd/PGLog: preserve original_crt to check rollbackability
- https://github.com/ceph/ceph/pull/27632
- 08:18 AM Backport #39042 (Resolved): luminous: osd/PGLog: preserve original_crt to check rollbackability
- https://github.com/ceph/ceph/pull/27715
03/29/2019
- 11:04 PM Bug #39039 (Duplicate): mon connection reset, command not resent
- ...
- 07:45 PM Backport #38854: luminous: .mgrstat failed to decode mgrstat state; luminous dev version?
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/27207
merged - 07:45 PM Backport #38859: luminous: upmap broken the crush rule
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/27224
merged - 07:44 PM Backport #38857: luminous: should set EPOLLET flag on del_event()
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/27226
merged - 07:12 AM Backport #38872 (In Progress): mimic: Rados.get_fsid() returning bytes in python3
- https://github.com/ceph/ceph/pull/27259
- 04:47 AM Backport #38858 (In Progress): mimic: upmap broken the crush rule
- https://github.com/ceph/ceph/pull/27257
- 03:04 AM Backport #38860 (In Progress): nautilus: upmap broken the crush rule
03/28/2019
- 10:23 PM Bug #39023 (Resolved): osd/PGLog: preserve original_crt to check rollbackability
- Related to the issue discovered in https://tracker.ceph.com/issues/21174#note-11.
- 07:12 PM Feature #39012 (Resolved): osd: distinguish unfound + impossible to find, vs start some down OSDs...
This may be a command that gets information from the primary of a pg listing unfound objects and where they may be ...- 06:59 PM Documentation #39011 (Resolved): Document how get_recovery_priority() and get_backfill_priority()...
Describe the get_recovery_priority() and get_backfill_priority() as it relates to these constants:...- 06:57 PM Bug #21174: OSD crash: 903: FAILED assert(objiter->second->version > last_divergent_update)
- Hi Grant,
Thanks for applying the patch and updating the logs. Looks like the earlier crash on osd.2(ENOENT on cl... - 05:22 PM Bug #21174: OSD crash: 903: FAILED assert(objiter->second->version > last_divergent_update)
- I am still seeing crashes with https://github.com/ceph/ceph/pull/27200 backported.
Attached are logs.
osd.2 cep... - 02:23 AM Bug #21174: OSD crash: 903: FAILED assert(objiter->second->version > last_divergent_update)
- https://github.com/ceph/ceph/pull/27200 attempts to resolve the failure seen on osd.2
- 04:03 PM Bug #39006: ceph tell osd.xx bench help : gives wrong help
- moreover, it says that first number is a count of block, but actually it is the count of bytes for whole operation:
... - 04:01 PM Bug #39006 (Resolved): ceph tell osd.xx bench help : gives wrong help
- ```
$ ceph tell osd.11 bench help
help not valid: help doesn't represent an int
Invalid command: unused arguments... - 12:34 PM Backport #38859 (In Progress): luminous: upmap broken the crush rule
- 01:39 AM Backport #38859: luminous: upmap broken the crush rule
- https://github.com/ceph/ceph/pull/27224
- 11:10 AM Backport #38510 (Resolved): luminous: ceph CLI ability to change file ownership
- 11:09 AM Backport #38562 (Resolved): luminous: mgr deadlock
- 11:06 AM Backport #38903 (Resolved): nautilus: Minor rados related documentation fixes
- 07:50 AM Bug #38945: osd: leaked pg refs on shutdown
- please note, in luminous, we also need to stop @snap_sleep_timer@ and @scrub_sleep_timer@ into @OSDService::shutdown(...
- 07:43 AM Bug #38945 (Fix Under Review): osd: leaked pg refs on shutdown
- 06:12 AM Bug #38892: /ceph/src/tools/kvstore_tool.cc:266:1: internal compiler error: Segmentation fault
- per Brad
> If we see this again we could try temporarily adding "--param ggc-min-expand=1 --param ggc-min-heapsize... - 03:22 AM Backport #38993 (Resolved): nautilus: unable to link rocksdb library if use system rocksdb
- https://github.com/ceph/ceph/pull/27601
- 03:04 AM Bug #38992 (Resolved): unable to link rocksdb library if use system rocksdb
- 02:33 AM Backport #38750 (New): luminous: should report EINVAL in ErasureCode::parse() if m<=0
- 02:31 AM Backport #38750 (In Progress): luminous: should report EINVAL in ErasureCode::parse() if m<=0
- 02:21 AM Backport #38857 (In Progress): luminous: should set EPOLLET flag on del_event()
- https://github.com/ceph/ceph/pull/27226
- 02:00 AM Backport #38860: nautilus: upmap broken the crush rule
- https://github.com/ceph/ceph/pull/27225
03/27/2019
- 10:56 PM Bug #38839: .mgrstat failed to decode mgrstat state; luminous dev version?
- Sage, Could this have something to do with #38941 ? The timing is right.
- 05:00 PM Backport #38983 (In Progress): nautilus: Improvements to auto repair
- 04:24 PM Backport #38983 (Resolved): nautilus: Improvements to auto repair
- https://github.com/ceph/ceph/pull/27220
- 04:38 PM Bug #38784 (Fix Under Review): osd: FAILED ceph_assert(attrs || !pg_log.get_missing().is_missing(...
- 04:01 AM Bug #26971: failed to become clean before timeout expired
dzafman-2019-03-26_16:39:54-rados:thrash-wip-zafman-26971-diag-distro-basic-smithi/3776762
Another run with diag...- 03:44 AM Backport #38854 (In Progress): luminous: .mgrstat failed to decode mgrstat state; luminous dev ve...
- https://github.com/ceph/ceph/pull/27207
- 01:54 AM Bug #38945 (Resolved): osd: leaked pg refs on shutdown
- recovery_request_timer may hold some QueuePeeringEvts which PGRef,
if we dont shutdown it earlier, it potentially ca... - 01:37 AM Feature #38616: Improvements to auto repair
- Also need to backport 0fb951963ff9d03a592bad0d4442049603195e25 with this.
03/26/2019
- 11:49 PM Feature #38616 (Pending Backport): Improvements to auto repair
- 04:56 PM Backport #38510: luminous: ceph CLI ability to change file ownership
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/26758
mergedReviewed-by: Sébastien Han <seb@redhat.com> - 04:49 PM Backport #38562: luminous: mgr deadlock
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/26830
merged - 04:29 PM Bug #38219: rebuild-mondb hangs
- /a/sage-2019-03-26_03:52:56-rados-wip-sage-testing-2019-03-25-1934-distro-basic-smithi/3774206
- 09:38 AM Backport #38903 (In Progress): nautilus: Minor rados related documentation fixes
- 09:29 AM Backport #38901 (In Progress): mimic: Minor rados related documentation fixes
- 09:04 AM Backport #38902 (In Progress): luminous: Minor rados related documentation fixes
- 04:15 AM Feature #38940 (New): Allow marking noout by failure domain for maintainance and planned downtime.
- - Sometimes an entire host can have planned downtime for maintenance.
- Disk failures outside of the affected area ...
03/25/2019
- 10:02 PM Subtask #37731 (Resolved): upgrade/luminous-x - add "require-osd-release nautilus" and clean up
- Yes, done as a part of these.
https://github.com/ceph/ceph/pull/26389
https://github.com/ceph/ceph/pull/26455 - 07:49 PM Subtask #37731: upgrade/luminous-x - add "require-osd-release nautilus" and clean up
- @neha I think this is done, just want to confirm, pls resolve
- 09:02 PM Bug #38041 (Resolved): Fix recovery and backfill priority handling
- 09:01 PM Backport #38275 (Resolved): mimic: Fix recovery and backfill priority handling
- 06:47 PM Bug #38927 (Resolved): should print min_mon_release correctly
- 06:13 PM Bug #38357: ClsLock.TestExclusiveEphemeralStealEphemeral failed
- Similar,...
- 03:04 PM Bug #38195: osd-backfill-space.sh exposes rocksdb hang
- Seen in mimic backport testing with new osd-backfill-prio.sh test.
http://pulpito.ceph.com/dzafman-2019-03-20_19:4... - 11:14 AM Bug #38931 (Resolved): osd does not proactively remove leftover PGs
- (Context: cephfs cluster running v12.2.11)
We had an osd go nearfull this weekend. I reweighted it to move out som... - 10:58 AM Bug #38930 (Duplicate): ceph osd safe-to-destroy wrongly approves any out osd
- With v12.2.11, we found that ceph osd safe-to-destroy is wrongly reporting that all out osds are safe to destroy.
... - 10:07 AM Backport #38850: upgrade: 1 nautilus mon + 1 luminous mon can't automatically form quorum
- Agreed, my expectation would be that we can maintain quorum during the entire upgrade period. Even discounting OS upg...
03/24/2019
- 04:08 PM Bug #21174: OSD crash: 903: FAILED assert(objiter->second->version > last_divergent_update)
- Yes I can still reproduce it, the cluster is still in a broken state....
- 03:37 PM Backport #38853 (Resolved): nautilus: .mgrstat failed to decode mgrstat state; luminous dev version?
- 02:31 AM Bug #38927: should print min_mon_release correctly
- > Brad Hubbard wrote:
> 15 - 15 !> 2 ?
>
>
> https://github.com/ceph/ceph/pull/27107 should fix this. - 02:30 AM Bug #38927 (Pending Backport): should print min_mon_release correctly
- 02:30 AM Bug #38927 (Resolved): should print min_mon_release correctly
dzafman-2019-03-20_19:53:02-rados-wip-zafman-testing-distro-basic-smithi/3754307
rados/upgrade/luminous-x-single...
03/23/2019
- 10:48 PM Backport #38901: mimic: Minor rados related documentation fixes
- Remove "premerge" pg state which doesn't apply in mimic.
- 09:13 PM Backport #38901 (Resolved): mimic: Minor rados related documentation fixes
- https://github.com/ceph/ceph/pull/27188
- 10:48 PM Backport #38902: luminous: Minor rados related documentation fixes
- Remove "premerge" pg state which doesn't apply in luminous.
- 09:13 PM Backport #38902 (Resolved): luminous: Minor rados related documentation fixes
- https://github.com/ceph/ceph/pull/27185
- 09:13 PM Backport #38906 (Resolved): nautilus: osd/PGLog.h: print olog_can_rollback_to before deciding to ...
- https://github.com/ceph/ceph/pull/27302
- 09:13 PM Backport #38905 (Resolved): luminous: osd/PGLog.h: print olog_can_rollback_to before deciding to ...
- https://github.com/ceph/ceph/pull/27715
- 09:13 PM Backport #38904 (Resolved): mimic: osd/PGLog.h: print olog_can_rollback_to before deciding to rol...
- https://github.com/ceph/ceph/pull/27284
- 09:13 PM Backport #38903 (Resolved): nautilus: Minor rados related documentation fixes
- https://github.com/ceph/ceph/pull/27189
- 09:13 PM Backport #38853 (In Progress): nautilus: .mgrstat failed to decode mgrstat state; luminous dev ve...
- 05:41 PM Bug #38900 (New): EC pools don't self repair on client read error
When a replicated client read fails at the primary, it will pull the object from another OSD (see rep_repair_primar...- 11:42 AM Documentation #38896 (Pending Backport): Minor rados related documentation fixes
- 12:22 AM Documentation #38896 (Resolved): Minor rados related documentation fixes
Document all pg states
Add auto repair items
"premerge" is not pg state in luminous nor mimic
03/22/2019
- 09:27 PM Bug #38845 (Resolved): mon.a@-1(probing) e1 current monmap has min_mon_release 15 (luminous) whic...
- 03:57 PM Bug #38845: mon.a@-1(probing) e1 current monmap has min_mon_release 15 (luminous) which is >2 rel...
- https://github.com/ceph/ceph/pull/27131
- 02:28 PM Bug #38845 (Fix Under Review): mon.a@-1(probing) e1 current monmap has min_mon_release 15 (lumino...
- 02:02 AM Bug #38845: mon.a@-1(probing) e1 current monmap has min_mon_release 15 (luminous) which is >2 rel...
- Brad Hubbard wrote:
> 15 - 15 !> 2 ?
https://github.com/ceph/ceph/pull/27107 should fix this. - 12:10 AM Bug #38845: mon.a@-1(probing) e1 current monmap has min_mon_release 15 (luminous) which is >2 rel...
- 15 - 15 !> 2 ?
- 09:05 PM Bug #38892: /ceph/src/tools/kvstore_tool.cc:266:1: internal compiler error: Segmentation fault
- While I was looking into this I noticed this warning in the Jenkins output....
- 04:46 PM Bug #38892 (Closed): /ceph/src/tools/kvstore_tool.cc:266:1: internal compiler error: Segmentation...
- ...
- 07:12 PM Bug #38894 (Pending Backport): osd/PGLog.h: print olog_can_rollback_to before deciding to rollback
- 05:20 PM Bug #38894 (Resolved): osd/PGLog.h: print olog_can_rollback_to before deciding to rollback
- This is important for debugging failures in merge_object_divergent_entries() before a decision to rollback is made.
- 05:16 PM Bug #38893 (Resolved): RuntimeError: expected MON_CLOCK_SKEW but got none
- ...
- 05:09 PM Cleanup #38635: bluestore: test osd_memory_target
- https://github.com/ceph/ceph/pull/27083 - Merged
Will mark Pending Backport when Part-2 merges. - 02:07 PM Bug #37766 (Resolved): rados_shutdown hang forever in ~objecter()
- 02:06 PM Backport #38398 (Resolved): mimic: rados_shutdown hang forever in ~objecter()
- 01:05 PM Backport #38881 (Resolved): nautilus: ENOENT in collection_move_rename on EC backfill target
- https://github.com/ceph/ceph/pull/27654
- 01:05 PM Backport #38880 (Resolved): luminous: ENOENT in collection_move_rename on EC backfill target
- https://github.com/ceph/ceph/pull/28110
- 01:04 PM Backport #38879 (Resolved): mimic: ENOENT in collection_move_rename on EC backfill target
- https://github.com/ceph/ceph/pull/27943
- 01:03 PM Backport #38873 (Resolved): luminous: Rados.get_fsid() returning bytes in python3
- https://github.com/ceph/ceph/pull/27674
- 01:03 PM Backport #38872 (Resolved): mimic: Rados.get_fsid() returning bytes in python3
- https://github.com/ceph/ceph/pull/27259
- 01:01 PM Backport #38860 (Resolved): nautilus: upmap broken the crush rule
- https://github.com/ceph/ceph/pull/27225
- 01:01 PM Backport #38859 (Resolved): luminous: upmap broken the crush rule
- https://github.com/ceph/ceph/pull/27224
- 01:01 PM Backport #38858 (Resolved): mimic: upmap broken the crush rule
- https://github.com/ceph/ceph/pull/27257
- 01:00 PM Backport #38857 (Resolved): luminous: should set EPOLLET flag on del_event()
- https://github.com/ceph/ceph/pull/27226
- 01:00 PM Backport #38856 (Resolved): mimic: should set EPOLLET flag on del_event()
- https://github.com/ceph/ceph/pull/29250
- 01:00 PM Backport #38854 (Resolved): luminous: .mgrstat failed to decode mgrstat state; luminous dev version?
- https://github.com/ceph/ceph/pull/27207
- 01:00 PM Backport #38853 (Resolved): nautilus: .mgrstat failed to decode mgrstat state; luminous dev version?
- https://github.com/ceph/ceph/pull/27116
- 01:00 PM Backport #38852 (Resolved): mimic: .mgrstat failed to decode mgrstat state; luminous dev version?
- https://github.com/ceph/ceph/pull/29249
- 11:05 AM Backport #38850: upgrade: 1 nautilus mon + 1 luminous mon can't automatically form quorum
- Just to clarify slightly -- I know the upgrade instructions in the Nautilus release announcement say to "upgrade moni...
- 10:19 AM Backport #38850 (Resolved): upgrade: 1 nautilus mon + 1 luminous mon can't automatically form quorum
- Seen while upgrading Luminous (12.2.10) to Nautilus (14.2.0). Three mon hosts, four osd hosts. The process was:
... - 09:30 AM Bug #38839: .mgrstat failed to decode mgrstat state; luminous dev version?
- nautilus https://github.com/ceph/ceph/pull/27116
- 09:30 AM Bug #38839 (Pending Backport): .mgrstat failed to decode mgrstat state; luminous dev version?
- 07:37 AM Bug #38826 (Pending Backport): upmap broken the crush rule
- 01:33 AM Bug #21174: OSD crash: 903: FAILED assert(objiter->second->version > last_divergent_update)
- It is possible that the crash we are seeing on osd.2 is due to 1:537949df:::20000a2c834.00000105:head incorrectly rol...
- 01:05 AM Bug #38846: dump_pgstate_history doesn't really produce useful json output, needs an array around...
- Probably be nice if it dumped the current state stack for each pg as well.
03/21/2019
- 11:06 PM Bug #38846 (Resolved): dump_pgstate_history doesn't really produce useful json output, needs an a...
- ...
- 08:42 PM Bug #38845 (Resolved): mon.a@-1(probing) e1 current monmap has min_mon_release 15 (luminous) whic...
dzafman-2019-03-20_19:53:02-rados-wip-zafman-testing-distro-basic-smithi/3754307
rados/upgrade/luminous-x-single...- 06:05 PM Bug #38841 (New): Objects degraded higher than 100%
- 1. Working Mimic or Nautilus deployment with Bluestore (haven't tested with Filestore)
2. All OSDs up, all PGs activ... - 05:29 PM Bug #38840 (Resolved): snaps missing in mapper, should be: ca was r -2...repaired
dzafman-2019-03-20_19:53:02-rados-wip-zafman-testing-distro-basic-smithi/3754443
This looks like a cache tier ev...- 04:59 PM Bug #38839 (Fix Under Review): .mgrstat failed to decode mgrstat state; luminous dev version?
- https://github.com/ceph/ceph/pull/27101
- 04:57 PM Bug #38839 (Resolved): .mgrstat failed to decode mgrstat state; luminous dev version?
- ...
- 02:26 AM Backport #38719 (In Progress): luminous: crush: choose_args array size mis-sized when weight-sets...
- https://github.com/ceph/ceph/pull/27085
- 01:38 AM Cleanup #38635 (In Progress): bluestore: test osd_memory_target
- https://github.com/ceph/ceph/pull/27083
- 01:31 AM Backport #38720 (In Progress): mimic: crush: choose_args array size mis-sized when weight-sets ar...
- https://github.com/ceph/ceph/pull/27082
03/20/2019
- 10:50 PM Bug #26971: failed to become clean before timeout expired
I'm not sure what this means, but pg 1.0 (size 3) needs to pick another one of the 2 remaining OSDs (4 OSDs in) to ...- 12:05 PM Bug #38582: Pool storage MAX AVAIL reduction seems higher when single OSD reweight is done
- Sorry for the delay. Attaching the required.
osd 155 is the OSD mentioned in description. The one which was manually... - 11:51 AM Bug #38381 (Pending Backport): Rados.get_fsid() returning bytes in python3
- 11:40 AM Bug #38827 (In Progress): valgrind: UninitCondition in ceph::crypto::onwire::AES128GCM_OnWireRxHa...
- 11:24 AM Bug #38827: valgrind: UninitCondition in ceph::crypto::onwire::AES128GCM_OnWireRxHandler::authent...
- the test branch contains https://github.com/ceph/ceph/pull/27012
- 11:21 AM Bug #38827 (Resolved): valgrind: UninitCondition in ceph::crypto::onwire::AES128GCM_OnWireRxHandl...
- ...
- 11:27 AM Bug #38828 (Resolved): should set EPOLLET flag on del_event()
- 10:41 AM Bug #21174: OSD crash: 903: FAILED assert(objiter->second->version > last_divergent_update)
- As requested.
osd.0: ceph-post-file: 17efe900-501c-479f-ba56-dd29fef18c58
osd.4: ceph-post-file: ff22f830-e6bc-4f... - 12:36 AM Bug #21174: OSD crash: 903: FAILED assert(objiter->second->version > last_divergent_update)
- Hi Grant,
Looking at the logs, it seems that the first crash was seen on osd.2 on pg id 1.cas2... - 08:27 AM Bug #38826: upmap broken the crush rule
- Here is the crush rule...
- 08:24 AM Bug #38826 (Resolved): upmap broken the crush rule
- I setup a cluster and want to specify the primary osds through crush rule.
Here is the test script... - 03:14 AM Backport #38275 (In Progress): mimic: Fix recovery and backfill priority handling
- 12:43 AM Backport #38244 (Resolved): luminous: scrub warning check incorrectly uses mon scrub interval
- 12:43 AM Backport #38274 (Resolved): luminous: Fix recovery and backfill priority handling
03/19/2019
- 11:30 PM Bug #36739 (Pending Backport): ENOENT in collection_move_rename on EC backfill target
- 08:38 PM Backport #38398: mimic: rados_shutdown hang forever in ~objecter()
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/26583
merged
03/18/2019
- 06:55 PM Bug #21174: OSD crash: 903: FAILED assert(objiter->second->version > last_divergent_update)
- Err. I believe I mixed up two different bugs, please disregard my previous comment. I don't currently recall what I ...
- 06:52 PM Bug #21174: OSD crash: 903: FAILED assert(objiter->second->version > last_divergent_update)
- For completeness: The root cause for the crashes I experienced were that I had oversized RADOS objects (2-10GB, max ...
- 02:22 PM Bug #38124: OSD down on snaptrim.
- Hello any updates about this?
- 06:35 AM Bug #38793 (New): data inconsistent
- I did some test on rbd snap, and found data inconsistent.
cluster status:...
03/17/2019
- 10:21 PM Bug #38787 (Fix Under Review): osd: cache tiering flush clone wrongly
- 02:38 AM Bug #38787 (Fix Under Review): osd: cache tiering flush clone wrongly
- because cephfs file snapcontext seq may start from 1, we find that in a never snaped fs,
the flush of file will dele... - 07:21 PM Bug #38294 (Resolved): osd/PG.cc: 6141: FAILED ceph_assert(info.history.same_interval_since != 0)...
- 10:01 AM Bug #38294 (Fix Under Review): osd/PG.cc: 6141: FAILED ceph_assert(info.history.same_interval_sin...
- https://github.com/ceph/ceph/pull/27018
- 09:57 AM Bug #38294 (In Progress): osd/PG.cc: 6141: FAILED ceph_assert(info.history.same_interval_since !=...
- /a/sage-2019-03-17_00:28:04-upgrade:luminous-x-wip-sage4-testing-2019-03-16-1713-distro-basic-smithi/3737326
pg 1.... - 12:10 AM Bug #21174: OSD crash: 903: FAILED assert(objiter->second->version > last_divergent_update)
- ...
03/16/2019
- 11:20 PM Bug #21174: OSD crash: 903: FAILED assert(objiter->second->version > last_divergent_update)
- I have a similar issue with OSDs dropping out:...
- 06:33 PM Bug #38786 (Resolved): autoscale down can lead to max_pg_per_osd limit
- we adjust pgp_num all the way down to the target, which can make osds hit the max_pgs_per_osd if it's going too far.
...
03/15/2019
- 09:45 PM Bug #38623 (Resolved): 2.8s2 past_intervals [6539,6541) start interval does not contain the requi...
- 08:31 PM Bug #38655 (Resolved): osd: missing, size mismatch, snap mapper errors
- 06:11 PM Bug #36739: ENOENT in collection_move_rename on EC backfill target
- https://github.com/ceph/ceph/pull/26996 is a more complete fix for this issue.
- 06:06 PM Bug #38784 (Resolved): osd: FAILED ceph_assert(attrs || !pg_log.get_missing().is_missing(soid) ||...
- ...
- 05:08 PM Bug #38746 (Resolved): msgr2 leaking buffers
- https://github.com/ceph/ceph/pull/26965
- 03:20 AM Bug #38746: msgr2 leaking buffers
- hmm it happens on some osds but not others.
i added to rxbuf and txbuf lengths to the dout prefix and got this
... - 03:01 AM Bug #38746 (Resolved): msgr2 leaking buffers
- osds with bluestore consume too much ram (seeing 20GB on sepia)
to reproduce with vstart, watch bin/ceph daemon os... - 05:03 PM Bug #38783 (New): Changing mon_pg_warn_max_object_skew has no effect.
- ...
- 03:20 PM Documentation #38051 (Resolved): doc/rados/configuration: refresh osdmap section
- 03:19 PM Backport #38095 (Resolved): luminous: doc/rados/configuration: refresh osdmap section
- 12:13 PM Bug #38762 (New): Ubuntu/Debian repo has incorrect InRelease
- On Ubuntu Bionic trying to update repo package I got error:
E: Failed to fetch https://download.ceph.com/debian-mi... - 08:59 AM Backport #38751 (Resolved): mimic: should report EINVAL in ErasureCode::parse() if m<=0
- https://github.com/ceph/ceph/pull/28995
- 08:58 AM Backport #38750 (Resolved): luminous: should report EINVAL in ErasureCode::parse() if m<=0
- https://github.com/ceph/ceph/pull/28111
Also available in: Atom