Project

General

Profile

Activity

From 12/05/2017 to 01/03/2018

01/03/2018

09:28 PM Bug #22064: "RadosModel.h: 865: FAILED assert(0)" in rados-jewel-distro-basic-smithi
So Nathan seems to have narrowed it down to https://github.com/ceph/ceph/pull/17815 - can you look at this when you'r... Josh Durgin
09:23 PM Support #22422: Block fsid does not match our fsid
It looks like you may have had a partial prepare there in the past - if you're sure it's the right disk, wipe it with... Josh Durgin
09:22 PM Bug #22438 (Resolved): mon: leak in lttng dlopen / __tracepoints__init
Josh Durgin
09:17 PM Support #22466 (Closed): PG failing to map to any OSDs
Josh Durgin
09:08 PM Support #22553: ceph-object-tool can not remove metadata pool's object
Is there possibly something wrong with that disk? Josh Durgin
03:28 PM Bug #22354: v12.2.2 unable to create bluestore osd using ceph-disk
Jon Heese wrote:
> Unfortunately, `ceph-disk zap /dev/sde` does not wipe enough of the disk to avoid this issue. As...
Curt Bruns
01:41 AM Bug #22346 (Resolved): OSD_ORPHAN issues after jewel->luminous upgrade, but orphaned osds not in ...
Not for me.
$ crushtool -d crushmap.bad -o crushmap.bad.txt
$ crushtool -d crushmap.good -o crushmap.good.txt
$ ...
Brad Hubbard

01/02/2018

09:03 PM Bug #22539: bluestore: New OSD - Caught signal - bstore_kv_sync
Alright, that fixed it!
It also fixed the heavy IO issue as well as the rather large amount of consumption I was s...
Brian Woods
06:20 PM Bug #22539: bluestore: New OSD - Caught signal - bstore_kv_sync
Sorry for the spam.
That broke it good!!!...
Brian Woods
06:15 PM Bug #22539: bluestore: New OSD - Caught signal - bstore_kv_sync
Was able to out them all:... Brian Woods
06:14 PM Bug #22539: bluestore: New OSD - Caught signal - bstore_kv_sync
I can't mark the OSDs out.... Brian Woods
03:42 PM Bug #22539: bluestore: New OSD - Caught signal - bstore_kv_sync
Hard to say excatly, but I would not be surprised to see any manner of odd behaviors with a huge map like that--we ha... Sage Weil
04:28 PM Bug #22354: v12.2.2 unable to create bluestore osd using ceph-disk
Unfortunately, `ceph-disk zap /dev/sde` does not wipe enough of the disk to avoid this issue. As I mentioned above, ... Jon Heese
01:01 PM Support #22553 (New): ceph-object-tool can not remove metadata pool's object
i put an object to the rbd pool
rados -p rbd put qinli.sh
then stop osd and remove it
[root@lab71 ~]# ceph-objec...
peng zhang

12/31/2017

11:13 PM Bug #21388: inconsistent pg but repair does nothing reporting head data_digest != data_digest fro...
I'm working on fixing all my inconsistent pgs but I'm having issues with rados get... hopefully I'm just doing the co... Ryan Anstey

12/30/2017

02:30 AM Bug #22539: bluestore: New OSD - Caught signal - bstore_kv_sync
I had no idea the ID would impact the map calculations that way (makes sense now)!!! Very good to know! And those I... Brian Woods

12/29/2017

10:34 PM Bug #22539 (In Progress): bluestore: New OSD - Caught signal - bstore_kv_sync
Brian, note that one reason why this triggered is that your osdmap is huge... because you have some osds with very la... Sage Weil

12/28/2017

11:02 PM Bug #22539: bluestore: New OSD - Caught signal - bstore_kv_sync
I'm a bit lost hence trying to re-arrange things:
Let's handle the crash first.
IMO it's caused by throttle value...
Igor Fedotov

12/27/2017

04:46 AM Bug #22539: bluestore: New OSD - Caught signal - bstore_kv_sync
A chunk from the mon log:
https://pastebin.com/MA1BStEc
Some screenshots of the IO:
https://imgur.com/a/BOKWc
...
Brian Woods
04:29 AM Bug #22544 (Resolved): objecter cannot resend split-dropped op when racing with con reset
@
if (split && con && con->has_features(CEPH_FEATUREMASK_RESEND_ON_SPLIT)) {
return RECALC_OP_TARGET_NEED_RES...
mingxin liu

12/26/2017

11:03 PM Bug #22539: bluestore: New OSD - Caught signal - bstore_kv_sync
UI Lag seems to be related to heavy load to the OS SSD from the monitor services. The monitor service does a lot of I... Brian Woods
10:51 PM Bug #22539: bluestore: New OSD - Caught signal - bstore_kv_sync
Edit, UI is lagging again. But its odd. SOME things lag, but GLXGears isn't. IO blocking of some sort? Adding mor... Brian Woods
10:47 PM Bug #22539: bluestore: New OSD - Caught signal - bstore_kv_sync
Confirmed the line was there. Added the extra debug line, but this time when I started it is came right online (almo... Brian Woods
09:18 PM Bug #22539: bluestore: New OSD - Caught signal - bstore_kv_sync
Given the object names in action it looks like that's osd map update or something that triggers the issue. Not the us... Igor Fedotov
06:41 PM Bug #22539: bluestore: New OSD - Caught signal - bstore_kv_sync
Additional note, there is no data on the cluster other than the built in pools. So there is very little information ... Brian Woods
05:46 PM Bug #22539: bluestore: New OSD - Caught signal - bstore_kv_sync
This will make only the fourth OSD in the cluster. Would that impact the overflowed value? What can I do to capture... Brian Woods
01:22 PM Bug #22539: bluestore: New OSD - Caught signal - bstore_kv_sync
As a workaround one can try to set (temporarily until initial rebuild completes?) bluestore_throttle_bytes = 0 at the... Igor Fedotov
01:07 PM Bug #22539: bluestore: New OSD - Caught signal - bstore_kv_sync
32-bit value in throttle_bytes is overflowed - see:
2017-12-25 13:18:06.783304 7f37a7a2a700 10 bluestore(/var/lib/ce...
Igor Fedotov

12/25/2017

09:27 PM Bug #22539: bluestore: New OSD - Caught signal - bstore_kv_sync
Added to ceph.conf:
debug bluestore = 20
debug osd = 20
Waited for crash, captured log, but its too large even c...
Brian Woods
08:46 PM Bug #22539 (Resolved): bluestore: New OSD - Caught signal - bstore_kv_sync
After rebuilding a demo cluster, OSD on one node can no longer be created.
Looking though the log I see this error...
Brian Woods
03:24 AM Support #22466: PG failing to map to any OSDs
When delete the osds outside of the default root , the problem's solved. Amine Liu

12/22/2017

02:15 PM Support #22531 (New): OSD flapping under repair/scrub after recieve inconsistent PG LFNIndex.cc: ...
Hi.
I have a problem when repairing PG 1.f under copy from OSD.3 on OSD.0. During the upgrade to 12.2.2, all OSDs we...
Jan Michlik
01:11 PM Bug #21262: cephfs ec data pool, many osds marked down
This looks like a Support Case rather than a Tracker Bug. Jos Collin
09:58 AM Bug #22530: pool create cmd's expected_num_objects is not correctly interpreted
fix: https://github.com/ceph/ceph/pull/19651... Honggang Yang
09:52 AM Bug #22530 (Resolved): pool create cmd's expected_num_objects is not correctly interpreted
1. disable merge... Honggang Yang
09:11 AM Bug #22354: v12.2.2 unable to create bluestore osd using ceph-disk
The problem of "ceph-disk activation issue in 12.2.2" has been catched ,It can be solved by this:
1. delete osd
2...
Hua Liu
05:33 AM Bug #21388: inconsistent pg but repair does nothing reporting head data_digest != data_digest fro...
I'm also having this issue. I'm getting new scrub errors every few days. No idea what's going on. This is something n... Ryan Anstey

12/21/2017

04:54 PM Bug #22346: OSD_ORPHAN issues after jewel->luminous upgrade, but orphaned osds not in crushmap
That did clean it up, thanks.
It is curious though that if I decompile the crushmap to text, it appears the same b...
Graham Allan
04:12 PM Feature #22528 (New): objects should not be promoted when locked
Hello,
We faced with immediate object promotion when call lock on object.
This behavior makes very hard to understa...
Aleksei Zakharov
02:57 PM Bug #21557: osd.6 found snap mapper error on pg 2.0 oid 2:0e781f33:::smithi14431805-379 ... :187 ...
/a/sage-2017-12-21_07:24:12-rados-wip-sage3-testing-2017-12-20-2253-distro-basic-smithi/1989672
but didn't have th...
Sage Weil
02:11 PM Bug #18698: BlueFS FAILED assert(0 == "allocate failed... wtf")
There is a PR#18494 addressing an issue with the symptoms similar to ones reported in comment #9 (assert during _bala... Igor Fedotov
12:50 PM Bug #22525 (Resolved): auth: ceph auth add does not sanity-check caps
When adding a keyring with "ceph auth add -i <keyring> <entity>", it does not verify that the contained capability st... Fabian Vogt
11:42 AM Support #22520: nearfull threshold is not cleared when osd really is not nearfull.
When I was delete some data from this osds, nearfull flag was also deleted.... Konstantin Shalygin
10:56 AM Support #22520 (Closed): nearfull threshold is not cleared when osd really is not nearfull.
Today one of my osd is reached nearfull ratio. mon_osd_nearfull_ratio: '.85'. I increased mon_osd_nearfull_ratio to '... Konstantin Shalygin
10:10 AM Backport #22502 (In Progress): luminous: Pool Compression type option doesn't apply to new OSD's
-https://github.com/ceph/ceph/pull/19629- Shinobu Kinjo

12/20/2017

08:59 PM Bug #22415: 'pg dump' fails after mon rebuild
/a/yuriw-2017-12-19_20:36:31-rados-wip-yuri4-testing-2017-12-19-1722-distro-basic-smithi/1980900 Sage Weil
08:58 PM Bug #22515 (Resolved): osd-config.sh fails with /usr/bin/ceph-authtool: unexpected '1000'
https://github.com/ceph/ceph/pull/19544 Sage Weil
07:56 PM Bug #22515 (Resolved): osd-config.sh fails with /usr/bin/ceph-authtool: unexpected '1000'
... Sage Weil
08:24 PM Bug #22408: objecter: sent out of order ops
/a/yuriw-2017-12-19_20:40:29-rbd-wip-yuri4-testing-2017-12-19-1722-distro-basic-smithi/1981037
rbd/basic/{base/ins...
Sage Weil
08:19 PM Bug #22369 (Resolved): out of order reply on set-chunks.yaml workload
Sage Weil
01:11 PM Bug #18698: BlueFS FAILED assert(0 == "allocate failed... wtf")
Reproduced with 12.2.2 during deep scrubbing after 7 days of workload.
36 ssds, 200G each contain 400G of rbds and...
Aleksei Gutikov
11:54 AM Backport #22502 (Resolved): luminous: Pool Compression type option doesn't apply to new OSD's
https://github.com/ceph/ceph/pull/20106 Nathan Cutler

12/19/2017

09:18 PM Bug #22486: ceph shows wrong MAX AVAIL with hybrid (chooseleaf firstn 1, chooseleaf firstn -1) CR...
Forgot to put the output in code tags, sadly I can't edit the original, so here it is again to make it more readable:... Patrick Fruh
09:14 PM Bug #22486 (New): ceph shows wrong MAX AVAIL with hybrid (chooseleaf firstn 1, chooseleaf firstn ...
I have the following configuration of OSDs:
ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS
0 hdd 5...
Patrick Fruh
03:35 PM Bug #22266: mgr/PyModuleRegistry.cc: 139: FAILED assert(map.epoch > 0)
/a/sage-2017-12-19_06:01:05-rados-wip-sage2-testing-2017-12-18-2147-distro-basic-smithi/1979661
saw this again on ...
Sage Weil
12:18 PM Bug #22445: ceph osd metadata reports wrong "back_iface"
Hmm, this could well be the first time anyone's really tested the IPv6 path here. John Spray
11:56 AM Support #22466: PG failing to map to any OSDs
More info will be needed to work out if this is a bug -- are the CRUSH rules customized? What is the topology ("ceph... John Spray
03:29 AM Bug #21218: thrash-eio + bluestore (hangs with unfound objects or read_log_and_missing assert)
/a/sage-2017-12-18_22:56:18-rados-wip-sage-testing-2017-12-18-1406-distro-basic-smithi/1976871
description: rados/si...
Sage Weil

12/18/2017

07:07 AM Bug #22468 (New): unblock backoff contend with cancel proxy write lead to out of order
1.cache primary send several proxy write to base primary
2.base pg haven't peered, backoff these op
3.base finish p...
mingxin liu
01:48 AM Support #22466 (Closed): PG failing to map to any OSDs

osdmap e88997 pg 9.d07 (9.d07) -> up [] acting []
health HEALTH_ERR
319 pgs are stuck inacti...
Amine Liu

12/17/2017

04:10 AM Backport #22450: luminous: Visibility for snap trim queue length
Nathan Cutler wrote:
> Shinobu Kinjo wrote:
> > unmerged pr can't be cherry-picked anyway...
>
> Actually, it ca...
Shinobu Kinjo
04:08 AM Backport #22450: luminous: Visibility for snap trim queue length
Shinobu Kinjo wrote:
> unmerged pr can't be cherry-picked anyway...
Actually, it can, but we definitely don't wan...
Nathan Cutler

12/16/2017

04:30 AM Bug #22419 (Pending Backport): Pool Compression type option doesn't apply to new OSD's
Kefu Chai
04:27 AM Bug #22093 (Resolved): osd stuck in loop processing resent ops due to ms inject socket failures: 500
Kefu Chai
02:21 AM Backport #22406 (In Progress): jewel: osd: deletes are performed inline during pg log processing
Josh Durgin
12:48 AM Bug #22462 (Resolved): mon: unknown message type 1537 in luminous->mimic upgrade tests
http://pulpito.ceph.com/teuthology-2017-12-14_22:26:40-upgrade:luminous-x:point-to-point-x-master-distro-basic-ovh/
...
Josh Durgin

12/15/2017

10:22 PM Backport #22450: luminous: Visibility for snap trim queue length
unmerged pr can't be cherry-picked anyway... Shinobu Kinjo
12:05 PM Backport #22450: luminous: Visibility for snap trim queue length
master PR https://github.com/ceph/ceph/pull/19520 has not been merged yet - do not backport! (this backport ticket wa... Nathan Cutler
08:16 AM Backport #22450 (Resolved): luminous: Visibility for snap trim queue length
https://github.com/ceph/ceph/pull/20098 Piotr Dalek
09:50 PM Bug #22354: v12.2.2 unable to create bluestore osd using ceph-disk
So I dug a little deeper on this, and followed this gentleman's efforts to manually set up bluestore OSDs (although h... Jon Heese
02:47 PM Feature #22456 (New): efficient snapshot rollback
#21305 :
> Rolling back images is painfully slow. Yes, I know, "rbd clone", but this creates another image and au...
Марк Коренберг
12:10 PM Feature #22448: Visibility for snap trim queue length
@Nathan: yeah, sorry, I thought this process is more manual. Piotr Dalek
12:08 PM Feature #22448: Visibility for snap trim queue length
@Piotr: It's OK to add e.g. "jewel, luminous" to the "Backport" field right from the beginning, though.
When the ...
Nathan Cutler
12:07 PM Feature #22448 (Fix Under Review): Visibility for snap trim queue length
master PR is https://github.com/ceph/ceph/pull/19520 Nathan Cutler
12:06 PM Feature #22448: Visibility for snap trim queue length
@Piotr: Please wait until the master PR is merged before starting the backporting process. Thanks. Nathan Cutler
08:11 AM Feature #22448 (Resolved): Visibility for snap trim queue length
We observed unexplained, constant disk space usage increase on a few of our prod clusters. At first we thought that i... Piotr Dalek
12:04 PM Backport #22449: jewel: Visibility for snap trim queue length
master PR https://github.com/ceph/ceph/pull/19520 has not been merged yet - do not backport! (this backport ticket wa... Nathan Cutler
08:13 AM Backport #22449 (Resolved): jewel: Visibility for snap trim queue length
https://github.com/ceph/ceph/pull/21200 Piotr Dalek
06:23 AM Bug #22093 (Fix Under Review): osd stuck in loop processing resent ops due to ms inject socket fa...
https://github.com/ceph/ceph/pull/19542 Kefu Chai
06:02 AM Bug #22346: OSD_ORPHAN issues after jewel->luminous upgrade, but orphaned osds not in crushmap
Hi Graham,
The consensus is that this was caused by a bug in a previous release which failed to remove the devices...
Brad Hubbard
05:53 AM Bug #22440: New pgs per osd hard limit can cause peering issues on existing clusters
We could definitely add a health warning for when we hit that condition in maybe_wait_for_max_pg()? that should show ... Brad Hubbard

12/14/2017

09:39 PM Bug #22354: v12.2.2 unable to create bluestore osd using ceph-disk
I am getting the exact same behavior during @ceph-deploy osd activate@ (which uses @ceph-disk activate@) on a newly-d... Jon Heese
09:27 PM Bug #22440: New pgs per osd hard limit can cause peering issues on existing clusters
Sure that makes sense.
If not a new state, how about something that would show up in pg query. I queried the pg wi...
Nick Fisk
09:19 PM Bug #22440: New pgs per osd hard limit can cause peering issues on existing clusters
I'm inclined to think we just need to surface this better (perhaps as a new state?) rather than try and let it peer i... Greg Farnum
11:45 AM Bug #22440 (Resolved): New pgs per osd hard limit can cause peering issues on existing clusters
During upgrade of OSD's in a cluster from Filestore to Bluestore, the CRUSH layout changed in my cluster. This result... Nick Fisk
08:17 PM Bug #22445 (New): ceph osd metadata reports wrong "back_iface"
ceph osd metadata reports wrong "back_iface". Example: ceph osd metadata 0
{
"id": 0,
"arch": "x86_64",
...
Stefan Kooman
08:05 PM Feature #22260: osd: recover after network outages
Thanks Joao for fielding Shinobu's question. Anonymous
04:08 PM Feature #22442 (New): ceph daemon mon.id mon_status -> ceph daemon mon.id status
ceph mon_status is not consistent with the status command for all other daemons. It would bring consistency to ceph w... Stefan Kooman
12:42 PM Bug #22064: "RadosModel.h: 865: FAILED assert(0)" in rados-jewel-distro-basic-smithi
Pushed https://shaman.ceph.com/builds/ceph/wip-jewel-22064/ pointing to 900e8cfc0d3a8057c3528b5e1787560bb6c2f198 whic... Nathan Cutler
12:28 PM Bug #22064: "RadosModel.h: 865: FAILED assert(0)" in rados-jewel-distro-basic-smithi
I'm going through the jewel rados nightlies now to figure out when it started. Here is a list of SHA1s where the bug ... Nathan Cutler
11:12 AM Bug #22064: "RadosModel.h: 865: FAILED assert(0)" in rados-jewel-distro-basic-smithi
Raising priority because this is readily reproducible in baseline jewel. (By "baseline jewel" I mean jewel HEAD, curr... Nathan Cutler
12:36 PM Bug #21997 (Resolved): thrashosds defaults to min_in 3, some ec tests are (2,2)
Nathan Cutler
12:34 PM Backport #22391 (Resolved): luminous: thrashosds defaults to min_in 3, some ec tests are (2,2)
Nathan Cutler
02:13 AM Backport #22391 (Closed): luminous: thrashosds defaults to min_in 3, some ec tests are (2,2)
d21809b is already in luminous. Shinobu Kinjo
11:05 AM Bug #22438 (Fix Under Review): mon: leak in lttng dlopen / __tracepoints__init
https://github.com/ceph/ceph/pull/19515 Kefu Chai
10:17 AM Bug #22438: mon: leak in lttng dlopen / __tracepoints__init
now daemons are linked against ceph-common, and when ceph-common is dlopen'ed @__tracepoints__init()@ is called; when... Kefu Chai
02:18 AM Bug #22438 (Resolved): mon: leak in lttng dlopen / __tracepoints__init
In the 3 valgrind failures here: http://pulpito.ceph.com/yuriw-2017-12-12_20:47:55-rados-wip-yuri2-testing-2017-12-12... Josh Durgin
07:48 AM Backport #22400 (In Progress): jewel: PR #16172 causing performance regression
Nathan Cutler
07:14 AM Bug #22220: osd/ReplicatedPG.h:1667:14: internal compiler error: in force_type_die, at dwarf2out....
Kefu Chai
02:20 AM Bug #22220: osd/ReplicatedPG.h:1667:14: internal compiler error: in force_type_die, at dwarf2out....
How does https://github.com/ceph/ceph/pull/19461 fix the bug in gcc7? Brad Hubbard

12/13/2017

10:58 PM Backport #22399 (In Progress): luminous: Manager daemon x is unresponsive. No standby daemons ava...
https://github.com/ceph/ceph/pull/19501 Shinobu Kinjo
10:49 PM Backport #22402 (In Progress): luminous: osd: replica read can trigger cache promotion
https://github.com/ceph/ceph/pull/19499 Shinobu Kinjo
10:25 PM Backport #22405 (In Progress): jewel: store longer dup op information
Nathan Cutler
09:42 PM Bug #16236: cache/proxied ops from different primaries (cache interval change) don't order proper...
also in
http://pulpito.ceph.com/teuthology-2017-12-09_02:00:03-rados-jewel-distro-basic-smithi/
'1946610'
Yuri Weinstein
09:41 PM Bug #22063: "RadosModel.h: 1703: FAILED assert(!version || comp->get_version64() == version)" inr...
also in
http://pulpito.ceph.com/teuthology-2017-12-09_02:00:03-rados-jewel-distro-basic-smithi/
'1946540'
Yuri Weinstein
09:40 PM Bug #22064: "RadosModel.h: 865: FAILED assert(0)" in rados-jewel-distro-basic-smithi
Also in http://pulpito.ceph.com/teuthology-2017-12-09_02:00:03-rados-jewel-distro-basic-smithi/
Jobs: '1946659', '19...
Yuri Weinstein
02:34 PM Backport #22389 (In Progress): luminous: ceph-objectstore-tool: Add option "dump-import" to exami...
Nathan Cutler
02:21 PM Bug #22419 (Fix Under Review): Pool Compression type option doesn't apply to new OSD's
https://github.com/ceph/ceph/pull/19486 Sage Weil
02:11 PM Bug #22419: Pool Compression type option doesn't apply to new OSD's
Sage Weil
11:49 AM Bug #22419 (Resolved): Pool Compression type option doesn't apply to new OSD's
If you set the pool compression type option to something like snappy, existing bluestore OSD's will then start compre... Nick Fisk
02:13 PM Feature #22420: Add support for obtaining a list of available compression options
This came up on IRC: my suggestion was to include the list of usable plugins in the MOSDBoot metadata. John Spray
12:30 PM Feature #22420 (Resolved): Add support for obtaining a list of available compression options
According to the documentation, Ceph supports a variety of compression algorithms when creating Pools on BlueStore vi... Lenz Grimmer
01:16 PM Backport #22069: luminous: osd/ReplicatedPG.cc: recover_replicas: object added to missing set for...
opened http://tracker.ceph.com/issues/22423 for the backport; waiting for green light from Sage. Nathan Cutler
01:01 PM Backport #22069: luminous: osd/ReplicatedPG.cc: recover_replicas: object added to missing set for...
David Zafman wrote:
> This will backport more cleanly if we also backport #17708 (https://github.com/ceph/ceph/pull/...
Nathan Cutler
01:14 PM Backport #22423: luminous: osd: initial minimal efforts to clean up PG interface
h3. description
Backport of https://github.com/ceph/ceph/pull/17708 which does not have an associated tracker issue.
Nathan Cutler
01:13 PM Backport #22423 (Closed): luminous: osd: initial minimal efforts to clean up PG interface
Nathan Cutler
01:09 PM Backport #22421 (In Progress): mon doesn't send health status after paxos service is inactive tem...
Kefu Chai
12:55 PM Backport #22421 (Resolved): mon doesn't send health status after paxos service is inactive tempor...
https://github.com/ceph/ceph/pull/19481 Jan Fajerski
01:08 PM Support #22422 (New): Block fsid does not match our fsid
Hi, i'm deploying new OSDs with luminous and bluestore.
I'm trying with:
"ceph-disk prepare --bluestore /dev/sda --...
Juan Manuel Rey
12:33 PM Bug #22142 (Pending Backport): mon doesn't send health status after paxos service is inactive tem...
Kefu Chai
12:05 PM Bug #21557: osd.6 found snap mapper error on pg 2.0 oid 2:0e781f33:::smithi14431805-379 ... :187 ...
https://github.com/ceph/ceph/pull/19366 is merged for more verbose logs Kefu Chai
11:29 AM Bug #22220: osd/ReplicatedPG.h:1667:14: internal compiler error: in force_type_die, at dwarf2out....
@Kefu: thanks for the quick fix! Nathan Cutler
04:36 AM Bug #22220 (Fix Under Review): osd/ReplicatedPG.h:1667:14: internal compiler error: in force_type...
https://github.com/ceph/ceph/pull/19461 Kefu Chai
03:58 AM Bug #22220: osd/ReplicatedPG.h:1667:14: internal compiler error: in force_type_die, at dwarf2out....
The build in https://github.com/ceph/ceph/pull/19457 was done on Ubuntu :( Brad Hubbard
03:57 AM Bug #22220: osd/ReplicatedPG.h:1667:14: internal compiler error: in force_type_die, at dwarf2out....
please see https://github.com/ceph/ceph/pull/19426. that's why it popped up recently.
Nathan, to downgrade the GCC...
Kefu Chai
02:07 AM Bug #22220: osd/ReplicatedPG.h:1667:14: internal compiler error: in force_type_die, at dwarf2out....
... Brad Hubbard
01:22 AM Bug #22220: osd/ReplicatedPG.h:1667:14: internal compiler error: in force_type_die, at dwarf2out....
... Brad Hubbard
01:05 AM Bug #22220: osd/ReplicatedPG.h:1667:14: internal compiler error: in force_type_die, at dwarf2out....
Is Jenkins using Fedora? If not I'd suggest we create a bug against the appropriate OS and component. I suspect this ... Brad Hubbard
12:57 AM Bug #22220: osd/ReplicatedPG.h:1667:14: internal compiler error: in force_type_die, at dwarf2out....
Raising priority because this error is now affecting (all?) Jewel PRs. See e.g. https://github.com/ceph/ceph/pull/194... Nathan Cutler
05:29 AM Bug #22369: out of order reply on set-chunks.yaml workload
https://github.com/ceph/ceph/pull/19464 Myoungwon Oh
02:50 AM Bug #22415: 'pg dump' fails after mon rebuild
probably the mgr commands didn't get included in the rebuilt mon? i think they should get set after the mgr daemon r... Sage Weil
02:50 AM Bug #22415 (Duplicate): 'pg dump' fails after mon rebuild
... Sage Weil
01:12 AM Bug #18239 (New): nan in ceph osd df again
Nathan Cutler
01:05 AM Bug #19700: OSD remained up despite cluster network being inactive?
Nathan Cutler
12:18 AM Bug #22413: can't delete object from pool when Ceph out of space
You can get around this by using rados_write_op_operate with the 'LIBRADOS_OPERATION_FULL_FORCE' flag (128), like the... Josh Durgin

12/12/2017

10:35 PM Bug #22413: can't delete object from pool when Ceph out of space
forgot to mention I get errors like this when it fills up:
192.168.203.54: 2017-12-12 22:33:58.369563 7f2f1cb7ee40...
Ben England
08:57 PM Bug #22413 (Resolved): can't delete object from pool when Ceph out of space
I ran into a situation where python librados script would hang while trying to delete an object when Ceph storage was... Ben England
03:16 PM Bug #22409 (Resolved): ceph_objectstore_tool: no flush before collection_empty() calls; ObjectSto...
Currently we need callers to flush the sequencer before collection_list (and thus collection_empty).
/a/sage-2017-...
Sage Weil
02:51 PM Bug #22408 (Can't reproduce): objecter: sent out of order ops
... Sage Weil
02:23 PM Bug #22232 (Duplicate): ceph df %used output is wrong
Closing as a duplicate of http://tracker.ceph.com/issues/22247 Nathan Cutler
01:55 PM Bug #22114: mon: ops get stuck in "resend forwarded message to leader"
hongpeng lu wrote:
> Hmm, actually, I don't know how to submit a PR.
https://github.com/ceph/ceph/blob/master/Sub...
Nathan Cutler
08:45 AM Backport #22406 (Rejected): jewel: osd: deletes are performed inline during pg log processing
https://github.com/ceph/ceph/pull/19558 Nathan Cutler
08:45 AM Backport #22405 (Rejected): jewel: store longer dup op information
-https://github.com/ceph/ceph/pull/19497-
https://github.com/ceph/ceph/pull/19558
Nathan Cutler
08:45 AM Backport #22403 (Resolved): jewel: osd: replica read can trigger cache promotion
https://github.com/ceph/ceph/pull/21199 Nathan Cutler
08:45 AM Backport #22402 (Resolved): luminous: osd: replica read can trigger cache promotion
Nathan Cutler
08:45 AM Backport #22400 (Rejected): jewel: PR #16172 causing performance regression
-https://github.com/ceph/ceph/pull/19497-
https://github.com/ceph/ceph/pull/19558
Nathan Cutler
08:45 AM Backport #22399 (Resolved): luminous: Manager daemon x is unresponsive. No standby daemons available
https://github.com/ceph/ceph/pull/19501 Nathan Cutler
08:43 AM Backport #22391 (Resolved): luminous: thrashosds defaults to min_in 3, some ec tests are (2,2)
https://github.com/ceph/ceph/pull/18702 Nathan Cutler
08:43 AM Backport #22390 (Rejected): jewel: ceph-objectstore-tool: Add option "dump-import" to examine an ...
https://github.com/ceph/ceph/pull/21193 Nathan Cutler
08:43 AM Backport #22389 (Resolved): luminous: ceph-objectstore-tool: Add option "dump-import" to examine ...
https://github.com/ceph/ceph/pull/19487 Nathan Cutler
08:43 AM Backport #22387 (Resolved): luminous: PG stuck in recovery_unfound
https://github.com/ceph/ceph/pull/20055 Nathan Cutler
07:26 AM Bug #22350: nearfull OSD count in 'ceph -w'
Greg Farnum wrote:
Hi Greg,
> Can you produce logs of the monitor doing this? With "debug mon = 20" set?
Not...
Richard Arends
01:39 AM Bug #22369: out of order reply on set-chunks.yaml workload
Let me take a look at this issue. Myoungwon Oh

12/11/2017

09:44 PM Bug #22350 (Need More Info): nearfull OSD count in 'ceph -w'
Greg Farnum
09:44 PM Bug #22350: nearfull OSD count in 'ceph -w'
Can you produce logs of the monitor doing this? With "debug mon = 20" set? Greg Farnum
09:29 PM Bug #19971 (Pending Backport): osd: deletes are performed inline during pg log processing
Ken Dreyer
09:20 PM Bug #22369: out of order reply on set-chunks.yaml workload
Sage Weil
09:19 PM Bug #22369 (Resolved): out of order reply on set-chunks.yaml workload
... Sage Weil

12/09/2017

03:40 AM Feature #22086 (Pending Backport): ceph-objectstore-tool: Add option "dump-import" to examine an ...
David Zafman

12/08/2017

07:29 PM Bug #22354: v12.2.2 unable to create bluestore osd using ceph-disk
Reproducing steps...
=======================
Stopping osd.112
# systemctl stop ceph-osd@112
Removing 112 fro...
Nokia ceph-users
06:35 PM Bug #22354 (Resolved): v12.2.2 unable to create bluestore osd using ceph-disk
Hello,
We aware that ceph-disk which is deprecated in 12.2.2 . As part of my testing, I can still using this ceph...
Nokia ceph-users
05:08 PM Bug #22346: OSD_ORPHAN issues after jewel->luminous upgrade, but orphaned osds not in crushmap
Interesting! It seems like we probably removed 30 osds from the old retired hardware, so it's curious that just 3 had... Graham Allan
07:27 AM Bug #22346: OSD_ORPHAN issues after jewel->luminous upgrade, but orphaned osds not in crushmap
So this is happening because entries for "device2", "device14", and "device19" still have entries in the "name_map" s... Brad Hubbard
12:13 AM Bug #22346: OSD_ORPHAN issues after jewel->luminous upgrade, but orphaned osds not in crushmap
Brad Hubbard
12:12 AM Bug #22346: OSD_ORPHAN issues after jewel->luminous upgrade, but orphaned osds not in crushmap
Thanks Graham,
I'll be taking a look into this. I can confirm I can reproduce the issue locally with the osdmaptoo...
Brad Hubbard
04:29 PM Bug #22142 (Fix Under Review): mon doesn't send health status after paxos service is inactive tem...
https://github.com/ceph/ceph/pull/19404 Jan Fajerski
03:51 PM Bug #22142 (In Progress): mon doesn't send health status after paxos service is inactive temporarily
Jan Fajerski
03:49 PM Bug #22142: mon doesn't send health status after paxos service is inactive temporarily
mon/MgrMonitor::send_digests() stops the periodic digests if PaxosService goes inactive for a time (say when a MON go... Jan Fajerski
08:30 AM Bug #22351 (Resolved): Couldn't init storage provider (RADOS)

2017-12-08 16:25:46.172119 7f12bf18de00 0 deferred set uid:gid to 167:167 (ceph:ceph)
2017-12-08 16:25:46.172...
Amine Liu
08:00 AM Bug #22350 (Resolved): nearfull OSD count in 'ceph -w'
Hello,
While looking at the 'ceph -w' output i noticed that sometimes the 'nearfull' information is wrong:
"201...
Richard Arends
07:14 AM Bug #22093: osd stuck in loop processing resent ops due to ms inject socket failures: 500
Sage, let's make it 1000 then. if it helps with the test. ... Kefu Chai
07:12 AM Bug #22349 (New): valgrind: Leak_StillReachable in rocksdb
... Kefu Chai
07:07 AM Bug #22278: FreeBSD fails to build with WITH_SPDK=ON
http://dpdk.org/dev/patchwork/patch/31865/ Kefu Chai

12/07/2017

10:39 PM Bug #22346 (Resolved): OSD_ORPHAN issues after jewel->luminous upgrade, but orphaned osds not in ...
ust updated a fairly long-lived (originally firefly) cluster from jewel to luminous 12.2.2.
One of the issues I se...
Graham Allan
03:32 AM Bug #20924: osd: leaked Session on osd.7
/a/sage-2017-12-06_22:54:32-rados-wip-sage-testing-2017-12-06-1352-distro-basic-smithi/1939984
osd.7 again!
Sage Weil
01:51 AM Feature #22086 (Fix Under Review): ceph-objectstore-tool: Add option "dump-import" to examine an ...
David Zafman
01:51 AM Feature #22086: ceph-objectstore-tool: Add option "dump-import" to examine an export
https://github.com/ceph/ceph/pull/19368 David Zafman

12/06/2017

11:56 PM Bug #22329: mon: Valgrind: mon (Leak_DefinitelyLost, Leak_IndirectlyLost)
I took a look and didn't see anything going through MDSMonitor.* or FSCommand.*.
It looks like a leaked session du...
Patrick Donnelly
10:12 PM Bug #22329: mon: Valgrind: mon (Leak_DefinitelyLost, Leak_IndirectlyLost)
We'll keep this here in case we see it elsewhere, but the leaks I see are of messages and the AuthSessions associated... Greg Farnum
06:27 AM Bug #22329 (Closed): mon: Valgrind: mon (Leak_DefinitelyLost, Leak_IndirectlyLost)
See: /ceph/teuthology-archive/pdonnell-2017-12-05_06:48:09-fs-wip-pdonnell-testing-20171205.044504-testing-basic-smit... Patrick Donnelly
11:49 PM Bug #21557: osd.6 found snap mapper error on pg 2.0 oid 2:0e781f33:::smithi14431805-379 ... :187 ...
saw this again,
/a/sage-2017-12-05_16:19:46-rados-mimic-dev1-distro-basic-smithi/1933230
still confused. it looks...
Sage Weil
11:00 PM Bug #22233: prime_pg_temp breaks on uncreated pgs
/a/sage-2017-12-05_18:31:27-rados-wip-pg-scrub-preempt-distro-basic-smithi/1934001 ? Sage Weil
06:33 AM Bug #22330: ec: src/common/interval_map.h: 161: FAILED assert(len > 0)
Another: /ceph/teuthology-archive/pdonnell-2017-12-05_06:54:06-kcephfs-wip-pdonnell-testing-20171205.044504-testing-b... Patrick Donnelly
06:31 AM Bug #22330 (Resolved): ec: src/common/interval_map.h: 161: FAILED assert(len > 0)
... Patrick Donnelly

12/05/2017

12:30 AM Bug #22064: "RadosModel.h: 865: FAILED assert(0)" in rados-jewel-distro-basic-smithi
Greg Farnum
12:23 AM Support #22132 (Resolved): OSDs stuck in "booting" state after catastrophic data loss
This isn't impossible but I believe you've gone about it the wrong way. See http://docs.ceph.com/docs/master/rados/tr... Greg Farnum
12:17 AM Bug #22144 (Can't reproduce): *** Caught signal (Aborted) ** in thread thread_name:tp_peering
This was discussed on the mailing list thread "[ceph-users] OSD Random Failures - Latest Luminous" and ended without ... Greg Farnum
12:07 AM Support #22224 (Resolved): memory leak
There was an issue in luminous where it was mis-estimating the amount of memory used in bluestore, but that is resolv... Greg Farnum
 

Also available in: Atom