Project

General

Profile

Activity

From 10/11/2017 to 11/09/2017

11/09/2017

09:52 AM Bug #22095 (Resolved): ceph status shows wrong number of objects
With a large number of objects the output of ceph status is wrong. Consider the folowing:
ceph -s:
objects: 683...
Jan Fajerski
05:33 AM Bug #22093 (Resolved): osd stuck in loop processing resent ops due to ms inject socket failures: 500
... Kefu Chai
03:33 AM Bug #22092: ceph-kvstore-tool's store-crc command does not save result to the file as expected
https://github.com/ceph/ceph/pull/18815 Chang Liu
03:12 AM Bug #22092 (Resolved): ceph-kvstore-tool's store-crc command does not save result to the file as ...
Chang Liu

11/08/2017

11:32 PM Bug #22090 (Resolved): cluster [ERR] Unhandled exception from module 'balancer' while running on ...
... Sage Weil
11:13 PM Bug #21907 (Pending Backport): On pg repair the primary is not favored as was intended
David Zafman
05:36 PM Bug #22082 (Fix Under Review): Various odd clog messages for mons
https://github.com/ceph/ceph/pull/18822 John Spray
02:55 PM Bug #22082 (Resolved): Various odd clog messages for mons
We periodically see the address of a monitor printed at INFO level (this is coming from the timecheck code)
We see...
John Spray
05:35 PM Feature #22086 (Resolved): ceph-objectstore-tool: Add option "dump-import" to examine an export

For diagnostic purposes add dump-import option to examine exports. This can allow diagnostics of some issues with ...
David Zafman
05:18 PM Bug #22085 (Can't reproduce): jewel->luminous: "[ FAILED ] LibRadosAioEC.IsSafe" in upgrade:jew...
Run: http://pulpito.ceph.com/teuthology-2017-11-04_04:23:02-upgrade:jewel-x-luminous-distro-basic-smithi/
Jobs: 1811...
Yuri Weinstein
03:50 PM Bug #21833: Multiple asserts caused by DNE pgs left behind after lots of OSD restarts
I got the patch applied to my OSDs and I see the same thing that Michael Schmid reported. The original p.same_interva... Jonathan Light
07:40 AM Bug #22071 (New): ceph-osd segmentation fault due to lockdep turning off
hello,
i've failed osd process few times by executing :
ceph tell osd.55 injectargs '--lockdep=false'
all times ...
alex kriulkin
06:37 AM Bug #21925: cluster capacity is much more smaller than it should be
Can you show the message from "ceph osd df"? jianpeng ma
04:07 AM Bug #22039 (Fix Under Review): bluestore: segv during unmount
https://github.com/ceph/ceph/pull/18805 Sage Weil
02:24 AM Backport #22069 (Resolved): luminous: osd/ReplicatedPG.cc: recover_replicas: object added to miss...
https://github.com/ceph/ceph/pull/20081
With http://tracker.ceph.com/issues/21653
David Zafman
01:45 AM Bug #18162 (Pending Backport): osd/ReplicatedPG.cc: recover_replicas: object added to missing set...
We had an immediate need for this to be backported to jewel. Been busy on other things. Setting this to pending bac... David Zafman

11/07/2017

10:29 PM Bug #21825: OSD won't stay online and crashes with abort
My initial attempt to import I used --op remove since this was a bluestore, the initial attempt also crashed, now I'm... Jérôme Poulin
02:18 AM Bug #21825: OSD won't stay online and crashes with abort

The crash of the ceph-objectstore-tool would be caused by removing a PG using "rm -rf" and then trying to import th...
David Zafman
10:05 PM Bug #18859: kraken monitor fails to bootstrap off jewel monitors if it has booted before
Yikes. A 9 month old ticket. I'm sorry. This must have fallen through all the cracks.
Let me take a look this week...
Joao Eduardo Luis
09:02 PM Bug #18859: kraken monitor fails to bootstrap off jewel monitors if it has booted before
This is also the case for going from jewel to luminous as well.
Our question: Is this something you're planning to...
Kjetil Joergensen
09:21 PM Bug #21388: inconsistent pg but repair does nothing reporting head data_digest != data_digest fro...

We should add code to be_select_auth_object() that checks that data_digest and omap_digest. If we do that then if ...
David Zafman
09:11 PM Bug #21388: inconsistent pg but repair does nothing reporting head data_digest != data_digest fro...

In a case where an objects snapshot is seeing an error, we can't use rados to get and put. Use the following proce...
David Zafman
06:13 PM Bug #22063: "RadosModel.h: 1703: FAILED assert(!version || comp->get_version64() == version)" inr...
could be related or dupe of http://tracker.ceph.com/issues/22064 Yuri Weinstein
06:11 PM Bug #22063 (Duplicate): "RadosModel.h: 1703: FAILED assert(!version || comp->get_version64() == v...
Run: http://pulpito.ceph.com/teuthology-2017-11-04_02:00:03-rados-jewel-distro-basic-smithi/
Jobs: '1809487', '18094...
Yuri Weinstein
06:12 PM Bug #22064 (Duplicate): "RadosModel.h: 865: FAILED assert(0)" in rados-jewel-distro-basic-smithi
Run: http://pulpito.ceph.com/teuthology-2017-11-04_02:00:03-rados-jewel-distro-basic-smithi/
Jobs: '1809520', '18096...
Yuri Weinstein
04:15 PM Bug #18162: osd/ReplicatedPG.cc: recover_replicas: object added to missing set for backfill, but ...
Is there a reason this hasn't been marked for back porting to Luminous yet? (Apologies if this is just because you h... Alastair Dewhurst
03:38 AM Bug #22039: bluestore: segv during unmount
http://pulpito.ceph.com/pdonnell-2017-11-06_22:14:57-fs-wip-pdonnell-testing-20171106.200337-testing-basic-smithi/182... Patrick Donnelly

11/06/2017

10:16 PM Bug #22039: bluestore: segv during unmount
http://pulpito.ceph.com/pdonnell-2017-11-06_22:14:57-fs-wip-pdonnell-testing-20171106.200337-testing-basic-smithi/
...
Patrick Donnelly
01:36 PM Bug #22039: bluestore: segv during unmount
Saw this in http://tracker.ceph.com/issues/21830#note-23 as well Abhishek Lekshmanan
06:18 PM Bug #22047: tests: scrubbing terminated not all pgs were a+c
Maybe a timing issue? Abhishek Lekshmanan
04:07 PM Bug #22047 (New): tests: scrubbing terminated not all pgs were a+c
Seen at http://qa-proxy.ceph.com/teuthology/abhi-2017-11-05_16:35:44-rados-wip-abhi-testing-2017-11-05-1320-distro-ba... Abhishek Lekshmanan
05:51 PM Bug #22052 (Resolved): ceph-mon: possible Leak in OSDMap::build_simple_optioned
run: http://qa-proxy.ceph.com/teuthology/abhi-2017-11-06_15:37:57-rgw-wip-abhi-testing-2017-11-05-1320-distro-basic-s... Abhishek Lekshmanan
04:45 PM Bug #22049: crushtool fails to compile its own output, failing with error: parse error at ''
It looks that it's related to "default~hdd" entry, and is similar to http://tracker.ceph.com/issues/4779
The quest...
Vladimir Pouzanov
04:16 PM Bug #22049 (New): crushtool fails to compile its own output, failing with error: parse error at ''
A crushmap that was previously working with kraken is now failing in luminous. I've re-created the test cluster from ... Vladimir Pouzanov
04:35 PM Bug #22050 (Resolved): ERROR type entries of pglog do not update min_last_complete_ondisk, potent...
we use rbd discard api to zero the whole range of a very big volume. many extents of this volume yet to be written, b... mingxin liu
02:35 AM Bug #22045: OSDMonitor: osd down by monitor is delayed
https://github.com/ceph/ceph/pull/18758 Tang Jin
02:25 AM Bug #22045: OSDMonitor: osd down by monitor is delayed
After new election, the leader monitor doesn't change. The leader will receive many OSDBeacons of part of down osds b... Tang Jin
02:15 AM Bug #22045 (New): OSDMonitor: osd down by monitor is delayed
Cluster is a 3-hosts cluster and each host has a monitor, a mgr and serval osds. The options are all default except m... Tang Jin
01:58 AM Bug #21833 (In Progress): Multiple asserts caused by DNE pgs left behind after lots of OSD restarts
The fix wasn't sufficient. David Zafman

11/05/2017

07:00 AM Backport #21636 (In Progress): luminous: ceph-monstore-tool --readable mode doesn't understand FS...
https://github.com/ceph/ceph/pull/18754 Shinobu Kinjo
06:54 AM Backport #21697 (In Progress): luminous: OSDService::recovery_need_sleep read+updated without loc...
https://github.com/ceph/ceph/pull/18753 Shinobu Kinjo
06:32 AM Backport #21701 (In Progress): luminous: ceph-kvstore-tool does not call bluestore's umount when ...
https://github.com/ceph/ceph/pull/18751 Shinobu Kinjo
06:29 AM Backport #21702 (In Progress): luminous: BlueStore::umount will crash when the BlueStore is opene...
https://github.com/ceph/ceph/pull/18750 Shinobu Kinjo
06:26 AM Backport #21785 (In Progress): luminous: OSDMap cache assert on shutdown
https://github.com/ceph/ceph/pull/18749 Shinobu Kinjo
04:38 AM Backport #21794 (In Progress): luminous: backoff causes out of order op
https://github.com/ceph/ceph/pull/18747 Shinobu Kinjo
03:59 AM Backport #21921 (In Progress): luminous: Objecter::_send_op unnecessarily constructs costly hobje...
https://github.com/ceph/ceph/pull/18745 Shinobu Kinjo
03:50 AM Backport #21922 (In Progress): luminous: Objecter::C_ObjectOperation_sparse_read throws/catches e...
https://github.com/ceph/ceph/pull/18744 Shinobu Kinjo
03:45 AM Backport #21923 (In Progress): jewel: Objecter::C_ObjectOperation_sparse_read throws/catches exce...
https://github.com/ceph/ceph/pull/18743 Shinobu Kinjo
03:42 AM Backport #21924 (In Progress): luminous: ceph_test_objectstore fails ObjectStore/StoreTest.Synthe...
https://github.com/ceph/ceph/pull/18742 Shinobu Kinjo
03:35 AM Backport #22019 (In Progress): luminous: "ceph osd create" is not idempotent
https://github.com/ceph/ceph/pull/18741 Shinobu Kinjo

11/04/2017

03:52 PM Bug #21833 (Resolved): Multiple asserts caused by DNE pgs left behind after lots of OSD restarts
Sage Weil
10:20 AM Bug #21833: Multiple asserts caused by DNE pgs left behind after lots of OSD restarts
David Zafman wrote:
> Tentative fix which still needs testing. Anyone seeing this can try applying this to their OS...
Michael Schmid

11/03/2017

09:57 PM Bug #22039 (Need More Info): bluestore: segv during unmount
https://github.com/ceph/ceph/pull/18714 to get better debug output Sage Weil
09:30 PM Bug #22039 (Resolved): bluestore: segv during unmount
... Patrick Donnelly
03:49 PM Backport #22019 (Resolved): luminous: "ceph osd create" is not idempotent
https://github.com/ceph/ceph/pull/18741 Nathan Cutler
03:45 PM Backport #21150: jewel: tests: btrfs copy_clone returns errno 95 (Operation not supported)
h3. description... Nathan Cutler
01:35 PM Backport #21150 (Resolved): jewel: tests: btrfs copy_clone returns errno 95 (Operation not suppor...
Kefu Chai
05:56 AM Bug #20616 (Resolved): pre-luminous: aio_read returns erroneous data when rados_osd_op_timeout is...
Kefu Chai
05:56 AM Backport #21308 (Resolved): jewel: pre-luminous: aio_read returns erroneous data when rados_osd_o...
Kefu Chai
03:55 AM Bug #22000 (Can't reproduce): test_mon_misc fails in qa/workunits/cephtool/test.sh
not able to reproduce it locally, will reopen it if it happens again. Kefu Chai
03:28 AM Backport #22013: jewel: osd/ReplicatedPG.cc: recover_replicas: object added to missing set for ba...
Breakdown of commits:
6cb93f7d3406efb8955770aef21e2dc791958b0a from pull 11921
6a78b81f37fd6fb495672142fe9d135d...
David Zafman
03:26 AM Backport #22013 (Resolved): jewel: osd/ReplicatedPG.cc: recover_replicas: object added to missing...
https://github.com/ceph/ceph/pull/18690
David Zafman
02:55 AM Bug #21144: daemon-helper: command crashed with signal 1
Hello,
I managed to bump into this during a Jewel -> Luminous upgrade using the docker ceph/daemon container image...
Hey Pas

11/01/2017

05:43 PM Bug #18749: OSD: allow EC PGs to do recovery below min_size
I still owe teuthology tests on this... Greg Farnum
05:09 PM Bug #21833 (In Progress): Multiple asserts caused by DNE pgs left behind after lots of OSD restarts

Tentative fix which still needs testing. Anyone seeing this can try applying this to their OSDs and report back.
...
David Zafman
03:12 PM Bug #21993 (Pending Backport): "ceph osd create" is not idempotent
add backport=luminous, because i think it could help to silence some false alarms in rados qa run with luminous branc... Kefu Chai
02:58 AM Bug #21993 (Fix Under Review): "ceph osd create" is not idempotent
https://github.com/ceph/ceph/pull/18659 Kefu Chai
02:34 AM Bug #21993 (Resolved): "ceph osd create" is not idempotent
it is not without the "id" and "uuid" parameter.
but since it's considered deprecated. this tick is just for the r...
Kefu Chai
03:10 PM Bug #22000 (Can't reproduce): test_mon_misc fails in qa/workunits/cephtool/test.sh
... Kefu Chai
01:32 PM Bug #21997 (Resolved): thrashosds defaults to min_in 3, some ec tests are (2,2)
/a/sage-2017-11-01_01:03:53-rados-wip-sage2-testing-2017-10-31-1354-distro-basic-smithi/1797020
see https://github...
Sage Weil

10/31/2017

11:22 PM Bug #21992 (Duplicate): osd: src/common/interval_map.h: 161: FAILED assert(len > 0)
... Patrick Donnelly

10/30/2017

07:57 PM Bug #21977 (Resolved): null map from OSDService::get_map in advance_pg
Run: http://pulpito.ceph.com/teuthology-2017-10-30_04:23:02-upgrade:jewel-x-luminous-distro-basic-ovh/
Jobs: 1791436...
Yuri Weinstein
04:54 AM Bug #21833: Multiple asserts caused by DNE pgs left behind after lots of OSD restarts
Hello,
I wanted to report that I seem to be running into this bug as well.
I had a very overloaded node that OOM'...
Jonathan Light
02:35 AM Bug #21965 (Can't reproduce): mon/MonClient.cc: 478: FAILED assert(authenticate_err == 0)
... Sage Weil

10/29/2017

05:08 PM Bug #21833: Multiple asserts caused by DNE pgs left behind after lots of OSD restarts
I just wanted to report that I managed to produce what certainly seems to be the same error on Ubuntu's 12.2.0 releas... Michael Schmid

10/26/2017

09:14 PM Backport #21544 (Resolved): luminous: mon osd feature checks for osdmap flags and require-osd-rel...
Sage Weil
09:13 PM Backport #21693 (Resolved): luminous: interval_map.h: 161: FAILED assert(len > 0)
Sage Weil
12:10 AM Bug #21931: osd: src/osd/ECBackend.cc: 2164: FAILED assert((offset + length) <= (range.first.get_...
Logs and coredump saved in: teuthology:/home/pdonnell/1773350 Patrick Donnelly

10/25/2017

10:16 PM Bug #21931: osd: src/osd/ECBackend.cc: 2164: FAILED assert((offset + length) <= (range.first.get_...
Dead OSD is accessible at smithi171 as of now.
http://pulpito.ceph.com/pdonnell-2017-10-25_18:05:03-kcephfs-wip-pd...
Patrick Donnelly
10:15 PM Bug #21931 (Resolved): osd: src/osd/ECBackend.cc: 2164: FAILED assert((offset + length) <= (range...
... Patrick Donnelly
06:22 PM Bug #21388: inconsistent pg but repair does nothing reporting head data_digest != data_digest fro...

Wei Jin,
According to the inconsistent output, the object info said the object size is 3461120, but all the shar...
David Zafman
10:19 AM Bug #21388: inconsistent pg but repair does nothing reporting head data_digest != data_digest fro...
I also ran into the same issue and fixed it by rados get/put command.
The inconsistent info is not exactly the same ...
wei jin
09:07 AM Bug #21925 (Need More Info): cluster capacity is much more smaller than it should be
Hi all,
I build a ceph cluster on a single host, with 1 mon, 3 osds. One osd created on a file path, two osds crea...
Jing Li
08:14 AM Backport #21924 (Resolved): luminous: ceph_test_objectstore fails ObjectStore/StoreTest.Synthetic...
Nathan Cutler
08:14 AM Backport #21923 (Resolved): jewel: Objecter::C_ObjectOperation_sparse_read throws/catches excepti...
https://github.com/ceph/ceph/pull/18743 Nathan Cutler
08:14 AM Backport #21922 (Resolved): luminous: Objecter::C_ObjectOperation_sparse_read throws/catches exce...
https://github.com/ceph/ceph/pull/18744 Nathan Cutler
08:13 AM Backport #21921 (Resolved): luminous: Objecter::_send_op unnecessarily constructs costly hobject_t
https://github.com/ceph/ceph/pull/18745 Nathan Cutler
05:55 AM Bug #21880 (Resolved): ObjectStore/StoreTest.Synthetic/1 (filestore) fails with fiemap enabled
Kefu Chai
02:36 AM Bug #21878 (Resolved): bluefs: os/bluestore/BlueFS.cc: 1505: FAILED assert(h->file->fnode.ino != 1)
Sage Weil
02:35 AM Bug #21766 (Resolved): os/bluestore/bluestore_types.h: 740: FAILED assert(p != extents.end()) (ec...
Sage Weil

10/24/2017

08:16 PM Bug #21907 (Resolved): On pg repair the primary is not favored as was intended

The commit cd0d8b0 was supposed to favor the primary as the authoritative copy, but did the opposite.
David Zafman
12:11 PM Feature #21902 (New): Support bytearray in python binding
This will avoid some memory copy to prepare data for rados.
But this requires cython>=0.20, see
https://github....
Mehdi Abaakouk
06:33 AM Bug #21847: osd frequently been marked down and up

Hi sage , this same issue happened again, is there any any possible reason for this issue ?
how should i go forwa...
dongdong tao
03:53 AM Bug #21573 (Resolved): [upgrade] buffer::list ABI broken in luminous release
Kefu Chai
03:53 AM Backport #21899 (Resolved): luminous: [upgrade] buffer::list ABI broken in luminous release
https://github.com/ceph/ceph/pull/18491 Kefu Chai
03:52 AM Backport #21899 (Resolved): luminous: [upgrade] buffer::list ABI broken in luminous release
https://github.com/ceph/ceph/pull/18491 Kefu Chai
03:30 AM Bug #21878 (Pending Backport): bluefs: os/bluestore/BlueFS.cc: 1505: FAILED assert(h->file->fnode...
Sage Weil
02:35 AM Bug #21766: os/bluestore/bluestore_types.h: 740: FAILED assert(p != extents.end()) (ec + compress...
backport https://github.com/ceph/ceph/pull/18501 Sage Weil
02:34 AM Bug #21766 (Pending Backport): os/bluestore/bluestore_types.h: 740: FAILED assert(p != extents.en...
Sage Weil

10/23/2017

03:53 PM Bug #21573 (Pending Backport): [upgrade] buffer::list ABI broken in luminous release
Sage Weil
03:31 PM Bug #21846: Default ms log level results in ~40% performance degradation on RBD 4K random read IO
Ken Dreyer wrote:
> How should we indicate that PR 18418 needs to go into Luminous (v12.2.2?)
Using the magic for...
Jason Dillaman
03:00 PM Bug #21846: Default ms log level results in ~40% performance degradation on RBD 4K random read IO
Jason Dillaman wrote:
> I posted PR https://github.com/ceph/ceph/pull/18418 as a temporary workaround for clients. I...
Ken Dreyer

10/22/2017

05:32 AM Bug #21847: osd frequently been marked down and up
from the dmesg, i found lots of "libceph: osd.6 down"
disk are all good. if you have never saw such kind weird log, ...
dongdong tao
03:17 AM Bug #21887 (Duplicate): degraded calculation is off during backfill
Sage Weil

10/21/2017

07:53 PM Bug #21887: degraded calculation is off during backfill
We should backport the fix to luminous. It is confusing/scary that the 'degraded' health warning comes up during a r... Sage Weil
04:23 PM Bug #21887 (Duplicate): degraded calculation is off during backfill
The PG is active+remapped+backfill_wait. There are 2 backfill targets, and 3 acting which are all up to date. There... Sage Weil
05:48 PM Bug #21750: scrub stat mismatch on bytes
Yeah, same here. https://github.com/ceph/ceph/pull/18396 was included in my run.
http://pulpito.ceph.com/sage-201...
Sage Weil
01:08 PM Bug #21750: scrub stat mismatch on bytes
Seeing more scrub-errors after https://github.com/ceph/ceph/pull/18396 is applied.
http://pulpito.ceph.com/sage-20...
xie xingguo
05:47 PM Bug #21844 (Pending Backport): Objecter::C_ObjectOperation_sparse_read throws/catches exceptions ...
Sage Weil
05:45 PM Bug #21845 (Pending Backport): Objecter::_send_op unnecessarily constructs costly hobject_t
Sage Weil
04:16 PM Bug #20759: mon: valgrind detects a few leaks
/a/kchai-2017-10-21_09:27:38-rados-wip-kefu-testing-2017-10-21-1049-distro-basic-mira/1757648/remote/mira121/log/ceph... Kefu Chai
04:10 AM Backport #21543 (Resolved): luminous: bluestore fsck took 224.778802 seconds to complete which ca...
Sage Weil
04:09 AM Backport #21783 (Resolved): luminous: cli/crushtools/build.t sometimes fails in jenkins' "make ch...
Sage Weil
02:49 AM Bug #21880 (Fix Under Review): ObjectStore/StoreTest.Synthetic/1 (filestore) fails with fiemap en...
Kefu Chai

10/20/2017

09:33 PM Bug #21880: ObjectStore/StoreTest.Synthetic/1 (filestore) fails with fiemap enabled
https://github.com/ceph/ceph/pull/18452 Sage Weil
09:29 PM Bug #21880 (Resolved): ObjectStore/StoreTest.Synthetic/1 (filestore) fails with fiemap enabled
... Sage Weil
09:29 PM Bug #21716: ObjectStore/StoreTest.FiemapHoles/3 fails with kstore
Enabling fiemap makes FileStore's Synthetic/1 fail reliably:... Sage Weil
03:27 PM Bug #21825: OSD won't stay online and crashes with abort
Would you be interested in having a copy of the 2 GB PG which causes ceph-objectstore-tool to crash? Jérôme Poulin
03:25 PM Bug #21825: OSD won't stay online and crashes with abort
I did a quick check on my 4 hosts and jemalloc is not enabled. The cluster is now back to active+clean. Jérôme Poulin
02:25 PM Bug #21825: OSD won't stay online and crashes with abort
Can you confirm you're not using jemalloc (check /etc/{default,sysconfig}/ceph)?
Sage Weil
02:27 PM Bug #21846: Default ms log level results in ~40% performance degradation on RBD 4K random read IO
I posted PR https://github.com/ceph/ceph/pull/18418 as a temporary workaround for clients. I figured I would leave th... Jason Dillaman
02:19 PM Bug #21846: Default ms log level results in ~40% performance degradation on RBD 4K random read IO
Two options?
1. Just set debug ms = 0 by default for clients.
2. Fix the async msgr to not log the second message...
Sage Weil
02:17 PM Bug #21847 (Need More Info): osd frequently been marked down and up
Is there anything in 'dmesg' output? Maybe a bad disk? Sage Weil
01:54 PM Bug #21845: Objecter::_send_op unnecessarily constructs costly hobject_t
Indeed -- in the OSDs. I only benchmarked the librbd clients under high IOPS workloads and that call is only executed... Jason Dillaman
01:46 PM Bug #21845: Objecter::_send_op unnecessarily constructs costly hobject_t
we have a lots of "xx == hobject_t()" judgements in the codes... Haomai Wang
01:43 PM Bug #21845: Objecter::_send_op unnecessarily constructs costly hobject_t
... I should also note that in unrelated "perf record" sessions for the "debug ms = 0/1" performance depredations, yo... Jason Dillaman
01:39 PM Bug #21845: Objecter::_send_op unnecessarily constructs costly hobject_t
multiple runs of "perf record" didn't lie -- and neither did the fact that moving it increased performance by ~10% un... Jason Dillaman
01:36 PM Bug #21845: Objecter::_send_op unnecessarily constructs costly hobject_t
of course, it's a good cleanup Haomai Wang
01:35 PM Bug #21845: Objecter::_send_op unnecessarily constructs costly hobject_t
I really don't agree that this will cause 10% performance degraded... the construct should be nanoseconds level.. Haomai Wang
01:33 PM Bug #21845 (Fix Under Review): Objecter::_send_op unnecessarily constructs costly hobject_t
*PR*: https://github.com/ceph/ceph/pull/18427 Jason Dillaman
01:31 PM Bug #21845 (In Progress): Objecter::_send_op unnecessarily constructs costly hobject_t
Jason Dillaman
01:51 PM Bug #21878 (Fix Under Review): bluefs: os/bluestore/BlueFS.cc: 1505: FAILED assert(h->file->fnode...
https://github.com/ceph/ceph/pull/18428 Sage Weil
01:29 PM Bug #21878 (Resolved): bluefs: os/bluestore/BlueFS.cc: 1505: FAILED assert(h->file->fnode.ino != 1)
... Sage Weil
01:38 PM Bug #21842 (Resolved): "repair kvstore failed" in qa/workunits/cephtool/test_kvstore_tool.sh
Kefu Chai
09:29 AM Backport #21872 (Resolved): jewel: ObjectStore/StoreTest.FiemapHoles/3 fails with kstore
https://github.com/ceph/ceph/pull/20143 Nathan Cutler
09:29 AM Backport #21871 (Rejected): luminous: ObjectStore/StoreTest.FiemapHoles/3 fails with kstore
-https://github.com/ceph/ceph/pull/20448- Nathan Cutler
05:30 AM Bug #21573 (Fix Under Review): [upgrade] buffer::list ABI broken in luminous release
https://github.com/ceph/ceph/pull/18408 Kefu Chai
01:24 AM Backport #21693 (Fix Under Review): luminous: interval_map.h: 161: FAILED assert(len > 0)
Anonymous

10/19/2017

09:47 PM Bug #21204 (Resolved): DNS SRV default service name not used anymore
Nathan Cutler
09:43 PM Bug #21365 (Resolved): Daemons(OSD, Mon...) exit abnormally at injectargs command
Nathan Cutler
09:42 PM Backport #21343 (Resolved): luminous: DNS SRV default service name not used anymore
Sage Weil
09:40 PM Backport #21438 (Resolved): luminous: Daemons(OSD, Mon...) exit abnormally at injectargs command
Sage Weil
01:37 PM Bug #21844 (Fix Under Review): Objecter::C_ObjectOperation_sparse_read throws/catches exceptions ...
*PR*: https://github.com/ceph/ceph/pull/18400 Jason Dillaman
01:04 PM Bug #21844 (In Progress): Objecter::C_ObjectOperation_sparse_read throws/catches exceptions on -E...
Jason Dillaman
01:56 AM Bug #21844 (Resolved): Objecter::C_ObjectOperation_sparse_read throws/catches exceptions on -ENOENT
Running RBD small IO performance tests against a mostly sparse image shows that the Objecter is throwing/catching a b... Jason Dillaman
01:30 PM Bug #21750: scrub stat mismatch on bytes
https://github.com/ceph/ceph/pull/18396 probably fixes this! Sage Weil
12:15 PM Backport #21783 (In Progress): luminous: cli/crushtools/build.t sometimes fails in jenkins' "make...
Nathan Cutler
09:14 AM Bug #21573: [upgrade] buffer::list ABI broken in luminous release
this would be a little bit tricky:... Kefu Chai
08:54 AM Backport #21150 (Fix Under Review): jewel: tests: btrfs copy_clone returns errno 95 (Operation no...
https://github.com/ceph/ceph/pull/18165 Kefu Chai
07:35 AM Bug #21842 (Fix Under Review): "repair kvstore failed" in qa/workunits/cephtool/test_kvstore_tool.sh
Kefu Chai
07:34 AM Bug #21842: "repair kvstore failed" in qa/workunits/cephtool/test_kvstore_tool.sh
> I can't figure out where creates "stat file: db". I guess we use this "stat file: db" as our dbname.
and it cons...
Kefu Chai
06:44 AM Bug #21842: "repair kvstore failed" in qa/workunits/cephtool/test_kvstore_tool.sh
Kefu, I know why repair failed.
rocksdb's Env imports a new member function called AreSameFiles. But our BlueRocks...
Chang Liu
06:31 AM Bug #21842: "repair kvstore failed" in qa/workunits/cephtool/test_kvstore_tool.sh
Chang, no worries. i am fixing it. Kefu Chai
03:36 AM Bug #21842: "repair kvstore failed" in qa/workunits/cephtool/test_kvstore_tool.sh
I can't figure out where creates "stat file: db". I guess we use this "stat file: db" as our dbname. Chang Liu
03:13 AM Bug #21842: "repair kvstore failed" in qa/workunits/cephtool/test_kvstore_tool.sh
Rocksdb::RepairDB tries to find all files: https://github.com/facebook/rocksdb/blob/master/db/repair.cc#L168, then it... Chang Liu
02:14 AM Bug #21842: "repair kvstore failed" in qa/workunits/cephtool/test_kvstore_tool.sh
Working on it Chang Liu
01:32 AM Bug #21842 (Resolved): "repair kvstore failed" in qa/workunits/cephtool/test_kvstore_tool.sh
... Kefu Chai
03:53 AM Bug #21847 (Need More Info): osd frequently been marked down and up
our ceph version is 10.2.5
we have encounter an issue that one of our osd has been marked down and up about 3 times ...
dongdong tao
02:41 AM Bug #21846 (Closed): Default ms log level results in ~40% performance degradation on RBD 4K rando...
Luminous is now 15% slower than Jewel and over 40% slower as compared to when the ms logs are disabled.
v10.2.10 d...
Jason Dillaman
02:25 AM Bug #21845 (Resolved): Objecter::_send_op unnecessarily constructs costly hobject_t
With zero backoffs, just constructing an hobject_t ("hobject_t hoid = op->target.get_hobj();") results in an approxim... Jason Dillaman

10/18/2017

11:30 PM Bug #21833: Multiple asserts caused by DNE pgs left behind after lots of OSD restarts

In the context of the newly created PGs:
pg[10.5a5s3( DNE empty local-lis/les=0/0 n=0 ec=0/0 lis/c 0/0 les/c/f 0...
David Zafman
09:12 PM Bug #21833 (Resolved): Multiple asserts caused by DNE pgs left behind after lots of OSD restarts
... Greg Farnum
09:46 PM Bug #21825: OSD won't stay online and crashes with abort
I had a chance to try and rm osd 3 today and replace the hard disk with a new one, no crash so far, it is rebalancing... Jérôme Poulin
06:26 AM Bug #21825: OSD won't stay online and crashes with abort
I think there is more to this, after active+clean, I shutdown osd.3 and then the PG went active+clean+snaptrim then o... Jérôme Poulin
05:11 AM Bug #21825: OSD won't stay online and crashes with abort
After tempering around with OSD kill and starting many, marking lost and unfound, I finally was able to recover all b... Jérôme Poulin
04:26 AM Bug #21825: OSD won't stay online and crashes with abort
You should bump up the OSD logging to see more of what is happening. David Zafman
03:33 AM Bug #21825 (Closed): OSD won't stay online and crashes with abort
I have an issue where 2 OSDs can't stay up at the same time and one will crash the other causing down PGs,
Exporti...
Jérôme Poulin
05:36 PM Bug #20243: Improve size scrub error handling and ignore system attrs in xattr checking

If we wanted to backport to Jewel it would be helpful to include this pull request first.
https://github.com/cep...
David Zafman

10/17/2017

09:28 PM Bug #21823 (Can't reproduce): on_flushed: object ... obc still alive (ec + cache tiering)
... Sage Weil
08:41 PM Bug #21573: [upgrade] buffer::list ABI broken in luminous release
@Kefu can you pls take a look? Yuri Weinstein
08:40 PM Backport #21544 (Fix Under Review): luminous: mon osd feature checks for osdmap flags and require...
Anonymous
08:20 PM Backport #21544 (In Progress): luminous: mon osd feature checks for osdmap flags and require-osd-...
Anonymous
07:03 PM Backport #21543 (Fix Under Review): luminous: bluestore fsck took 224.778802 seconds to complete ...
Anonymous
06:58 PM Backport #21543 (In Progress): luminous: bluestore fsck took 224.778802 seconds to complete which...
Anonymous
06:40 PM Feature #21760: add tools to stress RADOS omap
https://github.com/ceph/ceph/pull/18361 Douglas Fuller
05:29 PM Bug #21744 (Resolved): Core when `ceph-kvstore-tool exists`
Chang Liu
12:41 PM Bug #19198 (Closed): Bluestore doubles mem usage when caching object content
I talked to Igor. It seems this is really is a non-bug, as the UTs use the glibc allocator. A follow-up will be to us... Mohamad Gebai
04:56 AM Bug #21818 (Resolved): ceph_test_objectstore fails ObjectStore/StoreTest.Synthetic/1 (filestore) ...
... Kefu Chai
02:50 AM Bug #16279: assert(objiter->second->version > last_divergent_update) failed
same problem: http://tracker.ceph.com/issues/21174 huang jun

10/16/2017

11:29 PM Bug #20981: ./run_seed_to_range.sh errored out
I was never able to reproduce this with the following command line test.
rm -rf /tmp/td td ; mkdir /tmp/td td ; cd...
David Zafman
09:12 PM Bug #18162 (Fix Under Review): osd/ReplicatedPG.cc: recover_replicas: object added to missing set...
https://github.com/ceph/ceph/pull/18145 David Zafman
06:41 AM Bug #20053: crush compile / decompile looses precision on weight
Loïc Dachary

10/13/2017

08:48 PM Bug #21750 (In Progress): scrub stat mismatch on bytes
Sage Weil
08:48 PM Bug #21766: os/bluestore/bluestore_types.h: 740: FAILED assert(p != extents.end()) (ec + compress...
Sage Weil
05:15 PM Bug #21716 (Pending Backport): ObjectStore/StoreTest.FiemapHoles/3 fails with kstore
Kefu Chai
05:15 PM Bug #21716 (Resolved): ObjectStore/StoreTest.FiemapHoles/3 fails with kstore
Kefu Chai
12:13 PM Backport #21794 (Resolved): luminous: backoff causes out of order op
Nathan Cutler
12:13 PM Backport #21786 (Resolved): jewel: OSDMap cache assert on shutdown
https://github.com/ceph/ceph/pull/21184 Nathan Cutler
12:13 PM Backport #21785 (Resolved): luminous: OSDMap cache assert on shutdown
Nathan Cutler
12:13 PM Backport #21784 (Resolved): jewel: cli/crushtools/build.t sometimes fails in jenkins' "make check...
https://github.com/ceph/ceph/pull/21158 Nathan Cutler
12:12 PM Backport #21783 (Resolved): luminous: cli/crushtools/build.t sometimes fails in jenkins' "make ch...
https://github.com/ceph/ceph/pull/18398 Nathan Cutler
04:11 AM Bug #21603 (Resolved): rocksdb is using slow crc
Kefu Chai
03:30 AM Bug #18162: osd/ReplicatedPG.cc: recover_replicas: object added to missing set for backfill, but ...
David Zafman

10/12/2017

08:08 PM Bug #21737 (Pending Backport): OSDMap cache assert on shutdown
Greg Farnum
05:32 PM Feature #21760 (In Progress): add tools to stress RADOS omap
Douglas Fuller
04:16 PM Bug #21750: scrub stat mismatch on bytes
http://pulpito.front.sepia.ceph.com/yuriw-2017-10-11_19:25:41-rados-wip-yuri3-testing-2017-10-11-1645-distro-basic-sm... Sage Weil
08:26 AM Bug #16279: assert(objiter->second->version > last_divergent_update) failed
i have also met this problem when testing pull out disk and insert; ceph version 0.94.5,according @huang jun's osd lo... mingyue zhao
05:01 AM Bug #21603 (Fix Under Review): rocksdb is using slow crc
https://github.com/ceph/ceph/pull/18262 Kefu Chai
04:43 AM Bug #20909: Error ETIMEDOUT: crush test failed with -110: timed out during smoke test (5 seconds)
... Kefu Chai
12:46 AM Bug #21716: ObjectStore/StoreTest.FiemapHoles/3 fails with kstore
Kefu, thanks for fixing this. Can you also indicate which of the mentioned PRs need to be backported to fix the test ... Nathan Cutler

10/11/2017

09:49 PM Bug #21766: os/bluestore/bluestore_types.h: 740: FAILED assert(p != extents.end()) (ec + compress...
problem seems to be that the unsharing code isn't handling compressed extents properly.
https://github.com/ceph/ce...
Sage Weil
09:47 PM Bug #21766 (Resolved): os/bluestore/bluestore_types.h: 740: FAILED assert(p != extents.end()) (ec...
... Sage Weil
05:20 PM Bug #21331 (Resolved): pg recovery priority inversion
https://github.com/ceph/ceph/pull/18025 is luminous backport
Sage Weil
05:19 PM Bug #21417 (Resolved): buffer_anon leak during deep scrub (on otherwise idle osd)
Sage Weil
01:49 PM Feature #21760: add tools to stress RADOS omap
I had a discussion with Douglas and in the current implementation, we can enhance following points:
1. Adding --he...
Vikhyat Umrao
01:37 PM Feature #21760 (In Progress): add tools to stress RADOS omap
Add the tools omap_create and omap_delete to stress the RADOS object map directly. Douglas Fuller
01:45 PM Bug #21758 (Pending Backport): cli/crushtools/build.t sometimes fails in jenkins' "make check" run
Sage Weil
09:51 AM Bug #21758 (Fix Under Review): cli/crushtools/build.t sometimes fails in jenkins' "make check" run
https://github.com/ceph/ceph/pull/18242 Kefu Chai
09:49 AM Bug #21758 (Resolved): cli/crushtools/build.t sometimes fails in jenkins' "make check" run
... Kefu Chai
09:37 AM Bug #21756: /usr/src/ceph/src/osd/ECTransaction.h: 179: FAILED assert(plan.to_read.count(i.first)...
https://github.com/ceph/ceph/pull/18241 huang jun
06:13 AM Bug #21756: /usr/src/ceph/src/osd/ECTransaction.h: 179: FAILED assert(plan.to_read.count(i.first)...
comment out in ceph.conf
#osd copyfrom max chunk = 524288
if we use this config, it works fine.
but if we comment ...
huang jun
06:01 AM Bug #21756 (New): /usr/src/ceph/src/osd/ECTransaction.h: 179: FAILED assert(plan.to_read.count(i....
steps to reproduce:... huang jun
08:09 AM Bug #21716: ObjectStore/StoreTest.FiemapHoles/3 fails with kstore
https://github.com/ceph/ceph/pull/18240 Kefu Chai
07:50 AM Bug #21757 (New): snapshotted RBD objects can't be automatically evicted from a cache tier when c...
[enviroment]
1, ceph version:Jewel 10.2.6 or firefly 0.80.7
2, kernel: 3.10.0-229.14.1.el7.x86_64
[procedure to ...
Xiaojun Liao
02:26 AM Bug #21750: scrub stat mismatch on bytes
/a/sage-2017-10-10_20:19:10-rados-wip-sage-testing2-2017-10-10-1320-distro-basic-smithi/1723818
rados/thrash/{0-size...
Sage Weil
 

Also available in: Atom