Project

General

Profile

Activity

From 05/31/2018 to 06/29/2018

06/29/2018

11:27 PM Bug #23875 (In Progress): Removal of snapshot with corrupt replica crashes osd
Tentative pull request https://github.com/ceph/ceph/pull/22476 is an improvement but doesn't address comment 3 David Zafman
11:25 PM Bug #19753 (In Progress): Deny reservation if expected backfill size would put us over backfill_f...
David Zafman
05:59 PM Bug #24687: Automatically set expected_num_objects for new pools with >=100 PGs per OSD
Also include >1024 PGs overall Douglas Fuller
09:59 AM Bug #23145: OSD crashes during recovery of EC pg
@sage weil,
tks, due to env is not exists. I couldn't get the logs for the arguments debug_osd=20.
from the previou...
Yong Wang

06/28/2018

05:53 PM Bug #24645: Upload to radosgw fails when there are degraded objects
When the cluster is in recovery this is expected that we're waiting for the OSDs to respond Abhishek Lekshmanan
05:16 PM Bug #24676: FreeBSD/Linux integration - monitor map with wrong sa_family
I discovered that commit 9099ca5 - "fix the dencoder of entity_addr_t" introduced this kind of interoperability which... Alexander Haemmerle
08:50 AM Bug #24676: FreeBSD/Linux integration - monitor map with wrong sa_family
I investigated further with gdb. Lines 478-501 from msg/msg_types.h seem to be the culprit. Here sa_family is decoded... Alexander Haemmerle
05:14 PM Bug #17257: ceph_test_rados_api_lock fails LibRadosLockPP.LockExclusiveDurPP
... Neha Ojha
05:08 PM Bug #24485: LibRadosTwoPoolsPP.ManifestUnset failure
/a/nojha-2018-06-27_22:32:36-rados-wip-23979-distro-basic-smithi/2715571/ Neha Ojha
02:50 PM Bug #24686 (In Progress): change default filestore_merge_threshold to -10
Douglas Fuller
02:18 PM Bug #24686 (Resolved): change default filestore_merge_threshold to -10
Performance evaluations of medium to large size Ceph clusters have demonstrated negligible performance impact from un... Douglas Fuller
02:49 PM Bug #24687 (Resolved): Automatically set expected_num_objects for new pools with >=100 PGs per OSD
Field experience has demonstrated significant performance impact from filestore split and merge activity. The expecte... Douglas Fuller
10:15 AM Bug #24685 (Resolved): config options: possible inconsistency between flag 'can_update_at_runtime...
I'm wondering if there is a inconsistency between the 'can_update_at_runtime' flag and the 'flags' list for the confi... Tatjana Dehler
08:47 AM Bug #24683: ceph-mon binary doesn't report to systemd why it dies
If I execute the same command that systemd uses, I get a great readable error message:... Erik Bernoth
08:45 AM Bug #24683 (New): ceph-mon binary doesn't report to systemd why it dies
Following the quick start guide I get at a point where the monitor is supposed to come up but it doesn't. It doesn't ... Erik Bernoth
05:53 AM Bug #24587: librados api aio tests race condition
Good news, this is just a bug in the tests. They're submitting a write and then a read without waiting for the write ... Josh Durgin
03:04 AM Bug #21142: OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
Thanks Sage. I'll try to get my hands on another environment and see if I can reproduce and get more details. Will up... Dexter John Genterone

06/27/2018

09:17 PM Bug #24615: error message for 'unable to find any IP address' not shown
Sounds like the log isn't being flushed before exiting Josh Durgin
09:13 PM Bug #24652 (Won't Fix): OSD crashes when repairing pg
This should be fixed in later versions - hammer is end of life.
The crash was:...
Josh Durgin
09:06 PM Bug #24667: osd: SIGSEGV in MMgrReport::encode_payload
Possibly related to a memory corruption we've been seeing related to mgr health reporting on the osd. Josh Durgin
07:54 PM Bug #21142: OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
https://github.com/ceph/ceph/pull/22744 disabled build_past_itnervals_parallel in luminous (by default; can be turned... Sage Weil
05:51 PM Bug #21142: OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
Well, I can work around the issue.. I the build_past_itnervals_parallel() is removed entirely in mimic and I can do t... Sage Weil
07:12 AM Bug #21142: OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
> Any chance you can gdb one of the core files for a crashing OSD to identify which PG is it asserting on? and perhap... Dexter John Genterone
06:10 AM Bug #21142: OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
Sage Weil wrote:
> Dexter, anyone: was there a PG split (pg_num increase) on the cluster before this happened? Or m...
Xiaoxi Chen
05:19 PM Bug #24678 (Can't reproduce): ceph-mon segmentation fault after setting pool size to 1 on degrade...
We have an issue with starting any from 3 monitors after changing pool size from 3 to 1. The cluster was in a degrade... Sergey Burdakov
03:09 PM Bug #24423: failed to load OSD map for epoch X, got 0 bytes
I also have this issue on new installed mimic cluster.
Don't know if this is important, the problem has appeared aft...
Sergey Burdakov
12:00 PM Bug #24676 (Resolved): FreeBSD/Linux integration - monitor map with wrong sa_family
We are using a ceph cluster in a mixed FreeBSD/Linux environment. The ceph cluster is based on FreeBSD. Linux clients... Alexander Haemmerle
09:16 AM Bug #23352: osd: segfaults under normal operation
We're getting a few crashes like this per week here on 12.2.5.
Here's a fileStore OSD:...
Dan van der Ster
04:10 AM Backport #24494 (In Progress): mimic: osd: segv in Session::have_backoff
https://github.com/ceph/ceph/pull/22730 Prashant D
04:09 AM Backport #24495 (In Progress): luminous: osd: segv in Session::have_backoff
https://github.com/ceph/ceph/pull/22729 Prashant D
12:41 AM Bug #23395 (Can't reproduce): qa/standalone/special/ceph_objectstore_tool.py causes ceph-mon core...
David Zafman

06/26/2018

11:32 PM Bug #23492 (In Progress): Abort in OSDMap::decode() during qa/standalone/erasure-code/test-erasur...
David Zafman
11:29 PM Feature #13507 (New): scrub APIs to read replica
David Zafman
11:28 PM Bug #24366 (Resolved): omap_digest handling still not correct
David Zafman
11:27 PM Backport #24381 (Resolved): luminous: omap_digest handling still not correct
David Zafman
11:27 PM Backport #24380 (Resolved): mimic: omap_digest handling still not correct
David Zafman
09:08 PM Bug #23352: osd: segfaults under normal operation
Matt,
Can you provide a coredump or full backtrace?
Brad Hubbard
01:54 PM Bug #23352: osd: segfaults under normal operation
Also confirmed on Ubuntu 18.04/Ceph 13.2.0:
ceph-mgr.log
> 2018-06-24 11:14:47.317 7ff17b0db700 -1 mgr.server s...
Matt Dunavant
02:54 AM Bug #23352: osd: segfaults under normal operation
confirmed
ceph-mgr.log
@2018-06-20 08:46:05.528656 7fb998ff2700 -1 mgr.server send_report send_report osd,215.0x5...
Beom-Seok Park
07:14 PM Bug #21142: OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
Dexter, anyone: was there a PG split (pg_num increase) on the cluster before this happened? Or maybe a split combine... Sage Weil
07:10 PM Bug #21142: OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
... Sage Weil
07:06 PM Bug #21142: OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
... Sage Weil
06:42 PM Bug #21142: OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
Dexter John Genterone wrote:
> Uploaded a few more logs (debug 20) here: https://storage.googleapis.com/ceph-logs/ce...
Sage Weil
07:11 PM Bug #24667 (Can't reproduce): osd: SIGSEGV in MMgrReport::encode_payload
... Patrick Donnelly
07:07 PM Bug #24666 (New): pybind: InvalidArgumentError is missing 'errno' argument
Instead of being derived from 'Error', the 'InvalidArgumentError' should be derived from 'OSError' which will handle ... Jason Dillaman
01:49 PM Bug #24664 (Resolved): osd: crash in OpTracker::unregister_inflight_op via OSD::get_health_metrics
... Patrick Donnelly
09:47 AM Bug #24660 (New): admin/build-doc fails during autodoc on rados module: "AttributeError: __next__"
I'm trying to send a doc patch and am running @admin/build-doc@ in my local environment explained in [[http://docs.ce... Florian Haas

06/25/2018

09:50 PM Bug #23352: osd: segfaults under normal operation
Same here
2018-06-24 19:42:41.348699 7f3e53a46700 -1 mgr.server send_report send_report osd,226.0x55678069c850 sen...
Alex Gorbachev
09:34 PM Bug #23352: osd: segfaults under normal operation
Brad Hubbard wrote:
> Can anyone confirm seeing the "unknown health metric" messages in the mgr logs prior to the se...
Kjetil Joergensen
02:50 PM Bug #24652 (Won't Fix): OSD crashes when repairing pg
After a deep-scrub on the primary OSD for the pg we get:... Ana Aviles
01:03 PM Bug #24650 (New): mark unfound lost revert: out of order trim
OSD crashes in a few seconds after command 'ceph pg X.XX mark_unfound_lost revert'.
-10> 2018-06-25 15:52:14.49...
Sergey Malinin
07:53 AM Bug #24645 (New): Upload to radosgw fails when there are degraded objects
Hi,
we use Ceph RadosGW for storing and serving milions of small images. Everything is working well until recovery...
Michal Cila
06:46 AM Backport #24471 (In Progress): luminous: Ceph-osd crash when activate SPDK
https://github.com/ceph/ceph/pull/22686 Prashant D
05:01 AM Backport #24472 (In Progress): mimic: Ceph-osd crash when activate SPDK
https://github.com/ceph/ceph/pull/22684 Prashant D

06/22/2018

11:25 PM Bug #21142: OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
Uploaded a few more logs (debug 20) here: https://storage.googleapis.com/ceph-logs/ceph-osd-logs.tar.gz
After runn...
Dexter John Genterone
12:45 PM Bug #21142: OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
Hi Sage,
We've experienced this again on a new environment we setup. Took a snippet of the logs, hope it's enough:...
Dexter John Genterone
04:46 PM Bug #23622 (Resolved): qa/workunits/mon/test_mon_config_key.py fails on master
Nathan Cutler
04:45 PM Backport #23675 (Resolved): luminous: qa/workunits/mon/test_mon_config_key.py fails on master
Nathan Cutler
04:25 PM Backport #23675: luminous: qa/workunits/mon/test_mon_config_key.py fails on master
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/21368
merged
Yuri Weinstein
04:44 PM Bug #23921 (Resolved): pg-upmap cannot balance in some case
Nathan Cutler
04:43 PM Backport #24048 (Resolved): luminous: pg-upmap cannot balance in some case
Nathan Cutler
04:25 PM Backport #24048: luminous: pg-upmap cannot balance in some case
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/22115
merged
Yuri Weinstein
04:43 PM Bug #24025 (Resolved): RocksDB compression is not supported at least on Debian.
Nathan Cutler
04:42 PM Backport #24279 (Resolved): luminous: RocksDB compression is not supported at least on Debian.
Nathan Cutler
04:24 PM Backport #24279: luminous: RocksDB compression is not supported at least on Debian.
Kefu Chai wrote:
> https://github.com/ceph/ceph/pull/22215
merged
Yuri Weinstein
04:40 PM Backport #24329: mimic: assert manager.get_num_active_clean() == pg_num on rados/singleton/all/ma...
original mimic backport https://github.com/ceph/ceph/pull/22288 was merged, but deemed insufficient Nathan Cutler
04:38 PM Backport #24328 (Resolved): luminous: assert manager.get_num_active_clean() == pg_num on rados/si...
Nathan Cutler
04:23 PM Bug #24321: assert manager.get_num_active_clean() == pg_num on rados/singleton/all/max-pg-per-osd...
merged https://github.com/ceph/ceph/pull/22296 Yuri Weinstein
04:15 PM Bug #24635 (New): luminous: LibRadosTwoPoolsPP.SetRedirectRead failed
Probably a race with the redirect code.
From http://qa-proxy.ceph.com/teuthology/yuriw-2018-06-22_03:31:56-rados-w...
Josh Durgin
01:19 PM Bug #23352: osd: segfaults under normal operation
Yeah. We got in log mgr before segfault ceph-osd:
> mgr.server send_report send_report osd,74.0x560276d34ed8 sent me...
Serg D
03:54 AM Bug #23352: osd: segfaults under normal operation
Brad Hubbard
03:54 AM Bug #23352: osd: segfaults under normal operation
In several of the crashes we are seeing lines like the following prior to the crash.... Brad Hubbard
12:42 PM Bug #17170: mon/monclient: update "unable to obtain rotating service keys when osd init" to sugge...
I have a bit different effect on v12.2.5, but may be related:
I have similar logs:...
Peter Gervai
08:44 AM Backport #24351 (Resolved): luminous: slow mon ops from osd_failure
Nathan Cutler
12:23 AM Backport #24351: luminous: slow mon ops from osd_failure
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/22568
merged
Yuri Weinstein
08:44 AM Bug #23386 (Resolved): crush device class: Monitor Crash when moving Bucket into Default root
Nathan Cutler
08:43 AM Backport #24258 (Resolved): luminous: crush device class: Monitor Crash when moving Bucket into D...
Nathan Cutler
12:21 AM Backport #24258: luminous: crush device class: Monitor Crash when moving Bucket into Default root
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/22381
mergedReviewed-by: Nathan Cutler <ncutler@suse.com>
Yuri Weinstein
08:43 AM Backport #24290 (Resolved): luminous: common: JSON output from rados bench write has typo in max_...
Nathan Cutler
12:20 AM Backport #24290: luminous: common: JSON output from rados bench write has typo in max_latency key
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/22391
merged
Yuri Weinstein
08:41 AM Backport #24356 (Resolved): luminous: osd: pg hard limit too easy to hit
Nathan Cutler
12:18 AM Backport #24356: luminous: osd: pg hard limit too easy to hit
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/22592
merged
Yuri Weinstein
08:41 AM Backport #24618 (Resolved): mimic: osd: choose_acting loop
https://github.com/ceph/ceph/pull/22889 Nathan Cutler
08:41 AM Backport #24617 (Resolved): mimic: ValueError: too many values to unpack due to lack of subdir
https://github.com/ceph/ceph/pull/22888 Nathan Cutler
12:32 AM Bug #24615 (Resolved): error message for 'unable to find any IP address' not shown
Hi,
In my ceph.conf I have the option:...
Francois Lafont

06/21/2018

11:40 PM Bug #24613 (New): luminous: rest/test.py fails with expected 200, got 400
... Neha Ojha
10:57 PM Bug #23492: Abort in OSDMap::decode() during qa/standalone/erasure-code/test-erasure-eio.sh
possibly related lumious run: http://pulpito.ceph.com/yuriw-2018-06-11_16:27:32-rados-wip-yuri3-testing-2018-06-11-14... Josh Durgin
10:15 PM Bug #23352: osd: segfaults under normal operation
Another instance: http://pulpito.ceph.com/yuriw-2018-06-19_21:29:48-rados-wip-yuri-testing-2018-06-19-1953-luminous-d... Josh Durgin
09:01 PM Bug #24487 (Pending Backport): osd: choose_acting loop
Neha Ojha
05:58 PM Bug #24487 (Fix Under Review): osd: choose_acting loop
https://github.com/ceph/ceph/pull/22664 Neha Ojha
06:48 PM Bug #24612 (Resolved): FAILED assert(osdmap_manifest.pinned.empty()) in OSDMonitor::prune_init()
... Neha Ojha
04:51 PM Bug #23879: test_mon_osdmap_prune.sh fails
/a/nojha-2018-06-21_00:18:52-rados-wip-24487-distro-basic-smithi/2686362 Neha Ojha
09:19 AM Bug #21142: OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
We also hit this today, happen to have osd log with --debug_osd = 20
FWIW, the cluster has an inconsistent PG and ...
Xiaoxi Chen
01:34 AM Bug #24601 (Resolved): FAILED assert(is_up(osd)) in OSDMap::get_inst(int)
... Neha Ojha
12:52 AM Bug #24600 (Resolved): ValueError: too many values to unpack due to lack of subdir
... Neha Ojha

06/20/2018

10:13 PM Bug #24422: Ceph OSDs crashing in BlueStore::queue_transactions() using EC
Sage Weil wrote:
> Can you generate an osd log with 'debug osd = 20' for the crashing osd that leads up to the crash...
Sage Weil
10:13 PM Bug #24422: Ceph OSDs crashing in BlueStore::queue_transactions() using EC
Can you generate an osd log with 'debug osd = 20' for the crashing osd that leads up to the crash? Sage Weil
09:50 PM Bug #24422 (Duplicate): Ceph OSDs crashing in BlueStore::queue_transactions() using EC
Josh Durgin
10:11 PM Bug #23145: OSD crashes during recovery of EC pg
Two basic theories:
1. There is a bug that prematurely advances can_rollback_to
2. One of Peter's OSDs warped bac...
Sage Weil
10:05 PM Bug #23145: OSD crashes during recovery of EC pg
Sage Weil wrote:
> Zengran Zhang wrote:
> > osd in last peering stage will call pg_log.roll_forward(at last of PG:...
Sage Weil
10:03 PM Bug #23145 (Need More Info): OSD crashes during recovery of EC pg
Yong Wang, can you provide a full osd log with debug osd = 20 for the primary osd for the PG leading up to the crash... Sage Weil
09:22 PM Bug #23145: OSD crashes during recovery of EC pg
Zengran Zhang wrote:
> osd in last peering stage will call pg_log.roll_forward(at last of PG::activate), is there p...
Sage Weil
01:46 AM Bug #23145: OSD crashes during recovery of EC pg
@Sage Weil @Zengran Zhang
could you shared something about this bug recently?
Yong Wang
01:44 AM Bug #23145: OSD crashes during recovery of EC pg
hi all,did it has any updates please? Yong Wang
10:02 PM Backport #24599 (In Progress): mimic: failed to load OSD map for epoch X, got 0 bytes
Nathan Cutler
10:01 PM Backport #24599 (Resolved): mimic: failed to load OSD map for epoch X, got 0 bytes
https://github.com/ceph/ceph/pull/22651 Nathan Cutler
09:47 PM Bug #24448 (Won't Fix): (Filestore) ABRT report for package ceph has reached 10 occurrences
This is likely due to filestore becoming overloaded (hence waiting on throttles) and hitting the filestore op thread ... Josh Durgin
09:38 PM Bug #24511 (Duplicate): osd crushed at thread_name:safe_timer
Josh Durgin
09:37 PM Bug #24515: "[WRN] Health check failed: 1 slow ops, oldest one blocked for 32 sec, mon.c has slow...
Kefu, can you take a look at this? Josh Durgin
09:36 PM Bug #24531: Mimic MONs have slow/long running ops
Joao, could you take a look at this? Josh Durgin
09:34 PM Bug #24549 (Won't Fix): FileStore::read assert (ABRT report for package ceph has reached 1000 occ...
As John described, this is not a bug in ceph but due to failing hardware or the filesystem below. Josh Durgin
09:25 PM Bug #23753 (Can't reproduce): "Error ENXIO: problem getting command descriptions from osd.4" in u...
re-open if it recurs Josh Durgin
09:19 PM Bug #22624: filestore: 3180: FAILED assert(0 == "unexpected error"): error (2) No such file or di...
Josh Durgin
09:12 PM Bug #22085 (Can't reproduce): jewel->luminous: "[ FAILED ] LibRadosAioEC.IsSafe" in upgrade:jew...
assuming this is the mon crush testing timeout, logs are gone so can't be sure Josh Durgin
08:10 PM Bug #24423: failed to load OSD map for epoch X, got 0 bytes
backport for mimic: https://github.com/ceph/ceph/pull/22651 Sage Weil
08:07 PM Bug #24423 (Pending Backport): failed to load OSD map for epoch X, got 0 bytes
Sage Weil
07:46 PM Bug #24597 (Resolved): FAILED assert(0 == "ERROR: source must exist") in FileStore::_collection_m...
... Neha Ojha
06:32 PM Bug #20086: LibRadosLockECPP.LockSharedDurPP gets EEXIST
... Neha Ojha
03:01 PM Bug #23492: Abort in OSDMap::decode() during qa/standalone/erasure-code/test-erasure-eio.sh

Now that I've looked at the code there is nothing surprising about the map handling. There is code in dequeue_op()...
David Zafman
12:37 AM Bug #23492: Abort in OSDMap::decode() during qa/standalone/erasure-code/test-erasure-eio.sh

I was able to reproduce by running a loop of a single test case in qa/standalone/erasure-code/test-erasure-eio.sh
...
David Zafman
01:00 PM Backport #23673 (Resolved): jewel: auth: ceph auth add does not sanity-check caps
Nathan Cutler
12:52 PM Bug #23872 (Resolved): Deleting a pool with active watch/notify linger ops can result in seg fault
Nathan Cutler
12:52 PM Backport #23905 (Resolved): jewel: Deleting a pool with active watch/notify linger ops can result...
Nathan Cutler
12:21 PM Backport #24383 (In Progress): mimic: osd: stray osds in async_recovery_targets cause out of orde...
https://github.com/ceph/ceph/pull/22642 Prashant D
08:42 AM Bug #24588 (Fix Under Review): osd: may get empty info at recovery
-https://github.com/ceph/ceph/pull/22362- John Spray
01:42 AM Bug #24588 (Resolved): osd: may get empty info at recovery
2018-06-15 20:34:16.421720 7f89d2c24700 -1 /home/zzr/ceph.sf/src/osd/PG.cc: In function 'void PG::start_peering_inter... tao ning
08:40 AM Bug #24593: s390x: Ceph Monitor crashed with Caught signal (Aborted)
I expect that only people in possession of s390x hardware will be able to debug this
I see that there is another t...
John Spray
05:33 AM Bug #24593 (New): s390x: Ceph Monitor crashed with Caught signal (Aborted)
We are trying to setup ceph cluster on s390x platform.
ceph-mon service crashed with an error: *** Caught signal ...
Nayana Thorat
05:50 AM Feature #24591 (Fix Under Review): FileStore hasn't impl to get kv-db's statistics
Kefu Chai
03:22 AM Feature #24591: FileStore hasn't impl to get kv-db's statistics
https://github.com/ceph/ceph/pull/22633 Jack Lv
03:22 AM Feature #24591 (Fix Under Review): FileStore hasn't impl to get kv-db's statistics
In BlueStore, you can see kv-db's statistics by "ceph daemon osd.X dump_objectstore_kv_stats", but FileStore hasn't i... Jack Lv
03:22 AM Feature #22147: Set multiple flags in a single command line
I don’t think we should skip it entirely. Many of the places that implement a check like that are using a common flag... Greg Farnum

06/19/2018

11:44 PM Bug #24487 (In Progress): osd: choose_acting loop
This happens when an osd which is part of the acting set and not a part the upset, gets chosen as an async_recovery_t... Neha Ojha
10:51 PM Backport #23673: jewel: auth: ceph auth add does not sanity-check caps
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/21367
merged
Yuri Weinstein
10:50 PM Backport #23905: jewel: Deleting a pool with active watch/notify linger ops can result in seg fault
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/21754
merged
Yuri Weinstein
10:49 PM Feature #22147: Set multiple flags in a single command line
It seems fair to assume that "unset" should support this also.
Question: should settings that require --yes-i-real...
Jesse Williamson
10:40 PM Bug #24587: librados api aio tests race condition
http://pulpito.ceph.com/yuriw-2018-06-13_14:55:30-rados-wip-yuri4-testing-2018-06-12-2037-jewel-distro-basic-smithi/2... Josh Durgin
10:38 PM Bug #24587 (Resolved): librados api aio tests race condition
Seen in a jewel integration branch with no OSD changes:
http://pulpito.ceph.com/yuriw-2018-06-12_22:32:43-rados-wi...
Josh Durgin
09:58 PM Bug #23492: Abort in OSDMap::decode() during qa/standalone/erasure-code/test-erasure-eio.sh
I did a run based on d9284902e1b2e292595696caf11cdead18acec96 which is a branch off of master.
http://pulpito.ceph...
David Zafman
07:24 PM Backport #24584 (Resolved): luminous: osdc: wrong offset in BufferHead
https://github.com/ceph/ceph/pull/22865 Nathan Cutler
07:24 PM Backport #24583 (Resolved): mimic: osdc: wrong offset in BufferHead
https://github.com/ceph/ceph/pull/22869 Nathan Cutler
06:02 PM Bug #19971 (Resolved): osd: deletes are performed inline during pg log processing
Nathan Cutler
06:01 PM Backport #22406 (Rejected): jewel: osd: deletes are performed inline during pg log processing
This change was deemed too invasive at such a late stage in Jewel's life cycle. Nathan Cutler
06:01 PM Backport #22405 (Rejected): jewel: store longer dup op information
This change was deemed too invasive at such a late stage in Jewel's life cycle. Nathan Cutler
06:00 PM Backport #22400 (Rejected): jewel: PR #16172 causing performance regression
This change was deemed too invasive at such a late stage in Jewel's life cycle. Nathan Cutler
04:10 PM Bug #24484 (Pending Backport): osdc: wrong offset in BufferHead
Jason Dillaman
11:54 AM Bug #24448: (Filestore) ABRT report for package ceph has reached 10 occurrences
OSD killed by signal, something like OOM incidents perhaps? John Spray
11:53 AM Bug #24450 (Duplicate): OSD Caught signal (Aborted)
http://tracker.ceph.com/issues/24423 Igor Fedotov
11:51 AM Bug #24559 (Fix Under Review): building error for QAT decompress
John Spray
02:10 AM Bug #24559 (Fix Under Review): building error for QAT decompress
The parameter of decompress changes from 'bufferlist::iterator' to 'bufferlist::const_iterator', but chis change miss... Qiaowei Ren
11:34 AM Bug #24549: FileStore::read assert (ABRT report for package ceph has reached 1000 occurrences)
Presumably this is underlying FS failures tripping asserts rather than a bug (perhaps people using ZFS on centos, or ... John Spray
07:26 AM Backport #24355 (In Progress): mimic: osd: pg hard limit too easy to hit
https://github.com/ceph/ceph/pull/22621 Prashant D

06/18/2018

05:51 PM Bug #24423: failed to load OSD map for epoch X, got 0 bytes
Sage Weil
11:45 AM Bug #24549 (Won't Fix): FileStore::read assert (ABRT report for package ceph has reached 1000 occ...
FileStore::read(coll_t, ghobject_t const&, unsigned long, unsigned long, ceph::buffer::list&, unsigned int, bool)
...
Kaleb KEITHLEY
07:11 AM Backport #24356 (In Progress): luminous: osd: pg hard limit too easy to hit
https://github.com/ceph/ceph/pull/22592 Prashant D

06/16/2018

02:16 PM Bug #24423: failed to load OSD map for epoch X, got 0 bytes
How to fix installed Mimic (upgraded from Luminous) with this fix? Is there any way to make startup OSD not requestin... Lazuardi Nasution

06/15/2018

11:40 PM Bug #24423: failed to load OSD map for epoch X, got 0 bytes
I've fixed it here: https://github.com/ceph/ceph/pull/22585 Paul Emmerich
01:36 PM Bug #24423: failed to load OSD map for epoch X, got 0 bytes
Not sure if this is related, but for a few days, I'm not able to modify crushmap (like adding or removing OSD) on a l... Michel Nicol
09:23 AM Bug #24423: failed to load OSD map for epoch X, got 0 bytes
Seeing the same here with a new Mimic cluster.
I purged a few OSDs (deployment went wrong) and now they can't star...
Wido den Hollander
03:56 PM Bug #24057: cbt fails to copy results to the archive dir
Neha Ojha
02:48 PM Bug #24531: Mimic MONs have slow/long running ops
... Wido den Hollander
02:41 PM Bug #24531: Mimic MONs have slow/long running ops
What's the output of "ceph versions" on this cluster?
We had issues in the lab with OSD failure reports not gettin...
Greg Farnum
02:20 PM Bug #24531 (Resolved): Mimic MONs have slow/long running ops
When setting up a Mimic 13.2.0 cluster I saw a message like this:... Wido den Hollander
08:39 AM Bug #24529 (New): monitor report empty client io rate when clock not synchronized
we run rados bench when cluster is warn and clock is not synchronized. on the other hand, we watch io speed from resu... hikdata hik
05:08 AM Backport #24351 (In Progress): luminous: slow mon ops from osd_failure
https://github.com/ceph/ceph/pull/22568 Prashant D

06/14/2018

10:21 PM Bug #21142 (Need More Info): OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
Sage Weil
10:20 PM Bug #21142: OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
Tim, Dexter, is this something that is reproducible in your environment? I haven't seen this one, which makes me ver... Sage Weil
07:41 PM Bug #23492: Abort in OSDMap::decode() during qa/standalone/erasure-code/test-erasure-eio.sh

This might be caused by 52dd99e3011bfc787042fe105e02c11b28867c4c which was included in https://github.com/ceph/ceph...
David Zafman
07:27 PM Bug #24526: Mimic OSDs do not start after deleting some pools with size=1
I solved this issue by monkey-patching OSD code:... Vitaliy Filippov
03:48 PM Bug #24526: Mimic OSDs do not start after deleting some pools with size=1
P.S: This happened just after deleting some pool with size=1 - several OSDs died immediately and the latest error mes... Vitaliy Filippov
03:24 PM Bug #24526 (New): Mimic OSDs do not start after deleting some pools with size=1
After some amount of test actions involving creating pools with size=min_size=1 and then deleting them, most OSDs fai... Vitaliy Filippov
07:06 PM Feature #24527 (New): Need a pg query that doens't include invalid peer information

Some fields in the peer info remain unchanged after a peer transitions from being the primary. This information ma...
David Zafman
01:13 PM Bug #24423: failed to load OSD map for epoch X, got 0 bytes
I am getting the same issue.
I also upgraded to Luminous to Mimic.
I used: ceph osd purge
Grant Slater
11:48 AM Backport #24198 (Resolved): luminous: mon: slow op on log message
Nathan Cutler
11:47 AM Backport #24216 (Resolved): luminous: "process (unknown)" in ceph logs
Nathan Cutler
11:46 AM Bug #24167 (Resolved): Module 'balancer' has failed: could not find bucket -14
Nathan Cutler
11:46 AM Backport #24213 (Resolved): mimic: Module 'balancer' has failed: could not find bucket -14
Nathan Cutler
11:45 AM Backport #24214 (Resolved): luminous: Module 'balancer' has failed: could not find bucket -14
Nathan Cutler
05:54 AM Backport #24332 (In Progress): mimic: local_reserver double-reservation of backfilled pg
https://github.com/ceph/ceph/pull/22559 Prashant D

06/13/2018

10:01 PM Backport #24198: luminous: mon: slow op on log message
Kefu Chai wrote:
> https://github.com/ceph/ceph/pull/22109
merged
Yuri Weinstein
10:00 PM Backport #24216: luminous: "process (unknown)" in ceph logs
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/22290
merged
Yuri Weinstein
09:59 PM Backport #24214: luminous: Module 'balancer' has failed: could not find bucket -14
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/22308
merged
Yuri Weinstein
08:13 PM Bug #24515 (New): "[WRN] Health check failed: 1 slow ops, oldest one blocked for 32 sec, mon.c ha...
This seems to be rhel specific
Run: http://pulpito.ceph.com/yuriw-2018-06-12_21:09:43-fs-master-distro-basic-smith...
Yuri Weinstein
05:19 PM Bug #23966 (Resolved): Deleting a pool with active notify linger ops can result in seg fault
Nathan Cutler
05:19 PM Backport #24059 (Resolved): luminous: Deleting a pool with active notify linger ops can result in...
Nathan Cutler
04:46 PM Backport #24468 (In Progress): mimic: tell ... config rm <foo> not idempotent
Nathan Cutler
04:35 PM Backport #24245 (Resolved): luminous: Manager daemon y is unresponsive during teuthology cluster ...
Nathan Cutler
04:34 PM Backport #24374 (Resolved): luminous: mon: auto compaction on rocksdb should kick in more often
Nathan Cutler
12:56 PM Bug #24511 (Duplicate): osd crushed at thread_name:safe_timer
h1. ENV
*ceph version*...
Lei Liu
11:29 AM Bug #23049: ceph Status shows only WARN when traffic to cluster fails
hi,
which is the expected fix release version?
Thanks,
Nokia ceph-users
10:16 AM Backport #24501 (In Progress): luminous: osd: eternal stuck PG in 'unfound_recovery'
Nathan Cutler
10:16 AM Backport #24500 (In Progress): mimic: osd: eternal stuck PG in 'unfound_recovery'
Nathan Cutler

06/12/2018

08:01 AM Backport #24501 (Resolved): luminous: osd: eternal stuck PG in 'unfound_recovery'
https://github.com/ceph/ceph/pull/22546 Nathan Cutler
08:01 AM Backport #24500 (Resolved): mimic: osd: eternal stuck PG in 'unfound_recovery'
https://github.com/ceph/ceph/pull/22545 Nathan Cutler
08:00 AM Backport #24495 (Resolved): luminous: osd: segv in Session::have_backoff
https://github.com/ceph/ceph/pull/22729 Nathan Cutler
08:00 AM Backport #24494 (Resolved): mimic: osd: segv in Session::have_backoff
https://github.com/ceph/ceph/pull/22730 Nathan Cutler
03:22 AM Bug #24486 (Pending Backport): osd: segv in Session::have_backoff
Sage Weil

06/11/2018

09:32 PM Bug #24423: failed to load OSD map for epoch X, got 0 bytes
I am going to add this test for upgrade as well, steps to recreate... Vasu Kulkarni
04:19 AM Bug #24423: failed to load OSD map for epoch X, got 0 bytes
I have also experienced this issue while continuing the Bluestore conversion of OSDs on my Ceph cluster, after carryi... Gavin Baker
02:16 PM Backport #24059: luminous: Deleting a pool with active notify linger ops can result in seg fault
Casey Bodley wrote:
> https://github.com/ceph/ceph/pull/22143
merged
Yuri Weinstein
02:33 AM Bug #24487: osd: choose_acting loop
It looks like the "choose_async_recovery_ec candidates by cost are: 178,2(0)" line is different in the second case.. ... Sage Weil
01:45 AM Bug #24487 (Resolved): osd: choose_acting loop
ec pg looping between [2,3,0,1] and [-,3,0,1].
osd.3 says...
Sage Weil

06/10/2018

06:41 PM Bug #24486 (Fix Under Review): osd: segv in Session::have_backoff
https://github.com/ceph/ceph/pull/22497 Sage Weil
06:34 PM Bug #24486 (Resolved): osd: segv in Session::have_backoff
... Sage Weil
04:41 PM Bug #24485 (Resolved): LibRadosTwoPoolsPP.ManifestUnset failure
... Sage Weil
03:30 PM Bug #24484 (Fix Under Review): osdc: wrong offset in BufferHead
Kefu Chai
03:15 PM Bug #24484: osdc: wrong offset in BufferHead
this bug will lead to an exception "buffer::end_of_buffer" which is thrown in function "buffer::list::substr_of"
Thi...
dongdong tao
03:08 PM Bug #24484: osdc: wrong offset in BufferHead
PR: https://github.com/ceph/ceph/pull/22495 dongdong tao
03:07 PM Bug #24484 (Resolved): osdc: wrong offset in BufferHead
The offset of BufferHead should be "opos - bh->start()" dongdong tao
02:12 AM Backport #24329 (In Progress): mimic: assert manager.get_num_active_clean() == pg_num on rados/si...
Kefu Chai

06/09/2018

07:21 PM Bug #24321 (Pending Backport): assert manager.get_num_active_clean() == pg_num on rados/singleton...
Sage Weil
05:56 AM Bug #24321 (Fix Under Review): assert manager.get_num_active_clean() == pg_num on rados/singleton...
https://github.com/ceph/ceph/pull/22485 Kefu Chai
06:50 PM Bug #22462: mon: unknown message type 1537 in luminous->mimic upgrade tests
Maybe i have the same issue during upgrade Jewel->Luminous http://tracker.ceph.com/issues/24481?next_issue_id=24480&p... Aleksandr Rudenko
02:23 PM Bug #24373 (Pending Backport): osd: eternal stuck PG in 'unfound_recovery'
Kefu Chai
11:20 AM Backport #24478 (Resolved): luminous: read object attrs failed at EC recovery
https://github.com/ceph/ceph/pull/24327 Nathan Cutler
11:18 AM Backport #24473 (Resolved): mimic: cosbench stuck at booting cosbench driver
https://github.com/ceph/ceph/pull/22887 Nathan Cutler
11:18 AM Backport #24472 (Resolved): mimic: Ceph-osd crash when activate SPDK
https://github.com/ceph/ceph/pull/22684 Nathan Cutler
11:18 AM Backport #24471 (Resolved): luminous: Ceph-osd crash when activate SPDK
https://github.com/ceph/ceph/pull/22686 Nathan Cutler
11:18 AM Backport #24468 (Resolved): mimic: tell ... config rm <foo> not idempotent
https://github.com/ceph/ceph/pull/22552 Nathan Cutler
06:07 AM Bug #24452 (Resolved): Backfill hangs in a test case in master not mimic
Kefu Chai

06/08/2018

11:03 PM Bug #24423: failed to load OSD map for epoch X, got 0 bytes
I can't reproduce this on any new Mimic cluster, it only happens on clusters upgraded from Luminous (which is why we ... Paul Emmerich
09:04 PM Bug #24423: failed to load OSD map for epoch X, got 0 bytes
I'm trying to make new OSDs with ceph-volume osd create --dmcrypt --bluestore --data /dev/sdg and am getting the same... Michael Sudnick
07:05 PM Bug #24454 (Duplicate): failed to recover before timeout expired
#24452 Sage Weil
12:29 PM Bug #24454 (Duplicate): failed to recover before timeout expired
tons of this on current master
http://pulpito.ceph.com/kchai-2018-06-06_04:56:43-rados-wip-kefu-testing-2018-06-06...
Sage Weil
07:05 PM Bug #24452 (Fix Under Review): Backfill hangs in a test case in master not mimic
https://github.com/ceph/ceph/pull/22478 Sage Weil
02:48 PM Bug #24452: Backfill hangs in a test case in master not mimic

Final messages on primary during backfill about pg 1.0....
David Zafman
04:57 AM Bug #24452 (Resolved): Backfill hangs in a test case in master not mimic

../qa/run-standalone.sh "osd-backfill-stats.sh TEST_backfill_down_out" 2>&1 | tee obs.log
This test times out wa...
David Zafman
02:34 PM Backport #23912: luminous: mon: High MON cpu usage when cluster is changing
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/21968
merged
Yuri Weinstein
02:33 PM Backport #24245: luminous: Manager daemon y is unresponsive during teuthology cluster teardown
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/22331
merged
Yuri Weinstein
02:31 PM Backport #24374: luminous: mon: auto compaction on rocksdb should kick in more often
Kefu Chai wrote:
> https://github.com/ceph/ceph/pull/22360
merged
Yuri Weinstein
08:18 AM Bug #23352: osd: segfaults under normal operation
Experiencing the a safe_timer segfault with a freshly deployed cluster. No data on the cluster yet. Just an empty poo... Vangelis Tasoulas

06/07/2018

03:20 PM Bug #24423: failed to load OSD map for epoch X, got 0 bytes
We are also seeing this when creating OSDs with IDs that existed previously.
I verified that the old osd was delet...
Paul Emmerich
01:21 PM Bug #24373: osd: eternal stuck PG in 'unfound_recovery'
https://github.com/ceph/ceph/pull/22456 Sage Weil
01:14 PM Bug #24373: osd: eternal stuck PG in 'unfound_recovery'
Okay, I see the problem. Two fixes: first, reset every pg on down->up (simpler approach), but the bigger issue is th... Sage Weil
12:58 PM Bug #24450: OSD Caught signal (Aborted)
I have the same problem.
http://tracker.ceph.com/issues/24423
Sergey Malinin
12:03 PM Bug #24450 (Duplicate): OSD Caught signal (Aborted)
Hi,
I have done a rolling_upgrade to mimic with ceph-ansible. It works perfect! Now, I want to deploy new OSDs, bu...
Peter Schulz
11:46 AM Bug #24448 (Won't Fix): (Filestore) ABRT report for package ceph has reached 10 occurrences
https://retrace.fedoraproject.org/faf/reports/bthash/fe768f98e5fff65f0c850668c4bdae8d4da7e086/
https://retrace.fedor...
Kaleb KEITHLEY

06/06/2018

09:11 PM Bug #24264 (Closed): ssd-primary crush rule not working as intended
I don't think there's a good way to express that requirement in the current crush language. The rule in the docs does... Josh Durgin
09:06 PM Bug #24362 (Triaged): ceph-objectstore-tool incorrectly invokes crush_location_hook
Seems like the way to fix this is to stop ceph-objectstore-tool from trying to use the crush location hook at all.
...
Josh Durgin
07:15 AM Bug #23145: OSD crashes during recovery of EC pg
-3> 2018-06-06 15:00:40.462930 7fffddb25700 -1 bluestore(/var/lib/ceph/osd/ceph-12) _txc_add_transaction error (2... Yong Wang
02:45 AM Bug #23145: OSD crashes during recovery of EC pg
@Sage Weil
@Zengran Zahng
we meet the some question, and osd crash not recover until now.
env is 12.2.5 ec 2+1 b...
Yong Wang
06:02 AM Backport #24293 (In Progress): jewel: mon: slow op on log message
https://github.com/ceph/ceph/pull/22431 Prashant D
02:34 AM Bug #24373: osd: eternal stuck PG in 'unfound_recovery'
Attached full log (download ceph-osd.3.log.gz).
Points are:...
Kouya Shimura
12:33 AM Bug #24371 (Pending Backport): Ceph-osd crash when activate SPDK
Kefu Chai

06/05/2018

05:34 PM Bug #24365 (Pending Backport): cosbench stuck at booting cosbench driver
Neha Ojha
01:33 AM Bug #24365 (Fix Under Review): cosbench stuck at booting cosbench driver
https://github.com/ceph/ceph/pull/22405 Neha Ojha
04:04 PM Bug #24408 (Pending Backport): tell ... config rm <foo> not idempotent
Kefu Chai
11:00 AM Bug #24423 (Resolved): failed to load OSD map for epoch X, got 0 bytes
After upgrading to Mimic I deleted a non-lvm OSD and recreated it with 'ceph-volume lvm prepare --bluestore --data /d... Sergey Malinin
10:37 AM Bug #24422: Ceph OSDs crashing in BlueStore::queue_transactions() using EC
the same to https://tracker.ceph.com/issues/21475. and i already modify bluestore_deferred_throttle_bytes = 0
bluest...
鹏 张
10:31 AM Bug #24422: Ceph OSDs crashing in BlueStore::queue_transactions() using EC
2018-06-05T17:46:28.273183+08:00 node54 ceph-osd: /work/build/rpmbuild/BUILD/infinity-3.2.5/src/os/bluestore/BlueStor... 鹏 张
10:31 AM Bug #24422: Ceph OSDs crashing in BlueStore::queue_transactions() using EC
鹏 张 wrote:
> ceph version: 12.2.5
> data pool use Ec module 2 + 1.
> When restart one osd,it case crash and restar...
鹏 张
10:26 AM Bug #24422: Ceph OSDs crashing in BlueStore::queue_transactions() using EC
1.-45> 2018-06-05 17:47:56.886142 7f8972974700 -1 bluestore(/var/lib/ceph/osd/ceph-12) _txc_add_transaction error (2)... 鹏 张
10:25 AM Bug #24422 (Duplicate): Ceph OSDs crashing in BlueStore::queue_transactions() using EC
ceph version: 12.2.5
data pool use Ec module 3 + 1.
When restart one osd,it case crash and restart more and more.
...
鹏 张
04:42 AM Bug #24419 (Won't Fix): ceph-objectstore-tool unable to open mon store
Hi,everyone;
I use luminous v12.2.5,and i try to recovery monitor database from osds,
I perform step by step acc...
dovefi Z
03:32 AM Backport #24291 (In Progress): jewel: common: JSON output from rados bench write has typo in max_...
https://github.com/ceph/ceph/pull/22407 Prashant D
02:37 AM Bug #23875: Removal of snapshot with corrupt replica crashes osd

If update_snap_map() ignores the error from remove_oid() we still crash because an op from the primary related to...
David Zafman
02:20 AM Backport #24292 (In Progress): mimic: common: JSON output from rados bench write has typo in max_...
https://github.com/ceph/ceph/pull/22406 Prashant D

06/04/2018

06:32 PM Bug #24368: osd: should not restart on permanent failures
It would, but the previous settings were there for a reason so I'm not sure if it's feasible to backport this for cep... Greg Farnum
05:10 PM Bug #24371 (Fix Under Review): Ceph-osd crash when activate SPDK
Greg Farnum
04:00 PM Bug #24408 (Fix Under Review): tell ... config rm <foo> not idempotent
https://github.com/ceph/ceph/pull/22395 Sage Weil
03:56 PM Bug #24408 (Resolved): tell ... config rm <foo> not idempotent
... Sage Weil
02:56 PM Backport #24407 (In Progress): mimic: read object attrs failed at EC recovery
Kefu Chai
02:56 PM Backport #24407 (Resolved): mimic: read object attrs failed at EC recovery
https://github.com/ceph/ceph/pull/22394 Kefu Chai
02:54 PM Bug #24406 (Resolved): read object attrs failed at EC recovery
https://github.com/ceph/ceph/pull/22196 Kefu Chai
02:18 PM Backport #24290 (In Progress): luminous: common: JSON output from rados bench write has typo in m...
https://github.com/ceph/ceph/pull/22391 Prashant D
11:53 AM Bug #24366 (Pending Backport): omap_digest handling still not correct
Kefu Chai
06:27 AM Bug #23352: osd: segfaults under normal operation
Looking at the crash in http://tracker.ceph.com/issues/23352#note-14 there's a fairly glaring problem.... Brad Hubbard
12:14 AM Bug #23352: osd: segfaults under normal operation
Hi Kjetil,
Sure, worth a look, but AFAICT all access is protected by SafeTimers locks.
Brad Hubbard
02:08 AM Backport #24258 (In Progress): luminous: crush device class: Monitor Crash when moving Bucket int...
https://github.com/ceph/ceph/pull/22381 Prashant D

06/02/2018

12:04 AM Bug #24365 (In Progress): cosbench stuck at booting cosbench driver
Two things caused this issue:
1. cosbench requires openjdk-8. The cbt task does install this dependency, but we al...
Neha Ojha

06/01/2018

08:05 PM Bug #23352: osd: segfaults under normal operation
Brad Hubbard wrote:
> I've confirmed that in all of the SafeTimer segfaults the 'schedule' multimap is empty, indica...
Kjetil Joergensen
06:01 PM Bug #24368: osd: should not restart on permanent failures
Sounds like something that would be useful in our stable releases - Greg, do you agree? Nathan Cutler
05:56 PM Backport #24360 (Need More Info): luminous: osd: leaked Session on osd.7
Do Not Backport For Now
see https://github.com/ceph/ceph/pull/22339#issuecomment-393574371 for details
Nathan Cutler
05:44 PM Backport #24383 (Resolved): mimic: osd: stray osds in async_recovery_targets cause out of order ops
https://github.com/ceph/ceph/pull/22889 Nathan Cutler
05:28 PM Backport #24381 (Resolved): luminous: omap_digest handling still not correct
https://github.com/ceph/ceph/pull/22375 David Zafman
05:28 PM Backport #24380 (Resolved): mimic: omap_digest handling still not correct
https://github.com/ceph/ceph/pull/22374 David Zafman
08:02 AM Bug #24342: Monitor's routed_requests leak
Greg Farnum wrote:
> What version are you running? The MRoute handling is all pretty old; though we've certainly dis...
Xuehan Xu
07:16 AM Bug #24373 (Fix Under Review): osd: eternal stuck PG in 'unfound_recovery'
Mykola Golub
05:22 AM Bug #24373: osd: eternal stuck PG in 'unfound_recovery'
https://github.com/ceph/ceph/pull/22358
Kouya Shimura
04:57 AM Bug #24373 (Resolved): osd: eternal stuck PG in 'unfound_recovery'
A PG might be eternally stuck in 'unfound_recovery' after some OSDs are marked down.
For example, the following st...
Kouya Shimura
06:12 AM Backport #24375 (In Progress): mimic: mon: auto compaction on rocksdb should kick in more often
Kefu Chai
06:11 AM Backport #24375 (Resolved): mimic: mon: auto compaction on rocksdb should kick in more often
https://github.com/ceph/ceph/pull/22361 Kefu Chai
06:10 AM Backport #24374 (In Progress): luminous: mon: auto compaction on rocksdb should kick in more often
Kefu Chai
06:08 AM Backport #24374 (Resolved): luminous: mon: auto compaction on rocksdb should kick in more often
https://github.com/ceph/ceph/pull/22360 Kefu Chai
06:08 AM Bug #24361 (Pending Backport): auto compaction on rocksdb should kick in more often
Kefu Chai
04:47 AM Bug #24371: Ceph-osd crash when activate SPDK
This is a bug in NVMEDevice, the bug fix has been committed.
Please have a review PR https://github.com/ceph/ceph...
Anonymous
02:02 AM Bug #24371: Ceph-osd crash when activate SPDK
I'm working on the issue. Anonymous
02:01 AM Bug #24371 (Resolved): Ceph-osd crash when activate SPDK
Enable SPDK and configure bluestore as mentioned in http://docs.ceph.com/docs/master/rados/configuration/bluestore-co... Anonymous
02:56 AM Feature #24363: Configure DPDK with mellanox NIC
next, compiling pass. but all binaries can not run.
output error
EAL: VFIO_RESOURCE_LIST tailq is already registere...
YongSheng Zhang
02:38 AM Feature #24363: Configure DPDK with mellanox NIC
log details
mellanox NIC over fabric
When compiling output error.
1. lack numa and cryptopp libraries
I ...
YongSheng Zhang
12:23 AM Feature #24363: Configure DPDK with mellanox NIC
Append
NIC over optical fiber
YongSheng Zhang
12:07 AM Bug #24160 (Resolved): Monitor down when large store data needs to compact triggered by ceph tell...
Kefu Chai

05/31/2018

11:34 PM Bug #24368 (In Progress): osd: should not restart on permanent failures
https://github.com/ceph/ceph/pull/22349 has the simple restart interval change. Will investigate the options for cond... Greg Farnum
11:25 PM Bug #24368: osd: should not restart on permanent failures
See https://www.freedesktop.org/software/systemd/man/systemd.service.html#Restart= for the details on Restart options. Greg Farnum
11:17 PM Bug #24368 (Resolved): osd: should not restart on permanent failures
Last week at OpenStack I heard a few users report OSDs were not failing hard and fast as they should be on disk issue... Greg Farnum
07:01 PM Bug #24366 (In Progress): omap_digest handling still not correct
https://github.com/ceph/ceph/pull/22346 David Zafman
05:39 PM Bug #24366 (Resolved): omap_digest handling still not correct

When running bluestore the object info data_digest is not needed. In that case the omap_digest handling is still b...
David Zafman
06:08 PM Bug #24349 (Pending Backport): osd: stray osds in async_recovery_targets cause out of order ops
Josh Durgin
12:51 AM Bug #24349: osd: stray osds in async_recovery_targets cause out of order ops
https://github.com/ceph/ceph/pull/22330 Josh Durgin
12:46 AM Bug #24349 (Resolved): osd: stray osds in async_recovery_targets cause out of order ops
Related to https://tracker.ceph.com/issues/23827
http://pulpito.ceph.com/yuriw-2018-05-24_17:07:20-powercycle-mast...
Neha Ojha
05:07 PM Bug #24365 (Resolved): cosbench stuck at booting cosbench driver
... Neha Ojha
03:54 PM Bug #24342: Monitor's routed_requests leak
What version are you running? The MRoute handling is all pretty old; though we've certainly discovered a number of le... Greg Farnum
02:17 PM Feature #24363 (New): Configure DPDK with mellanox NIC
Hi all
Whether ceph-13.1.0 support DPDK on mellanox NIC?
I found many issues when compiling. I even though handle t...
YongSheng Zhang
01:22 PM Bug #24362 (Triaged): ceph-objectstore-tool incorrectly invokes crush_location_hook
Ceph release being used: 12.5.5 (cad919881333ac92274171586c827e01f554a70a) luminous (stable)
/etc/ceph/ceph.conf c...
Roman Chebotarev
11:50 AM Backport #24359 (In Progress): mimic: osd: leaked Session on osd.7
Kefu Chai
07:39 AM Backport #24359 (Resolved): mimic: osd: leaked Session on osd.7
https://github.com/ceph/ceph/pull/22339 Nathan Cutler
09:40 AM Bug #24361 (Fix Under Review): auto compaction on rocksdb should kick in more often
https://github.com/ceph/ceph/pull/22337 Kefu Chai
09:07 AM Bug #24361 (Resolved): auto compaction on rocksdb should kick in more often
in rocksdb, by default, "max_bytes_for_level_base" is 256MB, "max_bytes_for_level_multiplier" is 10. so with this set... Kefu Chai
07:39 AM Backport #24360 (Resolved): luminous: osd: leaked Session on osd.7
https://github.com/ceph/ceph/pull/29859 Nathan Cutler
07:38 AM Backport #24350 (In Progress): mimic: slow mon ops from osd_failure
Nathan Cutler
07:37 AM Backport #24350 (Resolved): mimic: slow mon ops from osd_failure
https://github.com/ceph/ceph/pull/22297 Nathan Cutler
07:38 AM Backport #24356 (Resolved): luminous: osd: pg hard limit too easy to hit
https://github.com/ceph/ceph/pull/22592 Nathan Cutler
07:38 AM Backport #24355 (Resolved): mimic: osd: pg hard limit too easy to hit
https://github.com/ceph/ceph/pull/22621 Nathan Cutler
07:37 AM Backport #24351 (Resolved): luminous: slow mon ops from osd_failure
https://github.com/ceph/ceph/pull/22568 Nathan Cutler
05:31 AM Bug #20924 (Pending Backport): osd: leaked Session on osd.7
i think https://github.com/ceph/ceph/pull/22292 indeed addresses this issue
https://github.com/ceph/ceph/pull/22384
Kefu Chai
04:51 AM Backport #24246 (In Progress): mimic: Manager daemon y is unresponsive during teuthology cluster ...
https://github.com/ceph/ceph/pull/22333 Prashant D
02:55 AM Backport #24245 (In Progress): luminous: Manager daemon y is unresponsive during teuthology clust...
https://github.com/ceph/ceph/pull/22331 Prashant D
 

Also available in: Atom