Project

General

Profile

Activity

From 09/02/2017 to 10/01/2017

10/01/2017

09:08 AM Bug #21611 (Closed): rename in BlueFS is not atomic
I testing repair command, and found that:
1. rocksdb creates new MANIFEST file during repair database, and wants t...
Chang Liu
02:20 AM Bug #21470: Ceph OSDs crashing in BlueStore::queue_transactions() using EC after applying fix
TSAN unfortunately just caused the OSDs to core dump instantly. I'll see if I can find another way to find threading ... Bob Bobington

09/30/2017

07:22 AM Bug #21603: rocksdb is using slow crc
Mark, please let me know if i should update ceph/rocksdb with this fix and pick it up in ceph/ceph if you think we ne... Kefu Chai
07:20 AM Bug #21603: rocksdb is using slow crc
https://github.com/facebook/rocksdb/pull/2950 Kefu Chai
06:33 AM Bug #21470: Ceph OSDs crashing in BlueStore::queue_transactions() using EC after applying fix
While I'm not intimately familiar with threaded programming, I'm okay with general C++. Could you possibly explain wh... Bob Bobington
03:05 AM Bug #21470: Ceph OSDs crashing in BlueStore::queue_transactions() using EC after applying fix
No luck. I applied 1918c57c7c6304875501f4f4b04b9c82834395a3 from the aforementioned repo to my copy of the official L... Bob Bobington
05:31 AM Bug #21303: rocksdb get a error: "Compaction error: Corruption: block checksum mismatch"
After merged the following pacths, the error did't happend again. You can close the issue. Thanks!

pacth list:
h...
黄 维
04:11 AM Bug #21577 (Pending Backport): ceph-monstore-tool --readable mode doesn't understand FSMap, MgrMap
Kefu Chai

09/29/2017

10:36 PM Bug #20416: "FAILED assert(osdmap->test_flag((1<<15)))" (sortbitwise) on upgraded cluster
https://github.com/ceph/ceph/pull/18047 for the fix. I'll backport it to Luminous if that looks good. Greg Farnum
09:18 PM Bug #21470: Ceph OSDs crashing in BlueStore::queue_transactions() using EC after applying fix
Ah, found it: https://github.com/ceph/ceph-ci/tree/wip-21470-test Bob Bobington
09:12 PM Bug #21470: Ceph OSDs crashing in BlueStore::queue_transactions() using EC after applying fix
I'm not on a Debian or Redhat derivative, is there a Git repository I can get the source from or a tarball you can li... Bob Bobington
06:54 PM Bug #21470: Ceph OSDs crashing in BlueStore::queue_transactions() using EC after applying fix
Ok, that's kind of embarrassing, I thinkt eh fix is pretty simple. Can you please test out this branch?
wip-21470-...
Sage Weil
06:39 PM Bug #21303: rocksdb get a error: "Compaction error: Corruption: block checksum mismatch"
Can you repeat the fsck with --debug-bluefs 20?
CEPH_ARGS="--debug-bluestore 20 --debug-bluefs 20 --err-to-stderr ...
Sage Weil
06:11 PM Bug #21382 (Pending Backport): Erasure code recovery should send additional reads if necessary
David Zafman
06:08 PM Bug #21603: rocksdb is using slow crc
Kefu Chai wrote:
> i set a breakpoint in Fast_CRC32() and Slow_CRC32() when debugging ceph-mon, the breakpoint in Fa...
Mark Nelson
05:37 PM Bug #21603: rocksdb is using slow crc
@kefu, that's really elegant work, thanks for the info
Matt
Matt Benjamin
04:49 PM Bug #21603: rocksdb is using slow crc
i set a breakpoint in Fast_CRC32() and Slow_CRC32() when debugging ceph-mon, the breakpoint in Fast_CRC32() is always... Kefu Chai
03:08 PM Bug #21603: rocksdb is using slow crc
Matt Benjamin wrote:
> Just randomly, is this output just from ceph-osd running under perf?
This is output from m...
Mark Nelson
03:00 PM Bug #21603: rocksdb is using slow crc
Just randomly, is this output just from ceph-osd running under perf?
Matt
Matt Benjamin
02:42 PM Bug #21603 (Resolved): rocksdb is using slow crc
... Sage Weil
03:00 PM Bug #21249 (Resolved): Client client.admin marked osd.2 out, after it was down for 1504627577 sec...
Nathan Cutler
02:58 PM Bug #20944 (Resolved): OSD metadata 'backend_filestore_dev_node' is "unknown" even for simple dep...
Nathan Cutler
02:38 PM Bug #21566 (Fix Under Review): OSDService::recovery_need_sleep read+updated without locking
https://github.com/ceph/ceph/pull/18022 should take care of this. Neha Ojha
12:11 PM Backport #21307 (Resolved): luminous: Client client.admin marked osd.2 out, after it was down for...
Sage Weil
12:11 PM Backport #21465 (Resolved): luminous: OSD metadata 'backend_filestore_dev_node' is "unknown" even...
Sage Weil
10:43 AM Bug #21555: src/osd/PGLog.h: 1455: FAILED assert(miter != missing.get_items().end())
osd.6 remove object "0#2:c4b0339b:::benchmark_data_mira035.xsky.com_17216_object7868:head#" from backfillinfo.objects... huang jun
03:53 AM Bug #21555: src/osd/PGLog.h: 1455: FAILED assert(miter != missing.get_items().end())
... huang jun
01:48 AM Bug #21555: src/osd/PGLog.h: 1455: FAILED assert(miter != missing.get_items().end())
... huang jun
12:01 AM Bug #21555: src/osd/PGLog.h: 1455: FAILED assert(miter != missing.get_items().end())
Is this on master?
Shouldn't osd.7 have the 149'793 log entry for the delete, and thus detect the retry as a dupli...
Josh Durgin

09/28/2017

01:30 PM Bug #21417 (Pending Backport): buffer_anon leak during deep scrub (on otherwise idle osd)
Sage Weil
01:27 PM Bug #21592 (Resolved): LibRadosCWriteOps.CmpExt got 0 instead of -4095-1
... Sage Weil

09/27/2017

09:13 PM Bug #21577 (Fix Under Review): ceph-monstore-tool --readable mode doesn't understand FSMap, MgrMap
https://github.com/ceph/ceph/pull/18005
Marking for backport -- I consider this a bugfix because the mdsmap dumpin...
John Spray
06:44 PM Bug #21577 (Resolved): ceph-monstore-tool --readable mode doesn't understand FSMap, MgrMap
Annoying for anyone wanting to inspect these. I never updated it because I don't think I knew it existed :-) John Spray
07:52 PM Bug #21417: buffer_anon leak during deep scrub (on otherwise idle osd)
ok, the problem is that as scrub (or whatever) happens, the bluestore cache is populated, but the attrs weren't in th... Sage Weil
07:50 PM Bug #21417 (Fix Under Review): buffer_anon leak during deep scrub (on otherwise idle osd)
https://github.com/ceph/ceph/pull/18001 Sage Weil
07:41 PM Bug #21580 (Resolved): osd: stalled recovery ends up in recovery_wait
With https://github.com/ceph/ceph/pull/17839 a stalled recovery (due to remaining unfound objects) goes back into rec... Sage Weil
07:28 PM Feature #21579 (Resolved): [RFE] Stop OSD's removal if the OSD's are part of inactive PGs
[RFE] Stop OSD's removal if the OSD's are part of inactive PGs
Description of problem:
[RFE] Stop OSD's removal...
Vikhyat Umrao
02:56 PM Bug #21573 (Resolved): [upgrade] buffer::list ABI broken in luminous release
A client application that was compiled against a pre-Luminous librados C++ API and therefore utilizing bufferlist wil... Jason Dillaman
01:22 PM Backport #17445: jewel: list-snap cache tier missing promotion logic (was: rbd cli segfault when ...
Another case: http://tracker.ceph.com/issues/21537 Jason Dillaman
06:28 AM Bug #21566 (Resolved): OSDService::recovery_need_sleep read+updated without locking
Unless I'm misreading this, OSD::do_recovery() is invoked from the ShardedOpQueue without holding any locks on global... Greg Farnum
03:05 AM Bug #21555: src/osd/PGLog.h: 1455: FAILED assert(miter != missing.get_items().end())
@Josh do you have time to look at it? huang jun

09/26/2017

01:34 PM Bug #21557 (Can't reproduce): osd.6 found snap mapper error on pg 2.0 oid 2:0e781f33:::smithi1443...
... Sage Weil
10:37 AM Bug #21555: src/osd/PGLog.h: 1455: FAILED assert(miter != missing.get_items().end())
osd7:
91'473 (0'0) modify
151'793 (0'0) error
osd.6
91'473 (0'0) modify
149'793 (91'473) delete
huang jun
09:01 AM Bug #21555 (New): src/osd/PGLog.h: 1455: FAILED assert(miter != missing.get_items().end())
pg 2.3s0 up/acting is [7,0,2]/[6,0,2]
in backfill_toofull state, osd.6 got write op, bc object > last_backfill, an...
huang jun
03:30 AM Bug #21338 (Resolved): There is a big risk in function bufferlist::claim_prepend()
Kefu Chai
12:24 AM Bug #20416: "FAILED assert(osdmap->test_flag((1<<15)))" (sortbitwise) on upgraded cluster
Okay. Assuming sortbitwise is just a messaging scheme (I think it is), we should be safe to change the assert to requ... Greg Farnum
12:10 AM Bug #20416: "FAILED assert(osdmap->test_flag((1<<15)))" (sortbitwise) on upgraded cluster
Okay, the one I'm looking at is crashing on pg 126.b7, at epoch 5350. Pool 126 does not presently exist; epoch 5350 (... Greg Farnum

09/25/2017

09:21 PM Backport #21544 (Resolved): luminous: mon osd feature checks for osdmap flags and require-osd-rel...
https://github.com/ceph/ceph/pull/18364 Nathan Cutler
09:21 PM Backport #21543 (Resolved): luminous: bluestore fsck took 224.778802 seconds to complete which ca...
https://github.com/ceph/ceph/pull/18362 Nathan Cutler
05:37 PM Bug #21532: osd: Abort in thread_name:tp_osd_tp
...although even a slow disk shouldn't be long enough for the the heartbeat to time out. :/ Sage Weil
05:36 PM Bug #21532: osd: Abort in thread_name:tp_osd_tp
It looks like a zilli threads are blocked at... Sage Weil
05:25 PM Bug #21532 (Need More Info): osd: Abort in thread_name:tp_osd_tp
[10:22:39] <@sage> it looks like everyone is waiting for log flush.. which is deep in snprintf in the core. can't te... Greg Farnum
05:23 PM Bug #21532: osd: Abort in thread_name:tp_osd_tp
The log ends 14 minutes prior to the signal, which I imagine is related to #21507.... Greg Farnum
03:47 AM Bug #21532 (Need More Info): osd: Abort in thread_name:tp_osd_tp
... Patrick Donnelly
02:19 AM Bug #21471 (Pending Backport): mon osd feature checks for osdmap flags and require-osd-release fa...
Sage Weil
02:15 AM Bug #21474 (Pending Backport): bluestore fsck took 224.778802 seconds to complete which caused "t...
Sage Weil
02:13 AM Bug #21511 (Resolved): rados/standalone/scrub.yaml: can't decode 'snapset' attr buffer::malformed...
Sage Weil

09/23/2017

04:39 PM Bug #21470: Ceph OSDs crashing in BlueStore::queue_transactions() using EC after applying fix
With a similar but slightly different setup, this same crash happened to me.
Installed via ceph-deploy install --r...
Roy Hooper
10:15 AM Bug #21303: rocksdb get a error: "Compaction error: Corruption: block checksum mismatch"
@Daniel,
Yes.no OSD running on xfs shows the problem in question. I think one of different between the db based o...
wei qiaomiao
02:25 AM Bug #21382: Erasure code recovery should send additional reads if necessary
https://github.com/ceph/ceph/pull/17920 David Zafman
02:25 AM Bug #21382 (Fix Under Review): Erasure code recovery should send additional reads if necessary
David Zafman

09/22/2017

09:49 PM Bug #21511 (Fix Under Review): rados/standalone/scrub.yaml: can't decode 'snapset' attr buffer::m...
https://github.com/ceph/ceph/pull/17927 Sage Weil
06:00 PM Bug #21511 (Resolved): rados/standalone/scrub.yaml: can't decode 'snapset' attr buffer::malformed...
... Sage Weil
09:04 PM Bug #21408 (Resolved): osd: "fsck error: free extent 0x2000~2000 intersects allocated blocks"
Sage Weil
08:43 PM Bug #21218: thrash-eio + bluestore (hangs with unfound objects or read_log_and_missing assert)
@Sage,
Pls, would you have the reproducer for this, so I could give it a try and check it out in my environment? ...
Daniel Oliveira
05:51 PM Bug #21303: rocksdb get a error: "Compaction error: Corruption: block checksum mismatch"
@Wei,
Yes, the log file shows the same error with 12.2.0 build running on. I agree with @Josh and you, it seems t...
Daniel Oliveira
02:45 AM Bug #21303: rocksdb get a error: "Compaction error: Corruption: block checksum mismatch"
@Sage @Daniel Huang and I use the same cluster. We use xfs insteads of bluefs for some osds in our cluster, the issue... wei qiaomiao
01:54 AM Bug #21303: rocksdb get a error: "Compaction error: Corruption: block checksum mismatch"
Sage Weil wrote:
> Can you please upgrade to 12.2.0 (or better yet, latest luminous branch), and then run fsck and a...
黄 维
12:51 AM Bug #21303: rocksdb get a error: "Compaction error: Corruption: block checksum mismatch"
Daniel Oliveira wrote:
> @Wei,
>
> Please, would you mind describing a bit more your environment? Also, how ofte...
wei qiaomiao
11:36 AM Bug #20871 (Resolved): core dump when bluefs's mkdir returns -EEXIST
Chang Liu
03:31 AM Bug #20759: mon: valgrind detects a few leaks
/kchai-2017-09-21_06:22:45-rados-wip-kefu-testing-2017-09-21-1013-distro-basic-mira/1654844/remote/mira038/log/valgri... Kefu Chai
03:13 AM Bug #21474: bluestore fsck took 224.778802 seconds to complete which caused "timed out waiting fo...
... Kefu Chai
03:06 AM Bug #21474 (Fix Under Review): bluestore fsck took 224.778802 seconds to complete which caused "t...
https://github.com/ceph/ceph/pull/17902 Kefu Chai

09/21/2017

11:00 PM Bug #21382 (In Progress): Erasure code recovery should send additional reads if necessary
David Zafman
09:50 PM Bug #21470: Ceph OSDs crashing in BlueStore::queue_transactions() using EC after applying fix
Done. Took less than an hour and happened on two OSDs. Uploaded one of them:
ceph-post-file: 6e0ed6ab-1528-428d-aa...
Bob Bobington
08:26 PM Bug #21470 (Need More Info): Ceph OSDs crashing in BlueStore::queue_transactions() using EC after...
Okay, thanks for confirmation that the #21171 fix is applied. can you reproduce with debug bluestore = 20, and then ... Sage Weil
08:25 PM Bug #21475 (Duplicate): 12.2.0 bluestore - OSD down/crash " internal heartbeat not healthy, dropp...
Sage Weil
08:08 PM Bug #20416: "FAILED assert(osdmap->test_flag((1<<15)))" (sortbitwise) on upgraded cluster
Got a report of this happening in downstream Red Hat packages at https://bugzilla.redhat.com/show_bug.cgi?id=1494238
...
Greg Farnum
08:02 PM Bug #21496 (Fix Under Review): doc: Manually editing a CRUSH map, Word 'type' missing.
http://docs.ceph.com/docs/master/rados/operations/crush-map-edits/
In the section "CRUSH map rules", in the overvi...
Anonymous
07:59 PM Bug #21303 (Need More Info): rocksdb get a error: "Compaction error: Corruption: block checksum m...
Can you please upgrade to 12.2.0 (or better yet, latest luminous branch), and then run fsck and attach the output?
...
Sage Weil
04:59 PM Bug #21303: rocksdb get a error: "Compaction error: Corruption: block checksum mismatch"
@Wei,
Please, would you mind describing a bit more your environment? Also, how often does it happen? Can we repro...
Daniel Oliveira
06:11 AM Bug #21303: rocksdb get a error: "Compaction error: Corruption: block checksum mismatch"
This issue can reproduce in our cluster, we are willing to give more information if you need. wei qiaomiao
07:36 PM Bug #20653 (Can't reproduce): bluestore: aios don't complete on very large writes on xenial
I'm going to assume this was #21171 Sage Weil
06:48 PM Bug #21417: buffer_anon leak during deep scrub (on otherwise idle osd)
definitely happens from an ec pool. Sage Weil
04:03 PM Bug #21410 (Resolved): pg_upmap_items can duplicate an item
Sage Weil
02:45 PM Bug #21410 (Pending Backport): pg_upmap_items can duplicate an item
Sage Weil
04:02 PM Bug #21495 (New): src/osd/OSD.cc: 346: FAILED assert(piter != rev_pending_splits.end())
... Sage Weil
04:07 AM Backport #21465 (In Progress): luminous: OSD metadata 'backend_filestore_dev_node' is "unknown" e...
Nathan Cutler
04:05 AM Backport #21438 (In Progress): luminous: Daemons(OSD, Mon...) exit abnormally at injectargs command
Nathan Cutler
04:03 AM Backport #21343 (In Progress): luminous: DNS SRV default service name not used anymore
Nathan Cutler
04:01 AM Backport #21307 (In Progress): luminous: Client client.admin marked osd.2 out, after it was down ...
Nathan Cutler

09/20/2017

08:37 PM Bug #21428: luminous: osd: does not request latest map from mon
Fix:
* master https://github.com/ceph/ceph/pull/17828
* luminous https://github.com/ceph/ceph/pull/17829
Nathan Cutler
03:15 PM Bug #21428 (Resolved): luminous: osd: does not request latest map from mon
Josh Durgin
05:17 AM Bug #21428 (In Progress): luminous: osd: does not request latest map from mon
fixing bug in the patch Josh Durgin
04:39 PM Bug #21408 (Fix Under Review): osd: "fsck error: free extent 0x2000~2000 intersects allocated blo...
https://github.com/ceph/ceph/pull/17845 Sage Weil
04:03 PM Bug #21303: rocksdb get a error: "Compaction error: Corruption: block checksum mismatch"
potentially a bug in bluefs Josh Durgin
03:53 PM Bug #21407: backoff causes out of order op
Josh Durgin
03:46 PM Bug #20924: osd: leaked Session on osd.7
/a/yuriw-2017-09-19_19:54:13-rados-wip-yuri-testing3-2017-09-19-1710-distro-basic-smithi/1648800
osd.7 again! weird
Sage Weil
03:01 PM Bug #21474: bluestore fsck took 224.778802 seconds to complete which caused "timed out waiting fo...
cool. will update the test. Kefu Chai
12:14 PM Bug #21474: bluestore fsck took 224.778802 seconds to complete which caused "timed out waiting fo...
Sigh.. yeah. I can't decide if we should stop doing these fsck's entirely, or reduce the debug level just for fsck, ... Sage Weil
05:30 AM Bug #21474: bluestore fsck took 224.778802 seconds to complete which caused "timed out waiting fo...
Sage, if you believe that it's normal for bluestore to take around 4 minutes to complete a deep fsck. i will prolong ... Kefu Chai
05:28 AM Bug #21474 (Resolved): bluestore fsck took 224.778802 seconds to complete which caused "timed out...
/a/kchai-2017-09-19_14:50:44-rados-wip-kefu-testing-2017-09-19-1954-distro-basic-mira/1648644... Kefu Chai
11:19 AM Bug #21475: 12.2.0 bluestore - OSD down/crash " internal heartbeat not healthy, dropping ping req...
Seems, its a duplicate of this tracker http://tracker.ceph.com/issues/21180 . Please verify.. Nokia ceph-users
11:18 AM Bug #21475 (Duplicate): 12.2.0 bluestore - OSD down/crash " internal heartbeat not healthy, dropp...
~~~
2017-09-18 14:51:59.895746 7f1e744e0700 0 log_channel(cluster) log [WRN] : slow request 60.068824 seconds old...
Nokia ceph-users
10:25 AM Bug #21471 (In Progress): mon osd feature checks for osdmap flags and require-osd-release fail if...
https://github.com/ceph/ceph/pull/17831 Brad Hubbard
02:29 AM Bug #21471 (Resolved): mon osd feature checks for osdmap flags and require-osd-release fail if 0 ...
the various checks test get_up_osd_features() but that returns 0 if no osds are up.
needs to be fixed in luminous ...
Sage Weil
02:19 AM Bug #21470: Ceph OSDs crashing in BlueStore::queue_transactions() using EC after applying fix
Oh, forgot to add that I've tried the workarounds on the related issues. Adding this to my ceph.conf makes no differe... Bob Bobington
02:16 AM Bug #21470 (Resolved): Ceph OSDs crashing in BlueStore::queue_transactions() using EC after apply...
This is a copy of http://tracker.ceph.com/issues/21314, which was marked as resolved. It's not resolved after applyin... Bob Bobington

09/19/2017

11:46 PM Bug #21428 (Resolved): luminous: osd: does not request latest map from mon
backport was https://github.com/ceph/ceph/pull/17796 Josh Durgin
07:24 AM Bug #21428 (Fix Under Review): luminous: osd: does not request latest map from mon
https://github.com/ceph/ceph/pull/17795 Josh Durgin
02:02 AM Bug #21428 (In Progress): luminous: osd: does not request latest map from mon
Josh Durgin
12:16 AM Bug #21428: luminous: osd: does not request latest map from mon
I think this is from the fast_dispatch refactor in luminous, and the latest test timing just happened to show it. Josh Durgin
12:12 AM Bug #21428 (Resolved): luminous: osd: does not request latest map from mon
On the current luminous branch, a couple tests saw slow requests > 1 hour due to ops waiting for maps.
One is /a/y...
Josh Durgin
08:25 PM Backport #21465 (Resolved): luminous: OSD metadata 'backend_filestore_dev_node' is "unknown" even...
https://github.com/ceph/ceph/pull/17865 Nathan Cutler
06:01 PM Bug #20944 (Pending Backport): OSD metadata 'backend_filestore_dev_node' is "unknown" even for si...
Sage Weil
11:36 AM Backport #21438 (Resolved): luminous: Daemons(OSD, Mon...) exit abnormally at injectargs command
https://github.com/ceph/ceph/pull/17864 Nathan Cutler
08:20 AM Bug #21287: 1 PG down, OSD fails with "FAILED assert(i->prior_version == last || i->is_error())"
I had to delete affected pool to reclaim occupied space so I am unable to verify any fixes Henrik Korkuc
03:31 AM Bug #21174: OSD crash: 903: FAILED assert(objiter->second->version > last_divergent_update)
duplicate issue: http://tracker.ceph.com/issues/16279 huang jun

09/18/2017

08:57 PM Bug #19790 (Resolved): rados ls on pool with no access returns no error
Nathan Cutler
08:57 PM Backport #20723 (Resolved): jewel: rados ls on pool with no access returns no error
Nathan Cutler
02:45 PM Bug #20909: Error ETIMEDOUT: crush test failed with -110: timed out during smoke test (5 seconds)
with https://github.com/ceph/ceph/pull/17179, we also met this error:
mon.b@0(leader).osd e15 tester.test_with_fork...
huang jun
02:53 AM Bug #21171: bluestore: aio submission deadlock
Since my issue (http://tracker.ceph.com/issues/21314) was marked as a dupe of this and I haven't received a response ... Bob Bobington
01:42 AM Bug #21417 (Resolved): buffer_anon leak during deep scrub (on otherwise idle osd)
observed gobs of ram (11gb rss) and most of it buffer_anon (~8gb) on a basically idle cluster with replication, ec, a... Sage Weil

09/16/2017

05:59 PM Bug #21409 (Resolved): per-pool full flags set incorrectly?
Sage Weil
05:46 AM Bug #21409 (Fix Under Review): per-pool full flags set incorrectly?
https://github.com/ceph/ceph/pull/17763 xie xingguo

09/15/2017

09:54 PM Bug #21408 (In Progress): osd: "fsck error: free extent 0x2000~2000 intersects allocated blocks"
Sage Weil
07:50 PM Bug #21408 (Resolved): osd: "fsck error: free extent 0x2000~2000 intersects allocated blocks"
Run: http://pulpito.ceph.com/teuthology-2017-09-15_17:30:33-upgrade:luminous-x-master-distro-basic-smithi/
Jobs: man...
Yuri Weinstein
08:57 PM Bug #21410 (Fix Under Review): pg_upmap_items can duplicate an item
https://github.com/ceph/ceph/pull/17760 Sage Weil
08:43 PM Bug #21410 (Resolved): pg_upmap_items can duplicate an item
... Sage Weil
08:34 PM Bug #21309 (Resolved): mon/OSDMonitor: deleting pool while pgs are being created leads to assert(...
Nathan Cutler
08:34 PM Backport #21341 (Resolved): luminous: mon/OSDMonitor: deleting pool while pgs are being created l...
Nathan Cutler
08:32 PM Bug #21409 (Resolved): per-pool full flags set incorrectly?
http://pulpito.ceph.com/sage-2017-09-15_15:50:19-rados-wip-sage-testing2-2017-09-14-1256-distro-basic-smithi/1635852
...
Sage Weil
08:28 PM Bug #21407: backoff causes out of order op
https://github.com/ceph/ceph/pull/17759 Sage Weil
07:44 PM Bug #21407: backoff causes out of order op
problem seems to be that we are requeueing waiting_for_peered before we are actually peered. that happens from on_fl... Sage Weil
07:19 PM Bug #21407 (Resolved): backoff causes out of order op
- receive op a and b... Sage Weil
11:53 AM Bug #21365 (Pending Backport): Daemons(OSD, Mon...) exit abnormally at injectargs command
https://github.com/ceph/ceph/pull/17664 Kefu Chai
08:11 AM Bug #21388: inconsistent pg but repair does nothing reporting head data_digest != data_digest fro...
We did an scp of the OSD object file then a rados put and deep-scrub got it back to OK.
Thanks for your quick answer!
Laurent GUERBY

09/14/2017

10:20 PM Bug #21388: inconsistent pg but repair does nothing reporting head data_digest != data_digest fro...
I forgot the mention that on get you have to prevent the code from checking the digest by doing reads that are smalle... David Zafman
10:18 PM Bug #21388: inconsistent pg but repair does nothing reporting head data_digest != data_digest fro...

After the deep-scrub, "rados get" give us some errors:
# rados list-inconsistent-obj 58.6c1 --format=json-pretty...
Mehdi Abaakouk
03:35 PM Bug #21388: inconsistent pg but repair does nothing reporting head data_digest != data_digest fro...

Doing the following should produce list-inconsistent-obj information:
$ ceph pg deep-scrub 58.6c1
(Wait for scr...
David Zafman
01:37 PM Bug #21388 (Duplicate): inconsistent pg but repair does nothing reporting head data_digest != dat...
ceph pg repair is currently not fixing three "inconsistent" objects
on one of our pg on a replica 3 pool.
For al...
Laurent GUERBY
10:18 PM Backport #21117 (In Progress): jewel: osd: osd_scrub_during_recovery only considers primary, not ...
This requires merge resolution which I've begun looking at. David Zafman
12:25 PM Bug #20924: osd: leaked Session on osd.7
/a/sage-2017-09-13_13:31:57-rados-wip-sage-testing-2017-09-12-1750-distro-basic-smithi/1627916
is it just me or is...
Sage Weil
12:22 PM Bug #21218: thrash-eio + bluestore (hangs with unfound objects or read_log_and_missing assert)
another run, eio injection led to a read_log_and_missing assert:... Sage Weil
12:15 PM Bug #21387 (Can't reproduce): mark_unfound_lost hangs
... Sage Weil
10:51 AM Bug #21354: Possible bug in interval_set.intersect_of()
great, i was searching this ticket in "My Page" on tracker =D Kefu Chai
09:29 AM Backport #21374 (In Progress): luminous: incorrect erasure-code space in command ceph df
Abhishek Lekshmanan
09:22 AM Documentation #21386: rados: manpage missing import/export
Also there seem to be hidden secret options like '--workers', nowhere to be found (not even in rados --help) but acce... Peter Gervai
09:17 AM Documentation #21386 (New): rados: manpage missing import/export
IMPORT AND EXPORT
export [filename]
Serialize pool contents to a file or standard out.
import [--dry-...
Peter Gervai

09/13/2017

11:59 PM Bug #21382 (Resolved): Erasure code recovery should send additional reads if necessary

We don't send additional reads when recovery experiences errors on some of the shards. For recovery we send k read...
David Zafman
09:54 PM Backport #17445: jewel: list-snap cache tier missing promotion logic (was: rbd cli segfault when ...
https://github.com/ceph/ceph/pull/17707 Sage Weil

09/12/2017

06:59 PM Feature #21084 (Fix Under Review): auth: add osd auth caps based on pool metadata
https://github.com/ceph/ceph/pull/17678 Douglas Fuller
05:19 PM Bug #20981: ./run_seed_to_range.sh errored out
I'm lowering the priority because it appears to me to be a test code issue. The test doesn't detect the failure it i... David Zafman
02:17 PM Backport #21374 (Resolved): luminous: incorrect erasure-code space in command ceph df
https://github.com/ceph/ceph/pull/17724 Nathan Cutler
01:44 PM Bug #21243 (Pending Backport): incorrect erasure-code space in command ceph df
Kefu Chai
12:32 PM Bug #21243 (Resolved): incorrect erasure-code space in command ceph df
Chang Liu
12:31 PM Bug #21258 (Closed): "ceph df"'s MAX AVAIL is not correct
Chang Liu
11:04 AM Bug #21354 (Closed): Possible bug in interval_set.intersect_of()
Closing, as the real reason for the issue was a git-merge that went wrong, leaving extra "insert(start, en-start);" c... Piotr Dalek
04:21 AM Bug #21354: Possible bug in interval_set.intersect_of()
I have tried to reproduce this problem myself, but I got the same results with and without the intersection_size_asym... Zac Medico
07:01 AM Feature #21366: tools/ceph-objectstore-tool: split filestore directories offline to target hash l...
https://github.com/ceph/ceph/pull/17666 Zhi Zhang
07:00 AM Feature #21366 (Resolved): tools/ceph-objectstore-tool: split filestore directories offline to ta...
Currently ceph-objectstore-tool can only split dirs that already meet the usual object number criteria. It won't redu... Zhi Zhang
06:21 AM Bug #21338 (Fix Under Review): There is a big risk in function bufferlist::claim_prepend()
https://github.com/ceph/ceph/pull/17661 Kefu Chai
04:13 AM Bug #21365 (Resolved): Daemons(OSD, Mon...) exit abnormally at injectargs command
Use tell injectargs command to adjust log level of osd, get the following error:... Yan Jun

09/11/2017

11:16 PM Bug #20981: ./run_seed_to_range.sh errored out

This bug was filed because the ceph_test_filestore_idempotent_sequence wasn't completing the _exit() in _inject_fai...
David Zafman
03:44 PM Bug #21354 (Closed): Possible bug in interval_set.intersect_of()
I've been working on different kind of optimization of pg_pool_t::build_removed_snaps (that gets rid of intersect int... Piotr Dalek
09:39 AM Backport #21341 (In Progress): luminous: mon/OSDMonitor: deleting pool while pgs are being create...
Nathan Cutler
09:37 AM Backport #21341 (Resolved): luminous: mon/OSDMonitor: deleting pool while pgs are being created l...
https://github.com/ceph/ceph/pull/17634 Nathan Cutler
09:38 AM Backport #21343 (Resolved): luminous: DNS SRV default service name not used anymore
https://github.com/ceph/ceph/pull/17863 Nathan Cutler
08:17 AM Bug #21338 (Resolved): There is a big risk in function bufferlist::claim_prepend()
Recently i found a design flaw in the study of the bufferlist. There is a big risk if we call buffer::list::claim_pre... Ivan Guan
08:10 AM Bug #21303: rocksdb get a error: "Compaction error: Corruption: block checksum mismatch"
[root@ceph241 hw]# ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-1 fsck
action fsck
2017-09-11 15:37:35.698119 ...
黄 维
04:06 AM Bug #18749: OSD: allow EC PGs to do recovery below min_size
https://github.com/ceph/ceph/pull/17619
Greg Farnum, would you mind taking a look?
Chang Liu

09/10/2017

07:17 PM Bug #21309 (Pending Backport): mon/OSDMonitor: deleting pool while pgs are being created leads to...
Sage Weil
07:15 PM Bug #20924: osd: leaked Session on osd.7
/a/sage-2017-09-10_02:50:18-rados-wip-sage-testing-2017-09-08-1434-distro-basic-smithi/1615133 Sage Weil
06:58 PM Bug #21180 (Resolved): Bluestore throttler causes down OSD
Pretty sure this was #21171, fixed merged to master and luminous, will be in 12.2.1. Sage Weil
06:57 PM Bug #21246 (Resolved): bluestore: hang while replaying deferred ios from journal
Pretty sure this was #21171. Fix is merged to master and luminous branch, will be in v12.2.1. Sage Weil
06:57 PM Backport #21325 (Resolved): luminous: bluestore: aio submission deadlock
Sage Weil
06:57 PM Bug #21171 (Resolved): bluestore: aio submission deadlock
Sage Weil
12:41 AM Bug #21331: pg recovery priority inversion
... Sage Weil

09/09/2017

08:38 PM Bug #21314: Ceph OSDs crashing in BlueStore::queue_transactions() using EC
I also tried the workaround in http://tracker.ceph.com/issues/21180 by adding these to ceph.conf but no luck:
<pre...
Bob Bobington
07:19 PM Bug #21314: Ceph OSDs crashing in BlueStore::queue_transactions() using EC
After a few crashes the OSDs become permanently lost, consistently displaying errors like this upon startup:... Bob Bobington
05:31 PM Bug #21314: Ceph OSDs crashing in BlueStore::queue_transactions() using EC
I've applied the changes in the Git pull request referenced in that issue and the issue still persists:... Bob Bobington
05:51 AM Bug #21314: Ceph OSDs crashing in BlueStore::queue_transactions() using EC
Hmm, I found another log file and came across this:... Bob Bobington
07:42 PM Bug #21331: pg recovery priority inversion
Actually, this isn't quite right.
The real problem is that the *primary* has an ancient last_complete, because it ...
Sage Weil
06:53 PM Bug #21331: pg recovery priority inversion
it looks lke peer_last_commit_ondisk for osd.26 isn't getting updated since it is not in acting (it's backfill target... Sage Weil
06:21 PM Bug #21331 (Resolved): pg recovery priority inversion
... Sage Weil
08:45 AM Bug #21303: rocksdb get a error: "Compaction error: Corruption: block checksum mismatch"
the gdb info maybe helpful.It return null when rocsdb read metadata from the sst file
(gdb) n
rocksdb::ReadBlockC...
黄 维
04:08 AM Bug #21204 (Pending Backport): DNS SRV default service name not used anymore
Kefu Chai

09/08/2017

08:21 PM Backport #21325 (In Progress): luminous: bluestore: aio submission deadlock
Nathan Cutler
08:20 PM Backport #21325 (Resolved): luminous: bluestore: aio submission deadlock
https://github.com/ceph/ceph/pull/17601 Nathan Cutler
08:18 PM Bug #21314: Ceph OSDs crashing in BlueStore::queue_transactions() using EC
There are no log entries regarding failed heartbeat checks on the failing OSDs, only on the other OSDs witnessing the... Bob Bobington
07:37 PM Bug #21314 (Duplicate): Ceph OSDs crashing in BlueStore::queue_transactions() using EC
It is hard to tell because the lines preceding the snippet are missing, but I'm pretty sure this is a dup of #21171, ... Sage Weil
06:22 PM Bug #21314: Ceph OSDs crashing in BlueStore::queue_transactions() using EC
... Greg Farnum
03:44 PM Bug #21314 (Duplicate): Ceph OSDs crashing in BlueStore::queue_transactions() using EC
Log is attached. 3 of my 4 OSDs have crashed in a similar manner at different times. I'm running Ceph on a single nod... Bob Bobington
06:37 PM Bug #21250 (Resolved): os/bluestore/BlueFS.cc: 1255: FAILED assert(!log_file->fnode.extents.empty())
Nathan Cutler
06:36 PM Backport #21276 (Resolved): luminous: os/bluestore/BlueFS.cc: 1255: FAILED assert(!log_file->fnod...
Nathan Cutler
03:48 PM Backport #21276: luminous: os/bluestore/BlueFS.cc: 1255: FAILED assert(!log_file->fnode.extents.e...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/17562
merged
Yuri Weinstein
05:51 PM Bug #21123 (Resolved): osd/PrimaryLogPG: sparse read won't trigger repair correctly
Nathan Cutler
05:50 PM Bug #21162 (Resolved): 'osd crush rule rename' not idempotent
Nathan Cutler
05:50 PM Bug #21207 (Resolved): bluestore: asyn cdeferred_try_submit deadlock
Nathan Cutler
05:13 PM Bug #19605 (Resolved): OSD crash: PrimaryLogPG.cc: 8396: FAILED assert(repop_queue.front() == repop)
Nathan Cutler
05:12 PM Bug #20888 (Resolved): "Health check update" log spam
Nathan Cutler
03:57 PM Backport #21133 (Resolved): luminous: osd/PrimaryLogPG: sparse read won't trigger repair correctly
Sage Weil
03:56 PM Backport #21234 (Resolved): luminous: bluestore: asyn cdeferred_try_submit deadlock
Sage Weil
03:56 PM Backport #21182 (Resolved): luminous: 'osd crush rule rename' not idempotent
Sage Weil
03:41 PM Bug #20370: leaked MOSDOp via PrimaryLogPG::_copy_some and PrimaryLogPG::do_proxy_write
/a/yuriw-2017-09-07_19:30:56-rados-wip-yuri-testing4-2017-09-07-1811-distro-basic-smithi/1607597 Sage Weil
02:42 PM Backport #21308: jewel: pre-luminous: aio_read returns erroneous data when rados_osd_op_timeout i...
Nathan, thanks for creating this ticket! Kefu Chai
08:16 AM Backport #21308 (In Progress): jewel: pre-luminous: aio_read returns erroneous data when rados_os...
Nathan Cutler
08:15 AM Backport #21308 (Resolved): jewel: pre-luminous: aio_read returns erroneous data when rados_osd_o...
https://github.com/ceph/ceph/pull/17594 Nathan Cutler
02:26 PM Backport #21242 (Resolved): luminous: OSD crash: PrimaryLogPG.cc: 8396: FAILED assert(repop_queue...
Sage Weil
02:25 PM Backport #21240 (Resolved): luminous: "Health check update" log spam
Sage Weil
02:24 PM Backport #21238 (Resolved): luminous: test_health_warnings.sh can fail
Sage Weil
12:20 PM Bug #21293 (Resolved): bluestore: spanning blob doesn't match expected ref_map
Sage Weil
12:18 PM Bug #21171 (Pending Backport): bluestore: aio submission deadlock
https://github.com/ceph/ceph/pull/17601 is teh backport Sage Weil
12:05 PM Bug #21309 (Fix Under Review): mon/OSDMonitor: deleting pool while pgs are being created leads to...
https://github.com/ceph/ceph/pull/17600 Joao Eduardo Luis
11:56 AM Bug #21309 (In Progress): mon/OSDMonitor: deleting pool while pgs are being created leads to asse...
Joao Eduardo Luis
11:55 AM Bug #21309 (Resolved): mon/OSDMonitor: deleting pool while pgs are being created leads to assert(...
ceph version 13.0.0-429-gbc5fe2e (bc5fe2e9099dbb560c2153d3ac85f38b46593a77) mimic (dev)
Easily reproducible on a v...
Joao Eduardo Luis
08:14 AM Backport #21307 (Resolved): luminous: Client client.admin marked osd.2 out, after it was down for...
https://github.com/ceph/ceph/pull/17862 Nathan Cutler
08:14 AM Bug #20616 (Pending Backport): pre-luminous: aio_read returns erroneous data when rados_osd_op_ti...
Fixed in Infernalis by https://github.com/ceph/ceph/commit/64bca33ae76646879e6801c45e6d91852e488f8b
Needs backport...
Nathan Cutler
07:32 AM Bug #20616 (Fix Under Review): pre-luminous: aio_read returns erroneous data when rados_osd_op_ti...
this only happens if "rados_osd_op_timeout > 0", where the rx_buffer optimization is disabled, due to #9582. in that ... Kefu Chai
06:04 AM Bug #21303 (Resolved): rocksdb get a error: "Compaction error: Corruption: block checksum mismatch"
ceph --version
ceph version 12.1.0.5 (27f32562975c5fd3b785a124c818599c677b3f67) luminous (dev)
osd log:
2017-09-...
黄 维

09/07/2017

09:23 PM Bug #21249 (Pending Backport): Client client.admin marked osd.2 out, after it was down for 150462...
Sage Weil
08:47 PM Bug #21171: bluestore: aio submission deadlock
There wsa also an aio submission bug that dropped ios on the floor. it was consistently reproducible with... Sage Weil
06:42 PM Bug #20910 (In Progress): spurious MON_DOWN, apparently slow/laggy mon
Ok, this is still happening.. and it correlated with (1) bluestore and (2) bluestore fsck on mount, which spews an un... Sage Weil
01:57 AM Bug #20910: spurious MON_DOWN, apparently slow/laggy mon
*master PR for backport*: https://github.com/ceph/ceph/pull/17505 Nathan Cutler
01:29 PM Bug #21293: bluestore: spanning blob doesn't match expected ref_map
... Sage Weil
01:29 PM Bug #21293 (Fix Under Review): bluestore: spanning blob doesn't match expected ref_map
https://github.com/ceph/ceph/pull/17569 Sage Weil
01:11 PM Bug #21293 (Resolved): bluestore: spanning blob doesn't match expected ref_map
... Sage Weil
01:02 PM Backport #21283 (In Progress): luminous: spurious MON_DOWN, apparently slow/laggy mon
Abhishek Lekshmanan
07:36 AM Backport #21283 (Resolved): luminous: spurious MON_DOWN, apparently slow/laggy mon
https://github.com/ceph/ceph/pull/17564 Nathan Cutler
01:00 PM Backport #21276 (In Progress): luminous: os/bluestore/BlueFS.cc: 1255: FAILED assert(!log_file->f...
Abhishek Lekshmanan
07:35 AM Backport #21276 (Resolved): luminous: os/bluestore/BlueFS.cc: 1255: FAILED assert(!log_file->fnod...
https://github.com/ceph/ceph/pull/17562 Nathan Cutler
10:49 AM Bug #20616: pre-luminous: aio_read returns erroneous data when rados_osd_op_timeout is set but no...
i am able to reproduce this issue with the last jewel, but not master.
reverting 126d0b30e990519b8f845f99ba893fdcd...
Kefu Chai
09:01 AM Bug #21287: 1 PG down, OSD fails with "FAILED assert(i->prior_version == last || i->is_error())"
btw down pg is 1.1735.
Starting OSD 381 crashes 65, 133 and 118. Stoping 65 enables to start remaining OSDs, start...
Henrik Korkuc
08:14 AM Bug #21287 (Duplicate): 1 PG down, OSD fails with "FAILED assert(i->prior_version == last || i->i...
One PG went down for me during large rebalance (I added racks to OSD placement, almost all data had to be shuffled). ... Henrik Korkuc
08:16 AM Bug #21180: Bluestore throttler causes down OSD
pool used for this workload is blocked by down PG (#21287), but I'll try to replicate on same cluster with newly crea... Henrik Korkuc
05:27 AM Bug #21204 (Fix Under Review): DNS SRV default service name not used anymore
https://github.com/ceph/ceph/pull/17539 Kefu Chai
02:44 AM Bug #21258: "ceph df"'s MAX AVAIL is not correct
Josh Durgin wrote:
> What is your crushmap and device sizes? It looks like you may have different roots, hence diffe...
Chang Liu
01:31 AM Bug #21262: cephfs ec data pool, many osds marked down
yes. the log not only about one issue.totally issue like blow:
1. slow request, osd marked down, osd op suicide ca...
Yong Wang

09/06/2017

09:03 PM Bug #20910 (Pending Backport): spurious MON_DOWN, apparently slow/laggy mon
Sage Weil
09:02 PM Bug #20910 (Resolved): spurious MON_DOWN, apparently slow/laggy mon
the problem is that bluestore logs so freaking much at debug bluestore = 30 that the mon gets all laggy. Sage Weil
08:52 PM Bug #21250 (Pending Backport): os/bluestore/BlueFS.cc: 1255: FAILED assert(!log_file->fnode.exten...
Sage Weil
05:03 PM Bug #21249 (Fix Under Review): Client client.admin marked osd.2 out, after it was down for 150462...
https://github.com/ceph/ceph/pull/17525 John Spray
04:35 PM Bug #21262 (Need More Info): cephfs ec data pool, many osds marked down
Sage Weil
03:44 PM Bug #21262: cephfs ec data pool, many osds marked down
You're hitting a variety of issues there - some suggesting on-disk corruption, the unexpected error indicating a like... Josh Durgin
02:26 PM Bug #21262: cephfs ec data pool, many osds marked down
relationed error
ceph-osd.22.log:/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AV...
Yong Wang
02:16 PM Bug #21262 (Need More Info): cephfs ec data pool, many osds marked down
cephfs ec data pool, many osds marked down
slow request and get flow blocked, deal op blocked and etc.
Yong Wang
04:34 PM Bug #21180 (Need More Info): Bluestore throttler causes down OSD
Sage Weil
04:34 PM Bug #21180: Bluestore throttler causes down OSD
Can you try setting bluestore_deferred_throttle_bytes = 0 along with bluestore_throttle_bytes = 0 and see if that res... Sage Weil
04:32 PM Bug #21246: bluestore: hang while replaying deferred ios from journal
This looks like it might be the same as #21171, or one of the related bugs I am currently working on. As soon as I h... Sage Weil
03:18 PM Bug #21258 (Fix Under Review): "ceph df"'s MAX AVAIL is not correct
Ah I see your PR now: https://github.com/ceph/ceph/pull/17513 Josh Durgin
03:16 PM Bug #21258: "ceph df"'s MAX AVAIL is not correct
What is your crushmap and device sizes? It looks like you may have different roots, hence different space available i... Josh Durgin
03:45 AM Bug #21258 (Closed): "ceph df"'s MAX AVAIL is not correct
... Chang Liu
03:18 PM Bug #21263: when disk error happens, osd reports assertion failure without any error information
Will fix it in this PR:
https://github.com/ceph/ceph/pull/17522
Pan Liu
02:38 PM Bug #21263 (Resolved): when disk error happens, osd reports assertion failure without any error i...
I used fio+librbd to test one osd(bluestore), which built in an NVME SSD. After I plug-out this SSD, osd reports asse... Pan Liu
11:40 AM Bug #21143: bad RESETSESSION between OSDs?
@yuri, this PR is not merged. or i misunderstand your comment here? Kefu Chai
07:44 AM Bug #21243: incorrect erasure-code space in command ceph df
https://github.com/ceph/ceph/pull/17513 Chang Liu
05:56 AM Feature #21198: Monitors don't handle incomplete network splits
the same case:
https://marc.info/?l=ceph-devel&w=2&r=1&s=ceph-mon+leader+election+problem&q=b
zhiang li

09/05/2017

08:49 PM Bug #20041 (Resolved): ceph-osd: PGs getting stuck in scrub state, stalling RBD
Nathan Cutler
08:49 PM Backport #20780 (Resolved): jewel: ceph-osd: PGs getting stuck in scrub state, stalling RBD
Nathan Cutler
08:47 PM Bug #20464 (Resolved): cache tier osd memory high memory consumption
Nathan Cutler
08:47 PM Backport #20511 (Resolved): jewel: cache tier osd memory high memory consumption
Nathan Cutler
08:46 PM Bug #20375 (Resolved): osd: omap threadpool heartbeat is only reset every 100 values
Nathan Cutler
08:46 PM Backport #20492 (Resolved): jewel: osd: omap threadpool heartbeat is only reset every 100 values
Nathan Cutler
07:02 PM Bug #21250 (Fix Under Review): os/bluestore/BlueFS.cc: 1255: FAILED assert(!log_file->fnode.exten...
https://github.com/ceph/ceph/pull/17503 Sage Weil
06:53 PM Bug #21250: os/bluestore/BlueFS.cc: 1255: FAILED assert(!log_file->fnode.extents.empty())
looks like two concurrent threads trying to compact_log_async:... Sage Weil
06:51 PM Bug #21250 (Resolved): os/bluestore/BlueFS.cc: 1255: FAILED assert(!log_file->fnode.extents.empty())
... Sage Weil
04:49 PM Bug #21249 (Resolved): Client client.admin marked osd.2 out, after it was down for 1504627577 sec...
... Sage Weil
03:30 PM Bug #20843 (Resolved): assert(i->prior_version == last) when a MODIFY entry follows an ERROR entry
Nathan Cutler
03:30 PM Backport #20930 (Rejected): kraken: assert(i->prior_version == last) when a MODIFY entry follows ...
Kraken is EOL. Nathan Cutler
03:30 PM Backport #20722 (Rejected): kraken: rados ls on pool with no access returns no error
Kraken is EOL. Nathan Cutler
03:29 PM Backport #20493 (Rejected): kraken: osd: omap threadpool heartbeat is only reset every 100 values
Kraken is EOL. Nathan Cutler
03:21 PM Backport #21242 (In Progress): luminous: OSD crash: PrimaryLogPG.cc: 8396: FAILED assert(repop_qu...
Nathan Cutler
09:10 AM Backport #21242 (Resolved): luminous: OSD crash: PrimaryLogPG.cc: 8396: FAILED assert(repop_queue...
https://github.com/ceph/ceph/pull/17501 Nathan Cutler
03:20 PM Backport #21240 (In Progress): luminous: "Health check update" log spam
Nathan Cutler
09:09 AM Backport #21240 (Resolved): luminous: "Health check update" log spam
https://github.com/ceph/ceph/pull/17500 Nathan Cutler
03:18 PM Backport #21238 (In Progress): luminous: test_health_warnings.sh can fail
Nathan Cutler
09:09 AM Backport #21238 (Resolved): luminous: test_health_warnings.sh can fail
https://github.com/ceph/ceph/pull/17498 Nathan Cutler
03:15 PM Backport #21236 (In Progress): luminous: build_initial_pg_history doesn't update up/acting/etc
Nathan Cutler
09:09 AM Backport #21236 (Resolved): luminous: build_initial_pg_history doesn't update up/acting/etc
https://github.com/ceph/ceph/pull/17496
https://github.com/ceph/ceph/pull/17622
Nathan Cutler
03:13 PM Backport #21235 (In Progress): luminous: thrashosds read error injection doesn't take live_osds i...
Nathan Cutler
09:09 AM Backport #21235 (Resolved): luminous: thrashosds read error injection doesn't take live_osds into...
https://github.com/ceph/ceph/pull/17495 Nathan Cutler
03:12 PM Backport #21234 (In Progress): luminous: bluestore: asyn cdeferred_try_submit deadlock
Nathan Cutler
09:09 AM Backport #21234 (Resolved): luminous: bluestore: asyn cdeferred_try_submit deadlock
https://github.com/ceph/ceph/pull/17494 Nathan Cutler
01:02 PM Bug #21243: incorrect erasure-code space in command ceph df
not only ISA plugin, It's common problem.... Chang Liu
01:00 PM Bug #21243: incorrect erasure-code space in command ceph df
... Chang Liu
11:09 AM Bug #21243 (Resolved): incorrect erasure-code space in command ceph df


ceph osd erasure-code-profile set ISA plugin=isa k=2 m=2 crush-failure-domain=host crush-device-c...
Petr Malkov
12:56 PM Bug #21246 (Resolved): bluestore: hang while replaying deferred ios from journal
Running ceph-osd-11.2.0-0.el7.x86_64 from ceph-stable's CentOS repository, I hit the following problem. The cluster (... Tobias Florek
10:22 AM Bug #21180: Bluestore throttler causes down OSD
just an update - sometimes even with bluestore_throttle_bytes set to 0 I get down OSDs, but it is much more rare and ... Henrik Korkuc
09:51 AM Backport #21182 (In Progress): luminous: 'osd crush rule rename' not idempotent
Nathan Cutler
09:39 AM Backport #21133 (In Progress): luminous: osd/PrimaryLogPG: sparse read won't trigger repair corre...
Nathan Cutler
09:38 AM Backport #21132 (Resolved): luminous: qa/standalone/scrub/osd-scrub-repair.sh timeout
Nathan Cutler
09:09 AM Backport #21239 (Resolved): jewel: test_health_warnings.sh can fail
https://github.com/ceph/ceph/pull/20289 Nathan Cutler

09/04/2017

08:36 PM Bug #20785 (Resolved): osd/osd_types.cc: 3574: FAILED assert(lastmap->get_pools().count(pgid.pool...
Nathan Cutler
01:43 PM Bug #20785: osd/osd_types.cc: 3574: FAILED assert(lastmap->get_pools().count(pgid.pool()))
thanks Joao, i am commenting on https://github.com/ceph/ceph/pull/17191 so it references https://github.com/ceph/ceph... Kefu Chai
12:57 PM Bug #20785: osd/osd_types.cc: 3574: FAILED assert(lastmap->get_pools().count(pgid.pool()))
doh. I missed the needs-backport tag on the pr :( Joao Eduardo Luis
12:14 PM Bug #20785: osd/osd_types.cc: 3574: FAILED assert(lastmap->get_pools().count(pgid.pool()))
Joao, I changed status to "Pending Backport" but the PR is also has the "needs-backport" label, which is perhaps enou... Nathan Cutler
12:13 PM Bug #20785 (Pending Backport): osd/osd_types.cc: 3574: FAILED assert(lastmap->get_pools().count(p...
Nathan Cutler
11:19 AM Bug #20785: osd/osd_types.cc: 3574: FAILED assert(lastmap->get_pools().count(pgid.pool()))
I may be wrong, but it looks like the commit fixing this is only present in current master. I was under the impressio... Joao Eduardo Luis
04:15 PM Bug #21227 (New): [osd]default mkfs.xfs option may make some problem
the default mkfs.xfs osd with -i size 2048
xfs=[
# xfs insists on not overwriting previous fs; even if...
peng zhang
10:39 AM Bug #21171: bluestore: aio submission deadlock
Sage, is there an identifiable behavior when this happens? Do the osds die, or is IO simply forever blocked? Joao Eduardo Luis
09:33 AM Backport #20781 (Rejected): kraken: ceph-osd: PGs getting stuck in scrub state, stalling RBD
Kraken is EOL. Nathan Cutler
06:46 AM Bug #21207 (Pending Backport): bluestore: asyn cdeferred_try_submit deadlock
xie xingguo

09/02/2017

06:36 PM Bug #20888 (Pending Backport): "Health check update" log spam
Sage Weil
06:35 PM Bug #21206 (Pending Backport): thrashosds read error injection doesn't take live_osds into account
Sage Weil
06:34 PM Bug #21203 (Pending Backport): build_initial_pg_history doesn't update up/acting/etc
Sage Weil
04:15 AM Bug #21174: OSD crash: 903: FAILED assert(objiter->second->version > last_divergent_update)
I've got exactly the same problem with kernel client. But fuse client seems fine with ec pool on cephfs George Zhao
01:04 AM Bug #20981: ./run_seed_to_range.sh errored out
one more here http://qa-proxy.ceph.com/teuthology/yuriw-2017-09-01_23:34:11-rados-wip-yuri-testing-2017-08-31-2109-di... Yuri Weinstein
 

Also available in: Atom