General

Profile

Sage Weil's activity

From 07/20/2018 to 08/18/2018

08/18/2018

07:12 PM Messengers Bug #26963 (New): msg/async: segv in _try_send
... Sage Weil
07:05 PM Ceph Bug #21416: osd/PGLog.cc: 60: FAILED assert(s <= can_rollback_to) after upgrade to luminous
...and the reason is just that the osd.0 is way behind, and the primary is not throttling it's work accordingly. And... Sage Weil
06:03 PM Ceph Bug #21416: osd/PGLog.cc: 60: FAILED assert(s <= can_rollback_to) after upgrade to luminous
the fix is clearly to not trim past can_rollback_to. the other question, though, is why can_rollback_to is so far be... Sage Weil

08/17/2018

02:27 PM RADOS Bug #26958 (Resolved): osd/ReplicatedBackend.cc: 1321: FAILED assert(get_parent()->get_log().get_...
... Sage Weil
02:06 PM Messengers Bug #23957: msg/async: read connect reply failed, but not retry
Another instance. On receiving end,... Sage Weil

08/16/2018

05:58 PM RADOS Bug #24612: FAILED assert(osdmap_manifest.pinned.empty()) in OSDMonitor::prune_init()
/a/sage-2018-08-15_15:49:39-rados-wip-sage2-testing-2018-08-15-0731-distro-basic-smithi/2908178
Sage Weil

08/14/2018

10:43 PM RADOS Bug #24866: FAILED assert(0 == "past_interval start interval mismatch") in check_past_interval_bo...
Generally yes, but I havne't been able to reproduce to test a solution. I take it this has happened to you?
I'm h...
Sage Weil
09:58 PM RADOS Bug #26947 (Resolved): ENOENT on collection_move_rename from divergent activate
... Sage Weil
04:56 PM RADOS Bug #26940 (Fix Under Review): force-create-pg broken
https://github.com/ceph/ceph/pull/23572 Sage Weil
03:53 PM RADOS Bug #26940 (Resolved): force-create-pg broken
This commit -
https://github.com/ceph/ceph/commit/7797ed67d2f9140b7eb9f182b06d04233e1e309c
has introduced regressio...
Sage Weil
03:34 PM bluestore Bug #24439 (Pending Backport): os/bluestore/BlueStore.cc: 1025: FAILED assert(buffer_bytes >= b->...
Sage Weil

08/13/2018

08:54 PM teuthology Revision d30ae426 (teuthology): Merge pull request #1201 from ceph/wip-syslog-whitelist-info
internal/syslog: whitelist ceph-crash daemon messages Sage Weil
06:01 PM RADOS Bug #26890 (Pending Backport): scrub livelock
Sage Weil
06:01 PM Ceph Bug #22056 (Pending Backport): segv in OSDMap::calc_pg_upmaps from balancer
Sage Weil
05:14 PM bluestore Bug #24439 (Fix Under Review): os/bluestore/BlueStore.cc: 1025: FAILED assert(buffer_bytes >= b->...
https://github.com/ceph/ceph/pull/23552 Sage Weil
05:09 PM bluestore Bug #26902 (Duplicate): ObjectStore/StoreTest.ColSplitTest1Clones/2 failure
Sage Weil

08/12/2018

08:40 PM bluestore Bug #26902 (Duplicate): ObjectStore/StoreTest.ColSplitTest1Clones/2 failure
... Sage Weil
08:38 PM RADOS Bug #21592: LibRadosCWriteOps.CmpExt got 0 instead of -4095-1
/a/sage-2018-08-11_18:40:58-rados-wip-sage-testing-2018-08-11-1120-distro-basic-smithi/2893875... Sage Weil
05:38 PM CephFS Bug #26901 (New): mds: no throttlers set on incoming messages
This means aggressive clients can consume unbounded mds memory.
See mon/mgr/osd throttlers for comparison:...
Sage Weil

08/09/2018

02:00 PM RADOS Bug #26891 (New): backfill reservation deadlock/stall

on backfill target:
- get backfill request, queue RequestBackfillPrio...
Sage Weil
01:34 PM RADOS Bug #26890: scrub livelock
https://github.com/ceph/ceph/pull/23512 Sage Weil
01:32 PM RADOS Bug #26890 (Resolved): scrub livelock
- both osds locally reserve a scrub slot
- both osds send a scrub schedule request
- both scrub requests are reject...
Sage Weil
01:06 PM Ceph Bug #26857 (Pending Backport): log: buffer overrun
Sage Weil

08/08/2018

09:42 PM RADOS Bug #25146: "rocksdb: Corruption: Can't access /000000.sst" in upgrade:mimic-x:parallel-master-di...
I think we need to fix this sooner rather than later. My suggestion is to incorporate enough of the original rocksdb... Sage Weil
06:53 PM rgw Feature #2804 (Rejected): rgw: disallow running multiple gateways on the same fastcgi socket
Sage Weil
06:53 PM Ceph Cleanup #2731 (Closed): ceph tell osd <num> should be ceph osd <subcommand> to match ceph pg <sub...
Sage Weil
06:52 PM RADOS Feature #1126 (Rejected): crush: extend rule definition
actually, you can do the above, just set size=3 and you'll get 2 in first rack and 1 in second rack. Sage Weil
06:51 PM Ceph Feature #364 (Rejected): osd: leave gap after journal entries to avoid full disk rotation
no more futzing with filestore Sage Weil
06:49 PM RADOS Feature #85 (Fix Under Review): osd: pg_num shrink
https://github.com/ceph/ceph/pull/20469 Sage Weil
06:33 PM RADOS Feature #84 (In Progress): mon: auto adjust pg_num as pool grows
Sage Weil
02:08 AM bluestore Bug #25207: ceph-volume lvm create gives segmentation fault
This looks a bit like the error we see when jemalloc is enabled in /etc/{default,sysconfig}/ceph. Can you see if it ... Sage Weil

08/07/2018

10:02 PM RADOS Bug #26875 (Resolved): kv: MergeOperator name() returns string, and caller calls c_str() on the t...
On Tue, 7 Aug 2018, Réka Nikolett Kovács wrote:
> Hi,
>
> I am working on a bug finding tool that looks for a ...
Sage Weil
01:55 PM Ceph Bug #25107 (Pending Backport): common: (mon) command sanitization accepts floats when Int type is...
let's let this bake for a while Sage Weil

08/06/2018

06:14 PM Ceph Bug #26866 (Fix Under Review): OSDMapMapping does not handle active.size() > pool size
https://github.com/ceph/ceph/pull/23449 Sage Weil
05:08 PM Ceph Bug #26866 (Resolved): OSDMapMapping does not handle active.size() > pool size
In some cases the active vector could be larger than the pool size (e.g., residual pg_temp mapping after pool size is... Sage Weil

08/04/2018

03:21 AM Ceph Bug #26857: log: buffer overrun
https://github.com/ceph/ceph/pull/23422 Sage Weil
03:20 AM Ceph Bug #26857 (Resolved): log: buffer overrun
gibberish in log file when large (>64k) entries are written to the log.
caused by 65da5ba216cafb8a91893d0e7fc09220...
Sage Weil

08/02/2018

03:57 PM RADOS Bug #25182: Upmaps forgotten after restarting OSDs
Hmm, I wasn't able to reproduce this... Sage Weil
03:30 PM RADOS Bug #25182: Upmaps forgotten after restarting OSDs
It is expected that the upmaps may evaporate if the "raw" CRUSH mapping changes. This shouldn't happen for osd up/do... Sage Weil

08/01/2018

10:18 PM RADOS Bug #25146: "rocksdb: Corruption: Can't access /000000.sst" in upgrade:mimic-x:parallel-master-di...
another option would be to only partially revert, and keep just the bits that ignore the older deleted log files. Sage Weil
02:12 PM Ceph Wiki edit: CDM_01-AUG-2018 (#7)
Sage Weil
03:46 AM Ceph Wiki edit: CDM_01-AUG-2018 (#6)
Sage Weil
01:06 PM RADOS Bug #25181 (Duplicate): /mon/OSDMonitor.cc: 1821: FAILED assert(osdmap_manifest.pinned.empty())
Sage Weil
01:06 PM RADOS Bug #24612: FAILED assert(osdmap_manifest.pinned.empty()) in OSDMonitor::prune_init()
/a/sage-2018-07-31_21:57:28-rados-wip-sage-testing-2018-07-31-1436-distro-basic-smithi/2844443
/a/sage-2018-07-30_13...
Sage Weil
01:17 AM Ceph Feature #24878 (Pending Backport): [RFE] Filestore split log should show PG that is splitting
Sage Weil
01:16 AM Ceph Bug #25007 (Pending Backport): common: Cond.h:C_SaferCond does not check done before calling cond...
Sage Weil
01:15 AM Messengers Bug #25208 (Duplicate): msg/async/AsyncConnection.cc: 1710: FAILED assert(can_write == WriteStatu...
... Sage Weil

07/31/2018

09:24 PM RADOS Bug #25198 (Pending Backport): FAILED assert(trim_to <= info.last_complete) in PGLog::trim()
Sage Weil
07:25 PM RADOS Bug #24485: LibRadosTwoPoolsPP.ManifestUnset failure
/a/sage-2018-07-31_14:52:20-rados:thrash-wip-sage2-testing-2018-07-30-1049-distro-basic-smithi/2843268 Sage Weil
07:08 PM RADOS Bug #25175 (Pending Backport): rados python bindings use prval from stack
https://github.com/ceph/ceph/pull/23334 Sage Weil
06:37 PM teuthology Bug #16142 (Won't Fix): Exception during internal.connect fails to unlock machines
i think this is moot with fog? Sage Weil
06:36 PM teuthology Bug #18249 (Won't Fix): suite --ceph-repo option doesn't change workunit.py's repo
i think this is by design (or at least fine).. it makes sense to me that workunit would pull from the suite repo and ... Sage Weil
06:25 PM bluestore Bug #25006: bad csum during upgrade test
Nathan Cutler wrote:
> Also, I noticed this in the test yaml:
>
> [...]
>
> The only thing within @parallel@ i...
Sage Weil

07/30/2018

08:17 PM RADOS Bug #24485: LibRadosTwoPoolsPP.ManifestUnset failure
/a/sage-2018-07-30_13:46:50-rados-wip-sage3-testing-2018-07-28-1512-distro-basic-smithi/2838971 Sage Weil
08:16 PM RADOS Bug #25181 (Duplicate): /mon/OSDMonitor.cc: 1821: FAILED assert(osdmap_manifest.pinned.empty())
... Sage Weil
08:13 PM bluestore Bug #25180 (Resolved): ObjectStore/StoreTest.CompressionTest/2 fail
... Sage Weil
07:55 PM Ceph Revision 1ebafdb6 (ceph): Merge pull request #23292 from yuriw/wip-yuriw-25140-master
qa/tests: added 1st draft of mimic-x suite Sage Weil
07:55 PM Ceph Revision c6dd193f (ceph): Merge pull request #23302 from yuriw/wip-yuriw-crontab-master
qa/tests: added mimic-x to the schedule Sage Weil
07:15 PM RADOS Bug #25175 (Resolved): rados python bindings use prval from stack
these methods include
- omap_get_vals
- omap_get_keys
- omap-get-vals-by-keys
Sage Weil
01:42 PM RADOS Bug #25155 (Can't reproduce): mon crash from 'ceph osd erasure-code-profile set lrcprofile name=l...
... Sage Weil
12:39 PM Ceph Wiki edit: CDM_01-AUG-2018 (#5)
Sage Weil

07/28/2018

07:55 PM RADOS Bug #20798: LibRadosLockECPP.LockExclusiveDurPP gets EEXIST
/a/sage-2018-07-27_22:50:28-rados-wip-sage-testing-2018-07-27-0744-distro-basic-smithi/2826326 Sage Weil

07/26/2018

09:42 PM teuthology Bug #25129: Race condition in install task
Why is the install task querying shaman a second time? Seems like that's the source of the race... Sage Weil
04:39 PM bluestore Bug #22102 (Won't Fix): BlueStore crashed on rocksdb checksum mismatch
This appears to be a kernel bug related to swapping.
So far no indication it affects distro kernels.
Sage Weil
04:31 PM Ceph Revision dd471db8 (ceph): Merge pull request #23262 from liewegas/wip-mimic-p2p
mimic: qa/suites/upgrade/mimic-p2p: allow target version to apply Sage Weil
02:41 PM Ceph Wiki edit: CDM_01-AUG-2018 (#3)
Sage Weil

07/25/2018

09:30 PM bluestore Bug #22464 (Won't Fix): Bluestore: many checksum errors, always 0x6706be76 (which matches a zero ...
I'm going to close this given that all of the evidence seems to point to a kernel bug with swap. Sage Weil
09:20 PM bluestore Bug #24903 (Resolved): Update 12.2.5 -> 12.2.6: block.db symlink exists but target unusable
Sage Weil
08:14 PM Ceph Bug #25107 (Fix Under Review): common: (mon) command sanitization accepts floats when Int type is...
https://github.com/ceph/ceph/pull/23243 Sage Weil
05:03 PM mgr Bug #25103 (Resolved): mgr: pgs show in unknown state despite being active
- mgr restarts
- mgr receives reports...
Sage Weil
12:27 PM RADOS Bug #25057: jewel->luminous: osdmap crc mismatch
luminous: https://github.com/ceph/ceph/pull/23227
mimic: https://github.com/ceph/ceph/pull/23226
Sage Weil
12:07 PM RADOS Bug #25057: jewel->luminous: osdmap crc mismatch
The problem was that CRUSH_TUNABLES5 was associated with kraken instead of jewel in 0ceb5c0, backported to luminous ... Sage Weil
12:00 PM RADOS Bug #25057 (Pending Backport): jewel->luminous: osdmap crc mismatch
https://github.com/ceph/ceph/pull/23220 Sage Weil
12:07 PM Ceph Revision 94264708 (ceph): Merge pull request #23227 from liewegas/wip-25057-luminous
luminous: osd/OSDMap: CRUSH_TUNABLES5 added in jewel, not kraken Sage Weil

07/24/2018

10:55 PM Ceph Revision 27728c38 (ceph): Merge pull request #23219 from liewegas/wip-slow-requests-upgrade-luminous
luminous: qa/suites/upgrade/jewel-x: whitelist 'slow requests' Sage Weil
07:18 PM RADOS Bug #25057 (In Progress): jewel->luminous: osdmap crc mismatch
Sage Weil
06:41 PM RADOS Bug #25057: jewel->luminous: osdmap crc mismatch
/a/teuthology-2018-07-20_04:23:01-upgrade:jewel-x-luminous-distro-basic-smithi/2799173
is an instance where the mo...
Sage Weil
02:43 AM Ceph Revision 13b25fd4 (ceph): Merge pull request #23164 from smithfarm/wip-25056-mimic
mimic: tests: upgrade/luminous-x: whitelist REQUEST_SLOW for rados_mon_thrash Sage Weil

07/22/2018

12:55 PM RADOS Bug #25057 (Resolved): jewel->luminous: osdmap crc mismatch
The upgrade/jewel-x runs for 12.2.6 and 12.2.7 threw osdmap crc mismatch errors. Sage Weil
12:41 PM teuthology Feature #24760: Add ability to check/install (/from) chacra.ceph.com and/or download.ceph.com for...
Can we discuss this in standup this week (infra standup monday?) and come to a consensus on the solution?
This gap...
Sage Weil

07/20/2018

08:59 PM Ceph Revision e2465fdc (ceph): Merge pull request #23151 from neha-ojha/wip-25008
qa/suites/powercycle: whitelist MDS_SLOW_REQUEST Sage Weil
06:24 PM Ceph Bug #24948 (Pending Backport): SPDK compiles with -march=native
Sage Weil
12:38 PM RADOS Bug #25017 (Duplicate): log [ERR] : 1.3 past_intervals [182,196) start interval does not contain ...
... Sage Weil
04:02 AM website Bug #25012 (New): change all download links to https, publish checksums
... Sage Weil
 

Also available in: Atom