Project

General

Profile

Activity

From 08/13/2017 to 09/11/2017

09/11/2017

11:16 PM Bug #20981: ./run_seed_to_range.sh errored out

This bug was filed because the ceph_test_filestore_idempotent_sequence wasn't completing the _exit() in _inject_fai...
David Zafman
03:44 PM Bug #21354 (Closed): Possible bug in interval_set.intersect_of()
I've been working on different kind of optimization of pg_pool_t::build_removed_snaps (that gets rid of intersect int... Piotr Dalek
09:39 AM Backport #21341 (In Progress): luminous: mon/OSDMonitor: deleting pool while pgs are being create...
Nathan Cutler
09:37 AM Backport #21341 (Resolved): luminous: mon/OSDMonitor: deleting pool while pgs are being created l...
https://github.com/ceph/ceph/pull/17634 Nathan Cutler
09:38 AM Backport #21343 (Resolved): luminous: DNS SRV default service name not used anymore
https://github.com/ceph/ceph/pull/17863 Nathan Cutler
08:17 AM Bug #21338 (Resolved): There is a big risk in function bufferlist::claim_prepend()
Recently i found a design flaw in the study of the bufferlist. There is a big risk if we call buffer::list::claim_pre... Ivan Guan
08:10 AM Bug #21303: rocksdb get a error: "Compaction error: Corruption: block checksum mismatch"
[root@ceph241 hw]# ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-1 fsck
action fsck
2017-09-11 15:37:35.698119 ...
黄 维
04:06 AM Bug #18749: OSD: allow EC PGs to do recovery below min_size
https://github.com/ceph/ceph/pull/17619
Greg Farnum, would you mind taking a look?
Chang Liu

09/10/2017

07:17 PM Bug #21309 (Pending Backport): mon/OSDMonitor: deleting pool while pgs are being created leads to...
Sage Weil
07:15 PM Bug #20924: osd: leaked Session on osd.7
/a/sage-2017-09-10_02:50:18-rados-wip-sage-testing-2017-09-08-1434-distro-basic-smithi/1615133 Sage Weil
06:58 PM Bug #21180 (Resolved): Bluestore throttler causes down OSD
Pretty sure this was #21171, fixed merged to master and luminous, will be in 12.2.1. Sage Weil
06:57 PM Bug #21246 (Resolved): bluestore: hang while replaying deferred ios from journal
Pretty sure this was #21171. Fix is merged to master and luminous branch, will be in v12.2.1. Sage Weil
06:57 PM Backport #21325 (Resolved): luminous: bluestore: aio submission deadlock
Sage Weil
06:57 PM Bug #21171 (Resolved): bluestore: aio submission deadlock
Sage Weil
12:41 AM Bug #21331: pg recovery priority inversion
... Sage Weil

09/09/2017

08:38 PM Bug #21314: Ceph OSDs crashing in BlueStore::queue_transactions() using EC
I also tried the workaround in http://tracker.ceph.com/issues/21180 by adding these to ceph.conf but no luck:
<pre...
Bob Bobington
07:19 PM Bug #21314: Ceph OSDs crashing in BlueStore::queue_transactions() using EC
After a few crashes the OSDs become permanently lost, consistently displaying errors like this upon startup:... Bob Bobington
05:31 PM Bug #21314: Ceph OSDs crashing in BlueStore::queue_transactions() using EC
I've applied the changes in the Git pull request referenced in that issue and the issue still persists:... Bob Bobington
05:51 AM Bug #21314: Ceph OSDs crashing in BlueStore::queue_transactions() using EC
Hmm, I found another log file and came across this:... Bob Bobington
07:42 PM Bug #21331: pg recovery priority inversion
Actually, this isn't quite right.
The real problem is that the *primary* has an ancient last_complete, because it ...
Sage Weil
06:53 PM Bug #21331: pg recovery priority inversion
it looks lke peer_last_commit_ondisk for osd.26 isn't getting updated since it is not in acting (it's backfill target... Sage Weil
06:21 PM Bug #21331 (Resolved): pg recovery priority inversion
... Sage Weil
08:45 AM Bug #21303: rocksdb get a error: "Compaction error: Corruption: block checksum mismatch"
the gdb info maybe helpful.It return null when rocsdb read metadata from the sst file
(gdb) n
rocksdb::ReadBlockC...
黄 维
04:08 AM Bug #21204 (Pending Backport): DNS SRV default service name not used anymore
Kefu Chai

09/08/2017

08:21 PM Backport #21325 (In Progress): luminous: bluestore: aio submission deadlock
Nathan Cutler
08:20 PM Backport #21325 (Resolved): luminous: bluestore: aio submission deadlock
https://github.com/ceph/ceph/pull/17601 Nathan Cutler
08:18 PM Bug #21314: Ceph OSDs crashing in BlueStore::queue_transactions() using EC
There are no log entries regarding failed heartbeat checks on the failing OSDs, only on the other OSDs witnessing the... Bob Bobington
07:37 PM Bug #21314 (Duplicate): Ceph OSDs crashing in BlueStore::queue_transactions() using EC
It is hard to tell because the lines preceding the snippet are missing, but I'm pretty sure this is a dup of #21171, ... Sage Weil
06:22 PM Bug #21314: Ceph OSDs crashing in BlueStore::queue_transactions() using EC
... Greg Farnum
03:44 PM Bug #21314 (Duplicate): Ceph OSDs crashing in BlueStore::queue_transactions() using EC
Log is attached. 3 of my 4 OSDs have crashed in a similar manner at different times. I'm running Ceph on a single nod... Bob Bobington
06:37 PM Bug #21250 (Resolved): os/bluestore/BlueFS.cc: 1255: FAILED assert(!log_file->fnode.extents.empty())
Nathan Cutler
06:36 PM Backport #21276 (Resolved): luminous: os/bluestore/BlueFS.cc: 1255: FAILED assert(!log_file->fnod...
Nathan Cutler
03:48 PM Backport #21276: luminous: os/bluestore/BlueFS.cc: 1255: FAILED assert(!log_file->fnode.extents.e...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/17562
merged
Yuri Weinstein
05:51 PM Bug #21123 (Resolved): osd/PrimaryLogPG: sparse read won't trigger repair correctly
Nathan Cutler
05:50 PM Bug #21162 (Resolved): 'osd crush rule rename' not idempotent
Nathan Cutler
05:50 PM Bug #21207 (Resolved): bluestore: asyn cdeferred_try_submit deadlock
Nathan Cutler
05:13 PM Bug #19605 (Resolved): OSD crash: PrimaryLogPG.cc: 8396: FAILED assert(repop_queue.front() == repop)
Nathan Cutler
05:12 PM Bug #20888 (Resolved): "Health check update" log spam
Nathan Cutler
03:57 PM Backport #21133 (Resolved): luminous: osd/PrimaryLogPG: sparse read won't trigger repair correctly
Sage Weil
03:56 PM Backport #21234 (Resolved): luminous: bluestore: asyn cdeferred_try_submit deadlock
Sage Weil
03:56 PM Backport #21182 (Resolved): luminous: 'osd crush rule rename' not idempotent
Sage Weil
03:41 PM Bug #20370: leaked MOSDOp via PrimaryLogPG::_copy_some and PrimaryLogPG::do_proxy_write
/a/yuriw-2017-09-07_19:30:56-rados-wip-yuri-testing4-2017-09-07-1811-distro-basic-smithi/1607597 Sage Weil
02:42 PM Backport #21308: jewel: pre-luminous: aio_read returns erroneous data when rados_osd_op_timeout i...
Nathan, thanks for creating this ticket! Kefu Chai
08:16 AM Backport #21308 (In Progress): jewel: pre-luminous: aio_read returns erroneous data when rados_os...
Nathan Cutler
08:15 AM Backport #21308 (Resolved): jewel: pre-luminous: aio_read returns erroneous data when rados_osd_o...
https://github.com/ceph/ceph/pull/17594 Nathan Cutler
02:26 PM Backport #21242 (Resolved): luminous: OSD crash: PrimaryLogPG.cc: 8396: FAILED assert(repop_queue...
Sage Weil
02:25 PM Backport #21240 (Resolved): luminous: "Health check update" log spam
Sage Weil
02:24 PM Backport #21238 (Resolved): luminous: test_health_warnings.sh can fail
Sage Weil
12:20 PM Bug #21293 (Resolved): bluestore: spanning blob doesn't match expected ref_map
Sage Weil
12:18 PM Bug #21171 (Pending Backport): bluestore: aio submission deadlock
https://github.com/ceph/ceph/pull/17601 is teh backport Sage Weil
12:05 PM Bug #21309 (Fix Under Review): mon/OSDMonitor: deleting pool while pgs are being created leads to...
https://github.com/ceph/ceph/pull/17600 Joao Eduardo Luis
11:56 AM Bug #21309 (In Progress): mon/OSDMonitor: deleting pool while pgs are being created leads to asse...
Joao Eduardo Luis
11:55 AM Bug #21309 (Resolved): mon/OSDMonitor: deleting pool while pgs are being created leads to assert(...
ceph version 13.0.0-429-gbc5fe2e (bc5fe2e9099dbb560c2153d3ac85f38b46593a77) mimic (dev)
Easily reproducible on a v...
Joao Eduardo Luis
08:14 AM Backport #21307 (Resolved): luminous: Client client.admin marked osd.2 out, after it was down for...
https://github.com/ceph/ceph/pull/17862 Nathan Cutler
08:14 AM Bug #20616 (Pending Backport): pre-luminous: aio_read returns erroneous data when rados_osd_op_ti...
Fixed in Infernalis by https://github.com/ceph/ceph/commit/64bca33ae76646879e6801c45e6d91852e488f8b
Needs backport...
Nathan Cutler
07:32 AM Bug #20616 (Fix Under Review): pre-luminous: aio_read returns erroneous data when rados_osd_op_ti...
this only happens if "rados_osd_op_timeout > 0", where the rx_buffer optimization is disabled, due to #9582. in that ... Kefu Chai
06:04 AM Bug #21303 (Resolved): rocksdb get a error: "Compaction error: Corruption: block checksum mismatch"
ceph --version
ceph version 12.1.0.5 (27f32562975c5fd3b785a124c818599c677b3f67) luminous (dev)
osd log:
2017-09-...
黄 维

09/07/2017

09:23 PM Bug #21249 (Pending Backport): Client client.admin marked osd.2 out, after it was down for 150462...
Sage Weil
08:47 PM Bug #21171: bluestore: aio submission deadlock
There wsa also an aio submission bug that dropped ios on the floor. it was consistently reproducible with... Sage Weil
06:42 PM Bug #20910 (In Progress): spurious MON_DOWN, apparently slow/laggy mon
Ok, this is still happening.. and it correlated with (1) bluestore and (2) bluestore fsck on mount, which spews an un... Sage Weil
01:57 AM Bug #20910: spurious MON_DOWN, apparently slow/laggy mon
*master PR for backport*: https://github.com/ceph/ceph/pull/17505 Nathan Cutler
01:29 PM Bug #21293: bluestore: spanning blob doesn't match expected ref_map
... Sage Weil
01:29 PM Bug #21293 (Fix Under Review): bluestore: spanning blob doesn't match expected ref_map
https://github.com/ceph/ceph/pull/17569 Sage Weil
01:11 PM Bug #21293 (Resolved): bluestore: spanning blob doesn't match expected ref_map
... Sage Weil
01:02 PM Backport #21283 (In Progress): luminous: spurious MON_DOWN, apparently slow/laggy mon
Abhishek Lekshmanan
07:36 AM Backport #21283 (Resolved): luminous: spurious MON_DOWN, apparently slow/laggy mon
https://github.com/ceph/ceph/pull/17564 Nathan Cutler
01:00 PM Backport #21276 (In Progress): luminous: os/bluestore/BlueFS.cc: 1255: FAILED assert(!log_file->f...
Abhishek Lekshmanan
07:35 AM Backport #21276 (Resolved): luminous: os/bluestore/BlueFS.cc: 1255: FAILED assert(!log_file->fnod...
https://github.com/ceph/ceph/pull/17562 Nathan Cutler
10:49 AM Bug #20616: pre-luminous: aio_read returns erroneous data when rados_osd_op_timeout is set but no...
i am able to reproduce this issue with the last jewel, but not master.
reverting 126d0b30e990519b8f845f99ba893fdcd...
Kefu Chai
09:01 AM Bug #21287: 1 PG down, OSD fails with "FAILED assert(i->prior_version == last || i->is_error())"
btw down pg is 1.1735.
Starting OSD 381 crashes 65, 133 and 118. Stoping 65 enables to start remaining OSDs, start...
Henrik Korkuc
08:14 AM Bug #21287 (Duplicate): 1 PG down, OSD fails with "FAILED assert(i->prior_version == last || i->i...
One PG went down for me during large rebalance (I added racks to OSD placement, almost all data had to be shuffled). ... Henrik Korkuc
08:16 AM Bug #21180: Bluestore throttler causes down OSD
pool used for this workload is blocked by down PG (#21287), but I'll try to replicate on same cluster with newly crea... Henrik Korkuc
05:27 AM Bug #21204 (Fix Under Review): DNS SRV default service name not used anymore
https://github.com/ceph/ceph/pull/17539 Kefu Chai
02:44 AM Bug #21258: "ceph df"'s MAX AVAIL is not correct
Josh Durgin wrote:
> What is your crushmap and device sizes? It looks like you may have different roots, hence diffe...
Chang Liu
01:31 AM Bug #21262: cephfs ec data pool, many osds marked down
yes. the log not only about one issue.totally issue like blow:
1. slow request, osd marked down, osd op suicide ca...
Yong Wang

09/06/2017

09:03 PM Bug #20910 (Pending Backport): spurious MON_DOWN, apparently slow/laggy mon
Sage Weil
09:02 PM Bug #20910 (Resolved): spurious MON_DOWN, apparently slow/laggy mon
the problem is that bluestore logs so freaking much at debug bluestore = 30 that the mon gets all laggy. Sage Weil
08:52 PM Bug #21250 (Pending Backport): os/bluestore/BlueFS.cc: 1255: FAILED assert(!log_file->fnode.exten...
Sage Weil
05:03 PM Bug #21249 (Fix Under Review): Client client.admin marked osd.2 out, after it was down for 150462...
https://github.com/ceph/ceph/pull/17525 John Spray
04:35 PM Bug #21262 (Need More Info): cephfs ec data pool, many osds marked down
Sage Weil
03:44 PM Bug #21262: cephfs ec data pool, many osds marked down
You're hitting a variety of issues there - some suggesting on-disk corruption, the unexpected error indicating a like... Josh Durgin
02:26 PM Bug #21262: cephfs ec data pool, many osds marked down
relationed error
ceph-osd.22.log:/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AV...
Yong Wang
02:16 PM Bug #21262 (Need More Info): cephfs ec data pool, many osds marked down
cephfs ec data pool, many osds marked down
slow request and get flow blocked, deal op blocked and etc.
Yong Wang
04:34 PM Bug #21180 (Need More Info): Bluestore throttler causes down OSD
Sage Weil
04:34 PM Bug #21180: Bluestore throttler causes down OSD
Can you try setting bluestore_deferred_throttle_bytes = 0 along with bluestore_throttle_bytes = 0 and see if that res... Sage Weil
04:32 PM Bug #21246: bluestore: hang while replaying deferred ios from journal
This looks like it might be the same as #21171, or one of the related bugs I am currently working on. As soon as I h... Sage Weil
03:18 PM Bug #21258 (Fix Under Review): "ceph df"'s MAX AVAIL is not correct
Ah I see your PR now: https://github.com/ceph/ceph/pull/17513 Josh Durgin
03:16 PM Bug #21258: "ceph df"'s MAX AVAIL is not correct
What is your crushmap and device sizes? It looks like you may have different roots, hence different space available i... Josh Durgin
03:45 AM Bug #21258 (Closed): "ceph df"'s MAX AVAIL is not correct
... Chang Liu
03:18 PM Bug #21263: when disk error happens, osd reports assertion failure without any error information
Will fix it in this PR:
https://github.com/ceph/ceph/pull/17522
Pan Liu
02:38 PM Bug #21263 (Resolved): when disk error happens, osd reports assertion failure without any error i...
I used fio+librbd to test one osd(bluestore), which built in an NVME SSD. After I plug-out this SSD, osd reports asse... Pan Liu
11:40 AM Bug #21143: bad RESETSESSION between OSDs?
@yuri, this PR is not merged. or i misunderstand your comment here? Kefu Chai
07:44 AM Bug #21243: incorrect erasure-code space in command ceph df
https://github.com/ceph/ceph/pull/17513 Chang Liu
05:56 AM Feature #21198: Monitors don't handle incomplete network splits
the same case:
https://marc.info/?l=ceph-devel&w=2&r=1&s=ceph-mon+leader+election+problem&q=b
zhiang li

09/05/2017

08:49 PM Bug #20041 (Resolved): ceph-osd: PGs getting stuck in scrub state, stalling RBD
Nathan Cutler
08:49 PM Backport #20780 (Resolved): jewel: ceph-osd: PGs getting stuck in scrub state, stalling RBD
Nathan Cutler
08:47 PM Bug #20464 (Resolved): cache tier osd memory high memory consumption
Nathan Cutler
08:47 PM Backport #20511 (Resolved): jewel: cache tier osd memory high memory consumption
Nathan Cutler
08:46 PM Bug #20375 (Resolved): osd: omap threadpool heartbeat is only reset every 100 values
Nathan Cutler
08:46 PM Backport #20492 (Resolved): jewel: osd: omap threadpool heartbeat is only reset every 100 values
Nathan Cutler
07:02 PM Bug #21250 (Fix Under Review): os/bluestore/BlueFS.cc: 1255: FAILED assert(!log_file->fnode.exten...
https://github.com/ceph/ceph/pull/17503 Sage Weil
06:53 PM Bug #21250: os/bluestore/BlueFS.cc: 1255: FAILED assert(!log_file->fnode.extents.empty())
looks like two concurrent threads trying to compact_log_async:... Sage Weil
06:51 PM Bug #21250 (Resolved): os/bluestore/BlueFS.cc: 1255: FAILED assert(!log_file->fnode.extents.empty())
... Sage Weil
04:49 PM Bug #21249 (Resolved): Client client.admin marked osd.2 out, after it was down for 1504627577 sec...
... Sage Weil
03:30 PM Bug #20843 (Resolved): assert(i->prior_version == last) when a MODIFY entry follows an ERROR entry
Nathan Cutler
03:30 PM Backport #20930 (Rejected): kraken: assert(i->prior_version == last) when a MODIFY entry follows ...
Kraken is EOL. Nathan Cutler
03:30 PM Backport #20722 (Rejected): kraken: rados ls on pool with no access returns no error
Kraken is EOL. Nathan Cutler
03:29 PM Backport #20493 (Rejected): kraken: osd: omap threadpool heartbeat is only reset every 100 values
Kraken is EOL. Nathan Cutler
03:21 PM Backport #21242 (In Progress): luminous: OSD crash: PrimaryLogPG.cc: 8396: FAILED assert(repop_qu...
Nathan Cutler
09:10 AM Backport #21242 (Resolved): luminous: OSD crash: PrimaryLogPG.cc: 8396: FAILED assert(repop_queue...
https://github.com/ceph/ceph/pull/17501 Nathan Cutler
03:20 PM Backport #21240 (In Progress): luminous: "Health check update" log spam
Nathan Cutler
09:09 AM Backport #21240 (Resolved): luminous: "Health check update" log spam
https://github.com/ceph/ceph/pull/17500 Nathan Cutler
03:18 PM Backport #21238 (In Progress): luminous: test_health_warnings.sh can fail
Nathan Cutler
09:09 AM Backport #21238 (Resolved): luminous: test_health_warnings.sh can fail
https://github.com/ceph/ceph/pull/17498 Nathan Cutler
03:15 PM Backport #21236 (In Progress): luminous: build_initial_pg_history doesn't update up/acting/etc
Nathan Cutler
09:09 AM Backport #21236 (Resolved): luminous: build_initial_pg_history doesn't update up/acting/etc
https://github.com/ceph/ceph/pull/17496
https://github.com/ceph/ceph/pull/17622
Nathan Cutler
03:13 PM Backport #21235 (In Progress): luminous: thrashosds read error injection doesn't take live_osds i...
Nathan Cutler
09:09 AM Backport #21235 (Resolved): luminous: thrashosds read error injection doesn't take live_osds into...
https://github.com/ceph/ceph/pull/17495 Nathan Cutler
03:12 PM Backport #21234 (In Progress): luminous: bluestore: asyn cdeferred_try_submit deadlock
Nathan Cutler
09:09 AM Backport #21234 (Resolved): luminous: bluestore: asyn cdeferred_try_submit deadlock
https://github.com/ceph/ceph/pull/17494 Nathan Cutler
01:02 PM Bug #21243: incorrect erasure-code space in command ceph df
not only ISA plugin, It's common problem.... Chang Liu
01:00 PM Bug #21243: incorrect erasure-code space in command ceph df
... Chang Liu
11:09 AM Bug #21243 (Resolved): incorrect erasure-code space in command ceph df


ceph osd erasure-code-profile set ISA plugin=isa k=2 m=2 crush-failure-domain=host crush-device-c...
Petr Malkov
12:56 PM Bug #21246 (Resolved): bluestore: hang while replaying deferred ios from journal
Running ceph-osd-11.2.0-0.el7.x86_64 from ceph-stable's CentOS repository, I hit the following problem. The cluster (... Tobias Florek
10:22 AM Bug #21180: Bluestore throttler causes down OSD
just an update - sometimes even with bluestore_throttle_bytes set to 0 I get down OSDs, but it is much more rare and ... Henrik Korkuc
09:51 AM Backport #21182 (In Progress): luminous: 'osd crush rule rename' not idempotent
Nathan Cutler
09:39 AM Backport #21133 (In Progress): luminous: osd/PrimaryLogPG: sparse read won't trigger repair corre...
Nathan Cutler
09:38 AM Backport #21132 (Resolved): luminous: qa/standalone/scrub/osd-scrub-repair.sh timeout
Nathan Cutler
09:09 AM Backport #21239 (Resolved): jewel: test_health_warnings.sh can fail
https://github.com/ceph/ceph/pull/20289 Nathan Cutler

09/04/2017

08:36 PM Bug #20785 (Resolved): osd/osd_types.cc: 3574: FAILED assert(lastmap->get_pools().count(pgid.pool...
Nathan Cutler
01:43 PM Bug #20785: osd/osd_types.cc: 3574: FAILED assert(lastmap->get_pools().count(pgid.pool()))
thanks Joao, i am commenting on https://github.com/ceph/ceph/pull/17191 so it references https://github.com/ceph/ceph... Kefu Chai
12:57 PM Bug #20785: osd/osd_types.cc: 3574: FAILED assert(lastmap->get_pools().count(pgid.pool()))
doh. I missed the needs-backport tag on the pr :( Joao Eduardo Luis
12:14 PM Bug #20785: osd/osd_types.cc: 3574: FAILED assert(lastmap->get_pools().count(pgid.pool()))
Joao, I changed status to "Pending Backport" but the PR is also has the "needs-backport" label, which is perhaps enou... Nathan Cutler
12:13 PM Bug #20785 (Pending Backport): osd/osd_types.cc: 3574: FAILED assert(lastmap->get_pools().count(p...
Nathan Cutler
11:19 AM Bug #20785: osd/osd_types.cc: 3574: FAILED assert(lastmap->get_pools().count(pgid.pool()))
I may be wrong, but it looks like the commit fixing this is only present in current master. I was under the impressio... Joao Eduardo Luis
04:15 PM Bug #21227 (New): [osd]default mkfs.xfs option may make some problem
the default mkfs.xfs osd with -i size 2048
xfs=[
# xfs insists on not overwriting previous fs; even if...
peng zhang
10:39 AM Bug #21171: bluestore: aio submission deadlock
Sage, is there an identifiable behavior when this happens? Do the osds die, or is IO simply forever blocked? Joao Eduardo Luis
09:33 AM Backport #20781 (Rejected): kraken: ceph-osd: PGs getting stuck in scrub state, stalling RBD
Kraken is EOL. Nathan Cutler
06:46 AM Bug #21207 (Pending Backport): bluestore: asyn cdeferred_try_submit deadlock
xie xingguo

09/02/2017

06:36 PM Bug #20888 (Pending Backport): "Health check update" log spam
Sage Weil
06:35 PM Bug #21206 (Pending Backport): thrashosds read error injection doesn't take live_osds into account
Sage Weil
06:34 PM Bug #21203 (Pending Backport): build_initial_pg_history doesn't update up/acting/etc
Sage Weil
04:15 AM Bug #21174: OSD crash: 903: FAILED assert(objiter->second->version > last_divergent_update)
I've got exactly the same problem with kernel client. But fuse client seems fine with ec pool on cephfs George Zhao
01:04 AM Bug #20981: ./run_seed_to_range.sh errored out
one more here http://qa-proxy.ceph.com/teuthology/yuriw-2017-09-01_23:34:11-rados-wip-yuri-testing-2017-08-31-2109-di... Yuri Weinstein

09/01/2017

09:25 PM Bug #21218 (Resolved): thrash-eio + bluestore (hangs with unfound objects or read_log_and_missing...
... Sage Weil
02:02 PM Bug #21203: build_initial_pg_history doesn't update up/acting/etc
https://github.com/ceph/ceph/pull/17423 Sage Weil
01:30 PM Backport #20512 (Rejected): kraken: cache tier osd memory high memory consumption
Kraken is EOL. Nathan Cutler
06:53 AM Bug #21211: 12.2.0,cephfs(meta replica 2, data ec 2+1),ceph-osd coredump
12.2.0
create cephfs
meta pool: model : replica 2
data pool: model : ec 2+1
ceph-osd coredump after r...
Yong Wang
06:49 AM Bug #21211 (Need More Info): 12.2.0,cephfs(meta replica 2, data ec 2+1),ceph-osd coredump
ceph version 12.2.0 (32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous (rc)
1: (()+0xa23b21) [0x7fe4a148bb21]
2...
Yong Wang

08/31/2017

09:38 PM Bug #18162: osd/ReplicatedPG.cc: recover_replicas: object added to missing set for backfill, but ...
Hi
My institute has a large cluster running Kraken 11.2.1-0 and using EC 8+3 and believe we have run into this bug...
Alastair Dewhurst
08:45 PM Bug #21121 (Pending Backport): test_health_warnings.sh can fail
Sage Weil
08:44 PM Bug #21207 (Fix Under Review): bluestore: asyn cdeferred_try_submit deadlock
https://github.com/ceph/ceph/pull/17409 Sage Weil
08:39 PM Bug #21207 (Resolved): bluestore: asyn cdeferred_try_submit deadlock
In deferred_aio_finish we may need to requeue pending deferred via a finisher. Currently we reuse finishers[0], but ... Sage Weil
06:56 PM Bug #21206 (Fix Under Review): thrashosds read error injection doesn't take live_osds into account
https://github.com/ceph/ceph/pull/17406 Sage Weil
06:54 PM Bug #21206 (Resolved): thrashosds read error injection doesn't take live_osds into account
... Sage Weil
04:31 PM Bug #20981: ./run_seed_to_range.sh errored out
David, the first dead job to appear was http://pulpito.ceph.com/smithfarm-2017-08-21_19:38:42-rados-wip-jewel-backpor... Nathan Cutler
03:21 AM Bug #20981: ./run_seed_to_range.sh errored out
Are we sure it isn't http://tracker.ceph.com/issues/20613#note-24 ? Because the dead runs here http://pulpito.ceph.c... David Zafman
03:14 AM Bug #20981: ./run_seed_to_range.sh errored out
I looked at https://github.com/ceph/ceph/pull/15050 and don't see anything that would cause this issue. David Zafman
04:11 PM Bug #21204 (Resolved): DNS SRV default service name not used anymore
Hi,
I am in the process of upgrading from Kraken to Luminous.
I am using DNS SRV records to lookup MON servers.
...
Lionel BEARD
01:56 PM Bug #19605 (Pending Backport): OSD crash: PrimaryLogPG.cc: 8396: FAILED assert(repop_queue.front(...
Kefu Chai
01:52 PM Bug #21203 (Resolved): build_initial_pg_history doesn't update up/acting/etc
The loop doesn't update up/acting/etc values, which means the result is incorrect when there are multiple intervals s... Sage Weil
11:00 AM Bug #20933: All mon nodes down when i use ceph-disk prepare a new osd.
I think I've hit the similar issue. Occured with 12.1.2 when tried to add host / osd (ceph-deploy osd prepare --dmcry... Denis Zadonskii
09:27 AM Feature #21198 (New): Monitors don't handle incomplete network splits
the network between monitors(the minimum rank and the maximum rank) disconnect, the node of the maximum rank always k... zhiang li
03:30 AM Bug #21194 (New): mon clock skew test is fragile
The original observed problem is that it failed to detect clock skew in run /a/sage-2017-08-27_02:16:57-rados-wip-sa... Sage Weil

08/30/2017

06:10 PM Bug #20981: ./run_seed_to_range.sh errored out
My money is on https://github.com/ceph/ceph/pull/15050 Nathan Cutler
06:07 PM Bug #20981: ./run_seed_to_range.sh errored out
David, the jewel failure started occurring in the integration branch that included the following PRs: http://tracker.... Nathan Cutler
04:56 PM Bug #20981: ./run_seed_to_range.sh errored out
I reverted https://github.com/ceph/ceph/pull/15947 to see if that would fix it and it did NOT. David Zafman
03:57 PM Backport #21182 (Resolved): luminous: 'osd crush rule rename' not idempotent
https://github.com/ceph/ceph/pull/17481 Nathan Cutler
03:28 PM Bug #21180 (Resolved): Bluestore throttler causes down OSD
Writing large amount of data to EC RBD pool via NBD causes down OSDs, PGs and drop in traffic due to unhealthy cluste... Henrik Korkuc
02:12 PM Bug #21174: OSD crash: 903: FAILED assert(objiter->second->version > last_divergent_update)
To clarify then: I have not tested this with a replicated cephfs data pool. Only tested with ec data pool as per my 4... Martin Millnert
01:19 PM Bug #21174: OSD crash: 903: FAILED assert(objiter->second->version > last_divergent_update)
Martin: just to confirm, you were seeing this crash while you had EC pools involved, and when you do not have any EC ... John Spray
11:25 AM Bug #21174: OSD crash: 903: FAILED assert(objiter->second->version > last_divergent_update)
... John Spray
06:12 AM Bug #21174 (Rejected): OSD crash: 903: FAILED assert(objiter->second->version > last_divergent_up...
I've setup a cephfs erasure coded pool on a small cluster consisting of 5 bluestore OSDs.
The pools were created as ...
Martin Millnert
01:33 PM Bug #20871 (Fix Under Review): core dump when bluefs's mkdir returns -EEXIST
Chang Liu
01:33 PM Bug #20871: core dump when bluefs's mkdir returns -EEXIST
https://github.com/ceph/ceph/pull/17357 Chang Liu

08/29/2017

09:43 PM Bug #21171 (Fix Under Review): bluestore: aio submission deadlock
https://github.com/ceph/ceph/pull/17352 Sage Weil
02:47 PM Bug #21171 (Resolved): bluestore: aio submission deadlock
- thread a holds deferred_submit_lock, blocks on aio submission (queue is full)
- thread b holds deferred_lock, bloc...
Sage Weil
08:58 PM Bug #21162 (Pending Backport): 'osd crush rule rename' not idempotent
Sage Weil
10:46 AM Bug #21162 (Fix Under Review): 'osd crush rule rename' not idempotent
https://github.com/ceph/ceph/pull/17329 xie xingguo
07:38 PM Documentation #20486 (Resolved): Document how to use bluestore compression
Sage Weil
04:11 PM Bug #21143: bad RESETSESSION between OSDs?
Haomai Wang wrote:
> https://github.com/ceph/ceph/pull/16009
>
> this pr gives a brief about reason. it's really ...
Yuri Weinstein
03:08 PM Bug #21092: OSD sporadically starts reading at 100% of ssd bandwidth
Seems that is side effect of too small value for bluestore_cache_size.
We set it to 50M to reduce osd memory consump...
Aleksei Gutikov
07:36 AM Bug #20981: ./run_seed_to_range.sh errored out
This is occurring in the current jewel branch now too:
https://github.com/ceph/ceph/pull/17317#issuecomment-325580432
Josh Durgin
07:06 AM Backport #16239 (Resolved): 'ceph tell osd.0 flush_pg_stats' fails in rados qa run
h3. description... Nathan Cutler
03:00 AM Bug #21165 (Can't reproduce): 2 pgs stuck in unknown during thrashing
... Sage Weil

08/28/2017

10:20 PM Bug #21162 (Resolved): 'osd crush rule rename' not idempotent
... Sage Weil
06:15 PM Backport #21150: jewel: tests: btrfs copy_clone returns errno 95 (Operation not supported)
Is this causing job failures? I'm having trouble finding anything indicating this would be fatal without an actual I... David Galloway
08:06 AM Backport #21150 (Resolved): jewel: tests: btrfs copy_clone returns errno 95 (Operation not suppor...
https://github.com/ceph/ceph/pull/18165 Kefu Chai
01:55 AM Bug #21016 (Resolved): CRUSH crash on bad memory handling
xie xingguo
01:54 AM Backport #21106 (Resolved): luminous: CRUSH crash on bad memory handling
https://github.com/ceph/ceph/pull/17214 xie xingguo

08/27/2017

05:59 PM Bug #21147 (Resolved): Manager daemon x is unresponsive. No standby daemons available
/a/sage-2017-08-26_20:38:41-rados-luminous-distro-basic-smithi/1567938
The last time I looked this appeared to be ...
Sage Weil
04:04 PM Bug #20924: osd: leaked Session on osd.7
/a/sage-2017-08-26_20:38:41-rados-luminous-distro-basic-smithi/1568055 Sage Weil
04:30 AM Bug #21143: bad RESETSESSION between OSDs?
https://github.com/ceph/ceph/pull/16009
this pr gives a brief about reason. it's really rare, so I don't do it imm...
Haomai Wang
02:16 AM Backport #21076 (Resolved): luminous: osd/osd_types.cc: 3574: FAILED assert(lastmap->get_pools()....
Kefu Chai
02:15 AM Backport #21095 (Resolved): osd: leak from osd/PGBackend.cc:136 PGBackend::handle_recovery_delete()
Kefu Chai

08/26/2017

06:14 PM Bug #20785 (Resolved): osd/osd_types.cc: 3574: FAILED assert(lastmap->get_pools().count(pgid.pool...
Sage Weil
06:13 PM Bug #20913 (Resolved): osd: leak from osd/PGBackend.cc:136 PGBackend::handle_recovery_delete()
Sage Weil
06:08 PM Bug #21144 (Resolved): daemon-helper: command crashed with signal 1
... Sage Weil
05:56 PM Bug #21143 (Duplicate): bad RESETSESSION between OSDs?
osd.5... Sage Weil
12:11 PM Bug #21142: OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
I uploaded more logs and info files with ceph-post-file
f27fb8a5-baae-4f04-8353-d3b2b314c61a
Ali chips
11:56 AM Bug #21142 (Won't Fix): OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
after upgrading to luminous 12.1.4 rc
we saw several osds crashing with below logs.
the cluster was unhealthy when ...
Ali chips
01:06 AM Bug #20981: ./run_seed_to_range.sh errored out
Stack trace from core dump doesn't include a stack with _inject_failure() in it.
For core dump in /a/kchai-2017-08...
David Zafman

08/25/2017

08:02 PM Backport #21133 (Resolved): luminous: osd/PrimaryLogPG: sparse read won't trigger repair correctly
https://github.com/ceph/ceph/pull/17475 Nathan Cutler
08:02 PM Backport #21132 (Resolved): luminous: qa/standalone/scrub/osd-scrub-repair.sh timeout
https://github.com/ceph/ceph/pull/17264 Nathan Cutler
07:46 PM Bug #21127: qa/standalone/scrub/osd-scrub-repair.sh timeout
https://github.com/ceph/ceph/pull/17264 Sage Weil
07:44 PM Bug #21127 (Pending Backport): qa/standalone/scrub/osd-scrub-repair.sh timeout
Sage Weil
03:01 PM Bug #21127: qa/standalone/scrub/osd-scrub-repair.sh timeout
We need to backport fe81b7e3a5034ce855303f93f3e413f3f2dc74a8 and this change together to luminous. David Zafman
02:59 PM Bug #21127: qa/standalone/scrub/osd-scrub-repair.sh timeout
Caused by:
commit fe81b7e3a5034ce855303f93f3e413f3f2dc74a8
Author: huanwen ren <ren.huanwen@zte.com.cn>
Date: ...
David Zafman
01:46 PM Bug #21127 (Fix Under Review): qa/standalone/scrub/osd-scrub-repair.sh timeout
https://github.com/ceph/ceph/pull/17258 Sage Weil
01:44 PM Bug #21127 (Resolved): qa/standalone/scrub/osd-scrub-repair.sh timeout
... Sage Weil
03:44 PM Bug #21130 (Can't reproduce): "FAILED assert(bh->last_write_tid > tid)" in powercycle-master-test...
Run: http://pulpito.ceph.com/yuriw-2017-08-24_22:38:48-powercycle-master-testing-basic-smithi/
Job: 1560682
Logs: h...
Yuri Weinstein
03:34 PM Backport #20781 (Fix Under Review): kraken: ceph-osd: PGs getting stuck in scrub state, stalling RBD
David Zafman
03:33 PM Backport #20781: kraken: ceph-osd: PGs getting stuck in scrub state, stalling RBD
https://github.com/ceph/ceph/pull/17261 David Zafman
03:22 PM Backport #20780 (Fix Under Review): jewel: ceph-osd: PGs getting stuck in scrub state, stalling RBD
David Zafman
03:09 PM Bug #21123 (Pending Backport): osd/PrimaryLogPG: sparse read won't trigger repair correctly
Sage Weil
03:08 PM Bug #21129 (New): 'ceph -s' hang
... Sage Weil
12:11 PM Backport #21076 (In Progress): luminous: osd/osd_types.cc: 3574: FAILED assert(lastmap->get_pools...
https://github.com/ceph/ceph/pull/17257 Kefu Chai
10:28 AM Bug #21092: OSD sporadically starts reading at 100% of ssd bandwidth
Another stack trace that leads to pread same size and same offset:... Aleksei Gutikov
09:20 AM Bug #21092: OSD sporadically starts reading at 100% of ssd bandwidth
Stacktrace of thread performing reads of 2445312 bytes from offset 96117329920 ... Aleksei Gutikov
10:19 AM Bug #20188 (New): filestore: os/filestore/FileStore.h: 357: FAILED assert(q.empty()) from ceph_te...
/a//kchai-2017-08-25_08:38:31-rados-wip-kefu-testing-distro-basic-smithi/1561884... Kefu Chai
06:35 AM Bug #20785 (Fix Under Review): osd/osd_types.cc: 3574: FAILED assert(lastmap->get_pools().count(p...
/a//joshd-2017-08-25_00:03:46-rados-wip-dup-perf-distro-basic-smithi/1560728/ mon.c Kefu Chai
02:40 AM Backport #21095: osd: leak from osd/PGBackend.cc:136 PGBackend::handle_recovery_delete()
should backport https://github.com/ceph/ceph/pull/17246 also. Kefu Chai
02:38 AM Bug #20913: osd: leak from osd/PGBackend.cc:136 PGBackend::handle_recovery_delete()
https://github.com/ceph/ceph/pull/17246 Kefu Chai
02:09 AM Bug #20876: BADAUTHORIZER on mgr, hung ceph tell mon.*
/a/sage-2017-08-24_17:38:40-rados-wip-sage-testing2-luminous-20170824a-distro-basic-smithi/1560473 Sage Weil

08/24/2017

11:57 PM Bug #21123 (Resolved): osd/PrimaryLogPG: sparse read won't trigger repair correctly
master PR: https://github.com/ceph/ceph/pull/17221 xie xingguo
09:59 PM Bug #21121 (Fix Under Review): test_health_warnings.sh can fail
https://github.com/ceph/ceph/pull/17244 Sage Weil
09:55 PM Bug #21121: test_health_warnings.sh can fail
I believe the fix is to subscribe to osdmaps when in the waiting for healthy state. if we are unhealthy because we a... Sage Weil
09:54 PM Bug #21121 (Resolved): test_health_warnings.sh can fail
- test_mark_all_but_last_osds_down marks all but one osd down
- clears noup
- osd.1 fails the is_healthy check beca...
Sage Weil
07:25 PM Bug #20770: test_pidfile.sh test is failing 2 places
This problem still hasn't been solved. The is disabled, so moving back to verified. David Zafman
07:23 PM Bug #20770 (Resolved): test_pidfile.sh test is failing 2 places
luminous backport rejected because the test continued to fail Nathan Cutler
07:22 PM Bug #20975 (Resolved): test_pidfile.sh is flaky
luminous backport: https://github.com/ceph/ceph/pull/17241 Nathan Cutler
05:50 PM Feature #18206: osd: osd_scrub_during_recovery only considers primary, not replicas
Nathan Cutler wrote:
> @Vikhyat, I think Abhi just created the luminous backport tracker manually. The jewel one wil...
Vikhyat Umrao
05:22 PM Feature #18206: osd: osd_scrub_during_recovery only considers primary, not replicas
@Vikhyat, I think Abhi just created the luminous backport tracker manually. The jewel one will be created automagical... Nathan Cutler
04:36 PM Feature #18206: osd: osd_scrub_during_recovery only considers primary, not replicas
Thanks Nathan. I think some issue and it did not create a tracker for jewel backport so I removed luminous so it can ... Vikhyat Umrao
03:56 PM Feature #18206: osd: osd_scrub_during_recovery only considers primary, not replicas
Verified that both commits from https://github.com/ceph/ceph/pull/17039 were cherry-picked to luminous. Nathan Cutler
05:25 PM Backport #21117 (Resolved): jewel: osd: osd_scrub_during_recovery only considers primary, not rep...
https://github.com/ceph/ceph/pull/17815 Nathan Cutler
05:23 PM Bug #21092: OSD sporadically starts reading at 100% of ssd bandwidth
59.log more obviously shows the issue with repeating part:... Aleksei Gutikov
10:10 AM Bug #21092 (New): OSD sporadically starts reading at 100% of ssd bandwidth
luminous v12.1.4
bluestore
Periodically (10 mins) some osd starts reading ssd disk at maximum available speed (45...
Aleksei Gutikov
05:22 PM Backport #21106 (Resolved): luminous: CRUSH crash on bad memory handling
Nathan Cutler
03:54 PM Bug #21096 (New): osd-scrub-repair.sh:381: unfound_erasure_coded: return 1
... Kefu Chai
03:31 PM Backport #21095 (In Progress): osd: leak from osd/PGBackend.cc:136 PGBackend::handle_recovery_del...
https://github.com/ceph/ceph/pull/17233 Kefu Chai
03:30 PM Backport #21095 (Resolved): osd: leak from osd/PGBackend.cc:136 PGBackend::handle_recovery_delete()
... Kefu Chai
03:14 PM Bug #20913 (Pending Backport): osd: leak from osd/PGBackend.cc:136 PGBackend::handle_recovery_del...
Kefu Chai
08:12 AM Bug #19605 (Fix Under Review): OSD crash: PrimaryLogPG.cc: 8396: FAILED assert(repop_queue.front(...
https://github.com/ceph/ceph/pull/17217 Kefu Chai
06:58 AM Bug #19605: OSD crash: PrimaryLogPG.cc: 8396: FAILED assert(repop_queue.front() == repop)
although all ops in repop_queue are canceled upon pg reset (change), and pg discards messages from down OSDs accordin... Kefu Chai
03:46 AM Bug #20785 (Resolved): osd/osd_types.cc: 3574: FAILED assert(lastmap->get_pools().count(pgid.pool...
Kefu Chai
03:46 AM Backport #21090 (Resolved): osd/osd_types.cc: 3574: FAILED assert(lastmap->get_pools().count(pgid...
https://github.com/ceph/ceph/pull/17191 Kefu Chai
03:46 AM Backport #21090 (Resolved): osd/osd_types.cc: 3574: FAILED assert(lastmap->get_pools().count(pgid...
https://github.com/ceph/ceph/pull/17191 Kefu Chai
03:44 AM Feature #20956 (Resolved): Include front/back interface names in OSD metadata
Kefu Chai
03:36 AM Bug #20970 (Resolved): bug in funciton reweight_by_utilization
Kefu Chai
03:13 AM Feature #21073: mgr: ceph/rgw: show hostnames and ports in ceph -s status output
... Chang Liu
03:04 AM Backport #21076 (Resolved): luminous: osd/osd_types.cc: 3574: FAILED assert(lastmap->get_pools()....
Sage Weil
03:03 AM Backport #21048 (Resolved): luminous: Include front/back interface names in OSD metadata
Sage Weil
03:02 AM Backport #21077 (Resolved): luminous: osd: osd_scrub_during_recovery only considers primary, not ...
Sage Weil
03:02 AM Backport #21079 (Resolved): bug in funciton reweight_by_utilization
Sage Weil
12:30 AM Bug #21016 (Pending Backport): CRUSH crash on bad memory handling
xie xingguo

08/23/2017

11:02 PM Bug #20730: need new OSD_SKEWED_USAGE implementation
I've created 2 pull request for Jewel and Kraken to disable this now.
Jewel: https://github.com/ceph/ceph/pull/172...
David Zafman
08:52 PM Bug #14115: crypto: race in nss init
Still seeing this in Jewel 10.2.7, Ubuntu 16.04.2 running an application using ceph under Apache:... Wyllys Ingersoll
06:33 PM Bug #21016: CRUSH crash on bad memory handling
Sage Weil
05:27 PM Bug #18209 (Resolved): src/common/LogClient.cc: 310: FAILED assert(num_unsent <= log_queue.size())
Nathan Cutler
05:00 PM Backport #20965 (Resolved): luminous: src/common/LogClient.cc: 310: FAILED assert(num_unsent <= l...
Sage Weil
01:46 PM Backport #20965 (In Progress): luminous: src/common/LogClient.cc: 310: FAILED assert(num_unsent <...
Abhishek Lekshmanan
04:09 PM Feature #21084 (Resolved): auth: add osd auth caps based on pool metadata
Add pool-metadata based auth caps. The initial use case is CephFS; if pools are tagged based on filesystem, then auth... Douglas Fuller
01:48 PM Backport #21079 (In Progress): bug in funciton reweight_by_utilization
Abhishek Lekshmanan
01:47 PM Backport #21079 (Resolved): bug in funciton reweight_by_utilization
https://github.com/ceph/ceph/pull/17198 Abhishek Lekshmanan
01:37 PM Backport #21051 (In Progress): luminous: Improve size scrub error handling and ignore system attr...
Abhishek Lekshmanan
01:30 PM Backport #21077 (In Progress): luminous: osd: osd_scrub_during_recovery only considers primary, n...
Abhishek Lekshmanan
01:27 PM Backport #21077 (Resolved): luminous: osd: osd_scrub_during_recovery only considers primary, not ...
https://github.com/ceph/ceph/pull/17195 Abhishek Lekshmanan
01:26 PM Backport #21048 (In Progress): luminous: Include front/back interface names in OSD metadata
Abhishek Lekshmanan
01:02 PM Backport #21076 (In Progress): luminous: osd/osd_types.cc: 3574: FAILED assert(lastmap->get_pools...
https://github.com/ceph/ceph/pull/17191 Kefu Chai
12:59 PM Backport #21076 (Resolved): luminous: osd/osd_types.cc: 3574: FAILED assert(lastmap->get_pools()....
https://github.com/ceph/ceph/pull/17191 Kefu Chai
10:20 AM Bug #16553: Removing Writeback Cache Tier Does not clean up Incomplete_Clones
It looks like I hit same issue on 10.2.9. Henrik Korkuc
08:37 AM Bug #20913 (Fix Under Review): osd: leak from osd/PGBackend.cc:136 PGBackend::handle_recovery_del...
https://github.com/ceph/ceph/pull/17183 Kefu Chai
08:23 AM Feature #21073 (Resolved): mgr: ceph/rgw: show hostnames and ports in ceph -s status output
Similar to the way we do mds and mgr statuses, we could display the rgw endpoints in ceph status as well, the informa... Abhishek Lekshmanan
05:26 AM Bug #20909: Error ETIMEDOUT: crush test failed with -110: timed out during smoke test (5 seconds)
see also https://github.com/ceph/ceph/pull/17179 Kefu Chai

08/22/2017

11:30 PM Bug #20909 (Fix Under Review): Error ETIMEDOUT: crush test failed with -110: timed out during smo...
https://github.com/ceph/ceph/pull/17169 Neha Ojha
04:43 PM Bug #20770: test_pidfile.sh test is failing 2 places
Another change is needed too. I've requested that in the pull request.
https://github.com/ceph/ceph/pull/17052 sh...
David Zafman
04:21 PM Bug #20770: test_pidfile.sh test is failing 2 places
David Zafman wrote:
> To backport all the test-pidfile.sh cherry-pick 4 pull requests using the sha1s in this order:...
Nathan Cutler
04:26 PM Bug #20981: ./run_seed_to_range.sh errored out
See also here =>
http://qa-proxy.ceph.com/teuthology/yuriw-2017-08-22_14:54:54-rados-wip-yuri-testing_2017_8_22-di...
Yuri Weinstein
03:18 PM Bug #20981: ./run_seed_to_range.sh errored out
David, can you take a look? This seems to be showing up pretty consistently in rados runs. Josh Durgin
04:23 PM Bug #20975 (Duplicate): test_pidfile.sh is flaky
Nathan Cutler
03:39 PM Bug #20785 (Pending Backport): osd/osd_types.cc: 3574: FAILED assert(lastmap->get_pools().count(p...
Kefu Chai
02:50 PM Feature #18206 (Pending Backport): osd: osd_scrub_during_recovery only considers primary, not rep...
Kefu Chai
01:18 PM Bug #19605: OSD crash: PrimaryLogPG.cc: 8396: FAILED assert(repop_queue.front() == repop)
... Kefu Chai
01:17 PM Bug #21016 (Fix Under Review): CRUSH crash on bad memory handling
Kefu Chai
06:09 AM Bug #20970 (Pending Backport): bug in funciton reweight_by_utilization
xie xingguo

08/21/2017

11:53 PM Bug #15741: librados get_last_version() doesn't return correct result after aio completion
This bug still exists. David Zafman
10:34 PM Bug #19487 (Closed): "GLOBAL %RAW USED" of "ceph df" is not consistent with check_full_status
Reopen this if issue hasn't been fixed in the latest code with the understanding that each OSD has its own fullness d... David Zafman
04:14 PM Backport #21051 (Resolved): luminous: Improve size scrub error handling and ignore system attrs i...
https://github.com/ceph/ceph/pull/17196 Nathan Cutler
04:13 PM Backport #21048 (Resolved): luminous: Include front/back interface names in OSD metadata
https://github.com/ceph/ceph/pull/17193 Nathan Cutler
03:54 PM Bug #19605: OSD crash: PrimaryLogPG.cc: 8396: FAILED assert(repop_queue.front() == repop)
# osd.1 sent failure report of osd.0
# osd.1 sent repop 5386 to osd.0
# mon.a marked osd.0 down in osdmap.27
# osd...
Kefu Chai
03:52 PM Bug #17138 (Resolved): crush: inconsistent ruleset/ruled_id are difficult to figure out
Josh Durgin
07:44 AM Bug #20981: ./run_seed_to_range.sh errored out
/a//kchai-2017-08-21_01:51:35-rados-master-distro-basic-smithi/1545907/teuthology.log has debug heartbeatmap = 20.
<...
Kefu Chai
03:57 AM Bug #20896: export_diff relies on clone_overlap, which is lost when cache tier is enabled
Hi, everyone.
I've found that the reason that clone overlap modifications should pass "is_present_clone" condition...
Xuehan Xu
02:04 AM Bug #20909: Error ETIMEDOUT: crush test failed with -110: timed out during smoke test (5 seconds)
/a//kchai-2017-08-20_09:42:12-rados-wip-kefu-testing-distro-basic-mira/1545387/ Kefu Chai

08/18/2017

11:20 PM Bug #20770: test_pidfile.sh test is failing 2 places

To backport all the test-pidfile.sh cherry-pick 4 pull requests using the sha1s in this order:
https://github.co...
David Zafman
11:08 PM Feature #18206: osd: osd_scrub_during_recovery only considers primary, not replicas
https://github.com/ceph/ceph/pull/17039 David Zafman
09:34 AM Bug #20981: ./run_seed_to_range.sh errored out
/a/kchai-2017-08-18_03:03:28-rados-master-distro-basic-mira/1537335... Kefu Chai
03:12 AM Bug #20243 (Pending Backport): Improve size scrub error handling and ignore system attrs in xattr...
https://github.com/ceph/ceph/pull/16407 David Zafman

08/17/2017

09:47 PM Bug #20332 (Won't Fix): rados bench seq option doesn't work
David Zafman
06:01 PM Feature #18206 (Fix Under Review): osd: osd_scrub_during_recovery only considers primary, not rep...
David Zafman
02:55 PM Bug #20970 (Fix Under Review): bug in funciton reweight_by_utilization
Kefu Chai
11:19 AM Bug #20970: bug in funciton reweight_by_utilization
https://github.com/ceph/ceph/pull/17064 xie xingguo
12:14 PM Bug #19605: OSD crash: PrimaryLogPG.cc: 8396: FAILED assert(repop_queue.front() == repop)
excerpt of osd.0.log... Kefu Chai
11:31 AM Bug #20785 (Fix Under Review): osd/osd_types.cc: 3574: FAILED assert(lastmap->get_pools().count(p...
https://github.com/ceph/ceph/pull/17065 Kefu Chai
07:52 AM Bug #21016: CRUSH crash on bad memory handling
I believe this should be fixed by https://github.com/ceph/ceph/pull/17014/commits/6252068ec08c66513e5394188b786978236... xie xingguo

08/16/2017

10:34 PM Bug #21016: CRUSH crash on bad memory handling
...and this was also responsible for at least a couple failures that got detected as such. Greg Farnum
10:15 PM Bug #21016 (Resolved): CRUSH crash on bad memory handling
... Greg Farnum
12:04 PM Feature #18206: osd: osd_scrub_during_recovery only considers primary, not replicas
david, i just read your inquiry over IRC. what would you want me to review for this ticket? do we have a PR for it al... Kefu Chai
01:48 AM Bug #21005 (New): mon: mon_osd_down_out interval can prompt osdmap creation when nothing is happe...
I saw a cluster where we had the whole gamut of no* flags set in an attempt to stop it creating maps.
Unfortunatel...
Greg Farnum

08/15/2017

03:40 PM Bug #20416: "FAILED assert(osdmap->test_flag((1<<15)))" (sortbitwise) on upgraded cluster
Hello,
sorry for the delay
Yes, it appears under flags....
Hey Pas
01:22 AM Bug #20770 (Pending Backport): test_pidfile.sh test is failing 2 places
David Zafman

08/14/2017

10:14 PM Feature #18206 (In Progress): osd: osd_scrub_during_recovery only considers primary, not replicas
David Zafman
09:00 PM Bug #20999 (New): rados python library does not document omap API
The omap API can be fairly important for RADOS applications but it is not documented in the expected location http://... Ben England
08:32 PM Backport #17445: jewel: list-snap cache tier missing promotion logic (was: rbd cli segfault when ...
Note: bug is not present in master, as demonstrated by https://github.com/ceph/ceph/pull/17017 Nathan Cutler
08:31 PM Backport #17445 (In Progress): jewel: list-snap cache tier missing promotion logic (was: rbd cli ...
h3. description
In our ceph cluster some rbd images (create by openstack) make rbd segfault. This is on a ubuntu 1...
Nathan Cutler
10:48 AM Backport #17445: jewel: list-snap cache tier missing promotion logic (was: rbd cli segfault when ...
The pull request https://github.com/ceph/ceph/pull/17017 Xuehan Xu
10:46 AM Backport #17445: jewel: list-snap cache tier missing promotion logic (was: rbd cli segfault when ...
Hi, everyone.
I've just add a new list-snaps test, #17017, which can test whether this problem exists in master br...
Xuehan Xu
07:40 PM Bug #20770 (Fix Under Review): test_pidfile.sh test is failing 2 places
David Zafman
01:55 PM Bug #20985 (Resolved): PG which marks divergent_priors causes crash on startup
Several other confirmations and a healthy test run later, all merged! Greg Farnum

08/13/2017

07:20 PM Feature #14527: Lookup monitors through DNS
The recent code doesn't support IPv6, apparently. Maybe we can choose among ns_t_a and ns_t_aaaa according to conf->m... WANG Guoqin
07:01 PM Bug #20939 (Resolved): crush weight-set + rm-device-class segv
Sage Weil
06:59 PM Bug #20876: BADAUTHORIZER on mgr, hung ceph tell mon.*
/a/sage-2017-08-12_21:09:40-rados-wip-sage-testing-20170812a-distro-basic-smithi/1518429... Sage Weil
09:17 AM Bug #20985: PG which marks divergent_priors causes crash on startup
Stephan Hohn wrote:
> I can confirm that this build worked on my test cluster. It's back to HEALTH_OK and all OSDs a...
Stephan Hohn
09:17 AM Bug #20985: PG which marks divergent_priors causes crash on startup
I can conform that this build worked on my test cluster. It's back to HEALTH_OK and all OSDs are up. Stephan Hohn
 

Also available in: Atom