Bug #15936
Osd-s on cache pool crash after upgrade from Hammer to Jewel
0%
Description
After upgrading my cluster from 94.7 to 10.2.1 all ods-s (ssd) backing cache tier stared to crash constantly as soon as pool with cache tier was accessed.
Randomly from 1 min to hour at max.
Cache pool info:
pool 8 'ssdcache' replicated size 3 min_size 1 crush_ruleset 1 object_hash rjenkins pg_num 256 pgp_num 256 last_change 11570 flags hashpspool,incomplete_clones tier_of 6 cache_mode writeback target_bytes 107374182400 hit_set bloom{false_positive_probability: 0.05, target_size: 0, seed: 0} 3600s x1 decay_rate 0 search_last_n 1 min_read_recency_for_promote 1 stripe_width 0
And pool info:
pool 6 'hdd10k' replicated size 3 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 512 pgp_num 512 last_change 11561 lfor 11561 flags hashpspool tiers 8 read_tier 8 write_tier 8 min_write_recency_for_promote 1 stripe_width 0
---
Also setting: ceph osd set sortbitwise after upgrade whole cluster become unusable - no qemu clients were able to r/w to disks. Reverted quickly.
History
#1 Updated by elder one almost 8 years ago
- File ceph-osd.44.zip added
#2 Updated by elder one almost 8 years ago
Ubuntu 14.04, kernel 3.18.33
#3 Updated by Samuel Just over 7 years ago
- Assignee set to Joao Eduardo Luis
#4 Updated by Greg Farnum almost 7 years ago
Ping Joao? This looks to have been a crash in persisting/trimming HitSets, which I know underwent a bunch of changes/fixes around time notation and stuff, so I suspect it's done now...
#5 Updated by Greg Farnum almost 7 years ago
- Project changed from Ceph to RADOS
- Category set to Tiering
- Component(RADOS) OSD added
#6 Updated by Sage Weil over 6 years ago
- Status changed from New to Can't reproduce
not enough info here to go on..