Bug #21588
closedupgrade from jewel to luminous ubuntu need firewall restart
0%
Description
Hello,
I upgraded to jewel to luminous follow the official documentation. This is the status of the cluster.
root@testing-controller-03:~# ceph -s
cluster:
id: 07cdafaf-6c21-49ba-b689-169827ba9440
health: HEALTH_WARN
Reduced data availability: 1032 pgs inactive
Degraded data redundancy: 1032 pgs unclean
services:
mon: 3 daemons, quorum testing-controller-03,node-86,testing-controller-02
mgr: testing-controller-03(active), standbys: node-86
osd: 6 osds: 6 up, 6 in
rgw: 1 daemon active
data:
pools: 16 pools, 1032 pgs
objects: 0 objects, 0 bytes
usage: 0 kB used, 0 kB / 0 kB avail
pgs: 100.000% pgs unknown
1032 unknown
All osds in unknown state. Did not know what happen, after I found a user that had exaclty the same issue:
https://stackoverflow.com/questions/46079301/data-100-unknown-after-ceph-update
I restarted netfilter-persistent (running on ubuntu xenial) and magically without any rule change the cluster is back online.
Could this be related to the package upgrade in any way?
Many thanks.
Updated by Luca Cervigni over 6 years ago
restarting firewall does not actually solve the issue, after 10 minutes or so, the cluster goes back into 100% unknown pgs.
Updated by Luca Cervigni over 6 years ago
restarting firewall again seems to keep it up?
cluster:
id: 07cdafaf-6c21-49ba-b689-169827ba9440
health: HEALTH_WARN
application not enabled on 1 pool(s)
too many PGs per OSD (516 > max 300)
services:
mon: 3 daemons, quorum testing-controller-03,node-86,testing-controller-02
mgr: node-86(active)
osd: 6 osds: 6 up, 6 in
rgw: 1 daemon active
data:
pools: 16 pools, 1032 pgs
objects: 211 objects, 22104 kB
usage: 12722 MB used, 919 GB / 932 GB avail
pgs: 1032 active+clean