Ceph : Issueshttps://tracker.ceph.com/https://tracker.ceph.com/favicon.ico2020-02-15T17:18:34ZCeph
Redmine rbd - Bug #44159 (Resolved): [rbd-mirror] Mirror daemon never recovers from being blacklistedhttps://tracker.ceph.com/issues/441592020-02-15T17:18:34ZOliver Freyermuth
<p>I can reproduce this rather reliably by:<br />- Restarting many OSDs (old nodes, slow spinning disks, likely exceeding default blacklist timeout). <br />- Sometimes, it also happens when restarting other RBD mirror daemons (we have 3).</p>
<p>The attached log is extracted from one blacklisted RBD mirror unable to recover at log level 15. <br />RBD volume names and domains are sanitized, otherwise the log is untouched.</p> Dashboard - Bug #44118 (New): Non-ASCII characters in ObjectGateway users' display_name break das...https://tracker.ceph.com/issues/441182020-02-13T12:29:11ZOliver Freyermuth
<p>Creating an Object Gateway user with purely ASCII characters, but a non-ASCII character in the "Full Name" field via Dashboard correctly creates the user,<br />and "ruciogw-admin user info" shows the correct display_name with correct encoding, but afterwards, the "User" page in the Dashboard fails to load. <br />mgr logs contain:<br /><pre>
2020-02-13 13:25:14.592 7ff360fe9700 0 mgr[dashboard] ['{"status": "500 Internal Server Error", "version": "3.2.2", "detail": "The server encountered an unexpected condition which prevented it from fulfilling the request.", "traceback": "Traceback (most recent call last):\\n File \\"/usr/lib/python2.7/site-packages/cherrypy/_cprequest.py\\", line 656, in respond\\n response.body = self.handler()\\n File \\"/usr/lib/python2.7/site-packages/cherrypy/lib/encoding.py\\", line 188, in __call__\\n self.body = self.oldhandler(*args, **kwargs)\\n File \\"/usr/lib/python2.7/site-packages/cherrypy/_cptools.py\\", line 221, in wrap\\n return self.newhandler(innerfunc, *args, **kwargs)\\n File \\"/usr/share/ceph/mgr/dashboard/services/exception.py\\", line 88, in dashboard_exception_handler\\n return handler(*args, **kwargs)\\n File \\"/usr/lib/python2.7/site-packages/cherrypy/_cpdispatch.py\\", line 34, in __call__\\n return self.callable(*self.args, **self.kwargs)\\n File \\"/usr/share/ceph/mgr/dashboard/controllers/__init__.py\\", line 661, in inner\\n ret = func(*args, **kwargs)\\n File \\"/usr/share/ceph/mgr/dashboard/controllers/__init__.py\\", line 854, in wrapper\\n return func(*vpath, **params)\\n File \\"/usr/share/ceph/mgr/dashboard/controllers/rgw.py\\", line 241, in create\\n result = self.proxy(\'PUT\', \'user\', params)\\n File \\"/usr/share/ceph/mgr/dashboard/controllers/rgw.py\\", line 98, in proxy\\n result = instance.proxy(method, path, params, None)\\n File \\"/usr/share/ceph/mgr/dashboard/services/rgw_client.py\\", line 394, in proxy\\n return self._proxy_request(self.admin_path, path, method, params, data)\\n File \\"/usr/share/ceph/mgr/dashboard/rest_client.py\\", line 507, in func_wrapper\\n **kwargs)\\n File \\"/usr/share/ceph/mgr/dashboard/services/rgw_client.py\\", line 389, in _proxy_request\\n method=method, params=params, data=data, raw_content=True)\\n File \\"/usr/share/ceph/mgr/dashboard/rest_client.py\\", line 313, in __call__\\n data, raw_content)\\n File \\"/usr/share/ceph/mgr/dashboard/rest_client.py\\", line 406, in do_request\\n resp.status_code, resp.text)\\n File \\"/usr/lib64/python2.7/logging/__init__.py\\", line 1137, in debug\\n self._log(DEBUG, msg, args, **kwargs)\\n File \\"/usr/lib64/python2.7/logging/__init__.py\\", line 1268, in _log\\n self.handle(record)\\n File \\"/usr/lib64/python2.7/logging/__init__.py\\", line 1278, in handle\\n self.callHandlers(record)\\n File \\"/usr/lib64/python2.7/logging/__init__.py\\", line 1318, in callHandlers\\n hdlr.handle(record)\\n File \\"/usr/lib64/python2.7/logging/__init__.py\\", line 749, in handle\\n self.emit(record)\\n File \\"/usr/share/ceph/mgr/mgr_module.py\\", line 65, in emit\\n self._module._ceph_log(ceph_level, self.format(record))\\nUnicodeEncodeError: \'ascii\' codec can\'t encode characters in position 109-110: ordinal not in range(128)\\n"}']
</pre><br />Deleting the user manually resolves the issue.</p>
<p>Likely related to <a class="issue tracker-1 status-3 priority-4 priority-default closed" title="Bug: mgr/dashboard: error when editing rbd image whose name contains non-ASCII chars. (Resolved)" href="https://tracker.ceph.com/issues/42651">#42651</a>.</p> rbd - Bug #43429 (Resolved): rbd-mirror daemon command "rbd mirror status" produces invalid JSONhttps://tracker.ceph.com/issues/434292019-12-27T01:30:43ZOliver Freyermuth
<p>The trailing brace is missing:<br /><pre>
ceph daemon /var/run/ceph/ceph-client.rbd_mirror_backup.$(systemctl show --property MainPID ceph-rbd-mirror@rbd_mirror_backup.service | sed 's/MainPID=//').*.asok rbd mirror status
{
"pool_replayers": [
{
"pool": "rbd",
"peer": "uuid: SOME__UUID__HERE cluster: ceph-virt client: client.rbd_mirror",
"instance_id": "1371140",
"state": "running",
"leader_instance_id": "1780996",
"leader": false,
"local_cluster_admin_socket": "/var/run/ceph/client.rbd_mirror_backup.522106.ceph.93950850517624.asok",
"remote_cluster_admin_socket": "/var/run/ceph/client.rbd_mirror.522106.ceph-virt.93950850527864.asok",
"sync_throttler": {},
"image_replayers": []
}
</pre><br />Piped into jq, it discards all input silently.</p> rbd - Bug #43428 (Resolved): rbd-mirror daemons don't logrotate correctlyhttps://tracker.ceph.com/issues/434282019-12-27T01:27:47ZOliver Freyermuth
<p>Currently, /etc/logrotate.d/ceph shipped with ceph-base has the following postrotate command:</p>
<p>killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw || pkill -1 -x "ceph-mon|ceph-mgr|ceph-mds|ceph-osd|ceph-fuse|radosgw" || true</p>
<p>rbd-mirror is missing here, and expectedly new log files stay empty after the old one was rotated. <br />I did not yet test whether rbd-mirror responds to SIGHUP correctly, but potentially the fix is as easy as adding it to the list.</p> CephFS - Feature #43337 (New): fs: support relatime correctly for CephFShttps://tracker.ceph.com/issues/433372019-12-17T01:27:41ZOliver Freyermuth
<p>As of now, CephFS does not seem to handle atime.</p>
<p>The relatime mount option (since Linux 2.6.30) is meant to not only adapt the atime to mtime/ctime, but will also bump the atime when an access happens after 24 hours. <br />There is one major use case for this we are interested in: <br />Find old data not read anymore. This use case probably applies to most long-lived clusters with actual users who tend not to clean up, but keep data forever even if not needed anymore. <br />From a more positive viewpoint, it can also help users themselves to clean up.</p>
<p>Bumping the atime at maximum once per day (as is done for other FS) seems reasonable for performance reasons and does not destroy this use case.</p> Dashboard - Bug #24902 (Resolved): Mimic Dashboard does not allow deletion of snapshots containin...https://tracker.ceph.com/issues/249022018-07-13T09:57:30ZOliver Freyermuth
<p>Trying to delete an RBD snapshot leads to the following message in the popup window:<br /> '2018-07-13T11:50:37+0200' doesn't match '2018-07-13T11:50:37+0200'. <br />and deletion is denied. <br />Replacing the "+" by something else let's me delete the snapshot.</p> bluestore - Bug #23165 (Resolved): OSD used for Metadata / MDS storage constantly entering heartb...https://tracker.ceph.com/issues/231652018-02-27T22:28:37ZOliver Freyermuth
<p>After our stress test creating 100,000,000 small files on cephfs, and now finally deleting all those files, now 2 of the 4 OSDs crash continously. <br />They enter heartbeat timeouts and are finally killed.</p>
<p>The other 2 of the 4 OSDs (replica 4) were recreated and backfilled during the deletion process.</p>
<pre>
# ceph osd df | head
ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS
0 ssd 0.21829 1.00000 223G 4692M 218G 2.05 1.74 128
1 ssd 0.21829 1.00000 223G 4218M 219G 1.84 1.56 128
2 ssd 0.21819 1.00000 223G 12007M 211G 5.25 4.46 128
3 ssd 0.21819 1.00000 223G 13314M 210G 5.82 4.94 128
</pre><br />osd.0 and osd.1 have been backfilled and are running stable, but osd.2 and osd.3 are affected by the issue. <br />They shortly managed to synchronize and cluster was healthy at some point, so they all contain the same "information". <br />Main difference is that osd.2 and osd.4 have lived through the mess of 100,000,000 files created and deleted,<br />while osd.0 and osd.1 are still rather fresh.
<p>I have uploaded a debug log with log level 20 for osd.3 here:<br />29275fcd-0dd3-4f0f-bacf-33d8482d85a3<br />It after 2018-02-27 21:00, it contains several of those crashes with debug level 1, while the last one I captured with debug level 20. <br />Basically, I just see many:<br /><pre>
heartbeat_map is_healthy 'OSD::osd_op_tp thread 0xXXXXXXXXX' had timed out after 15
</pre><br />before a suicide timeout and abort.</p>
<p>While I could now (and likely will) just recreate those OSDs and backfill them from the more healthy ones,<br />I hope the information collected in this ticket and log will help to solve the underyling issue.</p> bluestore - Bug #23120 (Can't reproduce): OSDs continously crash during recoveryhttps://tracker.ceph.com/issues/231202018-02-25T16:23:44ZOliver Freyermuth
<p>I have several OSDs continuously crashing during recovery. This is Luminous 12.2.3.</p>
<pre>
ceph version 12.2.3 (2dab17a455c09584f2a85e6b10888337d1ec8949) luminous (stable)
1: (()+0xa3c591) [0x55b3e5a85591]
2: (()+0xf5e0) [0x7f8c237ca5e0]
3: (gsignal()+0x37) [0x7f8c227f31f7]
4: (abort()+0x148) [0x7f8c227f48e8]
5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x284) [0x55b3e5ac4664]
6: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, ObjectStore::Transaction*)+0x1487) [0x55b3e5997a27]
7: (BlueStore::queue_transactions(ObjectStore::Sequencer*, std::vector<ObjectStore::Transaction, std::allocator<ObjectStore::Transaction> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x3a0) [0x55b3e5998a70]
8: (PrimaryLogPG::queue_transactions(std::vector<ObjectStore::Transaction, std::allocator<ObjectStore::Transaction> >&, boost::intrusive_ptr<OpRequest>)+0x65) [0x55b3e5708a85]
9: (ECBackend::handle_sub_write(pg_shard_t, boost::intrusive_ptr<OpRequest>, ECSubWrite&, ZTracer::Trace const&, Context*)+0x631) [0x55b3e5828191]
10: (ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x327) [0x55b3e5838b27]
11: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x50) [0x55b3e573d680]
12: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x59c) [0x55b3e56a900c]
13: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3f9) [0x55b3e552ef29]
14: (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest> const&)+0x57) [0x55b3e57abad7]
15: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xfce) [0x55b3e555d99e]
16: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x839) [0x55b3e5aca009]
17: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x55b3e5acbfa0]
18: (()+0x7e25) [0x7f8c237c2e25]
19: (clone()+0x6d) [0x7f8c228b634d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
</pre><br />This is using the officially released RPMs.
<p>I've uploaded the logfile of one such OSD as:<br />ca0a29ae-0993-4faa-be4d-9ba2f7d6f905</p>
<p>The cluster will likely be recreated soon, since the system is now borked anyway, so please let me know quickly if more info is needed.</p> RADOS - Bug #23117 (Fix Under Review): PGs stuck in "activating" after osd_max_pg_per_osd_hard_ra...https://tracker.ceph.com/issues/231172018-02-24T19:14:21ZOliver Freyermuth
In the following setup:
<ul>
<li>6 OSD hosts</li>
<li>Each host with 32 disks = 32 OSDs</li>
<li>Pool with 2048 PGs, EC, k=4, m=2, crush failure domain host</li>
</ul>
<p>When (re)installing the 6th host and creating the first OSD on it, PG overdose protection kicks in shortly,<br />since all PGs need to have shards on the 6th host. <br />For this reason, PGs enter "activating" state and get stuck there.</p>
<p>However, even when all 32 OSDs are added on the 6th host, the PGs are still stuck in activating and data stays unavailable (even though ODSs were added). <br />This situation does not resolve by itself.</p>
<p>This issue can be resolved by setting:<br /><pre>
osd_max_pg_per_osd_hard_ratio = 32
</pre><br />before the redeployment of a host, thus effectively turning off overdose protection.</p>
For one example PG in the stuck state:<br /><pre>
# ceph pg dump all | grep 2.7f6
dumped all
2.7f6 38086 0 38086 0 0 2403961148 1594 1594 activating+undersized+degraded+remapped 2018-02-24 19:50:01.654185 39755'134350 39946:274873 [153,6,42,95,115,167] 153 [153,NONE,42,95,115,167] 153 39559'109078 2018-02-24 04:01:57.991376 36022'53756 2018-02-22 18:03:40.386421 0
</pre><br />I have uploaded OSD logs from all involved OSDs:
<ul>
<li>c3953bf7-b482-4705-a7a3-df354453a933 for OSD 6 (which was reinstalled, so maybe this is irrelevant)</li>
<li>833c07e2-09ff-409c-b68f-1a87e7bfc353 for OSD 4, which was the first OSD reinstalled on the new OSD host, so it should have been affected by overdose protection</li>
<li>cb146d33-e6cb-4c84-8b15-543728bbc5dd for OSD.42</li>
<li>f716a2d1-e7ef-46d7-b4fc-dfc440e6fe59 for OSD.95</li>
<li>fc7ec27a-82c9-4fb4-94dc-5dd64335e3b4 for OSD.115</li>
<li>51213f5f-1b91-42b0-8c0c-8acf3622195f for OSD.153</li>
<li>3d67f227-4dba-4c93-9fe1-7951d3d32f30 for OSD 167</li>
</ul>
<p>I have also uploaded the ceph.conf of osd001 which was the reinstalled OSD host:<br />64744f9a-e136-40f9-a392-4a6f1b34a74e<br />All other OSD hosts have <br /><pre>
osd_max_pg_per_osd_hard_ratio = 32
</pre><br />set (which prevents the issue).</p>
<p>Additionally, I have uploaded all OSD logs of the reinstalled osd001 machine:<br />38ddd08f-6c66-4a88-8e83-f4eff0ae5d10<br />(so this includes osd.4 and osd.6 already linked above).</p>