Actions
Bug #648
closedmonclient: PGMap::apply_incremental
Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
Monitor
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
I left my laptop on last night with a 'ceph -w' on one of my test machines, this morning I saw:
2010-12-13 22:38:51.497761 7f3907e94710 monclient: hunting for new mon 2010-12-13 22:39:02.919849 pg v42560: 832 pgs: 832 active+clean; 5803 MB data, 16891 MB used, 277 GB / 300 GB avail 2010-12-13 22:39:12.979827 pg v42561: 832 pgs: 832 active+clean; 5803 MB data, 16892 MB used, 277 GB / 300 GB avail 2010-12-13 22:39:24.353785 pg v42562: 832 pgs: 832 active+clean; 5803 MB data, 16892 MB used, 277 GB / 300 GB avail 2010-12-13 22:39:37.422499 pg v42563: 832 pgs: 832 active+clean; 5803 MB data, 16892 MB used, 277 GB / 300 GB avail 2010-12-13 22:39:49.759698 pg v42564: 832 pgs: 832 active+clean; 5803 MB data, 16893 MB used, 277 GB / 300 GB avail 2010-12-13 22:40:00.477234 pg v42565: 832 pgs: 832 active+clean; 5803 MB data, 16892 MB used, 277 GB / 300 GB avail 2010-12-13 22:40:12.024961 pg v42566: 832 pgs: 832 active+clean; 5803 MB data, 16892 MB used, 277 GB / 300 GB avail 2010-12-13 22:40:25.177157 pg v42567: 832 pgs: 832 active+clean; 5803 MB data, 16892 MB used, 277 GB / 300 GB avail 2010-12-13 22:40:35.827663 pg v42568: 832 pgs: 832 active+clean; 5803 MB data, 16892 MB used, 277 GB / 300 GB avail 2010-12-13 22:40:43.069503 mds e886: 1/1/1 up {0=up:active(laggy or crashed)} 2010-12-13 22:40:55.565163 pg v42569: 832 pgs: 832 active+clean; 5803 MB data, 16891 MB used, 277 GB / 300 GB avail 2010-12-13 22:41:01.910177 mds e887: 1/1/1 up {0=up:active} 2010-12-13 22:41:16.599476 log 2010-12-13 22:41:01.908892 mon0 [2a00:f10:113:1:230:48ff:fe8d:a21f]:6789/0 44 : [INF] mds0 [2a00:f10:113:1:230:48ff:fe8d:a21f]:6800/1987 up:active 2010-12-13 22:41:27.167460 pg v42570: 832 pgs: 832 active+clean; 5803 MB data, 16891 MB used, 277 GB / 300 GB avail 2010-12-13 22:41:38.616370 pg v42571: 832 pgs: 832 active+clean; 5803 MB data, 16891 MB used, 277 GB / 300 GB avail 2010-12-13 22:41:50.188709 pg v42572: 832 pgs: 832 active+clean; 5803 MB data, 16891 MB used, 277 GB / 300 GB avail 2010-12-13 22:42:01.686584 pg v42573: 832 pgs: 832 active+clean; 5803 MB data, 16892 MB used, 277 GB / 300 GB avail 2010-12-13 22:42:12.054956 pg v42574: 832 pgs: 832 active+clean; 5803 MB data, 16892 MB used, 277 GB / 300 GB avail 2010-12-13 22:42:24.908336 pg v42575: 832 pgs: 832 active+clean; 5803 MB data, 16891 MB used, 277 GB / 300 GB avail 2010-12-13 22:42:33.920864 pg v42576: 832 pgs: 832 active+clean; 5803 MB data, 16891 MB used, 277 GB / 300 GB avail 2010-12-13 22:42:45.710302 7f3907e94710 monclient: hunting for new mon ./mon/PGMap.h: In function 'void PGMap::apply_incremental(PGMap::Incremental&)': ./mon/PGMap.h:77: FAILED assert(inc.version == version+1) ceph version 0.24~rc (commit:9add26be7698b55e31d9dff73537f1a726f9ee86) 1: ceph() [0x455a6e] 2: ceph() [0x458763] 3: (Admin::ms_dispatch(Message*)+0xe0) [0x46dfa0] 4: (SimpleMessenger::dispatch_entry()+0x759) [0x473b19] 5: (SimpleMessenger::DispatchThread::entry()+0x1c) [0x45dd8c] 6: (Thread::_entry_func(void*)+0xa) [0x4800da] 7: (()+0x69ca) [0x7f3910b699ca] 8: (clone()+0x6d) [0x7f390ea7070d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. ./mon/PGMap.h: In function 'void PGMap::apply_incremental(PGMap::Incremental&)': ./mon/PGMap.h:77: FAILED assert(inc.version == version+1) ceph version 0.24~rc (commit:9add26be7698b55e31d9dff73537f1a726f9ee86) 1: ceph() [0x455a6e] 2: ceph() [0x458763] 3: (Admin::ms_dispatch(Message*)+0xe0) [0x46dfa0] 4: (SimpleMessenger::dispatch_entry()+0x759) [0x473b19] 5: (SimpleMessenger::DispatchThread::entry()+0x1c) [0x45dd8c] 6: (Thread::_entry_func(void*)+0xa) [0x4800da] 7: (()+0x69ca) [0x7f3910b699ca] 8: (clone()+0x6d) [0x7f390ea7070d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. terminate called after throwing an instance of 'ceph::FailedAssertion' Aborted
monclient: hunting for new mon makes you think the monitor crashed and the client switch to another monitor, but there is only one monitor in this case.
The cluster layout:
- 1 monitor
- 1 MDS
- 3 OSD's
All on the same machine.
There is nothing special in the monitor logs on those times (debugging was low).
I'm not sure if I can reproduce it, but the hunting for new mon seems rather weird.
Updated by Sage Weil over 13 years ago
- Target version set to 19
This is a known issue, caused by the pg state trimming. It'll go away eventually with #647. In the meantime, I'll make the trimming less aggressive so it won't come up so often.
Updated by Sage Weil over 13 years ago
- Status changed from New to Resolved
trimming changed by 89d5c91e7d207d646651f8959ee37a15ea199d1b
Actions