https://tracker.ceph.com/https://tracker.ceph.com/favicon.ico2014-12-19T12:04:15ZCeph Ceph - Bug #10399: kvm die with assert(m_seed < old_pg_num)https://tracker.ceph.com/issues/10399?journal_id=458522014-12-19T12:04:15ZLoïc Dacharyloic@dachary.org
<ul></ul><p>Could you please send the output of <strong>ceph report</strong> ? It would also be great to have (if possible) a set of steps to follow to reproduce, even if in theory only.</p> Ceph - Bug #10399: kvm die with assert(m_seed < old_pg_num)https://tracker.ceph.com/issues/10399?journal_id=458542014-12-19T12:18:38ZSage Weilsage@newdream.net
<ul><li><strong>Priority</strong> changed from <i>Normal</i> to <i>Urgent</i></li><li><strong>Source</strong> changed from <i>other</i> to <i>Community (user)</i></li></ul> Ceph - Bug #10399: kvm die with assert(m_seed < old_pg_num)https://tracker.ceph.com/issues/10399?journal_id=458552014-12-19T12:20:20ZMehdi Abaakouksileht@sileht.net
<ul><li><strong>File</strong> <a href="/attachments/download/1573/ceph-report.gz">ceph-report.gz</a> added</li></ul><p>I have attached the ceph-report file.</p>
<p>The crashed kvm processes have a block device on the pool 'r2', the one where I have changed the pg_num.<br />The pg_num before the crash was 512.<br />I have used: "ceph osd pool set r2 1024" to change the pg_num<br />And then the kvm processes has stopped, before ceph have finish to create the new pg and before I have changed the pgp_num.</p> Ceph - Bug #10399: kvm die with assert(m_seed < old_pg_num)https://tracker.ceph.com/issues/10399?journal_id=458562014-12-19T12:51:50ZLaurent GUERBYlaurent@guerby.net
<ul></ul><p>If this helps: the cluster was in health_ok state when the command was launched (ceph-report.gz is now during the recovery of the pgnum change).</p> Ceph - Bug #10399: kvm die with assert(m_seed < old_pg_num)https://tracker.ceph.com/issues/10399?journal_id=470132015-01-27T21:17:58ZSage Weilsage@newdream.net
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>Need More Info</i></li></ul><p>what version are you running now, and have you seen this since? this version was a random development version shortly after giant, and we do lots of split testing in our qa so i would expect we would see this if it is still present...</p> Ceph - Bug #10399: kvm die with assert(m_seed < old_pg_num)https://tracker.ceph.com/issues/10399?journal_id=476032015-02-07T19:37:52ZSage Weilsage@newdream.net
<ul><li><strong>Status</strong> changed from <i>Need More Info</i> to <i>Can't reproduce</i></li></ul> Ceph - Bug #10399: kvm die with assert(m_seed < old_pg_num)https://tracker.ceph.com/issues/10399?journal_id=565772015-08-13T17:33:45ZRoy Keeneceph@rkeene.org
<ul></ul><p>I just ran into this issue as well, while growing pg_num and pgp_num.<br /><pre>
2015-07-29 16:53:16.018+0000: starting up libvirt version: 1.2.16, qemu version: 2.3.0
LC_ALL=C PATH=/bin:/usr/bin QEMU_AUDIO_DRV=none /bin/qemu-system-x86_64 -name one-30 -S -machine pc-i440fx-2.2,accel=kvm,usb=off -m 4096 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid cb6683fc-d1d1-42ad-aa5c-bcb8a0d322c4 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/one-30.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=rbd:rbd/one-16-30-0:auth_supported=none:mon_host=aurae-storage-1\:6789\;aurae-storage-2\:6789\;aurae-storage-3\:6789\;aurae-storage-4\:6789\;aurae-storage-5\:6789\;aurae-storage-6\:6789,if=none,id=drive-ide0-0-0,format=raw,cache=writeback -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -drive file=/var/lib/one//datastores/0/30/disk.2,if=none,id=drive-ide0-0-1,readonly=on,format=raw -device ide-cd,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1 -drive file=rbd:rbd/one-21-30-1:auth_supported=none:mon_host=aurae-storage-1\:6789\;aurae-storage-2\:6789\;aurae-storage-3\:6789\;aurae-storage-4\:6789\;aurae-storage-5\:6789\;aurae-storage-6\:6789,if=none,id=drive-virtio-disk0,format=raw,cache=writeback -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -netdev tap,fd=12,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=02:00:0a:50:00:13,bus=pci.0,addr=0x3 -vnc 0.0.0.0:30 -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -incoming tcp:0.0.0.0:49152 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -sandbox on -msg timestamp=on
osd/osd_types.cc: In function 'bool pg_t::is_split(unsigned int, unsigned int, std::set<pg_t>*) const' thread 7f9f895a8700 time 2015-08-13 14:57:29.443302
osd/osd_types.cc: 459: FAILED assert(m_seed < old_pg_num)
ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff)
1: (()+0x11dfe8) [0x7f9f8f607fe8]
2: (()+0x1e99e1) [0x7f9f8f6d39e1]
3: (()+0x1e9abd) [0x7f9f8f6d3abd]
4: (()+0x8f939) [0x7f9f8f579939]
5: (()+0xa6c73) [0x7f9f8f590c73]
6: (()+0xa74ba) [0x7f9f8f5914ba]
7: (()+0xa89f2) [0x7f9f8f5929f2]
8: (()+0xae8ff) [0x7f9f8f5988ff]
9: (()+0x2887aa) [0x7f9f8f7727aa]
10: (()+0x2b501d) [0x7f9f8f79f01d]
11: (()+0x8354) [0x7f9f8d909354]
12: (clone()+0x6d) [0x7f9f8d64871d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
terminate called after throwing an instance of 'ceph::FailedAssertion'
2015-08-13 14:57:29.766+0000: shutting down
</pre></p> Ceph - Bug #10399: kvm die with assert(m_seed < old_pg_num)https://tracker.ceph.com/issues/10399?journal_id=565792015-08-13T17:48:03ZJosh Durgin
<ul><li><strong>Status</strong> changed from <i>Can't reproduce</i> to <i>Need More Info</i></li><li><strong>Regression</strong> set to <i>No</i></li></ul><p>Which version of librados was this Roy? How many VMs saw this crash, and how many total are there (trying to figure out how hard it is to reproduce in tests)?</p> Ceph - Bug #10399: kvm die with assert(m_seed < old_pg_num)https://tracker.ceph.com/issues/10399?journal_id=565842015-08-13T18:17:27ZRoy Keeneceph@rkeene.org
<ul></ul><p>This was Ceph 0.94.1 on Linux/x86_64, so whatever version of librados came with that release.</p>
<p>About 2 of 10 VMs died this way simultaneously.</p>
<p>The QEMU version used is 2.3.0.</p> Ceph - Bug #10399: kvm die with assert(m_seed < old_pg_num)https://tracker.ceph.com/issues/10399?journal_id=566042015-08-14T10:07:03ZWei-Chung Chengfreeze.bilsted@gmail.com
<ul></ul><p>I have the same issue with ceph-0.94.2<br />Ubuntu version: 12.04.5<br />kernel version: 3.13.0-35<br />Qemu version: 2.0.0</p>
<p>8 of 20 VMs shutoff (with normal I/O stress)</p>
<p>Actually without VMs, it hard to reproduce(I do not reproduce successfully until now).</p>
<p>I still try to figure out whether m_seed or old_pg_num is wrong.</p>
<p>thanks!!!!</p> Ceph - Bug #10399: kvm die with assert(m_seed < old_pg_num)https://tracker.ceph.com/issues/10399?journal_id=567542015-08-18T18:02:30ZSage Weilsage@newdream.net
<ul></ul><p>is this teh same as <a class="issue tracker-1 status-10 priority-6 priority-high2 closed" title="Bug: "FAILED assert(m_seed < old_pg_num)" in upgrade:giant-x-hammer-distro-basic-vps run (Duplicate)" href="https://tracker.ceph.com/issues/10543">#10543</a> ?</p> Ceph - Bug #10399: kvm die with assert(m_seed < old_pg_num)https://tracker.ceph.com/issues/10399?journal_id=567872015-08-19T05:52:06ZWei-Chung Chengfreeze.bilsted@gmail.com
<ul></ul><p>Sage Weil wrote:</p>
<blockquote>
<p>is this teh same as <a class="issue tracker-1 status-10 priority-6 priority-high2 closed" title="Bug: "FAILED assert(m_seed < old_pg_num)" in upgrade:giant-x-hammer-distro-basic-vps run (Duplicate)" href="https://tracker.ceph.com/issues/10543">#10543</a> ?</p>
</blockquote>
<p>yes, I think so.</p>
<p>It looks like very similar on stack trace and environment.</p> Ceph - Bug #10399: kvm die with assert(m_seed < old_pg_num)https://tracker.ceph.com/issues/10399?journal_id=568552015-08-21T15:12:07ZJason Dillamandillaman@redhat.com
<ul><li><strong>Status</strong> changed from <i>Need More Info</i> to <i>In Progress</i></li><li><strong>Assignee</strong> set to <i>Jason Dillaman</i></li></ul><p>Easily repeatable when "rbd bench-write" is running in the background while you double the number of PGs.</p> Ceph - Bug #10399: kvm die with assert(m_seed < old_pg_num)https://tracker.ceph.com/issues/10399?journal_id=568562015-08-21T15:31:58ZJason Dillamandillaman@redhat.com
<ul><li><strong>Backport</strong> set to <i>hammer,giant,firefly</i></li></ul> Ceph - Bug #10399: kvm die with assert(m_seed < old_pg_num)https://tracker.ceph.com/issues/10399?journal_id=568592015-08-21T15:45:14ZJason Dillamandillaman@redhat.com
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Fix Under Review</i></li></ul><p><strong>master PR</strong>: <a class="external" href="https://github.com/ceph/ceph/pull/5648">https://github.com/ceph/ceph/pull/5648</a></p> Ceph - Bug #10399: kvm die with assert(m_seed < old_pg_num)https://tracker.ceph.com/issues/10399?journal_id=568712015-08-21T20:41:41ZLoïc Dacharyloic@dachary.org
<ul><li><strong>Backport</strong> changed from <i>hammer,giant,firefly</i> to <i>hammer,firefly</i></li></ul><p>giant is retired</p> Ceph - Bug #10399: kvm die with assert(m_seed < old_pg_num)https://tracker.ceph.com/issues/10399?journal_id=568972015-08-24T14:56:06ZJason Dillamandillaman@redhat.com
<ul><li><strong>Backport</strong> changed from <i>hammer,firefly</i> to <i>infernalis,hammer,firefly</i></li></ul> Ceph - Bug #10399: kvm die with assert(m_seed < old_pg_num)https://tracker.ceph.com/issues/10399?journal_id=570122015-08-25T15:27:02ZJason Dillamandillaman@redhat.com
<ul><li><strong>Status</strong> changed from <i>Fix Under Review</i> to <i>Pending Backport</i></li></ul> Ceph - Bug #10399: kvm die with assert(m_seed < old_pg_num)https://tracker.ceph.com/issues/10399?journal_id=607482015-10-27T14:37:40ZNathan Cutlerncutler@suse.cz
<ul><li><strong>Status</strong> changed from <i>Pending Backport</i> to <i>Resolved</i></li></ul> Ceph - Bug #10399: kvm die with assert(m_seed < old_pg_num)https://tracker.ceph.com/issues/10399?journal_id=607502015-10-27T14:39:40ZNathan Cutlerncutler@suse.cz
<ul></ul><p>Already in infernalis. Hammer and firefly backports have been merged.</p>