Project

General

Profile

Actions

Bug #53000

open

OSDMap/OSDMapTest.BUG_51842/2: ThreadPool::WorkQueue<ParallelPGMapper::Item>::_void_dequeue: pure virtual method called

Added by Mykola Golub over 2 years ago. Updated 10 days ago.

Status:
New
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
test-failure
Backport:
pacific,quincy
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

This failure was reported by Jenkins for pacific branch PR [1], though it does not look like related to that PR, and may be not specific for pacific branch. It was not reproduced after re-running Jenkins. Also I failed to reproduce it locally running the test in loop for a while. May be environment specific.

The test OSDMap/OSDMapTest.BUG_51842/2 aborted with "pure virtual method called" in the thread started by clean_pg_upmaps, in ThreadPool::WorkQueue<ParallelPGMapper::Item>::_void_dequeue().

[ RUN      ] OSDMap/OSDMapTest.BUG_51842/2
ID  CLASS  WEIGHT   TYPE NAME             
-5         3.00000  root infra-1706       
-4         1.00000      host host-0       
 0         1.00000          osd.0         
-6         1.00000      host host-1       
 1         1.00000          osd.1         
-7         1.00000      host host-2       
 2         1.00000          osd.2         
-1               0  root default          
-3               0      rack localrack    
-2               0          host localhost

pure virtual method called
terminate called without an active exception
...
    -2> 2021-10-05T08:33:22.552+0000 7fe076379700  1 BUG_40104::clean_upmap_tp worker finish
    -1> 2021-10-05T08:33:22.552+0000 7fe079b80700  1 BUG_40104::clean_upmap_tp worker finish
     0> 2021-10-05T08:33:22.556+0000 7fe076b7a700 -1 *** Caught signal (Aborted) **
 in thread 7fe076b7a700 thread_name:clean_upmap_tp

 ceph version Development (no_version) pacific (stable)
 1: /home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_osdmap(+0x2e81e3) [0x557b8f5431e3]
 2: /lib/x86_64-linux-gnu/libpthread.so.0(+0x153c0) [0x7fe086bca3c0]
 3: gsignal()
 4: abort()
 5: /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x9e911) [0x7fe07c0e9911]
 6: /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xaa38c) [0x7fe07c0f538c]
 7: /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xaa3f7) [0x7fe07c0f53f7]
 8: /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xab155) [0x7fe07c0f6155]
 9: (ThreadPool::WorkQueue<ParallelPGMapper::Item>::_void_dequeue()+0x27) [0x557b8f4d9493]
 10: (ThreadPool::worker(ThreadPool::WorkThread*)+0x50b) [0x7fe07d8ed7bf]
 11: (ThreadPool::WorkThread::entry()+0x36) [0x7fe07d8f1c92]
 12: (Thread::entry_wrapper()+0x87) [0x7fe07d8c8e73]
 13: (Thread::_entry_func(void*)+0x1c) [0x7fe07d8c8de2]
 14: /lib/x86_64-linux-gnu/libpthread.so.0(+0x9609) [0x7fe086bbe609]
 15: clone()
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

[1] https://github.com/ceph/ceph/pull/43415


Files

clean_pg_upmaps failure.gz (287 KB) clean_pg_upmaps failure.gz Laura Flores, 08/22/2022 03:51 PM

Related issues 2 (0 open2 closed)

Copied to RADOS - Backport #57288: pacific: OSDMap/OSDMapTest.BUG_51842/2: ThreadPool::WorkQueue<ParallelPGMapper::Item>::_void_dequeue: pure virtual method calledRejectedActions
Copied to RADOS - Backport #57289: quincy: OSDMap/OSDMapTest.BUG_51842/2: ThreadPool::WorkQueue<ParallelPGMapper::Item>::_void_dequeue: pure virtual method calledRejectedActions
Actions #1

Updated by Mykola Golub over 2 years ago

Yet another example, now for a PR for the master branch [1, 2], and for OSDMap/OSDMapTest.BUG_51842/1:

[ RUN      ] OSDMap/OSDMapTest.BUG_51842/1
ID  CLASS  WEIGHT   TYPE NAME             
-5         3.00000  root infra-1706       
-4         1.00000      host host-0       
 0         1.00000          osd.0         
-6         1.00000      host host-1       
 1         1.00000          osd.1         
-7         1.00000      host host-2       
 2         1.00000          osd.2         
-1               0  root default          
-3               0      rack localrack    
-2               0          host localhost

{
    "rule": {
        "rule_id": 0,
        "rule_name": "replicated_rule",
        "type": 1,
        "steps": [
            {
                "op": "take",
                "item": -1,
                "item_name": "default" 
            },
            {
                "op": "choose_firstn",
                "num": 0,
                "type": "osd" 
            },
            {
                "op": "emit" 
            }
        ]
    },
    "rule": {
        "rule_id": 1,
        "rule_name": "infra-1706",
        "type": 1,
        "steps": [
            {
                "op": "set_chooseleaf_tries",
                "num": 5
            },
            {
                "op": "set_choose_tries",
                "num": 100
            },
            {
                "op": "take",
                "item": -5,
                "item_name": "infra-1706" 
            },
            {
                "op": "chooseleaf_firstn",
                "num": 3,
                "type": "host" 
            },
            {
                "op": "emit" 
            }
        ]
    }
}
2021-10-19T11:21:14.182+0000 7fef7fba9700  1 heartbeat_map reset_timeout 'BUG_40104::clean_upmap_tp thread 0x7fef7fba9700' had timed out after 0.000000000s
pure virtual method called
terminate called without an active exception
*** Caught signal (Aborted) **
 in thread 7fef82baf700 thread_name:clean_upmap_tp
 ceph version Development (no_version) quincy (dev)
 1: /home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_osdmap() [0x6a7b7b]
 2: /lib/x86_64-linux-gnu/libpthread.so.0(+0x153c0) [0x7fef880073c0]
 3: gsignal()
 4: abort()
 5: /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x9e911) [0x7fef854eb911]
 6: /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xaa38c) [0x7fef854f738c]
 7: /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xaa3f7) [0x7fef854f73f7]
 8: /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xab155) [0x7fef854f8155]
 9: (ThreadPool::WorkQueue<ParallelPGMapper::Item>::_void_dequeue()+0x19) [0x5fea49]
 10: (ThreadPool::worker(ThreadPool::WorkThread*)+0x641) [0x7fef86f20841]
 11: (ThreadPool::WorkThread::entry()+0x20) [0x7fef86f268c0]
 12: (Thread::entry_wrapper()+0x90) [0x7fef86ef6b10]
 13: (Thread::_entry_func(void*)+0x18) [0x7fef86ef6a68]
 14: /lib/x86_64-linux-gnu/libpthread.so.0(+0x9609) [0x7fef87ffb609]
 15: clone()

[1] https://github.com/ceph/ceph/pull/41323
[2] https://jenkins.ceph.com/job/ceph-pull-requests/84171/console

Actions #4

Updated by Radoslaw Zarzynski over 1 year ago

  • Status changed from New to Fix Under Review
  • Pull request ID set to 41323
Actions #5

Updated by Radoslaw Zarzynski over 1 year ago

  • Backport set to pacific,quincy
Actions #6

Updated by Radoslaw Zarzynski over 1 year ago

  • Status changed from Fix Under Review to Pending Backport
Actions #7

Updated by Backport Bot over 1 year ago

  • Copied to Backport #57288: pacific: OSDMap/OSDMapTest.BUG_51842/2: ThreadPool::WorkQueue<ParallelPGMapper::Item>::_void_dequeue: pure virtual method called added
Actions #8

Updated by Backport Bot over 1 year ago

  • Copied to Backport #57289: quincy: OSDMap/OSDMapTest.BUG_51842/2: ThreadPool::WorkQueue<ParallelPGMapper::Item>::_void_dequeue: pure virtual method called added
Actions #9

Updated by Backport Bot over 1 year ago

  • Tags set to backport_processed
Actions #10

Updated by Mykola Golub over 1 year ago

I don't understand why PR ID for this bug was set to 41323. I mentioned the PR 41323 not as the fix but as an example where this bug was observed when testing that PR.

Actions #11

Updated by Radoslaw Zarzynski over 1 year ago

  • Status changed from Pending Backport to New
  • Pull request ID deleted (41323)

Sorry, moving back to New.

Actions #12

Updated by Kefu Chai over 1 year ago

[ RUN      ] OSDMap/OSDMapTest.BUG_51842/2
ID  CLASS  WEIGHT   TYPE NAME             
-5         3.00000  root infra-1706       
-4         1.00000      host host-0       
 0         1.00000          osd.0         
-6         1.00000      host host-1       
 1         1.00000          osd.1         
-7         1.00000      host host-2       
 2         1.00000          osd.2         
-1               0  root default          
-3               0      rack localrack    
-2               0          host localhost

pure virtual method called
...
    -2> 2022-09-04T14:17:08.035+0000 7efd9e464640  1 BUG_40104::clean_upmap_tp worker finish
    -1> 2022-09-04T14:17:08.035+0000 7efd9d462640  1 BUG_40104::clean_upmap_tp worker finish
     0> 2022-09-04T14:17:08.219+0000 7efd9ac5d640 -1 *** Caught signal (Aborted) **

spotted on main branch.

Actions #13

Updated by Radoslaw Zarzynski over 1 year ago

  • Tags changed from backport_processed to test-failure
Actions #14

Updated by Casey Bodley 22 days ago

from https://jenkins.ceph.com/job/ceph-pull-requests/133465/consoleFull

[ RUN      ] OSDMap/OSDMapTest.BUG_51842/1
ID  CLASS  WEIGHT   TYPE NAME             
-5         3.00000  root infra-1706       
-4         1.00000      host host-0       
 0         1.00000          osd.0         
-6         1.00000      host host-1       
 1         1.00000          osd.1         
-7         1.00000      host host-2       
 2         1.00000          osd.2         
-1               0  root default          
-3               0      rack localrack    
-2               0          host localhost

pure virtual method called
terminate called without an active exception
{
    "rule": {
        "rule_id": 0,
        "rule_name": "replicated_rule",
        "type": 1,
        "steps": [
            {
                "op": "take",
                "item": -1,
                "item_name": "default" 
            },
            {
                "op": "choose_firstn",
                "num": 0,
                "type": "osd" 
            },
            {
                "op": "emit" 
            }
        ]
    },
    "rule": {
        "rule_id": 1,
        "rule_name": "infra-1706",
        "type": 1,
        "steps": [
            {
                "op": "set_chooseleaf_tries",
                "num": 5
            },
            {
                "op": "set_choose_tries",
                "num": 100
            },
            {
                "op": "take",
                "item": -5,
                "item_name": "infra-1706" 
            },
            {
                "op": "chooseleaf_firstn",
                "num": 3,
                "type": "host" 
            },
            {
                "op": "emit" 
            }
        ]
    }
}
*** Caught signal (Aborted) **
 in thread 7ff8c9291640 thread_name:clean_upmap_tp
 ceph version Development (no_version) squid (dev)
 1: /home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_osdmap(+0x2ec6a2) [0x5621755016a2]
 2: /lib/x86_64-linux-gnu/libc.so.6(+0x42520) [0x7ff8d2328520]
 3: pthread_kill()
 4: raise()
 5: abort()
 6: /lib/x86_64-linux-gnu/libstdc++.so.6(+0xa2bfe) [0x7ff8d26b7bfe]
 7: /lib/x86_64-linux-gnu/libstdc++.so.6(+0xae28c) [0x7ff8d26c328c]
 8: /lib/x86_64-linux-gnu/libstdc++.so.6(+0xae2f7) [0x7ff8d26c32f7]
 9: /lib/x86_64-linux-gnu/libstdc++.so.6(+0xaf025) [0x7ff8d26c4025]
 10: (ThreadPool::WorkQueue<ParallelPGMapper::Item>::_void_dequeue()+0x16) [0x562175448ef6]
 11: (ThreadPool::worker(ThreadPool::WorkThread*)+0x7a5) [0x7ff8d447b195]
 12: (ThreadPool::WorkThread::entry()+0x1a) [0x7ff8d448119a]
 13: (Thread::entry_wrapper()+0x84) [0x7ff8d444ee44]
 14: (Thread::_entry_func(void*)+0x15) [0x7ff8d444eda5]
 15: /lib/x86_64-linux-gnu/libc.so.6(+0x94b43) [0x7ff8d237ab43]
 16: /lib/x86_64-linux-gnu/libc.so.6(+0x126a00) [0x7ff8d240ca00]

Actions #15

Updated by Radoslaw Zarzynski 17 days ago

  • Assignee set to Laura Flores
Actions #16

Updated by Laura Flores 10 days ago

Still looking into this one.

Actions

Also available in: Atom PDF