Project

General

Profile

Bug #53000

OSDMap/OSDMapTest.BUG_51842/2: ThreadPool::WorkQueue<ParallelPGMapper::Item>::_void_dequeue: pure virtual method called

Added by Mykola Golub about 1 year ago. Updated 3 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
test-failure
Backport:
pacific,quincy
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

This failure was reported by Jenkins for pacific branch PR [1], though it does not look like related to that PR, and may be not specific for pacific branch. It was not reproduced after re-running Jenkins. Also I failed to reproduce it locally running the test in loop for a while. May be environment specific.

The test OSDMap/OSDMapTest.BUG_51842/2 aborted with "pure virtual method called" in the thread started by clean_pg_upmaps, in ThreadPool::WorkQueue<ParallelPGMapper::Item>::_void_dequeue().

[ RUN      ] OSDMap/OSDMapTest.BUG_51842/2
ID  CLASS  WEIGHT   TYPE NAME             
-5         3.00000  root infra-1706       
-4         1.00000      host host-0       
 0         1.00000          osd.0         
-6         1.00000      host host-1       
 1         1.00000          osd.1         
-7         1.00000      host host-2       
 2         1.00000          osd.2         
-1               0  root default          
-3               0      rack localrack    
-2               0          host localhost

pure virtual method called
terminate called without an active exception
...
    -2> 2021-10-05T08:33:22.552+0000 7fe076379700  1 BUG_40104::clean_upmap_tp worker finish
    -1> 2021-10-05T08:33:22.552+0000 7fe079b80700  1 BUG_40104::clean_upmap_tp worker finish
     0> 2021-10-05T08:33:22.556+0000 7fe076b7a700 -1 *** Caught signal (Aborted) **
 in thread 7fe076b7a700 thread_name:clean_upmap_tp

 ceph version Development (no_version) pacific (stable)
 1: /home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_osdmap(+0x2e81e3) [0x557b8f5431e3]
 2: /lib/x86_64-linux-gnu/libpthread.so.0(+0x153c0) [0x7fe086bca3c0]
 3: gsignal()
 4: abort()
 5: /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x9e911) [0x7fe07c0e9911]
 6: /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xaa38c) [0x7fe07c0f538c]
 7: /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xaa3f7) [0x7fe07c0f53f7]
 8: /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xab155) [0x7fe07c0f6155]
 9: (ThreadPool::WorkQueue<ParallelPGMapper::Item>::_void_dequeue()+0x27) [0x557b8f4d9493]
 10: (ThreadPool::worker(ThreadPool::WorkThread*)+0x50b) [0x7fe07d8ed7bf]
 11: (ThreadPool::WorkThread::entry()+0x36) [0x7fe07d8f1c92]
 12: (Thread::entry_wrapper()+0x87) [0x7fe07d8c8e73]
 13: (Thread::_entry_func(void*)+0x1c) [0x7fe07d8c8de2]
 14: /lib/x86_64-linux-gnu/libpthread.so.0(+0x9609) [0x7fe086bbe609]
 15: clone()
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

[1] https://github.com/ceph/ceph/pull/43415

clean_pg_upmaps failure.gz (287 KB) Laura Flores, 08/22/2022 03:51 PM


Related issues

Copied to RADOS - Backport #57288: pacific: OSDMap/OSDMapTest.BUG_51842/2: ThreadPool::WorkQueue<ParallelPGMapper::Item>::_void_dequeue: pure virtual method called Rejected
Copied to RADOS - Backport #57289: quincy: OSDMap/OSDMapTest.BUG_51842/2: ThreadPool::WorkQueue<ParallelPGMapper::Item>::_void_dequeue: pure virtual method called Rejected

History

#1 Updated by Mykola Golub about 1 year ago

Yet another example, now for a PR for the master branch [1, 2], and for OSDMap/OSDMapTest.BUG_51842/1:

[ RUN      ] OSDMap/OSDMapTest.BUG_51842/1
ID  CLASS  WEIGHT   TYPE NAME             
-5         3.00000  root infra-1706       
-4         1.00000      host host-0       
 0         1.00000          osd.0         
-6         1.00000      host host-1       
 1         1.00000          osd.1         
-7         1.00000      host host-2       
 2         1.00000          osd.2         
-1               0  root default          
-3               0      rack localrack    
-2               0          host localhost

{
    "rule": {
        "rule_id": 0,
        "rule_name": "replicated_rule",
        "type": 1,
        "steps": [
            {
                "op": "take",
                "item": -1,
                "item_name": "default" 
            },
            {
                "op": "choose_firstn",
                "num": 0,
                "type": "osd" 
            },
            {
                "op": "emit" 
            }
        ]
    },
    "rule": {
        "rule_id": 1,
        "rule_name": "infra-1706",
        "type": 1,
        "steps": [
            {
                "op": "set_chooseleaf_tries",
                "num": 5
            },
            {
                "op": "set_choose_tries",
                "num": 100
            },
            {
                "op": "take",
                "item": -5,
                "item_name": "infra-1706" 
            },
            {
                "op": "chooseleaf_firstn",
                "num": 3,
                "type": "host" 
            },
            {
                "op": "emit" 
            }
        ]
    }
}
2021-10-19T11:21:14.182+0000 7fef7fba9700  1 heartbeat_map reset_timeout 'BUG_40104::clean_upmap_tp thread 0x7fef7fba9700' had timed out after 0.000000000s
pure virtual method called
terminate called without an active exception
*** Caught signal (Aborted) **
 in thread 7fef82baf700 thread_name:clean_upmap_tp
 ceph version Development (no_version) quincy (dev)
 1: /home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_osdmap() [0x6a7b7b]
 2: /lib/x86_64-linux-gnu/libpthread.so.0(+0x153c0) [0x7fef880073c0]
 3: gsignal()
 4: abort()
 5: /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x9e911) [0x7fef854eb911]
 6: /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xaa38c) [0x7fef854f738c]
 7: /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xaa3f7) [0x7fef854f73f7]
 8: /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xab155) [0x7fef854f8155]
 9: (ThreadPool::WorkQueue<ParallelPGMapper::Item>::_void_dequeue()+0x19) [0x5fea49]
 10: (ThreadPool::worker(ThreadPool::WorkThread*)+0x641) [0x7fef86f20841]
 11: (ThreadPool::WorkThread::entry()+0x20) [0x7fef86f268c0]
 12: (Thread::entry_wrapper()+0x90) [0x7fef86ef6b10]
 13: (Thread::_entry_func(void*)+0x18) [0x7fef86ef6a68]
 14: /lib/x86_64-linux-gnu/libpthread.so.0(+0x9609) [0x7fef87ffb609]
 15: clone()

[1] https://github.com/ceph/ceph/pull/41323
[2] https://jenkins.ceph.com/job/ceph-pull-requests/84171/console

#4 Updated by Radoslaw Zarzynski 3 months ago

  • Status changed from New to Fix Under Review
  • Pull request ID set to 41323

#5 Updated by Radoslaw Zarzynski 3 months ago

  • Backport set to pacific,quincy

#6 Updated by Radoslaw Zarzynski 3 months ago

  • Status changed from Fix Under Review to Pending Backport

#7 Updated by Backport Bot 3 months ago

  • Copied to Backport #57288: pacific: OSDMap/OSDMapTest.BUG_51842/2: ThreadPool::WorkQueue<ParallelPGMapper::Item>::_void_dequeue: pure virtual method called added

#8 Updated by Backport Bot 3 months ago

  • Copied to Backport #57289: quincy: OSDMap/OSDMapTest.BUG_51842/2: ThreadPool::WorkQueue<ParallelPGMapper::Item>::_void_dequeue: pure virtual method called added

#9 Updated by Backport Bot 3 months ago

  • Tags set to backport_processed

#10 Updated by Mykola Golub 3 months ago

I don't understand why PR ID for this bug was set to 41323. I mentioned the PR 41323 not as the fix but as an example where this bug was observed when testing that PR.

#11 Updated by Radoslaw Zarzynski 3 months ago

  • Status changed from Pending Backport to New
  • Pull request ID deleted (41323)

Sorry, moving back to New.

#12 Updated by Kefu Chai 3 months ago

[ RUN      ] OSDMap/OSDMapTest.BUG_51842/2
ID  CLASS  WEIGHT   TYPE NAME             
-5         3.00000  root infra-1706       
-4         1.00000      host host-0       
 0         1.00000          osd.0         
-6         1.00000      host host-1       
 1         1.00000          osd.1         
-7         1.00000      host host-2       
 2         1.00000          osd.2         
-1               0  root default          
-3               0      rack localrack    
-2               0          host localhost

pure virtual method called
...
    -2> 2022-09-04T14:17:08.035+0000 7efd9e464640  1 BUG_40104::clean_upmap_tp worker finish
    -1> 2022-09-04T14:17:08.035+0000 7efd9d462640  1 BUG_40104::clean_upmap_tp worker finish
     0> 2022-09-04T14:17:08.219+0000 7efd9ac5d640 -1 *** Caught signal (Aborted) **

spotted on main branch.

#13 Updated by Radoslaw Zarzynski 3 months ago

  • Tags changed from backport_processed to test-failure

Also available in: Atom PDF