Bug #7575
closedosd/ReplicatedPG.cc: 10600: FAILED assert(r >= 0): hit_set_persist() races with agent_load_hit_sets()
0%
Description
2014-03-01 00:13:39.895495 7efefcde2700 10 osd.4 pg_epoch: 76 pg[4.1( v 74'1454 lc 17'513 (0'0,74'1454] local-les=76 n=49 ec=7 les/c 76/70 75/75/75) [4,2] r=0 lpr=75 pi=68-74/2 rops=5 crt=74'1454 lcod 0'0 mlcod 0'0 active+recovering m=49] hit_set_persist
2014-03-01 00:13:39.895612 7efefcde2700 20 osd.4 pg_epoch: 76 pg[4.1( v 74'1454 lc 17'513 (0'0,74'1454] local-les=76 n=49 ec=7 les/c 76/70 75/75/75) [4,2] r=0 lpr=75 pi=68-74/2 rops=5 crt=74'1454 lcod 0'0 mlcod 0'0 active+recovering m=49] get_hit_set_archive_object 1/hit_set_4.1_archive_2014-03-01 00:13:38.002740_2014-03-01 00:13:39.895570/head/.ceph-internal/4
2014-03-01 00:13:39.895522 7efefc5e1700 10 osd.4 76 start_recovery_op pg[3.3( v 74'258 lc 0'0 (0'0,74'258] local-les=76 n=136 ec=6 les/c 76/70 75/75/75) [4,0] r=0 lpr=75 pi=6-74/6 rops=2 crt=74'258 mlcod 0'0 active+recovering m=136] 92ad546f/plana622773-60/head//3 (14/15 rops)
2014-03-01 00:13:39.895692 7efefcde2700 20 osd.4 pg_epoch: 76 pg[4.1( v 74'1454 lc 17'513 (0'0,74'1454] local-les=76 n=49 ec=7 les/c 76/70 75/75/75) [4,2] r=0 lpr=75 pi=68-74/2 rops=5 crt=74'1454 lcod 0'0 mlcod 0'0 active+recovering m=49] hit_set_persist archive 1/hit_set_4.1_archive_2014-03-01 00:13:38.002740_2014-03-01 00:13:39.895570/head/.ceph-internal/4
2014-03-01 00:13:39.941560 7efef85d9700 10 osd.4 pg_epoch: 76 pg[4.1( v 76'1456 lc 17'513 (0'0,76'1456] local-les=76 n=51 ec=7 les/c 76/70 75/75/75) [4,2] r=0 lpr=75 pi=68-74/2 luod=74'1454 rops=6 crt=74'1454 lcod 0'0 mlcod 0'0 active+recovering m=49] agent_load_hit_sets
2014-03-01 00:13:39.941658 7efef85d9700 10 osd.4 pg_epoch: 76 pg[4.1( v 76'1456 lc 17'513 (0'0,76'1456] local-les=76 n=51 ec=7 les/c 76/70 75/75/75) [4,2] r=0 lpr=75 pi=68-74/2 luod=74'1454 rops=6 crt=74'1454 lcod 0'0 mlcod 0'0 active+recovering m=49] agent_load_hit_sets loading 2014-03-01 00:13:38.002740-2014-03-01 00:13:39.895570
2014-03-01 00:13:39.941752 7efef85d9700 20 osd.4 pg_epoch: 76 pg[4.1( v 76'1456 lc 17'513 (0'0,76'1456] local-les=76 n=51 ec=7 les/c 76/70 75/75/75) [4,2] r=0 lpr=75 pi=68-74/2 luod=74'1454 rops=6 crt=74'1454 lcod 0'0 mlcod 0'0 active+recovering m=49] get_hit_set_archive_object 1/hit_set_4.1_archive_2014-03-01 00:13:38.002740_2014-03-01 00:13:39.895570/head/.ceph-internal/4
2014-03-01 00:13:39.941832 7efef85d9700 15 filestore(/var/lib/ceph/osd/ceph-4) read 4.1_head/1/hit_set_4.1_archive_2014-03-01 00:13:38.002740_2014-03-01 00:13:39.895570/head/.ceph-internal/4 0~0
2014-03-01 00:13:39.941663 7efefcde2700 20 osd.4 pg_epoch: 76 pg[3.3( v 74'258 lc 0'0 (0'0,74'258] local-les=76 n=136 ec=6 les/c 76/70 75/75/75) [4,0] r=0 lpr=75 pi=6-74/6 rops=10 crt=74'258 mlcod 0'0 active+recovering m=136] op_has_sufficient_caps pool=3 (base ) owner=0 need_read_cap=1 need_write_cap=0 need_class_read_cap=0 need_class_write_cap=0 -> yes
2014-03-01 00:13:39.941873 7efef85d9700 10 filestore(/var/lib/ceph/osd/ceph-4) error opening file /var/lib/ceph/osd/ceph-4/current/4.1_head/hit\uset\u4.1\uarchive\u2014-03-01 00:13:38.002740\u2014-03-01 00:13:39.895570__head_00000001_.ceph-internal_4 with flags=2: (2) No such file or directory
2014-03-01 00:13:39.941887 7efef85d9700 10 filestore(/var/lib/ceph/osd/ceph-4) FileStore::read(4.1_head/1/hit_set_4.1_archive_2014-03-01 00:13:38.002740_2014-03-01 00:13:39.895570/head/.ceph-internal/4) open error: (2) No such file or directory
- TOO LATE CREATING THE ARCHIVE
2014-03-01 00:13:39.948664 7eff07725700 15 filestore(/var/lib/ceph/osd/ceph-4) write 4.1_head/1/hit_set_4.1_archive_2014-03-01 00:13:38.002740_2014-03-01 00:13:39.895570/head/.ceph-internal/4 0~393 *********
2014-03-01 00:13:39.955756 7efef85d9700 -1 osd/ReplicatedPG.cc: In function 'void ReplicatedPG::agent_load_hit_sets()' thread 7efef85d9700 time 2014-03-01 00:13:39.941895
osd/ReplicatedPG.cc: 10600: FAILED assert(r >= 0)
Updated by David Zafman about 10 years ago
Log came from dzafman-2014-02-28_12:09:58-rados:thrash-wip-7458-testing-basic-plana/111634
Updated by Sage Weil about 10 years ago
- Status changed from New to 12
- Assignee set to Sage Weil
- Priority changed from Normal to Urgent
Updated by David Zafman about 10 years ago
- Status changed from 12 to 7
- Assignee changed from Sage Weil to David Zafman
Updated by David Zafman about 10 years ago
- Status changed from 7 to Resolved
135c27ec74be352416d06a9d0ad78e63cf477433