Project

General

Profile

Activity

From 07/09/2019 to 08/07/2019

08/07/2019

11:56 PM Bug #41156 (In Progress): dump_float() poor output
David Zafman
10:22 PM Bug #41156 (Rejected): dump_float() poor output

dump_float("15min", 810 / 1000.0) outputs "15min": 0.8100000000000001
This was introduced in https://github.com/...
David Zafman
04:12 PM Bug #41154 (New): osd: pg unknown state
hello. yesterday my cluster go crazy and zeroized action sent for one pg.
osd.119 pg_epoch: 79413 pg[15.7c1( v 794...
Alexander Kazansky
08:16 AM Bug #41150 (Resolved): osd-scrub-test.sh: TEST_interval_changes failure
From http://pulpito.ceph.com/nojha-2019-08-07_01:40:44-rados-wip-lower-bfs-alloc-size-distro-basic-smithi/4190093/
...
Josh Durgin
08:12 AM Bug #41149 (New): LibRadosTwoPoolsPP.ManifestUnset failed with -22
From http://pulpito.ceph.com/nojha-2019-08-07_00:01:13-rados-wip-bluestore-monitor-allocations-distro-basic-smithi/41... Josh Durgin
12:02 AM Bug #40765 (In Progress): mimic: "Command failed (workunit test rados/test.sh)" in smoke/master/m...
Brad Hubbard

08/06/2019

08:59 PM Bug #41145 (Duplicate): osd: bad alloc exception
... Patrick Donnelly
04:04 PM Bug #24531: Mimic MONs have slow/long running ops
I am encountering similar issues on a cluster with all daemons running... Theo O
03:07 AM Bug #40765: mimic: "Command failed (workunit test rados/test.sh)" in smoke/master/mimic
Correcting my statement in comment #12.... Brad Hubbard

08/05/2019

11:04 PM Backport #41092 (Resolved): nautilus: rocksdb: enable rocksdb_rmrange=true by default and make de...
https://github.com/ceph/ceph/pull/29439 Patrick Donnelly
11:03 PM Backport #41086 (Resolved): mimic: Change default for bluestore_fsck_on_mount_deep as false
https://github.com/ceph/ceph/pull/29699 Patrick Donnelly
11:03 PM Backport #41085 (Rejected): luminous: Change default for bluestore_fsck_on_mount_deep as false
Patrick Donnelly
11:03 PM Backport #41084 (Resolved): nautilus: Change default for bluestore_fsck_on_mount_deep as false
https://github.com/ceph/ceph/pull/29697 Patrick Donnelly
06:47 PM Bug #41077 (New): The expected_num_objects parameter when creating the pool. Is it still needed w...
# ceph osd pool create testpool 4096
Error ERANGE: For better initial performance on pools expected to store a large...
Vikhyat Umrao
08:16 AM Bug #41065: new osd added to cluster upgraded from 13 to 14 will down after some days
log in per osd... hoan nv
04:09 AM Bug #41065 (Closed): new osd added to cluster upgraded from 13 to 14 will down after some days
Hi all.
My ceph cluster upgraded from 13.2.5 and 14.2.2
I am not enable mgr v2 and add 2 new mon....
hoan nv
06:44 AM Bug #40765: mimic: "Command failed (workunit test rados/test.sh)" in smoke/master/mimic
I was discussing this with Josh Durgin and he noticed the reuse of the bufferlist for the read and mentioned we'd had... Brad Hubbard
05:36 AM Bug #41064: OSD: assert(objiter->second->version > last_divergent_update) fails when there is onl...
https://github.com/ceph/ceph/pull/29480 Xuehan Xu
04:07 AM Bug #41064 (New): OSD: assert(objiter->second->version > last_divergent_update) fails when there ...
Recently, some OSDs in one of our cluster failed to start after a power outage
One OSD's log is as follows:
<pr...
Xuehan Xu

08/02/2019

08:18 PM Backport #39516: nautilus: osd-backfill-space.sh test failed in TEST_backfill_multi_partial()
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28187
merged
Yuri Weinstein
08:11 PM Backport #40625: nautilus: OSDs get killed by OOM due to a broken switch
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/29391
merged
Yuri Weinstein
06:51 PM Bug #41017 (Pending Backport): Change default for bluestore_fsck_on_mount_deep as false
Neha Ojha
06:14 PM Bug #38724: _txc_add_transaction error (39) Directory not empty not handled on operation 21 (op 1...
That OSD failure seems to have caused a cascade. Several more OSDs have crashed. 12% of objects were degraded, and I ... Nathan Fish
04:34 PM Bug #38724: _txc_add_transaction error (39) Directory not empty not handled on operation 21 (op 1...
Just lost another OSD on 14.2.2. The cluster is still mostly empty, but a large parallel cp is going on, so rebuild w... Nathan Fish
12:20 AM Bug #41052 (Resolved): nautilus: cbt cosbench workloads failing in rados/perf suite
Neha Ojha

08/01/2019

11:15 PM Bug #41052 (Fix Under Review): nautilus: cbt cosbench workloads failing in rados/perf suite
Neha Ojha
10:28 PM Bug #41052: nautilus: cbt cosbench workloads failing in rados/perf suite
I think we need to backport https://github.com/ceph/ceph/pull/28442 to nautilus since master seems to be fine http://... Neha Ojha
10:06 PM Bug #41052 (Resolved): nautilus: cbt cosbench workloads failing in rados/perf suite
Run: http://pulpito.ceph.com/yuriw-2019-07-31_23:02:48-rados-wip-yuri6-testing-2019-07-31-1929-nautilus-distro-basic-... Yuri Weinstein
10:08 PM Backport #40180: nautilus: qa/standalone/scrub/osd-scrub-snaps.sh sometimes fails
David Zafman wrote:
> https://github.com/ceph/ceph/pull/29252
merged
Yuri Weinstein
04:18 PM Bug #40835 (In Progress): OSDCap.PoolClassRNS test aborts
Kefu Chai
03:22 PM Bug #40765: mimic: "Command failed (workunit test rados/test.sh)" in smoke/master/mimic
Suspect also in => http://pulpito.ceph.com/teuthology-2019-08-01_02:25:03-upgrade:luminous-x-mimic-distro-basic-smithi/ Yuri Weinstein
05:41 AM Bug #40765: mimic: "Command failed (workunit test rados/test.sh)" in smoke/master/mimic
Simplified test case.... Brad Hubbard
03:42 AM Bug #40765: mimic: "Command failed (workunit test rados/test.sh)" in smoke/master/mimic
Ok, slowly closing in on this I think.... Brad Hubbard
12:12 AM Backport #40940 (In Progress): nautilus: Update rocksdb to v6.1.2
https://github.com/ceph/ceph/pull/29440 Neha Ojha

07/31/2019

11:55 PM Bug #40969: rocksdb: enable rocksdb_rmrange=true by default and make delete range optional on num...
nautilus backport: https://github.com/ceph/ceph/pull/29439 Neha Ojha
11:50 PM Bug #40969: rocksdb: enable rocksdb_rmrange=true by default and make delete range optional on num...
let's backport https://github.com/ceph/ceph/pull/27317 and https://github.com/ceph/ceph/pull/29323 Neha Ojha
11:15 PM Backport #40465: nautilus: osd beacon sometimes has empty pg list
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/29254
merged
Yuri Weinstein
11:13 PM Backport #39743: nautilus: mon: "FAILED assert(pending_finishers.empty())" when paxos restart
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28528
merged
Yuri Weinstein
11:12 PM Backport #40382: nautilus: RuntimeError: expected MON_CLOCK_SKEW but got none
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28576
merged
Yuri Weinstein
11:09 PM Backport #40274: nautilus: librados 'buffer::create' and related functions are not exported in C+...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/29244
merged
Yuri Weinstein
09:30 PM Bug #40820: standalone/scrub/osd-scrub-test.sh +3 day failed assert
Can the test be improved so it doesn't assume maps propagate in a certain short time period, but waits for the releva... Josh Durgin
03:31 PM Bug #40720: mimic, nautilus: make bitmap allocator the default allocator for bluestore
Neha Ojha wrote:
> Luminous PR: https://github.com/ceph/ceph/pull/28972
merged
Yuri Weinstein
03:03 AM Feature #40955: Extend the scrub sleep time when the period is outside [osd_scrub_begin_hour, osd...
Updated logic:... Jeegn Chen
03:00 AM Feature #40955: Extend the scrub sleep time when the period is outside [osd_scrub_begin_hour, osd...
One more scenoario dzafman's comment:
@Jeegn-Chen Another way a scrub could happen even with scrub_time_permit() r...
Jeegn Chen
12:58 AM Bug #40765: mimic: "Command failed (workunit test rados/test.sh)" in smoke/master/mimic
Client:... Brad Hubbard
12:12 AM Bug #41017 (Fix Under Review): Change default for bluestore_fsck_on_mount_deep as false
Neha Ojha
12:09 AM Bug #41017: Change default for bluestore_fsck_on_mount_deep as false
Neha - as discussed I have created the tracker and assigned it to you. Vikhyat Umrao
12:08 AM Bug #41017 (Resolved): Change default for bluestore_fsck_on_mount_deep as false
RHBZ - https://bugzilla.redhat.com/show_bug.cgi?id=1734585 Vikhyat Umrao

07/30/2019

11:54 PM Bug #41016 (Resolved): Improve upmap change reporting in logs
1. do not silently skip mappings in _apply_upmap() or anywhere else, when they aren't going to be applied
2. maybe_r...
Vikhyat Umrao
11:40 PM Backport #40744 (Resolved): nautilus: core: lazy omap stat collection
Brad Hubbard
10:12 PM Backport #40744: nautilus: core: lazy omap stat collection
Brad Hubbard wrote:
> https://github.com/ceph/ceph/pull/29188
merged
Yuri Weinstein
10:26 PM Backport #40652: nautilus: os/bluestore: fix >2GB writes
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28966
merged
Yuri Weinstein
10:25 PM Backport #40652: nautilus: os/bluestore: fix >2GB writes
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28966
merged
Yuri Weinstein
10:21 PM Backport #40655: nautilus: Lower the default value of osd_deep_scrub_large_omap_object_key_threshold
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/29173
merged
Yuri Weinstein
10:17 PM Backport #40667: nautilus: PG scrub stamps reset to 0.000000
David Zafman wrote:
> https://github.com/ceph/ceph/pull/28869
merged
Yuri Weinstein
10:16 PM Backport #40730: nautilus: mon: auth mon isn't loading full KeyServerData after restart
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28993
merged
Yuri Weinstein
10:16 PM Backport #39693: nautilus: _txc_add_transaction error (39) Directory not empty not handled on ope...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/29115
merged
Yuri Weinstein
06:50 PM Feature #40640 (Fix Under Review): Network ping monitoring
Neha Ojha
04:09 PM Backport #39692: mimic: _txc_add_transaction error (39) Directory not empty not handled on operat...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/29217
merged
Yuri Weinstein
11:13 AM Tasks #40937: Problem "open vSwitch" networkbond set_numa_affinity
During the reboot the following messages are in the log:... Mario Hosse
07:59 AM Tasks #40937: Problem "open vSwitch" networkbond set_numa_affinity
It's all working so far. The question is, does the cpu assignment work? To do this I manually set the following comma... Mario Hosse
11:06 AM Bug #23402: objecter: does not resend op on split interval
duplicated by:https://tracker.ceph.com/issues/22544 mingxin liu
07:26 AM Documentation #41004 (In Progress): doc: pg_num should always be a power of two
Sebastian Wagner
06:21 AM Documentation #41004: doc: pg_num should always be a power of two
https://github.com/ceph/ceph/pull/29364 Kai Wagner
06:20 AM Documentation #41004 (Resolved): doc: pg_num should always be a power of two
Hi,
I updated the pg_num section in the docs just a little to be more strict.
I think we should make it crystal c...
Kai Wagner
06:37 AM Backport #40625 (In Progress): nautilus: OSDs get killed by OOM due to a broken switch
https://github.com/ceph/ceph/pull/29391 Prashant D

07/29/2019

09:09 PM Tasks #40937: Problem "open vSwitch" networkbond set_numa_affinity
Does this visibly break anything or is it just a message in the logs? Greg Farnum
07:33 PM Bug #40998 (Can't reproduce): ceph-objectstore-tool remove broken
After rebuilding the problem went away. David Zafman
05:57 PM Bug #40998 (Can't reproduce): ceph-objectstore-tool remove broken

It seems like the remove function no longer works properly with snaps. Test errors are detected with OSDs down the...
David Zafman
12:23 PM Bug #40791 (Closed): high variance in pg size
https://github.com/ceph/ceph/pull/29364
closing this.
Jan Fajerski
10:01 AM Bug #40994 (New): unittest_erasure_code_shec_all Failed with Timeout
https://jenkins.ceph.com/job/ceph-pull-requests/30238/console... Sebastian Wagner
09:45 AM Backport #40993 (Rejected): mimic: Ceph status in some cases does not report slow ops
We had 2 instances when running 13.2.6 they didn't report the slow ops of failing disks.
This is from 1 cluster:
<p...
Theofilos Mouratidis
06:38 AM Backport #40537 (In Progress): nautilus: osd/PG.cc: 2410: FAILED ceph_assert(scrub_queued)
https://github.com/ceph/ceph/pull/29372 Prashant D

07/28/2019

11:39 AM Bug #40825 (Duplicate): test_osd_came_back (tasks.mgr.test_progress.TestProgress) ... FAIL
Kefu Chai

07/26/2019

10:18 AM Bug #40765: mimic: "Command failed (workunit test rados/test.sh)" in smoke/master/mimic
I dumped out the log (500 lines, attached) almost exclusively this sequence so we're not communicating with osd.3.
...
Brad Hubbard
06:58 AM Bug #40765: mimic: "Command failed (workunit test rados/test.sh)" in smoke/master/mimic
Looking at a live process it doesn't seem to be a deadlock or "hang" but more like some sort of livelock where the 'm... Brad Hubbard
06:29 AM Bug #40765: mimic: "Command failed (workunit test rados/test.sh)" in smoke/master/mimic
Still working on this but steps to reproduce...... Brad Hubbard
08:32 AM Bug #40637 (Resolved): osd: report omap/data/metadata usage
Nathan Cutler
08:32 AM Backport #40638 (Resolved): luminous: osd: report omap/data/metadata usage
Nathan Cutler
08:30 AM Backport #40940 (Need More Info): nautilus: Update rocksdb to v6.1.2
Nathan Cutler
06:43 AM Feature #40955 (Fix Under Review): Extend the scrub sleep time when the period is outside [osd_sc...
Kefu Chai
04:03 AM Feature #40955: Extend the scrub sleep time when the period is outside [osd_scrub_begin_hour, osd...
PR: https://github.com/ceph/ceph/pull/29342 Jeegn Chen
01:30 AM Bug #40969 (Resolved): rocksdb: enable rocksdb_rmrange=true by default and make delete range opti...
Neha Ojha

07/25/2019

09:45 PM Backport #40638: luminous: osd: report omap/data/metadata usage
Josh Durgin wrote:
> https://github.com/ceph/ceph/pull/28851
merged
Yuri Weinstein
07:44 PM Backport #40940: nautilus: Update rocksdb to v6.1.2
We want this to bake in master for a while. Neha Ojha
08:56 AM Backport #40940 (Resolved): nautilus: Update rocksdb to v6.1.2
https://github.com/ceph/ceph/pull/29440 Nathan Cutler
04:24 PM Bug #40963 (Resolved): mimic: MQuery during Deleting state
... Sage Weil
12:29 PM Feature #40955 (Resolved): Extend the scrub sleep time when the period is outside [osd_scrub_begi...
We already have osd_scrub_begin_week_day, osd_scrub_end_week_day, osd_scrub_begin_hour and osd_scrub_end_hour to tell... Jeegn Chen
12:10 PM Backport #40654 (Resolved): mimic: Lower the default value of osd_deep_scrub_large_omap_object_ke...
Nathan Cutler
12:09 PM Backport #38552 (Resolved): mimic: core: lazy omap stat collection
Nathan Cutler
12:01 PM Bug #36739 (Resolved): ENOENT in collection_move_rename on EC backfill target
Nathan Cutler
12:01 PM Backport #38880 (Resolved): luminous: ENOENT in collection_move_rename on EC backfill target
Nathan Cutler
12:00 PM Bug #39006 (Resolved): ceph tell osd.xx bench help : gives wrong help
Nathan Cutler
11:59 AM Backport #39373 (Resolved): luminous: ceph tell osd.xx bench help : gives wrong help
Nathan Cutler
10:32 AM Feature #39339: prioritize backfill of metadata pools, automatically
since this is only going to be backported to nautilus and since there are two PRs involved, and since one of those PR... Nathan Cutler
08:57 AM Backport #40949 (Resolved): mimic: Better default value for osd_snap_trim_sleep
https://github.com/ceph/ceph/pull/29732 Nathan Cutler
08:57 AM Backport #40948 (Resolved): nautilus: Better default value for osd_snap_trim_sleep
https://github.com/ceph/ceph/pull/29678 Nathan Cutler
08:57 AM Backport #40947 (Resolved): luminous: Better default value for osd_snap_trim_sleep
https://github.com/ceph/ceph/pull/31857 Nathan Cutler
08:56 AM Backport #40943 (Resolved): mimic: mon/OSDMonitor.cc: better error message about min_size
https://github.com/ceph/ceph/pull/29618 Nathan Cutler
08:56 AM Backport #40942 (Resolved): nautilus: mon/OSDMonitor.cc: better error message about min_size
https://github.com/ceph/ceph/pull/29617 Nathan Cutler
08:56 AM Backport #40941 (Rejected): luminous: mon/OSDMonitor.cc: better error message about min_size
Nathan Cutler
07:28 AM Tasks #40937 (New): Problem "open vSwitch" networkbond set_numa_affinity
Hello,
after installing ceph 14.2.1 (Proxmox 6.0-4-6.0) i have the following message example in syslog when starti...
Mario Hosse

07/24/2019

11:01 PM Backport #40654: mimic: Lower the default value of osd_deep_scrub_large_omap_object_key_threshold
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/29174
merged
Yuri Weinstein
10:59 PM Backport #38552: mimic: core: lazy omap stat collection
Brad Hubbard wrote:
> https://github.com/ceph/ceph/pull/29189
merged
Yuri Weinstein
10:43 PM Backport #38880: luminous: ENOENT in collection_move_rename on EC backfill target
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28110
merged
Yuri Weinstein
10:43 PM Backport #39373: luminous: ceph tell osd.xx bench help : gives wrong help
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28112
merged
Yuri Weinstein
10:06 PM Feature #39339 (In Progress): prioritize backfill of metadata pools, automatically
Sorry, https://github.com/ceph/ceph/pull/29181 is yet to merge. Neha Ojha
10:03 PM Feature #39339 (Pending Backport): prioritize backfill of metadata pools, automatically
One backport for nautilus: https://github.com/ceph/ceph/pull/29275 Neha Ojha
09:16 PM Bug #40785 (Need More Info): In case of osd full scenario 100% pgs went to unknown state, when ad...
Which ceph version are you running? Can you provide the "ceph -s" output? Neha Ojha
08:27 PM Bug #40791: high variance in pg size
Jan Fajerski wrote:
> Greg Farnum wrote:
> > PGs split by splitting their hash range in half. So if you have not-a-...
Greg Farnum
11:07 AM Bug #40791: high variance in pg size
Greg Farnum wrote:
> PGs split by splitting their hash range in half. So if you have not-a-power-of-two, some of the...
Jan Fajerski
01:34 PM Bug #40825: test_osd_came_back (tasks.mgr.test_progress.TestProgress) ... FAIL
... Sage Weil
11:12 AM Backport #40502 (Need More Info): luminous: osd: rollforward may need to mark pglog dirty
Is this a follow-on fix for https://github.com/ceph/ceph/pull/27015 which is only in master? Please clarify.
Marki...
Nathan Cutler
11:12 AM Backport #40503 (Need More Info): mimic: osd: rollforward may need to mark pglog dirty
Is this a follow-on fix for https://github.com/ceph/ceph/pull/27015 which is only in master? Please clarify.
Marki...
Nathan Cutler
11:11 AM Backport #40504 (Need More Info): nautilus: osd: rollforward may need to mark pglog dirty
Is this a follow-on fix for https://github.com/ceph/ceph/pull/27015 which is only in master? Please clarify.
Marki...
Nathan Cutler
11:11 AM Bug #40403: osd: rollforward may need to mark pglog dirty
Is this a follow-on fix for https://github.com/ceph/ceph/pull/27015 which is only in master? Please clarify.
Marki...
Nathan Cutler
11:05 AM Backport #40465 (In Progress): nautilus: osd beacon sometimes has empty pg list
Nathan Cutler
11:04 AM Backport #40464 (In Progress): mimic: osd beacon sometimes has empty pg list
Nathan Cutler
11:03 AM Backport #40180 (In Progress): nautilus: qa/standalone/scrub/osd-scrub-snaps.sh sometimes fails
Nathan Cutler
11:02 AM Backport #40179 (In Progress): mimic: qa/standalone/scrub/osd-scrub-snaps.sh sometimes fails
Nathan Cutler
10:59 AM Backport #38856 (In Progress): mimic: should set EPOLLET flag on del_event()
Nathan Cutler
10:59 AM Backport #38852 (In Progress): mimic: .mgrstat failed to decode mgrstat state; luminous dev version?
Nathan Cutler
10:56 AM Backport #38436 (In Progress): luminous: crc cache should be invalidated when posting preallocate...
Nathan Cutler
10:54 AM Backport #38437 (In Progress): mimic: crc cache should be invalidated when posting preallocated r...
Nathan Cutler
10:50 AM Backport #38351 (In Progress): mimic: Limit loops waiting for force-backfill/force-recovery to ha...
Nathan Cutler
10:49 AM Backport #40274 (In Progress): nautilus: librados 'buffer::create' and related functions are not ...
https://github.com/ceph/ceph/pull/29244 Prashant D
10:30 AM Documentation #38896 (Resolved): Minor rados related documentation fixes
Nathan Cutler
10:29 AM Backport #38902 (Resolved): luminous: Minor rados related documentation fixes
Nathan Cutler
10:27 AM Backport #38610 (In Progress): luminous: mon: osdmap prune
Nathan Cutler
09:11 AM Backport #38277 (In Progress): mimic: osd_map_message_max default is too high?
Nathan Cutler
09:04 AM Backport #38206 (In Progress): mimic: osds allows to partially start more than N+2
Nathan Cutler
08:56 AM Backport #38163 (Need More Info): mimic: maybe_remove_pg_upmaps incorrectly cancels valid pending...
part of a complicated, interrelated set of PRs - assigning to the author of the luminous backport https://github.com/... Nathan Cutler
08:17 AM Bug #40835: OSDCap.PoolClassRNS test aborts
Brad Hubbard
01:24 AM Feature #40528 (Pending Backport): Better default value for osd_snap_trim_sleep
Kefu Chai
12:12 AM Bug #40915 (Pending Backport): Update rocksdb to v6.1.2
Neha Ojha

07/23/2019

09:16 PM Bug #40915 (Fix Under Review): Update rocksdb to v6.1.2
Neha Ojha
09:16 PM Bug #40915 (Resolved): Update rocksdb to v6.1.2
Neha Ojha
08:10 PM Bug #40910 (Resolved): mon/OSDMonitor.cc: better error message about min_size
Neha Ojha
07:49 PM Bug #38724: _txc_add_transaction error (39) Directory not empty not handled on operation 21 (op 1...
It seems this will be in 14.2.3. When the fix comes out, will my crashed OSDs work again, or should I just purge and ... Nathan Fish
03:10 PM Backport #39694 (Need More Info): luminous: _txc_add_transaction error (39) Directory not empty n...
non-trivial backport Nathan Cutler
03:01 PM Backport #39692 (In Progress): mimic: _txc_add_transaction error (39) Directory not empty not han...
Nathan Cutler
09:42 AM Backport #38450 (Need More Info): mimic: src/osd/OSDMap.h: 1065: FAILED assert(__null != pool)
A naive backport - https://github.com/ceph/ceph/pull/26594 - was closed because it resulted in the following build fa... Nathan Cutler
09:06 AM Bug #40198 (Resolved): Setting noscrub causing extraneous deep scrubs
Nathan Cutler
09:06 AM Backport #40265 (Resolved): nautilus: Setting noscrub causing extraneous deep scrubs
Nathan Cutler
08:24 AM Backport #40891 (Resolved): nautilus: Pool settings aren't populated to OSD after restart.
https://github.com/ceph/ceph/pull/32123 Nathan Cutler
08:24 AM Backport #40890 (Resolved): mimic: Pool settings aren't populated to OSD after restart.
https://github.com/ceph/ceph/pull/32125 Nathan Cutler
08:24 AM Backport #40889 (Rejected): luminous: Pool settings aren't populated to OSD after restart.
Nathan Cutler
08:23 AM Backport #40885 (Resolved): nautilus: ceph mgr module ls -f plain crashes mon
https://github.com/ceph/ceph/pull/29566 Nathan Cutler
08:23 AM Backport #40884 (Resolved): mimic: ceph mgr module ls -f plain crashes mon
https://github.com/ceph/ceph/pull/29593 Nathan Cutler
08:23 AM Backport #40883 (Rejected): luminous: ceph mgr module ls -f plain crashes mon
Nathan Cutler
08:22 AM Backport #39475 (Resolved): mimic: segv in fgets() in collect_sys_info reading /proc/cpuinfo
Nathan Cutler
08:18 AM Backport #40650 (Resolved): luminous: os/bluestore: fix >2GB writes
Nathan Cutler
08:18 AM Backport #40651 (Resolved): mimic: os/bluestore: fix >2GB writes
Nathan Cutler
08:16 AM Bug #40720 (In Progress): mimic, nautilus: make bitmap allocator the default allocator for bluestore
Changing status to "In Progress" so the backport-create-issue script doesn't create backport issues. Nathan Cutler
08:15 AM Bug #40720: mimic, nautilus: make bitmap allocator the default allocator for bluestore
master/nautilus PR containing the commit to be backported: https://github.com/ceph/ceph/pull/21825
mimic backport PR...
Nathan Cutler
08:10 AM Bug #38682 (Resolved): should report EINVAL in ErasureCode::parse() if m<=0
Nathan Cutler
08:10 AM Backport #38751 (Resolved): mimic: should report EINVAL in ErasureCode::parse() if m<=0
Nathan Cutler
04:37 AM Backport #38551 (In Progress): luminous: core: lazy omap stat collection
https://github.com/ceph/ceph/pull/29190 Brad Hubbard
03:44 AM Backport #38552 (In Progress): mimic: core: lazy omap stat collection
https://github.com/ceph/ceph/pull/29189 Brad Hubbard
03:14 AM Backport #40744 (In Progress): nautilus: core: lazy omap stat collection
Nautilus only requires https://github.com/ceph/ceph/pull/28070 as it already has https://github.com/ceph/ceph/pull/26... Brad Hubbard
12:15 AM Feature #39339 (Fix Under Review): prioritize backfill of metadata pools, automatically
https://github.com/ceph/ceph/pull/29180
https://github.com/ceph/ceph/pull/29181
Neha Ojha

07/22/2019

09:12 PM Backport #40265: nautilus: Setting noscrub causing extraneous deep scrubs
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28768
merged
Yuri Weinstein
07:01 PM Bug #40720 (Pending Backport): mimic, nautilus: make bitmap allocator the default allocator for b...
Luminous PR: https://github.com/ceph/ceph/pull/28972 Neha Ojha
06:40 PM Bug #40804 (Pending Backport): ceph mgr module ls -f plain crashes mon
Sage Weil
06:40 PM Bug #40483 (Pending Backport): Pool settings aren't populated to OSD after restart.
Sage Weil
06:38 PM Bug #40635 (Resolved): IndexError: list index out of range in thrash_pg_upmap
Sage Weil
05:37 PM Feature #40870 (Resolved): Implement mon_memory_target
Use the priority cache tuner for mon caches. Also implement a config observer to handle changes to mon cache sizes. Neha Ojha
05:28 PM Backport #40653 (In Progress): luminous: Lower the default value of osd_deep_scrub_large_omap_obj...
https://github.com/ceph/ceph/pull/29175 Neha Ojha
05:25 PM Backport #40654 (In Progress): mimic: Lower the default value of osd_deep_scrub_large_omap_object...
https://github.com/ceph/ceph/pull/29174 Neha Ojha
05:21 PM Backport #40655 (In Progress): nautilus: Lower the default value of osd_deep_scrub_large_omap_obj...
https://github.com/ceph/ceph/pull/29173 Neha Ojha
03:31 PM Bug #40868 (New): src/common/config_proxy.h: 70: FAILED ceph_assert(p != obs_call_gate.end())
qa/workunits/rbd/test_librbd_python.sh failure... Sage Weil
02:01 PM Backport #39513 (Resolved): mimic: osd: segv in _preboot -> heartbeat
Nathan Cutler
01:58 PM Backport #39311 (Resolved): mimic: crushtool crash on Fedora 28 and newer
Nathan Cutler
01:56 PM Backport #39374 (Resolved): mimic: ceph tell osd.xx bench help : gives wrong help
Nathan Cutler
01:56 PM Bug #39154 (Resolved): Don't mark removed osds in when running "ceph osd in any|all|*"
Nathan Cutler
01:56 PM Backport #39422 (Resolved): mimic: Don't mark removed osds in when running "ceph osd in any|all|*"
Nathan Cutler
01:55 PM Bug #38034 (Resolved): pg stuck in backfill_wait with plenty of disk space
Nathan Cutler
01:55 PM Backport #38341 (Resolved): mimic: pg stuck in backfill_wait with plenty of disk space
Nathan Cutler
10:39 AM Backport #40840 (Need More Info): nautilus: Explicitly requested repair of an inconsistent PG can...
non-trivial because it depends on d938b28565c801b1a6de8e8ce585f2389595311b which itself does not apply to nautilus cl... Nathan Cutler
08:20 AM Backport #40840 (Resolved): nautilus: Explicitly requested repair of an inconsistent PG cannot be...
https://github.com/ceph/ceph/pull/29748 Nathan Cutler
10:30 AM Backport #40639 (Resolved): mimic: osd: report omap/data/metadata usage
Nathan Cutler
09:52 AM Backport #40744 (Need More Info): nautilus: core: lazy omap stat collection
Requires backport of https://github.com/ceph/ceph/pull/26614 and https://github.com/ceph/ceph/pull/28070 - the latter... Nathan Cutler
09:52 AM Backport #38552 (Need More Info): mimic: core: lazy omap stat collection
-not sure of the status here?-
https://github.com/ceph/ceph/pull/28070 is non-trivial, hence assigned to the devel...
Nathan Cutler
09:51 AM Backport #38551 (Need More Info): luminous: core: lazy omap stat collection
-not sure of the status here?-
https://github.com/ceph/ceph/pull/28070 is non-trivial, hence assigned to the devel...
Nathan Cutler
04:03 AM Bug #40835 (New): OSDCap.PoolClassRNS test aborts
Brad Hubbard
02:59 AM Bug #40835 (Can't reproduce): OSDCap.PoolClassRNS test aborts
Brad Hubbard
12:01 AM Bug #40835: OSDCap.PoolClassRNS test aborts
... Brad Hubbard
12:00 AM Bug #40835 (Resolved): OSDCap.PoolClassRNS test aborts
... Brad Hubbard

07/19/2019

07:50 PM Bug #40635: IndexError: list index out of range in thrash_pg_upmap
https://github.com/ceph/ceph/pull/29144 too Sage Weil
07:35 PM Bug #40777: hit assert in AuthMonitor::update_from_paxos
Ah, ENOENT might be a code bug. Unless you have debug logs of the monitor from when it was writing that data to disk ... Greg Farnum
07:31 PM Bug #40791: high variance in pg size
Lars Marowsky-Brée wrote:
> This is Luminous, 12.2.12 by now.
>
> Balancing on bytes (reweight-by-utilization) wa...
Greg Farnum
12:30 PM Bug #40791: high variance in pg size
Jan Fajerski wrote:
> Greg Farnum wrote:
> > It sure looks like the PG count isn't a power of two, so some of them ...
Jan Fajerski
06:44 AM Bug #40791: high variance in pg size
Greg Farnum wrote:
> It sure looks like the PG count isn't a power of two, so some of them are simply half size comp...
Jan Fajerski
06:24 PM Bug #38724: _txc_add_transaction error (39) Directory not empty not handled on operation 21 (op 1...
Edward Kalk wrote:
> Sometimes when this happens, the OSDs repeatedly crash and Linux system prevents them from bein...
Nathan Fish
11:18 AM Bug #40831 (New): compression segfaults with zstd 1.3.8 and incompatibilities with zstd 1.4.0
Hey,
I'm currently working on packaging ceph 14.2.1 for Arch Linux (still some kinks to work out, once that is don...
Thore Bödecker
10:46 AM Bug #24835: osd daemon spontaneous segfault
Thanks for the response Soenke. If we haven't seen it again in a few months time I guess we can close this. Brad Hubbard
10:40 AM Bug #24835: osd daemon spontaneous segfault
We haven't seen this bug for 6 weeks now after updating to Nautilus (and changing configuration from ceph.conf to cep... Soenke Schippmann
02:42 AM Bug #24835: osd daemon spontaneous segfault
Soenke or Christian,
Are you still seeing this issue?
Brad Hubbard
08:15 AM Bug #24419: ceph-objectstore-tool unable to open mon store
I have the same problem.
yite gu
02:38 AM Bug #36250 (Can't reproduce): ceph-osd process crashing
Brad Hubbard
02:36 AM Bug #38892: /ceph/src/tools/kvstore_tool.cc:266:1: internal compiler error: Segmentation fault
Brad Hubbard

07/18/2019

10:01 PM Bug #38841: Objects degraded higher than 100%

The number of degraded objects is based on object replicas not the number of objects. So let's say every pool is h...
David Zafman
09:51 PM Bug #40825: test_osd_came_back (tasks.mgr.test_progress.TestProgress) ... FAIL
Josh Durgin
09:36 PM Bug #40825 (Duplicate): test_osd_came_back (tasks.mgr.test_progress.TestProgress) ... FAIL
the test marks osd out, verifies a progress event is there, then marks it in, and asserts that there are no progress ... Sage Weil
07:52 PM Backport #39475: mimic: segv in fgets() in collect_sys_info reading /proc/cpuinfo
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28206
merged
Yuri Weinstein
07:49 PM Backport #40651: mimic: os/bluestore: fix >2GB writes
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28967
merged
Yuri Weinstein
07:48 PM Bug #40720: mimic, nautilus: make bitmap allocator the default allocator for bluestore
merged https://github.com/ceph/ceph/pull/28970 Yuri Weinstein
07:47 PM Backport #38751: mimic: should report EINVAL in ErasureCode::parse() if m<=0
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28995
merged
Yuri Weinstein
07:45 PM Backport #39693 (In Progress): nautilus: _txc_add_transaction error (39) Directory not empty not ...
https://github.com/ceph/ceph/pull/29115 Sage Weil
04:45 PM Bug #40820: standalone/scrub/osd-scrub-test.sh +3 day failed assert

The prior osdmap was issuing these messages for 7 seconds....
David Zafman
04:42 PM Bug #40820: standalone/scrub/osd-scrub-test.sh +3 day failed assert

This test gives the mon 2 seconds to propagate changes. A scrub_min_interval change to a pool probably didn't reac...
David Zafman
04:01 PM Bug #40820 (Closed): standalone/scrub/osd-scrub-test.sh +3 day failed assert
... Sage Weil
02:07 PM Bug #40755 (Resolved): _txc_add_transaction error (2) No such file or directory not handled on op...
Sage Weil
04:24 AM Bug #40777: hit assert in AuthMonitor::update_from_paxos
Greg Farnum wrote:
> That assert means there was a read error when the monitor tried to get data off of disk. Check ...
sdkfzv sdkfzv
04:03 AM Bug #40410: ceph pg query Segmentation fault in 12.2.10
I understand, thanks Han. Brad Hubbard
12:16 AM Bug #39304 (Resolved): short pg log+nautilus-p2p-stress-split: "Error: finished tid 3 when last_a...
David Zafman
12:16 AM Backport #39720 (Resolved): mimic: short pg log+nautilus-p2p-stress-split: "Error: finished tid 3...
David Zafman

07/17/2019

11:01 PM Backport #39513: mimic: osd: segv in _preboot -> heartbeat
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28220
merged
Yuri Weinstein
10:44 PM Bug #39152: nautilus osd crash: Caught signal (Aborted) tp_osd_tp
once this is backported at released (#39693) we should confirm this fixes the problematic osd Sage Weil
10:29 PM Bug #40809 (New): qa: "Failed to send signal 1: None" in rados
Run: http://pulpito.ceph.com/yuriw-2019-07-15_19:24:27-rados-wip-yuri4-testing-2019-07-15-1517-mimic-distro-basic-smi... Yuri Weinstein
10:18 PM Backport #39311: mimic: crushtool crash on Fedora 28 and newer
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/27986
merged
Yuri Weinstein
10:17 PM Backport #39720: mimic: short pg log+nautilus-p2p-stress-split: "Error: finished tid 3 when last_...
David Zafman wrote:
> https://github.com/ceph/ceph/pull/28089
merged
Yuri Weinstein
10:17 PM Bug #40791: high variance in pg size
This is Luminous, 12.2.12 by now.
Balancing on bytes (reweight-by-utilization) was unable to resolve the issue pre...
Lars Marowsky-Brée
09:16 PM Bug #40791: high variance in pg size
It sure looks like the PG count isn't a power of two, so some of them are simply half size compared to the others. (S... Greg Farnum
09:15 PM Bug #40791 (Need More Info): high variance in pg size
Which ceph version are you using? Neha Ojha
10:17 PM Backport #39374: mimic: ceph tell osd.xx bench help : gives wrong help
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28097
merged
Yuri Weinstein
10:16 PM Backport #39422: mimic: Don't mark removed osds in when running "ceph osd in any|all|*"
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28142
merged
Yuri Weinstein
10:13 PM Backport #38341: mimic: pg stuck in backfill_wait with plenty of disk space
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28201
merged
Yuri Weinstein
09:33 PM Bug #23879: test_mon_osdmap_prune.sh fails

Another time on mimic so I assume Nautilus needs a fix too.
http://qa-proxy.ceph.com/teuthology/yuriw-2019-07-09_1...
David Zafman
09:26 PM Bug #40765: mimic: "Command failed (workunit test rados/test.sh)" in smoke/master/mimic
... Sage Weil
09:19 PM Bug #40726: "OSD::osd_op_tp thread 0x7f6dafcf0700' had timed out after 15"
This happens occasionally on Mira nodes; but if it pops up repeatedly on the same node or test suite that may be evid... Greg Farnum
09:15 PM Bug #40774 (Resolved): mon: interval_set.h: 490: FAILED ceph_assert(p->first > start+len)
Neha Ojha
09:12 PM Bug #40777 (Closed): hit assert in AuthMonitor::update_from_paxos
That assert means there was a read error when the monitor tried to get data off of disk. Check your disk! Greg Farnum
08:56 PM Bug #38238: rados/test.sh: api_aio_pp doesn't seem to start

http://qa-proxy.ceph.com/teuthology/yuriw-2019-07-09_15:21:18-rados-wip-yuri-testing-2019-07-08-2007-mimic-distro-b...
David Zafman
08:56 PM Bug #40070 (Rejected): mon/OSDMonitor: target_size_bytes integer overflow
this is by design. the target_size is new in nautilus, so we don't encode it in the map until require_osd_release >=... Sage Weil
08:54 PM Bug #40081: mon: luminous crash attempting to decode maps after nautilus quorum has been formed
https://github.com/ceph/ceph/pull/28672 (nautilus backport PR) Sage Weil
08:43 PM Bug #40000: osds do not bound xattrs and/or aggregate xattr data in pg log
from the ML,... Sage Weil
08:42 PM Bug #40000 (Need More Info): osds do not bound xattrs and/or aggregate xattr data in pg log
The message dump is 260M (once de-hexified), but the decode of the pg_log_t in the message indicates it is 2484154195... Sage Weil
05:50 PM Bug #40483 (Fix Under Review): Pool settings aren't populated to OSD after restart.
https://github.com/ceph/ceph/pull/29093 Sage Weil
04:57 PM Bug #40755 (Fix Under Review): _txc_add_transaction error (2) No such file or directory not handl...
https://github.com/ceph/ceph/pull/29092 Sage Weil
03:45 PM Bug #40793 (Rejected): mgr mon commands pile up
This was a side-effect of #40792. A targeted mon command was queued for down mon, which forced the MonClient to keep... Sage Weil
03:38 PM Bug #40792 (Fix Under Review): monc: send_command to specific down mon breaks other mon msgs
https://github.com/ceph/ceph/pull/29090 Sage Weil
02:45 PM Bug #40804 (Fix Under Review): ceph mgr module ls -f plain crashes mon
https://github.com/ceph/ceph/pull/29089 Sage Weil
02:28 PM Bug #40804 (Resolved): ceph mgr module ls -f plain crashes mon
Sage Weil
11:05 AM Bug #40410: ceph pg query Segmentation fault in 12.2.10
Hi Brad Hubbard
We manually generated some buckets after deployed the cluster.In order to avoid id repetition,we...
qingbo han
04:50 AM Bug #40410: ceph pg query Segmentation fault in 12.2.10
Brad Hubbard
04:49 AM Bug #40410: ceph pg query Segmentation fault in 12.2.10
qingbo han wrote:
> hi Brad Hubbard:
> I think your theory is correct. I run ceph pg query correctly when I uli...
Brad Hubbard

07/16/2019

05:27 PM Bug #40793 (Rejected): mgr mon commands pile up
on lab cluster, a mon was down for a few days. on restart,... Sage Weil
05:24 PM Bug #40792 (Resolved): monc: send_command to specific down mon breaks other mon msgs
On lab cluser, mgr regularly sends mgrbeacons. all is fine.
but, if one mon is down, *and* we send the smart scra...
Sage Weil
05:00 PM Bug #40791 (Closed): high variance in pg size
We're seeing a cluster that has a history of being very unbalanced in terms of OSD utilisation. The balancer in upmap... Jan Fajerski
03:08 PM Bug #40620 (Pending Backport): Explicitly requested repair of an inconsistent PG cannot be schedu...
Sage Weil
03:06 PM Bug #40635 (Fix Under Review): IndexError: list index out of range in thrash_pg_upmap
https://github.com/ceph/ceph/pull/29069 Sage Weil
03:02 PM Bug #40635: IndexError: list index out of range in thrash_pg_upmap
Looks like this triggers when there are no pools, and the pg dump pg_stats is thus empty. Sage Weil
02:45 PM Bug #40635: IndexError: list index out of range in thrash_pg_upmap
/a/sage-2019-07-15_19:52:54-rados-wip-sage-testing-2019-07-15-0918-distro-basic-smithi/4121793 Sage Weil
09:18 AM Bug #40410: ceph pg query Segmentation fault in 12.2.10
hi Brad Hubbard:
I think your theory is correct. I run ceph pg query correctly when I ulimit -s 16384.You said s...
qingbo han
04:28 AM Bug #40410: ceph pg query Segmentation fault in 12.2.10
Hello Han,
Many thanks to Radoslaw Zarzynski for the fruitful discussion we had regarding this issue last night. I...
Brad Hubbard
08:14 AM Bug #40785 (Need More Info): In case of osd full scenario 100% pgs went to unknown state, when ad...
After populating the more data, osds were being nearfull and full. When added more storage in this situation, all pgs... servesha dudhgaonkar
02:14 AM Bug #40777: hit assert in AuthMonitor::update_from_paxos
... sdkfzv sdkfzv

07/15/2019

07:41 PM Backport #40639: mimic: osd: report omap/data/metadata usage
Josh Durgin wrote:
> https://github.com/ceph/ceph/pull/28852
merged
Yuri Weinstein
05:49 PM Bug #40774 (Fix Under Review): mon: interval_set.h: 490: FAILED ceph_assert(p->first > start+len)
https://github.com/ceph/ceph/pull/29051 Sage Weil
04:23 PM Bug #40777: hit assert in AuthMonitor::update_from_paxos
Is this reproducible? If so, can you add mon logs (ideally both for peons and leader), at 'debug mon = 10', 'debug pa... Joao Eduardo Luis
09:17 AM Bug #40777 (New): hit assert in AuthMonitor::update_from_paxos
I created the ceph cluster by the rook(https://github.com/rook/rook), and ceph version is 12.2.7 stable.
After I reb...
sdkfzv sdkfzv
01:29 AM Bug #40410: ceph pg query Segmentation fault in 12.2.10
Still looking into this. The issue in the new core is the same as the original coredump. Brad Hubbard

07/13/2019

04:27 PM Bug #40765: mimic: "Command failed (workunit test rados/test.sh)" in smoke/master/mimic
Seems on mimic as well
http://pulpito.ceph.com/teuthology-2019-07-13_06:00:03-smoke-mimic-testing-basic-smithi/
...
Yuri Weinstein

07/12/2019

11:24 PM Bug #40774: mon: interval_set.h: 490: FAILED ceph_assert(p->first > start+len)
similar failure: /ceph/teuthology-archive/pdonnell-2019-07-11_22:52:33-fs-wip-pdonnell-testing-20190711.203149-distro... Patrick Donnelly
11:23 PM Bug #40774 (Resolved): mon: interval_set.h: 490: FAILED ceph_assert(p->first > start+len)
While removing snapshots:... Patrick Donnelly
11:07 PM Bug #40772: mon: pg size change delayed 1 minute because osdmap 35 delay
Kefu can you take a look? See the attached monitor logs. David Zafman
10:48 PM Bug #40772: mon: pg size change delayed 1 minute because osdmap 35 delay

This looks to be a monitor issue. We see that osdmap 35 may be getting hung up during the critical period 00:29:49 ...
David Zafman
09:21 PM Bug #40772 (New): mon: pg size change delayed 1 minute because osdmap 35 delay

osd-recovery-prio.sh TEST_recovery_pool_priority fails intermittently due to a delay in recovery starting on a pg. ...
David Zafman
10:21 PM Bug #40725 (Resolved): osd-scrub-snaps.sh fails
Sage Weil
02:19 AM Bug #40725 (Fix Under Review): osd-scrub-snaps.sh fails
Kefu Chai
09:11 PM Bug #38357: ClsLock.TestExclusiveEphemeralStealEphemeral failed
... Sage Weil
03:24 PM Bug #40765 (Duplicate): mimic: "Command failed (workunit test rados/test.sh)" in smoke/master/mimic
Run: http://pulpito.ceph.com/teuthology-2019-07-12_06:00:03-smoke-mimic-testing-basic-smithi/
Jobs: '4113997', '4113...
Yuri Weinstein
02:08 PM Bug #40755 (Resolved): _txc_add_transaction error (2) No such file or directory not handled on op...
... Sage Weil
01:58 PM Bug #40635: IndexError: list index out of range in thrash_pg_upmap
/a/sage-2019-07-11_17:46:52-rados-wip-sage-testing-2019-07-11-1048-distro-basic-smithi/4111022 Sage Weil
12:33 PM Bug #38124 (Resolved): OSD down on snaptrim.
Nathan Cutler
12:33 PM Backport #39698 (Resolved): mimic: OSD down on snaptrim.
Nathan Cutler

07/11/2019

10:35 PM Backport #40638 (In Progress): luminous: osd: report omap/data/metadata usage
Brad Hubbard
10:34 PM Backport #40638 (Duplicate): luminous: osd: report omap/data/metadata usage
Brad Hubbard
02:00 PM Backport #40638 (In Progress): luminous: osd: report omap/data/metadata usage
Nathan Cutler
10:34 PM Feature #38550 (Duplicate): osd: Implement lazy omap usage statistics per osd
Brad Hubbard
10:28 PM Backport #40744 (Resolved): nautilus: core: lazy omap stat collection
https://github.com/ceph/ceph/pull/29188 Brad Hubbard
10:22 PM Feature #38136: core: lazy omap stat collection
Requires backport of https://github.com/ceph/ceph/pull/26614 and https://github.com/ceph/ceph/pull/28070 Brad Hubbard
10:22 PM Backport #38552 (In Progress): mimic: core: lazy omap stat collection
Requires backport of https://github.com/ceph/ceph/pull/26614 and https://github.com/ceph/ceph/pull/28070 Brad Hubbard
10:21 PM Backport #38551 (In Progress): luminous: core: lazy omap stat collection
Requires backport of https://github.com/ceph/ceph/pull/26614 and https://github.com/ceph/ceph/pull/28070 Brad Hubbard
07:24 PM Backport #40650: luminous: os/bluestore: fix >2GB writes
Neha Ojha wrote:
> https://github.com/ceph/ceph/pull/28965
merged
Yuri Weinstein
04:39 PM Bug #38724: _txc_add_transaction error (39) Directory not empty not handled on operation 21 (op 1...
Rebooted node 4, on node 1 and 2, 2 OSDs each crashed and will not start.
The logs are similar, seems to be the BUG ...
Edward Kalk
02:36 PM Bug #38724: _txc_add_transaction error (39) Directory not empty not handled on operation 21 (op 1...
^^this results in the Production VMs becoming unresponsive as their disks are unavailable when we have multiple OSDs ... Edward Kalk
02:33 PM Bug #38724: _txc_add_transaction error (39) Directory not empty not handled on operation 21 (op 1...
Sometimes when this happens, the OSDs repeatedly crash and Linux system prevents them from being started. it takes 10... Edward Kalk
02:28 PM Bug #38724: _txc_add_transaction error (39) Directory not empty not handled on operation 21 (op 1...
Was Bug 38724:
```ceph-osd.9.log: -3> 2019-07-11 09:15:13.569 7fc7b8243700 -1 bluestore(/var/lib/ceph/osd/ceph-9...
Edward Kalk
02:28 PM Bug #38724: _txc_add_transaction error (39) Directory not empty not handled on operation 21 (op 1...
OSD 9, 15, 10, 13 crashed this AM.
```ceph.log:2019-07-11 09:15:15.501601 mon.synergy0 (mon.0) 4248 : cluster [IN...
Edward Kalk
04:28 PM Bug #40740 (New): "Error: finished tid 3 when last_acked_tid was 5" in upgrade:luminous-x-mimic
Run: http://pulpito.ceph.com/teuthology-2019-07-11_02:25:02-upgrade:luminous-x-mimic-distro-basic-smithi/
Job: 41101...
Yuri Weinstein
03:18 PM Backport #39693: nautilus: _txc_add_transaction error (39) Directory not empty not handled on ope...
Edward Kalk wrote:
> found a few things that seem like fixes for this on github... : https://github.com/ceph/ceph/pu...
Nathan Cutler
02:50 PM Backport #38276 (Resolved): luminous: osd_map_message_max default is too high?
Nathan Cutler
02:36 PM Backport #38751 (In Progress): mimic: should report EINVAL in ErasureCode::parse() if m<=0
Nathan Cutler
02:34 PM Backport #38750 (Resolved): luminous: should report EINVAL in ErasureCode::parse() if m<=0
Nathan Cutler
02:01 PM Backport #40639 (In Progress): mimic: osd: report omap/data/metadata usage
Nathan Cutler
01:59 PM Backport #40730 (In Progress): nautilus: mon: auth mon isn't loading full KeyServerData after res...
Nathan Cutler
01:58 PM Backport #40730 (Resolved): nautilus: mon: auth mon isn't loading full KeyServerData after restart
https://github.com/ceph/ceph/pull/28993 Nathan Cutler
01:58 PM Backport #40732 (Resolved): mimic: mon: auth mon isn't loading full KeyServerData after restart
https://github.com/ceph/ceph/pull/30181 Nathan Cutler
01:58 PM Backport #40731 (Rejected): luminous: mon: auth mon isn't loading full KeyServerData after restart
Nathan Cutler
01:13 PM Backport #39537 (In Progress): luminous: osd/ReplicatedBackend.cc: 1321: FAILED assert(get_parent...
Nathan Cutler
01:12 PM Backport #39538 (Resolved): mimic: osd/ReplicatedBackend.cc: 1321: FAILED assert(get_parent()->ge...
Nathan Cutler
01:12 PM Bug #39582 (Resolved): Binary data in OSD log from "CRC header" message
Nathan Cutler
01:12 PM Backport #39737 (Resolved): mimic: Binary data in OSD log from "CRC header" message
Nathan Cutler
01:11 PM Backport #39744 (Resolved): mimic: mon: "FAILED assert(pending_finishers.empty())" when paxos res...
Nathan Cutler
10:21 AM Bug #40726 (New): "OSD::osd_op_tp thread 0x7f6dafcf0700' had timed out after 15"
osd.7 was marked down by itself because of unhealthy heartbeat.... Kefu Chai
04:37 AM Bug #40725: osd-scrub-snaps.sh fails
David, mind taking a look? Kefu Chai
04:37 AM Bug #40725 (Resolved): osd-scrub-snaps.sh fails
... Kefu Chai
01:54 AM Feature #40420: Introduce an ceph.conf option to disable HEALTH_WARN when nodeep-scrub/scrub flag...
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-June/035406.html
https://pad.ceph.com/p/health-mute
Vikhyat Umrao
01:18 AM Bug #40641: OSD failure after PGInfo back to previous versions, resulting in PGLog error rollback
Neha Ojha wrote:
> How did you find out there were unrecoverable objects? Was there any indication in the logs?
T...
tao ning

07/10/2019

09:10 PM Bug #40641: OSD failure after PGInfo back to previous versions, resulting in PGLog error rollback
How did you find out there were unrecoverable objects? Was there any indication in the logs? Neha Ojha
09:06 PM Bug #40674 (Resolved): TEST_corrupt_snapset_scrub_rep fails
Neha Ojha
09:04 PM Bug #40718 (Duplicate): touch in txn on (old) nautilus osd
Josh Durgin
02:44 PM Bug #40718 (Duplicate): touch in txn on (old) nautilus osd
... Sage Weil
07:51 PM Bug #40722: "IOError: [Errno 2] No such file or directory: '/tmp/pip-build-o9ggCd/unknown/setup.p...
@Alfredo can you pls take a look? Yuri Weinstein
07:00 PM Bug #40722 (New): "IOError: [Errno 2] No such file or directory: '/tmp/pip-build-o9ggCd/unknown/s...
Run: http://pulpito.ceph.com/teuthology-2019-07-10_05:10:03-ceph-disk-mimic-distro-basic-mira/
Jobs: '4108064', '410...
Yuri Weinstein
05:39 PM Bug #40721: backfill caught in loop from block
original blocked request is... Sage Weil
04:42 PM Bug #40721: backfill caught in loop from block
actually, this retry is triggered on every osdmap. Sage Weil
04:42 PM Bug #40721 (Can't reproduce): backfill caught in loop from block
... Sage Weil
05:08 PM Bug #40720 (Fix Under Review): mimic, nautilus: make bitmap allocator the default allocator for b...
Neha Ojha
04:29 PM Bug #40720 (Resolved): mimic, nautilus: make bitmap allocator the default allocator for bluestore
The default for nautilus is already bitmap allocator.
We might just need to cherry-pick 231b7dd9c5dc1d22e93a8f81d07e...
Neha Ojha
03:49 PM Backport #40650 (In Progress): luminous: os/bluestore: fix >2GB writes
https://github.com/ceph/ceph/pull/28965 Neha Ojha
03:49 PM Backport #40651 (In Progress): mimic: os/bluestore: fix >2GB writes
https://github.com/ceph/ceph/pull/28967 Neha Ojha
03:48 PM Backport #40652 (In Progress): nautilus: os/bluestore: fix >2GB writes
https://github.com/ceph/ceph/pull/28966 Neha Ojha
03:11 PM Bug #40712: ceph-mon crash with assert(err == 0) after rocksdb->get
I also opened an issue in rocksdb: https://github.com/facebook/rocksdb/issues/5558, and I attached the db file in thi... Yang Dongsheng
12:18 PM Bug #40712 (New): ceph-mon crash with assert(err == 0) after rocksdb->get
(1)I found a very strange problem in our environment that the ceph-mon crashed with below error in log:... Yang Dongsheng
03:02 PM Backport #39693: nautilus: _txc_add_transaction error (39) Directory not empty not handled on ope...
found a few things that seem like fixes for this on github... : https://github.com/ceph/ceph/pull/27929/commits Edward Kalk
02:06 PM Backport #39693: nautilus: _txc_add_transaction error (39) Directory not empty not handled on ope...
Will this fix be included in : https://tracker.ceph.com/projects/ceph/roadmap#v14.2.2 ? Edward Kalk
03:01 PM Bug #38724: _txc_add_transaction error (39) Directory not empty not handled on operation 21 (op 1...
found a few things that seem like fixes for this on github... : https://github.com/ceph/ceph/pull/27929/commits Edward Kalk
02:49 PM Bug #38724: _txc_add_transaction error (39) Directory not empty not handled on operation 21 (op 1...
We hit this bug again : "2019-07-10 09:16:27.728 7f73b844c700 -1 bluestore(/var/lib/ceph/osd/ceph-5) _txc_add_transac... Edward Kalk
02:06 PM Bug #38724: _txc_add_transaction error (39) Directory not empty not handled on operation 21 (op 1...
will this be included in : https://tracker.ceph.com/projects/ceph/roadmap#v14.2.2 . ? Edward Kalk
11:56 AM Bug #39555 (In Progress): backfill_toofull while OSDs are not full (Unneccessary HEALTH_ERR)
oops, reverting - I had not seen Joao's question Nathan Cutler
11:54 AM Bug #39555 (Pending Backport): backfill_toofull while OSDs are not full (Unneccessary HEALTH_ERR)
Nathan Cutler
06:11 AM Bug #40410: ceph pg query Segmentation fault in 12.2.10
hi Brad Hubbard
I failed to reproduce segfault in python several times.I had upload coredump in ceph, the id ...
qingbo han

07/09/2019

08:53 PM Backport #39693: nautilus: _txc_add_transaction error (39) Directory not empty not handled on ope...
I dropped notes in "http://tracker.ceph.com/issues/38724". Not sure I understand the status. "pending backport" says ... Edward Kalk
04:30 PM Backport #38276: luminous: osd_map_message_max default is too high?
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28640
merged
Yuri Weinstein
04:26 PM Bug #38724: _txc_add_transaction error (39) Directory not empty not handled on operation 21 (op 1...
I am confused by the "Copied to RADOS - Backport #39693: nautilus" status. "Pending Backport 07/03/2019"
Does this m...
Edward Kalk
04:15 PM Bug #38724: _txc_add_transaction error (39) Directory not empty not handled on operation 21 (op 1...
We have hit this as well, it was triggered when I rebooted a node. A few OSD on other hosts crashed. Here's some log:... Edward Kalk
04:04 PM Backport #38750: luminous: should report EINVAL in ErasureCode::parse() if m<=0
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28111
merged
Yuri Weinstein
02:21 PM Bug #40634 (Pending Backport): mon: auth mon isn't loading full KeyServerData after restart
Kefu Chai
10:50 AM Bug #39555: backfill_toofull while OSDs are not full (Unneccessary HEALTH_ERR)
The pull request provided with the fix has been merged (https://github.com/ceph/ceph/pull/28204). Does anyone still s... Joao Eduardo Luis
01:58 AM Bug #40410: ceph pg query Segmentation fault in 12.2.10
Hello Han,
I don't see any glaring differences in the binaries so far but I did notice this in the dmesg output.
...
Brad Hubbard
 

Also available in: Atom