Project

General

Profile

Activity

From 09/03/2017 to 10/02/2017

10/02/2017

06:29 PM Backport #21626 (In Progress): jewel: ceph_volume_client: sets invalid caps for existing IDs with...
Patrick Donnelly
06:20 PM Backport #21626 (Resolved): jewel: ceph_volume_client: sets invalid caps for existing IDs with no...
https://github.com/ceph/ceph/pull/18084 Patrick Donnelly
06:29 PM Backport #21627 (In Progress): luminous: ceph_volume_client: sets invalid caps for existing IDs w...
Patrick Donnelly
06:25 PM Backport #21627 (Resolved): luminous: ceph_volume_client: sets invalid caps for existing IDs with...
https://github.com/ceph/ceph/pull/18085
-https://github.com/ceph/ceph/pull/18447-
Patrick Donnelly
12:41 PM Bug #21568: MDSMonitor commands crashing on cluster upgraded from Hammer (nonexistent pool?)
Patrick Donnelly
12:40 PM Bug #21568: MDSMonitor commands crashing on cluster upgraded from Hammer (nonexistent pool?)
User confirmed the MDSMap referred to data pools that no longer exist. The fix should check for non-existent pools an... Patrick Donnelly

10/01/2017

06:01 AM Bug #21304: mds v12.2.0 crashing
The following crash still persists with v12.2.1:
2017-10-01 06:07:34.673356 7f1066040700 0 -- 194.249.156.134:680...
Andrej Filipcic
12:46 AM Bug #19593 (Pending Backport): purge queue and standby replay mds
Patrick Donnelly
12:45 AM Bug #21501 (Pending Backport): ceph_volume_client: sets invalid caps for existing IDs with no caps
Patrick Donnelly

09/29/2017

03:12 PM Bug #21604: mds: may recall all client caps (?) because dirty/pinned metadata is not flushed
This sounds like something controlled by config options. The MDS wants to batch up data when flushing to avoid gratui... Greg Farnum
03:02 PM Bug #21604 (New): mds: may recall all client caps (?) because dirty/pinned metadata is not flushed
Testing tasks.cephfs.test_client_limits.TestClientLimits.test_client_pin with this patch:... Patrick Donnelly
10:06 AM Backport #21602 (Fix Under Review): luminous: ceph_volume_client: add get, put, and delete object...
https://github.com/ceph/ceph/pull/18037 Ramana Raja
09:12 AM Backport #21602 (Resolved): luminous: ceph_volume_client: add get, put, and delete object interfaces
Wrap low-level rados APIs to allow ceph_volume_client to get, put, and
delete objects. The interfaces would allow Op...
Ramana Raja
09:07 AM Feature #21601: ceph_volume_client: add get, put, and delete object interfaces
https://github.com/ceph/ceph/pull/17697 Ramana Raja
09:06 AM Feature #21601 (Resolved): ceph_volume_client: add get, put, and delete object interfaces
Wrap low-level rados APIs to allow ceph_volume_client to get, put, and
delete objects. The interfaces would allow Op...
Ramana Raja
12:14 AM Backport #21600 (In Progress): luminous: mds: client caps can go below hard-coded default (100)
Patrick Donnelly
12:10 AM Backport #21600 (Resolved): luminous: mds: client caps can go below hard-coded default (100)
https://github.com/ceph/ceph/pull/18030 Patrick Donnelly
12:03 AM Bug #21575 (Pending Backport): mds: client caps can go below hard-coded default (100)
https://github.com/ceph/ceph/pull/16036/commits/538834171fe4524b4bb7cffdcb08c5b13fe7689f Patrick Donnelly

09/28/2017

10:29 AM Bug #4829 (Closed): client: handling part of MClientForward incorrectly?
Zheng Yan
10:20 AM Bug #6473 (Can't reproduce): multimds + ceph-fuse: fsstress gets ENOTEMPTY on final rm -r
open new one if we encounter this again Zheng Yan
10:17 AM Bug #12777 (Resolved): qa: leftover files in cephtest directory
Zheng Yan
10:16 AM Bug #11277 (Can't reproduce): hung fsstress run under thrash (no useful logs)
Zheng Yan
10:12 AM Bug #17212 (Resolved): Unable to remove symlink / fill_inode badness on ffff88025f049f88
Zheng Yan
10:07 AM Bug #20467 (Resolved): Ceph FS kernel client not consistency
Zheng Yan
10:06 AM Bug #21091 (Pending Backport): StrayManager::truncate is broken
https://github.com/ceph/ceph/pull/18019 Zheng Yan
10:04 AM Bug #19306 (Resolved): fs: mount NFS to cephfs, and then ls a directory containing a large number...
Zheng Yan
10:03 AM Bug #20313 (Resolved): Assertion in handle_dir_update
Zheng Yan
09:56 AM Bug #21168 (Resolved): cap import/export message ordering issue
Zheng Yan

09/27/2017

10:50 PM Bug #21584 (Fix Under Review): FAILED assert(get_version() < pv) in CDir::mark_dirty
Encountered this issue during snapshot test. The patch is from multimds snapsot fixes https://github.com/ceph/ceph/pu... Zheng Yan
10:40 PM Bug #21584 (Resolved): FAILED assert(get_version() < pv) in CDir::mark_dirty
... Zheng Yan
10:38 PM Bug #21551: Ceph FS not recovering space on Luminous
OK, it's likely caused by http://tracker.ceph.com/issues/19593. please don't enable standby reply for now Zheng Yan
01:55 PM Bug #21551: Ceph FS not recovering space on Luminous
This file system was create with Ceph v12.2.0. This cluster was cleanly installed with Ceph v12.2.0 and was never upg... Eric Eastman
03:36 AM Bug #21551: Ceph FS not recovering space on Luminous
... Zheng Yan
05:35 PM Bug #21575 (Resolved): mds: client caps can go below hard-coded default (100)
This was caused by the MDS cache in mempool change:... Patrick Donnelly
02:44 PM Feature #118: kclient: clean pages when throwing out dirty metadata on session teardown
I think sage means clean dirty pages at the same time of cleaning dirty metadata. This haven't been implemented yet Zheng Yan
02:39 PM Bug #17370 (Can't reproduce): knfs ffsb hang on master
Zheng Yan
02:27 PM Bug #21337 (Fix Under Review): luminous: MDS is not getting past up:replay on Luminous cluster
https://github.com/ceph/ceph/pull/17994
patch is only for luminous. this bug has been fixed in master in another w...
Zheng Yan
01:39 PM Feature #21571 (Duplicate): mds: limit number of snapshots (global and subtree)
when there are hundreds of snapshots, metadata operations become slow. Besides, lots of snapshots can cause kernel pa... Zheng Yan
01:11 PM Bug #19593 (Fix Under Review): purge queue and standby replay mds
https://github.com/ceph/ceph/pull/17990 Zheng Yan
09:45 AM Bug #19593: purge queue and standby replay mds
Ouch, It seems this one hasn't been fixed. this one can explan #21551
Zheng Yan
11:28 AM Bug #21568 (Resolved): MDSMonitor commands crashing on cluster upgraded from Hammer (nonexistent ...
Opened from mailing list thread "[ceph-users] "ceph fs" commands hang forever and kill monitors"... John Spray
09:32 AM Bug #20129 (Resolved): Client syncfs is slow (waits for next MDS tick)
Zheng Yan

09/26/2017

01:43 PM Bug #21551: Ceph FS not recovering space on Luminous
I uploaded the new mds run with
debug_mds=5
debug_journaler=10
to:
ftp://ftp.keepertech.com/outgoing/eric...
Eric Eastman
08:19 AM Bug #21551: Ceph FS not recovering space on Luminous
there are lots of "mds.0.purge_queue _consume: not readable right now" in the log.looks like purge queue stayed in n... Zheng Yan
04:07 AM Bug #21551: Ceph FS not recovering space on Luminous
The command 'ceph daemon mds.ede-c1-mon01 dump cache /tmp/cachedump' did not give any output so I ran
ceph daemon md...
Eric Eastman
02:08 AM Bug #21551: Ceph FS not recovering space on Luminous
Could you please run 'ceph daemon mds.ede-c1-mon01 dump cache /tmp/cachedump' and upload cachedump. Besides, please s... Zheng Yan
01:35 PM Bug #21433 (Closed): mds: failed to decode message of type 43 v7: buffer::end_of_buffer
great Zheng Yan

09/25/2017

10:09 PM Bug #21551: Ceph FS not recovering space on Luminous
Snapshots are not considered stable (especially with multiple active metadata servers). There are proposed fixes in t... Patrick Donnelly
10:02 PM Bug #21551 (New): Ceph FS not recovering space on Luminous
I was running a test on a Ceph file system where I was creating and deleting about 45,000 files in a loop, and every ... Eric Eastman
09:33 PM Bug #21362 (Need More Info): cephfs ec data pool + windows fio,ceph cluster degraed several hours...
> cephfs: meta pool (ssd 1*3 replica 2), data pool (hdd 20*3 ec 2+1).
Using replica 2 is strongly advised against....
Patrick Donnelly
09:30 PM Bug #21363 (Duplicate): ceph-fuse crashing while mounting cephfs
Patrick Donnelly
08:40 PM Backport #21540 (Resolved): luminous: whitelist additions
Nathan Cutler
03:41 PM Backport #21540 (In Progress): luminous: whitelist additions
https://github.com/ceph/ceph/pull/17945 Patrick Donnelly
03:29 PM Backport #21540 (Resolved): luminous: whitelist additions
https://github.com/ceph/ceph/pull/17945 Patrick Donnelly
08:33 PM Bug #21463 (Resolved): qa: ignorable "MDS cache too large" warning
Nathan Cutler
08:32 PM Backport #21472 (Resolved): luminous: qa: ignorable "MDS cache too large" warning
Nathan Cutler
05:12 PM Bug #21507: mds: debug logs near respawn are not flushed
I'd take a hard look at the log rotation, which I got working and turned on for most all the fs suites a couple years... Greg Farnum
02:51 PM Bug #21539: man: missing man page for mount.fuse.ceph
Note the Debian package maintainers have written https://anonscm.debian.org/cgit/pkg-ceph/ceph.git/tree/debian/man/mo... Ken Dreyer
02:50 PM Bug #21539 (Resolved): man: missing man page for mount.fuse.ceph
The /usr/sbin/mount.fuse.ceph utility has no man page, and it needs one. Ken Dreyer
02:31 PM Bug #21433: mds: failed to decode message of type 43 v7: buffer::end_of_buffer
After Greg pointed us to the right direction, we recovered the FS by upgrading the cluster to luminous, now profiting... Christian Salzmann-Jäckel
01:49 PM Bug #21433: mds: failed to decode message of type 43 v7: buffer::end_of_buffer
This is presumably the same root cause as http://tracker.ceph.com/issues/16010 Greg Farnum
01:46 PM Bug #21433 (Need More Info): mds: failed to decode message of type 43 v7: buffer::end_of_buffer
Patrick Donnelly
06:49 AM Bug #21433: mds: failed to decode message of type 43 v7: buffer::end_of_buffer
Sorry for the delay. have you recovered the FS? if not, please set debug_ms=1 on both mds and osd.049, send logs to us. Zheng Yan
01:52 PM Bug #21510 (Resolved): qa: kcephfs: client-limits: whitelist "MDS cache too large"
Nathan Cutler
02:55 AM Bug #21510 (Pending Backport): qa: kcephfs: client-limits: whitelist "MDS cache too large"
Patrick Donnelly
01:52 PM Backport #21517 (Resolved): luminous: qa: kcephfs: client-limits: whitelist "MDS cache too large"
Nathan Cutler
01:51 PM Bug #21509 (Resolved): qa: kcephfs: ignore warning on expected mds failover
Nathan Cutler
02:54 AM Bug #21509 (Pending Backport): qa: kcephfs: ignore warning on expected mds failover
Patrick Donnelly
01:51 PM Backport #21516 (Resolved): luminous: qa: kcephfs: ignore warning on expected mds failover
Nathan Cutler
01:51 PM Bug #21508 (Resolved): qa: kcephfs: missing whitelist for evicted client
Nathan Cutler
02:54 AM Bug #21508 (Pending Backport): qa: kcephfs: missing whitelist for evicted client
Patrick Donnelly
01:46 PM Bug #21412: cephfs: too many cephfs snapshots chokes the system
Thanks. Im hesitant to trigger the issue again, last time it threw my cluster into major chaos that took several days... Wyllys Ingersoll
01:41 PM Bug #21412: cephfs: too many cephfs snapshots chokes the system
ceph daemon mds.<name> dump_ops_in_flight
ceph daemon mds.<name> perf dump
Greg Farnum
01:43 PM Backport #21515 (Resolved): luminous: qa: kcephfs: missing whitelist for evicted client
Nathan Cutler
08:24 AM Bug #21530 (Closed): inconsistent rstat on inode
already fixed by... Zheng Yan
01:13 AM Bug #21530 (Closed): inconsistent rstat on inode
http://qa-proxy.ceph.com/teuthology/yuriw-2017-09-23_23:29:32-kcephfs-wip-yuri-testing-2017-09-23-2100-testing-basic-... Zheng Yan

09/24/2017

04:30 PM Bug #21406 (In Progress): ceph.in: tell mds does not understand --cluster
To use `tell mds` with cluster having a non-default name, you can get around the issue for now by passing the corresp... Ramana Raja
04:12 PM Bug #21406: ceph.in: tell mds does not understand --cluster
Patrick Donnelly wrote:
> Ramana, was removing the BZ in the description an accident?
I haven't seen Red Hat down...
Ramana Raja
03:06 PM Bug #21501 (Fix Under Review): ceph_volume_client: sets invalid caps for existing IDs with no caps
https://github.com/ceph/ceph/pull/17935 Ramana Raja

09/23/2017

11:48 AM Bug #21512: qa: libcephfs_interface_tests: shutdown race failures
Nope, can't be done all on stack. This code is pretty reliant on keeping pointers around to some static stuff. What w... Jeff Layton

09/22/2017

09:25 PM Backport #21519: jewel: qa: test_client_pin times out waiting for dentry release from kernel
Let's delay this for the next minor release. Downstream can cherry-pick the fix and users can change their configs ea... Patrick Donnelly
09:15 PM Backport #21519: jewel: qa: test_client_pin times out waiting for dentry release from kernel
I'll be honest and say I don't like late merges that touch real code (not just tests) because they can introduce regr... Nathan Cutler
08:43 PM Backport #21519: jewel: qa: test_client_pin times out waiting for dentry release from kernel
@Nathan, can we still add it to 10.2.10? Yuri Weinstein
08:33 PM Backport #21519 (Fix Under Review): jewel: qa: test_client_pin times out waiting for dentry relea...
Patrick Donnelly
08:28 PM Backport #21519 (Resolved): jewel: qa: test_client_pin times out waiting for dentry release from ...
https://github.com/ceph/ceph/pull/17925 Patrick Donnelly
08:37 PM Bug #20988: client: dual client segfault with racing ceph_shutdown
To be clear, I think we may want to leave off the patch that adds the new testcase from this series as it's uncoverin... Jeff Layton
08:34 PM Backport #21526 (Closed): jewel: client: dual client segfault with racing ceph_shutdown
https://github.com/ceph/ceph/pull/21153 Nathan Cutler
08:34 PM Backport #21525 (Resolved): luminous: client: dual client segfault with racing ceph_shutdown
https://github.com/ceph/ceph/pull/20082 Nathan Cutler
08:31 PM Bug #20337 (Resolved): test_rebuild_simple_altpool triggers MDS assertion
Nathan Cutler
08:31 PM Backport #21490 (Resolved): luminous: test_rebuild_simple_altpool triggers MDS assertion
Nathan Cutler
08:31 PM Bug #21466 (Resolved): qa: fs.get_config on stopped MDS
Nathan Cutler
08:30 PM Backport #21484 (Resolved): luminous: qa: fs.get_config on stopped MDS
Nathan Cutler
07:34 PM Bug #21252 (Resolved): mds: asok command error merged with partial Formatter output
Nathan Cutler
07:34 PM Backport #21321 (Resolved): luminous: mds: asok command error merged with partial Formatter output
Nathan Cutler
07:33 PM Bug #21414 (Resolved): client: Variable "onsafe" going out of scope leaks the storage it points to
Nathan Cutler
07:33 PM Backport #21436 (Resolved): luminous: client: Variable "onsafe" going out of scope leaks the stor...
Nathan Cutler
07:31 PM Bug #21381 (Resolved): test_filtered_df: assert 0.9 < ratio < 1.1
Nathan Cutler
07:31 PM Backport #21437 (Resolved): luminous: test_filtered_df: assert 0.9 < ratio < 1.1
Nathan Cutler
07:30 PM Bug #21275 (Resolved): test hang after mds evicts kclient
Nathan Cutler
07:30 PM Backport #21473 (Resolved): luminous: test hang after mds evicts kclient
Nathan Cutler
07:29 PM Bug #21462 (Resolved): qa: ignorable MDS_READ_ONLY warning
Nathan Cutler
07:29 PM Backport #21464 (Resolved): luminous: qa: ignorable MDS_READ_ONLY warning
Nathan Cutler
07:28 PM Bug #21071 (Resolved): qa: test_misc creates metadata pool with dummy object resulting in WRN: PO...
Nathan Cutler
07:28 PM Backport #21449 (Resolved): luminous: qa: test_misc creates metadata pool with dummy object resul...
Nathan Cutler
07:27 PM Backport #21486 (Resolved): luminous: qa: test_client_pin times out waiting for dentry release fr...
Nathan Cutler
07:27 PM Bug #21421 (Resolved): MDS rank add/remove log messages say wrong number of ranks
Nathan Cutler
07:26 PM Backport #21487 (Resolved): luminous: MDS rank add/remove log messages say wrong number of ranks
Nathan Cutler
07:26 PM Backport #21488 (Resolved): luminous: qa: failures from pjd fstest
Nathan Cutler
07:15 PM Bug #21423: qa: test_client_pin times out waiting for dentry release from kernel
This is also a problem for RHCS 2.0, we need to backport to Jewel. Patrick Donnelly
06:54 PM Backport #21517 (Fix Under Review): luminous: qa: kcephfs: client-limits: whitelist "MDS cache to...
https://github.com/ceph/ceph/pull/17922 Patrick Donnelly
06:49 PM Backport #21517 (Resolved): luminous: qa: kcephfs: client-limits: whitelist "MDS cache too large"
https://github.com/ceph/ceph/pull/17922 Patrick Donnelly
06:54 PM Backport #21516 (Fix Under Review): luminous: qa: kcephfs: ignore warning on expected mds failover
https://github.com/ceph/ceph/pull/17922 Patrick Donnelly
06:49 PM Backport #21516 (Resolved): luminous: qa: kcephfs: ignore warning on expected mds failover
https://github.com/ceph/ceph/pull/17922 Patrick Donnelly
06:54 PM Backport #21515 (Fix Under Review): luminous: qa: kcephfs: missing whitelist for evicted client
https://github.com/ceph/ceph/pull/17922 Patrick Donnelly
06:49 PM Backport #21515 (Resolved): luminous: qa: kcephfs: missing whitelist for evicted client
https://github.com/ceph/ceph/pull/17922 Patrick Donnelly
06:49 PM Bug #21512: qa: libcephfs_interface_tests: shutdown race failures
Ouch! Looks like env_to_vec is also not threadsafe:... Jeff Layton
06:30 PM Bug #21512 (Resolved): qa: libcephfs_interface_tests: shutdown race failures
... Patrick Donnelly
06:47 PM Backport #21514 (Fix Under Review): luminous: ceph_volume_client: snapshot dir name hardcoded
https://github.com/ceph/ceph/pull/17921 Patrick Donnelly
06:45 PM Backport #21514 (Resolved): luminous: ceph_volume_client: snapshot dir name hardcoded
Patrick Donnelly
06:38 PM Bug #21467 (Resolved): mds: src/mds/MDLog.cc: 276: FAILED assert(!capped)
Patrick Donnelly
06:37 PM Backport #21513 (Resolved): luminous: mds: src/mds/MDLog.cc: 276: FAILED assert(!capped)
Patrick Donnelly
06:37 PM Backport #21513 (Resolved): luminous: mds: src/mds/MDLog.cc: 276: FAILED assert(!capped)
https://github.com/ceph/ceph/pull/17852 Patrick Donnelly
06:34 PM Bug #21476 (Pending Backport): ceph_volume_client: snapshot dir name hardcoded
Patrick Donnelly
04:45 PM Bug #21510 (Fix Under Review): qa: kcephfs: client-limits: whitelist "MDS cache too large"
https://github.com/ceph/ceph/pull/17919 Patrick Donnelly
04:25 PM Bug #21510 (Resolved): qa: kcephfs: client-limits: whitelist "MDS cache too large"
... Patrick Donnelly
04:43 PM Bug #21509 (Fix Under Review): qa: kcephfs: ignore warning on expected mds failover
https://github.com/ceph/ceph/pull/17918 Patrick Donnelly
04:20 PM Bug #21509 (Resolved): qa: kcephfs: ignore warning on expected mds failover
... Patrick Donnelly
04:40 PM Bug #21508 (Fix Under Review): qa: kcephfs: missing whitelist for evicted client
https://github.com/ceph/ceph/pull/17917 Patrick Donnelly
04:06 PM Bug #21508 (Resolved): qa: kcephfs: missing whitelist for evicted client
... Patrick Donnelly
04:01 PM Bug #21507 (New): mds: debug logs near respawn are not flushed
... Patrick Donnelly
01:36 PM Bug #21501 (Resolved): ceph_volume_client: sets invalid caps for existing IDs with no caps
Create a ceph auth ID with no caps,
$ sudo ceph auth get-or-create client.test2
Allow ceph_volume_client to autho...
Ramana Raja
11:38 AM Bug #21483: qa: test_snapshot_remove (kcephfs): RuntimeError: Bad data at offset 0
https://github.com/ceph/ceph-client/commit/627c3763020604960d9c20b246b303478f34a6ec Zheng Yan

09/21/2017

10:12 PM Bug #21476: ceph_volume_client: snapshot dir name hardcoded
This needs a BZ for downstream backport. Patrick Donnelly
07:32 PM Bug #21406: ceph.in: tell mds does not understand --cluster
Ramana, was removing the BZ in the description an accident? Patrick Donnelly
07:14 PM Bug #20337: test_rebuild_simple_altpool triggers MDS assertion
Nathan Cutler wrote:
> @Patrick: So both https://github.com/ceph/ceph/pull/16305 and https://github.com/ceph/ceph/pu...
Patrick Donnelly
03:32 AM Bug #20337: test_rebuild_simple_altpool triggers MDS assertion
@Patrick: So both https://github.com/ceph/ceph/pull/16305 and https://github.com/ceph/ceph/pull/17849 need to be back... Nathan Cutler
02:10 PM Bug #21483: qa: test_snapshot_remove (kcephfs): RuntimeError: Bad data at offset 0
it's kernel client bug in testing branch, likely caused by... Zheng Yan
01:40 AM Bug #21483 (Resolved): qa: test_snapshot_remove (kcephfs): RuntimeError: Bad data at offset 0
... Patrick Donnelly
01:44 PM Backport #21488 (In Progress): luminous: qa: failures from pjd fstest
Nathan Cutler
03:34 AM Backport #21488 (Resolved): luminous: qa: failures from pjd fstest
https://github.com/ceph/ceph/pull/17888 Nathan Cutler
01:42 PM Backport #21487 (In Progress): luminous: MDS rank add/remove log messages say wrong number of ranks
Nathan Cutler
03:34 AM Backport #21487 (Resolved): luminous: MDS rank add/remove log messages say wrong number of ranks
https://github.com/ceph/ceph/pull/17887 Nathan Cutler
01:40 PM Backport #21486 (In Progress): luminous: qa: test_client_pin times out waiting for dentry release...
Nathan Cutler
03:34 AM Backport #21486 (Resolved): luminous: qa: test_client_pin times out waiting for dentry release fr...
https://github.com/ceph/ceph/pull/17886 Nathan Cutler
08:38 AM Backport #21449 (In Progress): luminous: qa: test_misc creates metadata pool with dummy object re...
Nathan Cutler
08:37 AM Backport #21437 (In Progress): luminous: test_filtered_df: assert 0.9 < ratio < 1.1
Nathan Cutler
08:35 AM Backport #21436 (In Progress): luminous: client: Variable "onsafe" going out of scope leaks the s...
Nathan Cutler
08:29 AM Backport #21359 (In Progress): luminous: racy is_mounted() checks in libcephfs
Nathan Cutler
04:20 AM Feature #20752 (Fix Under Review): cap message flag which indicates if client still has pending c...
*master PR*: https://github.com/ceph/ceph/pull/16778 Nathan Cutler
04:18 AM Backport #21321 (In Progress): luminous: mds: asok command error merged with partial Formatter ou...
Nathan Cutler
03:44 AM Bug #21168: cap import/export message ordering issue
https://github.com/ceph/ceph/pull/17854 Zheng Yan
03:43 AM Backport #21484 (In Progress): luminous: qa: fs.get_config on stopped MDS
Nathan Cutler
03:34 AM Backport #21484 (Resolved): luminous: qa: fs.get_config on stopped MDS
https://github.com/ceph/ceph/pull/17855 Nathan Cutler
03:42 AM Bug #19706: Laggy mon daemons causing MDS failover (symptom: failed to set counters on mds daemon...
still happen
http://qa-proxy.ceph.com/teuthology/teuthology-2017-09-16_03:15:02-fs-master-distro-basic-smithi/1639...
Zheng Yan
03:41 AM Backport #21490 (In Progress): luminous: test_rebuild_simple_altpool triggers MDS assertion
Nathan Cutler
03:35 AM Backport #21490 (Resolved): luminous: test_rebuild_simple_altpool triggers MDS assertion
https://github.com/ceph/ceph/pull/17855 Nathan Cutler
03:34 AM Backport #21489 (Resolved): jewel: qa: failures from pjd fstest
https://github.com/ceph/ceph/pull/21152 Nathan Cutler
03:32 AM Bug #21467 (Fix Under Review): mds: src/mds/MDLog.cc: 276: FAILED assert(!capped)
https://github.com/ceph/ceph/pull/17852
https://github.com/ceph/ceph/pull/17853
Zheng Yan
01:28 AM Bug #21467: mds: src/mds/MDLog.cc: 276: FAILED assert(!capped)
Another: http://pulpito.ceph.com/pdonnell-2017-09-20_23:49:47-fs:basic_functional-master-testing-basic-smithi/1652996 Patrick Donnelly
01:34 AM Bug #21466 (Pending Backport): qa: fs.get_config on stopped MDS
Patrick Donnelly

09/20/2017

11:00 PM Bug #21466 (Fix Under Review): qa: fs.get_config on stopped MDS
https://github.com/ceph/ceph/pull/17849 Patrick Donnelly
10:23 PM Bug #21466 (In Progress): qa: fs.get_config on stopped MDS
So my analysis is wrong, the actual problem is that the test is killing two unneded MDS and then trying to do fs.get_... Patrick Donnelly
10:57 PM Bug #20337 (Pending Backport): test_rebuild_simple_altpool triggers MDS assertion
This should resolve this failure in 12.2.1 testing:
http://tracker.ceph.com/issues/21466#note-1
Patrick Donnelly
07:59 PM Feature #18490: client: implement delegation support in userland cephfs
Jeff Layton wrote:
> Can you list them? Is it just we want Ganesha to return NFS4ERR_DELAY?
Yes, that's the big o...
Patrick Donnelly
06:41 PM Feature #18490: client: implement delegation support in userland cephfs
Patrick, Greg and others have been kind enough to give me some good review so far, so I've been working to address th... Jeff Layton
07:42 PM Bug #21423 (Pending Backport): qa: test_client_pin times out waiting for dentry release from kernel
Patrick Donnelly
07:41 PM Bug #21421 (Pending Backport): MDS rank add/remove log messages say wrong number of ranks
Patrick Donnelly
07:40 PM Bug #21383 (Pending Backport): qa: failures from pjd fstest
Patrick Donnelly
05:51 PM Backport #21481 (Fix Under Review): jewel: "FileStore.cc: 2930: FAILED assert(0 == "unexpected er...
https://github.com/ceph/ceph/pull/17847 Patrick Donnelly
04:59 PM Backport #21481: jewel: "FileStore.cc: 2930: FAILED assert(0 == "unexpected error")" in fs
Patrick Donnelly wrote:
> Yuri, I think you meant jewel not 12.2.1?
>
> We're not testing btrfs in luminous/maste...
Yuri Weinstein
04:57 PM Backport #21481 (In Progress): jewel: "FileStore.cc: 2930: FAILED assert(0 == "unexpected error")...
Yuri, I think you meant jewel not 12.2.1?
We're not testing btrfs in luminous/master so we could just remove that ...
Patrick Donnelly
04:49 PM Backport #21481 (Rejected): jewel: "FileStore.cc: 2930: FAILED assert(0 == "unexpected error")" i...
This is a btfrs bug where it reports ENOSPC:... Josh Durgin
04:32 PM Backport #21481: jewel: "FileStore.cc: 2930: FAILED assert(0 == "unexpected error")" in fs
Also verified in http://pulpito.ceph.com/yuriw-2017-09-20_15:31:18-fs-jewel-distro-basic-smithi/1652593 Yuri Weinstein
04:31 PM Backport #21481 (Resolved): jewel: "FileStore.cc: 2930: FAILED assert(0 == "unexpected error")" i...
This is jewel v10.2.10
Run: http://pulpito.ceph.com/yuriw-2017-09-19_20:52:59-fs-jewel-distro-basic-smithi/
Job: 1...
Yuri Weinstein
03:17 PM Bug #21476 (Fix Under Review): ceph_volume_client: snapshot dir name hardcoded
https://github.com/ceph/ceph/pull/17843/ Ramana Raja
12:13 PM Bug #21476 (Resolved): ceph_volume_client: snapshot dir name hardcoded
The ceph_volume_client always creates snapshots in '.snap' folder.
https://github.com/ceph/ceph/commit/aebce4b643755...
Ramana Raja
11:02 AM Bug #20988 (Pending Backport): client: dual client segfault with racing ceph_shutdown
I think we do want to backport this to any currently maintained releases. It really shouldn't change any behavior and... Jeff Layton
03:21 AM Bug #20988: client: dual client segfault with racing ceph_shutdown
the PR has been merged.
but i am not sure if we should backport it. i am pinging Jeff for his insights on it at ht...
Kefu Chai
03:19 AM Bug #20988 (Fix Under Review): client: dual client segfault with racing ceph_shutdown
https://github.com/ceph/ceph/pull/17738 Kefu Chai
08:19 AM Bug #21149: SubsystemMap.h: 62: FAILED assert(sub < m_subsys.size())
[hadoop@ceph149 hadoop]$ cat /etc/ceph/ceph.conf
[global]
fsid = 99bf903b-b1e9-49de-afd6-2d7897bfd3c5
mon_initial...
shangzhong zhu
03:07 AM Backport #21473 (In Progress): luminous: test hang after mds evicts kclient
Nathan Cutler
02:43 AM Backport #21473 (Resolved): luminous: test hang after mds evicts kclient
https://github.com/ceph/ceph/pull/17822 Nathan Cutler
02:47 AM Backport #21472 (In Progress): luminous: qa: ignorable "MDS cache too large" warning
Nathan Cutler
02:43 AM Backport #21472 (Resolved): luminous: qa: ignorable "MDS cache too large" warning
https://github.com/ceph/ceph/pull/17821 Nathan Cutler
02:43 AM Bug #21463 (Pending Backport): qa: ignorable "MDS cache too large" warning
Looks like it needs a backport of 71f0066f6ec Nathan Cutler

09/19/2017

11:24 PM Bug #21275 (Pending Backport): test hang after mds evicts kclient
Patrick Donnelly
11:23 PM Bug #21468 (Duplicate): kcephfs: hang during umount
Patrick Donnelly
11:15 PM Bug #21468: kcephfs: hang during umount
dup of http://tracker.ceph.com/issues/21275 Zheng Yan
09:10 PM Bug #21468 (Duplicate): kcephfs: hang during umount
After completing tasks.cephfs.test_client_recovery.TestClientRecovery.test_filelock_eviction:... Patrick Donnelly
09:02 PM Bug #21467 (Resolved): mds: src/mds/MDLog.cc: 276: FAILED assert(!capped)
... Patrick Donnelly
08:57 PM Bug #21466: qa: fs.get_config on stopped MDS
Similar failure here:... Patrick Donnelly
08:42 PM Bug #21466 (Resolved): qa: fs.get_config on stopped MDS
... Patrick Donnelly
08:26 PM Backport #21464 (In Progress): luminous: qa: ignorable MDS_READ_ONLY warning
Nathan Cutler
08:25 PM Backport #21464 (Resolved): luminous: qa: ignorable MDS_READ_ONLY warning
https://github.com/ceph/ceph/pull/17817 Nathan Cutler
07:59 PM Bug #21463 (Resolved): qa: ignorable "MDS cache too large" warning
... Patrick Donnelly
07:56 PM Bug #21462 (Resolved): qa: ignorable MDS_READ_ONLY warning
Creating this issue for backport.
https://github.com/ceph/ceph/pull/17466
Patrick Donnelly
01:44 PM Bug #21419: client: is ceph_caps_for_mode correct for r/o opens?
Well, these are just the sets of caps the client *requests* when it's opening a file, right?
So I'm pretty sure it w...
Greg Farnum
12:52 PM Bug #21419: client: is ceph_caps_for_mode correct for r/o opens?
Zheng Yan wrote:
> which sideway? client::open() call the may_open() too.
Yes, I think you're right. We'll call m...
Jeff Layton
12:42 PM Bug #21419: client: is ceph_caps_for_mode correct for r/o opens?
With some git archaeology, I was able to dig up this:... Jeff Layton
07:07 AM Bug #21419: client: is ceph_caps_for_mode correct for r/o opens?
which sideway? client::open() call the may_open() too. Zheng Yan
11:37 AM Backport #21450 (Closed): jewel: MDS: MDS is laggy or crashed When deleting a large number of files
Nathan Cutler
11:37 AM Backport #21449 (Resolved): luminous: qa: test_misc creates metadata pool with dummy object resul...
https://github.com/ceph/ceph/pull/17879 Nathan Cutler
11:36 AM Backport #21437 (Resolved): luminous: test_filtered_df: assert 0.9 < ratio < 1.1
https://github.com/ceph/ceph/pull/17878 Nathan Cutler
11:36 AM Backport #21436 (Resolved): luminous: client: Variable "onsafe" going out of scope leaks the stor...
https://github.com/ceph/ceph/pull/17877 Nathan Cutler
10:43 AM Bug #21433 (Closed): mds: failed to decode message of type 43 v7: buffer::end_of_buffer
Hi,
we run cephfs (10.2.9 on Debian jessie; 108 OSDs on 9 nodes) as a scratch filesystem for a slurm cluster using...
Christian Salzmann-Jäckel
08:01 AM Bug #21426: qa/workunits/fs/snaps/untar_snap_rm.sh: timeout during up:rejoin (thrashing)
The mds was busy at opening inodes for reconnected client caps. There are two reason opening inodes took so long.
...
Zheng Yan
06:59 AM Bug #21423 (Fix Under Review): qa: test_client_pin times out waiting for dentry release from kernel
https://github.com/ceph/ceph/pull/17791 Zheng Yan
04:07 AM Bug #21423 (In Progress): qa: test_client_pin times out waiting for dentry release from kernel
The reason is that, when kerne verion < 3.18, ceph-fuse uses "dentry invalidate" upcall to trim dcache. but the patch... Zheng Yan

09/18/2017

09:33 PM Bug #21421 (Fix Under Review): MDS rank add/remove log messages say wrong number of ranks
https://github.com/ceph/ceph/pull/17783 John Spray
05:17 PM Bug #21421 (Resolved): MDS rank add/remove log messages say wrong number of ranks
e.g. when increasing max mds to 2, I get:... John Spray
08:56 PM Bug #21426 (New): qa/workunits/fs/snaps/untar_snap_rm.sh: timeout during up:rejoin (thrashing)
Log: http://magna002.ceph.redhat.com/vasu-2017-09-16_00:44:06-fs-luminous---basic-multi/274217/teuthology.log
MDS ...
Patrick Donnelly
07:23 PM Bug #21423 (Resolved): qa: test_client_pin times out waiting for dentry release from kernel
http://magna002.ceph.redhat.com/vasu-2017-09-16_00:44:06-fs-luminous---basic-multi/274192/teuthology.log
http://ma...
Patrick Donnelly
02:04 PM Bug #21393: MDSMonitor: inconsistent role/who usage in command help
role is certainly the more general/accurate term, for places where we're talking about a filesystem+rank.
I want a...
John Spray
01:50 PM Bug #21419: client: is ceph_caps_for_mode correct for r/o opens?
Okay, so I *think* this is okay in many instances because Client::open() uses path_walk(), and that correctly invokes... Greg Farnum
01:03 PM Bug #21419 (Rejected): client: is ceph_caps_for_mode correct for r/o opens?
Greg was reviewing my delegation patches and noticed that we really aren't getting enough caps for read delegations. ... Jeff Layton
12:50 PM Bug #20988: client: dual client segfault with racing ceph_shutdown
Greg Farnum wrote:
> What resources are actually duplicated across a CephContext, which we don't want duplicated? Wh...
Jeff Layton
09:53 AM Bug #19406: MDS server crashes due to inconsistent metadata.
Christoffer Lilja wrote:
> The master install-deps.sh installed a flood of other stuff the earlier didn't. But still...
demiao wu

09/17/2017

12:33 PM Bug #21304 (Can't reproduce): mds v12.2.0 crashing
Zheng Yan
12:04 PM Bug #21412: cephfs: too many cephfs snapshots chokes the system
ceph-mds.mds01.log.gz does not include useful information. The log was generated when mds replays log. Maybe the hang... Zheng Yan
09:22 AM Bug #21383 (Fix Under Review): qa: failures from pjd fstest
https://github.com/ceph/ceph/pull/17768 Zheng Yan

09/16/2017

05:26 PM Bug #21414 (Pending Backport): client: Variable "onsafe" going out of scope leaks the storage it ...
Patrick Donnelly
03:38 AM Bug #21414 (Resolved): client: Variable "onsafe" going out of scope leaks the storage it points to
Variable "onsafe" going out of scope leaks the storage it points to. This fixes the Coverity
Scan CID 1417473.
PR...
Jos Collin
12:16 PM Bug #21412: cephfs: too many cephfs snapshots chokes the system
Greg Farnum wrote:
> Can you dump the ops in flight on both the MDS and the client issuing the snap rmdir when this ...
Wyllys Ingersoll

09/15/2017

10:13 PM Bug #20988: client: dual client segfault with racing ceph_shutdown
What resources are actually duplicated across a CephContext, which we don't want duplicated? When I think of duplicat... Greg Farnum
10:07 PM Bug #21412: cephfs: too many cephfs snapshots chokes the system
Can you dump the ops in flight on both the MDS and the client issuing the snap rmdir when this happens? And the perfc... Greg Farnum
09:53 PM Bug #21412 (Closed): cephfs: too many cephfs snapshots chokes the system
We have a cluster with /cephfs/.snap directory with over 4800 entries. Trying to delete older snapshots (some are ove... Wyllys Ingersoll
09:19 PM Bug #21071 (Pending Backport): qa: test_misc creates metadata pool with dummy object resulting in...
Patrick Donnelly
09:17 PM Bug #21275 (Resolved): test hang after mds evicts kclient
Patrick Donnelly
09:17 PM Bug #21381 (Pending Backport): test_filtered_df: assert 0.9 < ratio < 1.1
Patrick Donnelly
08:48 PM Bug #20594 (Resolved): mds: cache limits should be expressed in memory usage, not inode count
Patrick Donnelly
08:48 PM Bug #21378 (Resolved): mds: up:stopping MDS cannot export directories
Patrick Donnelly
08:48 PM Backport #21385 (Resolved): luminous: mds: up:stopping MDS cannot export directories
Patrick Donnelly
08:47 PM Bug #21222 (Resolved): MDS: standby-replay mds should avoid initiating subtree export
Patrick Donnelly
08:46 PM Bug #21221 (Resolved): MDCache::try_subtree_merge() may print N^2 lines of debug message
Patrick Donnelly
08:45 PM Backport #21384 (Resolved): luminous: mds: cache limits should be expressed in memory usage, not ...
Patrick Donnelly
08:45 PM Backport #21322 (Resolved): luminous: MDS: standby-replay mds should avoid initiating subtree export
Patrick Donnelly
08:45 PM Backport #21357 (Resolved): luminous: mds: segfault during `rm -rf` of large directory
Patrick Donnelly
08:44 PM Backport #21323 (Resolved): luminous: MDCache::try_subtree_merge() may print N^2 lines of debug m...
Patrick Donnelly
07:15 PM Bug #21406 (Resolved): ceph.in: tell mds does not understand --cluster
... Patrick Donnelly
06:57 PM Bug #21405 (Resolved): qa: add EC data pool to testing
This would end up being another sub-suite I in fs/ which adds support for testing an erasure data pool with overwrite... Patrick Donnelly
03:57 PM Bug #21304: mds v12.2.0 crashing
It works fine with that. To be precise I built from the luminous branch from today. No crashes for 8 hours under heav... Andrej Filipcic
01:48 PM Bug #21402 (Resolved): mds: move remaining containers in CDentry/CDir/CInode to mempool
This commit:
https://github.com/ceph/ceph/commit/e035b64fcb0482c3318656e1680d683814f494fe
does only part of the...
Patrick Donnelly
10:27 AM Bug #21383 (In Progress): qa: failures from pjd fstest
Zheng Yan

09/14/2017

09:27 PM Bug #21393 (Resolved): MDSMonitor: inconsistent role/who usage in command help
`ceph rmfailed` refers to its argument as "who" and `ceph repaired` refers to its argument as "rank". We should make ... Patrick Donnelly
07:38 PM Bug #20988: client: dual client segfault with racing ceph_shutdown
Note that I think we do need to convert over programs like ganesha and samba to only keep a single CephContext and sh... Jeff Layton
07:37 PM Bug #20988: client: dual client segfault with racing ceph_shutdown
Ok, I finally settled on just keeping things more or less limping along like they are now with lockdep, and just ensu... Jeff Layton
04:06 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
> Probably related to: http://tracker.ceph.com/issues/19706
I'll keep an eye on it. I'm suspecting out of sync clock...
Webert Lima
01:34 PM Bug #21304: mds v12.2.0 crashing
I'm running luminous (head commit is ba746cd14d) ceph-mds for while, haven't reproduced the issue. could you try the ... Zheng Yan
11:01 AM Backport #21324 (In Progress): luminous: ceph: tell mds.* results in warning
Nathan Cutler
10:28 AM Bug #20892 (Resolved): qa: FS_DEGRADED spurious health warnings in some sub-suites
Nathan Cutler
10:28 AM Backport #21114 (Resolved): luminous: qa: FS_DEGRADED spurious health warnings in some sub-suites
Nathan Cutler
10:24 AM Bug #21004 (Resolved): fs: client/mds has wrong check to clear S_ISGID on chown
Nathan Cutler
10:24 AM Backport #21107 (Resolved): luminous: fs: client/mds has wrong check to clear S_ISGID on chown
Nathan Cutler
08:23 AM Bug #21274 (Resolved): Client: if request gets aborted, its reference leaks
the bug was introduced by ... Zheng Yan
01:45 AM Bug #21274 (Pending Backport): Client: if request gets aborted, its reference leaks
Patrick Donnelly
04:32 AM Backport #21357 (In Progress): luminous: mds: segfault during `rm -rf` of large directory
Patrick Donnelly
04:32 AM Backport #21323 (In Progress): luminous: MDCache::try_subtree_merge() may print N^2 lines of debu...
Patrick Donnelly
03:37 AM Backport #21322 (In Progress): luminous: MDS: standby-replay mds should avoid initiating subtree ...
https://github.com/ceph/ceph/pull/17714 Patrick Donnelly
01:47 AM Backport #21322: luminous: MDS: standby-replay mds should avoid initiating subtree export
Regression link in my last comment is wrong, see: http://tracker.ceph.com/issues/21378 Patrick Donnelly
01:46 AM Backport #21322: luminous: MDS: standby-replay mds should avoid initiating subtree export
Fix for regression can be merged with this backport: https://github.com/ceph/ceph/pull/17689 Patrick Donnelly
03:37 AM Backport #21385 (In Progress): luminous: mds: up:stopping MDS cannot export directories
Patrick Donnelly
03:37 AM Backport #21385 (Resolved): luminous: mds: up:stopping MDS cannot export directories
https://github.com/ceph/ceph/pull/17714 Patrick Donnelly
03:27 AM Backport #21278 (Resolved): luminous: the standbys are not updated via "ceph tell mds.* command"
Patrick Donnelly
03:27 AM Backport #21267 (Resolved): luminous: Incorrect grammar in FS message "1 filesystem is have a fai...
Patrick Donnelly
03:20 AM Backport #21384 (In Progress): luminous: mds: cache limits should be expressed in memory usage, n...
Patrick Donnelly
03:19 AM Backport #21384 (Resolved): luminous: mds: cache limits should be expressed in memory usage, not ...
https://github.com/ceph/ceph/pull/17711 Patrick Donnelly
03:10 AM Bug #20594 (Pending Backport): mds: cache limits should be expressed in memory usage, not inode c...
Patrick Donnelly
02:01 AM Bug #21383 (Resolved): qa: failures from pjd fstest
... Patrick Donnelly
01:46 AM Bug #21378 (Pending Backport): mds: up:stopping MDS cannot export directories
Patrick Donnelly

09/13/2017

06:53 PM Bug #20988: client: dual client segfault with racing ceph_shutdown
I started a discussion on ceph-devel and I think the consensus is that we can't make CephContext a singleton.
I we...
Jeff Layton
06:17 PM Bug #21381: test_filtered_df: assert 0.9 < ratio < 1.1
I think this was caused by:
commit 365558571c59dd42cf0934e6c31c7b4bf2c65026 365558571c (upstream/pull/17513/head)
...
Patrick Donnelly
06:04 PM Bug #21381 (Fix Under Review): test_filtered_df: assert 0.9 < ratio < 1.1
https://github.com/ceph/ceph/pull/17701 Douglas Fuller
04:26 PM Bug #21381 (Resolved): test_filtered_df: assert 0.9 < ratio < 1.1
... Patrick Donnelly
03:28 PM Bug #21337: luminous: MDS is not getting past up:replay on Luminous cluster
The entire log using bzip2 compressed down to 4.6G. You can download it from:
ftp://ftp.keepertech.com/outgoing/eri...
Eric Eastman
10:12 AM Bug #21337: luminous: MDS is not getting past up:replay on Luminous cluster
Eric Eastman wrote:
> The log file with *debug_mds=10* from MDS startup to reaching the assert is 110GB. I am attac...
Zheng Yan
02:40 AM Bug #21337: luminous: MDS is not getting past up:replay on Luminous cluster
One active, one standby-replay, one standby as shown:
mds: cephfs-1/1/1 up {0=ede-c2-mon02=up:active}, 1 up:stand...
Eric Eastman
02:32 AM Bug #21337: luminous: MDS is not getting past up:replay on Luminous cluster
Eric Eastman wrote:
> Replacing the 'assert(in)' with 'continue' did get the Ceph file system working again. Lookin...
Zheng Yan
01:44 PM Bug #21380 (Closed): mds: src/mds/MDSCacheObject.h: 171: FAILED assert(ref_map[by] > 0)
Thanks Zheng, I merged your commits. Patrick Donnelly
08:36 AM Bug #21380: mds: src/mds/MDSCacheObject.h: 171: FAILED assert(ref_map[by] > 0)
it's caused by bug in https://github.com/ceph/ceph/pull/17657. fixed by https://github.com/ukernel/ceph/commits/batri... Zheng Yan
04:27 AM Bug #21380 (Closed): mds: src/mds/MDSCacheObject.h: 171: FAILED assert(ref_map[by] > 0)
... Patrick Donnelly
08:51 AM Bug #21275: test hang after mds evicts kclient
with kernel fixes, the test case still hang at umount. http://qa-proxy.ceph.com/teuthology/zyan-2017-09-12_01:10:12-k... Zheng Yan
08:35 AM Bug #21379 (Duplicate): TestJournalRepair.test_reset: src/mds/CDir.cc: 930: FAILED assert(get_num...
dup of #21380 Zheng Yan
04:16 AM Bug #21379 (Duplicate): TestJournalRepair.test_reset: src/mds/CDir.cc: 930: FAILED assert(get_num...
... Patrick Donnelly
05:59 AM Bug #21363 (Closed): ceph-fuse crashing while mounting cephfs
Shinobu Kinjo
05:56 AM Bug #21363: ceph-fuse crashing while mounting cephfs
The issue#20972 has already been fixed in PR 16963.
[1] http://tracker.ceph.com/issues/20972
[2] https://github.c...
Prashant D
03:39 AM Bug #21363: ceph-fuse crashing while mounting cephfs
3242b2b should fix the issue. Shinobu Kinjo
03:53 AM Bug #21378: mds: up:stopping MDS cannot export directories
Looks good! Jianyu Li
03:48 AM Bug #21378 (Fix Under Review): mds: up:stopping MDS cannot export directories
https://github.com/ceph/ceph/pull/17689 Zheng Yan
03:30 AM Bug #21378: mds: up:stopping MDS cannot export directories
Seems the check in export_dir is too strict for up:stopping state:
if (!mds->is_active()) {
dout(7) << "i'...
Jianyu Li
02:41 AM Bug #21378 (Resolved): mds: up:stopping MDS cannot export directories
... Patrick Donnelly
03:20 AM Bug #21222: MDS: standby-replay mds should avoid initiating subtree export
cause regression http://tracker.ceph.com/issues/21378 Zheng Yan
02:42 AM Bug #21222: MDS: standby-replay mds should avoid initiating subtree export
Fix causes: http://tracker.ceph.com/issues/21222 Patrick Donnelly
03:08 AM Backport #21357: luminous: mds: segfault during `rm -rf` of large directory
h3. description... Nathan Cutler
02:13 AM Backport #21357 (Fix Under Review): luminous: mds: segfault during `rm -rf` of large directory
PR for luminous https://github.com/ceph/ceph/pull/17686 Zheng Yan
02:05 AM Backport #21357: luminous: mds: segfault during `rm -rf` of large directory
It's dup of http://tracker.ceph.com/issues/21070. fix has already been fixed in master branch Zheng Yan
03:07 AM Bug #21070 (Pending Backport): MDS: MDS is laggy or crashed When deleting a large number of files
Nathan Cutler
02:42 AM Backport #21322: luminous: MDS: standby-replay mds should avoid initiating subtree export
Nathan, fix causes regression: http://tracker.ceph.com/issues/21222 Patrick Donnelly

09/12/2017

10:49 PM Bug #21311 (Rejected): ceph perf dump should report standby MDSes
OK, so I'm going to take the opinionated position that this is a WONTFIX as we have an existing interface that provid... John Spray
05:31 PM Bug #21311: ceph perf dump should report standby MDSes
John, if you have strong opinions about ripping out perf counters, I'll send this one over to you. Feel free to send ... Douglas Fuller
03:25 PM Bug #21311: ceph perf dump should report standby MDSes
So on closer inspection I see that as you say, for the existing stuff it is indeed using perf counters, but it doesn'... John Spray
03:04 PM Bug #21311: ceph perf dump should report standby MDSes
John Spray wrote:
> This is a collectd thing, which isn't to say that we shouldn't care, but... I'm not sure bugs ag...
David Galloway
07:34 PM Backport #21357: luminous: mds: segfault during `rm -rf` of large directory
I'm running into the same problem on luminous 12.2.0 - while removing a directory with lots of files, the MDS crashes... Andras Pataki
05:39 PM Bug #21337: luminous: MDS is not getting past up:replay on Luminous cluster
Replacing the 'assert(in)' with 'continue' did get the Ceph file system working again. Looking at the log, there wer... Eric Eastman
05:26 PM Bug #21071 (Fix Under Review): qa: test_misc creates metadata pool with dummy object resulting in...
https://github.com/ceph/ceph/pull/17676 Douglas Fuller
05:19 PM Bug #21071: qa: test_misc creates metadata pool with dummy object resulting in WRN: POOL_APP_NOT_...
I think we should just whitelist this, then. It's an intentionally pathological case, and this error should not be tr... Douglas Fuller
12:19 PM Bug #20988: client: dual client segfault with racing ceph_shutdown
We could also fix this less invasively too -- try to make g_lockdep_ceph_ctx a refcounted object pointer, and then fi... Jeff Layton
02:19 AM Bug #21363 (Duplicate): ceph-fuse crashing while mounting cephfs
When trying to mount cephfs using ceph-fuse on ubuntu 16.04.3, parent or child ceph-fuse process getting SIGABRT sign... Prashant D
01:52 AM Bug #21362 (Need More Info): cephfs ec data pool + windows fio,ceph cluster degraed several hours...
1.configure
version : 12.2.0, ceph professional rpms install,new installed env.
cephfs: meta pool (ssd 1*3 replica...
Yong Wang
12:28 AM Bug #20594 (Fix Under Review): mds: cache limits should be expressed in memory usage, not inode c...
https://github.com/ceph/ceph/pull/17657 Patrick Donnelly

09/11/2017

10:16 PM Bug #21311: ceph perf dump should report standby MDSes
This is a collectd thing, which isn't to say that we shouldn't care, but... I'm not sure bugs against collectd really... John Spray
05:33 PM Bug #21311: ceph perf dump should report standby MDSes
Doug, please take this one. Patrick Donnelly
08:45 PM Bug #20945 (Resolved): get_quota_root sends lookupname op for every buffered write
Nathan Cutler
08:44 PM Backport #21112 (Resolved): luminous: get_quota_root sends lookupname op for every buffered write
Nathan Cutler
08:02 PM Backport #21359 (Resolved): luminous: racy is_mounted() checks in libcephfs
https://github.com/ceph/ceph/pull/17875 Nathan Cutler
07:33 PM Bug #21337: luminous: MDS is not getting past up:replay on Luminous cluster
The log file with *debug_mds=10* from MDS startup to reaching the assert is 110GB. I am attaching the last 50K lines... Eric Eastman
08:48 AM Bug #21337: luminous: MDS is not getting past up:replay on Luminous cluster
please set debug_mds=10, restart mds and upload the full log. To recover the situation, just replace the 'assert(in)'... Zheng Yan
06:43 AM Bug #21337 (Resolved): luminous: MDS is not getting past up:replay on Luminous cluster
On my Luminous 12.2.0 test cluster, after I have run for the last few days, the MDS process is not getting past up:re... Eric Eastman
05:46 PM Backport #21357: luminous: mds: segfault during `rm -rf` of large directory
Zheng, please take a look. Patrick Donnelly
05:45 PM Backport #21357 (Resolved): luminous: mds: segfault during `rm -rf` of large directory
https://github.com/ceph/ceph/pull/17686 Patrick Donnelly
05:28 PM Bug #20988: client: dual client segfault with racing ceph_shutdown
I spent a couple of hours today crawling over the code in ganesha and ceph that handles the CephContext. We have rout... Jeff Layton
05:22 PM Bug #21025 (Pending Backport): racy is_mounted() checks in libcephfs
Patrick Donnelly
05:07 PM Bug #21025 (Resolved): racy is_mounted() checks in libcephfs
PR is merged. Jeff Layton
01:58 PM Bug #21275 (Fix Under Review): test hang after mds evicts kclient
Patch is on ceph-devel. Patrick Donnelly

09/08/2017

08:20 PM Backport #21324 (Resolved): luminous: ceph: tell mds.* results in warning
https://github.com/ceph/ceph/pull/17729 Nathan Cutler
08:20 PM Backport #21323 (Resolved): luminous: MDCache::try_subtree_merge() may print N^2 lines of debug m...
https://github.com/ceph/ceph/pull/17712 Nathan Cutler
08:20 PM Backport #21322 (Resolved): luminous: MDS: standby-replay mds should avoid initiating subtree export
https://github.com/ceph/ceph/pull/17714 Nathan Cutler
08:20 PM Backport #21321 (Resolved): luminous: mds: asok command error merged with partial Formatter output
https://github.com/ceph/ceph/pull/17870 Nathan Cutler
06:26 PM Bug #21191 (Pending Backport): ceph: tell mds.* results in warning
Patrick Donnelly
06:26 PM Bug #21222 (Pending Backport): MDS: standby-replay mds should avoid initiating subtree export
Patrick Donnelly
06:26 PM Bug #21221 (Pending Backport): MDCache::try_subtree_merge() may print N^2 lines of debug message
Patrick Donnelly
06:25 PM Bug #21252 (Pending Backport): mds: asok command error merged with partial Formatter output
Patrick Donnelly
05:14 PM Cleanup #21069 (Resolved): client: missing space in some client debug log messages
Nathan Cutler
05:14 PM Backport #21103 (Resolved): luminous: client: missing space in some client debug log messages
Nathan Cutler
03:51 PM Backport #21103: luminous: client: missing space in some client debug log messages
https://github.com/ceph/ceph/pull/17469 merged Yuri Weinstein
03:10 PM Bug #21311 (Rejected): ceph perf dump should report standby MDSes
This was discovered when observing the cephmetrics dashboard monitoring the Sepia cluster.... David Galloway
01:52 PM Bug #21275: test hang after mds evicts kclient
Got it. I think we've hit problems like that in NFS, and what we had to do is save copies of the fields from utsname(... Jeff Layton
01:43 PM Bug #21275: test hang after mds evicts kclient
... Zheng Yan
06:14 AM Bug #21304 (Can't reproduce): mds v12.2.0 crashing

luminous mds crashes few times a day. large activity (eg untaring kernel tarball) causes to crash it in few minutes...
Andrej Filipcic

09/07/2017

01:52 PM Bug #20988: client: dual client segfault with racing ceph_shutdown
cc'ing Matt on this bug, as it may have implications for the new code that can fetch config info out of RADOS:
Bas...
Jeff Layton
01:10 PM Backport #21267 (In Progress): luminous: Incorrect grammar in FS message "1 filesystem is have a ...
Abhishek Lekshmanan
01:09 PM Backport #21278 (In Progress): luminous: the standbys are not updated via "ceph tell mds.* command"
Abhishek Lekshmanan
07:35 AM Backport #21278 (Resolved): luminous: the standbys are not updated via "ceph tell mds.* command"
https://github.com/ceph/ceph/pull/17565 Nathan Cutler
08:27 AM Bug #21274 (Fix Under Review): Client: if request gets aborted, its reference leaks
https://github.com/ceph/ceph/pull/17545 Zheng Yan
01:48 AM Bug #21274 (Resolved): Client: if request gets aborted, its reference leaks
/a/pdonnell-2017-09-06_15:30:20-fs-wip-pdonnell-testing-20170906-distro-basic-smithi/1601384/teuthology.log
log of...
Zheng Yan
07:47 AM Backport #21113 (Resolved): jewel: get_quota_root sends lookupname op for every buffered write
Nathan Cutler
07:43 AM Bug #18157 (Resolved): ceph-fuse segfaults on daemonize
Nathan Cutler
07:43 AM Backport #20972 (Resolved): jewel ceph-fuse segfaults at mount time, assert in ceph::log::Log::stop
Nathan Cutler
05:44 AM Bug #21275 (Resolved): test hang after mds evicts kclient
http://pulpito.ceph.com/zyan-2017-09-07_03:18:23-kcephfs-master-testing-basic-mira/
http://qa-proxy.ceph.com/teuth...
Zheng Yan
01:22 AM Bug #21230 (Pending Backport): the standbys are not updated via "ceph tell mds.* command"
Kefu Chai

09/06/2017

07:39 PM Backport #21267 (Resolved): luminous: Incorrect grammar in FS message "1 filesystem is have a fai...
https://github.com/ceph/ceph/pull/17566 Nathan Cutler
09:05 AM Bug #21252: mds: asok command error merged with partial Formatter output
Sorry, the bug was introduced by my commit:... Zheng Yan
03:52 AM Bug #21153 (Pending Backport): Incorrect grammar in FS message "1 filesystem is have a failed mds...
Patrick Donnelly
03:51 AM Bug #20337 (Resolved): test_rebuild_simple_altpool triggers MDS assertion
Patrick Donnelly

09/05/2017

09:48 PM Bug #21252: mds: asok command error merged with partial Formatter output
I should note: the error itself is very concerning because the only way for dump_cache to fail is if it's operating o... Patrick Donnelly
09:46 PM Bug #21252 (Fix Under Review): mds: asok command error merged with partial Formatter output
https://github.com/ceph/ceph/pull/17506 Patrick Donnelly
08:20 PM Bug #21252 (Resolved): mds: asok command error merged with partial Formatter output
... Patrick Donnelly
09:35 PM Bug #21222 (Fix Under Review): MDS: standby-replay mds should avoid initiating subtree export
Patrick Donnelly
03:39 PM Bug #16709 (Resolved): No output for "ceph mds rmfailed 0 --yes-i-really-mean-it" command
Nathan Cutler
03:38 PM Bug #18660 (Resolved): fragment space check can cause replayed request fail
Nathan Cutler
03:38 PM Bug #18661 (Resolved): Test failure: test_open_inode
Nathan Cutler
03:38 PM Bug #18877 (Resolved): mds/StrayManager: avoid reusing deleted inode in StrayManager::_purge_stra...
Nathan Cutler
03:38 PM Bug #18941 (Resolved): buffer overflow in test LibCephFS.DirLs
Nathan Cutler
03:38 PM Bug #19118 (Resolved): MDS heartbeat timeout during rejoin, when working with large amount of cap...
Nathan Cutler
03:37 PM Bug #19406 (Resolved): MDS server crashes due to inconsistent metadata.
Nathan Cutler
03:32 PM Bug #19955 (Resolved): Too many stat ops when MDS trying to probe a large file
Nathan Cutler
03:32 PM Backport #20149 (Rejected): kraken: Too many stat ops when MDS trying to probe a large file
Kraken is EOL. Nathan Cutler
03:32 PM Bug #20055 (Resolved): Journaler may execute on_safe contexts prematurely
Nathan Cutler
03:31 PM Backport #20141 (Rejected): kraken: Journaler may execute on_safe contexts prematurely
Kraken is EOL.Kraken is EOL. Nathan Cutler
11:07 AM Bug #20988: client: dual client segfault with racing ceph_shutdown
Hmm. I'm not sure that really helps. Here's the doc comment over ceph_create_with_context:... Jeff Layton
09:46 AM Bug #20988: client: dual client segfault with racing ceph_shutdown
I found a workaround. We can create single CephContext for multiple ceph_mount.... Zheng Yan
09:35 AM Backport #21114 (In Progress): luminous: qa: FS_DEGRADED spurious health warnings in some sub-suites
Nathan Cutler
09:33 AM Backport #21112 (In Progress): luminous: get_quota_root sends lookupname op for every buffered write
Nathan Cutler
09:28 AM Backport #21107 (In Progress): luminous: fs: client/mds has wrong check to clear S_ISGID on chown
Nathan Cutler
09:22 AM Backport #21103 (In Progress): luminous: client: missing space in some client debug log messages
Nathan Cutler
08:45 AM Bug #21193 (Duplicate): ceph.in: `ceph tell mds.* injectargs` does not update standbys
http://tracker.ceph.com/issues/21230 Chang Liu
08:40 AM Bug #21230 (Fix Under Review): the standbys are not updated via "ceph tell mds.* command"
https://github.com/ceph/ceph/pull/17463 Kefu Chai
07:59 AM Bug #21230 (Resolved): the standbys are not updated via "ceph tell mds.* command"
Chang Liu

09/04/2017

07:11 PM Bug #20178 (Resolved): df reports negative disk "used" value when quota exceed
Nathan Cutler
07:11 PM Backport #20350 (Rejected): kraken: df reports negative disk "used" value when quota exceed
Kraken is EOL. Nathan Cutler
07:11 PM Backport #20349 (Resolved): jewel: df reports negative disk "used" value when quota exceed
Nathan Cutler
07:10 PM Bug #20340 (Resolved): cephfs permission denied until second client accesses file
Nathan Cutler
07:10 PM Backport #20404 (Rejected): kraken: cephfs permission denied until second client accesses file
Kraken is EOL. Nathan Cutler
07:10 PM Backport #20403 (Resolved): jewel: cephfs permission denied until second client accesses file
Nathan Cutler
06:43 PM Bug #21221 (Fix Under Review): MDCache::try_subtree_merge() may print N^2 lines of debug message
https://github.com/ceph/ceph/pull/17456 Patrick Donnelly
09:11 AM Bug #21221 (Resolved): MDCache::try_subtree_merge() may print N^2 lines of debug message
MDCache::try_subtree_merge(dirfrag) calls MDCache::try_subtree_merge_at() for each subtree in the dirfrag. try_subtre... Zheng Yan
11:14 AM Bug #21222: MDS: standby-replay mds should avoid initiating subtree export
Here is a merge request for this bug fix: https://github.com/ceph/ceph/pull/17452, could you have a review? @Patrick Jianyu Li
11:11 AM Bug #21222: MDS: standby-replay mds should avoid initiating subtree export
Although for the latest code in master branch, this issue could be avoided by the destination check in export_dir:
...
Jianyu Li
10:24 AM Bug #21222 (Resolved): MDS: standby-replay mds should avoid initiating subtree export
For jewel-10.2.7 version, use two active mds and two related standby-replay mds.
When standby-replay replays the m...
Jianyu Li
 

Also available in: Atom