Project

General

Profile

Activity

From 09/21/2017 to 10/20/2017

10/20/2017

03:21 PM Bug #21512: qa: libcephfs_interface_tests: shutdown race failures
PR to master was https://github.com/ceph/ceph/pull/18139 Ken Dreyer
10:08 AM Feature #21877 (Resolved): quota and snaprealm integation
https://github.com/ceph/ceph/pull/18424/commits/4477f8b93d183eb461798b5b67550d3d5b22c16c Zheng Yan
09:30 AM Backport #21874 (Resolved): luminous: qa: libcephfs_interface_tests: shutdown race failures
https://github.com/ceph/ceph/pull/20082 Nathan Cutler
09:29 AM Backport #21870 (Resolved): luminous: Assertion in EImportStart::replay should be a damaged()
https://github.com/ceph/ceph/pull/18930 Nathan Cutler
07:10 AM Bug #21861 (New): osdc: truncate Object and remove the bh which have someone wait for read on it ...
ceph version: jewel 10.2.2
When one osd be written over the full_ratio(default is 0.95) will lead the cluster t...
Ivan Guan

10/19/2017

11:22 PM Bug #21777 (Need More Info): src/mds/MDCache.cc: 4332: FAILED assert(mds->is_rejoin())
This is NMI because we weren't able to reproduce the actual problem. We'll ahve to wait for QE to reproduce again wit... Patrick Donnelly
06:35 PM Bug #21853 (Resolved): mds: mdsload debug too high
... Patrick Donnelly
01:44 PM Feature #19578: mds: optimize CDir::_omap_commit() and CDir::_committed() for large directory
this should help large directory performance Zheng Yan
07:52 AM Bug #21848: client: re-expand admin_socket metavariables in child process
https://github.com/ceph/ceph/pull/18393 Zhi Zhang
07:51 AM Bug #21848 (Resolved): client: re-expand admin_socket metavariables in child process
The default value of admin_socket is $run_dir/$cluster-$name.asok. If mounting multiple ceph-fuse instances on the sa... Zhi Zhang
07:44 AM Bug #21483 (Resolved): qa: test_snapshot_remove (kcephfs): RuntimeError: Bad data at offset 0
Zheng Yan
02:20 AM Bug #21749 (Duplicate): PurgeQueue corruption in 12.2.1
dup of #19593 Zheng Yan
02:17 AM Backport #21658 (Fix Under Review): luminous: purge queue and standby replay mds
https://github.com/ceph/ceph/pull/18385 Zheng Yan
01:53 AM Bug #21843 (Fix Under Review): mds: preserve order of requests during recovery of multimds cluster
https://github.com/ceph/ceph/pull/18384 Zheng Yan
01:50 AM Bug #21843 (Resolved): mds: preserve order of requests during recovery of multimds cluster
there are several cases that requests get processed in wrong order
1)
touch a/b/f (handled by mds.1, early ...
Zheng Yan

10/17/2017

10:09 PM Bug #21821 (Fix Under Review): MDSMonitor: mons should reject misconfigured mds_blacklist_interval
https://github.com/ceph/ceph/pull/18366 John Spray
09:55 PM Bug #21821: MDSMonitor: mons should reject misconfigured mds_blacklist_interval
A good opportunity to use the new min/max fields on the config option itself.
I suppose if we accept the idea that...
John Spray
05:36 PM Bug #21821 (Resolved): MDSMonitor: mons should reject misconfigured mds_blacklist_interval
There should be a minimum acceptable value otherwise we see potential behavior where a blacklisted MDS is still writi... Patrick Donnelly
02:01 AM Bug #21812 (Closed): standby replay mds may submit log
I wrongly interpret the log Zheng Yan

10/16/2017

01:19 PM Bug #21812 (Closed): standby replay mds may submit log
magna002://home/smohan/LOGS/cfs-mds.magna116.trunc.log.gz
mds submitted log entry while it's in standby replay sta...
Zheng Yan
12:07 AM Bug #21807 (Pending Backport): mds: trims all unpinned dentries when memory limit is reached
Patrick Donnelly
12:03 AM Backport #21810 (Resolved): luminous: mds: trims all unpinned dentries when memory limit is reached
https://github.com/ceph/ceph/pull/18316 Patrick Donnelly

10/14/2017

08:49 PM Bug #21807 (Fix Under Review): mds: trims all unpinned dentries when memory limit is reached
https://github.com/ceph/ceph/pull/18309 Patrick Donnelly
08:46 PM Bug #21807 (Resolved): mds: trims all unpinned dentries when memory limit is reached
Generally dentries are pinned by the client cache so this was easy to miss in testing. Bug is here:
https://github...
Patrick Donnelly
08:17 AM Backport #21805 (In Progress): luminous: client_metadata can be missing
Nathan Cutler
12:32 AM Backport #21805 (Resolved): luminous: client_metadata can be missing
https://github.com/ceph/ceph/pull/18299 Patrick Donnelly
08:16 AM Backport #21804 (In Progress): luminous: limit internal memory usage of object cacher.
Nathan Cutler
12:23 AM Backport #21804 (Resolved): luminous: limit internal memory usage of object cacher.
https://github.com/ceph/ceph/pull/18298 Patrick Donnelly
12:39 AM Backport #21806 (In Progress): luminous: FAILED assert(in->is_dir()) in MDBalancer::handle_export...
Patrick Donnelly
12:37 AM Backport #21806 (Resolved): luminous: FAILED assert(in->is_dir()) in MDBalancer::handle_export_pi...
https://github.com/ceph/ceph/pull/18300 Patrick Donnelly
12:16 AM Bug #21512 (Pending Backport): qa: libcephfs_interface_tests: shutdown race failures
Patrick Donnelly
12:14 AM Bug #21726 (Pending Backport): limit internal memory usage of object cacher.
Patrick Donnelly
12:14 AM Bug #21746 (Pending Backport): client_metadata can be missing
Patrick Donnelly
12:13 AM Bug #21759 (Pending Backport): Assertion in EImportStart::replay should be a damaged()
Patrick Donnelly
12:12 AM Bug #21768 (Pending Backport): FAILED assert(in->is_dir()) in MDBalancer::handle_export_pins()
Patrick Donnelly

10/13/2017

07:18 PM Feature #21601 (Resolved): ceph_volume_client: add get, put, and delete object interfaces
Patrick Donnelly
07:18 PM Backport #21602 (Resolved): luminous: ceph_volume_client: add get, put, and delete object interfaces
Patrick Donnelly
12:37 AM Bug #21777 (Fix Under Review): src/mds/MDCache.cc: 4332: FAILED assert(mds->is_rejoin())
https://github.com/ceph/ceph/pull/18278 Patrick Donnelly

10/12/2017

10:47 PM Bug #21777 (Need More Info): src/mds/MDCache.cc: 4332: FAILED assert(mds->is_rejoin())
MDS may send a MMDSCacheRejoin(MMDSCacheRejoin::OP_WEAK) message to an MDS which is not rejoin/active/stopping. Once ... Patrick Donnelly
04:04 AM Bug #21768 (Fix Under Review): FAILED assert(in->is_dir()) in MDBalancer::handle_export_pins()
https://github.com/ceph/ceph/pull/18261 Zheng Yan
03:58 AM Bug #21768 (Resolved): FAILED assert(in->is_dir()) in MDBalancer::handle_export_pins()
... Zheng Yan

10/11/2017

09:38 PM Bug #21765: auth|doc: fs authorize error for existing credentials confusing/unclear
Doug, please take this one. Patrick Donnelly
09:37 PM Bug #21765 (Resolved): auth|doc: fs authorize error for existing credentials confusing/unclear
If you attempt to use `fs authorize` on a key that already exists you get an error like:
https://github.com/ceph/c...
Patrick Donnelly
03:47 PM Bug #21764 (Resolved): common/options.cc: Update descriptions and visibility levels for MDS/clien...
Go through the options in common/options.cc and figure out which should be LEVEL_DEV (hidden in the UI). BASIC/ADVANC... Patrick Donnelly
12:02 PM Bug #21748: client assertions tripped during some workloads
Huh. That is an interesting theory. I don't see how ganesha would do that, but maybe. Unfortunately, the original pro... Jeff Layton
08:27 AM Bug #21748: client assertions tripped during some workloads
this shouldn't happen even for traceless reply. I suspect the 'in' passed to ceph_ll_setattr isn't belong to the 'cmo... Zheng Yan
11:03 AM Bug #21759 (Fix Under Review): Assertion in EImportStart::replay should be a damaged()
https://github.com/ceph/ceph/pull/18244 John Spray
10:35 AM Bug #21759 (Resolved): Assertion in EImportStart::replay should be a damaged()

This is one of a number of assertions that still linger in journal.cc, but since it's been seen in the wild ("[ceph...
John Spray
09:11 AM Bug #21754: mds: src/osdc/Journaler.cc: 402: FAILED assert(!r)
... Zheng Yan
06:31 AM Bug #21749: PurgeQueue corruption in 12.2.1
Hi Yan,
yes, we had 3 MDS running in standby-replay mode (I switched them to standby now).
Thanks for the offer...
Daniel Baumann
02:58 AM Bug #21749: PurgeQueue corruption in 12.2.1
likely caused by http://tracker.ceph.com/issues/19593.
ping 'yanzheng' at ceph@OFTC, I will help you to recover th...
Zheng Yan
02:40 AM Bug #21745: mds: MDBalancer using total (all time) request count in load statistics
although it is simple to add last_timestamp and last_reqcount so that we can get an average TPS, but TPS may fluctuat... Xiaoxi Chen

10/10/2017

10:10 PM Bug #21754 (Rejected): mds: src/osdc/Journaler.cc: 402: FAILED assert(!r)
... Patrick Donnelly
04:09 PM Bug #21749: PurgeQueue corruption in 12.2.1
I saved all information/logs/objects, feel free to ask for any of it and further things.
Regards,
Daniel
Daniel Baumann
12:05 PM Bug #21749 (Duplicate): PurgeQueue corruption in 12.2.1
From "[ceph-users] how to debug (in order to repair) damaged MDS (rank)?"
Log snippet during MDS startup:...
John Spray
02:18 PM Bug #21748: client assertions tripped during some workloads
Actually this is wrong (as Zheng pointed out). The call is made with a zero-length path that starts from the inode on... Jeff Layton
10:44 AM Bug #21748: client assertions tripped during some workloads
The right fix is probably to just remove that assertion. I don't think it's really valid anyway. cephfs turns the ino... Jeff Layton
10:42 AM Bug #21748 (Can't reproduce): client assertions tripped during some workloads
We had a report of some crashes in ganesha here:
https://github.com/nfs-ganesha/nfs-ganesha/issues/215
Dan and ...
Jeff Layton
12:51 PM Bug #21412: cephfs: too many cephfs snapshots chokes the system
The trimsnap states. The rmdir actually completes quickly, but the resulting operations throw the entire cluster int... Wyllys Ingersoll
02:51 AM Bug #21412: cephfs: too many cephfs snapshots chokes the system
what do you mean "it takes almost 24 hours to delete a single snapshot"? 'rmdir .snap/xxx' tooks 24 hours or pgs on ... Zheng Yan
09:55 AM Bug #21746 (Fix Under Review): client_metadata can be missing
https://github.com/ceph/ceph/pull/18215 Zheng Yan
09:52 AM Bug #21746 (Resolved): client_metadata can be missing
session opened by Server::prepare_force_open_sessions() has no client metadata. Zheng Yan
09:47 AM Bug #21745 (Resolved): mds: MDBalancer using total (all time) request count in load statistics
This was pointed out by Xiaoxi Chen
The get_req_rate() function is returning the value of l_mds_request, which is ...
John Spray

10/09/2017

06:25 PM Bug #21405 (Fix Under Review): qa: add EC data pool to testing
https://github.com/ceph/ceph/pull/18192 Sage Weil
05:50 PM Bug #21734 (Duplicate): mount client shows total capacity of cluster but not of a pool
SERVER:
ceph df
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
44637G...
Petr Malkov
01:56 PM Bug #21412: cephfs: too many cephfs snapshots chokes the system
Note, the bug says "10.2.7" but we have since upgraded to 10.2.9 and the same problem exists. Wyllys Ingersoll
01:55 PM Bug #21412: cephfs: too many cephfs snapshots chokes the system
Here is a dump of the cephfs 'dentry_lru' table, in case it is interesting. Wyllys Ingersoll
01:53 PM Bug #21412: cephfs: too many cephfs snapshots chokes the system
Here is data collected from a recent attempt to delete a very old and very large snapshot:
The snapshot extended a...
Wyllys Ingersoll
10:30 AM Bug #21726 (Fix Under Review): limit internal memory usage of object cacher.
https://github.com/ceph/ceph/pull/18183 Zheng Yan
10:22 AM Bug #21726 (Resolved): limit internal memory usage of object cacher.
https://bugzilla.redhat.com/show_bug.cgi?id=1490814
Zheng Yan
07:07 AM Bug #21722: mds: no assertion on inode being purging in find_ino_peers()
https://github.com/ceph/ceph/pull/18174 Zhi Zhang
07:06 AM Bug #21722 (Resolved): mds: no assertion on inode being purging in find_ino_peers()
Recently we hit an assertion on MDS only few times when MDS was very busy.... Zhi Zhang
03:43 AM Bug #20938: CephFS: concurrent access to file from multiple nodes blocks for seconds
kernel of RHEL7 supports FUSE_AUTO_INVAL_DATA. But FUSE_CAP_DONT_MASK was added in libfuse 3.0. Currently no major li... Zheng Yan

10/06/2017

05:46 PM Feature #15066: multifs: Allow filesystems to be assigned RADOS namespace as well as pool for met...
we should default to using a namespace named after the filesystem unless otherwise specified. Douglas Fuller
05:45 PM Feature #21709 (New): ceph fs authorize should detect the correct data namespace
when per-FS data namespaces are enabled, ceph fs authorize should be updated to issue caps for them Douglas Fuller
02:21 PM Bug #21512: qa: libcephfs_interface_tests: shutdown race failures
I have a PR up that seems to fix this, but it may not be what we need. env_to_vec seems like it ought to be reworked ... Jeff Layton

10/05/2017

12:04 PM Bug #21512: qa: libcephfs_interface_tests: shutdown race failures
Looking now to see if we can somehow just fix up lockdep for this. Most of the problems I have seen have seen are fal... Jeff Layton

10/04/2017

06:42 PM Bug #21512: qa: libcephfs_interface_tests: shutdown race failures
I'm now looking for ways to selectively disable lockdep for just this test. So far, I've been unable to do so:
<pr...
Jeff Layton
04:22 PM Bug #21512: qa: libcephfs_interface_tests: shutdown race failures
Patch to make the ShutdownRace test even more thrashy. This has each thread do the setup and teardown in a tight loop... Jeff Layton
02:32 AM Bug #21568 (Fix Under Review): MDSMonitor commands crashing on cluster upgraded from Hammer (none...
https://github.com/ceph/ceph/pull/18109 Patrick Donnelly

10/03/2017

07:41 PM Bug #20938: CephFS: concurrent access to file from multiple nodes blocks for seconds
Ok, I think you're probably right there. Do we need a cmake test to ensure the fuse library defines FUSE_AUTO_INVAL_D... Jeff Layton
02:59 AM Backport #21658 (Resolved): luminous: purge queue and standby replay mds
https://github.com/ceph/ceph/pull/18385 Nathan Cutler
02:58 AM Backport #21657 (Resolved): luminous: StrayManager::truncate is broken
https://github.com/ceph/ceph/pull/18019 Nathan Cutler

10/02/2017

06:29 PM Backport #21626 (In Progress): jewel: ceph_volume_client: sets invalid caps for existing IDs with...
Patrick Donnelly
06:20 PM Backport #21626 (Resolved): jewel: ceph_volume_client: sets invalid caps for existing IDs with no...
https://github.com/ceph/ceph/pull/18084 Patrick Donnelly
06:29 PM Backport #21627 (In Progress): luminous: ceph_volume_client: sets invalid caps for existing IDs w...
Patrick Donnelly
06:25 PM Backport #21627 (Resolved): luminous: ceph_volume_client: sets invalid caps for existing IDs with...
https://github.com/ceph/ceph/pull/18085
-https://github.com/ceph/ceph/pull/18447-
Patrick Donnelly
12:41 PM Bug #21568: MDSMonitor commands crashing on cluster upgraded from Hammer (nonexistent pool?)
Patrick Donnelly
12:40 PM Bug #21568: MDSMonitor commands crashing on cluster upgraded from Hammer (nonexistent pool?)
User confirmed the MDSMap referred to data pools that no longer exist. The fix should check for non-existent pools an... Patrick Donnelly

10/01/2017

06:01 AM Bug #21304: mds v12.2.0 crashing
The following crash still persists with v12.2.1:
2017-10-01 06:07:34.673356 7f1066040700 0 -- 194.249.156.134:680...
Andrej Filipcic
12:46 AM Bug #19593 (Pending Backport): purge queue and standby replay mds
Patrick Donnelly
12:45 AM Bug #21501 (Pending Backport): ceph_volume_client: sets invalid caps for existing IDs with no caps
Patrick Donnelly

09/29/2017

03:12 PM Bug #21604: mds: may recall all client caps (?) because dirty/pinned metadata is not flushed
This sounds like something controlled by config options. The MDS wants to batch up data when flushing to avoid gratui... Greg Farnum
03:02 PM Bug #21604 (New): mds: may recall all client caps (?) because dirty/pinned metadata is not flushed
Testing tasks.cephfs.test_client_limits.TestClientLimits.test_client_pin with this patch:... Patrick Donnelly
10:06 AM Backport #21602 (Fix Under Review): luminous: ceph_volume_client: add get, put, and delete object...
https://github.com/ceph/ceph/pull/18037 Ramana Raja
09:12 AM Backport #21602 (Resolved): luminous: ceph_volume_client: add get, put, and delete object interfaces
Wrap low-level rados APIs to allow ceph_volume_client to get, put, and
delete objects. The interfaces would allow Op...
Ramana Raja
09:07 AM Feature #21601: ceph_volume_client: add get, put, and delete object interfaces
https://github.com/ceph/ceph/pull/17697 Ramana Raja
09:06 AM Feature #21601 (Resolved): ceph_volume_client: add get, put, and delete object interfaces
Wrap low-level rados APIs to allow ceph_volume_client to get, put, and
delete objects. The interfaces would allow Op...
Ramana Raja
12:14 AM Backport #21600 (In Progress): luminous: mds: client caps can go below hard-coded default (100)
Patrick Donnelly
12:10 AM Backport #21600 (Resolved): luminous: mds: client caps can go below hard-coded default (100)
https://github.com/ceph/ceph/pull/18030 Patrick Donnelly
12:03 AM Bug #21575 (Pending Backport): mds: client caps can go below hard-coded default (100)
https://github.com/ceph/ceph/pull/16036/commits/538834171fe4524b4bb7cffdcb08c5b13fe7689f Patrick Donnelly

09/28/2017

10:29 AM Bug #4829 (Closed): client: handling part of MClientForward incorrectly?
Zheng Yan
10:20 AM Bug #6473 (Can't reproduce): multimds + ceph-fuse: fsstress gets ENOTEMPTY on final rm -r
open new one if we encounter this again Zheng Yan
10:17 AM Bug #12777 (Resolved): qa: leftover files in cephtest directory
Zheng Yan
10:16 AM Bug #11277 (Can't reproduce): hung fsstress run under thrash (no useful logs)
Zheng Yan
10:12 AM Bug #17212 (Resolved): Unable to remove symlink / fill_inode badness on ffff88025f049f88
Zheng Yan
10:07 AM Bug #20467 (Resolved): Ceph FS kernel client not consistency
Zheng Yan
10:06 AM Bug #21091 (Pending Backport): StrayManager::truncate is broken
https://github.com/ceph/ceph/pull/18019 Zheng Yan
10:04 AM Bug #19306 (Resolved): fs: mount NFS to cephfs, and then ls a directory containing a large number...
Zheng Yan
10:03 AM Bug #20313 (Resolved): Assertion in handle_dir_update
Zheng Yan
09:56 AM Bug #21168 (Resolved): cap import/export message ordering issue
Zheng Yan

09/27/2017

10:50 PM Bug #21584 (Fix Under Review): FAILED assert(get_version() < pv) in CDir::mark_dirty
Encountered this issue during snapshot test. The patch is from multimds snapsot fixes https://github.com/ceph/ceph/pu... Zheng Yan
10:40 PM Bug #21584 (Resolved): FAILED assert(get_version() < pv) in CDir::mark_dirty
... Zheng Yan
10:38 PM Bug #21551: Ceph FS not recovering space on Luminous
OK, it's likely caused by http://tracker.ceph.com/issues/19593. please don't enable standby reply for now Zheng Yan
01:55 PM Bug #21551: Ceph FS not recovering space on Luminous
This file system was create with Ceph v12.2.0. This cluster was cleanly installed with Ceph v12.2.0 and was never upg... Eric Eastman
03:36 AM Bug #21551: Ceph FS not recovering space on Luminous
... Zheng Yan
05:35 PM Bug #21575 (Resolved): mds: client caps can go below hard-coded default (100)
This was caused by the MDS cache in mempool change:... Patrick Donnelly
02:44 PM Feature #118: kclient: clean pages when throwing out dirty metadata on session teardown
I think sage means clean dirty pages at the same time of cleaning dirty metadata. This haven't been implemented yet Zheng Yan
02:39 PM Bug #17370 (Can't reproduce): knfs ffsb hang on master
Zheng Yan
02:27 PM Bug #21337 (Fix Under Review): luminous: MDS is not getting past up:replay on Luminous cluster
https://github.com/ceph/ceph/pull/17994
patch is only for luminous. this bug has been fixed in master in another w...
Zheng Yan
01:39 PM Feature #21571 (Duplicate): mds: limit number of snapshots (global and subtree)
when there are hundreds of snapshots, metadata operations become slow. Besides, lots of snapshots can cause kernel pa... Zheng Yan
01:11 PM Bug #19593 (Fix Under Review): purge queue and standby replay mds
https://github.com/ceph/ceph/pull/17990 Zheng Yan
09:45 AM Bug #19593: purge queue and standby replay mds
Ouch, It seems this one hasn't been fixed. this one can explan #21551
Zheng Yan
11:28 AM Bug #21568 (Resolved): MDSMonitor commands crashing on cluster upgraded from Hammer (nonexistent ...
Opened from mailing list thread "[ceph-users] "ceph fs" commands hang forever and kill monitors"... John Spray
09:32 AM Bug #20129 (Resolved): Client syncfs is slow (waits for next MDS tick)
Zheng Yan

09/26/2017

01:43 PM Bug #21551: Ceph FS not recovering space on Luminous
I uploaded the new mds run with
debug_mds=5
debug_journaler=10
to:
ftp://ftp.keepertech.com/outgoing/eric...
Eric Eastman
08:19 AM Bug #21551: Ceph FS not recovering space on Luminous
there are lots of "mds.0.purge_queue _consume: not readable right now" in the log.looks like purge queue stayed in n... Zheng Yan
04:07 AM Bug #21551: Ceph FS not recovering space on Luminous
The command 'ceph daemon mds.ede-c1-mon01 dump cache /tmp/cachedump' did not give any output so I ran
ceph daemon md...
Eric Eastman
02:08 AM Bug #21551: Ceph FS not recovering space on Luminous
Could you please run 'ceph daemon mds.ede-c1-mon01 dump cache /tmp/cachedump' and upload cachedump. Besides, please s... Zheng Yan
01:35 PM Bug #21433 (Closed): mds: failed to decode message of type 43 v7: buffer::end_of_buffer
great Zheng Yan

09/25/2017

10:09 PM Bug #21551: Ceph FS not recovering space on Luminous
Snapshots are not considered stable (especially with multiple active metadata servers). There are proposed fixes in t... Patrick Donnelly
10:02 PM Bug #21551 (New): Ceph FS not recovering space on Luminous
I was running a test on a Ceph file system where I was creating and deleting about 45,000 files in a loop, and every ... Eric Eastman
09:33 PM Bug #21362 (Need More Info): cephfs ec data pool + windows fio,ceph cluster degraed several hours...
> cephfs: meta pool (ssd 1*3 replica 2), data pool (hdd 20*3 ec 2+1).
Using replica 2 is strongly advised against....
Patrick Donnelly
09:30 PM Bug #21363 (Duplicate): ceph-fuse crashing while mounting cephfs
Patrick Donnelly
08:40 PM Backport #21540 (Resolved): luminous: whitelist additions
Nathan Cutler
03:41 PM Backport #21540 (In Progress): luminous: whitelist additions
https://github.com/ceph/ceph/pull/17945 Patrick Donnelly
03:29 PM Backport #21540 (Resolved): luminous: whitelist additions
https://github.com/ceph/ceph/pull/17945 Patrick Donnelly
08:33 PM Bug #21463 (Resolved): qa: ignorable "MDS cache too large" warning
Nathan Cutler
08:32 PM Backport #21472 (Resolved): luminous: qa: ignorable "MDS cache too large" warning
Nathan Cutler
05:12 PM Bug #21507: mds: debug logs near respawn are not flushed
I'd take a hard look at the log rotation, which I got working and turned on for most all the fs suites a couple years... Greg Farnum
02:51 PM Bug #21539: man: missing man page for mount.fuse.ceph
Note the Debian package maintainers have written https://anonscm.debian.org/cgit/pkg-ceph/ceph.git/tree/debian/man/mo... Ken Dreyer
02:50 PM Bug #21539 (Resolved): man: missing man page for mount.fuse.ceph
The /usr/sbin/mount.fuse.ceph utility has no man page, and it needs one. Ken Dreyer
02:31 PM Bug #21433: mds: failed to decode message of type 43 v7: buffer::end_of_buffer
After Greg pointed us to the right direction, we recovered the FS by upgrading the cluster to luminous, now profiting... Christian Salzmann-Jäckel
01:49 PM Bug #21433: mds: failed to decode message of type 43 v7: buffer::end_of_buffer
This is presumably the same root cause as http://tracker.ceph.com/issues/16010 Greg Farnum
01:46 PM Bug #21433 (Need More Info): mds: failed to decode message of type 43 v7: buffer::end_of_buffer
Patrick Donnelly
06:49 AM Bug #21433: mds: failed to decode message of type 43 v7: buffer::end_of_buffer
Sorry for the delay. have you recovered the FS? if not, please set debug_ms=1 on both mds and osd.049, send logs to us. Zheng Yan
01:52 PM Bug #21510 (Resolved): qa: kcephfs: client-limits: whitelist "MDS cache too large"
Nathan Cutler
02:55 AM Bug #21510 (Pending Backport): qa: kcephfs: client-limits: whitelist "MDS cache too large"
Patrick Donnelly
01:52 PM Backport #21517 (Resolved): luminous: qa: kcephfs: client-limits: whitelist "MDS cache too large"
Nathan Cutler
01:51 PM Bug #21509 (Resolved): qa: kcephfs: ignore warning on expected mds failover
Nathan Cutler
02:54 AM Bug #21509 (Pending Backport): qa: kcephfs: ignore warning on expected mds failover
Patrick Donnelly
01:51 PM Backport #21516 (Resolved): luminous: qa: kcephfs: ignore warning on expected mds failover
Nathan Cutler
01:51 PM Bug #21508 (Resolved): qa: kcephfs: missing whitelist for evicted client
Nathan Cutler
02:54 AM Bug #21508 (Pending Backport): qa: kcephfs: missing whitelist for evicted client
Patrick Donnelly
01:46 PM Bug #21412: cephfs: too many cephfs snapshots chokes the system
Thanks. Im hesitant to trigger the issue again, last time it threw my cluster into major chaos that took several days... Wyllys Ingersoll
01:41 PM Bug #21412: cephfs: too many cephfs snapshots chokes the system
ceph daemon mds.<name> dump_ops_in_flight
ceph daemon mds.<name> perf dump
Greg Farnum
01:43 PM Backport #21515 (Resolved): luminous: qa: kcephfs: missing whitelist for evicted client
Nathan Cutler
08:24 AM Bug #21530 (Closed): inconsistent rstat on inode
already fixed by... Zheng Yan
01:13 AM Bug #21530 (Closed): inconsistent rstat on inode
http://qa-proxy.ceph.com/teuthology/yuriw-2017-09-23_23:29:32-kcephfs-wip-yuri-testing-2017-09-23-2100-testing-basic-... Zheng Yan

09/24/2017

04:30 PM Bug #21406 (In Progress): ceph.in: tell mds does not understand --cluster
To use `tell mds` with cluster having a non-default name, you can get around the issue for now by passing the corresp... Ramana Raja
04:12 PM Bug #21406: ceph.in: tell mds does not understand --cluster
Patrick Donnelly wrote:
> Ramana, was removing the BZ in the description an accident?
I haven't seen Red Hat down...
Ramana Raja
03:06 PM Bug #21501 (Fix Under Review): ceph_volume_client: sets invalid caps for existing IDs with no caps
https://github.com/ceph/ceph/pull/17935 Ramana Raja

09/23/2017

11:48 AM Bug #21512: qa: libcephfs_interface_tests: shutdown race failures
Nope, can't be done all on stack. This code is pretty reliant on keeping pointers around to some static stuff. What w... Jeff Layton

09/22/2017

09:25 PM Backport #21519: jewel: qa: test_client_pin times out waiting for dentry release from kernel
Let's delay this for the next minor release. Downstream can cherry-pick the fix and users can change their configs ea... Patrick Donnelly
09:15 PM Backport #21519: jewel: qa: test_client_pin times out waiting for dentry release from kernel
I'll be honest and say I don't like late merges that touch real code (not just tests) because they can introduce regr... Nathan Cutler
08:43 PM Backport #21519: jewel: qa: test_client_pin times out waiting for dentry release from kernel
@Nathan, can we still add it to 10.2.10? Yuri Weinstein
08:33 PM Backport #21519 (Fix Under Review): jewel: qa: test_client_pin times out waiting for dentry relea...
Patrick Donnelly
08:28 PM Backport #21519 (Resolved): jewel: qa: test_client_pin times out waiting for dentry release from ...
https://github.com/ceph/ceph/pull/17925 Patrick Donnelly
08:37 PM Bug #20988: client: dual client segfault with racing ceph_shutdown
To be clear, I think we may want to leave off the patch that adds the new testcase from this series as it's uncoverin... Jeff Layton
08:34 PM Backport #21526 (Closed): jewel: client: dual client segfault with racing ceph_shutdown
https://github.com/ceph/ceph/pull/21153 Nathan Cutler
08:34 PM Backport #21525 (Resolved): luminous: client: dual client segfault with racing ceph_shutdown
https://github.com/ceph/ceph/pull/20082 Nathan Cutler
08:31 PM Bug #20337 (Resolved): test_rebuild_simple_altpool triggers MDS assertion
Nathan Cutler
08:31 PM Backport #21490 (Resolved): luminous: test_rebuild_simple_altpool triggers MDS assertion
Nathan Cutler
08:31 PM Bug #21466 (Resolved): qa: fs.get_config on stopped MDS
Nathan Cutler
08:30 PM Backport #21484 (Resolved): luminous: qa: fs.get_config on stopped MDS
Nathan Cutler
07:34 PM Bug #21252 (Resolved): mds: asok command error merged with partial Formatter output
Nathan Cutler
07:34 PM Backport #21321 (Resolved): luminous: mds: asok command error merged with partial Formatter output
Nathan Cutler
07:33 PM Bug #21414 (Resolved): client: Variable "onsafe" going out of scope leaks the storage it points to
Nathan Cutler
07:33 PM Backport #21436 (Resolved): luminous: client: Variable "onsafe" going out of scope leaks the stor...
Nathan Cutler
07:31 PM Bug #21381 (Resolved): test_filtered_df: assert 0.9 < ratio < 1.1
Nathan Cutler
07:31 PM Backport #21437 (Resolved): luminous: test_filtered_df: assert 0.9 < ratio < 1.1
Nathan Cutler
07:30 PM Bug #21275 (Resolved): test hang after mds evicts kclient
Nathan Cutler
07:30 PM Backport #21473 (Resolved): luminous: test hang after mds evicts kclient
Nathan Cutler
07:29 PM Bug #21462 (Resolved): qa: ignorable MDS_READ_ONLY warning
Nathan Cutler
07:29 PM Backport #21464 (Resolved): luminous: qa: ignorable MDS_READ_ONLY warning
Nathan Cutler
07:28 PM Bug #21071 (Resolved): qa: test_misc creates metadata pool with dummy object resulting in WRN: PO...
Nathan Cutler
07:28 PM Backport #21449 (Resolved): luminous: qa: test_misc creates metadata pool with dummy object resul...
Nathan Cutler
07:27 PM Backport #21486 (Resolved): luminous: qa: test_client_pin times out waiting for dentry release fr...
Nathan Cutler
07:27 PM Bug #21421 (Resolved): MDS rank add/remove log messages say wrong number of ranks
Nathan Cutler
07:26 PM Backport #21487 (Resolved): luminous: MDS rank add/remove log messages say wrong number of ranks
Nathan Cutler
07:26 PM Backport #21488 (Resolved): luminous: qa: failures from pjd fstest
Nathan Cutler
07:15 PM Bug #21423: qa: test_client_pin times out waiting for dentry release from kernel
This is also a problem for RHCS 2.0, we need to backport to Jewel. Patrick Donnelly
06:54 PM Backport #21517 (Fix Under Review): luminous: qa: kcephfs: client-limits: whitelist "MDS cache to...
https://github.com/ceph/ceph/pull/17922 Patrick Donnelly
06:49 PM Backport #21517 (Resolved): luminous: qa: kcephfs: client-limits: whitelist "MDS cache too large"
https://github.com/ceph/ceph/pull/17922 Patrick Donnelly
06:54 PM Backport #21516 (Fix Under Review): luminous: qa: kcephfs: ignore warning on expected mds failover
https://github.com/ceph/ceph/pull/17922 Patrick Donnelly
06:49 PM Backport #21516 (Resolved): luminous: qa: kcephfs: ignore warning on expected mds failover
https://github.com/ceph/ceph/pull/17922 Patrick Donnelly
06:54 PM Backport #21515 (Fix Under Review): luminous: qa: kcephfs: missing whitelist for evicted client
https://github.com/ceph/ceph/pull/17922 Patrick Donnelly
06:49 PM Backport #21515 (Resolved): luminous: qa: kcephfs: missing whitelist for evicted client
https://github.com/ceph/ceph/pull/17922 Patrick Donnelly
06:49 PM Bug #21512: qa: libcephfs_interface_tests: shutdown race failures
Ouch! Looks like env_to_vec is also not threadsafe:... Jeff Layton
06:30 PM Bug #21512 (Resolved): qa: libcephfs_interface_tests: shutdown race failures
... Patrick Donnelly
06:47 PM Backport #21514 (Fix Under Review): luminous: ceph_volume_client: snapshot dir name hardcoded
https://github.com/ceph/ceph/pull/17921 Patrick Donnelly
06:45 PM Backport #21514 (Resolved): luminous: ceph_volume_client: snapshot dir name hardcoded
Patrick Donnelly
06:38 PM Bug #21467 (Resolved): mds: src/mds/MDLog.cc: 276: FAILED assert(!capped)
Patrick Donnelly
06:37 PM Backport #21513 (Resolved): luminous: mds: src/mds/MDLog.cc: 276: FAILED assert(!capped)
Patrick Donnelly
06:37 PM Backport #21513 (Resolved): luminous: mds: src/mds/MDLog.cc: 276: FAILED assert(!capped)
https://github.com/ceph/ceph/pull/17852 Patrick Donnelly
06:34 PM Bug #21476 (Pending Backport): ceph_volume_client: snapshot dir name hardcoded
Patrick Donnelly
04:45 PM Bug #21510 (Fix Under Review): qa: kcephfs: client-limits: whitelist "MDS cache too large"
https://github.com/ceph/ceph/pull/17919 Patrick Donnelly
04:25 PM Bug #21510 (Resolved): qa: kcephfs: client-limits: whitelist "MDS cache too large"
... Patrick Donnelly
04:43 PM Bug #21509 (Fix Under Review): qa: kcephfs: ignore warning on expected mds failover
https://github.com/ceph/ceph/pull/17918 Patrick Donnelly
04:20 PM Bug #21509 (Resolved): qa: kcephfs: ignore warning on expected mds failover
... Patrick Donnelly
04:40 PM Bug #21508 (Fix Under Review): qa: kcephfs: missing whitelist for evicted client
https://github.com/ceph/ceph/pull/17917 Patrick Donnelly
04:06 PM Bug #21508 (Resolved): qa: kcephfs: missing whitelist for evicted client
... Patrick Donnelly
04:01 PM Bug #21507 (New): mds: debug logs near respawn are not flushed
... Patrick Donnelly
01:36 PM Bug #21501 (Resolved): ceph_volume_client: sets invalid caps for existing IDs with no caps
Create a ceph auth ID with no caps,
$ sudo ceph auth get-or-create client.test2
Allow ceph_volume_client to autho...
Ramana Raja
11:38 AM Bug #21483: qa: test_snapshot_remove (kcephfs): RuntimeError: Bad data at offset 0
https://github.com/ceph/ceph-client/commit/627c3763020604960d9c20b246b303478f34a6ec Zheng Yan

09/21/2017

10:12 PM Bug #21476: ceph_volume_client: snapshot dir name hardcoded
This needs a BZ for downstream backport. Patrick Donnelly
07:32 PM Bug #21406: ceph.in: tell mds does not understand --cluster
Ramana, was removing the BZ in the description an accident? Patrick Donnelly
07:14 PM Bug #20337: test_rebuild_simple_altpool triggers MDS assertion
Nathan Cutler wrote:
> @Patrick: So both https://github.com/ceph/ceph/pull/16305 and https://github.com/ceph/ceph/pu...
Patrick Donnelly
03:32 AM Bug #20337: test_rebuild_simple_altpool triggers MDS assertion
@Patrick: So both https://github.com/ceph/ceph/pull/16305 and https://github.com/ceph/ceph/pull/17849 need to be back... Nathan Cutler
02:10 PM Bug #21483: qa: test_snapshot_remove (kcephfs): RuntimeError: Bad data at offset 0
it's kernel client bug in testing branch, likely caused by... Zheng Yan
01:40 AM Bug #21483 (Resolved): qa: test_snapshot_remove (kcephfs): RuntimeError: Bad data at offset 0
... Patrick Donnelly
01:44 PM Backport #21488 (In Progress): luminous: qa: failures from pjd fstest
Nathan Cutler
03:34 AM Backport #21488 (Resolved): luminous: qa: failures from pjd fstest
https://github.com/ceph/ceph/pull/17888 Nathan Cutler
01:42 PM Backport #21487 (In Progress): luminous: MDS rank add/remove log messages say wrong number of ranks
Nathan Cutler
03:34 AM Backport #21487 (Resolved): luminous: MDS rank add/remove log messages say wrong number of ranks
https://github.com/ceph/ceph/pull/17887 Nathan Cutler
01:40 PM Backport #21486 (In Progress): luminous: qa: test_client_pin times out waiting for dentry release...
Nathan Cutler
03:34 AM Backport #21486 (Resolved): luminous: qa: test_client_pin times out waiting for dentry release fr...
https://github.com/ceph/ceph/pull/17886 Nathan Cutler
08:38 AM Backport #21449 (In Progress): luminous: qa: test_misc creates metadata pool with dummy object re...
Nathan Cutler
08:37 AM Backport #21437 (In Progress): luminous: test_filtered_df: assert 0.9 < ratio < 1.1
Nathan Cutler
08:35 AM Backport #21436 (In Progress): luminous: client: Variable "onsafe" going out of scope leaks the s...
Nathan Cutler
08:29 AM Backport #21359 (In Progress): luminous: racy is_mounted() checks in libcephfs
Nathan Cutler
04:20 AM Feature #20752 (Fix Under Review): cap message flag which indicates if client still has pending c...
*master PR*: https://github.com/ceph/ceph/pull/16778 Nathan Cutler
04:18 AM Backport #21321 (In Progress): luminous: mds: asok command error merged with partial Formatter ou...
Nathan Cutler
03:44 AM Bug #21168: cap import/export message ordering issue
https://github.com/ceph/ceph/pull/17854 Zheng Yan
03:43 AM Backport #21484 (In Progress): luminous: qa: fs.get_config on stopped MDS
Nathan Cutler
03:34 AM Backport #21484 (Resolved): luminous: qa: fs.get_config on stopped MDS
https://github.com/ceph/ceph/pull/17855 Nathan Cutler
03:42 AM Bug #19706: Laggy mon daemons causing MDS failover (symptom: failed to set counters on mds daemon...
still happen
http://qa-proxy.ceph.com/teuthology/teuthology-2017-09-16_03:15:02-fs-master-distro-basic-smithi/1639...
Zheng Yan
03:41 AM Backport #21490 (In Progress): luminous: test_rebuild_simple_altpool triggers MDS assertion
Nathan Cutler
03:35 AM Backport #21490 (Resolved): luminous: test_rebuild_simple_altpool triggers MDS assertion
https://github.com/ceph/ceph/pull/17855 Nathan Cutler
03:34 AM Backport #21489 (Resolved): jewel: qa: failures from pjd fstest
https://github.com/ceph/ceph/pull/21152 Nathan Cutler
03:32 AM Bug #21467 (Fix Under Review): mds: src/mds/MDLog.cc: 276: FAILED assert(!capped)
https://github.com/ceph/ceph/pull/17852
https://github.com/ceph/ceph/pull/17853
Zheng Yan
01:28 AM Bug #21467: mds: src/mds/MDLog.cc: 276: FAILED assert(!capped)
Another: http://pulpito.ceph.com/pdonnell-2017-09-20_23:49:47-fs:basic_functional-master-testing-basic-smithi/1652996 Patrick Donnelly
01:34 AM Bug #21466 (Pending Backport): qa: fs.get_config on stopped MDS
Patrick Donnelly
 

Also available in: Atom