Activity
From 06/15/2017 to 07/14/2017
07/14/2017
- 06:30 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
- ok i'll download them just in case.
- 06:01 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
- It will expire in a week or two.
- 04:01 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
- Patrick Donnelly wrote:
> https://shaman.ceph.com/builds/ceph/i20535-backport-v10.2.9/
Thanks, I'll be upgrading ... - 11:19 AM Feature #20606: mds: improve usability of cluster rank manipulation and setting cluster up/down
- My thoughts on this:
* maybe we should preface this class of command (manipulating the MDS ranks) with "cluster", ... - 04:11 AM Bug #20622: mds: takeover mds stuck in up:replay after thrashing rank 0
- 3 of 6 osds failed. But there is no clue why they failed....
- 12:19 AM Feature #19109: Use data pool's 'df' for statfs instead of global stats, if there is only one dat...
- You'd give the available space for that pool (i.e. how many bytes can they write before it becomes full). Same as in...
07/13/2017
- 11:04 PM Bug #20622: mds: takeover mds stuck in up:replay after thrashing rank 0
- Zheng, please take a look.
- 11:04 PM Bug #20622 (Closed): mds: takeover mds stuck in up:replay after thrashing rank 0
- ...
- 10:03 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
- https://shaman.ceph.com/builds/ceph/i20535-backport-v10.2.9/
- 12:35 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
- Patrick Donnelly wrote:
> It does not include the fix. Do not use that branch. I'll make a note to update it...
T... - 04:08 PM Feature #19109: Use data pool's 'df' for statfs instead of global stats, if there is only one dat...
- What value for free space should we give in this case? If it's the global free space, it might be misleading to speci...
- 02:49 PM Bug #20337: test_rebuild_simple_altpool triggers MDS assertion
- https://github.com/ceph/ceph/pull/16305
- 02:27 PM Feature #20609: MDSMonitor: add new command `ceph fs set <fs_name> down` to bring the cluster down
- So really this would just set max_mds to 0. I do think this should trigger HEALTH_ERR unless and until the user delet...
- 02:23 PM Feature #20610: MDSMonitor: add new command to shrink the cluster in an automated way
- I think we should still call this max_mds to avoid confusion between the filesystem itself (the data) and its MDSs. I...
- 02:11 PM Feature #20607: MDSMonitor: change "mds deactivate" to clearer "mds rejoin"
- I think "reset" would work well. I do think we should change it, though.
- 08:05 AM Feature #20607: MDSMonitor: change "mds deactivate" to clearer "mds rejoin"
- I like "flush". Also, "reactivate" occurred to me.
- 04:35 AM Feature #20607: MDSMonitor: change "mds deactivate" to clearer "mds rejoin"
- Agreed on "rejoin".
Okay so after a jog and some light thought, I would suggest "release" (i.e. the rank is releas... - 03:47 AM Feature #20607: MDSMonitor: change "mds deactivate" to clearer "mds rejoin"
- I agree with john. rejoin is confusing for non-native English speaker
- 02:08 PM Feature #20611: MDSMonitor: do not show cluster health warnings for file system intentionally mar...
- Taking an MDS down for hardware maintenance, etc, should trigger a health warning because such actions do, even if in...
- 11:12 AM Bug #20614: [WRN] MDS daemon 'a-s' is not responding, replacing it as rank 0 with standby 'a'"...
- Dupe of http://tracker.ceph.com/issues/19706 ?
- 08:58 AM Bug #20614 (Duplicate): [WRN] MDS daemon 'a-s' is not responding, replacing it as rank 0 with ...
- http://pulpito.ceph.com/teuthology-2017-07-08_03:15:04-fs-master-distro-basic-smithi/
- 08:31 AM Backport #20599 (Resolved): jewel: cephfs: Damaged MDS with 10.2.8
07/12/2017
- 11:36 PM Feature #20607: MDSMonitor: change "mds deactivate" to clearer "mds rejoin"
- The trouble with "rejoin" is that it's also the name of one of the states the daemon passes through during startup.
- 11:02 PM Feature #20607 (Rejected): MDSMonitor: change "mds deactivate" to clearer "mds rejoin"
- Rename `ceph mds deactivate` to `ceph mds rejoin`. This makes it clear an MDS is leaving and then rejoining the metad...
- 11:11 PM Feature #20611 (Resolved): MDSMonitor: do not show cluster health warnings for file system intent...
- Here's what you see currently:...
- 11:09 PM Feature #20610 (Resolved): MDSMonitor: add new command to shrink the cluster in an automated way
- Deprecate `ceph fs set <fs_name> max_mds <max_mds>`. New `ceph fs set <fs_name> ranks <max_mds>` sets max_mds and beg...
- 11:07 PM Feature #20609 (Resolved): MDSMonitor: add new command `ceph fs set <fs_name> down` to bring the ...
- This will cause the MDSMonitor to start stopping ranks (what deactivate does) beginning at the highest rank. Only one...
- 11:03 PM Feature #20608 (Resolved): MDSMonitor: rename `ceph fs set <fs_name> cluster_down` to `ceph fs se...
- This indicates whether the file system can be joined by the mds as a new rank. The behavior stays the same.
- 10:51 PM Feature #20606 (Resolved): mds: improve usability of cluster rank manipulation and setting cluste...
- Right now the procedure for bringing down a cluster is:...
- 09:55 PM Bug #20582 (Fix Under Review): common: config showing ints as floats
- https://github.com/ceph/ceph/pull/16288#event-1161205437
- 05:53 PM Bug #20582 (In Progress): common: config showing ints as floats
- 05:53 PM Bug #20582: common: config showing ints as floats
- Reassigning John. I agree `config show` needs fixed, not the tests.
- 12:22 PM Bug #20582 (Fix Under Review): common: config showing ints as floats
- https://github.com/ceph/ceph/pull/16288
- 12:00 PM Bug #20582 (In Progress): common: config showing ints as floats
- It was this change:...
- 10:02 AM Bug #20582: common: config showing ints as floats
- For float/double config option, the output of 'ceph daemon mds.x config get xxxx' looks like...
- 08:31 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
- Webert Lima wrote:
> Does this built package include the fix for the MDS regression that was found in 10.2.8? I read... - 05:44 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
- Does this built package include the fix for the MDS regression that was found in 10.2.8? I read about it in the maili...
- 06:02 PM Fix #20246 (Fix Under Review): Make clog message on scrub errors friendlier.
- This snowballed into me pondering about what clog messages really should look like (https://github.com/ceph/ceph/pull...
- 12:42 PM Bug #20592 (In Progress): client::mkdirs not handle well when two clients send mkdir request for ...
- 07:09 AM Bug #20592 (Fix Under Review): client::mkdirs not handle well when two clients send mkdir request...
- 04:00 AM Bug #20592: client::mkdirs not handle well when two clients send mkdir request for a same dir
- I have pulled a request for this issue
https://github.com/ceph/ceph/pull/16280 - 02:55 AM Bug #20592 (Resolved): client::mkdirs not handle well when two clients send mkdir request for a s...
- suppose we got two clients trying to make two level directory:
client1 mkdirs: a/b
client2 mkdirs: a/c
in func... - 10:27 AM Backport #19466 (In Progress): jewel: mds: log rotation doesn't work if mds has respawned
- 08:16 AM Backport #20140: jewel: Journaler may execute on_safe contexts prematurely
- Reverted by https://github.com/ceph/ceph/pull/16282
- 07:07 AM Backport #20140 (Rejected): jewel: Journaler may execute on_safe contexts prematurely
- 07:07 AM Backport #20536 (Rejected): jewel: Journaler may execute on_safe contexts prematurely (part 2)
- 06:56 AM Backport #20599: jewel: cephfs: Damaged MDS with 10.2.8
- h3. description
A bad backport was included in 10.2.8, potentially causing MDS breakage. See https://github.com/ce... - 06:41 AM Backport #20599 (Resolved): jewel: cephfs: Damaged MDS with 10.2.8
- https://github.com/ceph/ceph/pull/16282
- 04:25 AM Feature #19109: Use data pool's 'df' for statfs instead of global stats, if there is only one dat...
- Doug, please take this one.
- 04:19 AM Feature #20598 (Resolved): mds: revisit LAZY_IO
- We do not test this and enabling it is no longer possible. However, we still repeatedly get requests for byte-range l...
- 04:00 AM Bug #20597 (New): mds: tree exports should be reported at a higher debug level
- Exporting a subtree is a major operation that should be visible to an operator monitoring the debug log at a low sett...
- 03:58 AM Bug #20596 (Resolved): MDSMonitor: obsolete `mds dump` and other deprecated mds commands
- Dan at CERN was still using this and wondered how to see standbys. We should obsolete this command post-Luminous and ...
- 03:45 AM Bug #20595 (Resolved): mds: export_pin should be included in `get subtrees` output
- Currently it is only available by looking at the dumped cache. An indicator that a subtree is unsplittable and/or pin...
- 03:43 AM Bug #20594 (Resolved): mds: cache limits should be expressed in memory usage, not inode count
- Cached inode count (mds_cache_size) is an imperfect limit for what we really want. We frequently have to tell users/c...
- 03:34 AM Bug #19706: Laggy mon daemons causing MDS failover (symptom: failed to set counters on mds daemon...
- PR for debugging: https://github.com/ceph/ceph/pull/16278
- 03:10 AM Bug #20593 (Resolved): mds: the number of inode showed by "mds perf dump" not correct after trimm...
- currently, mds only update the inode number for "mds perf dump" in function "MDSRank::_dispatch"
this means it will ...
07/11/2017
- 07:32 PM Bug #20583 (Fix Under Review): mds: improve wording when mds respawns due to mdsmap removal
- https://github.com/ceph/ceph/pull/16270
- 04:36 PM Bug #20583 (Resolved): mds: improve wording when mds respawns due to mdsmap removal
- > 2017-07-11 07:07:55.397645 7ffb7a1d7700 1 mds.b handle_mds_map i (10.0.1.2:6822/28190) dne in the mdsmap, respawni...
- 05:18 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
- Ok. I'll do it in two small clusters tomorrow and I'll update this troublesome clusters next week.
Thanks a lot for ... - 12:57 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
- Webert Lima wrote:
> Couple questions:
> - any recommendations choosing over notcmalloc or default flavors?
Gen... - 12:41 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
- Patrick Donnelly wrote:
> New run: https://shaman.ceph.com/builds/ceph/i20535-backport/387e184970bc2949e16139db0cbda... - 04:33 PM Bug #20582 (Resolved): common: config showing ints as floats
- Something broke in common/config.cc around printing values, the ints are coming out as floats, and it's breaking test...
- 03:42 PM Backport #20412: test_remote_update_write (tasks.cephfs.test_quota.TestQuota) fails in Jewel 10.2...
- h3. description
This test fails reproducibly on the wip-jewel-backports branch: ... - 09:34 AM Bug #20569: mds: don't mark dirty rstat on non-auth inode
- -https://github.com/ceph/ceph/pull/16253-
Within this fix, inodes can be trimmed successfully and no such warning ... - 09:32 AM Bug #20569 (Resolved): mds: don't mark dirty rstat on non-auth inode
- Currently using multi-MDS on Luminous, we found ceph status reported such warning all the time if writing large amoun...
- 03:57 AM Bug #20566 (Resolved): "MDS health message (mds.0): Behind on trimming" in powercycle tests
- see http://pulpito.ceph.com/kchai-2017-07-11_02:07:09-powercycle-master-distro-basic-mira/
07/10/2017
- 09:01 PM Backport #20564 (In Progress): jewel: mds segmentation fault ceph_lock_state_t::get_overlapping_l...
- 08:45 PM Backport #20564 (Resolved): jewel: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
- https://github.com/ceph/ceph/pull/16248
- 08:05 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
- New run: https://shaman.ceph.com/builds/ceph/i20535-backport/387e184970bc2949e16139db0cbda6acfa3f7b3a/
- 07:59 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
- Awesome! Thanks!
- 07:57 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
- I meant to link: https://shaman.ceph.com/builds/ceph/i20535-backport/2223e478c4b770e75cb7db196f5cd9d985929ac9/
Loo... - 07:49 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
- Patrick Donnelly wrote:
> I went ahead and did it since our CI will build the repos for you: https://shaman.ceph.com... - 07:46 PM Bug #20535 (Pending Backport): mds segmentation fault ceph_lock_state_t::get_overlapping_locks
- Backport PR: https://github.com/ceph/ceph/pull/16248
- 07:46 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
- > I'm not really comfortable in doing a upgrade like that, as the service and data availability is very critical here...
- 07:33 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
- Patrick Donnelly wrote:
> Zheng, PR#15440 indicates it's a multimds fix but Webert's setup is single MDS. Any issues... - 07:21 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
- > it should be backported to jewel/kraken if feasible
Consider backport to jewel only, since kraken goes EOL the m... - 05:56 PM Bug #20535 (In Progress): mds segmentation fault ceph_lock_state_t::get_overlapping_locks
- Zheng, PR#15440 indicates it's a multimds fix but Webert's setup is single MDS. Any issues you see backporting the fi...
- 01:51 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
- Patrick Donnelly wrote:
> Okay, it appears the deadlock is fixed.
I'm sorry. Do you refer to that commit? If so, ... - 01:41 PM Bug #20535 (Closed): mds segmentation fault ceph_lock_state_t::get_overlapping_locks
- Okay, it appears the deadlock is fixed. Please open a new ticket if you're still seeing issues with rejoin taking unr...
- 01:02 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
- and this is current session ls from each Active MDS on each of both clusters:
root@bhs1-mail01-ds02:~# ceph daemo... - 12:46 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
- Zheng Yan wrote:
> how about other machines
Sorry I didn't check at the time. I'll post everything as it looks ... - 03:38 AM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
- Webert Lima wrote:
> root@bhs1-mail02-ds05:~# cat /proc/sys/fs/file-nr
> 3360 0 3273932
> root@bhs1-mail0... - 05:50 PM Bug #20537: mds: MDLog.cc: 276: FAILED assert(!capped)
- Zheng, please break out these fixes into separate PRs.
- 11:46 AM Bug #20537 (Fix Under Review): mds: MDLog.cc: 276: FAILED assert(!capped)
- https://github.com/ceph/ceph/pull/16068/commits/2b98f4701e9a12e50f8d017c93e5101eb02f7992
- 02:02 PM Bug #2494: mds: Cannot remove directory despite it being empty.
- Zheng Yan wrote:
> David Galloway wrote:
> > I've moved these dirs to ... - 03:31 AM Bug #2494 (Resolved): mds: Cannot remove directory despite it being empty.
- 08:53 AM Bug #20549: cephfs-journal-tool: segfault during journal reset
- The reason is that Resetter::reset() creates on-stack journaler. The journaler got destroyed before receiving all pre...
07/08/2017
- 12:29 PM Bug #2494: mds: Cannot remove directory despite it being empty.
- David Galloway wrote:
> I've moved these dirs to ...
07/07/2017
- 10:18 PM Bug #2494: mds: Cannot remove directory despite it being empty.
- I've moved these dirs to ...
- 10:14 PM Bug #20072 (Fix Under Review): TestStrays.test_snapshot_remove doesn't handle head whiteout in pg...
- https://github.com/ceph/ceph/pull/16226
- 04:52 PM Bug #20072 (In Progress): TestStrays.test_snapshot_remove doesn't handle head whiteout in pgls re...
- 09:14 PM Bug #20549 (Resolved): cephfs-journal-tool: segfault during journal reset
- ...
- 12:47 PM Bug #20494: cephfs_data_scan: try_remove_dentries_for_stray assertion failure
- Zheng Yan wrote:
> If inode numbers were removed from inotable when replaying journal. how did "assert(inode_map.cou... - 10:53 AM Bug #20494: cephfs_data_scan: try_remove_dentries_for_stray assertion failure
- If inode numbers were removed from inotable when replaying journal. how did "assert(inode_map.count(in->vino()) == 0)...
- 09:38 AM Bug #20494: cephfs_data_scan: try_remove_dentries_for_stray assertion failure
- Zheng Yan wrote:
> you can use cephfs-table-tool manually remove used inodes from inotable. (use "rados -p data ls",... - 09:13 AM Bug #20494: cephfs_data_scan: try_remove_dentries_for_stray assertion failure
- you can use cephfs-table-tool manually remove used inodes from inotable. (use "rados -p data ls", "rados -p metadata ...
- 07:59 AM Bug #20494: cephfs_data_scan: try_remove_dentries_for_stray assertion failure
- Zheng Yan wrote:
> I fixed a cephfs-data-scan bug. https://github.com/ceph/ceph/pull/16202. It can explain this inco... - 07:47 AM Bug #20494: cephfs_data_scan: try_remove_dentries_for_stray assertion failure
- Thank you,i made a test of running "ceph daemon mds.a scrub_path / force recursive repair" after scan_inodes.As you s...
- 06:24 AM Bug #20494: cephfs_data_scan: try_remove_dentries_for_stray assertion failure
- I fixed a cephfs-data-scan bug. https://github.com/ceph/ceph/pull/16202. It can explain this inconsistent state.
- 06:23 AM Bug #20494: cephfs_data_scan: try_remove_dentries_for_stray assertion failure
- Ivan Guan wrote:
> Ivan Guan wrote:
> > After carefull consideration, i try to fix this bug using first solution.
... - 03:43 AM Bug #20494: cephfs_data_scan: try_remove_dentries_for_stray assertion failure
- Ivan Guan wrote:
> After carefull consideration, i try to fix this bug using first solution.
> function inject_with... - 12:41 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
root@bhs1-mail02-ds05:~# cat /proc/sys/fs/file-nr
3360 0 3273932
root@bhs1-mail02-ds05:~# cat /proc/sys/...- 12:34 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
- {
"id": 1029565,
"num_leases": 0,
*"num_caps": 2588906,*
"state": "open",
"replay_requests": 0,
"completed_requ... - 12:20 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
- There is something interesting here. The host "bhs1-mail02-ds05.m9.network" has the highest number of caps (over 2.5M...
- 11:45 AM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
- Running it today on the same cluster that crashed (client usage is about the same by the time):
~# ceph daemon mds... - 03:19 AM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
- I recently found a bug. It can explain the crash.
https://github.com/ceph/ceph/pull/15440... - 10:15 AM Bug #17435: Crash in ceph-fuse in ObjectCacher::trim while adding an OSD
- the ceph-fuse client crash when top application reading , the version is 10.2.3.
bh_lru_rest containt a wrong state... - 06:10 AM Bug #20440: mds: mds/journal.cc: 1559: FAILED assert(inotablev == mds->inotable->get_version())
The recover_dentries command of journal tool only inject link, but never delete old links. this causes duplicated p...- 02:06 AM Bug #20440: mds: mds/journal.cc: 1559: FAILED assert(inotablev == mds->inotable->get_version())
- Log snippet from: /ceph/teuthology-archive/pdonnell-2017-07-01_01:07:39-fs-wip-pdonnell-20170630-distro-basic-smithi/...
- 02:06 AM Bug #20440: mds: mds/journal.cc: 1559: FAILED assert(inotablev == mds->inotable->get_version())
- Zheng, adding that patch lets the test make progress but there still appears to be a problem:...
- 05:31 AM Feature #10792: qa: enable thrasher for MDS cluster size (vary max_mds)
- https://github.com/ceph/ceph/pull/16200
- 05:29 AM Feature #19230: Limit MDS deactivation to one at a time
- thrasher now deactivates one at a time: https://github.com/ceph/ceph/pull/15950
- 05:19 AM Bug #20212 (Resolved): test_fs_new failure on race between pool creation and appearance in `df`
- 04:47 AM Backport #20412 (Resolved): test_remote_update_write (tasks.cephfs.test_quota.TestQuota) fails in...
- 04:42 AM Bug #20376 (Resolved): last_epoch_(over|under) in MDBalancer should be updated if mds0 has failed
- 04:41 AM Bug #20254 (Resolved): mds: coverity error in Server::_rename_prepare
- 04:40 AM Bug #20318 (Resolved): Race in TestExports.test_export_pin
- 04:35 AM Bug #16914 (Resolved): multimds: pathologically slow deletions in some tests
07/06/2017
- 09:30 PM Bug #20537: mds: MDLog.cc: 276: FAILED assert(!capped)
- Zheng, please look at this one.
- 09:30 PM Bug #20537 (Resolved): mds: MDLog.cc: 276: FAILED assert(!capped)
- ...
- 08:48 PM Backport #20140: jewel: Journaler may execute on_safe contexts prematurely
- Original backport (the one that went into 10.2.8) was incomplete. See #20536 for the completion.
- 08:46 PM Backport #20536 (In Progress): jewel: Journaler may execute on_safe contexts prematurely (part 2)
- 08:45 PM Backport #20536 (Rejected): jewel: Journaler may execute on_safe contexts prematurely (part 2)
- https://github.com/ceph/ceph/pull/16192
- 08:20 PM Backport #20028 (In Progress): kraken: Deadlock on two ceph-fuse clients accessing the same file
- 08:19 PM Bug #20535 (Resolved): mds segmentation fault ceph_lock_state_t::get_overlapping_locks
- The Active MDS crashes, all clients freeze and the standby (or standby-replay) daemon takes hours to recover.
cep... - 08:17 PM Backport #20026 (In Progress): kraken: cephfs: MDS became unresponsive when truncating a very lar...
- 01:46 PM Bug #20494: cephfs_data_scan: try_remove_dentries_for_stray assertion failure
- This inconsistent state is created by cephfs-data-scan? I think we can avoid injecting dentry to stray directory (inj...
- 01:32 PM Bug #20494: cephfs_data_scan: try_remove_dentries_for_stray assertion failure
- After carefull consideration, i try to fix this bug using first solution.
function inject_with_backtrace will traver... - 01:29 PM Bug #20467: Ceph FS kernel client not consistency
- Yunzhi Cheng wrote:
> Zheng Yan wrote:
> > Please try
> >
> > https://github.com/ceph/ceph-client/commit/6c51866... - 10:04 AM Bug #20467: Ceph FS kernel client not consistency
- Zheng Yan wrote:
> Please try
>
> https://github.com/ceph/ceph-client/commit/6c51866e67dda544b9be524185a526cf4ace... - 04:20 AM Bug #20467: Ceph FS kernel client not consistency
- Please try
https://github.com/ceph/ceph-client/commit/6c51866e67dda544b9be524185a526cf4ace865a - 02:58 AM Backport #20349 (In Progress): jewel: df reports negative disk "used" value when quota exceed
- 02:53 AM Backport #20403 (In Progress): jewel: cephfs permission denied until second client accesses file
07/05/2017
- 02:21 PM Feature #16523 (Resolved): Assert directory fragmentation is occuring during stress tests
- ...
- 01:33 PM Bug #17731 (Can't reproduce): MDS stuck in stopping with other rank's strays
- This code has all changed a lot since.
- 09:02 AM Bug #20467: Ceph FS kernel client not consistency
- Yunzhi Cheng wrote:
>
> Seems it's error, should I do echo module ceph +p > /sys/kernel/debug/dynamic_debug ?
>
... - 08:30 AM Bug #20467: Ceph FS kernel client not consistency
- Zheng Yan wrote:
> ceph-mds version and configure, dirfrag or multimds enabled?
>
> please describe workload you ... - 07:22 AM Bug #20467: Ceph FS kernel client not consistency
- ceph-mds version and configure, dirfrag or multimds enabled?
please describe workload you put on cephfs, how many ... - 03:08 AM Bug #20467: Ceph FS kernel client not consistency
- Zheng Yan wrote:
> I checked ubuntu kenrel. It does not contain following commit. I suspect it's the cause. Could yo...
07/04/2017
- 09:13 PM Backport #20500 (In Progress): kraken: src/test/pybind/test_cephfs.py fails
- 09:12 PM Backport #20500 (Resolved): kraken: src/test/pybind/test_cephfs.py fails
- https://github.com/ceph/ceph/pull/16114
- 09:10 PM Bug #19890 (Pending Backport): src/test/pybind/test_cephfs.py fails
- 11:30 AM Backport #19763 (In Progress): kraken: non-local cephfs quota changes not visible until some IO i...
- 11:26 AM Backport #19710 (In Progress): kraken: Enable MDS to start when session ino info is corrupt
- 11:17 AM Backport #19680 (In Progress): kraken: MDS: damage reporting by ino number is useless
- 11:12 AM Backport #19678 (In Progress): kraken: Jewel ceph-fuse does not recover after lost connection to MDS
- 11:08 AM Backport #19676 (In Progress): kraken: cephfs: Test failure: test_data_isolated (tasks.cephfs.tes...
- 11:06 AM Backport #19674 (In Progress): kraken: cephfs: mds is crushed, after I set about 400 64KB xattr k...
- 11:04 AM Backport #19672 (In Progress): kraken: MDS assert failed when shutting down
- 10:58 AM Backport #19669 (In Progress): kraken: MDS goes readonly writing backtrace for a file whose data ...
- 10:36 AM Backport #19667 (In Progress): kraken: fs:The mount point break off when mds switch hanppened.
- 10:28 AM Bug #20494 (Closed): cephfs_data_scan: try_remove_dentries_for_stray assertion failure
- Using teuthology run data-scan.yaml and when completed test_data_scan.py:test_parallel_execution test case i want to ...
- 10:07 AM Backport #19664 (In Progress): kraken: C_MDSInternalNoop::complete doesn't free itself
07/03/2017
- 04:12 PM Bug #20424 (Resolved): doc: improve description of `mds deactivate` to better contrast with `mds ...
- 01:48 PM Bug #20424 (Fix Under Review): doc: improve description of `mds deactivate` to better contrast wi...
- In the absence of inspiration for better names:
https://github.com/ceph/ceph/pull/16080
It would be really nice... - 01:38 PM Bug #20424: doc: improve description of `mds deactivate` to better contrast with `mds fail`
- It may be useful to change the names of these commands to be more intuitive.
- 03:45 AM Bug #20469: Ceph Client can't access file and show '???'
- Yunzhi Cheng wrote:
> Zheng Yan wrote:
> > this one and http://tracker.ceph.com/issues/20467 may be caused by the s...
07/01/2017
- 02:41 PM Bug #20469: Ceph Client can't access file and show '???'
- Zheng Yan wrote:
> this one and http://tracker.ceph.com/issues/20467 may be caused by the same bug. please try the n...
06/30/2017
- 07:43 PM Bug #19706: Laggy mon daemons causing MDS failover (symptom: failed to set counters on mds daemon...
- Here's another: /a/pdonnell-2017-06-27_19:50:40-fs-wip-pdonnell-20170627---basic-smithi/1333703
- 02:08 PM Bug #20469: Ceph Client can't access file and show '???'
- this one and http://tracker.ceph.com/issues/20467 may be caused by the same bug. please try the newest upstream kernel
- 07:19 AM Bug #20469 (Need More Info): Ceph Client can't access file and show '???'
- Very strange behavior...
- 10:53 AM Bug #20467: Ceph FS kernel client not consistency
- I checked ubuntu kenrel. It does not contain following commit. I suspect it's the cause. Could you please try newest ...
- 03:01 AM Bug #20467: Ceph FS kernel client not consistency
- ...
- 02:55 AM Bug #20467: Ceph FS kernel client not consistency
- kernel version is 4.4.0-46
- 02:43 AM Bug #20467: Ceph FS kernel client not consistency
- kernel version ?
- 02:40 AM Bug #20467 (Resolved): Ceph FS kernel client not consistency
- I use 'ls' to list files in one directory, two client has different result...
06/29/2017
- 09:45 AM Bug #20440 (Fix Under Review): mds: mds/journal.cc: 1559: FAILED assert(inotablev == mds->inotabl...
- caused by https://github.com/ceph/ceph/pull/15844.
need incremental patch https://github.com/ukernel/ceph/commit/47...
06/28/2017
- 07:37 PM Bug #20337: test_rebuild_simple_altpool triggers MDS assertion
- 05:33 PM Bug #20452: Adding pool with id smaller then existing data pool ids breaks MDSMap::is_data_pool
- https://github.com/ceph/ceph/pull/15982
- 05:09 PM Bug #20452 (Fix Under Review): Adding pool with id smaller then existing data pool ids breaks MDS...
- 04:14 PM Bug #20452: Adding pool with id smaller then existing data pool ids breaks MDSMap::is_data_pool
- Seems like the implementation of MDSMap::is_data_pool makes a wrong assumption by using binary_search.
data_pools is... - 04:09 PM Bug #20452: Adding pool with id smaller then existing data pool ids breaks MDSMap::is_data_pool
- removing pool 1 fixes the issues
ceph fs rm_data_pool fs 1
Creating files now works again. - 04:06 PM Bug #20452 (Resolved): Adding pool with id smaller then existing data pool ids breaks MDSMap::is_...
- To reproduce:
Setup ceph cluster with mds but don't create fs yet (nor the necessary pools). Equivalent to the resul... - 03:29 PM Bug #20441 (Fix Under Review): mds: failure during data scan
- https://github.com/ceph/ceph/pull/15979
- 03:20 PM Bug #20441: mds: failure during data scan
- That error is whitelisted; the relevant error is:
failure_reason: '"2017-06-27 21:09:57.713761 mds.a mds.0 172.21.... - 01:02 AM Bug #20441: mds: failure during data scan
- Doug, please take a look at this one.
- 01:01 AM Bug #20441 (Resolved): mds: failure during data scan
- ...
- 12:34 AM Bug #20440: mds: mds/journal.cc: 1559: FAILED assert(inotablev == mds->inotable->get_version())
- Zheng, please take a look at this one.
- 12:34 AM Bug #20440 (Resolved): mds: mds/journal.cc: 1559: FAILED assert(inotablev == mds->inotable->get_v...
- From:
http://qa-proxy.ceph.com/teuthology/pdonnell-2017-06-27_19:50:40-fs-wip-pdonnell-20170627---basic-smithi/133...
06/27/2017
- 04:11 PM Bug #20072: TestStrays.test_snapshot_remove doesn't handle head whiteout in pgls results
- FWIW that is basicaly what we did with the rados api test cleanup failures (loop waiting for snaptrimmer to do its th...
- 04:59 AM Bug #20424 (Resolved): doc: improve description of `mds deactivate` to better contrast with `mds ...
- Currently the help output is not very useful for `ceph mds deactivate`:...
- 02:21 AM Backport #20412: test_remote_update_write (tasks.cephfs.test_quota.TestQuota) fails in Jewel 10.2...
- Also: https://github.com/ceph/ceph/pull/15937
- 02:07 AM Backport #20412 (Fix Under Review): test_remote_update_write (tasks.cephfs.test_quota.TestQuota) ...
- John, that looks like the problem. Here's a PR:
https://github.com/ceph/ceph/pull/15936
06/26/2017
- 08:25 PM Backport #20027 (Resolved): jewel: Deadlock on two ceph-fuse clients accessing the same file
- 08:24 PM Backport #19846 (Resolved): jewel: write to cephfs mount hangs, ceph-fuse and kernel
- 08:21 PM Backport #20412: test_remote_update_write (tasks.cephfs.test_quota.TestQuota) fails in Jewel 10.2...
- Aargh, I think this might just be failing because this is a new test that was written for luminous, where client_quot...
- 07:49 PM Backport #20412 (In Progress): test_remote_update_write (tasks.cephfs.test_quota.TestQuota) fails...
- I dug into the logs. It looks like the MDS is not sending a quota update to the client. From a brief look at the code...
- 07:30 PM Bug #20337 (New): test_rebuild_simple_altpool triggers MDS assertion
- 06:54 PM Bug #20337: test_rebuild_simple_altpool triggers MDS assertion
- I'm not seeing how Filesystem.are_daemons_healthy is waiting for daemons outside the filesystem: it's inspecting daem...
- 06:42 PM Bug #20337 (Need More Info): test_rebuild_simple_altpool triggers MDS assertion
- wait_for_daemons should wait for every daemon regardless of filesystem. is there a failure log I can look at?
- 05:26 AM Backport #20140 (Resolved): jewel: Journaler may execute on_safe contexts prematurely
06/25/2017
- 07:56 AM Backport #20412 (Resolved): test_remote_update_write (tasks.cephfs.test_quota.TestQuota) fails in...
- https://github.com/ceph/ceph/pull/15936
06/23/2017
- 08:14 PM Backport #20404 (Rejected): kraken: cephfs permission denied until second client accesses file
- 08:14 PM Backport #20403 (Resolved): jewel: cephfs permission denied until second client accesses file
- https://github.com/ceph/ceph/pull/16150
- 12:04 PM Backport #20148 (Resolved): jewel: Too many stat ops when MDS trying to probe a large file
- 08:05 AM Bug #20334: I/O become slowly when multi mds which subtree root has replica
- try uploading it somewhere else or send it to my email zyan@redhat.com
- 07:51 AM Bug #20334: I/O become slowly when multi mds which subtree root has replica
- yanmei ding wrote:
> yanmei ding wrote:
> > Zheng Yan wrote:
> > > please upload detailed log for the slow case.
... - 07:47 AM Bug #20334: I/O become slowly when multi mds which subtree root has replica
- yanmei ding wrote:
> Zheng Yan wrote:
> > please upload detailed log for the slow case.
>
> This is a detailed l... - 07:46 AM Bug #20334: I/O become slowly when multi mds which subtree root has replica
- Zheng Yan wrote:
> please upload detailed log for the slow case.
This is a detailed log.
Thank you! - 06:43 AM Bug #20334: I/O become slowly when multi mds which subtree root has replica
- please upload detailed log for the slow case.
- 02:05 AM Bug #20334: I/O become slowly when multi mds which subtree root has replica
- John Spray wrote:
> In that case I suggest you wait for 12.1.0 to see if the issue is fixed there.
John Spray: I ... - 03:54 AM Bug #20376 (Fix Under Review): last_epoch_(over|under) in MDBalancer should be updated if mds0 ha...
- 12:17 AM Bug #20376: last_epoch_(over|under) in MDBalancer should be updated if mds0 has failed
- There is a merge request for this bug fix: https://github.com/ceph/ceph/pull/15825, could you have a review? @Patrick
06/22/2017
- 10:51 PM Bug #20376: last_epoch_(over|under) in MDBalancer should be updated if mds0 has failed
- 02:20 AM Bug #20376 (Resolved): last_epoch_(over|under) in MDBalancer should be updated if mds0 has failed
- When mds0 has failed and started up again, it will reset beat_epoch to zero. In this case, other MDSes should update ...
- 05:28 PM Bug #20122: Ceph MDS crash with assert failure
- Are you able to reliably reproduce this? Do you have any MDS logs during the failure?
- 11:08 AM Bug #20340 (Pending Backport): cephfs permission denied until second client accesses file
- 11:07 AM Bug #20338 (Resolved): mem leak in Journaler::_issue_read() in ceph-mds
- 11:06 AM Bug #20165 (Resolved): Deadlock during shutdown in PurgeQueue::_consume
- 10:57 AM Bug #19706: Laggy mon daemons causing MDS failover (symptom: failed to set counters on mds daemon...
- The fix is not working in at least some cases. Here's a smoking gun failure:
http://pulpito.ceph.com/jspray-2017-... - 10:47 AM Feature #20196: mds: early reintegration of strays on hardlink deletion
- Zheng's patch for the special case (both links in cache at time of primary unlink) is merged for luminous -- hopefull...
06/21/2017
- 10:06 PM Bug #20212 (Fix Under Review): test_fs_new failure on race between pool creation and appearance i...
- https://github.com/ceph/ceph/pull/15822
- 08:36 PM Bug #20212 (In Progress): test_fs_new failure on race between pool creation and appearance in `df`
- 09:12 PM Bug #20254 (Fix Under Review): mds: coverity error in Server::_rename_prepare
- https://github.com/ceph/ceph/pull/15818
- 08:34 PM Bug #20318 (Fix Under Review): Race in TestExports.test_export_pin
- https://github.com/ceph/ceph/pull/15817
- 02:58 PM Bug #20334: I/O become slowly when multi mds which subtree root has replica
- In that case I suggest you wait for 12.1.0 to see if the issue is fixed there.
- 12:52 AM Bug #20334: I/O become slowly when multi mds which subtree root has replica
- John Spray wrote:
> yanmei ding: there have been fixes on master since 12.0.3, please could you retest with the tip ... - 10:22 AM Bug #20340: cephfs permission denied until second client accesses file
- Thanks for this patch. It seems to fix the problem for our users.
- 07:57 AM Bug #20340 (Fix Under Review): cephfs permission denied until second client accesses file
- https://github.com/ceph/ceph/pull/15800
06/20/2017
- 06:25 PM Bug #20170 (Resolved): filelock_interrupt.py fails on multimds
- 06:21 PM Documentation #13311 (Resolved): explain user permission syntax, details
- 06:21 PM Documentation #13311: explain user permission syntax, details
- Not sure why this got made a FS ticket, but fortunately I wrote the docs for cephfs client auth caps a while ago so t...
- 06:20 PM Feature #8786 (Resolved): ceph kernel module for el7
- CephFS kernel module is in RHEL since 7.4. the kmod-* packages are discontinued as per the note at https://github.co...
- 06:15 PM Bug #20060 (Resolved): segmentation fault in _do_cap_update
- 06:14 PM Bug #20131 (Resolved): mds/MDBalancer: update MDSRank export_targets according to current balance...
- 06:14 PM Bug #20335 (Resolved): test_migration_on_shutdown, test_grow_shrink failing
- 01:24 PM Bug #20335 (Fix Under Review): test_migration_on_shutdown, test_grow_shrink failing
- https://github.com/ceph/ceph/pull/15768
- 06:09 PM Bug #16914 (Fix Under Review): multimds: pathologically slow deletions in some tests
- It looks like this case is now working properly with the latest code, so flipping this ticket to need review and remo...
- 02:03 PM Bug #18641 (Can't reproduce): mds: stalled clients apparently due to stale sessions
- 02:02 PM Bug #17069 (Closed): multimds: slave rmdir assertion failure
- Closing because currently we know that snapshots+multimds is broken.
- 02:00 PM Bug #16925 (Can't reproduce): multimds: cfuse (?) hang on fsx.sh workunit
- 01:54 PM Bug #20334: I/O become slowly when multi mds which subtree root has replica
- yanmei ding: there have been fixes on master since 12.0.3, please could you retest with the tip of master?
- 01:08 PM Bug #20334: I/O become slowly when multi mds which subtree root has replica
- Zheng Yan wrote:
> yanmei ding wrote:
> > John Spray wrote:
> > > Yanmei Ding: can you be more specific about how ... - 08:03 AM Bug #20334: I/O become slowly when multi mds which subtree root has replica
- yanmei ding wrote:
> John Spray wrote:
> > Yanmei Ding: can you be more specific about how to reproduce this or wha... - 01:20 AM Bug #20334: I/O become slowly when multi mds which subtree root has replica
- John Spray wrote:
> Yanmei Ding: can you be more specific about how to reproduce this or what is going wrong interna... - 01:49 PM Fix #20246 (In Progress): Make clog message on scrub errors friendlier.
- 01:47 PM Bug #20282 (Closed): qa: missing even trivial tests for many commands
- 01:44 PM Bug #20282: qa: missing even trivial tests for many commands
- The script just greps for anything that looks like a COMMMAND and then greps for their existence in qa/ and src/tests...
- 01:44 PM Bug #20329 (Resolved): Ceph file system hang on Jewel
- Resolving, patch will show up in stable release as and when.
- 01:42 AM Bug #20329: Ceph file system hang on Jewel
- Eric Eastman wrote:
> The number in the first column changes. Here is the output running the command in a while loop... - 11:05 AM Bug #20338 (Fix Under Review): mem leak in Journaler::_issue_read() in ceph-mds
- https://github.com/ceph/ceph/pull/15776
- 09:49 AM Bug #20340: cephfs permission denied until second client accesses file
- Yup, in this case the diri is_stray (it looks like this...
- 09:28 AM Bug #20340: cephfs permission denied until second client accesses file
- Ahh so it *is* related to path-restricted cap.
I tried as above with client B having the same client caps -- didn't ... - 09:18 AM Bug #20340: cephfs permission denied until second client accesses file
- I've confirmed that none of these help resolve these EPERM files:
* restart the ceph-fuse on client A
* mount... - 09:29 AM Bug #17858: Cannot create deep directories when caps contain "path=/somepath"
- Just pinging this to say that there remain some issues with path-restricted caps, as shown in #20340.
06/19/2017
- 08:52 PM Backport #20350 (Rejected): kraken: df reports negative disk "used" value when quota exceed
- 08:52 PM Backport #20349 (Resolved): jewel: df reports negative disk "used" value when quota exceed
- https://github.com/ceph/ceph/pull/16151
- 03:54 PM Bug #20338: mem leak in Journaler::_issue_read() in ceph-mds
- Was there a teuthology run where this was happening?
- 10:57 AM Bug #20338 (Resolved): mem leak in Journaler::_issue_read() in ceph-mds
- ...
- 01:47 PM Bug #20341 (Duplicate): test_migration_on_shutdown fails on master
- 01:29 PM Bug #20341 (Duplicate): test_migration_on_shutdown fails on master
- http://qa-proxy.ceph.com/teuthology/teuthology-2017-06-19_03:15:05-fs-master-distro-basic-smithi/1300512/teuthology.l...
- 01:30 PM Bug #20340: cephfs permission denied until second client accesses file
- I should mention that while client A was a user that has a path-restricted mds cap, the client B that "fixes" the EPE...
- 01:25 PM Bug #20340 (Resolved): cephfs permission denied until second client accesses file
- Here is a file that client A gets permission denied during stat:...
- 12:49 PM Bug #20329: Ceph file system hang on Jewel
- Eric: that's a conversation to have with whoever is providing your kernel -- the kernel bits of Ceph are not part of ...
- 12:38 PM Bug #20329: Ceph file system hang on Jewel
- The number in the first column changes. Here is the output running the command in a while loop, once a second. Every ...
- 09:19 AM Bug #20329: Ceph file system hang on Jewel
- ...
- 12:41 PM Bug #20178 (Pending Backport): df reports negative disk "used" value when quota exceed
- 10:53 AM Bug #20282: qa: missing even trivial tests for many commands
- Greg: can you say which script you're looking to cover these commands in? Things like session kill would be pretty a...
- 10:50 AM Bug #20272 (Rejected): Ceph OSD & MDS Failure
- I don't think there's anything to be done with this right now -- feel free to reopen if there's some other evidence t...
- 10:47 AM Bug #20334: I/O become slowly when multi mds which subtree root has replica
- Yanmei Ding: can you be more specific about how to reproduce this or what is going wrong internally?
- 10:34 AM Bug #20337 (Resolved): test_rebuild_simple_altpool triggers MDS assertion
- Two things are going wrong here, I think:
* The test code is doing a self.fs.wait_for_daemons() (test_data_scan.py:4... - 10:18 AM Bug #20335 (Resolved): test_migration_on_shutdown, test_grow_shrink failing
- Seems to be happening repeatedly since June 10. Latest fs-master failures:
http://pulpito.ceph.com/teuthology-201... - 09:35 AM Bug #20313 (Fix Under Review): Assertion in handle_dir_update
- https://github.com/ceph/ceph/pull/15510/commits/1a5fd47880229d69a6ea484e662e8b8280ff5158
06/18/2017
- 05:47 PM Bug #20328 (Duplicate): Test failure: test_export_pin (tasks.cephfs.test_exports.TestExports)
- 02:24 PM Bug #20334 (Resolved): I/O become slowly when multi mds which subtree root has replica
06/16/2017
- 04:22 PM Bug #20329 (Resolved): Ceph file system hang on Jewel
- We are running Ceph 10.2.7 and after adding a new multi-threaded writer application we are seeing hangs accessing met...
- 01:24 PM Bug #20328 (Duplicate): Test failure: test_export_pin (tasks.cephfs.test_exports.TestExports)
- http://qa-proxy.ceph.com/teuthology/jspray-2017-06-15_02:50:24-multimds-wip-jcsp-testing-20170614-testing-basic-smith...
06/15/2017
- 02:51 PM Bug #19706 (Resolved): Laggy mon daemons causing MDS failover (symptom: failed to set counters on...
- 02:28 PM Bug #20318 (Resolved): Race in TestExports.test_export_pin
- Seen failure here:
http://pulpito.ceph.com/jspray-2017-06-15_02:50:24-multimds-wip-jcsp-testing-20170614-testing-bas... - 02:13 PM Bug #20313 (Resolved): Assertion in handle_dir_update
- Seen in test branch that had the following PRs in it:
[15125] mds: miscellaneous multimds fixes part2
[15510] mds...
Also available in: Atom