Project

General

Profile

Activity

From 06/25/2017 to 07/24/2017

07/24/2017

08:01 PM Bug #20761 (Resolved): fs status: KeyError in handle_fs_status
... Sage Weil
06:14 PM Feature #20760 (Resolved): mds: add perf counters for all mds-to-mds messages
Idea here is to have a better idea of what the MDS are doing through external tools (graphs). Continuation of #19362. Patrick Donnelly

07/23/2017

08:38 AM Feature #20752 (Resolved): cap message flag which indicates if client still has pending capsnap
current mds code uses "(cap->issued() & CEPH_CAP_ANY_FILE_WR) == 0" to infer that client has no pending capsnap. ther... Zheng Yan

07/21/2017

10:30 PM Bug #20594 (In Progress): mds: cache limits should be expressed in memory usage, not inode count
Patrick Donnelly
08:31 PM Bug #20682 (Resolved): qa: test_client_pin looking for wrong health warning string
Patrick Donnelly
08:31 PM Bug #20569 (Resolved): mds: don't mark dirty rstat on non-auth inode
Patrick Donnelly
08:29 PM Bug #20592 (Pending Backport): client::mkdirs not handle well when two clients send mkdir request...
Patrick Donnelly
08:28 PM Bug #20583 (Resolved): mds: improve wording when mds respawns due to mdsmap removal
Patrick Donnelly
08:27 PM Bug #20072 (Resolved): TestStrays.test_snapshot_remove doesn't handle head whiteout in pgls results
Patrick Donnelly
04:00 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
Patrick Donnelly wrote:
> See this announcement: http://ceph.com/geen-categorie/v10-2-4-jewel-released/
Thank you...
Webert Lima
03:30 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
Webert Lima wrote:
> Patrick Donnelly wrote:
> > Any update?
> Hey Patrick, I have upgrade one test cluster first,...
Patrick Donnelly
11:52 AM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
Patrick Donnelly wrote:
> Any update?
Hey Patrick, I have upgrade one test cluster first, but it keeps as HEALTH_WA...
Webert Lima
05:56 AM Feature #20607: MDSMonitor: change "mds deactivate" to clearer "mds rejoin"
Proposed doc fix: https://github.com/ceph/ceph/pull/16471 Jan Fajerski
04:15 AM Bug #20735 (New): mds: stderr:gzip: /var/log/ceph/ceph-mds.f.log: file size changed while zipping
Some logs are still being written to after the MDS is terminated. This only happens with valgrind and tasks/cfuse_wor... Patrick Donnelly

07/20/2017

11:49 PM Bug #16881: RuntimeError: Files in flight high water is unexpectedly low (0 / 6)
Zheng, please take a look. Patrick Donnelly
11:47 PM Bug #20118 (Duplicate): Test failure: test_ops_throttle (tasks.cephfs.test_strays.TestStrays)
Patrick Donnelly
11:15 PM Bug #20118: Test failure: test_ops_throttle (tasks.cephfs.test_strays.TestStrays)
Patrick Donnelly
11:44 PM Bug #16920: mds.inodes* perf counters sound like the number of inodes but they aren't
Patrick Donnelly
11:43 PM Bug #17837 (Resolved): ceph-mon crashed after upgrade from hammer 0.94.7 to jewel 10.2.3
Patrick Donnelly
11:40 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
Any update? Patrick Donnelly
11:34 PM Bug #8807 (Closed): multimds: kernel_untar_build.sh is failing to remove all files
Haven't seen this with the latest multimds fixes (probably thanks to Zheng) and the test files are no longer availabl... Patrick Donnelly
11:33 PM Bug #10542 (Resolved): ceph-fuse cap trimming fails with: mount: only root can use "--options" op...
This is caught during startup now and causes ceph-fuse to fail unless --client-die-on-failed-remount=false is set. Ma... Patrick Donnelly
11:28 PM Bug #11314 (In Progress): qa: MDS crashed and the runs hung without ever timing out
Patrick Donnelly
11:28 PM Bug #11314: qa: MDS crashed and the runs hung without ever timing out
I added a DaemonWatchdog in the mds_thrash.py code that catches this kind of thing. We should pull it out into its ow... Patrick Donnelly
11:25 PM Bug #11986 (Closed): logs changing during tarball generation at end of job
Haven't seen this one recently. Closing. Patrick Donnelly
11:17 PM Bug #19255 (Need More Info): qa: test_full_fclose failure
John, do you have a test failure to point to? Patrick Donnelly
11:16 PM Bug #19712: some kcephfs tests become very slow
Any update on this one Zheng? Patrick Donnelly
11:15 PM Bug #19812: client: not swapping directory caps efficiently leads to very slow create chains
Patrick Donnelly
09:51 PM Feature #20609: MDSMonitor: add new command `ceph fs set <fs_name> down` to bring the cluster down
Patrick Donnelly wrote:
> Douglas Fuller wrote:
> > (edited to add: this may be more automagic than we want since...
Douglas Fuller
09:17 PM Feature #20609: MDSMonitor: add new command `ceph fs set <fs_name> down` to bring the cluster down
Douglas Fuller wrote:
> Patrick Donnelly wrote:
> > Douglas Fuller wrote:
> > > Sure, but I don't want the user to...
Patrick Donnelly
09:01 PM Feature #20609: MDSMonitor: add new command `ceph fs set <fs_name> down` to bring the cluster down
Patrick Donnelly wrote:
> Douglas Fuller wrote:
> > Sure, but I don't want the user to have to understand that dist...
Douglas Fuller
08:28 PM Feature #20609: MDSMonitor: add new command `ceph fs set <fs_name> down` to bring the cluster down
Douglas Fuller wrote:
> Patrick Donnelly wrote:
> > > I'd like to work "mds" into this command somehow to make it c...
Patrick Donnelly
07:42 PM Feature #20609: MDSMonitor: add new command `ceph fs set <fs_name> down` to bring the cluster down
Patrick Donnelly wrote:
> > I'd like to work "mds" into this command somehow to make it clear to the user that they ...
Douglas Fuller
07:30 PM Feature #20609: MDSMonitor: add new command `ceph fs set <fs_name> down` to bring the cluster down
Douglas Fuller wrote:
> So really this would just set max_mds to 0. I do think this should trigger HEALTH_ERR unless...
Patrick Donnelly
09:23 PM Bug #20731 (Resolved): "[ERR] : Health check failed: 1 mds daemon down (MDS_FAILED)" in upgrade:j...
Run: http://pulpito.ceph.com/teuthology-2017-07-19_04:23:05-upgrade:jewel-x-luminous-distro-basic-smithi/
Jobs: 38
...
Yuri Weinstein
08:27 PM Backport #20714 (Rejected): jewel: Adding pool with id smaller then existing data pool ids breaks...
Nathan Cutler
07:57 PM Feature #20606: mds: improve usability of cluster rank manipulation and setting cluster up/down
John Spray wrote:
> The last point about cluster down: looking at http://tracker.ceph.com/issues/20609, I'm not sure...
Patrick Donnelly
07:47 PM Feature #20611: MDSMonitor: do not show cluster health warnings for file system intentionally mar...
Douglas Fuller wrote:
> Taking an MDS down for hardware maintenance, etc, should trigger a health warning because su...
Patrick Donnelly
07:44 PM Feature #20610: MDSMonitor: add new command to shrink the cluster in an automated way
Dropping the old behavior for decreasing max_mds (i.e. do nothing but set it) is okay with me. BTW, there should be a... Patrick Donnelly
07:12 PM Documentation #6771 (Closed): add mds configuration
Patrick Donnelly
12:39 PM Documentation #6771: add mds configuration
This can probably be closed? Jan Fajerski
07:03 PM Bug #20334 (Resolved): I/O become slowly when multi mds which subtree root has replica
Thanks for letting us know! Patrick Donnelly
01:07 AM Bug #20334: I/O become slowly when multi mds which subtree root has replica
v12.1.0 has solved the problem,please help close this issue,thank you! yanmei ding
12:13 AM Bug #20334 (Need More Info): I/O become slowly when multi mds which subtree root has replica
Patrick Donnelly
12:55 PM Feature #20607: MDSMonitor: change "mds deactivate" to clearer "mds rejoin"
I don't find deactivate so bad since this commands primarily deals with ranks (that just happened to be backed by a c... Jan Fajerski
12:39 AM Bug #20122 (Need More Info): Ceph MDS crash with assert failure
A debug log from the MDS is necesary to diagnose this I think. See: http://docs.ceph.com/docs/giant/rados/troubleshoo... Patrick Donnelly
12:12 AM Bug #20469 (Need More Info): Ceph Client can't access file and show '???'
Patrick Donnelly
12:12 AM Bug #20566 (Fix Under Review): "MDS health message (mds.0): Behind on trimming" in powercycle tests
PR: https://github.com/ceph/ceph/pull/16435 Patrick Donnelly
12:09 AM Bug #20566 (In Progress): "MDS health message (mds.0): Behind on trimming" in powercycle tests
Patrick Donnelly
12:00 AM Bug #20569: mds: don't mark dirty rstat on non-auth inode
Patrick Donnelly

07/19/2017

11:56 PM Bug #20595 (In Progress): mds: export_pin should be included in `get subtrees` output
Patrick Donnelly
11:56 PM Bug #20614 (Duplicate): [WRN] MDS daemon 'a-s' is not responding, replacing it as rank 0 with ...
Patrick Donnelly
09:10 PM Bug #19890 (Resolved): src/test/pybind/test_cephfs.py fails
Nathan Cutler
09:09 PM Backport #20500 (Resolved): kraken: src/test/pybind/test_cephfs.py fails
Nathan Cutler
09:03 PM Bug #17939 (Resolved): non-local cephfs quota changes not visible until some IO is done
Nathan Cutler
09:03 PM Backport #19763 (Resolved): kraken: non-local cephfs quota changes not visible until some IO is done
Nathan Cutler
09:02 PM Fix #19708 (Resolved): Enable MDS to start when session ino info is corrupt
Nathan Cutler
09:02 PM Backport #19710 (Resolved): kraken: Enable MDS to start when session ino info is corrupt
Nathan Cutler
09:01 PM Backport #19680 (Resolved): kraken: MDS: damage reporting by ino number is useless
Nathan Cutler
09:00 PM Bug #18757 (Resolved): Jewel ceph-fuse does not recover after lost connection to MDS
Nathan Cutler
09:00 PM Backport #19678 (Resolved): kraken: Jewel ceph-fuse does not recover after lost connection to MDS
Nathan Cutler
08:59 PM Bug #18914 (Resolved): cephfs: Test failure: test_data_isolated (tasks.cephfs.test_volume_client....
Nathan Cutler
08:59 PM Backport #19676 (Resolved): kraken: cephfs: Test failure: test_data_isolated (tasks.cephfs.test_v...
Nathan Cutler
08:56 PM Bug #19033 (Resolved): cephfs: mds is crushed, after I set about 400 64KB xattr kv pairs to a file
Nathan Cutler
08:56 PM Backport #19674 (Resolved): kraken: cephfs: mds is crushed, after I set about 400 64KB xattr kv p...
Nathan Cutler
08:55 PM Bug #19204 (Resolved): MDS assert failed when shutting down
Nathan Cutler
08:55 PM Backport #19672 (Resolved): kraken: MDS assert failed when shutting down
Nathan Cutler
08:54 PM Bug #19401 (Resolved): MDS goes readonly writing backtrace for a file whose data pool has been re...
Nathan Cutler
08:54 PM Backport #19669 (Resolved): kraken: MDS goes readonly writing backtrace for a file whose data poo...
Nathan Cutler
08:53 PM Bug #19437 (Resolved): fs:The mount point break off when mds switch hanppened.
Nathan Cutler
08:53 PM Backport #19667 (Resolved): kraken: fs:The mount point break off when mds switch hanppened.
Nathan Cutler
08:05 PM Bug #19501 (Resolved): C_MDSInternalNoop::complete doesn't free itself
Nathan Cutler
08:05 PM Backport #19664 (Resolved): kraken: C_MDSInternalNoop::complete doesn't free itself
Nathan Cutler
08:04 PM Bug #18872 (Resolved): write to cephfs mount hangs, ceph-fuse and kernel
Nathan Cutler
08:04 PM Backport #19845 (Resolved): kraken: write to cephfs mount hangs, ceph-fuse and kernel
Nathan Cutler
02:08 PM Bug #20122: Ceph MDS crash with assert failure
Thank you for your help. We've had several occurrences of this same issue since. This isn't something easily replicat... James Poole
01:43 PM Bug #19635 (Resolved): Deadlock on two ceph-fuse clients accessing the same file
Nathan Cutler
01:43 PM Backport #20028 (Resolved): kraken: Deadlock on two ceph-fuse clients accessing the same file
Nathan Cutler
07:40 AM Bug #20681 (Need More Info): kclient: umount target is busy
'sudo umount /home/ubuntu/cephtest/mnt.0' failed, but 'sudo umount /home/ubuntu/cephtest/mnt.0 -f' succeeded.
It's...
Zheng Yan
06:38 AM Bug #20677 (Fix Under Review): mds: abrt during migration
https://github.com/ceph/ceph/pull/16410/commits/58623d781da1189d2e88cf4875294353db78cea9 Zheng Yan
04:07 AM Bug #20677 (In Progress): mds: abrt during migration
Zheng Yan
03:47 AM Bug #20622 (Closed): mds: takeover mds stuck in up:replay after thrashing rank 0
Zheng Yan
02:34 AM Bug #20569: mds: don't mark dirty rstat on non-auth inode
New PR here: https://github.com/ceph/ceph/pull/16337 Zhi Zhang

07/18/2017

11:06 PM Bug #20682 (Resolved): qa: test_client_pin looking for wrong health warning string
... Patrick Donnelly
10:46 PM Bug #20681 (Closed): kclient: umount target is busy
... Patrick Donnelly
09:30 PM Bug #20677: mds: abrt during migration
Zheng: note this happened when the thrasher deactivated a rank. Patrick Donnelly
09:28 PM Bug #20677: mds: abrt during migration
Zheng, please take a look. Patrick Donnelly
09:28 PM Bug #20677 (Resolved): mds: abrt during migration
... Patrick Donnelly
09:24 PM Bug #20452: Adding pool with id smaller then existing data pool ids breaks MDSMap::is_data_pool
Kraken is soon-to-be-EOL, yes. Nathan Cutler
08:32 PM Bug #20452: Adding pool with id smaller then existing data pool ids breaks MDSMap::is_data_pool
Kraken is EOL right? Just jewel I think. Patrick Donnelly
08:26 PM Bug #20452: Adding pool with id smaller then existing data pool ids breaks MDSMap::is_data_pool
@Patrick - backport to which stable versions? Nathan Cutler
04:29 PM Bug #20452 (Pending Backport): Adding pool with id smaller then existing data pool ids breaks MDS...
Patrick Donnelly
04:25 PM Bug #20452 (Resolved): Adding pool with id smaller then existing data pool ids breaks MDSMap::is_...
Patrick Donnelly
08:34 PM Bug #20055: Journaler may execute on_safe contexts prematurely
Originally slated for jewel backport, but this was reconsidered. The jewel backport tracker was http://tracker.ceph.c... Nathan Cutler
04:33 PM Bug #20582 (Resolved): common: config showing ints as floats
Patrick Donnelly
04:32 PM Bug #20537 (Resolved): mds: MDLog.cc: 276: FAILED assert(!capped)
Patrick Donnelly
04:32 PM Bug #20440 (Resolved): mds: mds/journal.cc: 1559: FAILED assert(inotablev == mds->inotable->get_v...
Patrick Donnelly
04:24 PM Bug #20441 (Resolved): mds: failure during data scan
Patrick Donnelly
02:22 PM Bug #20441 (Closed): mds: failure during data scan
Douglas Fuller
04:23 PM Feature #10792 (Resolved): qa: enable thrasher for MDS cluster size (vary max_mds)
Patrick Donnelly
01:29 PM Bug #20659: MDSMonitor: assertion failure if two mds report same health warning
No it isn't more recent. Thanks! Patrick Donnelly
10:56 AM Bug #20659 (Resolved): MDSMonitor: assertion failure if two mds report same health warning
Unless this test run was more recent than the fix, I think this is https://github.com/ceph/ceph/pull/16302 John Spray
05:22 AM Bug #20659 (Resolved): MDSMonitor: assertion failure if two mds report same health warning
... Patrick Donnelly
11:20 AM Feature #20606: mds: improve usability of cluster rank manipulation and setting cluster up/down
The last point about cluster down: looking at http://tracker.ceph.com/issues/20609, I'm not sure what the higher leve... John Spray

07/17/2017

10:24 PM Feature #20606: mds: improve usability of cluster rank manipulation and setting cluster up/down
John Spray wrote:
> My thoughts on this:
>
> * maybe we should preface this class of command (manipulating the MD...
Patrick Donnelly
09:31 PM Feature #19109 (Fix Under Review): Use data pool's 'df' for statfs instead of global stats, if th...
https://github.com/ceph/ceph/pull/16378 Douglas Fuller
02:11 PM Bug #20566: "MDS health message (mds.0): Behind on trimming" in powercycle tests
it's transient warning caused by backfill. I think we should add this warning to whitelist Zheng Yan
01:52 PM Bug #20594: mds: cache limits should be expressed in memory usage, not inode count
See #4504 and associated MemoryModel tickets. Greg Farnum
01:48 PM Bug #20593 (Fix Under Review): mds: the number of inode showed by "mds perf dump" not correct aft...
Patrick Donnelly

07/14/2017

06:30 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
ok i'll download them just in case. Webert Lima
06:01 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
It will expire in a week or two. Patrick Donnelly
04:01 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
Patrick Donnelly wrote:
> https://shaman.ceph.com/builds/ceph/i20535-backport-v10.2.9/
Thanks, I'll be upgrading ...
Webert Lima
11:19 AM Feature #20606: mds: improve usability of cluster rank manipulation and setting cluster up/down
My thoughts on this:
* maybe we should preface this class of command (manipulating the MDS ranks) with "cluster", ...
John Spray
04:11 AM Bug #20622: mds: takeover mds stuck in up:replay after thrashing rank 0
3 of 6 osds failed. But there is no clue why they failed.... Zheng Yan
12:19 AM Feature #19109: Use data pool's 'df' for statfs instead of global stats, if there is only one dat...
You'd give the available space for that pool (i.e. how many bytes can they write before it becomes full). Same as in... John Spray

07/13/2017

11:04 PM Bug #20622: mds: takeover mds stuck in up:replay after thrashing rank 0
Zheng, please take a look. Patrick Donnelly
11:04 PM Bug #20622 (Closed): mds: takeover mds stuck in up:replay after thrashing rank 0
... Patrick Donnelly
10:03 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
https://shaman.ceph.com/builds/ceph/i20535-backport-v10.2.9/ Patrick Donnelly
12:35 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
Patrick Donnelly wrote:
> It does not include the fix. Do not use that branch. I'll make a note to update it...
T...
Webert Lima
04:08 PM Feature #19109: Use data pool's 'df' for statfs instead of global stats, if there is only one dat...
What value for free space should we give in this case? If it's the global free space, it might be misleading to speci... Douglas Fuller
02:49 PM Bug #20337: test_rebuild_simple_altpool triggers MDS assertion
https://github.com/ceph/ceph/pull/16305 Douglas Fuller
02:27 PM Feature #20609: MDSMonitor: add new command `ceph fs set <fs_name> down` to bring the cluster down
So really this would just set max_mds to 0. I do think this should trigger HEALTH_ERR unless and until the user delet... Douglas Fuller
02:23 PM Feature #20610: MDSMonitor: add new command to shrink the cluster in an automated way
I think we should still call this max_mds to avoid confusion between the filesystem itself (the data) and its MDSs. I... Douglas Fuller
02:11 PM Feature #20607: MDSMonitor: change "mds deactivate" to clearer "mds rejoin"
I think "reset" would work well. I do think we should change it, though. Douglas Fuller
08:05 AM Feature #20607: MDSMonitor: change "mds deactivate" to clearer "mds rejoin"
I like "flush". Also, "reactivate" occurred to me. Nathan Cutler
04:35 AM Feature #20607: MDSMonitor: change "mds deactivate" to clearer "mds rejoin"
Agreed on "rejoin".
Okay so after a jog and some light thought, I would suggest "release" (i.e. the rank is releas...
Patrick Donnelly
03:47 AM Feature #20607: MDSMonitor: change "mds deactivate" to clearer "mds rejoin"
I agree with john. rejoin is confusing for non-native English speaker Zheng Yan
02:08 PM Feature #20611: MDSMonitor: do not show cluster health warnings for file system intentionally mar...
Taking an MDS down for hardware maintenance, etc, should trigger a health warning because such actions do, even if in... Douglas Fuller
11:12 AM Bug #20614: [WRN] MDS daemon 'a-s' is not responding, replacing it as rank 0 with standby 'a'"...
Dupe of http://tracker.ceph.com/issues/19706 ? John Spray
08:58 AM Bug #20614 (Duplicate): [WRN] MDS daemon 'a-s' is not responding, replacing it as rank 0 with ...
http://pulpito.ceph.com/teuthology-2017-07-08_03:15:04-fs-master-distro-basic-smithi/ Zheng Yan
08:31 AM Backport #20599 (Resolved): jewel: cephfs: Damaged MDS with 10.2.8
Nathan Cutler

07/12/2017

11:36 PM Feature #20607: MDSMonitor: change "mds deactivate" to clearer "mds rejoin"
The trouble with "rejoin" is that it's also the name of one of the states the daemon passes through during startup. John Spray
11:02 PM Feature #20607 (Rejected): MDSMonitor: change "mds deactivate" to clearer "mds rejoin"
Rename `ceph mds deactivate` to `ceph mds rejoin`. This makes it clear an MDS is leaving and then rejoining the metad... Patrick Donnelly
11:11 PM Feature #20611 (Resolved): MDSMonitor: do not show cluster health warnings for file system intent...
Here's what you see currently:... Patrick Donnelly
11:09 PM Feature #20610 (Resolved): MDSMonitor: add new command to shrink the cluster in an automated way
Deprecate `ceph fs set <fs_name> max_mds <max_mds>`. New `ceph fs set <fs_name> ranks <max_mds>` sets max_mds and beg... Patrick Donnelly
11:07 PM Feature #20609 (Resolved): MDSMonitor: add new command `ceph fs set <fs_name> down` to bring the ...
This will cause the MDSMonitor to start stopping ranks (what deactivate does) beginning at the highest rank. Only one... Patrick Donnelly
11:03 PM Feature #20608 (Resolved): MDSMonitor: rename `ceph fs set <fs_name> cluster_down` to `ceph fs se...
This indicates whether the file system can be joined by the mds as a new rank. The behavior stays the same. Patrick Donnelly
10:51 PM Feature #20606 (Resolved): mds: improve usability of cluster rank manipulation and setting cluste...
Right now the procedure for bringing down a cluster is:... Patrick Donnelly
09:55 PM Bug #20582 (Fix Under Review): common: config showing ints as floats
https://github.com/ceph/ceph/pull/16288#event-1161205437 John Spray
05:53 PM Bug #20582 (In Progress): common: config showing ints as floats
Patrick Donnelly
05:53 PM Bug #20582: common: config showing ints as floats
Reassigning John. I agree `config show` needs fixed, not the tests. Patrick Donnelly
12:22 PM Bug #20582 (Fix Under Review): common: config showing ints as floats
https://github.com/ceph/ceph/pull/16288 Zheng Yan
12:00 PM Bug #20582 (In Progress): common: config showing ints as floats
It was this change:... John Spray
10:02 AM Bug #20582: common: config showing ints as floats
For float/double config option, the output of 'ceph daemon mds.x config get xxxx' looks like... Zheng Yan
08:31 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
Webert Lima wrote:
> Does this built package include the fix for the MDS regression that was found in 10.2.8? I read...
Patrick Donnelly
05:44 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
Does this built package include the fix for the MDS regression that was found in 10.2.8? I read about it in the maili... Webert Lima
06:02 PM Fix #20246 (Fix Under Review): Make clog message on scrub errors friendlier.
This snowballed into me pondering about what clog messages really should look like (https://github.com/ceph/ceph/pull... John Spray
12:42 PM Bug #20592 (In Progress): client::mkdirs not handle well when two clients send mkdir request for ...
Jos Collin
07:09 AM Bug #20592 (Fix Under Review): client::mkdirs not handle well when two clients send mkdir request...
Nathan Cutler
04:00 AM Bug #20592: client::mkdirs not handle well when two clients send mkdir request for a same dir
I have pulled a request for this issue
https://github.com/ceph/ceph/pull/16280
dongdong tao
02:55 AM Bug #20592 (Resolved): client::mkdirs not handle well when two clients send mkdir request for a s...
suppose we got two clients trying to make two level directory:
client1 mkdirs: a/b
client2 mkdirs: a/c
in func...
dongdong tao
10:27 AM Backport #19466 (In Progress): jewel: mds: log rotation doesn't work if mds has respawned
Nathan Cutler
08:16 AM Backport #20140: jewel: Journaler may execute on_safe contexts prematurely
Reverted by https://github.com/ceph/ceph/pull/16282 Nathan Cutler
07:07 AM Backport #20140 (Rejected): jewel: Journaler may execute on_safe contexts prematurely
Zheng Yan
07:07 AM Backport #20536 (Rejected): jewel: Journaler may execute on_safe contexts prematurely (part 2)
Zheng Yan
06:56 AM Backport #20599: jewel: cephfs: Damaged MDS with 10.2.8
h3. description
A bad backport was included in 10.2.8, potentially causing MDS breakage. See https://github.com/ce...
Nathan Cutler
06:41 AM Backport #20599 (Resolved): jewel: cephfs: Damaged MDS with 10.2.8
https://github.com/ceph/ceph/pull/16282 Nathan Cutler
04:25 AM Feature #19109: Use data pool's 'df' for statfs instead of global stats, if there is only one dat...
Doug, please take this one. Patrick Donnelly
04:19 AM Feature #20598 (Resolved): mds: revisit LAZY_IO
We do not test this and enabling it is no longer possible. However, we still repeatedly get requests for byte-range l... Patrick Donnelly
04:00 AM Bug #20597 (New): mds: tree exports should be reported at a higher debug level
Exporting a subtree is a major operation that should be visible to an operator monitoring the debug log at a low sett... Patrick Donnelly
03:58 AM Bug #20596 (Resolved): MDSMonitor: obsolete `mds dump` and other deprecated mds commands
Dan at CERN was still using this and wondered how to see standbys. We should obsolete this command post-Luminous and ... Patrick Donnelly
03:45 AM Bug #20595 (Resolved): mds: export_pin should be included in `get subtrees` output
Currently it is only available by looking at the dumped cache. An indicator that a subtree is unsplittable and/or pin... Patrick Donnelly
03:43 AM Bug #20594 (Resolved): mds: cache limits should be expressed in memory usage, not inode count
Cached inode count (mds_cache_size) is an imperfect limit for what we really want. We frequently have to tell users/c... Patrick Donnelly
03:34 AM Bug #19706: Laggy mon daemons causing MDS failover (symptom: failed to set counters on mds daemon...
PR for debugging: https://github.com/ceph/ceph/pull/16278 Patrick Donnelly
03:10 AM Bug #20593 (Resolved): mds: the number of inode showed by "mds perf dump" not correct after trimm...
currently, mds only update the inode number for "mds perf dump" in function "MDSRank::_dispatch"
this means it will ...
dongdong tao

07/11/2017

07:32 PM Bug #20583 (Fix Under Review): mds: improve wording when mds respawns due to mdsmap removal
https://github.com/ceph/ceph/pull/16270 Patrick Donnelly
04:36 PM Bug #20583 (Resolved): mds: improve wording when mds respawns due to mdsmap removal
> 2017-07-11 07:07:55.397645 7ffb7a1d7700 1 mds.b handle_mds_map i (10.0.1.2:6822/28190) dne in the mdsmap, respawni... Patrick Donnelly
05:18 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
Ok. I'll do it in two small clusters tomorrow and I'll update this troublesome clusters next week.
Thanks a lot for ...
Webert Lima
12:57 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
Webert Lima wrote:
> Couple questions:
> - any recommendations choosing over notcmalloc or default flavors?
Gen...
Patrick Donnelly
12:41 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
Patrick Donnelly wrote:
> New run: https://shaman.ceph.com/builds/ceph/i20535-backport/387e184970bc2949e16139db0cbda...
Webert Lima
04:33 PM Bug #20582 (Resolved): common: config showing ints as floats
Something broke in common/config.cc around printing values, the ints are coming out as floats, and it's breaking test... John Spray
03:42 PM Backport #20412: test_remote_update_write (tasks.cephfs.test_quota.TestQuota) fails in Jewel 10.2...
h3. description
This test fails reproducibly on the wip-jewel-backports branch: ...
Nathan Cutler
09:34 AM Bug #20569: mds: don't mark dirty rstat on non-auth inode
-https://github.com/ceph/ceph/pull/16253-
Within this fix, inodes can be trimmed successfully and no such warning ...
Zhi Zhang
09:32 AM Bug #20569 (Resolved): mds: don't mark dirty rstat on non-auth inode
Currently using multi-MDS on Luminous, we found ceph status reported such warning all the time if writing large amoun... Zhi Zhang
03:57 AM Bug #20566 (Resolved): "MDS health message (mds.0): Behind on trimming" in powercycle tests
see http://pulpito.ceph.com/kchai-2017-07-11_02:07:09-powercycle-master-distro-basic-mira/ Kefu Chai

07/10/2017

09:01 PM Backport #20564 (In Progress): jewel: mds segmentation fault ceph_lock_state_t::get_overlapping_l...
Nathan Cutler
08:45 PM Backport #20564 (Resolved): jewel: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
https://github.com/ceph/ceph/pull/16248 Nathan Cutler
08:05 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
New run: https://shaman.ceph.com/builds/ceph/i20535-backport/387e184970bc2949e16139db0cbda6acfa3f7b3a/ Patrick Donnelly
07:59 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
Awesome! Thanks! Webert Lima
07:57 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
I meant to link: https://shaman.ceph.com/builds/ceph/i20535-backport/2223e478c4b770e75cb7db196f5cd9d985929ac9/
Loo...
Patrick Donnelly
07:49 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
Patrick Donnelly wrote:
> I went ahead and did it since our CI will build the repos for you: https://shaman.ceph.com...
Webert Lima
07:46 PM Bug #20535 (Pending Backport): mds segmentation fault ceph_lock_state_t::get_overlapping_locks
Backport PR: https://github.com/ceph/ceph/pull/16248 Patrick Donnelly
07:46 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
> I'm not really comfortable in doing a upgrade like that, as the service and data availability is very critical here... Patrick Donnelly
07:33 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
Patrick Donnelly wrote:
> Zheng, PR#15440 indicates it's a multimds fix but Webert's setup is single MDS. Any issues...
Webert Lima
07:21 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
> it should be backported to jewel/kraken if feasible
Consider backport to jewel only, since kraken goes EOL the m...
Nathan Cutler
05:56 PM Bug #20535 (In Progress): mds segmentation fault ceph_lock_state_t::get_overlapping_locks
Zheng, PR#15440 indicates it's a multimds fix but Webert's setup is single MDS. Any issues you see backporting the fi... Patrick Donnelly
01:51 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
Patrick Donnelly wrote:
> Okay, it appears the deadlock is fixed.
I'm sorry. Do you refer to that commit? If so, ...
Webert Lima
01:41 PM Bug #20535 (Closed): mds segmentation fault ceph_lock_state_t::get_overlapping_locks
Okay, it appears the deadlock is fixed. Please open a new ticket if you're still seeing issues with rejoin taking unr... Patrick Donnelly
01:02 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
and this is current session ls from each Active MDS on each of both clusters:
root@bhs1-mail01-ds02:~# ceph daemo...
Webert Lima
12:46 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
Zheng Yan wrote:
> how about other machines
Sorry I didn't check at the time. I'll post everything as it looks ...
Webert Lima
03:38 AM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
Webert Lima wrote:
> root@bhs1-mail02-ds05:~# cat /proc/sys/fs/file-nr
> 3360 0 3273932
> root@bhs1-mail0...
Zheng Yan
05:50 PM Bug #20537: mds: MDLog.cc: 276: FAILED assert(!capped)
Zheng, please break out these fixes into separate PRs. Patrick Donnelly
11:46 AM Bug #20537 (Fix Under Review): mds: MDLog.cc: 276: FAILED assert(!capped)
https://github.com/ceph/ceph/pull/16068/commits/2b98f4701e9a12e50f8d017c93e5101eb02f7992 Zheng Yan
02:02 PM Bug #2494: mds: Cannot remove directory despite it being empty.
Zheng Yan wrote:
> David Galloway wrote:
> > I've moved these dirs to ...
David Galloway
03:31 AM Bug #2494 (Resolved): mds: Cannot remove directory despite it being empty.
Zheng Yan
08:53 AM Bug #20549: cephfs-journal-tool: segfault during journal reset
The reason is that Resetter::reset() creates on-stack journaler. The journaler got destroyed before receiving all pre... Zheng Yan

07/08/2017

12:29 PM Bug #2494: mds: Cannot remove directory despite it being empty.
David Galloway wrote:
> I've moved these dirs to ...
Zheng Yan

07/07/2017

10:18 PM Bug #2494: mds: Cannot remove directory despite it being empty.
I've moved these dirs to ... David Galloway
10:14 PM Bug #20072 (Fix Under Review): TestStrays.test_snapshot_remove doesn't handle head whiteout in pg...
https://github.com/ceph/ceph/pull/16226 Patrick Donnelly
04:52 PM Bug #20072 (In Progress): TestStrays.test_snapshot_remove doesn't handle head whiteout in pgls re...
Patrick Donnelly
09:14 PM Bug #20549 (Resolved): cephfs-journal-tool: segfault during journal reset
... Patrick Donnelly
12:47 PM Bug #20494: cephfs_data_scan: try_remove_dentries_for_stray assertion failure
Zheng Yan wrote:
> If inode numbers were removed from inotable when replaying journal. how did "assert(inode_map.cou...
Ivan Guan
10:53 AM Bug #20494: cephfs_data_scan: try_remove_dentries_for_stray assertion failure
If inode numbers were removed from inotable when replaying journal. how did "assert(inode_map.count(in->vino()) == 0)... Zheng Yan
09:38 AM Bug #20494: cephfs_data_scan: try_remove_dentries_for_stray assertion failure
Zheng Yan wrote:
> you can use cephfs-table-tool manually remove used inodes from inotable. (use "rados -p data ls",...
Ivan Guan
09:13 AM Bug #20494: cephfs_data_scan: try_remove_dentries_for_stray assertion failure
you can use cephfs-table-tool manually remove used inodes from inotable. (use "rados -p data ls", "rados -p metadata ... Zheng Yan
07:59 AM Bug #20494: cephfs_data_scan: try_remove_dentries_for_stray assertion failure
Zheng Yan wrote:
> I fixed a cephfs-data-scan bug. https://github.com/ceph/ceph/pull/16202. It can explain this inco...
Ivan Guan
07:47 AM Bug #20494: cephfs_data_scan: try_remove_dentries_for_stray assertion failure
Thank you,i made a test of running "ceph daemon mds.a scrub_path / force recursive repair" after scan_inodes.As you s... Ivan Guan
06:24 AM Bug #20494: cephfs_data_scan: try_remove_dentries_for_stray assertion failure
I fixed a cephfs-data-scan bug. https://github.com/ceph/ceph/pull/16202. It can explain this inconsistent state. Zheng Yan
06:23 AM Bug #20494: cephfs_data_scan: try_remove_dentries_for_stray assertion failure
Ivan Guan wrote:
> Ivan Guan wrote:
> > After carefull consideration, i try to fix this bug using first solution.
...
Zheng Yan
03:43 AM Bug #20494: cephfs_data_scan: try_remove_dentries_for_stray assertion failure
Ivan Guan wrote:
> After carefull consideration, i try to fix this bug using first solution.
> function inject_with...
Ivan Guan
12:41 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks

root@bhs1-mail02-ds05:~# cat /proc/sys/fs/file-nr
3360 0 3273932
root@bhs1-mail02-ds05:~# cat /proc/sys/...
Webert Lima
12:34 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
{
"id": 1029565,
"num_leases": 0,
*"num_caps": 2588906,*
"state": "open",
"replay_requests": 0,
"completed_requ...
Zheng Yan
12:20 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
There is something interesting here. The host "bhs1-mail02-ds05.m9.network" has the highest number of caps (over 2.5M... Webert Lima
11:45 AM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
Running it today on the same cluster that crashed (client usage is about the same by the time):
~# ceph daemon mds...
Webert Lima
03:19 AM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
I recently found a bug. It can explain the crash.
https://github.com/ceph/ceph/pull/15440...
Zheng Yan
10:15 AM Bug #17435: Crash in ceph-fuse in ObjectCacher::trim while adding an OSD
the ceph-fuse client crash when top application reading , the version is 10.2.3.
bh_lru_rest containt a wrong state...
Jay Lee
06:10 AM Bug #20440: mds: mds/journal.cc: 1559: FAILED assert(inotablev == mds->inotable->get_version())

The recover_dentries command of journal tool only inject link, but never delete old links. this causes duplicated p...
Zheng Yan
02:06 AM Bug #20440: mds: mds/journal.cc: 1559: FAILED assert(inotablev == mds->inotable->get_version())
Log snippet from: /ceph/teuthology-archive/pdonnell-2017-07-01_01:07:39-fs-wip-pdonnell-20170630-distro-basic-smithi/... Patrick Donnelly
02:06 AM Bug #20440: mds: mds/journal.cc: 1559: FAILED assert(inotablev == mds->inotable->get_version())
Zheng, adding that patch lets the test make progress but there still appears to be a problem:... Patrick Donnelly
05:31 AM Feature #10792: qa: enable thrasher for MDS cluster size (vary max_mds)
https://github.com/ceph/ceph/pull/16200 Patrick Donnelly
05:29 AM Feature #19230: Limit MDS deactivation to one at a time
thrasher now deactivates one at a time: https://github.com/ceph/ceph/pull/15950 Patrick Donnelly
05:19 AM Bug #20212 (Resolved): test_fs_new failure on race between pool creation and appearance in `df`
Patrick Donnelly
04:47 AM Backport #20412 (Resolved): test_remote_update_write (tasks.cephfs.test_quota.TestQuota) fails in...
Patrick Donnelly
04:42 AM Bug #20376 (Resolved): last_epoch_(over|under) in MDBalancer should be updated if mds0 has failed
Patrick Donnelly
04:41 AM Bug #20254 (Resolved): mds: coverity error in Server::_rename_prepare
Patrick Donnelly
04:40 AM Bug #20318 (Resolved): Race in TestExports.test_export_pin
Patrick Donnelly
04:35 AM Bug #16914 (Resolved): multimds: pathologically slow deletions in some tests
Patrick Donnelly

07/06/2017

09:30 PM Bug #20537: mds: MDLog.cc: 276: FAILED assert(!capped)
Zheng, please look at this one. Patrick Donnelly
09:30 PM Bug #20537 (Resolved): mds: MDLog.cc: 276: FAILED assert(!capped)
... Patrick Donnelly
08:48 PM Backport #20140: jewel: Journaler may execute on_safe contexts prematurely
Original backport (the one that went into 10.2.8) was incomplete. See #20536 for the completion. Nathan Cutler
08:46 PM Backport #20536 (In Progress): jewel: Journaler may execute on_safe contexts prematurely (part 2)
Nathan Cutler
08:45 PM Backport #20536 (Rejected): jewel: Journaler may execute on_safe contexts prematurely (part 2)
https://github.com/ceph/ceph/pull/16192 Nathan Cutler
08:20 PM Backport #20028 (In Progress): kraken: Deadlock on two ceph-fuse clients accessing the same file
Nathan Cutler
08:19 PM Bug #20535 (Resolved): mds segmentation fault ceph_lock_state_t::get_overlapping_locks
The Active MDS crashes, all clients freeze and the standby (or standby-replay) daemon takes hours to recover.
cep...
Webert Lima
08:17 PM Backport #20026 (In Progress): kraken: cephfs: MDS became unresponsive when truncating a very lar...
Nathan Cutler
01:46 PM Bug #20494: cephfs_data_scan: try_remove_dentries_for_stray assertion failure
This inconsistent state is created by cephfs-data-scan? I think we can avoid injecting dentry to stray directory (inj... Zheng Yan
01:32 PM Bug #20494: cephfs_data_scan: try_remove_dentries_for_stray assertion failure
After carefull consideration, i try to fix this bug using first solution.
function inject_with_backtrace will traver...
Ivan Guan
01:29 PM Bug #20467: Ceph FS kernel client not consistency
Yunzhi Cheng wrote:
> Zheng Yan wrote:
> > Please try
> >
> > https://github.com/ceph/ceph-client/commit/6c51866...
Zheng Yan
10:04 AM Bug #20467: Ceph FS kernel client not consistency
Zheng Yan wrote:
> Please try
>
> https://github.com/ceph/ceph-client/commit/6c51866e67dda544b9be524185a526cf4ace...
Yunzhi Cheng
04:20 AM Bug #20467: Ceph FS kernel client not consistency
Please try
https://github.com/ceph/ceph-client/commit/6c51866e67dda544b9be524185a526cf4ace865a
Zheng Yan
02:58 AM Backport #20349 (In Progress): jewel: df reports negative disk "used" value when quota exceed
Wei-Chung Cheng
02:53 AM Backport #20403 (In Progress): jewel: cephfs permission denied until second client accesses file
Wei-Chung Cheng

07/05/2017

02:21 PM Feature #16523 (Resolved): Assert directory fragmentation is occuring during stress tests
... John Spray
01:33 PM Bug #17731 (Can't reproduce): MDS stuck in stopping with other rank's strays
This code has all changed a lot since. John Spray
09:02 AM Bug #20467: Ceph FS kernel client not consistency
Yunzhi Cheng wrote:
>
> Seems it's error, should I do echo module ceph +p > /sys/kernel/debug/dynamic_debug ?
>
...
Zheng Yan
08:30 AM Bug #20467: Ceph FS kernel client not consistency
Zheng Yan wrote:
> ceph-mds version and configure, dirfrag or multimds enabled?
>
> please describe workload you ...
Yunzhi Cheng
07:22 AM Bug #20467: Ceph FS kernel client not consistency
ceph-mds version and configure, dirfrag or multimds enabled?
please describe workload you put on cephfs, how many ...
Zheng Yan
03:08 AM Bug #20467: Ceph FS kernel client not consistency
Zheng Yan wrote:
> I checked ubuntu kenrel. It does not contain following commit. I suspect it's the cause. Could yo...
Yunzhi Cheng

07/04/2017

09:13 PM Backport #20500 (In Progress): kraken: src/test/pybind/test_cephfs.py fails
Nathan Cutler
09:12 PM Backport #20500 (Resolved): kraken: src/test/pybind/test_cephfs.py fails
https://github.com/ceph/ceph/pull/16114 Nathan Cutler
09:10 PM Bug #19890 (Pending Backport): src/test/pybind/test_cephfs.py fails
Nathan Cutler
11:30 AM Backport #19763 (In Progress): kraken: non-local cephfs quota changes not visible until some IO i...
Nathan Cutler
11:26 AM Backport #19710 (In Progress): kraken: Enable MDS to start when session ino info is corrupt
Nathan Cutler
11:17 AM Backport #19680 (In Progress): kraken: MDS: damage reporting by ino number is useless
Nathan Cutler
11:12 AM Backport #19678 (In Progress): kraken: Jewel ceph-fuse does not recover after lost connection to MDS
Nathan Cutler
11:08 AM Backport #19676 (In Progress): kraken: cephfs: Test failure: test_data_isolated (tasks.cephfs.tes...
Nathan Cutler
11:06 AM Backport #19674 (In Progress): kraken: cephfs: mds is crushed, after I set about 400 64KB xattr k...
Nathan Cutler
11:04 AM Backport #19672 (In Progress): kraken: MDS assert failed when shutting down
Nathan Cutler
10:58 AM Backport #19669 (In Progress): kraken: MDS goes readonly writing backtrace for a file whose data ...
Nathan Cutler
10:36 AM Backport #19667 (In Progress): kraken: fs:The mount point break off when mds switch hanppened.
Nathan Cutler
10:28 AM Bug #20494 (Closed): cephfs_data_scan: try_remove_dentries_for_stray assertion failure
Using teuthology run data-scan.yaml and when completed test_data_scan.py:test_parallel_execution test case i want to ... Ivan Guan
10:07 AM Backport #19664 (In Progress): kraken: C_MDSInternalNoop::complete doesn't free itself
Nathan Cutler

07/03/2017

04:12 PM Bug #20424 (Resolved): doc: improve description of `mds deactivate` to better contrast with `mds ...
Patrick Donnelly
01:48 PM Bug #20424 (Fix Under Review): doc: improve description of `mds deactivate` to better contrast wi...
In the absence of inspiration for better names:
https://github.com/ceph/ceph/pull/16080
It would be really nice...
John Spray
01:38 PM Bug #20424: doc: improve description of `mds deactivate` to better contrast with `mds fail`
It may be useful to change the names of these commands to be more intuitive. Patrick Donnelly
03:45 AM Bug #20469: Ceph Client can't access file and show '???'
Yunzhi Cheng wrote:
> Zheng Yan wrote:
> > this one and http://tracker.ceph.com/issues/20467 may be caused by the s...
Zheng Yan

07/01/2017

02:41 PM Bug #20469: Ceph Client can't access file and show '???'
Zheng Yan wrote:
> this one and http://tracker.ceph.com/issues/20467 may be caused by the same bug. please try the n...
Yunzhi Cheng

06/30/2017

07:43 PM Bug #19706: Laggy mon daemons causing MDS failover (symptom: failed to set counters on mds daemon...
Here's another: /a/pdonnell-2017-06-27_19:50:40-fs-wip-pdonnell-20170627---basic-smithi/1333703 Patrick Donnelly
02:08 PM Bug #20469: Ceph Client can't access file and show '???'
this one and http://tracker.ceph.com/issues/20467 may be caused by the same bug. please try the newest upstream kernel Zheng Yan
07:19 AM Bug #20469 (Need More Info): Ceph Client can't access file and show '???'
Very strange behavior... Yunzhi Cheng
10:53 AM Bug #20467: Ceph FS kernel client not consistency
I checked ubuntu kenrel. It does not contain following commit. I suspect it's the cause. Could you please try newest ... Zheng Yan
03:01 AM Bug #20467: Ceph FS kernel client not consistency
... Yunzhi Cheng
02:55 AM Bug #20467: Ceph FS kernel client not consistency
kernel version is 4.4.0-46 Yunzhi Cheng
02:43 AM Bug #20467: Ceph FS kernel client not consistency
kernel version ? Zheng Yan
02:40 AM Bug #20467 (Resolved): Ceph FS kernel client not consistency
I use 'ls' to list files in one directory, two client has different result... Yunzhi Cheng

06/29/2017

09:45 AM Bug #20440 (Fix Under Review): mds: mds/journal.cc: 1559: FAILED assert(inotablev == mds->inotabl...
caused by https://github.com/ceph/ceph/pull/15844.
need incremental patch https://github.com/ukernel/ceph/commit/47...
Zheng Yan

06/28/2017

07:37 PM Bug #20337: test_rebuild_simple_altpool triggers MDS assertion
Douglas Fuller
05:33 PM Bug #20452: Adding pool with id smaller then existing data pool ids breaks MDSMap::is_data_pool
https://github.com/ceph/ceph/pull/15982 Nathan Cutler
05:09 PM Bug #20452 (Fix Under Review): Adding pool with id smaller then existing data pool ids breaks MDS...
Jan Fajerski
04:14 PM Bug #20452: Adding pool with id smaller then existing data pool ids breaks MDSMap::is_data_pool
Seems like the implementation of MDSMap::is_data_pool makes a wrong assumption by using binary_search.
data_pools is...
Jan Fajerski
04:09 PM Bug #20452: Adding pool with id smaller then existing data pool ids breaks MDSMap::is_data_pool
removing pool 1 fixes the issues
ceph fs rm_data_pool fs 1
Creating files now works again.
Jan Fajerski
04:06 PM Bug #20452 (Resolved): Adding pool with id smaller then existing data pool ids breaks MDSMap::is_...
To reproduce:
Setup ceph cluster with mds but don't create fs yet (nor the necessary pools). Equivalent to the resul...
Jan Fajerski
03:29 PM Bug #20441 (Fix Under Review): mds: failure during data scan
https://github.com/ceph/ceph/pull/15979 Douglas Fuller
03:20 PM Bug #20441: mds: failure during data scan
That error is whitelisted; the relevant error is:
failure_reason: '"2017-06-27 21:09:57.713761 mds.a mds.0 172.21....
Douglas Fuller
01:02 AM Bug #20441: mds: failure during data scan
Doug, please take a look at this one. Patrick Donnelly
01:01 AM Bug #20441 (Resolved): mds: failure during data scan
... Patrick Donnelly
12:34 AM Bug #20440: mds: mds/journal.cc: 1559: FAILED assert(inotablev == mds->inotable->get_version())
Zheng, please take a look at this one. Patrick Donnelly
12:34 AM Bug #20440 (Resolved): mds: mds/journal.cc: 1559: FAILED assert(inotablev == mds->inotable->get_v...
From:
http://qa-proxy.ceph.com/teuthology/pdonnell-2017-06-27_19:50:40-fs-wip-pdonnell-20170627---basic-smithi/133...
Patrick Donnelly

06/27/2017

04:11 PM Bug #20072: TestStrays.test_snapshot_remove doesn't handle head whiteout in pgls results
FWIW that is basicaly what we did with the rados api test cleanup failures (loop waiting for snaptrimmer to do its th... Sage Weil
04:59 AM Bug #20424 (Resolved): doc: improve description of `mds deactivate` to better contrast with `mds ...
Currently the help output is not very useful for `ceph mds deactivate`:... Patrick Donnelly
02:21 AM Backport #20412: test_remote_update_write (tasks.cephfs.test_quota.TestQuota) fails in Jewel 10.2...
Also: https://github.com/ceph/ceph/pull/15937 Patrick Donnelly
02:07 AM Backport #20412 (Fix Under Review): test_remote_update_write (tasks.cephfs.test_quota.TestQuota) ...
John, that looks like the problem. Here's a PR:
https://github.com/ceph/ceph/pull/15936
Patrick Donnelly

06/26/2017

08:25 PM Backport #20027 (Resolved): jewel: Deadlock on two ceph-fuse clients accessing the same file
John Spray
08:24 PM Backport #19846 (Resolved): jewel: write to cephfs mount hangs, ceph-fuse and kernel
John Spray
08:21 PM Backport #20412: test_remote_update_write (tasks.cephfs.test_quota.TestQuota) fails in Jewel 10.2...
Aargh, I think this might just be failing because this is a new test that was written for luminous, where client_quot... John Spray
07:49 PM Backport #20412 (In Progress): test_remote_update_write (tasks.cephfs.test_quota.TestQuota) fails...
I dug into the logs. It looks like the MDS is not sending a quota update to the client. From a brief look at the code... Patrick Donnelly
07:30 PM Bug #20337 (New): test_rebuild_simple_altpool triggers MDS assertion
John Spray
06:54 PM Bug #20337: test_rebuild_simple_altpool triggers MDS assertion
I'm not seeing how Filesystem.are_daemons_healthy is waiting for daemons outside the filesystem: it's inspecting daem... John Spray
06:42 PM Bug #20337 (Need More Info): test_rebuild_simple_altpool triggers MDS assertion
wait_for_daemons should wait for every daemon regardless of filesystem. is there a failure log I can look at? Douglas Fuller
05:26 AM Backport #20140 (Resolved): jewel: Journaler may execute on_safe contexts prematurely
Nathan Cutler

06/25/2017

07:56 AM Backport #20412 (Resolved): test_remote_update_write (tasks.cephfs.test_quota.TestQuota) fails in...
https://github.com/ceph/ceph/pull/15936 Nathan Cutler
 

Also available in: Atom