Activity
From 02/06/2018 to 03/07/2018
03/07/2018
- 11:26 PM Bug #23270 (New): failed mutex assert in PipeConnection::try_get_pipe() (via OSD::do_command())
- ...
- 09:29 PM Bug #23269 (New): Early use of clog in OSD startup crashes OSD
This crash occurred because log_weirdness() called osd->clog->error() probably out of init() -> load_pgs() -> read_...- 05:13 PM Bug #23128: invalid values in ceph.conf do not issue visible warnings
- Josh Durgin wrote:
> This is being improved with a centralized configuration stored in the monitors in mimic.
I... - 05:01 PM Bug #23267 (Resolved): scrub errors not cleared on replicas can cause inconsistent pg state when ...
The PG_STATE_INCONSISTENT flag is set based on num_scrub_errors. A pg query can show after scrub inconsistencies r...- 12:39 PM Bug #22092 (Resolved): ceph-kvstore-tool's store-crc command does not save result to the file as ...
- 11:48 AM Bug #23258: OSDs keep crashing.
- Additional info: We were running on Kraken until last week, than upgraded to 12.2.3, where the problems started and u...
- 11:44 AM Bug #23258 (New): OSDs keep crashing.
- At least two OSDs (#11 and #20) on two different hosts in our cluster keep crashing, which prevent our cluster to get...
- 10:33 AM Backport #23256 (In Progress): luminous: bluestore: should recalc_allocated when decoding bluefs_...
- https://github.com/ceph/ceph/pull/20771
- 10:32 AM Backport #23256 (Resolved): luminous: bluestore: should recalc_allocated when decoding bluefs_fno...
- https://github.com/ceph/ceph/pull/20771
- 10:30 AM Bug #23212 (Pending Backport): bluestore: should recalc_allocated when decoding bluefs_fnode_t
- 05:01 AM Feature #23242: ceph-objectstore-tool command to trim the pg log
- The assert(num_unsent <= log_queue.size()) probably doesn't relate directly with this feature. The log_weirdness() f...
- 01:02 AM Feature #23242: ceph-objectstore-tool command to trim the pg log
From PG::log_weirdness():
2018-03-06 16:18:57.413 7f0a593b9dc0 -1 log_channel(cluster) log [ERR] : 1.0 log bound...- 12:26 AM Feature #23242 (In Progress): ceph-objectstore-tool command to trim the pg log
When testing the log trimming code on master the OSD crashes like this....- 02:34 AM Feature #23236 (Rejected): should allow osd to dump slow ops
03/06/2018
- 07:25 PM Feature #23236: should allow osd to dump slow ops
- Oh yep, that'll do it. So I'm a bit confused what this ticket is supposed to mean.
- 09:55 AM Feature #23236: should allow osd to dump slow ops
- Isn't this what dump_blocked_ops is for? See also https://tracker.ceph.com/issues/23205
- 04:23 AM Feature #23236: should allow osd to dump slow ops
- I guess this is saying we don’t have a slow-only output command? dump_ops_in_flight et al certainly will print them o...
- 04:11 AM Feature #23236 (Rejected): should allow osd to dump slow ops
- after f4b74125e44fe78154fb377fa06fc08b3325859d, we have no way to print out the slow ops of OSDs. only a summary is o...
- 03:01 PM Bug #23145 (Need More Info): OSD crashes during recovery of EC pg
- 02:29 PM Bug #23145: OSD crashes during recovery of EC pg
- @Peter:
Is there a chance to get a log with both OSD and BlueStore debugs levels turned to 20? At the moment I can't... - 01:51 PM Feature #23242 (Resolved): ceph-objectstore-tool command to trim the pg log
- ceph-objectstore-tool command to trim the pg log
The motive of this bug is to have a command to trim the pg log with... - 12:54 PM Bug #23200 (Fix Under Review): invalid JSON returned when querying pool parameters
- https://github.com/ceph/ceph/pull/20745.
- 06:07 AM Bug #23233 (Duplicate): The randomness of the hash function causes the object to be inhomogeneous...
- 02:41 AM Bug #23233: The randomness of the hash function causes the object to be inhomogeneous to the PG.T...
- Sorry, this question is not completed. Please ignore it.
- 02:34 AM Bug #23233 (Duplicate): The randomness of the hash function causes the object to be inhomogeneous...
- 05:30 AM Backport #23077 (New): luminous: mon: ops get stuck in "resend forwarded message to leader"
- These are both done and the backport can proceed. :)
- 03:03 AM Bug #23235 (New): The randomness of the hash function causes the object to be inhomogeneous to th...
- The randomness of the ceph_str_hash_rjenkins hash function causes the object to be inhomogeneous to the PG.The result...
- 12:26 AM Bug #20924: osd: leaked Session on osd.7
- /a/kchai-2018-03-05_17:31:09-rados-wip-kefu-testing-2018-03-05-2238-distro-basic-smithi/2252897
03/05/2018
- 07:12 PM Bug #23228 (Closed): scrub mismatch on objects
- ...
- 06:27 PM Bug #20086: LibRadosLockECPP.LockSharedDurPP gets EEXIST
- saw this again,...
- 04:45 PM Bug #23215: config.cc: ~/.ceph/$cluster.conf is passed unexpanded to fopen()
- (I think this is rbd, right?)
- 06:01 AM Bug #23215 (Resolved): config.cc: ~/.ceph/$cluster.conf is passed unexpanded to fopen()
- parse_file() in "src/dmclock/sim/src/ConfUtils.cc" receives a filename without the tilde being expanded to correspond...
- 09:34 AM Backport #23174 (In Progress): luminous: SRV resolution fails to lookup AAAA records
- https://github.com/ceph/ceph/pull/20710
- 03:37 AM Bug #23212: bluestore: should recalc_allocated when decoding bluefs_fnode_t
- https://github.com/ceph/ceph/pull/20701
- 03:35 AM Bug #23212 (Resolved): bluestore: should recalc_allocated when decoding bluefs_fnode_t
- ...
- 03:20 AM Bug #23119: MD5-checksum of the snapshot for rbd image in Ceph(as OpenStack-Glance backend Storag...
- 宏伟 唐 wrote:
> 宏伟 唐 wrote:
> > Mykola Golub wrote:
> > > > There are no logs indicating osd crash and the outputs o... - 02:36 AM Bug #23119: MD5-checksum of the snapshot for rbd image in Ceph(as OpenStack-Glance backend Storag...
- 宏伟 唐 wrote:
> Mykola Golub wrote:
> > > There are no logs indicating osd crash and the outputs of 'ceph daemon osd....
03/04/2018
- 02:33 AM Bug #23119: MD5-checksum of the snapshot for rbd image in Ceph(as OpenStack-Glance backend Storag...
- Mykola Golub wrote:
> > There are no logs indicating osd crash and the outputs of 'ceph daemon osd.x log dump' are a...
03/02/2018
- 10:40 PM Bug #18165 (Resolved): OSD crash with osd/ReplicatedPG.cc: 8485: FAILED assert(is_backfill_target...
- 10:23 PM Bug #23204 (Duplicate): missing primary copy of object in mixed luminous<->master cluster with bl...
- The dead jobs here failed due to this:
http://pulpito.ceph.com/yuriw-2018-03-01_22:45:38-upgrade:luminous-x-wip-yu... - 09:21 PM Bug #22050: ERROR type entries of pglog do not update min_last_complete_ondisk, potentially ballo...
- This seems to be biting rgw's usage pools when rgw-admin usage trim occurs in pgs with little other activity.
- 04:18 PM Bug #23200 (Resolved): invalid JSON returned when querying pool parameters
- When requesting JSON formatted results for querying for pool
parameters, the list that comes back is not valid JSON.... - 01:56 PM Bug #23119: MD5-checksum of the snapshot for rbd image in Ceph(as OpenStack-Glance backend Storag...
- Both the image HEAD and snapshot "snap" show a size of 10GB, so if your exported sizes are different, the export must...
- 09:49 AM Bug #23119: MD5-checksum of the snapshot for rbd image in Ceph(as OpenStack-Glance backend Storag...
- > There are no logs indicating osd crash and the outputs of 'ceph daemon osd.x log dump' are all empty ({}).
The m... - 08:26 AM Bug #23119: MD5-checksum of the snapshot for rbd image in Ceph(as OpenStack-Glance backend Storag...
- Jason Dillaman wrote:
> Can you please run "rados -p <pool name> listomapvals rbd_header.<image id>" and provide the... - 12:06 PM Bug #23194 (Rejected): librados client is sending bad omap value just before program exits
- Thanks Jason. You were absolutely right -- the omap get/put at exit is being driven by ganesha. I had missed that bef...
- 04:37 AM Bug #23130: No error is shown when "osd_mon_report_interval_min" value is greater than "osd_mon_...
- Jewel is scheduled to reach End of Life when Mimic is released (around June 2018). It's possible this issue will not ...
03/01/2018
- 11:46 PM Bug #23195 (Resolved): Read operations segfaulting multiple OSDs
- I'm seeing some OSDs crashing at the same time with (mostly) the same error message related to a reading an erasure c...
- 11:14 PM Bug #23194: librados client is sending bad omap value just before program exits
- ... there was a "omap get" right before the store and the values stored where the (truncated) values that were just r...
- 10:38 PM Bug #23194: librados client is sending bad omap value just before program exits
- rados_kv_get does look hinky, but I don't think we're calling into it here. We're basically doing a rados_kv_put into...
- 09:53 PM Bug #23194: librados client is sending bad omap value just before program exits
- I don't know what nfs-ganesha code to look at, but this [1] looks very suspect to me since you are returning a pointe...
- 09:43 PM Bug #23194: librados client is sending bad omap value just before program exits
- Frame 201:
Object: rec-00000000:0000000000000017
Key: 6528071705456279553
Value: ::ffff:192.168.1.243-(37:Linux NF... - 09:16 PM Bug #23194: librados client is sending bad omap value just before program exits
- I do have the ability to collect client logs within the container, and can turn up debugging in there if it'll help.
- 08:56 PM Bug #23194: librados client is sending bad omap value just before program exits
- Ahh, the object name is 29 bytes in this case, so maybe there is some confusion about lengths down in the code that i...
- 08:49 PM Bug #23194 (Rejected): librados client is sending bad omap value just before program exits
- I've been tracking down a problem in nfs-ganesha where an omap value in an object ends up truncated. It doesn't alway...
- 11:47 AM Backport #23160 (Need More Info): luminous: Multiple asserts caused by DNE pgs left behind after ...
- Waiting for code review for backport PR : https://github.com/ceph/ceph/pull/20668
- 11:16 AM Bug #20798: LibRadosLockECPP.LockExclusiveDurPP gets EEXIST
- ...
- 09:55 AM Bug #21388: inconsistent pg but repair does nothing reporting head data_digest != data_digest fro...
- Yoann Moulin wrote:
> David Zafman wrote:
> > Yoann Moulin wrote:
> > > is that normal all files in 11.5f_head hav... - 09:26 AM Bug #21388: inconsistent pg but repair does nothing reporting head data_digest != data_digest fro...
- in attachment the result of the dump for each OSD with the good args
- 09:08 AM Bug #21388: inconsistent pg but repair does nothing reporting head data_digest != data_digest fro...
- David Zafman wrote:
> Yoann Moulin wrote:
> > is that normal all files in 11.5f_head have size=0 on each replicate ... - 09:06 AM Bug #21388: inconsistent pg but repair does nothing reporting head data_digest != data_digest fro...
- in attachment the result of the dump for each OSD
and the extended attributes for the files on disk :... - 01:55 AM Bug #21388: inconsistent pg but repair does nothing reporting head data_digest != data_digest fro...
- Yoann Moulin wrote:
> is that normal all files in 11.5f_head have size=0 on each replicate of the PG ?
>
> [...]
... - 01:11 AM Bug #21388: inconsistent pg but repair does nothing reporting head data_digest != data_digest fro...
- Can you dump the object with something like the following....
- 09:33 AM Backport #23186 (In Progress): luminous: ceph tell mds.* <command> prints only one matching usage
- https://github.com/ceph/ceph/pull/20664
- 09:26 AM Backport #23186 (Resolved): luminous: ceph tell mds.* <command> prints only one matching usage
- https://github.com/ceph/ceph/pull/20664
- 09:25 AM Bug #23125 (Duplicate): Bad help text when 'ceph osd pool' is run
- 02:38 AM Bug #23125: Bad help text when 'ceph osd pool' is run
- I am working on this issue. Thanks.
- 08:36 AM Feature #23045 (Fix Under Review): mon: warn on slow ops in OpTracker
- 07:56 AM Feature #23045: mon: warn on slow ops in OpTracker
- https://github.com/ceph/ceph/pull/20660
- 03:30 AM Bug #23124: Status of OSDs are not showing properly after disabling ceph.target and ceph-osd.target
- As OSDs are brought up by the udev rules, regardless of the enabled status of "ceph.target" and "ceph-osd.target" hen...
- 03:14 AM Bug #23119: MD5-checksum of the snapshot for rbd image in Ceph(as OpenStack-Glance backend Storag...
- Can you please run "rados -p <pool name> listomapvals rbd_header.<image id>" and provide the output? You can determin...
- 01:57 AM Bug #23119: MD5-checksum of the snapshot for rbd image in Ceph(as OpenStack-Glance backend Storag...
- Jason Dillaman wrote:
> Yes, snapshots are read-only so the only thing I can think of is some sort of data corruptio... - 12:21 AM Bug #23119: MD5-checksum of the snapshot for rbd image in Ceph(as OpenStack-Glance backend Storag...
- Mykola Golub wrote:
> It looks like your log entries are from in memory log dump. Did you osd crash (could be seen i...
02/28/2018
- 10:41 PM Bug #23132 (Triaged): some config values should be unsigned, to disallow negative values
- 10:37 PM Bug #23130 (Triaged): No error is shown when "osd_mon_report_interval_min" value is greater than...
- This only affects jewel since the osd_mon_report_interval_max option is no longer used in luminous and later.
- 10:35 PM Bug #23128: invalid values in ceph.conf do not issue visible warnings
- This is reverting to the default value since 1.1 is not a valide value for the option.
This is being improved with... - 10:34 PM Bug #23128 (Triaged): invalid values in ceph.conf do not issue visible warnings
- 10:31 PM Bug #23125 (Triaged): Bad help text when 'ceph osd pool' is run
- 10:30 PM Bug #23124 (Won't Fix): Status of OSDs are not showing properly after disabling ceph.target and c...
- As Nathan explained, this isn't how the targets are meant to work.
- 10:27 PM Bug #23145: OSD crashes during recovery of EC pg
- Sage, is this a bluestore issue, or did we lose the rollback info somewhere?
It looks like it's getting enoent for... - 11:23 AM Backport #23181 (In Progress): jewel: Can't repair corrupt object info due to bad oid on all repl...
- 11:22 AM Backport #23181 (Resolved): jewel: Can't repair corrupt object info due to bad oid on all replicas
- https://github.com/ceph/ceph/pull/20622
- 11:20 AM Backport #23174 (Resolved): luminous: SRV resolution fails to lookup AAAA records
- https://github.com/ceph/ceph/pull/20710
- 11:19 AM Bug #20471 (Pending Backport): Can't repair corrupt object info due to bad oid on all replicas
- 10:33 AM Bug #21388: inconsistent pg but repair does nothing reporting head data_digest != data_digest fro...
- is that normal all files in 11.5f_head have size=0 on each replicate of the PG ?...
- 08:17 AM Bug #21388: inconsistent pg but repair does nothing reporting head data_digest != data_digest fro...
- Here the result of the 3 commands for each replicate of the PG, osd.78 on iccluster020 is the one with the error :
... - 06:57 AM Bug #23119: MD5-checksum of the snapshot for rbd image in Ceph(as OpenStack-Glance backend Storag...
- It looks like your log entries are from in memory log dump. Did you osd crash (could be seen in the log) or did you u...
- 12:30 AM Bug #23119: MD5-checksum of the snapshot for rbd image in Ceph(as OpenStack-Glance backend Storag...
- Jason Dillaman wrote:
> Yes, snapshots are read-only so the only thing I can think of is some sort of data corruptio... - 01:26 AM Bug #23078 (Pending Backport): SRV resolution fails to lookup AAAA records
- 01:19 AM Bug #22462 (Resolved): mon: unknown message type 1537 in luminous->mimic upgrade tests
- 01:13 AM Bug #22656: scrub mismatch on bytes (cache pools)
- http://pulpito.ceph.com/kchai-2018-02-27_10:33:49-rados-wip-kefu-testing-2018-02-27-1348-distro-basic-mira/2232486/
...
02/27/2018
- 11:13 PM Bug #22902: src/osd/PG.cc: 6455: FAILED assert(0 == "we got a bad state machine event")
- This one looks like a similar failure: http://pulpito.ceph.com/nojha-2018-02-23_18:13:41-rados-wip-async-recovery-201...
- 06:49 PM Bug #18746: monitors crashing ./include/interval_set.h: 355: FAILED assert(0) (jewel+kraken)
- To summarize what I've figured out to reproduce this:
* both rbd client and mon are running 12.2.4, happened with ... - 05:46 PM Bug #18746: monitors crashing ./include/interval_set.h: 355: FAILED assert(0) (jewel+kraken)
- Still happening on 12.2.4...
- 04:23 PM Bug #23124: Status of OSDs are not showing properly after disabling ceph.target and ceph-osd.target
- The ceph.target and ceph-osd.target cannot be used this way. Assuming ceph-disk is being used, the OSDs are brought u...
- 04:08 PM Feature #22974 (Resolved): documentation - pg state table missing "activating" state
- 04:08 PM Backport #23113 (Resolved): luminous: documentation - pg state table missing "activating" state
- 12:55 PM Backport #23160 (Resolved): luminous: Multiple asserts caused by DNE pgs left behind after lots o...
- https://github.com/ceph/ceph/pull/20668
- 11:48 AM Bug #21716 (Resolved): ObjectStore/StoreTest.FiemapHoles/3 fails with kstore
- 11:47 AM Backport #21871 (Rejected): luminous: ObjectStore/StoreTest.FiemapHoles/3 fails with kstore
- @smithfarm i am sorry that it turns out that this backport is not needed, because of http://tracker.ceph.com/issues/2...
- 08:58 AM Bug #23145 (Duplicate): OSD crashes during recovery of EC pg
- I've got a cluster (running released debs of ceph 12.2.3) that started crashing on OSD startup a little bit ago. I di...
- 06:12 AM Backport #23075 (In Progress): luminous: osd: objecter sends out of sync with pg epochs for proxi...
- https://github.com/ceph/ceph/pull/20609
02/26/2018
- 09:18 PM Bug #23119: MD5-checksum of the snapshot for rbd image in Ceph(as OpenStack-Glance backend Storag...
- ... possible, but it actually does say it's replicated cache tiers in front of EC backends which should rule-out data...
- 09:01 PM Bug #23119: MD5-checksum of the snapshot for rbd image in Ceph(as OpenStack-Glance backend Storag...
- Couldn't this be related to #21639 (snapshots was not created/deleted against data pool)? The reported version here i...
- 07:16 PM Bug #23119 (Need More Info): MD5-checksum of the snapshot for rbd image in Ceph(as OpenStack-Glan...
- Yes, snapshots are read-only so the only thing I can think of is some sort of data corruption on the OSDs. Have you r...
- 08:23 PM Bug #22996 (Resolved): Snapset inconsistency is no longer detected
- 08:20 PM Backport #23054 (Resolved): luminous: Snapset inconsistency is no longer detected
- https://github.com/ceph/ceph/pull/20501
- 08:04 PM Backport #23093 (Resolved): luminous: last-stat-seq returns 0 because osd stats are cleared
- 08:03 PM Bug #21833 (Pending Backport): Multiple asserts caused by DNE pgs left behind after lots of OSD r...
- 07:32 PM Feature #23087 (Duplicate): Add OSD metrics to keep track of per-client IO
- We've discussed "rbd top" before (http://tracker.ceph.com/projects/ceph/wiki/CDM_07-DEC-2016, http://tracker.ceph.com...
- 05:07 AM Bug #23132 (Triaged): some config values should be unsigned, to disallow negative values
- Execution Steps:
-------------------
1. Set negative value for parameter "osd_heartbeat_interval" in ceph.conf
2.... - 04:56 AM Backport #23114 (In Progress): luminous: can't delete object from pool when Ceph out of space
- https://github.com/ceph/ceph/pull/20585
- 04:54 AM Bug #23130 (Triaged): No error is shown when "osd_mon_report_interval_min" value is greater than...
- Execution Steps:
------------------
1. Set the "osd_mon_report_interval_min" value using CLI
# ceph daemon osd... - 04:37 AM Feature #23129: After creating a snapshot of a rados pool when we try to rollback the pool it all...
- rados -p testpool rollback myobject1 testpool-snap
[Note :- Only mentioned object is roll backed from snapshot] - 04:35 AM Feature #23129 (New): After creating a snapshot of a rados pool when we try to rollback the pool ...
- Execution Steps:
------------------
1. Creating a pool
# ceph osd pool create testpool 16 16
2. Add ... - 04:29 AM Bug #23128 (Triaged): invalid values in ceph.conf do not issue visible warnings
- Execution Steps
-----------------
1. Change the setting of "mon osd down out interval" in ceph.conf as per below
... - 04:24 AM Bug #21388: inconsistent pg but repair does nothing reporting head data_digest != data_digest fro...
- Yes, size 0 object is expected since all copies report '"size": 0'.
The discrepancy appears to be in the omap data... - 04:10 AM Bug #23125 (Duplicate): Bad help text when 'ceph osd pool' is run
- Execution Steps :
-----------------
1.While executing the cli for creating a snapshot of a pool
#ceph osd pool ... - 04:04 AM Bug #23124 (Won't Fix): Status of OSDs are not showing properly after disabling ceph.target and c...
- Execution Steps:
----------------
1. # ceph osd tree [ceph is in running state]
2. # systemctl disab... - 03:49 AM Feature #23123 (New): use pwrite to emulate posix_fallocate
- less IO when using a plain file as the store for testing bluestore if posix_fallocate() is not available.
see ht... - 03:24 AM Backport #23113 (In Progress): luminous: documentation - pg state table missing "activating" state
- https://github.com/ceph/ceph/pull/20584
02/25/2018
- 08:31 AM Bug #23119 (Need More Info): MD5-checksum of the snapshot for rbd image in Ceph(as OpenStack-Glan...
- Ceph Version: 12.2.2 Luminous Stable
Problem description:
We use ceph as the backend storage for OpenStack Glance...
02/24/2018
- 07:14 PM Bug #23117 (Fix Under Review): PGs stuck in "activating" after osd_max_pg_per_osd_hard_ratio has ...
- In the following setup:
* 6 OSD hosts
* Each host with 32 disks = 32 OSDs
* Pool with 2048 PGs, EC, k=4, m=2, crus... - 05:54 PM Bug #21833: Multiple asserts caused by DNE pgs left behind after lots of OSD restarts
- https://github.com/ceph/ceph/pull/20571
- 05:54 PM Bug #21833: Multiple asserts caused by DNE pgs left behind after lots of OSD restarts
- Not sure if these needs a Jewel backport
- 11:22 AM Backport #23114 (Resolved): luminous: can't delete object from pool when Ceph out of space
- https://github.com/ceph/ceph/pull/20585
- 11:21 AM Backport #23113 (Resolved): luminous: documentation - pg state table missing "activating" state
- https://github.com/ceph/ceph/pull/20584
- 04:39 AM Feature #22974 (Pending Backport): documentation - pg state table missing "activating" state
- https://github.com/ceph/ceph/pull/20504
- 04:35 AM Bug #23078: SRV resolution fails to lookup AAAA records
- 04:32 AM Bug #22952 (Duplicate): Monitor stopped responding after awhile
- great! i am marking this ticket as a "duplicate". please reopen it if you think otherwise.
happy Chinese new year ... - 04:20 AM Bug #22413 (Pending Backport): can't delete object from pool when Ceph out of space
02/23/2018
- 10:24 PM Feature #23096: mon: don't remove auth caps without a flag
- We could throw an error instead, yeah. That is probably a wise forcing function. I think we still want the flag thoug...
- 11:37 AM Feature #23096: mon: don't remove auth caps without a flag
- Bit torn on this one: there is a security downside to changing this behaviour in-place -- any existing scripts that e...
- 01:08 AM Feature #23096 (New): mon: don't remove auth caps without a flag
- With current syntax, something like...
- 08:02 PM Bug #21833 (In Progress): Multiple asserts caused by DNE pgs left behind after lots of OSD restarts
- 02:03 AM Bug #21833: Multiple asserts caused by DNE pgs left behind after lots of OSD restarts
- I was working on this last week, but got distracted by other issues. I'm going to force this scenario and see about f...
- 02:01 PM Backport #23103 (In Progress): luminous: v12.2.2 unable to create bluestore osd using ceph-disk
- 01:50 PM Backport #23103 (Resolved): luminous: v12.2.2 unable to create bluestore osd using ceph-disk
- https://github.com/ceph/ceph/pull/20563
- 11:54 AM Bug #18165 (Pending Backport): OSD crash with osd/ReplicatedPG.cc: 8485: FAILED assert(is_backfil...
- This should not have been marked Resolved when one of the backports was still open.
- 08:33 AM Bug #21388: inconsistent pg but repair does nothing reporting head data_digest != data_digest fro...
- Hello Brad,
Sorry I have been too fast,
the rados get with the good pool return a file with size=0... - 03:42 AM Bug #21388: inconsistent pg but repair does nothing reporting head data_digest != data_digest fro...
- Does "rados -p disks ls" list the object? Can you find the actual storage for this object on the disks used for these...
02/22/2018
- 11:56 PM Backport #23093 (In Progress): luminous: last-stat-seq returns 0 because osd stats are cleared
- 11:43 PM Backport #23093: luminous: last-stat-seq returns 0 because osd stats are cleared
- https://github.com/ceph/ceph/pull/20548
- 05:52 PM Backport #23093 (Resolved): luminous: last-stat-seq returns 0 because osd stats are cleared
I added an assert which crashes ceph-mgr because PGMap::apply_incremental() processes a osd_stat_t that is all zero...- 11:40 PM Bug #22882 (Fix Under Review): Objecter deadlocked on op budget while holding rwlock in ms_handle...
- https://github.com/ceph/ceph/pull/20519
- 09:40 PM Bug #22952: Monitor stopped responding after awhile
- Thanks, with the 12.2.3 + this patch, the cluster is now back to HEALTH_OK state
- 06:37 PM Bug #22952: Monitor stopped responding after awhile
- Kefu Chai wrote:
> Frank, sorry for the latency. i am just back from the holiday. i pushed 12.2.3 + https://github.c... - 10:07 AM Bug #22952: Monitor stopped responding after awhile
- Frank, sorry for the latency. i am just back from the holiday. i pushed 12.2.3 + https://github.com/ceph/ceph/pull/20...
- 06:06 PM Bug #22662 (Resolved): ceph osd df json output validation reported invalid numbers (-nan) (jewel)
- 06:05 PM Backport #22866 (Resolved): jewel: ceph osd df json output validation reported invalid numbers (-...
- 04:03 PM Bug #21121 (Resolved): test_health_warnings.sh can fail
- 04:03 PM Backport #21239 (Resolved): jewel: test_health_warnings.sh can fail
- 02:09 PM Backport #23077 (Need More Info): luminous: mon: ops get stuck in "resend forwarded message to le...
- This backport has two master PRs:
* https://github.com/ceph/ceph/pull/20467
* https://github.com/ceph/ceph/pull/2... - 01:14 PM Bug #21388: inconsistent pg but repair does nothing reporting head data_digest != data_digest fro...
- Hello,
I'm also having this issue... - 12:54 PM Feature #23087 (Duplicate): Add OSD metrics to keep track of per-client IO
- In our online clusters, there are times when some RBD images' size increase rapidly, which could fill up the whole cl...
- 11:10 AM Bug #22413 (Fix Under Review): can't delete object from pool when Ceph out of space
- https://github.com/ceph/ceph/pull/20534
- 09:58 AM Bug #22354 (Pending Backport): v12.2.2 unable to create bluestore osd using ceph-disk
- 08:17 AM Bug #23078 (Fix Under Review): SRV resolution fails to lookup AAAA records
- 08:09 AM Bug #23078: SRV resolution fails to lookup AAAA records
- In the meantime btw, a Round Robin IPv6 DNS record works just fine, something like:...
- 07:35 AM Bug #23078: SRV resolution fails to lookup AAAA records
- Simon Leinen wrote:
> WANG Guoqin actually noted the lack of IPv6 support in "a comment on issue #14527":http://trac... - 06:29 AM Bug #22462 (Fix Under Review): mon: unknown message type 1537 in luminous->mimic upgrade tests
- https://github.com/ceph/ceph/pull/20528
- 05:41 AM Bug #22462: mon: unknown message type 1537 in luminous->mimic upgrade tests
- MMonHealth (MSG_MON_HEALTH=0x601 (1537)) was removed in https://github.com/ceph/ceph/commit/7b4a741fbda4dc817a003c694...
02/21/2018
- 10:46 PM Feature #14527: Lookup monitors through DNS
- WANG Guoqin wrote:
> The recent code doesn't support IPv6, apparently. Maybe we can choose among ns_t_a and ns_t_aaa... - 10:44 PM Bug #23078: SRV resolution fails to lookup AAAA records
- WANG Guoqin actually noted the lack of IPv6 support in "a comment on issue #14527":http://tracker.ceph.com/issues/145...
- 10:26 PM Bug #23078 (Resolved): SRV resolution fails to lookup AAAA records
- We have some IPv6 Rados clusters. So far we have been specifying the addresses of each cluster's three mons using li...
- 09:56 PM Support #23005: Implement rados for Python library with some problem
- Does this work without pyinstaller on your system?
- 09:54 PM Bug #23029: osd does not handle eio on meta objects (e.g., osdmap)
- We could at least fail more politely here even if we can't recover from it in the short term.
- 09:50 PM Bug #23049: ceph Status shows only WARN when traffic to cluster fails
- Can reproduce easily - thanks for the report.
2 bugs here - 1) the monitor is still enforcing the mon_osd_min_up_r... - 09:46 PM Support #23050 (Closed): PG doesn't move to down state in replica pool
- 'stale' means there haven't been any reports from the primary in a while. Since there's no osd to report the status o...
- 09:40 PM Bug #23051: PGs stuck in down state
- Can you post the results of 'ceph pg $PGID query' for some of the down pgs?
- 09:34 PM Bug #22994: rados bench doesn't use --max-objects
- rados tool options are pretty confusing - help text should make more clear what the options are for bench vs load-gen...
- 09:27 PM Backport #23076 (In Progress): jewel: osd: objecter sends out of sync with pg epochs for proxied ops
- 09:26 PM Backport #23076 (Resolved): jewel: osd: objecter sends out of sync with pg epochs for proxied ops
- https://github.com/ceph/ceph/pull/20518
- 09:26 PM Backport #23077 (Resolved): luminous: mon: ops get stuck in "resend forwarded message to leader"
- https://github.com/ceph/ceph/pull/21016
- 09:26 PM Backport #23075 (Resolved): luminous: osd: objecter sends out of sync with pg epochs for proxied ops
- https://github.com/ceph/ceph/pull/20609
- 07:48 PM Bug #22114: mon: ops get stuck in "resend forwarded message to leader"
- Oh, second PR for the OSD beacons and PG create messages: https://github.com/ceph/ceph/pull/20517
- 04:35 PM Bug #22114 (Pending Backport): mon: ops get stuck in "resend forwarded message to leader"
- 04:34 PM Bug #22123 (Pending Backport): osd: objecter sends out of sync with pg epochs for proxied ops
- 01:06 PM Bug #19737: EAGAIN encountered during pg scrub (jewel)
- @Josh - thanks
https://github.com/ceph/ceph/pull/20508 - 12:49 AM Bug #23031: FAILED assert(!parent->get_log().get_missing().is_missing(soid))
osd.0 was the primary before it crashed came back up and crashed again as original indicated in this bug. This is ...
02/20/2018
- 04:03 PM Backport #23054 (Resolved): luminous: Snapset inconsistency is no longer detected
The fix for #20243 required additional handling of snapset inconsistency. The Object info and snapset aren't part ...- 12:26 PM Bug #23051 (New): PGs stuck in down state
- Hello,
We see PGs stuck in down state even when the respective osds are started and recovered from the failure sc... - 10:38 AM Bug #21833: Multiple asserts caused by DNE pgs left behind after lots of OSD restarts
- I can confirm this on 12.2.2. It makes data unavailable.
My output:... - 10:14 AM Support #23050: PG doesn't move to down state in replica pool
- Please let me know of the required logs/info to be added if any.
- 10:13 AM Support #23050 (Closed): PG doesn't move to down state in replica pool
- Hello,
Environment used - 3 node cluster
Replication - 3
#ceph osd pool ls detail
pool 16 'cdvr_ec' replica... - 09:45 AM Backport #17445 (Resolved): jewel: list-snap cache tier missing promotion logic (was: rbd cli seg...
- 09:43 AM Feature #15835 (Resolved): filestore: randomize split threshold
- 09:42 AM Backport #22658 (Resolved): filestore: randomize split threshold
- 09:35 AM Backport #22794 (Resolved): jewel: heartbeat peers need to be updated when a new OSD added into a...
- 09:33 AM Bug #20705 (Resolved): repair_test fails due to race with osd start
- 09:33 AM Backport #22818 (Resolved): jewel: repair_test fails due to race with osd start
- 09:04 AM Backport #23024 (In Progress): luminous: thrash-eio + bluestore (hangs with unfound objects or re...
- https://github.com/ceph/ceph/pull/20495
- 06:16 AM Backport #23024: luminous: thrash-eio + bluestore (hangs with unfound objects or read_log_and_mis...
- I'm on it.
- 08:55 AM Bug #23049: ceph Status shows only WARN when traffic to cluster fails
- Please let me know of the required logs/info to be added if any.
- 08:54 AM Bug #23049 (New): ceph Status shows only WARN when traffic to cluster fails
- Hello,
While using Kraken, i have seen the status change to ERR but in luminous we do not see the status of ceph ... - 07:46 AM Bug #22996 (Pending Backport): Snapset inconsistency is no longer detected
- https://github.com/ceph/ceph/pull/20450
- 05:30 AM Bug #19737: EAGAIN encountered during pg scrub (jewel)
- Looked at the logs from http://pulpito.front.sepia.ceph.com/smithfarm-2018-02-06_21:07:15-rados-wip-jewel-backports-d...
02/19/2018
- 10:59 PM Bug #18178 (Won't Fix): Unfound objects lost after OSD daemons restarted
Reasons this is being close
1. PG repair is moving to user mode so on the fly object repair probably won't use r...- 09:58 PM Feature #23045: mon: warn on slow ops in OpTracker
- I've assigned this to myself but I don't know when I can get to it, so if you want to work on this feel free to take it!
- 09:56 PM Feature #23045 (Resolved): mon: warn on slow ops in OpTracker
- The monitor has an OpTracker now, but it doesn't warn on slow ops the way the MDS or OSD do. We should enable that to...
- 09:52 PM Bug #23030: osd: crash during recovery with assert(p != recovery_info.ss.clone_snap)and assert(re...
- This snapshot assert looks like "Ceph Luminous - pg is down due to src/osd/SnapMapper.cc: 246: FAILED assert(r == -2)...
- 09:02 PM Feature #23044 (New): osd: use madvise with MADV_DONTDUMP to prevent cached data from being core ...
- Idea here is to reduce the size of the coredumps but also to prevent sensitive data from being leaked.
- 02:55 PM Bug #22123 (Fix Under Review): osd: objecter sends out of sync with pg epochs for proxied ops
- https://github.com/ceph/ceph/pull/20484
I opted for the marginally more complex solution of cancelling multiple o...
02/17/2018
- 02:20 AM Bug #23031 (New): FAILED assert(!parent->get_log().get_missing().is_missing(soid))
- Using vstart to start 3 OSDs with -o filestore debug inject read err=1
Manually injectdataerr on all replicas of o... - 12:37 AM Bug #23030 (Fix Under Review): osd: crash during recovery with assert(p != recovery_info.ss.clone...
- I've got some OSDs in an 5/3 EC pool crashing during recovery. The crash happens simultaneously on 5 to 10 OSDs, some...
- 12:36 AM Bug #22743: "RadosModel.h: 854: FAILED assert(0)" in upgrade:hammer-x-jewel-distro-basic-smithi
- I looked at it briefly and the output doesn't make any sense to me, but I don't have a lot of context around what the...
02/16/2018
- 11:49 PM Bug #22114 (Fix Under Review): mon: ops get stuck in "resend forwarded message to leader"
- https://github.com/ceph/ceph/pull/20467
- 02:14 PM Bug #22114: mon: ops get stuck in "resend forwarded message to leader"
- Greg Farnum wrote:
> Ummm, yep, that looks right to me at a quick glance! Can you submit a PR with that change? :)
... - 02:04 PM Bug #22114: mon: ops get stuck in "resend forwarded message to leader"
- Maybe not. you should check the code on github.com.
- 01:22 PM Bug #22114: mon: ops get stuck in "resend forwarded message to leader"
- hongpeng lu wrote:
> The messages can not be forwarded appropriately, you must change the code like this.
> [...]
... - 01:17 PM Bug #22114: mon: ops get stuck in "resend forwarded message to leader"
- The messages can not be forwarded appropriately, you must change the code like this....
- 12:52 PM Bug #22114: mon: ops get stuck in "resend forwarded message to leader"
- We have the same problem on all our Luminous clusters. Any news regarding fix?
Most stuck messages in our case are o... - 10:35 PM Bug #22743: "RadosModel.h: 854: FAILED assert(0)" in upgrade:hammer-x-jewel-distro-basic-smithi
- @nathan This doesn't have cache tier, so it would be a different issue. Maybe related to upgrade?
- 07:58 PM Bug #22743: "RadosModel.h: 854: FAILED assert(0)" in upgrade:hammer-x-jewel-distro-basic-smithi
- @David I guess this is a duplicate, too?
- 04:27 PM Bug #22743: "RadosModel.h: 854: FAILED assert(0)" in upgrade:hammer-x-jewel-distro-basic-smithi
- seems reproducible, see
http://pulpito.ceph.com/teuthology-2018-02-16_01:15:03-upgrade:hammer-x-jewel-distro-basic-... - 10:01 PM Bug #23029 (New): osd does not handle eio on meta objects (e.g., osdmap)
- ...
- 05:00 PM Bug #22063 (Duplicate): "RadosModel.h: 1703: FAILED assert(!version || comp->get_version64() == v...
- 04:59 PM Bug #22064 (Duplicate): "RadosModel.h: 865: FAILED assert(0)" in rados-jewel-distro-basic-smithi
- 11:03 AM Backport #23024 (Resolved): luminous: thrash-eio + bluestore (hangs with unfound objects or read_...
- https://github.com/ceph/ceph/pull/20495
- 12:07 AM Bug #21218 (Pending Backport): thrash-eio + bluestore (hangs with unfound objects or read_log_and...
02/15/2018
- 06:53 PM Bug #22952: Monitor stopped responding after awhile
- Frank Li wrote:
> either 12.2.2 + the patch or 12.2.3 RC + the patch would be good, whichever is more convenient to ... - 05:09 PM Bug #22996 (Fix Under Review): Snapset inconsistency is no longer detected
- 04:13 PM Bug #18746: monitors crashing ./include/interval_set.h: 355: FAILED assert(0) (jewel+kraken)
- I've got a cluster here where this issue is 100% reproducible when trying to delete snapshots. Let me know if we can ...
- 04:07 PM Bug #21833: Multiple asserts caused by DNE pgs left behind after lots of OSD restarts
- I'm also seeing this on 12.2.2. The crashing OSD has some bad PG which crashes it on startup. I first assumed the dis...
- 03:47 PM Backport #21871 (In Progress): luminous: ObjectStore/StoreTest.FiemapHoles/3 fails with kstore
- 03:45 PM Backport #21871 (Need More Info): luminous: ObjectStore/StoreTest.FiemapHoles/3 fails with kstore
- somewhat non-trivial, @Kefu could you take a look?
- 03:40 PM Backport #21871 (In Progress): luminous: ObjectStore/StoreTest.FiemapHoles/3 fails with kstore
- 03:39 PM Backport #21872 (Resolved): jewel: ObjectStore/StoreTest.FiemapHoles/3 fails with kstore
- 07:34 AM Support #23005 (New): Implement rados for Python library with some problem
- Hi all,
This is my first time to be here.
I use the ceph raods library to implement customize python code, and ...
02/14/2018
- 10:02 PM Bug #18746: monitors crashing ./include/interval_set.h: 355: FAILED assert(0) (jewel+kraken)
- I'm seeing this on Luminous. Some kRBD clients are sending requests of death killing the active monitor.
No special ... - 08:30 PM Bug #22462: mon: unknown message type 1537 in luminous->mimic upgrade tests
- @Kefu could pls take a look?
- 05:48 PM Bug #22847: ceph osd force-create-pg cause all ceph-mon to crash and unable to come up again
- ok, I'll wait for 12.2.4 or a 12.2.3 + the patch then.
- 09:10 AM Bug #22847: ceph osd force-create-pg cause all ceph-mon to crash and unable to come up again
- Frank Li wrote:
> just curious, I saw this patch got merged to the master branch and has the target version of 12.2.... - 06:51 AM Bug #22847: ceph osd force-create-pg cause all ceph-mon to crash and unable to come up again
- just curious, I saw this patch got merged to the master branch and has the target version of 12.2.3, does that mean i...
- 06:50 AM Bug #22952: Monitor stopped responding after awhile
- either 12.2.2 + the patch or 12.2.3 RC + the patch would be good, whichever is more convenient to build.
- 06:05 AM Bug #22996: Snapset inconsistency is no longer detected
- We also need this fix to include tests that happen in the QA suite to prevent a future regression! :)
(Presumably th... - 03:39 AM Bug #22996: Snapset inconsistency is no longer detected
- 03:37 AM Bug #22996 (Resolved): Snapset inconsistency is no longer detected
The fix for #20243 required additional handling of snapset inconsistency. The Object info and snapset aren't part ...
02/13/2018
- 07:53 PM Bug #22994 (New): rados bench doesn't use --max-objects
- It would be useful for testing OSD caching behavior if rados bench would respect --max-objects parameter. It seems t...
- 07:30 PM Bug #22992: mon: add RAM usage (including avail) to HealthMonitor::check_member_health?
- Turned out it was just the monitor being thrashed (didn't realize we were doing that in kcephfs!): #22993
Still, m... - 06:43 PM Bug #22992 (New): mon: add RAM usage (including avail) to HealthMonitor::check_member_health?
- I'm looking into several MON_DOWN failures from
http://pulpito.ceph.com/pdonnell-2018-02-13_17:49:41-kcephfs-wip-p... - 06:12 PM Bug #21218: thrash-eio + bluestore (hangs with unfound objects or read_log_and_missing assert)
- https://github.com/ceph/ceph/pull/20410
- 04:04 AM Bug #21218 (Fix Under Review): thrash-eio + bluestore (hangs with unfound objects or read_log_and...
- 12:27 PM Bug #22063: "RadosModel.h: 1703: FAILED assert(!version || comp->get_version64() == version)" inr...
- Another jewel run with this bug:
* http://qa-proxy.ceph.com/teuthology/smithfarm-2018-02-06_21:07:15-rados-wip-jew... - 06:52 AM Bug #22952: Monitor stopped responding after awhile
- Kefu Chai wrote:
> > I reproduced the issue in a seperate cluster
>
> could you share the steps to reproduce this...
02/12/2018
- 10:35 PM Bug #21218: thrash-eio + bluestore (hangs with unfound objects or read_log_and_missing assert)
This assert can only happen in the following two cases:
osd debug verify missing on start = true. Used in t...- 10:07 PM Bug #21218: thrash-eio + bluestore (hangs with unfound objects or read_log_and_missing assert)
- For kefu's run above,...
- 03:07 AM Bug #21218: thrash-eio + bluestore (hangs with unfound objects or read_log_and_missing assert)
- thrash-eio + bluestore
/a/kchai-2018-02-11_04:16:47-rados-wip-kefu-testing-2018-02-11-0959-distro-basic-mira/2181825... - 10:05 AM Bug #22354 (Fix Under Review): v12.2.2 unable to create bluestore osd using ceph-disk
- https://github.com/ceph/ceph/pull/20400
- 09:52 AM Bug #22445: ceph osd metadata reports wrong "back_iface"
- John Spray wrote:
> Hmm, this could well be the first time anyone's really tested the IPv6 path here.
https://git... - 09:27 AM Backport #22942 (In Progress): luminous: ceph osd force-create-pg cause all ceph-mon to crash and...
- 08:57 AM Bug #22952: Monitor stopped responding after awhile
- > I reproduced the issue in a seperate cluster
could you share the steps to reproduce this issue? so i can try it ... - 05:58 AM Bug #22949 (Rejected): ceph_test_admin_socket_output --all times out
- 05:57 AM Bug #22949: ceph_test_admin_socket_output --all times out
- thanks Brad. my bad, i thought the bug was in master also. closing this ticket, as the related PR is not yet merged.
02/10/2018
- 08:50 AM Bug #22949: ceph_test_admin_socket_output --all times out
- 08:39 AM Bug #22949: ceph_test_admin_socket_output --all times out
- This is not a problem with the test (although it highlights a deficiency with error reporting which I'll submit a PR ...
- 02:32 AM Bug #22882 (In Progress): Objecter deadlocked on op budget while holding rwlock in ms_handle_reset()
- I finally realized that the op throttler *does* drop the global rwlock while waiting for throttle, so it at least doe...
02/09/2018
- 10:08 PM Bug #22847: ceph osd force-create-pg cause all ceph-mon to crash and unable to come up again
- Just FYI, Using this new patch, the leader ceph-mon will hung once it is up, and any kind of OSD command is ran, like...
- 10:06 PM Bug #22952: Monitor stopped responding after awhile
- Frank Li wrote:
> Frank Li wrote:
> > I reproduced the issue in a seperate cluster, it seems that whichever ceph-mo... - 08:40 PM Bug #22952: Monitor stopped responding after awhile
- Frank Li wrote:
> I reproduced the issue in a seperate cluster, it seems that whichever ceph-mon became the leader w... - 08:35 PM Bug #22952: Monitor stopped responding after awhile
- I reproduced the issue in a seperate cluster, it seems that whichever ceph-mon became the leader will be stuck, as I ...
- 07:50 PM Feature #22973 (Duplicate): log lines when hitting "pg overdose protection"
- You're right that it's bad! This will be fixed in the next luminous release after a belated backport finally happened...
- 02:15 PM Feature #22973 (Duplicate): log lines when hitting "pg overdose protection"
- After upgrading to Luminous we ran into situation where 10% of our pgs remained unavailable, stuck in "activating" st...
- 04:24 PM Bug #22300 (Rejected): ceph osd reweightn command seems to change weight value
- the parameter of reweigtn is an array of fixed point integers. and the integers are int(weight * 0x10000), where weig...
- 02:20 PM Feature #22974 (Resolved): documentation - pg state table missing "activating" state
- "activating" is not listed in the pg state table:
http://docs.ceph.com/docs/master/rados/operations/pg-states/
... - 06:41 AM Bug #22949: ceph_test_admin_socket_output --all times out
- Sure mate, added a patch to get better debugging and will test as soon as it's built.
- 12:24 AM Bug #22882: Objecter deadlocked on op budget while holding rwlock in ms_handle_reset()
- Oh, and I had the LingerOp and Op conflated in my head a bit when looking at that before, but they are different.
... - 12:03 AM Bug #22882: Objecter deadlocked on op budget while holding rwlock in ms_handle_reset()
- Jason, how did you establish the number of in-flight ops? I wonder if maybe it *did* have them but they weren't able ...
- 12:02 AM Bug #22882: Objecter deadlocked on op budget while holding rwlock in ms_handle_reset()
- Okay, so presumably on resend you shouldn't need to grab op budget again, since it's already budgeted, right?
And ...
02/08/2018
- 02:37 PM Bug #22949: ceph_test_admin_socket_output --all times out
- Brad, i am not able to reproduce this issue. could you help take a look?
- 02:25 AM Bug #20086 (Resolved): LibRadosLockECPP.LockSharedDurPP gets EEXIST
- 02:24 AM Bug #22440 (Resolved): New pgs per osd hard limit can cause peering issues on existing clusters
- @Nick, if you think this issue deserves a different fix, please feel free to reopen this ticket
- 12:51 AM Bug #22848: Pull the cable,5mins later,Put back to the cable,pg stuck a long time ulitl to resta...
- Hi Josh Durginm,
1.They both are fibre-optical cable in our networkcard.
2.Log files cann't be found yet,due to at...
02/07/2018
- 11:09 PM Bug #22220: osd/ReplicatedPG.h:1667:14: internal compiler error: in force_type_die, at dwarf2out....
- Fixed by gcc-7.3.1-2.fc26 gcc-7.3.1-2.fc27 in fc27
- 10:49 PM Bug #22440: New pgs per osd hard limit can cause peering issues on existing clusters
- Kefu Chai wrote:
> https://github.com/ceph/ceph/pull/20204
merged - 09:44 PM Bug #22848: Pull the cable,5mins later,Put back to the cable,pg stuck a long time ulitl to resta...
- Which cable are you pulling? Do you have logs from the monitors and osds? The default failure detection timeouts can ...
- 09:40 PM Bug #22916 (Duplicate): OSD crashing in peering
- 09:40 PM Bug #21287 (Duplicate): 1 PG down, OSD fails with "FAILED assert(i->prior_version == last || i->i...
- 03:52 AM Bug #21287: 1 PG down, OSD fails with "FAILED assert(i->prior_version == last || i->is_error())"
- see https://github.com/ceph/ceph/pull/16675
- 02:37 AM Bug #21287: 1 PG down, OSD fails with "FAILED assert(i->prior_version == last || i->is_error())"
- we hit this bug too in ec pool 2+1, i find one peer did not receive one piece of op message sended from primary osd, ...
- 06:12 PM Bug #22952: Monitor stopped responding after awhile
- here is where the first mon server is stuck, running mon_status hang:
[root@dl1-kaf101 frli]# ceph --admin-daemon /v... - 06:06 PM Bug #22952 (Duplicate): Monitor stopped responding after awhile
- After a crash of ceph-mon in 12.2.2 and using a private build provided by ceph developers, the ceph-mon would come up...
- 06:06 PM Bug #22847: ceph osd force-create-pg cause all ceph-mon to crash and unable to come up again
- https://tracker.ceph.com/issues/22952
ticket opened for ceph-mon not responding issue. - 06:02 PM Bug #22847: ceph osd force-create-pg cause all ceph-mon to crash and unable to come up again
- I'll open a separate ticket to track the monitor not responding issue. the fix for the force-create-pg issue is good.
- 06:01 PM Bug #22847: ceph osd force-create-pg cause all ceph-mon to crash and unable to come up again
- Kefu Chai wrote:
> [...]
>
>
> the cluster formed a quorum of [0,1,2,3,4] since 18:02:21. and it was not in pro... - 05:58 PM Bug #22847: ceph osd force-create-pg cause all ceph-mon to crash and unable to come up again
- Kefu Chai wrote:
> [...]
>
> was any osd up when you were testing?
Yes, but they were in Booting State, all of... - 06:56 AM Bug #22847: ceph osd force-create-pg cause all ceph-mon to crash and unable to come up again
- ...
- 06:12 AM Bug #22847: ceph osd force-create-pg cause all ceph-mon to crash and unable to come up again
- ...
- 04:05 PM Bug #22746 (Resolved): osd/common: ceph-osd process is terminated by the logratote task
- 03:33 PM Bug #22949 (Rejected): ceph_test_admin_socket_output --all times out
- http://pulpito.ceph.com/kchai-2018-02-07_01:22:25-rados-wip-kefu-testing-2018-02-06-1514-distro-basic-mira/2161301/
- 05:50 AM Backport #22942 (Resolved): luminous: ceph osd force-create-pg cause all ceph-mon to crash and un...
- https://github.com/ceph/ceph/pull/20399
- 05:01 AM Backport #22934 (Resolved): luminous: filestore journal replay does not guard omap operations
- https://github.com/ceph/ceph/pull/21547
- 12:54 AM Backport #22866 (In Progress): jewel: ceph osd df json output validation reported invalid numbers...
- https://github.com/ceph/ceph/pull/20344
02/06/2018
- 08:01 PM Bug #22350 (Resolved): nearfull OSD count in 'ceph -w'
- 07:49 PM Bug #22847: ceph osd force-create-pg cause all ceph-mon to crash and unable to come up again
- so anything I can do to help recover the cluster ??
- 06:50 AM Bug #22847 (Pending Backport): ceph osd force-create-pg cause all ceph-mon to crash and unable to...
- 01:23 AM Bug #22847: ceph osd force-create-pg cause all ceph-mon to crash and unable to come up again
- please see attached logs for when the monitor was started, and then later got into the stuck mode.
I just replaced t... - 04:54 PM Bug #22920: filestore journal replay does not guard omap operations
- lowering the priority since in practice we don't clone objects with omap on them.
- 04:53 PM Bug #22920 (Pending Backport): filestore journal replay does not guard omap operations
- 04:07 PM Bug #22656: scrub mismatch on bytes (cache pools)
- aah, just popped up on luminous: http://pulpito.ceph.com/yuriw-2018-02-05_23:07:16-rados-wip-yuri-testing-2018-02-05-...
- 02:24 PM Bug #20924: osd: leaked Session on osd.7
- /a/yuriw-2018-02-02_20:31:37-rados-wip_yuri_master_2.2.18-distro-basic-smithi/2143177
Also available in: Atom