Activity
From 09/04/2019 to 10/03/2019
10/03/2019
- 11:45 PM Backport #38277: mimic: osd_map_message_max default is too high?
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/29242
merged - 11:39 PM Backport #38852: mimic: .mgrstat failed to decode mgrstat state; luminous dev version?
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/29249
merged - 11:37 PM Backport #38437: mimic: crc cache should be invalidated when posting preallocated rx buffers
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/29247
merged - 11:37 PM Backport #40884: mimic: ceph mgr module ls -f plain crashes mon
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/29593
merged - 11:36 PM Backport #40949: mimic: Better default value for osd_snap_trim_sleep
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/29732
merged - 11:35 PM Backport #38450: mimic: src/osd/OSDMap.h: 1065: FAILED assert(__null != pool)
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/29976
merged - 11:34 PM Backport #41595: mimic: ceph-objectstore-tool can't remove head with bad snapset
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/30081
merged - 11:34 PM Backport #40083: mimic: osd: Better error message when OSD count is less than osd_pool_default_size
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/30180
merged - 11:32 PM Backport #40732: mimic: mon: auth mon isn't loading full KeyServerData after restart
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/30181
merged - 11:32 PM Backport #41291: mimic: filestore pre-split may not split enough directories
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/30182
merged - 11:31 PM Backport #41351: mimic: hidden corei7 requirement in binary packages
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/30183
merged - 11:31 PM Backport #41490: mimic: OSDCap.PoolClassRNS test aborts
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/30214
merged - 11:30 PM Backport #41502: mimic: Warning about past_interval bounds on deleting pg
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/30222
merged - 11:27 PM Backport #40464: mimic: osd beacon sometimes has empty pg list
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/29253
merged - 11:26 PM Backport #38351: mimic: Limit loops waiting for force-backfill/force-recovery to happen
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/29245
merged - 11:24 PM Backport #38856: mimic: should set EPOLLET flag on del_event()
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/29250
merged - 11:23 PM Backport #40179: mimic: qa/standalone/scrub/osd-scrub-snaps.sh sometimes fails
- David Zafman wrote:
> https://github.com/ceph/ceph/pull/29251
merged - 08:57 PM Bug #42102: use-after-free in Objecter timer handing
- I will note that the test has to run for several minutes before the ASAN warning pops. ASAN does slow things down, bu...
- 08:39 PM Bug #42114 (Fix Under Review): mon: /var/lib/ceph/mon/* data (esp rocksdb) is not 0600
- 07:52 PM Backport #41534: nautilus: valgrind: UninitCondition in ceph::crypto::onwire::AES128GCM_OnWireRxH...
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/29928
merged - 07:51 PM Backport #41703: nautilus: oi(object_info_t).size does not match on disk size
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/30278
merged - 07:50 PM Backport #41963: nautilus: Segmentation fault in rados ls when using --pgid and --pool/-p togethe...
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/30605
merged - 07:49 PM Backport #41960: nautilus: tools/rados: add --pgid in help
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/30607
merged - 06:12 PM Bug #42178 (Duplicate): scrub errors due to missing objects
- ...
- 06:07 PM Bug #42176 (Duplicate): FAILED ceph_assert(obc) in PrimaryLogPG::recover_backfill()
- 05:40 PM Bug #42176 (Duplicate): FAILED ceph_assert(obc) in PrimaryLogPG::recover_backfill()
- ...
- 06:01 PM Bug #42177 (Fix Under Review): osd/PrimaryLogPG.cc: 13068: FAILED ceph_assert(obc)
- 05:58 PM Bug #42177 (Resolved): osd/PrimaryLogPG.cc: 13068: FAILED ceph_assert(obc)
- First the object is deleted,...
- 05:34 PM Bug #42175 (Can't reproduce): _txc_add_transaction error (2) No such file or directory not handl...
- ...
- 05:12 PM Bug #38219: rebuild-mondb hangs
- Just a quick note as this might be relevant for the decision whether or not to integrate this PR:
Running mimic 13... - 04:36 PM Support #42174 (Closed): Ceph Nautilus OSD isn't able to add to cluster
- Ceph cluster on debian, nautilus version, the issue is any time i try creating the data store the don't add the osds ...
- 04:18 PM Bug #36631: potential deadlock in PG::_scan_snaps when repairing snap mapper
- jewel is EOL - @Mykola, does any of this apply to luminous?
- 04:01 PM Bug #42173 (Closed): _pinned_map closest pinned map ver 252615 not available! error: (2) No such ...
- -4> 2019-10-03 17:58:44.023 7fde2e2f9700 5 mon.km-fsn-1-dc4-m1-797678@0(leader).paxos(paxos active c 4545611..45463...
- 01:09 PM Backport #42168 (In Progress): nautilus: readable.sh test fails
- 11:18 AM Backport #42168 (Resolved): nautilus: readable.sh test fails
- https://github.com/ceph/ceph/pull/30704
- 11:19 AM Bug #41424 (Pending Backport): readable.sh test fails
- 07:13 AM Feature #41905: Add ability to change fsid of cluster
- Splitting the cluster meant no data copy from A to B. Minimal downtime for the RGW application and no downtime for th...
10/02/2019
- 10:02 PM Feature #41905: Add ability to change fsid of cluster
- This sounds to me like the kind of thing we don't want to support directly. What's the use case for splitting a clust...
- 09:09 PM Bug #42060 (Need More Info): Slow ops seen when one ceph private interface is shut down
- What workload are you running; does it have its own metrics? Is there evidence that Nautilus is slower or behaving wo...
- 09:04 PM Bug #42113: ceph -h usage should indicate CephChoices --name= is sometime required
- No failures so this is normal priority?
- 04:43 PM Bug #41754: Use dump_stream() instead of dump_float() for floats where max precision isn't helpful
From json.org:
JSON (JavaScript Object Notation) is a lightweight data-interchange format. *It is easy for human...- 01:20 PM Bug #20924 (Resolved): osd: leaked Session on osd.7
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 01:19 PM Feature #37935 (Resolved): Add clear-data-digest command to objectstore tool
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 01:13 PM Documentation #41004 (Resolved): doc: pg_num should always be a power of two
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 01:12 PM Documentation #41403 (Resolved): doc: mon_health_to_clog_* values flipped
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 01:11 PM Backport #42154 (Resolved): mimic: Removed OSDs with outstanding peer failure reports crash the m...
- https://github.com/ceph/ceph/pull/30903
- 01:11 PM Backport #42153 (Resolved): luminous: Removed OSDs with outstanding peer failure reports crash th...
- https://github.com/ceph/ceph/pull/30905
- 01:11 PM Backport #42152 (Resolved): nautilus: Removed OSDs with outstanding peer failure reports crash th...
- https://github.com/ceph/ceph/pull/30904
- 01:09 PM Backport #42141 (Resolved): nautilus: asynchronous recovery can not function under certain circum...
- https://github.com/ceph/ceph/pull/31077
- 01:09 PM Backport #42138 (Resolved): luminous: Remove unused full and nearful output from OSDMap summary
- https://github.com/ceph/ceph/pull/30902
- 01:09 PM Backport #42137 (Resolved): mimic: Remove unused full and nearful output from OSDMap summary
- https://github.com/ceph/ceph/pull/30901
- 01:09 PM Backport #42136 (Resolved): nautilus: Remove unused full and nearful output from OSDMap summary
- https://github.com/ceph/ceph/pull/30900
- 01:08 PM Backport #42128 (Resolved): mimic: mgr/balancer FAILED ceph_assert(osd_weight.count(i.first))
- https://github.com/ceph/ceph/pull/30898
- 01:08 PM Backport #42127 (Resolved): luminous: mgr/balancer FAILED ceph_assert(osd_weight.count(i.first))
- https://github.com/ceph/ceph/pull/30926
- 01:07 PM Backport #42126 (Resolved): nautilus: mgr/balancer FAILED ceph_assert(osd_weight.count(i.first))
- https://github.com/ceph/ceph/pull/30899
- 01:07 PM Backport #42125 (Resolved): nautilus: weird daemon key seen in health alert
- https://github.com/ceph/ceph/pull/31039
- 12:13 PM Backport #24360 (Resolved): luminous: osd: leaked Session on osd.7
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/29859
m... - 12:08 AM Backport #24360: luminous: osd: leaked Session on osd.7
- Samuel Just wrote:
> https://github.com/ceph/ceph/pull/29859
merged - 12:12 PM Backport #38436 (Resolved): luminous: crc cache should be invalidated when posting preallocated r...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/29248
m... - 12:10 PM Backport #41568 (Resolved): nautilus: doc: pg_num should always be a power of two
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/30004
m... - 12:09 PM Backport #41529 (Resolved): nautilus: doc: mon_health_to_clog_* values flipped
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/30003
m... - 11:19 AM Backport #42120 (In Progress): nautilus: pg_autoscaler should show a warning if pg_num isn't a po...
- 11:09 AM Backport #42120 (Resolved): nautilus: pg_autoscaler should show a warning if pg_num isn't a power...
- https://github.com/ceph/ceph/pull/30689
- 10:54 AM Bug #42102: use-after-free in Objecter timer handing
- Su Yue wrote:
> Jeff Layton wrote:
> > While hunting a crash in tracker #42026, I ran across this bug when testing ... - 06:24 AM Bug #42102: use-after-free in Objecter timer handing
- Su Yue wrote:
> Jeff Layton wrote:
> > While hunting a crash in tracker #42026, I ran across this bug when testing ... - 03:47 AM Bug #42102: use-after-free in Objecter timer handing
- Jeff Layton wrote:
> While hunting a crash in tracker #42026, I ran across this bug when testing with ASAN:
>
> [... - 10:18 AM Backport #41921: nautilus: OSDMonitor: missing `pool_id` field in `osd pool ls` command
- duplicate PR https://github.com/ceph/ceph/pull/30568 was closed
- 03:32 AM Feature #41359 (In Progress): Adding Placement Group id in Large omap log message
- 02:37 AM Bug #42115 (Resolved): Turn off repair pg state when leaving recovery
We set the repair pg state during recovery initiated by repair. To handle all cases we need to clear it when trans...
10/01/2019
- 10:58 PM Backport #38436: luminous: crc cache should be invalidated when posting preallocated rx buffers
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/29248
merged - 10:19 PM Bug #42114 (Resolved): mon: /var/lib/ceph/mon/* data (esp rocksdb) is not 0600
- by default, i see...
- 09:25 PM Bug #42113 (Fix Under Review): ceph -h usage should indicate CephChoices --name= is sometime requ...
- ...
- 07:26 PM Bug #42111 (Resolved): max_size from crushmap ignored when increasing size on pool
- Hello,
when the crushmap-rule has "max_size=2" for example, and you set size=3 on the pool, all I/O stops withou... - 02:05 PM Backport #41922: mimic: OSDMonitor: missing `pool_id` field in `osd pool ls` command
- https://github.com/ceph/ceph/pull/30547 was closed because identical PR https://github.com/ceph/ceph/pull/30485 was o...
- 11:28 AM Bug #42102: use-after-free in Objecter timer handing
- Found by running LibRadosMisc.ShutdownRace test built with -DWITH_ASAN=ON. I had to set:...
- 10:31 AM Bug #42102 (Can't reproduce): use-after-free in Objecter timer handing
- While hunting a crash in tracker #42026, I ran across this bug when testing with ASAN:...
- 01:08 AM Feature #40419 (Resolved): [RFE] Estimated remaining time on recovery?
09/30/2019
- 12:28 PM Backport #42095 (In Progress): nautilus: global osd crash in DynamicPerfStats::add_to_reports
- 12:28 PM Backport #42095 (Resolved): nautilus: global osd crash in DynamicPerfStats::add_to_reports
- https://github.com/ceph/ceph/pull/30648
- 05:46 AM Backport #41958 (In Progress): nautilus: scrub errors after quick split/merge cycle
- https://github.com/ceph/ceph/pull/30643
09/29/2019
- 09:58 PM Bug #42082 (Duplicate): pybind/rados: set_omap() crash on py3
- 10:17 AM Bug #42079 (Pending Backport): weird daemon key seen in health alert
- 09:40 AM Bug #42079: weird daemon key seen in health alert
- an alternative fix: https://github.com/ceph/ceph/pull/30635
09/28/2019
- 02:25 PM Bug #41748: log [ERR] : 7.19 caller_ops.size 62 > log size 61
- I suggest putting a call to log_weirdness() in the Reset state entry point, so we can tell if the problem came from t...
- 08:01 AM Bug #41891 (Pending Backport): global osd crash in DynamicPerfStats::add_to_reports
09/27/2019
- 05:08 PM Feature #41647 (Pending Backport): pg_autoscaler should show a warning if pg_num isn't a power of...
- 03:55 PM Bug #42015 (Pending Backport): Remove unused full and nearful output from OSDMap summary
- 07:02 AM Bug #42015 (Resolved): Remove unused full and nearful output from OSDMap summary
- 03:27 PM Bug #42084 (New): df output difference if 8 OSD cluster has 5+3 shared EC pool vs larger cluster
I created an 8 OSD cluster with 1 EC pool 5+3 and this ceph df detail output....- 02:42 PM Bug #42082 (Resolved): pybind/rados: set_omap() crash on py3
- Details see https://github.com/ceph/ceph/pull/30483#issuecomment-535873920
- 12:47 PM Bug #36572 (Closed): ceph-in: --connect-timeout doesn't work while pinging mon
- 12:30 PM Bug #42079 (Resolved): weird daemon key seen in health alert
- e.g.:
19 slow ops, oldest one blocked for 34 sec, daemons [osd,2,osd,4] have slow ops. - 05:19 AM Bug #41680 (Pending Backport): Removed OSDs with outstanding peer failure reports crash the monitor
- 02:40 AM Bug #42052 (Pending Backport): mgr/balancer FAILED ceph_assert(osd_weight.count(i.first))
- 02:39 AM Bug #41924 (Pending Backport): asynchronous recovery can not function under certain circumstances
- 01:36 AM Bug #42058 (In Progress): OSD reconnected across map epochs, inconsistent pg logs created
- 12:27 AM Backport #41845 (In Progress): luminous: tools/rados: allow list objects in a specific pg in a pool
- 12:26 AM Backport #41959 (In Progress): luminous: tools/rados: add --pgid in help
- 12:26 AM Backport #41962 (In Progress): luminous: Segmentation fault in rados ls when using --pgid and --p...
- https://github.com/ceph/ceph/pull/30608
09/26/2019
- 11:04 PM Backport #41960 (In Progress): nautilus: tools/rados: add --pgid in help
- 10:53 PM Backport #41963 (In Progress): nautilus: Segmentation fault in rados ls when using --pgid and --p...
- 10:48 AM Bug #42060: Slow ops seen when one ceph private interface is shut down
- Hi,
When i mention private network i am referring to the cluster_network. - 10:30 AM Bug #42060 (Need More Info): Slow ops seen when one ceph private interface is shut down
- Environment -
5 node Nautilus cluster
67 OSDs per node - 4TB HDD per OSD
We are trying a use case where we shut... - 08:53 AM Bug #42058 (Duplicate): OSD reconnected across map epochs, inconsistent pg logs created
- Get the lossless cluster connection between osd.2 and osd.47 for example.
When osd.47 is restarted and at the same... - 08:37 AM Bug #40035: smoke.sh failing in jenkins "make check" test randomly
- In addition to what Laura reported, it must be said that this failure is seen in jenkins job only
when running the j... - 08:26 AM Bug #40035: smoke.sh failing in jenkins "make check" test randomly
- Kefu Chai wrote:
> [...]
>
> see https://jenkins.ceph.com/job/ceph-pull-requests/817/console
>
> i tried to re... - 03:21 AM Bug #41743: Long heartbeat ping times on front interface seen, longest is 2237.999 msec (OSD_SLOW...
This shows the send on osd.0 and receive at osd.6. ...- 02:52 AM Bug #41743: Long heartbeat ping times on front interface seen, longest is 2237.999 msec (OSD_SLOW...
- This shows the front and back interface. I don't know which is which, but it already sent the second interface maybe...
- 02:32 AM Bug #41743: Long heartbeat ping times on front interface seen, longest is 2237.999 msec (OSD_SLOW...
I confused the front and back interface with a retransmit. The ports are the 2 interfaces.
-At the ping receivi...
09/25/2019
- 11:41 PM Bug #41924 (Fix Under Review): asynchronous recovery can not function under certain circumstances
- 09:27 PM Bug #41924: asynchronous recovery can not function under certain circumstances
- 09:46 PM Bug #41874 (Resolved): mon-osdmap-prune.sh fails
- 09:45 PM Bug #41873 (Resolved): test-erasure-code.sh fails
- 09:28 PM Bug #41939 (Need More Info): Scaling with unfound options might leave PGs in state "unknown"
- 09:28 PM Bug #41939: Scaling with unfound options might leave PGs in state "unknown"
- How are we ending up in this state? What the previous states on the those PGs?
- 09:24 PM Bug #41943 (Need More Info): ceph-mgr fails to report OSD status correctly
- Do you have any other information from that OSD while this happened?
- 09:22 PM Bug #41943: ceph-mgr fails to report OSD status correctly
- Sounds like this OSD was somehow up enough that it responded to peer heartbeats, but was not processing any client re...
- 09:03 PM Bug #41908 (Resolved): TMAPUP operation results in OSD assertion failure
- 12:11 PM Bug #42052 (Resolved): mgr/balancer FAILED ceph_assert(osd_weight.count(i.first))
- > OSDMap.cc: 4603: FAILED ceph_assert(osd_weight.count(i.first))
>
> ceph version v15.0.0-5429-gac828d7 (ac828d732... - 10:50 AM Bug #41866 (Fix Under Review): OSD cannot report slow operation warnings in time.
- 10:49 AM Bug #41866: OSD cannot report slow operation warnings in time.
- 08:26 AM Backport #41921 (In Progress): nautilus: OSDMonitor: missing `pool_id` field in `osd pool ls` com...
- https://github.com/ceph/ceph/pull/30568
09/24/2019
- 09:50 PM Bug #38724: _txc_add_transaction error (39) Directory not empty not handled on operation 21 (op 1...
- Bumping priority based on community feedback.
- 07:53 PM Backport #42037 (Resolved): luminous: Enable auto-scaler and get src/osd/PeeringState.cc:3671: fa...
- https://github.com/ceph/ceph/pull/30896
- 07:52 PM Backport #42036 (Resolved): mimic: Enable auto-scaler and get src/osd/PeeringState.cc:3671: faile...
- https://github.com/ceph/ceph/pull/30895
- 04:11 PM Bug #41946: cbt perf test fails due to leftover in /home/ubuntu/cephtest
- the log files were created by cosbench. see https://github.com/intel-cloud/cosbench/blob/ca68b333e85c51829ea68f203877...
- 12:19 PM Backport #41922 (In Progress): mimic: OSDMonitor: missing `pool_id` field in `osd pool ls` command
- https://github.com/ceph/ceph/pull/30547
- 12:15 PM Backport #41917 (In Progress): nautilus: osd: failure result of do_osd_ops not logged in prepare_...
- https://github.com/ceph/ceph/pull/30546
09/23/2019
- 09:33 PM Bug #42015 (In Progress): Remove unused full and nearful output from OSDMap summary
- 09:27 PM Bug #42015 (Resolved): Remove unused full and nearful output from OSDMap summary
in OSDMap::print_oneline_summary() and OSDMap::print_summary() (CEPH_OSDMAP_FULL and CEPH_OSDMAP_NEARFULL checks)- 08:41 PM Backport #42014 (In Progress): nautilus: Enable auto-scaler and get src/osd/PeeringState.cc:3671:...
- 08:35 PM Backport #42014 (Resolved): nautilus: Enable auto-scaler and get src/osd/PeeringState.cc:3671: fa...
- https://github.com/ceph/ceph/pull/30528
- 07:42 PM Feature #41647 (Fix Under Review): pg_autoscaler should show a warning if pg_num isn't a power of...
- 07:20 PM Bug #42012: mon osd_snap keys grow unbounded
- This is (mostly) fixed in master by https://github.com/ceph/ceph/pull/30518. There is still one set of per-epoch key...
- 03:41 PM Bug #42012: mon osd_snap keys grow unbounded
- Link to the full "dump-keys | grep osd_snap"
https://wustl.box.com/s/3r7bgv32hs5hw4jmgmywbo9qvqrqsmwn - 03:26 PM Bug #42012 (Resolved): mon osd_snap keys grow unbounded
- ...
- 07:19 PM Bug #41680: Removed OSDs with outstanding peer failure reports crash the monitor
- 05:09 PM Bug #41944: inconsistent pool count in ceph -s output
- Is this after pools are deleted? In that case, it's #40011
- 04:27 PM Backport #41864 (In Progress): luminous: Mimic MONs have slow/long running ops
- 02:27 PM Bug #37875: osdmaps aren't being cleaned up automatically on healthy cluster
- Still ongoing here, with mimic too. On one 13.2.6 cluster we have this, for example:...
- 02:12 PM Bug #41816 (Pending Backport): Enable auto-scaler and get src/osd/PeeringState.cc:3671: failed as...
- 09:02 AM Backport #41964 (Resolved): mimic: Segmentation fault in rados ls when using --pgid and --pool/-p...
- https://github.com/ceph/ceph/pull/30893
- 09:02 AM Backport #41963 (Resolved): nautilus: Segmentation fault in rados ls when using --pgid and --pool...
- https://github.com/ceph/ceph/pull/30605
- 09:02 AM Backport #41962 (Resolved): luminous: Segmentation fault in rados ls when using --pgid and --pool...
- 09:02 AM Backport #41961 (Resolved): mimic: tools/rados: add --pgid in help
- https://github.com/ceph/ceph/pull/30893
- 09:02 AM Backport #41960 (Resolved): nautilus: tools/rados: add --pgid in help
- https://github.com/ceph/ceph/pull/30607
- 09:02 AM Backport #41959 (Resolved): luminous: tools/rados: add --pgid in help
- https://github.com/ceph/ceph/pull/30608
- 09:02 AM Backport #41958 (Resolved): nautilus: scrub errors after quick split/merge cycle
- https://github.com/ceph/ceph/pull/30643
09/22/2019
- 10:12 PM Cleanup #41876 (Pending Backport): tools/rados: add --pgid in help
- 11:55 AM Bug #41950 (Can't reproduce): crimson compile
- Can i know crimson-old use what version Seastar code at ceph-15 version?
When compile, output following option:
<... - 04:12 AM Bug #41936 (Pending Backport): scrub errors after quick split/merge cycle
- 03:45 AM Bug #41946: cbt perf test fails due to leftover in /home/ubuntu/cephtest
- ...
- 02:09 AM Bug #41946 (Duplicate): cbt perf test fails due to leftover in /home/ubuntu/cephtest
- ...
- 03:42 AM Bug #41875 (Pending Backport): Segmentation fault in rados ls when using --pgid and --pool/-p tog...
09/20/2019
- 09:01 PM Bug #41156 (Rejected): dump_float() poor output
- 08:47 PM Bug #41817 (Closed): qa/standalone/scrub/osd-recovery-scrub.sh timed out waiting for scrub
- 07:17 PM Bug #41913 (Fix Under Review): With auto scaler operating stopping an OSD can lead to COT crashin...
- the real bug here is that the pgid split so the pgid specified to COT is wrong. the attached PR adds a check in COT ...
- 06:22 PM Bug #41944 (Resolved): inconsistent pool count in ceph -s output
- ...
- 06:08 PM Bug #41816 (Fix Under Review): Enable auto-scaler and get src/osd/PeeringState.cc:3671: failed as...
- 05:36 PM Bug #41816: Enable auto-scaler and get src/osd/PeeringState.cc:3671: failed assert info.last_comp...
- The complete_to pointer is already at log end before recover_got() is called. I think it's because during split() we ...
- 04:35 PM Bug #41943 (Closed): ceph-mgr fails to report OSD status correctly
- After an inexplicable cluster event that resulted in around 10% of our OSDs falsely reported down (and shortly after ...
- 12:47 PM Bug #41834: qa: EC Pool configuration and slow op warnings for OSDs caused by recent master changes
- Might as well add some RBD failures while piling on:
http://pulpito.ceph.com/trociny-2019-09-19_12:41:57-rbd-wip-m... - 02:13 AM Bug #41939 (Need More Info): Scaling with unfound options might leave PGs in state "unknown"
With osd_pool_default_pg_autoscale_mode="on"
../qa/run-standalone.sh TEST_rep_recovery_unfound
The test failu...- 01:59 AM Backport #41863 (In Progress): mimic: Mimic MONs have slow/long running ops
- https://github.com/ceph/ceph/pull/30481
- 01:57 AM Backport #41862 (In Progress): nautilus: Mimic MONs have slow/long running ops
- https://github.com/ceph/ceph/pull/30480
09/19/2019
- 11:12 PM Bug #41817: qa/standalone/scrub/osd-recovery-scrub.sh timed out waiting for scrub
- This fix for this particular issue is to just disable auto scaler because it just causes a hang in the test but no cr...
- 10:59 PM Bug #41923: 3 different ceph-osd asserts caused by enabling auto-scaler
I think this stack better reflects the thread that hit the suicide timeout. However, everytime I've seen this thre...- 09:41 PM Bug #41923: 3 different ceph-osd asserts caused by enabling auto-scaler
Look at the assert(op.hinfo) it is caused by the corruption injected by the test. I'll verify that the asserts are...- 12:05 AM Bug #41923 (Can't reproduce): 3 different ceph-osd asserts caused by enabling auto-scaler
Change config osd_pool_default_pg_autoscale_mode to "on"
Saw these 4 core dumps on 3 different sub-tests.
../...- 04:51 PM Bug #41936 (Fix Under Review): scrub errors after quick split/merge cycle
- 04:51 PM Bug #41936 (Resolved): scrub errors after quick split/merge cycle
- PGs split and then merge soon after. There is a pg stat scrub mismatch.
- 04:48 PM Bug #41834: qa: EC Pool configuration and slow op warnings for OSDs caused by recent master changes
- This shows up in rgw's ec pool tests also. In osd logs, I see slow ops on MOSDECSubOpRead/Reply messages, and they al...
- 09:32 AM Feature #41647: pg_autoscaler should show a warning if pg_num isn't a power of two
- Note: contrary to what the bug description says, pg_autoscaler will (apparently) *not* be automatically turned on wit...
- 01:56 AM Bug #41924 (Resolved): asynchronous recovery can not function under certain circumstances
- guoracle report that:
> In the asynchronous recovery feature,
> the asynchronous recovery target OSD is selected ... - 01:39 AM Bug #41866: OSD cannot report slow operation warnings in time.
- *report_callback* thread is also blocked on PG::lock with MGRClient::lock locked while getting the pg stats. This in ...
- 12:54 AM Bug #41816: Enable auto-scaler and get src/osd/PeeringState.cc:3671: failed assert info.last_comp...
This can be reproduced by setting config osd_pool_default_pg_autoscale_mode="on" and executing this test:
../qa/...- 12:29 AM Bug #41754: Use dump_stream() instead of dump_float() for floats where max precision isn't helpful
I was suspicious that the trailing 0999999994 in the elapsed time is noise. Could this be caused by a float being...
09/18/2019
- 06:33 PM Backport #41922 (Resolved): mimic: OSDMonitor: missing `pool_id` field in `osd pool ls` command
- https://github.com/ceph/ceph/pull/30485
- 06:33 PM Backport #41921 (Resolved): nautilus: OSDMonitor: missing `pool_id` field in `osd pool ls` command
- https://github.com/ceph/ceph/pull/30486
- 06:31 PM Backport #41920 (Resolved): nautilus: osd: scrub error on big objects; make bluestore refuse to s...
- https://github.com/ceph/ceph/pull/30783
- 06:31 PM Backport #41919 (Resolved): luminous: osd: scrub error on big objects; make bluestore refuse to s...
- https://github.com/ceph/ceph/pull/30785
- 06:31 PM Backport #41918 (Resolved): mimic: osd: scrub error on big objects; make bluestore refuse to star...
- https://github.com/ceph/ceph/pull/30784
- 06:31 PM Backport #41917 (Resolved): nautilus: osd: failure result of do_osd_ops not logged in prepare_tra...
- https://github.com/ceph/ceph/pull/30546
- 04:25 PM Bug #41900 (Resolved): auto-scaler breaks many standalone tests
- 03:38 PM Bug #41913 (Resolved): With auto scaler operating stopping an OSD can lead to COT crashing instea...
- ...
- 03:03 PM Bug #41891: global osd crash in DynamicPerfStats::add_to_reports
- Answering myself - seems that rbd_support cannot be disabled anyway
# ceph mgr module disable rbd_support
Error E... - 10:59 AM Bug #41891: global osd crash in DynamicPerfStats::add_to_reports
- I don't believe this command was running at that time, however "rbd_support" mgr module was active. Could this be the...
- 10:53 AM Bug #41891: global osd crash in DynamicPerfStats::add_to_reports
- Marcin, I believe I know the cause and I am now discussing the fix [1]. A workaround could be not to use "rbd perf im...
- 10:13 AM Bug #41891 (Fix Under Review): global osd crash in DynamicPerfStats::add_to_reports
- 06:24 AM Bug #41891 (In Progress): global osd crash in DynamicPerfStats::add_to_reports
- 01:55 PM Bug #41908 (Fix Under Review): TMAPUP operation results in OSD assertion failure
- 01:47 PM Bug #41908 (Resolved): TMAPUP operation results in OSD assertion failure
- In 'do_tmapup', the object is READ into a 'newop' structure and then when it is re-written, the same 'newop' structur...
- 10:52 AM Bug #41677: Cephmon:fix mon crash
- @shuguang what is the exact version of ceph-mon? i cannot match the backtrace with the source code of master HEAD.
- 09:46 AM Feature #41905 (New): Add ability to change fsid of cluster
- There is a case where you want to change the fsid of a cluster: When you have splitted a cluster into two different c...
09/17/2019
- 09:50 PM Bug #41900 (Resolved): auto-scaler breaks many standalone tests
Caused by https://github.com/ceph/ceph/pull/30112
In some cases I had to kill processes to get past hung tests. ...- 08:46 PM Bug #41816: Enable auto-scaler and get src/osd/PeeringState.cc:3671: failed assert info.last_comp...
- This crash didn't reproduce for me using run-standalone.sh with the auto scaler turned off.
- 08:35 PM Bug #40287 (Pending Backport): OSDMonitor: missing `pool_id` field in `osd pool ls` command
- 08:30 PM Bug #41191 (Pending Backport): osd: scrub error on big objects; make bluestore refuse to start on...
- 08:29 PM Bug #41210 (Pending Backport): osd: failure result of do_osd_ops not logged in prepare_transactio...
- @shuguang wang did you want this to be backported to a release older than nautilus?
- 06:59 PM Bug #41336: All OSD Faild after Reboot.
- Hi,
two questions:
- How to find out if a pool is affected?
"ceph osd erasure-code-profile get" does not list... - 05:04 PM Bug #41891: global osd crash in DynamicPerfStats::add_to_reports
- Yes, I use "rbd perf image iotop/iostat" (one of the reasons for upgrade:-) ). Not exporting per image data with prom...
- 03:51 PM Bug #41891: global osd crash in DynamicPerfStats::add_to_reports
- Marcin, are you using `rbd perf image iotop|iostat` commands? Or may be prometheus mgr module with rbd per image stat...
- 01:49 PM Bug #41891: global osd crash in DynamicPerfStats::add_to_reports
- As crash seems to be related to stats reporting - don't know if it is related, but it was soon after eliminating "Leg...
- 10:30 AM Bug #41891 (Resolved): global osd crash in DynamicPerfStats::add_to_reports
- Hi,
during routine host maintenance, I've encountered massive osd crash across entire cluster. The sequence of event... - 01:19 PM Feature #40420 (Need More Info): Introduce an ceph.conf option to disable HEALTH_WARN when nodeep...
- https://github.com/ceph/ceph/pull/29422 has been merged, but not yet backported
- 08:05 AM Bug #41754: Use dump_stream() instead of dump_float() for floats where max precision isn't helpful
- Regarding elapsed time it might be important (for `compact` is not, but for benchmarking is). Another importatnat thi...
- 06:15 AM Backport #41238 (In Progress): nautilus: Implement mon_memory_target
09/16/2019
- 10:10 PM Cleanup #41876 (Fix Under Review): tools/rados: add --pgid in help
- 10:09 PM Cleanup #41876 (Resolved): tools/rados: add --pgid in help
- 09:39 PM Bug #41817 (In Progress): qa/standalone/scrub/osd-recovery-scrub.sh timed out waiting for scrub
- This is likely cause by enabling of auto scaler.
- 03:27 PM Bug #41817: qa/standalone/scrub/osd-recovery-scrub.sh timed out waiting for scrub
- /a/kchai-2019-09-15_15:37:26-rados-wip-kefu-testing-2019-09-15-1533-distro-basic-mira/4311115/
/a/pdonnell-2019-09-1... - 08:05 PM Bug #41875 (Fix Under Review): Segmentation fault in rados ls when using --pgid and --pool/-p tog...
- 07:55 PM Bug #41875 (Resolved): Segmentation fault in rados ls when using --pgid and --pool/-p together as...
- - Works fine with only --pgid...
- 07:57 PM Bug #41816: Enable auto-scaler and get src/osd/PeeringState.cc:3671: failed assert info.last_comp...
- Reproduced with logs: /a/nojha-2019-09-13_21:45:51-rados:standalone-master-distro-basic-smithi/4304313/remote/smithi1...
- 03:25 PM Bug #40522: on_local_recover doesn't touch?
- /a/pdonnell-2019-09-14_22:40:03-rados-master-distro-basic-smithi/4307679/
/a/kchai-2019-09-15_15:37:26-rados-wip-kef... - 03:23 PM Bug #41874 (Resolved): mon-osdmap-prune.sh fails
- ...
- 03:19 PM Bug #41873 (Resolved): test-erasure-code.sh fails
- ...
- 01:46 PM Backport #41238: nautilus: Implement mon_memory_target
- The old PR is unlinked from the tracker as more commits need to be pulled in for this backport. I will update this tr...
- 01:04 PM Backport #41238 (Need More Info): nautilus: Implement mon_memory_target
- first attempted backport https://github.com/ceph/ceph/pull/29652 was closed - apparently, the backport is not trivial...
- 01:23 PM Backport #40993: mimic: Ceph status in some cases does not report slow ops
- just for completeness - the mimic fix is (I think): https://github.com/ceph/ceph/pull/30391
- 10:39 AM Bug #41866: OSD cannot report slow operation warnings in time.
- assumed that bluestore is used.
- 10:23 AM Bug #41866 (Fix Under Review): OSD cannot report slow operation warnings in time.
- If an underlying device is blocked due to H/W issues, a thread that checks slow ops can’t report slow op warning in t...
- 07:21 AM Backport #41864 (Resolved): luminous: Mimic MONs have slow/long running ops
- https://github.com/ceph/ceph/pull/30519
- 07:21 AM Backport #41863 (Resolved): mimic: Mimic MONs have slow/long running ops
- https://github.com/ceph/ceph/pull/30481
- 07:21 AM Backport #41862 (Resolved): nautilus: Mimic MONs have slow/long running ops
- https://github.com/ceph/ceph/pull/30480
- 07:14 AM Backport #41845 (Resolved): luminous: tools/rados: allow list objects in a specific pg in a pool
- https://github.com/ceph/ceph/pull/30608
- 07:14 AM Backport #41844 (Resolved): mimic: tools/rados: allow list objects in a specific pg in a pool
- https://github.com/ceph/ceph/pull/30893
09/15/2019
- 01:59 PM Bug #41716 (Resolved): LibRadosTwoPoolsPP.ManifestUnset fails
- 01:51 PM Bug #41716: LibRadosTwoPoolsPP.ManifestUnset fails
- This issue is fixed by https://github.com/ceph/ceph/pull/29985
When the error occurs, the following ops are executed... - 03:05 AM Bug #41834 (Resolved): qa: EC Pool configuration and slow op warnings for OSDs caused by recent m...
- See: http://pulpito.ceph.com/pdonnell-2019-09-14_22:39:31-fs-master-distro-basic-smithi/
Recent run of fs suite on...
09/13/2019
- 10:29 PM Feature #41831 (Resolved): tools/rados: allow list objects in a specific pg in a pool
- This one is already present in nautilus.
- 04:41 PM Bug #41817: qa/standalone/scrub/osd-recovery-scrub.sh timed out waiting for scrub
- David, can you please take a look at this whenever you get a chance.
- 01:31 PM Bug #41817 (Closed): qa/standalone/scrub/osd-recovery-scrub.sh timed out waiting for scrub
- ...
- 04:40 PM Bug #41816: Enable auto-scaler and get src/osd/PeeringState.cc:3671: failed assert info.last_comp...
- I'll try to see if I can reproduce this.
- 01:30 PM Bug #41816 (Resolved): Enable auto-scaler and get src/osd/PeeringState.cc:3671: failed assert inf...
- ...
- 04:37 PM Bug #41735 (Resolved): pg_autoscaler throws HEALTH_WARN with auto_scale on for all pools
- See https://tracker.ceph.com/issues/41735#note-3 and https://github.com/rook/rook/pull/3847/commits/11d3831d742639148...
- 04:29 PM Bug #24531 (Pending Backport): Mimic MONs have slow/long running ops
- 09:09 AM Backport #40993 (Rejected): mimic: Ceph status in some cases does not report slow ops
- backports will be pursued in https://tracker.ceph.com/issues/41741
- 07:54 AM Bug #41758 (Duplicate): Ceph status in some cases does not report slow ops
- 05:13 AM Feature #40420: Introduce an ceph.conf option to disable HEALTH_WARN when nodeep-scrub/scrub flag...
- What is the back port targets for this? I don't see a health mute tracker referenced by any of the commits, but this...
- 01:55 AM Backport #41712 (In Progress): nautilus: FAILED ceph_assert(p != pg_slots.end()) in OSDShard::reg...
- https://github.com/ceph/ceph/pull/30371
09/12/2019
- 10:38 PM Backport #40993: mimic: Ceph status in some cases does not report slow ops
- Nathan Cutler wrote:
> backport ticket opened prematurely - setting "Need More Info" pending:
>
> 1. opening of P... - 08:19 PM Backport #40993 (Need More Info): mimic: Ceph status in some cases does not report slow ops
- backport ticket opened prematurely - setting "Need More Info" pending:
1. opening of PR fixing the issue in master... - 08:18 PM Backport #40993 (New): mimic: Ceph status in some cases does not report slow ops
- 11:58 AM Backport #40993: mimic: Ceph status in some cases does not report slow ops
- Converting this to track backport from master where the fix is under review.
- 02:03 PM Bug #36289: Converting Filestore OSD from leveldb to rocksdb backend on CentOS
- We had to scrap the idea of changing the backend and went for upgrading the OSDs to Bluestore. Our backfilling issue ...
- 01:58 PM Bug #36289: Converting Filestore OSD from leveldb to rocksdb backend on CentOS
- David:
Did you run into a solution for this? We're seeing similar issues but the only possible alternative seems ... - 08:32 AM Backport #41785 (Resolved): nautilus: Make dumping of reservation info congruent between scrub an...
- https://github.com/ceph/ceph/pull/31444
- 05:41 AM Backport #41764 (In Progress): nautilus: TestClsRbd.sparsify fails when using filestore
- https://github.com/ceph/ceph/pull/30354
- 02:24 AM Bug #23647 (In Progress): thrash-eio test can prevent recovery
- http://pulpito.ceph.com/nojha-2019-09-06_14:33:54-rados:singleton-wip-41385-3-distro-basic-smithi/ - this is where I ...
- 01:22 AM Bug #41743: Long heartbeat ping times on front interface seen, longest is 2237.999 msec (OSD_SLOW...
Reproduced several times with debug_ms = 20
http://pulpito.ceph.com/dzafman-2019-09-11_15:28:37-rados-wip-zafman...- 01:21 AM Bug #41735: pg_autoscaler throws HEALTH_WARN with auto_scale on for all pools
- sorry I missed that...
09/11/2019
- 10:28 PM Bug #41735 (Fix Under Review): pg_autoscaler throws HEALTH_WARN with auto_scale on for all pools
- Rook should probably set this option explicitly, since it is working with nautilus and we won't backport this (or the...
- 09:29 PM Bug #41735 (Need More Info): pg_autoscaler throws HEALTH_WARN with auto_scale on for all pools
- can you attach the 'ceph health detail' output so i can see which warning it's throwing?
- 09:33 PM Bug #41669 (Pending Backport): Make dumping of reservation info congruent between scrub and recovery
- 09:11 PM Bug #41680 (Won't Fix): Removed OSDs with outstanding peer failure reports crash the monitor
- OSD failure reports will die out on their own eventually and there's no general reason to expect a removed OSD was in...
- 09:11 PM Bug #41639 (Rejected): mon/MgrMonitor: enable pg_autoscaler by default for nautilus
- 09:10 PM Bug #41693 (Need More Info): a accidental problems with osd detection algorithm in monitor
- Can you explain in more detail exactly what happened here?
It sounds like you have three hosts with colocated OSDs... - 09:08 PM Bug #41718 (Fix Under Review): ceph osd stat JSON output incomplete
- 03:28 PM Bug #41758 (Fix Under Review): Ceph status in some cases does not report slow ops
- 01:13 PM Bug #41758: Ceph status in some cases does not report slow ops
- After applying the fix, health warning pertaining to slow ops show up as shown below,...
- 12:57 PM Bug #41758: Ceph status in some cases does not report slow ops
- PR https://github.com/ceph/ceph/pull/30337 addresses this issue.
- 09:29 AM Bug #41758 (Duplicate): Ceph status in some cases does not report slow ops
- In cases when only osds report slow ops, it is observed that ceph summary status doesn't report the same. This issue ...
- 01:28 PM Backport #41764 (Resolved): nautilus: TestClsRbd.sparsify fails when using filestore
- https://github.com/ceph/ceph/pull/30354
- 09:14 AM Backport #40993: mimic: Ceph status in some cases does not report slow ops
- Further to my findings earlier, I confirmed that the "reported" flag is being reset in case ONLY an osd daemon report...
- 04:08 AM Bug #41754 (New): Use dump_stream() instead of dump_float() for floats where max precision isn't ...
Some examples from osd dump are below. The full_ratio is .95, backfill_ratio .90 and nearfull_ratio .85.
<pre...- 01:25 AM Bug #41661 (Resolved): radosbench_omap_write cleanup slow/stuck
- 12:25 AM Bug #41743: Long heartbeat ping times on front interface seen, longest is 2237.999 msec (OSD_SLOW...
- 12:24 AM Bug #41743 (In Progress): Long heartbeat ping times on front interface seen, longest is 2237.999 ...
09/10/2019
- 10:42 PM Bug #41743: Long heartbeat ping times on front interface seen, longest is 2237.999 msec (OSD_SLOW...
- The only OSDs involved are osd.6 and osd.0.
Slow heartbeat ping on front interface from osd.6 to osd.0 2237.999 ms... - 12:12 PM Bug #41743 (Resolved): Long heartbeat ping times on front interface seen, longest is 2237.999 mse...
- "2019-09-09T22:25:11.794749+0000 mon.b (mon.0) 389 : cluster [WRN] Health check failed: Long heartbeat ping times on ...
- 08:21 PM Bug #41661 (Fix Under Review): radosbench_omap_write cleanup slow/stuck
- 07:54 PM Bug #41661: radosbench_omap_write cleanup slow/stuck
- Clearly, filestore-xfs.yaml is the one failing consistently.
See http://pulpito.ceph.com/nojha-2019-09-09_23:22:30... - 05:03 PM Backport #40082 (In Progress): luminous: osd: Better error message when OSD count is less than os...
- 02:59 PM Bug #41748 (Can't reproduce): log [ERR] : 7.19 caller_ops.size 62 > log size 61
- ...
- 08:27 AM Bug #41721 (Pending Backport): TestClsRbd.sparsify fails when using filestore
- 06:45 AM Backport #41640 (In Progress): nautilus: FAILED ceph_assert(info.history.same_interval_since != 0...
- 06:36 AM Backport #41530 (Resolved): mimic: doc: mon_health_to_clog_* values flipped
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/30227
m... - 06:34 AM Backport #41532 (Resolved): luminous: Move bluefs alloc size initialization log message to log le...
- 06:32 AM Backport #38551: luminous: core: lazy omap stat collection
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/29190
m... - 05:42 AM Backport #41703 (In Progress): nautilus: oi(object_info_t).size does not match on disk size
- https://github.com/ceph/ceph/pull/30278
- 03:55 AM Backport #41704 (In Progress): mimic: oi(object_info_t).size does not match on disk size
- https://github.com/ceph/ceph/pull/30275
- 01:01 AM Bug #41735 (Resolved): pg_autoscaler throws HEALTH_WARN with auto_scale on for all pools
- Old pools have auto_scale on and ceph health still shows HEALTH_WARN (20 < 30)...
09/09/2019
- 11:38 PM Bug #41661: radosbench_omap_write cleanup slow/stuck
- The current timeout (config.get('time', 360) * 30 + 300 = 300*30 + 300) of 9300 seconds is not enough to clean up the...
- 10:25 PM Feature #38136 (Resolved): core: lazy omap stat collection
- 10:25 PM Backport #38551 (Resolved): luminous: core: lazy omap stat collection
- 09:45 PM Bug #41601: oi(object_info_t).size does not match on disk size
- Greg Farnum wrote:
> Hmm I was going to move this into the RADOS project tracker but now I'm leaving it because I'm ... - 08:20 PM Bug #41601: oi(object_info_t).size does not match on disk size
- Hmm I was going to move this into the RADOS project tracker but now I'm leaving it because I'm not sure if that will ...
- 09:35 PM Backport #41731 (Need More Info): nautilus: osd/ReplicatedBackend.cc: 1349: FAILED ceph_assert(pe...
- note that the backport of https://github.com/ceph/ceph/pull/30059 should happen after https://github.com/ceph/ceph/pu...
- 07:39 PM Backport #41731 (Rejected): nautilus: osd/ReplicatedBackend.cc: 1349: FAILED ceph_assert(peer_mis...
- 09:34 PM Backport #41732 (Need More Info): mimic: osd/ReplicatedBackend.cc: 1349: FAILED ceph_assert(peer_...
- 09:33 PM Backport #41732: mimic: osd/ReplicatedBackend.cc: 1349: FAILED ceph_assert(peer_missing.count(fro...
- note that the backport of https://github.com/ceph/ceph/pull/30059 should happen after https://github.com/ceph/ceph/pu...
- 07:39 PM Backport #41732 (Rejected): mimic: osd/ReplicatedBackend.cc: 1349: FAILED ceph_assert(peer_missin...
- 09:33 PM Backport #41730 (Need More Info): luminous: osd/ReplicatedBackend.cc: 1349: FAILED ceph_assert(pe...
- note that the backport of https://github.com/ceph/ceph/pull/30059 should happen after https://github.com/ceph/ceph/pu...
- 07:39 PM Backport #41730 (Resolved): luminous: osd/ReplicatedBackend.cc: 1349: FAILED ceph_assert(peer_mis...
- https://github.com/ceph/ceph/pull/31855
- 09:03 PM Bug #41385: osd/ReplicatedBackend.cc: 1349: FAILED ceph_assert(peer_missing.count(fromshard))
- Nathan Cutler wrote:
> @Neha - backport all three PRs?
Yes, note that the backport of https://github.com/ceph/cep... - 07:41 PM Bug #41385: osd/ReplicatedBackend.cc: 1349: FAILED ceph_assert(peer_missing.count(fromshard))
- @Neha - backport all three PRs?
- 04:53 PM Bug #41385 (Pending Backport): osd/ReplicatedBackend.cc: 1349: FAILED ceph_assert(peer_missing.co...
- 08:51 PM Bug #41065 (Closed): new osd added to cluster upgraded from 13 to 14 will down after some days
- It's not clear from these snippets what issue you're actually experiencing. The "bad authorizer" suggests either a cl...
- 08:37 PM Bug #41406: common: SafeTimer reinit doesn't fix up "stopping" bool, used in MonClient bootstrap
- That's a weird one; perhaps the MonClient should behave differently instead.
(Note that this is a problem only on ... - 04:20 PM Bug #41689 (Resolved): Network ping test fails in TEST_network_ping_test2
- This is a follow on fix for the feature https://tracker.ceph.com/issues/40640. The backport is included as part of t...
- 10:50 AM Bug #41721 (Fix Under Review): TestClsRbd.sparsify fails when using filestore
- 10:24 AM Bug #41721 (Resolved): TestClsRbd.sparsify fails when using filestore
- it's a regression introduced by https://github.com/ceph/ceph/pull/30061
see http://pulpito.ceph.com/kchai-2019-09-...
09/08/2019
- 06:16 PM Bug #41718 (Resolved): ceph osd stat JSON output incomplete
- ...
- 09:22 AM Bug #40583 (Resolved): Lower the default value of osd_deep_scrub_large_omap_object_key_threshold
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 09:20 AM Backport #40653 (Resolved): luminous: Lower the default value of osd_deep_scrub_large_omap_object...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/29175
m...
09/07/2019
- 08:07 PM Bug #41716 (Resolved): LibRadosTwoPoolsPP.ManifestUnset fails
- ...
- 09:29 AM Backport #41712 (Resolved): nautilus: FAILED ceph_assert(p != pg_slots.end()) in OSDShard::regist...
- https://github.com/ceph/ceph/pull/30371
- 09:23 AM Backport #41705 (Resolved): nautilus: Incorrect logical operator in Monitor::handle_auth_request()
- https://github.com/ceph/ceph/pull/31038
- 09:23 AM Backport #41704 (Resolved): mimic: oi(object_info_t).size does not match on disk size
- https://github.com/ceph/ceph/pull/30275
- 09:23 AM Backport #41703 (Resolved): nautilus: oi(object_info_t).size does not match on disk size
- https://github.com/ceph/ceph/pull/30278
- 09:23 AM Backport #41702 (Rejected): luminous: oi(object_info_t).size does not match on disk size
- 07:45 AM Backport #41697 (In Progress): luminous: Network ping monitoring
- 07:31 AM Backport #41697 (Resolved): luminous: Network ping monitoring
- https://github.com/ceph/ceph/pull/30230
- 07:43 AM Backport #41696 (In Progress): mimic: Network ping monitoring
- 07:31 AM Backport #41696 (Resolved): mimic: Network ping monitoring
- https://github.com/ceph/ceph/pull/30225
- 07:34 AM Backport #41695 (In Progress): nautilus: Network ping monitoring
- 07:31 AM Backport #41695 (Resolved): nautilus: Network ping monitoring
- https://github.com/ceph/ceph/pull/30195
- 02:35 AM Bug #41693 (Need More Info): a accidental problems with osd detection algorithm in monitor
- There is a accidental problems with osd detection algorithm in monitor. In a three-cluster environment,HostA/HostB/Ho...
09/06/2019
- 11:49 PM Backport #41531 (In Progress): nautilus: Move bluefs alloc size initialization log message to log...
- 10:15 PM Backport #41531 (Need More Info): nautilus: Move bluefs alloc size initialization log message to ...
- non-trivial backport - needs https://github.com/ceph/ceph/pull/29537 at least
- 11:38 PM Bug #41385 (Fix Under Review): osd/ReplicatedBackend.cc: 1349: FAILED ceph_assert(peer_missing.co...
- https://github.com/ceph/ceph/pull/30119 (merged September 4, 2019)
https://github.com/ceph/ceph/pull/30059 (merged S... - 10:21 PM Backport #41533 (In Progress): mimic: Move bluefs alloc size initialization log message to log le...
- 10:14 PM Backport #41533 (Need More Info): mimic: Move bluefs alloc size initialization log message to log...
- non-trivial backport - needs https://github.com/ceph/ceph/pull/29537 at least
- 10:03 PM Backport #41530 (In Progress): mimic: doc: mon_health_to_clog_* values flipped
- 08:01 PM Backport #41499 (Need More Info): mimic: backfill_toofull while OSDs are not full (Unneccessary H...
- The backport needs 3b8f86c8b09b9143d3e25ab34b51057581b48114 to be cherry-picked, first, for it to make sense, but tha...
- 03:34 PM Backport #41499 (In Progress): mimic: backfill_toofull while OSDs are not full (Unneccessary HEAL...
- 07:42 PM Backport #41502 (In Progress): mimic: Warning about past_interval bounds on deleting pg
- 07:03 PM Bug #41689: Network ping test fails in TEST_network_ping_test2
- ...
- 06:37 PM Bug #41689 (Fix Under Review): Network ping test fails in TEST_network_ping_test2
- 06:18 PM Bug #41689 (Resolved): Network ping test fails in TEST_network_ping_test2
- http://pulpito.ceph.com/kchai-2019-09-06_15:05:18-rados-wip-kefu-testing-2019-09-06-1807-distro-basic-smithi/4283774/...
- 05:27 PM Bug #41429 (Pending Backport): Incorrect logical operator in Monitor::handle_auth_request()
- 05:08 PM Bug #38513: luminous: "AsyncReserver.h: 190: FAILED assert(!queue_pointers.count(item) && !in_pro...
- /a/nojha-2019-09-05_23:53:20-rados-wip-40769-luminous-distro-basic-smithi/4279855/
- 03:29 PM Backport #41490 (In Progress): mimic: OSDCap.PoolClassRNS test aborts
- 03:28 PM Backport #41449 (In Progress): mimic: mon: C_AckMarkedDown has not handled the Callback Arguments
- 01:29 PM Backport #40993 (In Progress): mimic: Ceph status in some cases does not report slow ops
- The logs relating to this tracker didn't indicate anything obvious upon analysis. The issue was reproduced locally on...
- 10:04 AM Bug #41680 (Resolved): Removed OSDs with outstanding peer failure reports crash the monitor
- The osd have been reduced, but reported anomaly information for partner OSD Previously. However, reporters of failure...
- 09:50 AM Bug #41677: Cephmon:fix mon crash
- shuguang wang wrote:
> Reduction num of osd in primary mon of three node cluster, the primary mon crash of occasiona... - 08:45 AM Bug #41677 (Fix Under Review): Cephmon:fix mon crash
- 08:43 AM Bug #41677: Cephmon:fix mon crash
- shuguang wang wrote:
> Reduction num of osd in primary mon of three node cluster, the primary mon crash of occasiona... - 08:42 AM Bug #41677: Cephmon:fix mon crash
- The osd have been reduced, but reported anomaly information for partner OSD Previously. However, failure_info of this...
- 05:34 AM Bug #41677 (Resolved): Cephmon:fix mon crash
- Reduction num of osd in primary mon of three node cluster, the primary mon crash of occasional.
- 05:53 AM Bug #41427 (Resolved): set-chunk raced with deep-scrub
- 05:52 AM Bug #41514 (Resolved): in-flight manifest ops not properly cancelled on interval changing
- 03:17 AM Bug #41601 (Pending Backport): oi(object_info_t).size does not match on disk size
09/05/2019
- 10:30 PM Bug #41657 (Rejected): osd/PeeringState.cc: 2540: FAILED ceph_assert(cct->_conf->osd_find_best_in...
- this is caused by a bug in my test branch
- 09:19 PM Bug #41669: Make dumping of reservation info congruent between scrub and recovery
- 05:47 PM Bug #41669 (Resolved): Make dumping of reservation info congruent between scrub and recovery
Rename dump_reservations to dump_recovery_reservations
Add dump_scrub_reservations- 06:59 PM Feature #40640 (Pending Backport): Network ping monitoring
- 01:50 PM Backport #41447 (In Progress): mimic: osd/PrimaryLogPG: Access destroyed references in finish_deg...
- 01:01 PM Backport #41351 (In Progress): mimic: hidden corei7 requirement in binary packages
- 12:49 PM Backport #41291 (In Progress): mimic: filestore pre-split may not split enough directories
- 12:48 PM Backport #40732 (In Progress): mimic: mon: auth mon isn't loading full KeyServerData after restart
- 12:36 PM Backport #40083 (In Progress): mimic: osd: Better error message when OSD count is less than osd_p...
- 07:25 AM Feature #41666 (Resolved): Issue a HEALTH_WARN when a Pool is configured with [min_]size == 1
- To prevent the user from experiencing data loss, Ceph should issue a health warning if any Pool is configured with a ...
- 12:00 AM Bug #41661 (Resolved): radosbench_omap_write cleanup slow/stuck
- ...
09/04/2019
- 09:34 PM Feature #38458: Ceph does not have command to show current osd primary-affinity
- "ceph osd dump", perhaps with a detail or json formatting, includes that information.
I don't think we have any qu... - 05:49 AM Feature #38458: Ceph does not have command to show current osd primary-affinity
- Greg, what is the exact command ?
- 05:58 PM Bug #41657 (Fix Under Review): osd/PeeringState.cc: 2540: FAILED ceph_assert(cct->_conf->osd_find...
- 05:53 PM Bug #41657: osd/PeeringState.cc: 2540: FAILED ceph_assert(cct->_conf->osd_find_best_info_ignore_h...
- The find_best_info process excludes getting a master log from an osd with an old(er) last_epoch_started. However, th...
- 05:51 PM Bug #41657 (Rejected): osd/PeeringState.cc: 2540: FAILED ceph_assert(cct->_conf->osd_find_best_in...
- ...
- 01:27 PM Feature #41650 (New): Convert between EC profiles online
- Users have repeatedly voiced the need to convert/modify an EC profile while the cluster was running, in response to c...
- 09:06 AM Feature #41647 (Resolved): pg_autoscaler should show a warning if pg_num isn't a power of two
- As the pg_autoscaler will be automatically turned on with the 14.2.4 release and future releases I would like to enha...
- 03:36 AM Bug #40646 (Resolved): FTBFS with devtoolset-8-gcc-c++-8.3.1-3.el7.x86_64 and devtoolset-8-libstd...
- 01:23 AM Bug #40646 (Fix Under Review): FTBFS with devtoolset-8-gcc-c++-8.3.1-3.el7.x86_64 and devtoolset-...
- 12:43 AM Bug #40646 (Resolved): FTBFS with devtoolset-8-gcc-c++-8.3.1-3.el7.x86_64 and devtoolset-8-libstd...
- 12:32 AM Bug #38483 (Pending Backport): FAILED ceph_assert(p != pg_slots.end()) in OSDShard::register_and_...
Also available in: Atom