Activity
From 03/20/2018 to 04/18/2018
04/18/2018
- 10:01 PM Bug #23653: tcmalloc Attempt to free invalid pointer 0x55de11f2a540 in rocksdb::LRUCache::~LRUCac...
- /a/sage-2018-04-18_19:08:00-rados-wip-sage-testing-2018-04-18-1210-distro-basic-smithi/2413082
- 08:20 AM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
- We also see the problem on two clusters with linux 4.13, but not on a cluster with linux 4.10. Configuration and ceph...
04/17/2018
- 12:30 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
- Our Ceph Cluster has the same Problems. I am just migrating OSDs from filestore with XFS to bluestore and get these E...
04/16/2018
- 04:37 PM Bug #22510 (Resolved): osd: BlueStore.cc: BlueStore::_balance_bluefs_freespace: assert(0 == "allo...
- 04:36 PM Backport #23063 (Resolved): luminous: osd: BlueStore.cc: BlueStore::_balance_bluefs_freespace: as...
- 04:21 PM Backport #23063: luminous: osd: BlueStore.cc: BlueStore::_balance_bluefs_freespace: assert(0 == "...
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/21394
merged - 05:38 AM Bug #22102: BlueStore crashed on rocksdb checksum mismatch
- Sage Weil wrote:
> Artemy, is it possible the machine where you saw this was swapping?
Yes, system was in a swap ...
04/13/2018
- 09:29 PM Bug #22510 (Pending Backport): osd: BlueStore.cc: BlueStore::_balance_bluefs_freespace: assert(0 ...
- The luminous PR is still open.
- 02:38 AM Bug #22510 (Resolved): osd: BlueStore.cc: BlueStore::_balance_bluefs_freespace: assert(0 == "allo...
- 09:29 PM Backport #23063 (In Progress): luminous: osd: BlueStore.cc: BlueStore::_balance_bluefs_freespace:...
- The luminous PR is still open, so reopening the backport issue.
- 02:37 AM Backport #23063 (Resolved): luminous: osd: BlueStore.cc: BlueStore::_balance_bluefs_freespace: as...
- 06:46 PM Bug #23653: tcmalloc Attempt to free invalid pointer 0x55de11f2a540 in rocksdb::LRUCache::~LRUCac...
- except the job runs on rhel 7.5,...
- 06:45 PM Bug #23653: tcmalloc Attempt to free invalid pointer 0x55de11f2a540 in rocksdb::LRUCache::~LRUCac...
- on lab centos deploy,...
- 06:38 PM Bug #23653: tcmalloc Attempt to free invalid pointer 0x55de11f2a540 in rocksdb::LRUCache::~LRUCac...
- This looks to me like a build issue with tcmalloc... specifically, building in centos and running in rhel. Running o...
- 08:06 AM Backport #23700 (In Progress): luminous: osd: KernelDevice.cc: 539: FAILED assert(r == 0)
- https://github.com/ceph/ceph/pull/21407
- 07:59 AM Backport #23700 (Resolved): luminous: osd: KernelDevice.cc: 539: FAILED assert(r == 0)
- https://github.com/ceph/ceph/pull/21407
- 07:58 AM Bug #23246 (Pending Backport): [OSD bug] KernelDevice.cc: 539: FAILED assert(r == 0)
- 06:49 AM Backport #23672 (In Progress): luminous: bluestore: ENODATA on aio
- https://github.com/ceph/ceph/pull/21405
04/12/2018
- 10:28 PM Bug #22102: BlueStore crashed on rocksdb checksum mismatch
- Sorry for interrupting, (I provided related info in comments to issue 22678) but I must to add that no error has appe...
- 10:07 PM Bug #22102: BlueStore crashed on rocksdb checksum mismatch
- Artemy, is it possible the machine where you saw this was swapping?
- 10:05 PM Bug #22102: BlueStore crashed on rocksdb checksum mismatch
- Update:
- The bad data appears to be in the buffer immediately after the pread syscall
- pread is returning the f... - 06:47 PM Backport #23063 (In Progress): luminous: osd: BlueStore.cc: BlueStore::_balance_bluefs_freespace:...
- https://github.com/ceph/ceph/pull/21394
- 11:36 AM Bug #22044: rocksdb log replay - corruption: missing start of fragmented record
- Excluse the late respons. On the advice of the ceph-users ML I had wiped and recreated the OSD. Of course this made t...
- 01:39 AM Backport #23672 (Resolved): luminous: bluestore: ENODATA on aio
- https://github.com/ceph/ceph/pull/21405
04/11/2018
- 03:30 PM Bug #23653 (Resolved): tcmalloc Attempt to free invalid pointer 0x55de11f2a540 in rocksdb::LRUCac...
- This is on rhel7.5 qst run
Run: http://pulpito.ceph.com/teuthology-2018-04-10_20:02:32-smoke-master-testing-basic-... - 03:07 PM Bug #23333 (Pending Backport): bluestore: ENODATA on aio
- i believe this change should be backported.
- 02:36 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
- Could this be caused by re-written data while the first write is still in flight?
This write pattern has been obse... - 12:51 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
- We're seeing the same behavior:...
04/09/2018
- 10:53 PM Bug #23577: Inconsistent PG refusing to deep-scrub or repair
- I attempted to upload a log file with debug_osd = 20/20 for this with upload tag e6d4f641-3006-4ee9-86eb-359f569de6ed...
- 05:47 PM Bug #23577: Inconsistent PG refusing to deep-scrub or repair
- I have a second PG in the same cluster doing this exact same thing. One of it's 11 copies is on Bluestore, the rest ...
- 06:54 PM Bug #23333: bluestore: ENODATA on aio
- 06:54 PM Bug #23333 (Fix Under Review): bluestore: ENODATA on aio
- PR: https://github.com/ceph/ceph/pull/21306.
- 06:51 PM Support #23433: Ceph cluster doesn't start - ERROR: error creating empty object store in /data/ce...
- What is the filesystem underneath _/data/ceph/build/dev/osd0_?
- 01:19 PM Bug #23540: FAILED assert(0 == "can't mark unloaded shard dirty") with compression enabled
- Hello,
We are running for a few days without problem (with compression disabled), to get a debug i need to enable ...
04/06/2018
- 08:39 PM Bug #22616 (Resolved): bluestore_cache_data uses too much memory
- 08:38 PM Backport #23226 (Resolved): luminous: bluestore_cache_data uses too much memory
- 07:27 PM Backport #23226: luminous: bluestore_cache_data uses too much memory
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/21059
merged - 08:29 PM Bug #22678: block checksum mismatch from rocksdb
- debug_bluefs=20
Uploaded with id 5f2ee681-a9d7-4923-899e-5852b3fe18cb - 02:46 PM Bug #22678: block checksum mismatch from rocksdb
- Sergey, can you generate a similar log, but also with 'debug bluefs = 20' enabled? You can even turn down 'debug blu...
- 04:28 PM Bug #23577 (Can't reproduce): Inconsistent PG refusing to deep-scrub or repair
- This is an issue brought over from the ceph-users Mailing List for a thread titled "Have an inconsistent PG, repair n...
- 02:49 PM Bug #21259: bluestore: segv in BlueStore::TwoQCache::_trim
- Sage, to be honest I'm still uncertain if that patch is related to this issue.
I haven't managed to reproduce this ... - 02:41 PM Bug #21259 (Resolved): bluestore: segv in BlueStore::TwoQCache::_trim
- Igor deduced this was missing backports in luminous. THey're merged now, will be in 12.2.5:
https://github.com/ce... - 02:17 PM Bug #21259: bluestore: segv in BlueStore::TwoQCache::_trim
- The same issue here - segfaults on 12.2.4 when recovery:...
04/05/2018
- 06:12 PM Bug #22678: block checksum mismatch from rocksdb
- After crash OSD restarts and goes on running for a few hours before it crashes again.
I have uploaded log file wit... - 12:08 PM Bug #22678: block checksum mismatch from rocksdb
- Is the error permanent in the sense that an affected OSD doesn't start and must be recreated?
Could you please pro... - 02:49 PM Bug #22102: BlueStore crashed on rocksdb checksum mismatch
- Hrm, this run got a crc *near* EOF, but not past it....
04/04/2018
- 07:49 PM Bug #22102: BlueStore crashed on rocksdb checksum mismatch
- Scratch that, mark didn't hit the assert for num_readers ==0 , and the core indicates file isn't deleted.
_read_ra... - 03:40 PM Bug #22678: block checksum mismatch from rocksdb
- I have similar issue with two OSDs (12.2.4) running on the same host. Recreating OSDs did not have any effect, I get ...
- 03:12 PM Bug #23540: FAILED assert(0 == "can't mark unloaded shard dirty") with compression enabled
- Francisco,
thanks for the update, very appreciated.
Curious if you can collect a log for the crushing OSD, with d... - 02:53 PM Bug #23540: FAILED assert(0 == "can't mark unloaded shard dirty") with compression enabled
- I disabled the compression for a while and no OSD's get the error, after enabled it again they back to get the proble...
04/03/2018
- 09:36 PM Bug #22102: BlueStore crashed on rocksdb checksum mismatch
- Current theory: bluefs is not protecting against a file open for read that is deleted. Mark observes that he sees th...
04/02/2018
- 11:50 PM Bug #23540: FAILED assert(0 == "can't mark unloaded shard dirty") with compression enabled
- Yeah. The whole cluster has compression enabled
- 09:56 PM Bug #23540: FAILED assert(0 == "can't mark unloaded shard dirty") with compression enabled
- Hi Francisco,
wondering if you have compression enabled for any of your pools or the whole bluestore? - 08:22 PM Bug #23540 (Resolved): FAILED assert(0 == "can't mark unloaded shard dirty") with compression ena...
- We are using the latest ceph luminous version (12.2.4), and we have a SATA pool tiered by an SSD pool. All using blue...
- 08:25 AM Bug #21259: bluestore: segv in BlueStore::TwoQCache::_trim
- We have also had this fault a number of times.
This was during a migration to bluestore - so we were backfilling for...
03/29/2018
- 06:38 PM Bug #23040 (Resolved): bluestore: statfs available can go negative
- 06:38 PM Backport #23074 (Resolved): luminous: bluestore: statfs available can go negative
- 01:19 PM Backport #23074: luminous: bluestore: statfs available can go negative
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/20554
merged - 12:43 PM Bug #21259: bluestore: segv in BlueStore::TwoQCache::_trim
- 08:17 AM Bug #23141 (Resolved): BlueFS reports rotational journals if BDEV_WAL is not set
- 08:17 AM Backport #23173 (Resolved): luminous: BlueFS reports rotational journals if BDEV_WAL is not set
03/28/2018
- 10:27 PM Backport #23173: luminous: BlueFS reports rotational journals if BDEV_WAL is not set
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/20651
merged - 04:14 AM Backport #23226 (In Progress): luminous: bluestore_cache_data uses too much memory
- https://github.com/ceph/ceph/pull/21059
03/26/2018
- 03:29 PM Bug #23463 (Can't reproduce): src/os/bluestore/StupidAllocator.cc: 336: FAILED assert(rm.empty())
- The ceph-volume nightly tests have seen this failure on one run so far (March 25th) with 2 out of 6 OSDs deployed. We...
- 03:58 AM Bug #23459: BlueStore kv_sync_thread() crash
- crash dump attached
- 03:55 AM Bug #23459 (Can't reproduce): BlueStore kv_sync_thread() crash
- 2018-03-25 06:49:02.894926 7ff4fdc97700 -1 *** Caught signal (Aborted) **
in thread 7ff4fdc97700 thread_name:bstore...
03/22/2018
- 08:21 PM Documentation #23443 (Resolved): doc: object -> file -> disk is wrong for bluestore
- http://docs.ceph.com/docs/master/architecture/#storing-data
object -> file -> disk
is wrong now (for bluesto... - 10:54 AM Bug #23372: osd: segfault
- Nokia ceph-users wrote:
> We are having 5 node cluster with 5 mons and 120 OSDs.
>
> One of the OSD (osd.7) crash...
03/21/2018
- 08:45 PM Bug #23246 (Fix Under Review): [OSD bug] KernelDevice.cc: 539: FAILED assert(r == 0)
- Pull request: https://github.com/ceph/ceph/pull/20996.
- 04:33 PM Support #23433 (New): Ceph cluster doesn't start - ERROR: error creating empty object store in /d...
- After running make vstart. When I try to start a ceph cluster with
@MON=3 OSD=1 MDS=1 MGR=1 RGW=1 ../src/vstart.sh ...
03/20/2018
- 11:28 PM Bug #23426: aio thread got No space left on device
- Yeah, "the assertion came from _aio_t::get_return_value_":https://github.com/ceph/ceph/blob/820dac980e9416fe05998d50c...
- 10:33 PM Bug #23426: aio thread got No space left on device
- might be dupe of #23333
- 10:32 PM Bug #23426 (Won't Fix): aio thread got No space left on device
- Seems reproducible on all distros
Runs:
http://pulpito.ceph.com/teuthology-2018-03-20_05:02:01-smoke-master-tes... - 10:08 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
- I'm seeing the same problem here.
When I get the notification about the deep scrub error, I don't need to do "repa... - 12:57 PM Bug #23333 (In Progress): bluestore: ENODATA on aio
- > Mar 13 15:55:45 ceph02 kernel: [362540.919407] print_req_error: critical medium error, dev sde, sector 5245986552
...
Also available in: Atom