Project

General

Profile

Actions

Fix #5388

open

osd: localized reads (from replicas) reordered wrt writes

Added by Mike Bryant almost 11 years ago. Updated over 4 years ago.

Status:
New
Priority:
High
Assignee:
-
Category:
OSD
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I'm using hbase, with the hadoop-cephfs bindings, on top of a ceph 0.61 cluster.
I'm seeing instances where reading part of a file is returning all nulls, instead of the valid data.
This is showing up at the application level where it's trying to validate a magic number is there, failing, and crashing.

I've been running debugging the the hadoop bindings, and it appears to be doing the right thing, and just getting back nulls from the libcephfs layer.
The file it's reading incorrect data from is: ceph://null/hbase/tsdb/e50e991f7082c47e9172c5421af6e5f3/t/3847239684462394768

This is the interesting bit from the regionserver log:
2013-06-17 19:01:46,984 INFO org.apache.hadoop.fs.ceph.CephFileSystem: selectDataPool path=ceph://null/hbase/tsdb/e50e991f7082c47e9172c5421af6e5f3/.tmp/3566091379609591353 pool:repl=data:4 wanted=3
2013-06-17 19:01:47,172 INFO org.apache.hadoop.hbase.regionserver.StoreFile: Bloom added to HFile (ceph://null/hbase/tsdb/e50e991f7082c47e9172c5421af6e5f3/.tmp/3566091379609591353): 4.5k, 3687/3841 (96%)
2013-06-17 19:01:47,233 INFO org.apache.hadoop.hbase.regionserver.Store: Renaming flushed file at ceph://null/hbase/tsdb/e50e991f7082c47e9172c5421af6e5f3/.tmp/3566091379609591353 to ceph://null/hbase/tsdb/e50e991f7082c47e9172c5421af6e5f3/t/3847239684462394768
2013-06-17 19:01:47,841 DEBUG org.apache.hadoop.fs.ceph.CephInputStream: CephInputStream constructor: initializing stream with fh 2651 and file length 2597375
2013-06-17 19:01:47,841 TRACE org.apache.hadoop.fs.ceph.CephInputStream: CephInputStream.seek: Seeking to position 2597315 on fd 2651
2013-06-17 19:01:47,841 TRACE org.apache.hadoop.fs.ceph.CephInputStream: CephInputStream.read: Reading 8 bytes from fd 2651
2013-06-17 19:01:47,841 TRACE org.apache.hadoop.fs.ceph.CephInputStream: CephInputStream.read: Reading 8 bytes from fd 2651: succeeded in reading 8 bytes
2013-06-17 19:01:47,841 TRACE org.apache.hadoop.fs.ceph.CephInputStream: CephInputStream.read: 8 byte output: 0000000000000000
(This is the bit where it's reading the trailer, and it should be some bytes corresponding to TRABLK)
2013-06-17 19:01:47,842 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server serverName=ec02sv14.ocado.com,60020,1371491732490, load=(requests=264597, regions=132, usedHeap=2709, maxHeap=4062): Replay of HLog required. Forcing server shutdown
org.apache.hadoop.hbase.DroppedSnapshotException: region: tsdb,\x00\x00fQ\xB1\xE7`\x00\x00\x01\x00\x0E\xC2\x00\x00\x03\x00\x00\xC1,1371473213321.e50e991f7082c47e9172c5421af6e5f3.
at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1070)
at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:967)
at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:915)
at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:394)
at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:202)
at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:222)
Caused by: java.io.IOException: Trailer 'header' is wrong; does the trailer size match content?
at org.apache.hadoop.hbase.io.hfile.HFile$FixedFileTrailer.deserialize(HFile.java:1527)
at org.apache.hadoop.hbase.io.hfile.HFile$Reader.readTrailer(HFile.java:885)
at org.apache.hadoop.hbase.io.hfile.HFile$Reader.loadFileInfo(HFile.java:819)
at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.loadFileInfo(StoreFile.java:1003)
at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:382)
at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:438)
at org.apache.hadoop.hbase.regionserver.Store.internalFlushCache(Store.java:557)
at org.apache.hadoop.hbase.regionserver.Store.flushCache(Store.java:496)
at org.apache.hadoop.hbase.regionserver.Store.access$100(Store.java:83)
at org.apache.hadoop.hbase.regionserver.Store$StoreFlusherImpl.flushCache(Store.java:1576)
at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1046)
... 5 more

For comparison this is an instance where it was fine:
2013-06-17 19:01:45,872 TRACE org.apache.hadoop.fs.ceph.CephInputStream: CephInputStream.read: Reading 8 bytes from fd 2644: succeeded in rea
ding 8 bytes
2013-06-17 19:01:45,872 TRACE org.apache.hadoop.fs.ceph.CephInputStream: CephInputStream.read: 8 byte output: 545241424c4b2224

When I look at the file it was reading from a fuse mountpoint, I can see the correct sequence of bytes.

Actions

Also available in: Atom PDF