Project

General

Profile

Bug #23390

Identifying NVMe via PCI serial isn't sufficient (Bluestore/SPDK)

Added by Andreas Merk over 4 years ago. Updated almost 4 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Target version:
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hi,
the manual requires a serial to put in to be able to use the NVMe with spdk.
http://docs.ceph.com/docs/master/rados/configuration/bluestore-config-ref/

But I have 2 NVME types which either does not report a serial or the serial is zeroed out.

:~# nvme list
Node SN Model Namespace Usage Format FW Rev
---------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme0n1 BTPY72750DXU512F INTEL SSDPEKKW512G7 1 512.11 GB / 512.11 GB 512 B + 0 B PSF121C
/dev/nvme1n1 P02802105785 PLEXTOR PX-1TM9PeGN 1 1.02 TB / 1.02 TB 512 B + 0 B 1.01
/dev/nvme2n1 P02802105784 PLEXTOR PX-1TM9PeGN 1 1.02 TB / 1.02 TB 512 B + 0 B 1.01
/dev/nvme3n1 P02802105786 PLEXTOR PX-1TM9PeGN 1 1.02 TB / 1.02 TB 512 B + 0 B 1.01

lspci vvvv (for the first Intel 600p NVMe - no serial shown)
02:00.0 System peripheral: Intel Corporation Xeon Processor D Family QuickData Technology Register DMA Channel 0
Subsystem: Super Micro Computer Inc Xeon Processor D Family QuickData Technology Register DMA Channel 0
Physical Slot: 0
Control: I/O
Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 16
Region 0: Memory at fb906000 (64-bit, non-prefetchable) [size=8K]
Capabilities: [40] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+
DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- FLReset-
MaxPayload 256 bytes, MaxReadReq 256 bytes
DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
LnkCap: Port #0, Speed unknown, Width x0, ASPM not supported, Exit Latency L0s <64ns, L1 <1us
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed unknown, Width x0, TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR-, OBFF Not Supported
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: 6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete
, EqualizationPhase1-
EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
Capabilities: [80] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [90] MSI: Enable- Count=1/1 Maskable+ 64bit-
Address: 00000000 Data: 0000
Masking: 00000000 Pending: 00000000
Capabilities: [ac] MSI-X: Enable+ Count=1 Masked-
Vector table: BAR=0 offset=00001000
PBA: BAR=0 offset=00001800
Capabilities: [100 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
AERCap: First Error Pointer: 14, GenCap- CGenEn- ChkCap- ChkEn-
Capabilities: [150 v1] #12
Kernel driver in use: ioatdma
Kernel modules: ioatdma

lspci vvvv (for a plextor NVMe - zeroed out serial)
06:00.0 Non-Volatile memory controller: Lite-On Technology Corporation Device 23f1 (rev 01) (prog-if 02 [NVM Express])
Subsystem: Marvell Technology Group Ltd. Device 1093
Physical Slot: 0-2
Control: I/O
Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 32 bytes
Interrupt: pin A routed to IRQ 16
Region 0: Memory at fb800000 (64-bit, non-prefetchable) [size=16K]
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
Address: 0000000000000000 Data: 0000
Masking: 00000000 Pending: 00000000
Capabilities: [70] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+
DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- FLReset-
MaxPayload 128 bytes, MaxReadReq 4096 bytes
DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
LnkCap: Port #0, Speed 8GT/s, Width x4, ASPM L1, Exit Latency L0s <512ns, L1 <64us
ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM L1 Enabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 8GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR+, OBFF Via message
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: 6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+, EqualizationPhase1+
EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest

Capabilities: [b0] MSI-X: Enable+ Count=19 Masked-
Vector table: BAR=0 offset=00002000
PBA: BAR=0 offset=00003000
Capabilities: [100 v2] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
AERCap: First Error Pointer: 00, GenCap+ CGenEn+ ChkCap+ ChkEn+
Capabilities: [148 v1] Device Serial Number 00-00-00-00-00-00-00-00
Capabilities: [158 v1] Power Budgeting
Capabilities: [168 v1] Alternative Routing-ID Interpretation (ARI)
ARICap: MFVC- ACS-, Next Function: 0
ARICtl: MFVC- ACS-, Function Group: 0
Capabilities: [178 v1] #19
Capabilities: [2b8 v1] Latency Tolerance Reporting
Max snoop latency: 0ns
Max no snoop latency: 0ns
Capabilities: [2c0 v1] L1 PM Substates
L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
PortCommonModeRestoreTime=10us PortTPowerOnTime=10us
Kernel driver in use: nvme
Kernel modules: nvme

Another way to identify NVMes needs to be added.

History

#1 Updated by Greg Farnum over 4 years ago

  • Project changed from Ceph to bluestore
  • Category deleted (OSD)
  • Affected Versions deleted (v0.20, v0.21, v0.21.1, v0.21.2, v0.21.3, v0.21.4, v0.22, v0.22.1, v0.22.2, v0.22.3, v0.23, v0.23.1, v0.23.2, v0.24, v0.24.1, v0.24.2, v0.24.3, v0.25, v0.25.1, v0.25.2, v0.25.3, v0.26, v0.26.1, v0.27, v0.27.1, v0.28, v0.29, v0.30, v0.31, v0.32, v0.33, v0.34, v0.35, v0.36, v0.37, v0.38, v0.39, v0.40, v0.41, v0.42, v0.43, v0.44, v0.45, v0.46, v0.47, v0.48, v0.49, v0.50, v0.51, v0.52a, v0.53a, v0.53b, v0.53c, v0.54a, v0.54b, v0.55a, v0.55b, v0.55c, v0.55d, v0.56, v0.57a, v0.57b, v0.57c, v0.58, v0.59, v0.60, v0.61 - Cuttlefish, v0.62a, v0.62b, v0.63, v0.64, v0.65, v0.66, v0.67 - Dumpling, v0.67rc, v0.67rc - continued, v0.68, v0.68 - continued, v0.69, v0.70, v0.71, v0.72 Emperor, v0.73, v0.74, v0.75, v0.76a, v0.76b, v0.77, 0.78, 0.79, 0.80rc, 0.80, v0.81, 0.82, 0.83, 0.83 cont., 0.84, 0.84 cont., 0.85, 0.85 cont., 0.86, 0.88, 0.89, 0.90, v.91, v.actually90, v.actually91, v0.92, v0.93 - Last Hammer Sprint, v0.94, v0.95, v9.0.2, v9.0.3, v9.0.4, v9.0.5, v9.0.6, v9.0.7, v9.0.8, v10.0.4, v0.80.10, v0.80.11, v0.80.12, v0.94.10, v0.94.11, v0.94.2, v0.94.3, v0.94.4, v0.94.5, v0.94.6, v0.94.7, v0.94.8, v0.94.9, v10.0.0, v10.1.1, v10.2.0, v10.2.1, v10.2.10, v10.2.11, v10.2.2, v10.2.3, v10.2.4, v10.2.5, v10.2.6, v10.2.7, v10.2.8, v10.2.9, v11.1.0, v11.2.0, v11.2.1, v11.2.2, v12.0.0, v12.1.0, v12.2.0, v12.2.1, v12.2.2, v12.2.3, v12.2.4, v12.2.5, v13.0.0, v9.1.1, v9.2.1, v9.2.2)

#2 Updated by Kefu Chai almost 4 years ago

  • Status changed from New to Resolved

fixed in master. now the spdk backend is using PCI device's selector instead. see https://github.com/ceph/ceph/pull/24144

Also available in: Atom PDF