LAYER 2 - RELIABLE DELIVERY

Data Link Layer

Complete Technical Deep-Dive: DLCMSM, ACK/NAK Protocol, LCRC, Retry Mechanism, DLLPs, Flow Control Initialization, and Flit Mode Operation

1. Data Link Layer Overview

What is the Data Link Layer?

The Data Link Layer (DLL) provides reliable, point-to-point delivery of Transaction Layer Packets (TLPs) across a single PCIe link. It sits between the Transaction Layer and Physical Layer, adding error detection (LCRC), sequencing, and a retry mechanism to ensure TLPs are delivered correctly despite physical layer errors.

Position in Protocol Stack

// PCIe Protocol Stack ┌─────────────────────────────────────────────────────────┐ │ SOFTWARE LAYER │ │ Device Drivers, OS, Applications │ ├─────────────────────────────────────────────────────────┤ │ TRANSACTION LAYER │ │ TLP Generation, Flow Control, Ordering │ │ ↓ TLPs ↓ │ ├─────────────────────────────────────────────────────────┤ │ DATA LINK LAYERYOU ARE HERE │ │ Seq Num, LCRC, ACK/NAK, Retry, FC Init, DLLPs │ │ ↓ TLPs + DLLPs ↓ │ ├─────────────────────────────────────────────────────────┤ │ PHYSICAL LAYER │ │ Encoding, Scrambling, Serialization, LTSSM │ ├─────────────────────────────────────────────────────────┤ │ ELECTRICAL/PHY │ │ Signaling, Equalization, Clocking │ └─────────────────────────────────────────────────────────┘

Core Responsibilities

TLP Integrity
  • Add LCRC (32-bit)
  • Add sequence number
  • Verify LCRC on receive
  • Detect corruption
Reliable Delivery
  • ACK/NAK protocol
  • Replay buffer
  • Retransmission
  • Duplicate detection
Link Management
  • Flow Control Init
  • Power Management DLLPs
  • Link state tracking
  • Data Link Feature Exchange

2. Data Link Layer Architecture

// Data Link Layer Internal Architecture ┌─────────────────────────────────────────────────────────────────────────┐ │ DATA LINK LAYER │ │ │ │ ┌─────────────────────────────┐ ┌─────────────────────────────┐ │ │ │ TRANSMIT SIDE │ │ RECEIVE SIDE │ │ │ │ │ │ │ │ │ │ TLP from Tx Layer │ │ Packet from PHY │ │ │ │ ↓ │ │ ↓ │ │ │ │ ┌─────────────────┐ │ │ ┌─────────────────┐ │ │ │ │ │ Seq Num Assign │ │ │ │ LCRC Check │ │ │ │ │ └────────┬────────┘ │ │ └────────┬────────┘ │ │ │ │ ↓ │ │ ↓ │ │ │ │ ┌─────────────────┐ │ │ ┌─────────────────┐ │ │ │ │ │ LCRC Calculate │ │ │ │ Seq Num Check │ │ │ │ │ └────────┬────────┘ │ │ └────────┬────────┘ │ │ │ │ ↓ │ │ ↓ │ │ │ │ ┌─────────────────┐ │ │ ┌─────────────────┐ │ │ │ │ │ Replay Buffer │◄───────┼────┼──│ ACK/NAK Logic │ │ │ │ │ │ (Store TLP) │ │ │ │ │ │ │ │ │ └────────┬────────┘ │ │ └────────┬────────┘ │ │ │ │ ↓ │ │ ↓ │ │ │ │ To PHY Layer │ │ TLP to Rx Layer │ │ │ │ │ │ │ │ │ │ ┌─────────────────┐ │ │ ┌─────────────────┐ │ │ │ │ │ DLLP Generator │◄───────┼────┼──│ DLLP Receiver │ │ │ │ │ │ (ACK/NAK/FC) │ │ │ │ (ACK/NAK/FC) │ │ │ │ │ └─────────────────┘ │ │ └─────────────────┘ │ │ │ └─────────────────────────────┘ └─────────────────────────────┘ │ │ │ │ ┌─────────────────────────────────────────────────────────────────┐ │ │ │ DLCMSM State Machine │ │ │ │ DL_Inactive ──► DL_Init ──► DL_Active │ │ │ └─────────────────────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────────────────────┘

Transmit Side Processing

TLP Transmission Sequence (Step-by-Step)

  1. Receive TLP from Transaction Layer (already includes ECRC if enabled)
  2. Assign Sequence Number: Use NEXT_TRANSMIT_SEQ, then increment
  3. Prepend Sequence Number: 2-byte header added to TLP
  4. Calculate LCRC: 32-bit CRC over [Seq Num + TLP]
  5. Append LCRC: 4-byte trailer added
  6. Store in Replay Buffer: Keep copy for potential retransmission
  7. Pass to Physical Layer: For encoding and transmission
  8. Start/Reset REPLAY_TIMER: Track ACK timeout
// TLP Packet Structure at Data Link Layer Before DLL processing: ┌─────────────────────────────────────────────────────┐ │ TLP Header (3-4 DW) │ Data Payload │ ECRC (opt) │ └─────────────────────────────────────────────────────┘ After DLL processing (transmitted on link): ┌──────────┬─────────────────────────────────────────────┬──────────┐ │ Seq Num │ TLP Header (3-4 DW) │ Data Payload │ ECRC │ LCRC │ │ (2 bytes)│ Original TLP │ (4 bytes)│ └──────────┴─────────────────────────────────────────────┴──────────┘ ↑ ↑ │ LCRC covers this range │ └─────────────────────────────────────────────────────────┘

Receive Side Processing

TLP Reception Sequence (Step-by-Step)

  1. Receive Packet from Physical Layer
  2. Verify LCRC: Calculate CRC, compare with received LCRC
    • If LCRC invalid → Schedule NAK, discard packet
  3. Extract Sequence Number: From 2-byte header
  4. Check Sequence Number:
    • If Seq == NEXT_RCV_SEQ → Accept TLP, increment NEXT_RCV_SEQ
    • If Seq < NEXT_RCV_SEQ (using modulo arithmetic) → Duplicate, discard silently
    • If Seq > NEXT_RCV_SEQ → Out-of-order, schedule NAK for NEXT_RCV_SEQ
  5. Forward TLP to Transaction Layer (strip Seq Num and LCRC)
  6. Schedule ACK: Update ACK sequence number to send

3. DLCMSM (Data Link Control and Management State Machine)

The DLCMSM controls the overall operational state of the Data Link Layer. It determines when TLPs can be exchanged and manages the transition between inactive, initialization, and active states.

DLCMSM State Diagram DL_Inactive PHY not in L0 No TLPs allowed DL_Init FC Initialization Exchange InitFC DLLPs DL_Active TLP Exchange OK Normal Operation PHY LinkUp (LTSSM → L0) FC Init Complete (All VCs ready) LinkDown / LTSSM leaves L0 / DL_Down PHY leaves L0 DL_Down Status DL_Down Status DL_Up Status

DL_Inactive State

DL_Inactive Conditions

  • Physical Layer LTSSM is NOT in L0 state
  • Link is training, in low-power state, or disabled
  • No TLPs or DLLPs can be transmitted/received
  • Replay buffer is cleared
  • All sequence counters reset to 0

Exit Condition: Physical Layer signals LinkUp (LTSSM enters L0)

DL_Init State

DL_Init Activities

  • Flow Control Initialization: Exchange InitFC1 and InitFC2 DLLPs
  • Data Link Feature Exchange: Negotiate DL features (PCIe 4.0+)
  • TLPs are NOT transmitted (only DLLPs)
  • Both ports must complete FC initialization for all enabled VCs

Exit Condition: FC initialization complete for all Virtual Channels

DL_Active State

DL_Active - Normal Operation

  • TLPs can be transmitted and received
  • ACK/NAK protocol active
  • Flow Control Updates (UpdateFC DLLPs) exchanged
  • Replay mechanism operational
  • DL_Up Status reported to Transaction Layer

Exit Conditions: Physical Layer LinkDown, REPLAY_NUM exceeds limit, errors

4. Data Link Layer Packets (DLLPs) - Complete Specification

DLLPs are packets generated and consumed entirely within the Data Link Layer. They are NOT subject to flow control and can always be transmitted when the Physical Layer is ready.

DLLP Format

// DLLP Structure (6 bytes total) ┌──────────┬──────────┬──────────┬──────────┬──────────┬──────────┐ │ Byte 0 │ Byte 1 │ Byte 2 │ Byte 3 │ Byte 4 │ Byte 5 │ │ Type │ Field 1 │ Field 2 │ Field 3 │ CRC-16 │ CRC-16 │ │ │ │ │ │ (Low) │ (High) │ └──────────┴──────────┴──────────┴──────────┴──────────┴──────────┘ ↑ ↑ │ CRC-16 covers these 4 bytes │ └────────────────────────────────────────────┘ DLLP CRC-16 Polynomial: x^16 + x^12 + x^5 + 1 (CRC-CCITT) Initial Value: 0xFFFF Residue (no error): 0x800D

Complete DLLP Type Encoding

Type Code DLLP Name Category Description
0000 0000 Ack ACK/NAK Acknowledge TLPs up to sequence number
0001 0000 Nak ACK/NAK Request retransmission from sequence number
0010 0000 PM_Enter_L1 Power Mgmt Request ASPM L1 entry
0010 0001 PM_Enter_L23 Power Mgmt Request L2/L3 Ready entry
0010 0011 PM_Active_State_Request_L1 Power Mgmt ASPM L1 request (Upstream only)
0010 0100 PM_Request_Ack Power Mgmt Acknowledge PM request
0011 0xxx Vendor Specific Vendor Vendor-defined DLLP
0100 0xxx InitFC1-P FC Init FC Init Phase 1 - Posted (VC in xxx)
0101 0xxx InitFC1-NP FC Init FC Init Phase 1 - Non-Posted (VC in xxx)
0110 0xxx InitFC1-Cpl FC Init FC Init Phase 1 - Completion (VC in xxx)
1000 0xxx UpdateFC-P FC Update Update Posted credits
1001 0xxx UpdateFC-NP FC Update Update Non-Posted credits
1010 0xxx UpdateFC-Cpl FC Update Update Completion credits
1100 0xxx InitFC2-P FC Init FC Init Phase 2 - Posted
1101 0xxx InitFC2-NP FC Init FC Init Phase 2 - Non-Posted
1110 0xxx InitFC2-Cpl FC Init FC Init Phase 2 - Completion

ACK/NAK DLLP Format

Byte 0 Byte 1 Byte 2 Byte 3
Type Rsvd Seq[11:8] Seq[7:0] Reserved
00h=Ack
10h=Nak
0000 12-bit AckNak_Seq_Num 00h

Flow Control DLLP Format

Byte 0 Byte 1 Byte 2 Byte 3
Type VC Rsvd HdrFC[7:2] HdrFC[1:0] DataFC[11:8] Rsvd DataFC[7:0]
FC Type 0-7 Header Credits (8-bit) Data Credits (12-bit)

5. ACK/NAK Protocol - Complete Technical Specification

The ACK/NAK protocol provides reliable delivery by acknowledging successfully received TLPs and requesting retransmission of missing or corrupted TLPs.

Sequence Number Management

// Sequence Number Counters (12-bit, range 0-4095) TRANSMIT SIDE: NEXT_TRANSMIT_SEQ: Sequence number for next TLP to transmit ACKD_SEQ: Highest sequence number acknowledged by receiver RECEIVE SIDE: NEXT_RCV_SEQ: Expected sequence number of next incoming TLP // Modulo-4096 Arithmetic All sequence number comparisons use modulo-4096 arithmetic: - After 4095 comes 0 - "A < B" means: (B - A) mod 4096 is in range [1, 2047] - "A > B" means: (A - B) mod 4096 is in range [1, 2047] // Example: If ACKD_SEQ = 4090 and we receive ACK(5): Outstanding = (NEXT_TRANSMIT_SEQ - 5 - 1) mod 4096 This correctly handles wrap-around
Outstanding TLPs = (NEXT_TRANSMIT_SEQ - ACKD_SEQ - 1) mod 4096

ACK Rules

ACK DLLP Semantics and Rules

  • Ack(N) means: "I have successfully received all TLPs with sequence numbers ≤ N"
  • ACK is cumulative - acknowledges ALL TLPs up to and including N
  • Receiver MUST send ACK within ACKNOWLEDGE_LATENCY_TIMER after receiving a TLP
  • ACK may acknowledge multiple TLPs at once
  • Transmitter purges all TLPs with sequence ≤ N from Replay Buffer upon receiving ACK(N)
  • Receiving ACK for already-ACKed sequence is silently ignored

NAK Rules

NAK DLLP Semantics and Rules

  • Nak(N) means: "I am missing TLP with sequence N, please retransmit from N onwards"
  • NAK is sent when:
    • LCRC check fails on received packet
    • Received sequence number > expected (gap detected)
    • 8b/10b decode error (at data rates using 8b/10b)
  • Only ONE NAK can be outstanding at a time (NAK_SCHEDULED flag)
  • After sending NAK, receiver enters NAK_OUTSTANDING state
  • Exit NAK_OUTSTANDING when expected TLP is received correctly
  • Duplicate NAKs for same sequence are suppressed
// ACK/NAK Protocol Flow Example Transmitter Receiver │ │ │───TLP(Seq=10)──────────────────────►│ Receive OK │ │ NEXT_RCV_SEQ = 11 │───TLP(Seq=11)──────────────────────►│ Receive OK │ │ NEXT_RCV_SEQ = 12 │───TLP(Seq=12)───────X (corrupted) │ LCRC fail │ │ │◄──────────────────────Nak(12)────────│ Request retransmit │ │ │ Replay from Seq=12 │ │───TLP(Seq=12)──────────────────────►│ Receive OK │───TLP(Seq=13)──────────────────────►│ Receive OK │───TLP(Seq=14)──────────────────────►│ Receive OK │ │ │◄──────────────────────Ack(14)────────│ All received OK │ │ │ Purge Seq 10-14 from buffer

6. LCRC (Link CRC) Calculation

LCRC Specification

Polynomial:0x04C11DB7 (CRC-32)
Polynomial Expansion:x³² + x²⁶ + x²³ + x²² + x¹⁶ + x¹² + x¹¹ + x¹⁰ + x⁸ + x⁷ + x⁵ + x⁴ + x² + x + 1
Initial Value:0xFFFFFFFF
Final XOR:0xFFFFFFFF (invert result)
Input/Output:Bit-reversed (LSB first)
Residue:0x1CDF4421 (when CRC verified over entire packet including LCRC)
// LCRC Coverage ┌──────────┬─────────────────────────────────────────────┬──────────┐ │ Seq Num │ TLP │ LCRC │ │ 2 bytes │ Header + Data + ECRC (if present) │ 4 bytes │ └──────────┴─────────────────────────────────────────────┴──────────┘ │◄────────────── LCRC calculated over this range ──────────────────►│ // Note: LCRC does NOT cover framing symbols (STP, END, etc.) // LCRC is recalculated at each link hop (switches regenerate LCRC)

LCRC vs ECRC Comparison

Aspect LCRC ECRC
Scope Single link (point-to-point) End-to-end (source to destination)
Regenerated? Yes, at every switch/bridge No, preserved through switches
Coverage Seq Num + TLP + ECRC TLP Header (some fields) + Data
Detects Link transmission errors Switch/path corruption
Required? Always Optional

7. Retry Mechanism - Complete Technical Analysis

Replay Buffer

Replay Buffer Architecture

The transmitter maintains a Replay Buffer containing copies of all transmitted but unacknowledged TLPs. This buffer enables retransmission when NAK is received or REPLAY_TIMER expires.

  • Store Operation: TLP copied to buffer before transmission
  • Purge Operation: Remove TLPs with sequence ≤ ACK'd sequence
  • Replay Operation: Retransmit all buffered TLPs from NAK'd sequence
Minimum Replay Buffer Size

Buffer Size (bytes) = Max_Payload_Size × Max_Outstanding_TLPs

Example: 512 bytes MPS × 64 outstanding = 32 KB buffer
Example: 4096 bytes MPS × 32 outstanding = 128 KB buffer

Replay Timer and Counters

// Retry-Related Timers and Counters REPLAY_TIMER Purpose: Detect lost ACKs and trigger retransmission Started: When TLP transmitted and buffer non-empty Reset: When ACK received or replay initiated Timeout: Implementation-specific (based on link latency) REPLAY_TIMER Timeout Value Calculation: REPLAY_TIMER ≥ 3 × (Max_Link_Latency) Where Max_Link_Latency includes: - Time to transmit largest TLP - Round-trip propagation delay - Receiver processing time - ACK transmission time - Transmitter processing time REPLAY_NUM Counter Purpose: Count consecutive replay attempts Incremented: On each replay initiation (NAK or timeout) Reset: When ACK advances ACKD_SEQ Threshold: REPLAY_NUM_ROLLOVER (typically 3 or 4) When REPLAY_NUM > REPLAY_NUM_ROLLOVER: → Enter Recovery state → Attempt link retraining → If retraining fails, link considered failed

Replay Protocol Rules

Normative Replay Rules

  1. Replay Initiation (NAK): Upon receiving Nak(N), transmitter MUST retransmit all TLPs starting from sequence N
  2. Replay Initiation (Timeout): Upon REPLAY_TIMER expiration, transmitter MUST retransmit from oldest unACK'd TLP
  3. Replay Order: Replayed TLPs MUST be retransmitted in original sequence order
  4. New TLPs During Replay: New TLPs may be interleaved with replayed TLPs
  5. Sequence Numbers: Replayed TLPs retain their original sequence numbers
  6. LCRC: LCRC is NOT recalculated for replay (original LCRC used)
  7. REPLAY_NUM Limit: After REPLAY_NUM_ROLLOVER + 1 consecutive replays without ACK advancement, initiate Recovery
// Replay State Machine ┌─────────────────┐ │ NORMAL TX │◄─────────────────────────────┐ │ │ │ │ Transmit TLPs │ │ │ Start TIMER │ │ └────────┬────────┘ │ │ │ ▼ │ ┌────────────┐ │ │ ACK Rcvd? │──Yes──► Purge Buffer ────────►│ └─────┬──────┘ Reset REPLAY_NUM │ │No │ ▼ │ ┌────────────┐ │ │ NAK Rcvd? │──Yes──┐ │ └─────┬──────┘ │ │ │No ▼ │ ▼ ┌──────────────┐ │ ┌────────────┐│ REPLAY_NUM++ │ │ │Timer Exp? ││ Replay from │ │ └─────┬──────┘│ NAK'd Seq ├───────────────┘ │No └──────────────┘ │ ▲ └──Yes─────────┘ ┌────────────────────────────────────────────┐ │ If REPLAY_NUM > REPLAY_NUM_ROLLOVER: │ │ → Enter Recovery State │ │ → Link Retraining │ └────────────────────────────────────────────┘

8. Flow Control Initialization Protocol

Before TLPs can be exchanged, both ends of the link must complete Flow Control Initialization to advertise their buffer capacities.

// FC Initialization State Machine ┌──────────────┐ ┌──────────────┐ │ FC_INIT1 │ │ FC_INIT1 │ │ │ │ │ │ Send InitFC1│─────────InitFC1-P──────────►│ Receive │ │ for all VCs │─────────InitFC1-NP─────────►│ InitFC1 │ │ │─────────InitFC1-Cpl────────►│ │ │ │ │ │ │ │◄────────InitFC1-P────────────│ Send InitFC1│ │ Receive │◄────────InitFC1-NP───────────│ for all VCs │ │ InitFC1 │◄────────InitFC1-Cpl──────────│ │ └──────┬───────┘ └──────┬───────┘ │ │ │ All InitFC1 received for all VCs │ ▼ ▼ ┌──────────────┐ ┌──────────────┐ │ FC_INIT2 │ │ FC_INIT2 │ │ │ │ │ │ Send InitFC2│─────────InitFC2-P──────────►│ Receive │ │ for all VCs │─────────InitFC2-NP─────────►│ InitFC2 │ │ │─────────InitFC2-Cpl────────►│ │ │ │ │ │ │ │◄────────InitFC2-P────────────│ Send InitFC2│ │ Receive │◄────────InitFC2-NP───────────│ for all VCs │ │ InitFC2 │◄────────InitFC2-Cpl──────────│ │ └──────┬───────┘ └──────┬───────┘ │ │ │ All InitFC2 received for all VCs │ ▼ ▼ ┌─────────────────────────────────────────────────────────────┐ │ FC_INIT COMPLETE │ │ Transition to DL_Active │ └─────────────────────────────────────────────────────────────┘

FC Initialization Rules

  • InitFC1 DLLPs advertise initial credit values for P, NP, and Cpl
  • InitFC2 DLLPs confirm values (may be same or different from InitFC1)
  • Both phases required for each enabled Virtual Channel
  • Minimum VC0 always enabled; other VCs optional
  • Credit values: Header (8-bit) and Data (12-bit)
  • Value 0 in InitFC = infinite credits

9. Flit Mode Data Link Layer (PCIe 6.0+)

PCIe 6.0 and later operating at 64 GT/s+ use Flit Mode, which fundamentally changes Data Link Layer operation.

Non-Flit Mode (Legacy)
  • Separate DLLPs for ACK/NAK
  • Per-TLP LCRC (32-bit)
  • 12-bit sequence numbers
  • Variable packet sizes
  • Replay buffer by TLP
  • 8b/10b or 128b/130b encoding
Flit Mode (PCIe 6.0+)
  • DLP embedded in 256-byte Flit
  • FEC + CRC at Flit level
  • 8-bit Flit sequence numbers
  • Fixed 256-byte Flits
  • Replay buffer by Flit
  • 1b/1b PAM4 encoding
// Flit Structure (256 bytes) ┌──────────────────────────────────────────────────────────────────────┐ │ Bytes 0-1 │ TLP Payload Area (234 bytes) │ DLP (14 bytes) │ FEC │ │ Flit Header │ (Multiple TLPs or TLP parts) │ ACK/NAK/FC │ CRC │ └──────────────────────────────────────────────────────────────────────┘ Flit Header (2 bytes): - Flit Type (Payload, NOP, Retransmit Request) - Flit Sequence Number (8-bit) - First TLP Byte Offset DLP - Data Link Payload (14 bytes): - Explicit ACK sequence number - Explicit NAK sequence number (if any) - Flow Control credits - CRC for DLP FEC (6 bytes): - Forward Error Correction code - Enables error correction without retransmission Key Differences in Flit Mode: - No separate DLLPs - all DLL info in Flit DLP section - FEC can correct some errors without triggering replay - Implicit ACK: Receiving correct Flits implies ACK of previous - NOP Flits sent when no TLP data available

10. Data Link Layer Error Handling

Error Condition Detection Action Recovery
LCRC Error CRC mismatch on receive Discard TLP, send NAK Transmitter replays from NAK'd seq
Sequence Error Unexpected sequence number Send NAK for expected seq Transmitter replays missing TLPs
DLLP CRC Error CRC-16 mismatch on DLLP Discard DLLP silently Wait for retransmission/timeout
REPLAY_TIMER Timeout No ACK within timeout Increment REPLAY_NUM, replay Retransmit unACK'd TLPs
REPLAY_NUM Exceeded Too many consecutive replays Enter Recovery state Link retraining via LTSSM
Receiver Overflow No FC credits but TLP received Protocol error Link error, may require reset

Correctable vs Uncorrectable DLL Errors

  • Correctable: LCRC error, sequence error, replay timeout (recovered via retry)
  • Uncorrectable: REPLAY_NUM exceeded, receiver overflow, persistent errors

DLL errors are reported via AER (Advanced Error Reporting) capability. Repeated correctable errors may indicate marginal link quality.