Chapter 2: Transaction Layer

Advanced Deep Dive - TLPs, Flow Control, Ordering, and Protocol Details

Transaction Layer Overview

The Transaction Layer is the uppermost layer of the PCIe protocol stack, responsible for assembling and disassembling Transaction Layer Packets (TLPs). It provides the interface between the device's core logic and the PCIe link, implementing the semantics required for various transaction types.

Transaction Layer Responsibilities

  • TLP Assembly/Disassembly: Creating TLPs from device requests and parsing incoming TLPs
  • Flow Control: Managing credit-based flow control to prevent buffer overflow
  • Transaction Ordering: Enforcing producer-consumer ordering model and relaxed ordering rules
  • Quality of Service: Traffic Class assignment and Virtual Channel management
  • Error Detection: ECRC generation and checking for end-to-end data integrity

TLP Structure - Complete Analysis

Header Format Details

Every TLP consists of a header (3 or 4 DWORDs), an optional data payload, and an optional ECRC. The header format varies based on the TLP type and addressing mode.

TLP Structure (Non-Flit Mode) Header 3 or 4 DW (12-16 bytes) Data Payload (Optional) 0 to 1024 DW (0-4096 bytes) ECRC 1 DW (Optional) DW 0: Type/Fmt DW 1: Req ID/Tag DW 2: Address DW 3 (64-bit) Byte 0-3 Byte 4-7 Byte 8-11 Byte 12-15 * LCRC is added by Data Link Layer, not part of TLP

Type and Format Encoding

The first byte of every TLP header contains the Format (Fmt) and Type fields that determine the TLP's purpose and structure.

// TLP Header DW0 - Bit Layout (Non-Flit Mode) Byte 0: [7:5] Fmt - Format (3 bits) 000 = 3 DW header, no data 001 = 4 DW header, no data 010 = 3 DW header, with data 011 = 4 DW header, with data 100 = TLP Prefix [4:0] Type - Transaction Type (5 bits) Byte 1: [7] T9 - Tag[9] (10-bit tag extension) [6:4] TC - Traffic Class (3 bits, 0-7) [3] T8 - Tag[8] (extended tag) [2] Attr[2] - IDO (ID-based Ordering) [1] LN - Lightweight Notification (Reserved in most) [0] TH - TLP Processing Hints Byte 2: [7] TD - TLP Digest (ECRC present) [6] EP - Error/Poisoned [5:4] Attr[1:0] - Attributes (Relaxed Ordering, No Snoop) [3:2] AT - Address Type (for ATS) [1:0] Length[9:8] - Upper Length bits Byte 3: [7:0] Length[7:0] - Lower Length bits (total 10 bits = 0-1023 DW)
Fmt[2:0]Type[4:0]TLP TypePosted?Description
00000000MRdNoMemory Read Request (32-bit address)
00100000MRdNoMemory Read Request (64-bit address)
00000001MRdLkNoMemory Read Lock (legacy, deprecated)
01000000MWrYesMemory Write (32-bit address)
01100000MWrYesMemory Write (64-bit address)
00000010IORdNoI/O Read Request
01000010IOWrNoI/O Write Request
00000100CfgRd0NoConfiguration Read Type 0
01000100CfgWr0NoConfiguration Write Type 0
00000101CfgRd1NoConfiguration Read Type 1
01000101CfgWr1NoConfiguration Write Type 1
0x001010Cpl-Completion without Data
0x001010CplD-Completion with Data
0x110xxxMsgVariesMessage Request
00101100FetchAddNoFetch and Add AtomicOp (64-bit)
00101101SwapNoUnconditional Swap AtomicOp
00101110CASNoCompare and Swap AtomicOp

Transaction Descriptor - Deep Analysis

The Transaction Descriptor uniquely identifies each transaction and contains critical routing and handling information.

Transaction Descriptor Components

Transaction ID = Requester ID (16 bits) + Tag (up to 14 bits in PCIe 7.0)

  • Requester ID: Bus[7:0] + Device[4:0] + Function[2:0] = 16 bits
  • Tag: Uniquely identifies outstanding transactions from same Requester
    • 5-bit Tags: 32 outstanding transactions (legacy)
    • 8-bit Tags: 256 outstanding transactions (Extended Tag)
    • 10-bit Tags: 1024 outstanding transactions (PCIe 4.0+)
    • 14-bit Tags: 16384 outstanding transactions (PCIe 6.0+ Flit Mode)
  • Attributes: Relaxed Ordering, No Snoop, ID-Based Ordering
  • Traffic Class (TC): 0-7, maps to Virtual Channels
Maximum Outstanding NPRs = Tag_Count × (1 + Phantom_Functions_Count)
Example: 10-bit Tags + 3 Phantom Functions = 1024 × 4 = 4096 outstanding requests

TLP Types - Complete Deep Dive

Memory Requests

Memory Read (MRd)

Memory Read requests are Non-Posted transactions that require a Completion with data. The Requester must track outstanding requests and associate returning Completions using the Transaction ID.

// Memory Read Request - 64-bit Addressing (4 DW Header) DW0: Fmt=001 Type=00000 | TC | T8 | Attr | TH | TD | EP | AT | Length DW1: Requester ID [15:0] | Tag [7:0] DW2: Address [63:32] // Upper 32 bits DW3: Address [31:2] | PH [1:0] // Lower 30 bits + Processing Hints // Byte Enable Rules for MRd: - Length = 1 DW: First DW BE = valid pattern, Last DW BE = 0000 - Length = 2+ DW: Both BE fields can have valid patterns - First DW BE 0000 only valid if Last DW BE also 0000 (Zero-length read)

Critical Memory Request Rules

  • 4KB Boundary: A single request MUST NOT cross a 4KB address boundary
  • Max Read Request Size: Limited by Max_Read_Request_Size in Device Control register (128, 256, 512, 1024, 2048, or 4096 bytes)
  • Address Alignment: First DW must be naturally aligned (bits [1:0] of address are 00)
  • Completion Timeout: Requester must implement timeout for Non-Posted requests

Memory Write (MWr)

Memory Write requests are Posted transactions - they complete at the Requester immediately after transmission. No Completion is returned, so errors must be reported via error Messages.

Posted vs Non-Posted: Performance vs Reliability Trade-off

Posted (MWr): Higher performance, fire-and-forget, errors detected late via ERR_* Messages

Non-Posted (MRd, IORd/Wr, CfgRd/Wr): Guaranteed delivery confirmation, lower throughput due to Completion latency

Deferrable Memory Write (DMWr - PCIe 6.0+): Hybrid approach - Non-Posted write that can be declined by Completer

Atomic Operations (AtomicOps)

AtomicOps perform read-modify-write operations atomically at the target location. Three types are defined:

FetchAdd

Reads value, adds operand, writes result, returns original value.

Operand sizes: 32-bit, 64-bit

Use case: Counters, sequence numbers

Swap

Reads value, writes new value, returns original value.

Operand sizes: 32-bit, 64-bit

Use case: Lock acquisition

Compare and Swap (CAS)

Compares value with operand1, if equal writes operand2, returns original.

Operand sizes: 32-bit, 64-bit, 128-bit

Use case: Lock-free algorithms

Completions - Detailed Analysis

Completions are responses to Non-Posted Requests. They may or may not contain data, and carry status information about the transaction outcome.

// Completion Header (3 DW) DW0: Fmt=0x0 Type=01010 | TC | Attr | TD | EP | AT | Length DW1: Completer ID [15:0] | Cpl Status [2:0] | BCM | Byte Count [11:0] DW2: Requester ID [15:0] | Tag [7:0] | R | Lower Address [6:0] // Completion Status Values: 000 = Successful Completion (SC) 001 = Unsupported Request (UR) 010 = Configuration Request Retry Status (CRS) 100 = Completer Abort (CA)
StatusCodeMeaningAction Required
Successful Completion000Request completed normallyProcess returned data (if any)
Unsupported Request001Completer doesn't support this requestReport UR error, may retry with different params
Config Retry Status010Device not ready, retry laterWait and retry the Configuration request
Completer Abort100Unrecoverable error at CompleterReport CA error, do not retry

Split Completions

A single Read Request may result in multiple Completions (split completions) due to:

  • Read Completion Boundary (RCB): Completions must not cross RCB (64 or 128 bytes)
  • Max Payload Size: Completion data limited by MPS setting
  • Completer design: May split for internal reasons

Byte Count field indicates remaining bytes including current Completion. Requester uses this to track completion progress and detect missing Completions.

Flow Control - Complete Mechanism

PCIe uses a credit-based flow control mechanism to prevent receiver buffer overflow. Each receiver advertises available buffer space as "credits," and transmitters must have sufficient credits before sending.

Credit System Architecture

Flow Control Credit Types

Credits are tracked separately for three transaction types across header and data:

TypeHeader Credits (HdrFC)Data Credits (DataFC)Unit
Posted (P)PHPD1 header / 16 bytes (4 DW)
Non-Posted (NP)NPHNPD1 header / 16 bytes (4 DW)
Completion (Cpl)CplHCplD1 header / 16 bytes (4 DW)
Credits_Required = Header_Credits + ceil(Payload_Size / 16)
Example: 256-byte MWr = 1 PH + ceil(256/16) = 1 PH + 16 PD

Flow Control Initialization

During link training, devices exchange Flow Control initialization DLLPs to advertise their receive buffer capacities:

// FC Initialization Sequence Phase 1 - InitFC1: - Transmit InitFC1 DLLPs advertising credit values for each VC - Receive InitFC1 from link partner - InitFC1 contains: PH, PD, NPH, NPD, CplH, CplD for each active VC Phase 2 - InitFC2: - After receiving all InitFC1, transmit InitFC2 (confirms receipt) - Receive InitFC2 from link partner - Link enters FC_INIT_COMPLETE state // Credit Limits - Infinite Credits: Indicated by advertising 0 for a credit type - Completers typically advertise infinite CplH/CplD (completions come from internal logic) - Minimum Credits: Must support at least max TLP size allowed by MPS

Credit Updates and Consumption

Credit Update Algorithm
  1. Transmitter: Before sending TLP, check: CREDITS_CONSUMED + Required ≤ CREDIT_LIMIT
  2. Transmitter: After sending, increment CREDITS_CONSUMED
  3. Receiver: After processing TLP (buffer freed), update CREDITS_ALLOCATED
  4. Receiver: Periodically send UpdateFC DLLP with new CREDIT_LIMIT
  5. Transmitter: On receiving UpdateFC, update local CREDIT_LIMIT

Flow Control Deadlock Prevention

PCIe avoids deadlock through careful rules:

  • Posted transactions cannot be blocked waiting for Non-Posted credits
  • Completions use separate credit pool - never blocked by requests
  • Switches must guarantee forward progress for Completions
  • UpdateFC DLLPs are always transmitted (no flow control on DLLPs)

Transaction Ordering Rules - Complete Reference

PCIe implements a producer-consumer ordering model derived from PCI. Understanding ordering rules is critical for correct system operation and performance optimization.

Pass Column →
Row ↓
Posted RequestNon-Posted RequestCompletion
Posted RequestNo (same address)
Yes (different)
YesYes
Non-Posted RequestNoYes (if RO enabled)Yes
CompletionNoNoYes

Ordering Attributes Explained

  • Relaxed Ordering (RO) - Attr[1]: When set, allows this Write to pass previous Writes to different addresses. Breaks strict ordering for performance.
  • ID-Based Ordering (IDO) - Attr[2]: Transactions with different Requester IDs can pass each other. Enables independent streams.
  • No Snoop (NS) - Attr[0]: Hints that data does not require cache coherency checks. Performance optimization.

Virtual Channels (VCs)

Virtual Channels provide independent flow paths through the PCIe fabric, enabling QoS differentiation and traffic isolation.

VC0 (Default)

Mandatory for all devices. Used for general traffic and TC0.

Arbitration: Round-robin or weighted

VC1-VC7 (Optional)

Additional VCs for differentiated service. Mapped to specific TCs.

Use case: Isochronous, low-latency traffic

TC to VC Mapping

Traffic Classes (TC0-TC7) are mapped to Virtual Channels at each port. Default: All TCs map to VC0. Software configures mapping via VC Capability registers. Mapping must be consistent across the entire path.

Tag Management - PCIe 7.0

Tag SizeMax OutstandingEnable BitPCIe Version
5-bit32Default (Extended Tag = 0)PCIe 1.0+
8-bit256Extended Tag Field EnablePCIe 1.0+
10-bit102410-Bit Tag Requester EnablePCIe 4.0+
14-bit1638414-Bit Tag Requester Enable (Flit Mode only)PCIe 6.0+

Flit Mode TLP Differences

PCIe 6.0+ introduces Flit Mode with significant TLP header format changes optimized for 256-byte fixed Flits.

Non-Flit Mode (Legacy)
  • Variable TLP size (up to 4KB payload)
  • Per-TLP LCRC at Data Link Layer
  • 8b/10b or 128b/130b encoding
  • 10-bit Tags maximum
  • Explicit Byte Enables
Flit Mode (PCIe 6.0+)
  • Fixed 256-byte Flits
  • FEC + CRC at Physical Layer
  • 1b/1b PAM4 encoding
  • 14-bit Tags supported
  • Optimized Header Compression (OHC)
  • Embedded DLP in Flit structure