PCIe 7.0 - Transaction Layer (Advanced Deep Dive)

Transaction Layer Overview

The Transaction Layer is the uppermost layer of the PCIe protocol stack, responsible for assembling and disassembling Transaction Layer Packets (TLPs). It provides the interface between the device's core logic and the PCIe link, implementing the semantics required for various transaction types.

Transaction Layer Responsibilities

TLP Assembly/Disassembly: Creating TLPs from device requests and parsing incoming TLPs
Flow Control: Managing credit-based flow control to prevent buffer overflow
Transaction Ordering: Enforcing producer-consumer ordering model and relaxed ordering rules
Quality of Service: Traffic Class assignment and Virtual Channel management
Error Detection: ECRC generation and checking for end-to-end data integrity

TLP Structure - Complete Analysis

Header Format Details

Every TLP consists of a header (3 or 4 DWORDs), an optional data payload, and an optional ECRC. The header format varies based on the TLP type and addressing mode.

Type and Format Encoding

The first byte of every TLP header contains the Format (Fmt) and Type fields that determine the TLP's purpose and structure.

// TLP Header DW0 - Bit Layout (Non-Flit Mode) Byte 0: [7:5] Fmt - Format (3 bits) 000 = 3 DW header, no data 001 = 4 DW header, no data 010 = 3 DW header, with data 011 = 4 DW header, with data 100 = TLP Prefix [4:0] Type - Transaction Type (5 bits) Byte 1: [7] T9 - Tag[9] (10-bit tag extension) [6:4] TC - Traffic Class (3 bits, 0-7) [3] T8 - Tag[8] (extended tag) [2] Attr[2] - IDO (ID-based Ordering) [1] LN - Lightweight Notification (Reserved in most) [0] TH - TLP Processing Hints Byte 2: [7] TD - TLP Digest (ECRC present) [6] EP - Error/Poisoned [5:4] Attr[1:0] - Attributes (Relaxed Ordering, No Snoop) [3:2] AT - Address Type (for ATS) [1:0] Length[9:8] - Upper Length bits Byte 3: [7:0] Length[7:0] - Lower Length bits (total 10 bits = 0-1023 DW)

Fmt[2:0]	Type[4:0]	TLP Type	Posted?	Description
000	00000	MRd	No	Memory Read Request (32-bit address)
001	00000	MRd	No	Memory Read Request (64-bit address)
000	00001	MRdLk	No	Memory Read Lock (legacy, deprecated)
010	00000	MWr	Yes	Memory Write (32-bit address)
011	00000	MWr	Yes	Memory Write (64-bit address)
000	00010	IORd	No	I/O Read Request
010	00010	IOWr	No	I/O Write Request
000	00100	CfgRd0	No	Configuration Read Type 0
010	00100	CfgWr0	No	Configuration Write Type 0
000	00101	CfgRd1	No	Configuration Read Type 1
010	00101	CfgWr1	No	Configuration Write Type 1
0x0	01010	Cpl	-	Completion without Data
0x0	01010	CplD	-	Completion with Data
0x1	10xxx	Msg	Varies	Message Request
001	01100	FetchAdd	No	Fetch and Add AtomicOp (64-bit)
001	01101	Swap	No	Unconditional Swap AtomicOp
001	01110	CAS	No	Compare and Swap AtomicOp

Transaction Descriptor - Deep Analysis

The Transaction Descriptor uniquely identifies each transaction and contains critical routing and handling information.

Transaction Descriptor Components

Transaction ID = Requester ID (16 bits) + Tag (up to 14 bits in PCIe 7.0)

Requester ID: Bus[7:0] + Device[4:0] + Function[2:0] = 16 bits
Tag: Uniquely identifies outstanding transactions from same Requester
- 5-bit Tags: 32 outstanding transactions (legacy)
- 8-bit Tags: 256 outstanding transactions (Extended Tag)
- 10-bit Tags: 1024 outstanding transactions (PCIe 4.0+)
- 14-bit Tags: 16384 outstanding transactions (PCIe 6.0+ Flit Mode)
Attributes: Relaxed Ordering, No Snoop, ID-Based Ordering
Traffic Class (TC): 0-7, maps to Virtual Channels

Maximum Outstanding NPRs = Tag_Count × (1 + Phantom_Functions_Count)
Example: 10-bit Tags + 3 Phantom Functions = 1024 × 4 = 4096 outstanding requests

TLP Types - Complete Deep Dive

Memory Requests

Memory Read (MRd)

Memory Read requests are Non-Posted transactions that require a Completion with data. The Requester must track outstanding requests and associate returning Completions using the Transaction ID.

// Memory Read Request - 64-bit Addressing (4 DW Header) DW0: Fmt=001 Type=00000 | TC | T8 | Attr | TH | TD | EP | AT | Length DW1: Requester ID [15:0] | Tag [7:0] DW2: Address [63:32] // Upper 32 bits DW3: Address [31:2] | PH [1:0] // Lower 30 bits + Processing Hints // Byte Enable Rules for MRd: - Length = 1 DW: First DW BE = valid pattern, Last DW BE = 0000 - Length = 2+ DW: Both BE fields can have valid patterns - First DW BE 0000 only valid if Last DW BE also 0000 (Zero-length read)

Critical Memory Request Rules

4KB Boundary: A single request MUST NOT cross a 4KB address boundary
Max Read Request Size: Limited by Max_Read_Request_Size in Device Control register (128, 256, 512, 1024, 2048, or 4096 bytes)
Address Alignment: First DW must be naturally aligned (bits [1:0] of address are 00)
Completion Timeout: Requester must implement timeout for Non-Posted requests

Memory Write (MWr)

Memory Write requests are Posted transactions - they complete at the Requester immediately after transmission. No Completion is returned, so errors must be reported via error Messages.

Posted vs Non-Posted: Performance vs Reliability Trade-off

Posted (MWr): Higher performance, fire-and-forget, errors detected late via ERR_* Messages

Non-Posted (MRd, IORd/Wr, CfgRd/Wr): Guaranteed delivery confirmation, lower throughput due to Completion latency

Deferrable Memory Write (DMWr - PCIe 6.0+): Hybrid approach - Non-Posted write that can be declined by Completer

Atomic Operations (AtomicOps)

AtomicOps perform read-modify-write operations atomically at the target location. Three types are defined:

FetchAdd

Reads value, adds operand, writes result, returns original value.

Operand sizes: 32-bit, 64-bit

Use case: Counters, sequence numbers

Swap

Reads value, writes new value, returns original value.

Operand sizes: 32-bit, 64-bit

Use case: Lock acquisition

Compare and Swap (CAS)

Compares value with operand1, if equal writes operand2, returns original.

Operand sizes: 32-bit, 64-bit, 128-bit

Use case: Lock-free algorithms

Completions - Detailed Analysis

Completions are responses to Non-Posted Requests. They may or may not contain data, and carry status information about the transaction outcome.

// Completion Header (3 DW) DW0: Fmt=0x0 Type=01010 | TC | Attr | TD | EP | AT | Length DW1: Completer ID [15:0] | Cpl Status [2:0] | BCM | Byte Count [11:0] DW2: Requester ID [15:0] | Tag [7:0] | R | Lower Address [6:0] // Completion Status Values: 000 = Successful Completion (SC) 001 = Unsupported Request (UR) 010 = Configuration Request Retry Status (CRS) 100 = Completer Abort (CA)

Status	Code	Meaning	Action Required
Successful Completion	000	Request completed normally	Process returned data (if any)
Unsupported Request	001	Completer doesn't support this request	Report UR error, may retry with different params
Config Retry Status	010	Device not ready, retry later	Wait and retry the Configuration request
Completer Abort	100	Unrecoverable error at Completer	Report CA error, do not retry

Split Completions

A single Read Request may result in multiple Completions (split completions) due to:

Read Completion Boundary (RCB): Completions must not cross RCB (64 or 128 bytes)
Max Payload Size: Completion data limited by MPS setting
Completer design: May split for internal reasons

Byte Count field indicates remaining bytes including current Completion. Requester uses this to track completion progress and detect missing Completions.

Flow Control - Complete Mechanism

PCIe uses a credit-based flow control mechanism to prevent receiver buffer overflow. Each receiver advertises available buffer space as "credits," and transmitters must have sufficient credits before sending.

Credit System Architecture

Flow Control Credit Types

Credits are tracked separately for three transaction types across header and data:

Type	Header Credits (HdrFC)	Data Credits (DataFC)	Unit
Posted (P)	PH	PD	1 header / 16 bytes (4 DW)
Non-Posted (NP)	NPH	NPD	1 header / 16 bytes (4 DW)
Completion (Cpl)	CplH	CplD	1 header / 16 bytes (4 DW)

Credits_Required = Header_Credits + ceil(Payload_Size / 16)
Example: 256-byte MWr = 1 PH + ceil(256/16) = 1 PH + 16 PD

Flow Control Initialization

During link training, devices exchange Flow Control initialization DLLPs to advertise their receive buffer capacities:

// FC Initialization Sequence Phase 1 - InitFC1: - Transmit InitFC1 DLLPs advertising credit values for each VC - Receive InitFC1 from link partner - InitFC1 contains: PH, PD, NPH, NPD, CplH, CplD for each active VC Phase 2 - InitFC2: - After receiving all InitFC1, transmit InitFC2 (confirms receipt) - Receive InitFC2 from link partner - Link enters FC_INIT_COMPLETE state // Credit Limits - Infinite Credits: Indicated by advertising 0 for a credit type - Completers typically advertise infinite CplH/CplD (completions come from internal logic) - Minimum Credits: Must support at least max TLP size allowed by MPS

Credit Updates and Consumption

Credit Update Algorithm

Transmitter: Before sending TLP, check: CREDITS_CONSUMED + Required ≤ CREDIT_LIMIT
Transmitter: After sending, increment CREDITS_CONSUMED
Receiver: After processing TLP (buffer freed), update CREDITS_ALLOCATED
Receiver: Periodically send UpdateFC DLLP with new CREDIT_LIMIT
Transmitter: On receiving UpdateFC, update local CREDIT_LIMIT

Flow Control Deadlock Prevention

PCIe avoids deadlock through careful rules:

Posted transactions cannot be blocked waiting for Non-Posted credits
Completions use separate credit pool - never blocked by requests
Switches must guarantee forward progress for Completions
UpdateFC DLLPs are always transmitted (no flow control on DLLPs)

Transaction Ordering Rules - Complete Reference

PCIe implements a producer-consumer ordering model derived from PCI. Understanding ordering rules is critical for correct system operation and performance optimization.

Pass Column → Row ↓	Posted Request	Non-Posted Request	Completion
Posted Request	No (same address) Yes (different)	Yes	Yes
Non-Posted Request	No	Yes (if RO enabled)	Yes
Completion	No	No	Yes

Ordering Attributes Explained

Relaxed Ordering (RO) - Attr[1]: When set, allows this Write to pass previous Writes to different addresses. Breaks strict ordering for performance.
ID-Based Ordering (IDO) - Attr[2]: Transactions with different Requester IDs can pass each other. Enables independent streams.
No Snoop (NS) - Attr[0]: Hints that data does not require cache coherency checks. Performance optimization.

Virtual Channels (VCs)

Virtual Channels provide independent flow paths through the PCIe fabric, enabling QoS differentiation and traffic isolation.

VC0 (Default)

Mandatory for all devices. Used for general traffic and TC0.

Arbitration: Round-robin or weighted

VC1-VC7 (Optional)

Additional VCs for differentiated service. Mapped to specific TCs.

Use case: Isochronous, low-latency traffic

TC to VC Mapping

Traffic Classes (TC0-TC7) are mapped to Virtual Channels at each port. Default: All TCs map to VC0. Software configures mapping via VC Capability registers. Mapping must be consistent across the entire path.

Tag Management - PCIe 7.0

Tag Size	Max Outstanding	Enable Bit	PCIe Version
5-bit	32	Default (Extended Tag = 0)	PCIe 1.0+
8-bit	256	Extended Tag Field Enable	PCIe 1.0+
10-bit	1024	10-Bit Tag Requester Enable	PCIe 4.0+
14-bit	16384	14-Bit Tag Requester Enable (Flit Mode only)	PCIe 6.0+

Flit Mode TLP Differences

PCIe 6.0+ introduces Flit Mode with significant TLP header format changes optimized for 256-byte fixed Flits.

Non-Flit Mode (Legacy)

Variable TLP size (up to 4KB payload)
Per-TLP LCRC at Data Link Layer
8b/10b or 128b/130b encoding
10-bit Tags maximum
Explicit Byte Enables

                    Flit Mode (PCIe 6.0+)
                    Fixed 256-byte Flits
FEC + CRC at Physical Layer
1b/1b PAM4 encoding
14-bit Tags supported
Optimized Header Compression (OHC)
Embedded DLP in Flit structure

                

Chapter 2: Transaction Layer