Locked Transactions

Deep-Dive: Legacy Lock Protocol for Atomic Read-Modify-Write Operations

1. Locked Transactions Overview

Legacy Feature Notice

Locked transactions are a legacy mechanism from PCI, retained in PCIe primarily for compatibility with PCI-to-PCIe bridges. Modern designs should use Atomic Operations (AtomicOps) instead, which provide better performance and don't require locking the entire bus hierarchy.

Locked transactions provide a mechanism for performing atomic read-modify-write operations across a PCI Express hierarchy. When a lock is held, other transactions targeting the locked resource are blocked until the lock is released.

When Locked Transactions Are Used

2. Lock Protocol

2.1 Lock Acquisition Sequence

Locked Transaction Sequence: Requester Switches/Bridges Completer │ │ │ │ MRdLk (Memory Read │ │ │ with Lock bit set) │ │ │────────────────────────►│ │ │ │ Forward MRdLk │ │ │─────────────────────────►│ │ │ │ │ │ Locked Completion │ │ │◄─────────────────────────│ │ Locked Completion │ │ │◄────────────────────────│ │ │ │ │ │ [LOCK ACQUIRED] │ [LOCK STATE SET] │ │ │ │ │ Additional MRd/MWr │ │ │ (within locked region) │ │ │────────────────────────►│─────────────────────────►│ │ │ │ │◄────────────────────────│◄─────────────────────────│ │ │ │ │ MWr (without Lock) │ │ │ [LOCK RELEASE] │ │ │────────────────────────►│─────────────────────────►│ │ │ │ │ │ [LOCK STATE CLEARED] │ │ │ │ Lock Acquisition Rules: 1. Lock is acquired with a Memory Read Lock (MRdLk) request 2. Successful completion with Locked bit indicates lock granted 3. Lock is held until a Memory Write without Lock bit is issued 4. Lock can span multiple transactions between acquire and release

2.2 TLP Format for Locked Transactions

Memory Read Lock TLP Header (Type 0 = MRdLk): Byte 0-3 (DW0): ┌────┬──────┬────┬────┬──────┬─────────────────────┐ │Fmt │ Type │ R │TC │ Attr │ Length │ │ 00 │ 00001│ │ │ │ │ └────┴──────┴────┴────┴──────┴─────────────────────┘ └──────── MRdLk = Type 00001b Type Field Encoding for Locked Requests: 00000b = Memory Read (MRd) 00001b = Memory Read Lock (MRdLk) 01000b = Memory Write (MWr) Completion with Lock: The completion for a MRdLk request has the same format as a regular completion, but the switch/bridge marks its internal state as "locked" upon forwarding the completion upstream. Lock Release: Any Memory Write (MWr) targeting the same address range releases the lock. No special TLP type is needed - the absence of the Lock bit releases.

2.3 Lock State Machine

Bridge/Switch Lock State Machine: ┌─────────────────────┐ │ UNLOCKED │ │ (Initial State) │ └──────────┬──────────┘ │ │ Receive MRdLk from upstream │ (forward downstream, wait for completion) │ ▼ ┌─────────────────────┐ │ LOCK_PENDING │ │ (Waiting for Cpl) │ └──────────┬──────────┘ │ ┌──────────────────┼──────────────────┐ │ │ │ │ Cpl with │ Cpl with │ Timeout/Error │ Successful │ UR/CA status │ │ status │ │ ▼ ▼ ▼ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ │ LOCKED │ │ UNLOCKED │ │ UNLOCKED │ │ (Lock Active) │ │(Lock rejected)│ │ (Lock failed) │ └───────┬───────┘ └───────────────┘ └───────────────┘ │ │ Receive MWr to locked region │ (from original lock owner) │ ▼ ┌───────────────────┐ │ UNLOCKED │ │ (Lock Released) │ └───────────────────┘ While LOCKED: - Block all other requests to locked address region - Allow requests from lock owner to pass - Queue blocked requests until unlock

3. Switch and Bridge Handling

3.1 Switch Lock Behavior

Switch Lock Handling Rules: ┌─────────────────────────────────────────────────────────────────────────┐ │ Switch │ │ │ │ Upstream Port │ │ │ │ │ │◄──── MRdLk arrives │ │ │ │ │ ├────► Route to appropriate downstream port │ │ │ Mark egress port as "lock pending" │ │ │ │ │ ─────┼───────────────────────────────────────────── │ │ │ │ │ Downstream Port 0 Downstream Port 1 Downstream Port 2 │ │ │ │ │ │ │ │ LOCKED │ BLOCKED │ BLOCKED │ │ │ (target) │ (other traffic) │ (other traffic) │ │ ▼ ▼ ▼ │ │ │ │ Rule: While any downstream port holds a lock, transactions │ │ from other requesters to ANY port MAY be blocked │ │ (implementation dependent - some switches only block │ │ transactions to the locked port) │ │ │ └─────────────────────────────────────────────────────────────────────────┘ Lock Scope Options (Implementation Defined): 1. Port-Scoped Lock: - Only blocks access to the specific downstream port - Better concurrency, more complex implementation 2. Switch-Wide Lock: - Blocks all downstream traffic while lock held - Simpler implementation, worse performance

3.2 Root Complex Lock Handling

Root Complex Lock Behavior: The Root Complex is typically the source of locked transactions (from CPU executing LOCK-prefixed instructions). ┌─────────────────────────────────────────────────────────────────────────┐ │ Root Complex │ │ │ │ CPU Core │ │ │ │ │ │ LOCK MOV [legacy_device], value │ │ │ │ │ ▼ │ │ ┌──────────────────┐ │ │ │ Memory Controller│ │ │ │ / Transaction │ │ │ │ Generator │ │ │ └────────┬─────────┘ │ │ │ │ │ │ Address maps to PCIe device │ │ │ │ │ ▼ │ │ ┌──────────────────┐ │ │ │ Root Port │ │ │ │ │ │ │ │ 1. Issue MRdLk │ │ │ │ 2. Wait for Cpl │ │ │ │ 3. Hold lock │ │ │ │ 4. Issue MWr │ │ │ │ (releases lock) │ │ │ └──────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────┘ Root Complex Rules: - MUST NOT generate MRdLk to PCI Express endpoints (they don't support lock) - MAY generate MRdLk to PCI Express-to-PCI bridges - MUST handle lock timeout gracefully

4. Legacy Compatibility

4.1 PCIe-to-PCI Bridge Lock Handling

PCIe-to-PCI Bridge Lock Translation: PCIe Side PCI Side │ │ MRdLk ─────►│ │ │ Translate to PCI │ │ Locked Cycle: │ │ │ │ FRAME# ─────────────────►│ │ LOCK# ──────────────────►│ (LOCK# asserted) │ C/BE[3:0]# (Read) ──────►│ │ │ │ ◄──────────────── TRDY# │ │ ◄──────────────── Data │ │ │ │ Completion ◄─────────────│ Cpl ◄───────│ (with lock held) │ │ │ │ [Bridge holds LOCK#] │ │ │ MRd/MWr ───►│ │ │ Forward with LOCK# held │ │ ────────────────────────►│ │ │ MWr ───────►│ │ │ Release LOCK# │ │ ────────────────────────►│ (LOCK# deasserted) │ │ PCI LOCK# Signal: - Active-low signal - Asserted during locked transaction sequence - Prevents other masters from accessing locked resource - Released when lock owner completes write

4.2 Endpoint Support

Endpoint Type Lock Support Description
PCI Express Endpoint NOT supported Must not accept MRdLk, return UR
Legacy Endpoint Optional May support for PCI compatibility
PCIe-to-PCI Bridge Required Must translate to PCI locked cycles
Root Complex Integrated EP Implementation specific Usually not supported

5. Modern Alternative: AtomicOps

Use AtomicOps Instead of Locked Transactions

For new designs, Atomic Operations (AtomicOps) provide superior functionality:

  • No bus-wide locking required
  • Better performance and concurrency
  • Three operations: FetchAdd, Swap, CAS (Compare-And-Swap)
  • Supported by all PCIe 3.0+ compliant endpoints

See Atomic Operations Deep-Dive for details.

6. Normative Rules

Locked Transaction Rules

  1. R1: PCI Express Endpoints MUST NOT accept MRdLk requests (return Unsupported Request).
  2. R2: Legacy Endpoints MAY optionally support locked transactions.
  3. R3: Switches MUST forward MRdLk to the appropriate downstream port.
  4. R4: Switches MUST track lock state per-port or globally.
  5. R5: Root Complexes SHOULD only generate MRdLk when targeting legacy devices behind a PCI bridge.
  6. R6: PCIe-to-PCI Bridges MUST translate MRdLk to PCI locked cycles.
  7. R7: Lock MUST be released by a Memory Write from the lock owner.
  8. R8: Lock timeout SHOULD be implemented to prevent deadlocks.
  9. R9: Completions for MRdLk MUST be returned before any subsequent requests are processed.
  10. R10: Multiple concurrent locks on the same hierarchy are NOT permitted.