SOFTWARE / CONFIGURATION

PCIe Enumeration Complete Technical Deep-Dive

Device discovery, bus assignment, BAR configuration, resource allocation, and topology building

1. What is PCIe Enumeration?

Definition

Enumeration is the software process of discovering all PCIe devices in a system, assigning bus numbers, configuring Base Address Registers (BARs), allocating resources (memory, I/O), and building the device tree. It is performed by firmware (BIOS/UEFI) and/or operating system.

Important: Enumeration is NOT Physical Layer

Enumeration uses Configuration Read/Write TLPs through the Transaction Layer. The Physical Layer only provides the trained link; it has no involvement in device discovery or configuration.

When Does Enumeration Occur?

Enumeration Steps Overview

PCIe Enumeration Process Overview
═══════════════════════════════════════════════════════════════════════════════

  ┌─────────────────────────────────────────────────────────────────────────┐
  │ Step 1: DEVICE DISCOVERY                                                 │
  │   • Scan all possible Bus/Device/Function combinations                  │
  │   • Read Vendor ID to detect device presence                            │
  │   • Build device tree structure                                         │
  └─────────────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
  ┌─────────────────────────────────────────────────────────────────────────┐
  │ Step 2: BUS NUMBER ASSIGNMENT                                           │
  │   • Assign Primary/Secondary/Subordinate bus numbers to bridges         │
  │   • Depth-first traversal of topology                                   │
  │   • Update subordinate numbers on backtrack                             │
  └─────────────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
  ┌─────────────────────────────────────────────────────────────────────────┐
  │ Step 3: BAR SIZING                                                       │
  │   • Write 0xFFFFFFFF to each BAR                                        │
  │   • Read back to determine size and type                                │
  │   • Record resource requirements                                        │
  └─────────────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
  ┌─────────────────────────────────────────────────────────────────────────┐
  │ Step 4: RESOURCE ALLOCATION                                              │
  │   • Allocate memory ranges (MMIO, prefetchable)                         │
  │   • Allocate I/O port ranges                                            │
  │   • Configure bridge windows                                            │
  │   • Write final addresses to BARs                                       │
  └─────────────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
  ┌─────────────────────────────────────────────────────────────────────────┐
  │ Step 5: ENABLE DEVICES                                                   │
  │   • Set Memory Space Enable, I/O Space Enable, Bus Master Enable        │
  │   • Configure interrupts (MSI/MSI-X)                                    │
  │   • Enable device-specific features                                     │
  └─────────────────────────────────────────────────────────────────────────┘

2. Device Discovery Algorithm

Scanning Process

Device Discovery Pseudocode
═══════════════════════════════════════════════════════════════════════════════

function enumerate_bus(bus_number):
    // Scan all 32 possible devices on this bus
    for device in 0..31:
        // Check function 0 first
        vendor_id = config_read(bus, device, 0, VENDOR_ID)
        
        if vendor_id == 0xFFFF:
            continue  // No device present
        
        // Device found - configure it
        configure_function(bus, device, 0)
        
        // Check if multi-function device
        header_type = config_read(bus, device, 0, HEADER_TYPE)
        
        if (header_type & 0x80):  // Multi-function bit set
            for function in 1..7:
                vendor_id = config_read(bus, device, function, VENDOR_ID)
                if vendor_id != 0xFFFF:
                    configure_function(bus, device, function)

function configure_function(bus, device, function):
    // Read device identification
    vendor_id = config_read(bus, device, function, VENDOR_ID)
    device_id = config_read(bus, device, function, DEVICE_ID)
    class_code = config_read(bus, device, function, CLASS_CODE)
    header_type = config_read(bus, device, function, HEADER_TYPE) & 0x7F
    
    // Check if this is a bridge
    if header_type == 0x01:  // Type 1 header = Bridge
        configure_bridge(bus, device, function)
    else:  // Type 0 header = Endpoint
        configure_endpoint(bus, device, function)

Configuration Read Response

Vendor ID Value Meaning Action
0xFFFF No device present Skip to next device/function
0x0001 CRS (Configuration Retry Status) Retry after delay (device initializing)
Valid ID Device present and ready Continue enumeration

3. Bus Number Assignment

Bridge Bus Number Registers

Type 1 Header Bus Number Fields (Offsets 18h-1Bh)
═══════════════════════════════════════════════════════════════════════════════

  Offset 18h: Primary Bus Number
    └── Bus number of the port on the upstream side of the bridge
  
  Offset 19h: Secondary Bus Number
    └── Bus number of the bus immediately downstream of the bridge
  
  Offset 1Ah: Subordinate Bus Number
    └── Highest numbered bus downstream of this bridge
  
  Offset 1Bh: Secondary Latency Timer (legacy, typically 0)

Example:

                    Root Complex
                    (Bus 0)
                        │
                  ┌─────┴─────┐
                  │  Bridge   │  Primary=0, Secondary=1, Subordinate=4
                  │  00:01.0  │
                  └─────┬─────┘
                        │ Bus 1
          ┌─────────────┼─────────────┐
          │             │             │
    ┌─────┴─────┐ ┌─────┴─────┐ ┌─────┴─────┐
    │ Endpoint  │ │  Bridge   │ │  Bridge   │
    │  01:00.0  │ │  01:01.0  │ │  01:02.0  │
    └───────────┘ │ P=1,S=2,  │ │ P=1,S=3,  │
                  │ Sub=2     │ │ Sub=4     │
                  └─────┬─────┘ └─────┬─────┘
                        │             │
                  ┌─────┴─────┐ ┌─────┴─────┐
                  │ Endpoint  │ │  Bridge   │
                  │  02:00.0  │ │  03:00.0  │
                  └───────────┘ │ P=3,S=4,  │
                                │ Sub=4     │
                                └─────┬─────┘
                                      │
                                ┌─────┴─────┐
                                │ Endpoint  │
                                │  04:00.0  │
                                └───────────┘

Bus Assignment Algorithm

Depth-First Bus Number Assignment
═══════════════════════════════════════════════════════════════════════════════

function configure_bridge(bus, device, function):
    // Get next available bus number
    secondary_bus = next_bus_number++
    
    // Set Primary = current bus
    config_write(bus, device, function, PRIMARY_BUS, bus)
    
    // Set Secondary = new bus
    config_write(bus, device, function, SECONDARY_BUS, secondary_bus)
    
    // Temporarily set Subordinate to max (255) to allow access
    config_write(bus, device, function, SUBORDINATE_BUS, 255)
    
    // Recursively enumerate downstream bus
    enumerate_bus(secondary_bus)
    
    // Now we know the highest bus number used downstream
    subordinate_bus = next_bus_number - 1
    
    // Update subordinate to actual value
    config_write(bus, device, function, SUBORDINATE_BUS, subordinate_bus)

Important Rules:
  • Bus 0 is always the Root Complex
  • Bus numbers must be contiguous within a hierarchy
  • Subordinate ≥ Secondary
  • Maximum 256 buses (0-255)

4. BAR (Base Address Register) Configuration

BAR Types

Bit 0 Bits 2:1 (Memory) Type Description
000Memory, 32-bit32-bit address space
010Memory, 64-bit64-bit, uses two BAR slots
001Memory (reserved)Legacy 1MB addressing
1-I/OI/O port address space

BAR Sizing Algorithm

BAR Size Determination Process
═══════════════════════════════════════════════════════════════════════════════

Step 1: Save original BAR value
    original = config_read(bus, dev, func, BAR_N)

Step 2: Write all 1s to BAR
    config_write(bus, dev, func, BAR_N, 0xFFFFFFFF)

Step 3: Read back the value
    sizing_value = config_read(bus, dev, func, BAR_N)

Step 4: Decode size and type
    if sizing_value == 0:
        BAR is not implemented
    else:
        // Clear type bits, invert, add 1
        size = (~(sizing_value & ~0xF) + 1)

Step 5: Restore original (or leave for later assignment)
    config_write(bus, dev, func, BAR_N, original)

Example - Memory BAR Sizing:

  Write 0xFFFFFFFF to BAR
  Read back: 0xFFF00000
  
  Mask type bits:     0xFFF00000 & ~0xF = 0xFFF00000
  Invert:             ~0xFFF00000       = 0x000FFFFF
  Add 1:              0x000FFFFF + 1    = 0x00100000 = 1 MB
  
  Bit 3 (Prefetchable): (original >> 3) & 1
  Bits 2:1 (Type):      (original >> 1) & 3  (00=32bit, 10=64bit)

64-bit BAR Example:

  BAR[N]   sizing = 0xFFC00000, type bits show 64-bit
  BAR[N+1] sizing = 0xFFFFFFFF (upper 32 bits)
  
  Combined: 0xFFFFFFFF_FFC00000
  Size = ~0xFFFFFFFF_FFC00000 + 1 = 0x00400000 = 4 MB

BAR Memory Types

Non-Prefetchable Memory
  • Bit 3 = 0
  • Read side effects possible
  • No speculative reads allowed
  • No write combining
  • Examples: Control registers, FIFOs
Prefetchable Memory
  • Bit 3 = 1
  • No read side effects
  • Speculative reads OK
  • Write combining allowed
  • Examples: Frame buffers, DMA buffers

5. Resource Allocation

Bridge Window Configuration

Bridge Resource Windows (Type 1 Header)
═══════════════════════════════════════════════════════════════════════════════

I/O Window (Offsets 1Ch-1Dh, 30h-33h):
  ┌──────────────────────────────────────────────────────────────────────────┐
  │ I/O Base (1Ch):     Upper 4 bits of 16-bit I/O base address             │
  │ I/O Limit (1Dh):    Upper 4 bits of 16-bit I/O limit address            │
  │ I/O Base Upper (30h-31h):  Upper 16 bits (for 32-bit I/O)               │
  │ I/O Limit Upper (32h-33h): Upper 16 bits (for 32-bit I/O)               │
  │ Granularity: 4KB (12-bit aligned)                                       │
  └──────────────────────────────────────────────────────────────────────────┘

Memory Window (Offsets 20h-23h):
  ┌──────────────────────────────────────────────────────────────────────────┐
  │ Memory Base (20h-21h):  Upper 12 bits of 32-bit base                    │
  │ Memory Limit (22h-23h): Upper 12 bits of 32-bit limit                   │
  │ Granularity: 1MB (20-bit aligned)                                       │
  │ Used for: Non-prefetchable MMIO                                         │
  └──────────────────────────────────────────────────────────────────────────┘

Prefetchable Memory Window (Offsets 24h-2Bh):
  ┌──────────────────────────────────────────────────────────────────────────┐
  │ Prefetch Base (24h-25h):       Upper 12 bits of base                    │
  │ Prefetch Limit (26h-27h):      Upper 12 bits of limit                   │
  │ Prefetch Base Upper (28h-2Bh): Upper 32 bits (64-bit capable)           │
  │ Prefetch Limit Upper (2Ch-2Fh): Upper 32 bits                           │
  │ Granularity: 1MB                                                        │
  │ Used for: Prefetchable MMIO (GPU VRAM, etc.)                            │
  └──────────────────────────────────────────────────────────────────────────┘

Resource Allocation Algorithm

Bottom-Up Resource Allocation
═══════════════════════════════════════════════════════════════════════════════

Phase 1: Calculate Requirements (Bottom-Up)
  
  For each device/bridge (leaf to root):
    • Sum all BAR requirements
    • For bridges: sum all downstream requirements
    • Align to bridge window granularity (1MB for memory)
    • Track prefetchable vs non-prefetchable separately

Phase 2: Assign Addresses (Top-Down)

  Starting from Root Complex with available ranges:
    • Allocate largest requests first (reduce fragmentation)
    • Assign base address to each resource
    • Configure bridge windows to span downstream devices
    • Write addresses to BARs

Example Resource Map:

  System Memory:        0x00000000 - 0xBFFFFFFF (3GB)
  MMIO Region:          0xC0000000 - 0xFFFFFFFF (1GB)
  
  Bridge 1 Window:      0xC0000000 - 0xCFFFFFFF (256MB)
    ├── GPU BAR0:       0xC0000000 - 0xC7FFFFFF (128MB prefetch)
    ├── GPU BAR2:       0xC8000000 - 0xC800FFFF (64KB non-prefetch)
    └── NIC BAR0:       0xC8010000 - 0xC801FFFF (64KB non-prefetch)
  
  Bridge 2 Window:      0xD0000000 - 0xD0FFFFFF (16MB)
    └── NVMe BAR0:      0xD0000000 - 0xD0003FFF (16KB)

6. Command Register Configuration

Command Register (Offset 04h) - Enable Device
═══════════════════════════════════════════════════════════════════════════════

  Bit │ Name                    │ Description
  ────┼─────────────────────────┼───────────────────────────────────────────
   0  │ I/O Space Enable        │ Respond to I/O BAR accesses
   1  │ Memory Space Enable     │ Respond to Memory BAR accesses
   2  │ Bus Master Enable       │ Allow device to initiate transactions
   3  │ Special Cycles          │ (Legacy, usually 0)
   4  │ Memory Write & Inval    │ (Legacy, usually 0)
   5  │ VGA Palette Snoop       │ (Legacy, usually 0)
   6  │ Parity Error Response   │ Enable parity error reporting
   7  │ Reserved                │
   8  │ SERR# Enable            │ Enable system error reporting
   9  │ Fast B2B Enable         │ (Legacy, usually 0)
  10  │ INTx Disable            │ Disable legacy interrupts (use MSI)
  11+ │ Reserved                │

Typical Enumeration Sequence:

  1. Initial state: Command = 0x0000 (all disabled)
  
  2. Configure BARs: (still disabled, safe to program)
     config_write(BAR0, allocated_address)
  
  3. Enable device:
     config_write(COMMAND, 0x0006)  // Memory + Bus Master
     
     Or for device with I/O BARs:
     config_write(COMMAND, 0x0007)  // I/O + Memory + Bus Master

7. Multi-Function Device Handling

Multi-Function Device Detection
═══════════════════════════════════════════════════════════════════════════════

Header Type Register (Offset 0Eh):

  Bit 7:   Multi-Function Device flag
  Bits 6:0: Header Type (0 = Endpoint, 1 = Bridge, 2 = CardBus)

Enumeration Logic:

  header_type = config_read(bus, dev, 0, 0x0E)
  
  if (header_type & 0x80):
      // Multi-function: scan functions 0-7
      for func in 0..7:
          if config_read(bus, dev, func, VENDOR_ID) != 0xFFFF:
              configure_function(bus, dev, func)
  else:
      // Single function: only function 0 exists
      configure_function(bus, dev, 0)

Example: Multi-Function Network Adapter

  Bus 3, Device 0:
    Function 0: Ethernet Port 1  (vendor_id = 0x8086)
    Function 1: Ethernet Port 2  (vendor_id = 0x8086)
    Function 2: 0xFFFF (not present)
    Function 3: 0xFFFF (not present)
    ...
    Function 7: 0xFFFF (not present)

ARI (Alternative Routing-ID Interpretation):

  For SR-IOV devices, ARI extends Function field to 8 bits:
    Standard: 5-bit Device + 3-bit Function = 8 functions max
    ARI:      8-bit Function = 256 functions max
    
  Requires:
    • Device ARI capability
    • Upstream port ARI forwarding enabled

8. Hot-Plug Enumeration

Hot-Plug Enumeration Sequence
═══════════════════════════════════════════════════════════════════════════════

Device Insertion:

  1. Physical Connection
     │ User inserts card into slot
     │ MRL sensor detects (if present)
     │ Attention Button pressed (if required)
     ▼
  2. Power-Up Sequence
     │ Slot Controller powers up slot
     │ Power Indicator set to ON
     │ Device performs internal initialization
     ▼
  3. Link Training
     │ Physical Layer LTSSM: Detect → Polling → Config → L0
     │ Data Link Layer: FC Init → DL_Active
     │ Hot-Plug Controller receives Data Link Layer State Changed
     ▼
  4. OS Notification
     │ Hot-Plug interrupt generated (MSI or INTx)
     │ OS reads Slot Status register
     │ Presence Detect Changed / DLL State Changed bits set
     ▼
  5. Enumeration
     │ OS performs configuration access to new device
     │ May return CRS initially (device not ready)
     │ Eventually returns valid Vendor ID
     │ OS allocates resources (may require rebalancing)
     │ Device driver loaded
     ▼
  6. Device Ready
     │ Device fully operational

Device Removal:

  1. Attention Button pressed (orderly) or card pulled (surprise)
  2. Data Link Layer goes down (DLL State Changed)
  3. OS notified via interrupt
  4. OS quiesces driver, releases resources
  5. Slot power turned off (orderly removal)
  6. Resources freed for reallocation

9. SR-IOV Virtual Function Enumeration

SR-IOV VF Enumeration
═══════════════════════════════════════════════════════════════════════════════

PF (Physical Function) Enumeration:
  • Standard enumeration discovers PF as normal endpoint
  • SR-IOV Extended Capability indicates VF support
  • PF BARs configured normally

VF Creation Process:

  1. Read SR-IOV Capability
     ├── TotalVFs: Maximum VFs supported
     ├── InitialVFs: Initial allocation
     ├── VF Offset: First VF Routing ID offset
     └── VF Stride: Routing ID increment between VFs
  
  2. Configure VF BARs (in SR-IOV capability)
     ├── VF BAR0-5: Size determined like standard BARs
     └── Each VF gets same-sized slice of VF BAR space
  
  3. Set NumVFs (number of VFs to create)
  
  4. Set VF Enable = 1
     └── VFs appear at calculated Routing IDs

VF Routing ID Calculation:

  VF_RID[n] = PF_RID + VF_Offset + (n × VF_Stride)
  
  Example:
    PF at Bus 5, Device 0, Function 0 (RID = 0x500)
    VF Offset = 0x100
    VF Stride = 0x001
    
    VF0: 0x500 + 0x100 + 0×1 = 0x600 → Bus 6, Dev 0, Func 0
    VF1: 0x500 + 0x100 + 1×1 = 0x601 → Bus 6, Dev 0, Func 1
    VF2: 0x500 + 0x100 + 2×1 = 0x602 → Bus 6, Dev 0, Func 2
    ...

VF Configuration Space:
  • VFs have minimal configuration space
  • BARs read from SR-IOV capability, calculated per-VF
  • No capability list (capabilities from PF)
  • Vendor/Device ID in SR-IOV capability

10. Linux Enumeration Example

Linux PCIe Enumeration Flow
═══════════════════════════════════════════════════════════════════════════════

Boot-time Flow:

  BIOS/UEFI performs initial enumeration
        │
        ▼
  Linux kernel starts
        │
        ▼
  pci_subsys_init()
        │
        ├── pcibios_init()           // Architecture-specific init
        │
        ├── pci_acpi_init()          // Parse MCFG table for ECAM
        │
        └── pci_scan_root_bus()      // Scan from root complex
              │
              ├── pci_scan_child_bus()
              │     │
              │     └── pci_scan_slot()
              │           │
              │           └── pci_scan_single_device()
              │                 │
              │                 └── pci_device_add()
              │
              └── pci_assign_unassigned_resources()
                    │
                    └── pci_bus_assign_resources()

Useful Commands:

  # List all PCI devices
  lspci -v
  
  # Show tree view with bus numbers
  lspci -tv
  
  # Show device configuration space
  lspci -xxx -s 00:1f.0
  
  # Show resource allocation
  cat /proc/iomem
  cat /proc/ioports
  
  # Rescan PCI bus
  echo 1 > /sys/bus/pci/rescan
  
  # Remove device
  echo 1 > /sys/bus/pci/devices/0000:03:00.0/remove

11. Enumeration Rules and Best Practices

Normative Enumeration Rules

  1. CRS Handling: Software must retry configuration access when CRS is returned (Vendor ID = 0x0001)
  2. Bus Number Ordering: Secondary Bus Number must be greater than Primary Bus Number
  3. Subordinate Accuracy: Subordinate Bus Number must accurately reflect highest downstream bus
  4. BAR Alignment: Allocated addresses must be naturally aligned to BAR size
  5. Bridge Windows: Bridge windows must encompass all downstream resources
  6. 64-bit BARs: Must use consecutive BAR pairs; second BAR holds upper 32 bits
  7. Enable Ordering: Configure all BARs before setting Memory/IO Space Enable
  8. Bus Master: Must set Bus Master Enable before device can initiate transactions
  9. Hot-Plug: Must wait for DLL Active before accessing hot-plugged device
  10. Power-On: Device must be ready within 1 second of power stable (or return CRS)