Comments for CXL.cache transaction layer specification--clarifications needed (2)

The version we are using: CXL 3.0 spec

Relevant sections in spec:

Table of Contents

Suggestion 1

Section 3.2.5.5 mandates that only one snoop is allowed for a device (for the same address). What it does not disallow is having two snoops (same address) to two different devices.

3.2.5.5 Multiple Snoops to the Same Address

The host is only allowed to have one snoop pending at a time per cacheline address per device. The host must wait until it has received both the snoop response and all IWB data (if any) before dispatching the next snoop to that address.
This seems to imply that having two simultaneous same-block snoops is permitted by CXL. This could lead to either deadlocks or incoherence, depending on which interpretation one chooses for rule rule 3.2.5.2.

3.2.5.2 Device/Host Snoop-GO-Data Assumptions

... When the host is sending a snoop to the device, the requirement is that no GO response will be sent to any requests with that address in the device until after the Host has received a response for the snoop and all implicit writeback (IWB) data (dirty data forwarded in response to a snoop) has been received. ...
Rule 3.2.5.2 says that no GOs should be sent before snoop's results has been received. If the action of receiving a snoop response means processing the response and sending the corresponding H2D Response to the requestor (GO), then neither snoop response 1 or 2 can be processed as that would lead to a GO being sent before a previous snoop being received, which is a deadlock.
Two snoops leading to deadlock
Now if we allow the action of receiving a snoop response and sending its corresponding GO message to be non-atomic, then a message can be buffered somewhere in the host and considered ''received''. Then the following sequence is possible:
Two snoops leading to deadlock
then one snoop response (for example RspIHitI (2) in picture) will be considered received first, allowing another response to be consumed and generate a GO first (GO-M (1) in picture). This in turn allows the snoop response being held at Host to be processed, generating a second go (GO-S (2)). This clearly leads to a coherence violation. Either way, there is some inconsistency in the specification that needs to be changed.

Proposed Change

Strengthen the rule 3.2.5.5 to also forbid multiple snoops to multiple devices:

3.2.5.5 Multiple Snoops to the Same Address

The host is only allowed to have one snoop pending at a time per cacheline address per device. The host must wait until it has received both the snoop response and all IWB data (if any) before dispatching the next snoop to that address.