Store-exclusive instruction conflict resolution

A data processing system includes a plurality of transaction masters (4, 6, 8, 10) each with an associated local cache memory (12, 14, 16, 18) and coupled to coherent interconnect circuitry (20). Monitoring circuitry (24) within the coherent interconnect circuitry (20) maintains a state variable (flag) in respect of each of the transaction masters to monitor whether an exclusive store access state is pending for that transaction master. When a transaction master is to execute a store-exclusive instruction, then a current value of the subject state variable for that transaction master is compared with a previous value of that variable stored when the exclusive store access was setup. If there is a match, then store-exclusive instruction is allowed to proceed and the state variables of all other transaction masters for which there is a pending exclusive store access state are changed. If there is not a match, then the execution of the store-exclusive instruction is marked as failing.

TECHNICAL FIELD

This invention relates to the field of data processing systems. More particularly, this invention relates to data processing systems supporting store-exclusive program instructions.

BACKGROUND

It is known to provide data processing systems that support store-exclusive program instructions (these are sometimes referred to as Load-linked/Store-conditional or Load Exclusive/Store Exclusive instructions). Such store-exclusive instructions are normally used in combination with a load-exclusive instruction within multiprocessing systems so as to control exclusive store access to a data value (to the exclusion of other processors) for a period of time; typically, a short period of time. An example of such load-exclusive program instructions and store-exclusive program instructions are the LDREX and STREX instructions in some of the processors designed by ARM Limited of Cambridge, England. A description of these instructions and their functionality may be found in the ARM Architecture Reference Manual.

It is known to connect different processors, each having their own local cache memory, within a multiprocessor system using interconnect circuitry that provides support for maintaining data coherency within the system. The individual processors may be arranged to access a shared memory system via the interconnect circuitry and the interconnect circuitry may monitor the content of the local cache memories of each of the processors and pass messages between these cache memories so as to maintain coherency, e.g. invalidate a copy of data held in one cache memory when that data is updated in a different cache memory or in the shared memory system.

If more than one processor seeks to use the load-exclusive and store-exclusive program instruction mechanisms to provide guaranteed exclusive access to a data value for a period of time, then the interconnect circuitry is used to communicate signals and perform processing operations to police this behaviour. When more than one processor seeks to establish exclusive access to the same data, or the same range of data, then the interconnect circuitry may be configured to arbitrate between the processors such that one of the processors is successful in performing its load-exclusive and store-exclusive operations while the other processor fails in at least its attempt to perform one of the operations.

Viewed from one aspect the present invention provides a method managing data coherency within a data processing apparatus having a plurality of transaction masters, including a subject transaction master, said method comprising performing in respect of each of said plurality of transaction masters serving as a subject transaction master the steps of:

  • setting a subject state variable and a subject control value to match so as to indicate an exclusive store access state to subject data within a subject cache memory coupled to said subject transaction master; and
  • in response to a store-exclusive instruction for execution by said subject transaction master:
    • comparing a store address of a store data value associated with said store-exclusive instruction with addresses of data values stored within said subject cache memory to determine if said store data value is currently stored within said target cache memory and is valid;
    • if said stored data value is not marked as valid within said subject cache memory, then marking as failed execution of said store-exclusive instruction; and if said stored data value is valid within said subject cache memory, then:
      • (i) comparing a current value of said subject state variable with said subject control value;
      • (ii) if said current value does not match said subject control value, then marking as failed execution of said store-exclusive instruction; and
      • (iii) if said current value does match said subject control value, then permitting execution of said store-exclusive instruction to pass and changing, for each other transaction master of said plurality of transaction masters using a current value of a state variable to track an exclusive store access state of said other transaction master and corresponding to said store address, one of said current value and a state variable associated with said other transaction master such that a subsequent store-exclusive instruction for execution by said other transaction master and corresponding to said exclusive store access state will not be executed with success by said other transaction master.

The present techniques recognise that the finite delays imposed in the operation of the interconnect circuitry in arbitrating conflicts between different processors each seeking to establish its own load-exclusive and store-exclusive access to a data value that is shared may expose race conditions in these arbitration mechanism that should be addressed. In particular, it is possible for live-lock situations to arise in which LDREX-STREX sequences being executed on more than one processor conflict with each other, causing each processor's LDREX-STREX sequence to fail and retry repeatedly. For example, each of the processors is given permission by the interconnect circuitry to perform its store operation, but before this store operation can be performed, the relevant data is invalidated by a store-exclusive operation being performed on another processor, which in turn has its own store-exclusive operation invalidated by the first processor before it completes LDREX-STREX sequences are used to enforce short duration exclusive access to data values. Live-locks in such environments arise due to combinations of software and hardware conditions. The present technique addresses these problems by effectively providing a point of serialisation associated with subject data to be accessed and managed using a subject state variable, the subject state variable and subject control value being set equal when the exclusive store access is set up. When a processor wishes to perform its store-exclusive operation it checks whether or not the subject state variable and the subject control value are still equal. If they are not equal, then the store-exclusive operation is allowed to proceed and the current value of the subject state variable associated with any other transaction masters which are tracking an exclusive store access are changed such that when those other transaction masters subsequently check the value of their own state variable, the change will be noted and will indicate that a different transaction master has reached the point of serialisation ahead of them and that their own store-exclusive operation should fail. This avoids a live lock arising.

The step of setting the subject state variable to a subject control value could be performed in a variety of different ways. For example, it could be set when an instruction is fetched from a memory address associated with a previously encountered load-exclusive instruction/a store-exclusive instruction sequence, when a counter value forming the state variable has not been sampled for greater than a predetermined number of processing cycles, or in other ways. One effective way to control the step of setting is that this is performed in response to a load-exclusive instruction executed by the subject transaction master in which the load-exclusive instruction loads a data value to the subject cache memory coupled to the subject transaction master if the data value is not already present within that cache memory.

The marking of a store-exclusive instruction as either failed or permitted to pass may be achieved by recording a fail status or a pass status. These may be recorded, for example, in a result status register associated with the instructions.

The store-exclusive instruction may have a variety of forms. It may, for example, perform other operations, such as a compare, in addition to a simple store operation. However, the store-exclusive instruction in at least some embodiments performs a standard store operation if the data value is marked as valid within the cache and the current value matches the subject control value.

In order to assist in the management of coherency, if the current value does match the subject control value, then the system may mark as invalid any data value stored in the other transaction masters which corresponds to the store address of the data in respect of which the store-exclusive access has been permitted. Conversely, if the current value does not match the subject control value, then the system does not perform any such invalidate operations so avoid a potential cause of live locks

The state variable which is used to track pending store exclusive access state can take a variety of different forms. In some embodiments the state variable may have the form of a separate state variable provided for each of a plurality of transaction masters and tracking pending exclusive store states within those transaction masters. This set of separate state variables may be stored within coherency control circuitry which is shared between the plurality of transaction masters. In this context, the steps of comparing and changing are performed by the coherency control circuitry.

The subject control value may be a simple binary flag having a predetermined set state indicative of an exclusive store access state and with which the step of changing sets the current value of the state variable for each of the other transaction masters, to a predetermined reset state. Thus, for each transaction master the state variable is placed into a set state when the transaction master operates to set up an exclusive store access state and then before a store-exclusive instruction is allowed to proceed, a check is made as to whether or not the state variable still has the set state so as to check that it has not been reset by another transaction master which has prevailed in an arbitration between the present transaction master and that other transaction master.

It is possible that in some embodiments it will be sufficient to provide a single state variable for each transaction master indicating that an exclusive store access state is pending for that transaction master. However, more fine-grained control which helps to reduce the likelihood of store-exclusive instructions being unnecessarily failed may be achieved when a plurality of separate state variables are provided for each of the transaction masters, with each of these separate state variables being associated with different address ranges of the data. Thus, two store-exclusive instructions being performed by different transaction masters will not interfere with each other if they are accessing two data values within different address ranges as different state variables may be used to separately track the potential conflicts between exclusive store access operations within those different address ranges.

In some embodiments the different address ranges may be fixed address ranges whereas in other embodiments it may be desirable to provide programmable (under hardware or software control) address ranges.

An alternative to the set and reset form of state variables, other embodiments may use counter values with the subject control value taking the form of the counter value and the subject state variable a sample of the counter value to be associated with an exclusive store access state. The counter value is shared by the plurality of transaction masters and is stored within the coherency control circuitry to provide a point of serialisation using which store-exclusive instructions from different transaction masters may be arbitrated. When a transaction master is successful in executing a store-exclusive instruction, the counter value is changed such that it no longer will match previously stored samples of the count value associated with pending exclusive store access states of other transaction masters.

The subject control value associated with the subject transaction master may be stored in different locations. It is convenient if this is stored within the subject cache memory of the subject transaction master, as exclusive store operations and exclusive load operations will typically be routed through this subject cache memory.

The counter value may be provided to the subject cache memory in a variety of different ways, including as a sideband signal, as an out-of-band signal and as a data payload within an in-band signal.

The setting of the subject state variable may be triggered in a variety of different ways as previously discussed. These include the fetching by the subject transaction master of an instruction from an address associated with a previously encountered load-exclusive instruction or a store-exclusive instruction. Another alternative is the decoding by the subject transaction master of one of a load-exclusive instruction or a store-exclusive instruction. A further alternative is that the counter value has not yet been sampled for greater than a predetermined number of processing cycles.

In a similar way to that in which a plurality of set and reset state variables may be associated with different address ranges, it is also possible to provide a plurality of counters which each of these plurality of counters being associated with different address ranges. The address ranges may again be fixed address ranges or programmable (by software or hardware mechanisms) address ranges.

The present techniques are also applicable to systems utilising hierarchies of transaction masters. Exclusive store access states and store exclusive instructions may be arbitrated within a cluster of transaction masters, with that cluster forming part of a system containing one or more further transaction masters. In this case, if a store-exclusive instruction is permitted within the cluster, then a further arbitration against potentially any overlapping exclusive store access state and store-exclusive instruction of the one or more further transaction masters may be performed using the same steps as are performed within the cluster, and as previously discussed.

The additional steps of checking the current value of the subject state variable may be bypassed if the stored data value is marked and valid and uniquely stored within the subject cache memory when the store-exclusive instruction is executed, as in this case there is no risk of the problems of overlapping in competing store-exclusive instructions as previously discussed.

Viewed from another aspect the present invention provides an apparatus for managing data coherency within a data processing apparatus having a plurality of transaction masters, including a subject transaction master, said apparatus comprising in respect of each of said plurality of transaction masters serving as a subject transaction master:

  • state setting circuitry configured to set a subject state variable and a subject control value to match so as to indicate an exclusive store access state to subject data within a subject cache memory coupled to said subject transaction master;
  • monitor circuitry configured to respond to a store-exclusive instruction for execution by said subject transaction master by:
    • comparing a store address of a store data value associated with said store-exclusive instruction with addresses of data values stored within said subject cache memory to determine if said store data value is currently stored within said target cache memory and is valid;
    • if said stored data value is not marked as valid within said subject cache memory, then marking as failed execution of said store-exclusive instruction; and if said stored data value is valid within said subject cache memory, then:
      • (i) comparing a current value of said subject state variable with said subject control value;
      • (ii) if said current value does not match said subject control value, then marking as failed execution of said store-exclusive instruction; and
      • (iii) if said current value does match said subject control value, then permitting execution of said store-exclusive instruction and changing, for each other transaction master of said plurality of transaction masters using a current value of a state variable to track an exclusive store access state of said other transaction master and corresponding to said store address, one of said current value and a state variable associated with said other transaction master such that a subsequent store-exclusive instruction for execution by said other transaction master and corresponding to said exclusive store access state will not be executed with success by said other transaction master.

Viewed from a further aspect the present invention provides an apparatus for managing data coherency within a data processing apparatus having a plurality of transaction masters, including a subject transaction master, said apparatus comprising in respect of each of said plurality of transaction masters serving as a subject transaction master:

  • state setting means for setting a subject state variable and a subject control value to match so as to indicate an exclusive store access state to subject data within a subject cache memory coupled to said subject transaction master;
  • monitor means for responding to a store-exclusive instruction for execution by said subject transaction master by:
    • comparing a store address of a store data value associated with said store-exclusive instruction with addresses of data values stored within said subject cache memory to determine if said store data value is currently stored within said target cache memory and is valid;
    • if said stored data value is not marked as valid within said subject cache memory, then marking as failed execution of said store-exclusive instruction; and
    • if said stored data value is valid within said subject cache memory, then:
      • (i) comparing a current value of said subject state variable with said subject control value;
      • (ii) if said current value does not match said subject control value, then marking as failed execution of said store-exclusive instruction; and
      • (iii) if said current value does match said subject control value, then permitting execution of said store-exclusive instruction and changing, for each other transaction master of said plurality of transaction masters using a current value of a state variable to track an exclusive store access state of said other transaction master and corresponding to said store address, one of said current value and a state variable associated with said other transaction master such that a subsequent store-exclusive instruction for execution by said other transaction master and corresponding to said exclusive store access state will not be executed with success by said other transaction master.

DETAILED DESCRIPTION

FIG. 1 schematically illustrates a data processing system 2 including a plurality of transaction masters 46810 each having an associated local cache memory 12141618. Coherent interconnect circuitry 20 is provided to manage coherence between the data stored within the local cache memories 12141618 and to communicate with a memory system 22 (e.g. subsequent levels of cache memory, a volatile main memory and non-volatile storage). The transaction masters 46810 may take the form of general purpose processor cores, such as the processor cores designed by ARM Limited of Cambridge, England, or other forms of processing device such as DSP devices, graphics processing units and the like.

In this example embodiment, each of the transaction masters 46810 has an associated local cache memory 121416,18 into which data values stored within the memory 22 may be cached for high speed local access. The loading and storing of data values from the memory 22 is conducted via the coherent interconnect circuitry 20. The coherent interconnect circuitry 20 manages data coherence between the local cache memories 12141618. As will be appreciated by those in this technical field, multiple copies of data held within the memory 22 may be separately cached within the individual local cache memories 12141618. If one of the transaction masters 46810 updates its local copy of the data held within its respective local cache memory 12141618, then coherence operations are required, such as invalidating the data stored within the other cache memories or updating the data stored within the other cache memories.

The transaction masters 46810 include provision for executing store-exclusive instructions and load-exclusive instructions. This type of instruction is described in the ARM Architecture Reference Manual produced by ARM Limited of Cambridge, England. The load-exclusive instruction is an LDREX instruction and the store-exclusive instruction is an STREXinstruction. The definition of these instructions, the architectural behaviour of these instructions and example pseudo-code for using these instructions is described in the ARM Architecture Reference Manual, the content of which is incorporated herein by reference.

Also illustrated in FIG. 1 is monitoring circuitry 24 including a register 26 storing flag values which serve as state variables for monitoring an exclusive access state of associated respective transaction masters 46810. This monitoring circuitry24 is useful in providing a point of serialisation for store-exclusive instructions and thereby helping to avoid live-lock situations as will be described further below, i.e. by ensuring that store-exclusive instructions have a well defined serial order and that a store-exclusive instruction is not prevented from successfully executing by another store-exclusive instruction later in the serial order.

FIG. 2 is a flow diagram schematically illustrating the setting of a subject state variable in response to an LDREX instruction. At step 28 the process waits until an LDREX instruction is to be executed in one of the transaction masters 46810. When such an LDREX instruction is decoded, then step 30 issues signals indicating this to the coherent interconnect circuitry 20. At step 32 the monitoring circuitry 24 within the coherent interconnect circuitry 20 responds to notification that an LDREX instruction has been decoded within one of the transaction masters 46810 by setting the flag associated with that transaction master to a value of "1". This flag serves as a subject state variable for the associated transaction master and tracks the exclusive store access state of that subject transaction master 46810. Step 34 determines whether or not the data subject to the LDREX instruction is already loaded within the local cache of the transaction master 46810 that is executing that LDREX instruction. If the data is already loaded within the local cache 12141618 of the transaction master 46810 that is executing that LDREX instruction, then step 36 returns this data to the transaction master 468,10 from that local cache 12141618 concerned. If the data is not already stored within the local cache 12141618 of the transaction master 46810 that is executing the LDREX instruction, then step 38 serves to fetch the data from the memory 22 and store the data into the local cache 12141618 concerned as well as returning the data to the transaction master 46810 that is executing the LDREX instruction.

In other embodiments a different flow could be followed in which the local cache is first checked to see if the data is present and in response to this one of two transactions sent to the coherent interconnect: (1) cache hit→send "set flag" message and return no data; (2) cache miss→send "read and set flag" message returning required data.

It will be appreciated that in addition to the operations illustrated in FIG. 2, further steps will be taken to monitor the status of the data loaded within the local cache 12141618 and the coherence between local caches 12141618, these techniques may include ones tracking the validity of the data within the local caches 12141618, the shared or exclusive status of data within the local caches 12141618, the modified "dirty" data of the data within the local caches 121416,18 and the like.

FIG. 3 is a flow diagram schematically illustrating the testing and resetting of a state variable tracking exclusive access status. At step 40 processing waits until an STREX instruction is to be executed by one of the transaction masters 468,10. Step 42 then determines whether or not the data having address within the memory 22 corresponding to the STREX is present and valid within the local cache 12141618 of the transaction master 46810 in which that STREX instruction has been decoded. If the data concerned is not present within that local cache 12141618, then processing proceeds to step 44 where the STREX instruction is marked as failing by returning a result value indicative of this fail status within a result register (e.g. a general purpose register with a processor core) associated with the STREX instruction.

If the test at step 42 indicates that the data is present and valid within the local cache 12141618 of the transaction master 46810, then processing proceeds to step 46 where a determination is made as to whether or not that data is marked as being unique, i.e. only stored within that local cache 12141618. If the data is marked as unique, then there is no coherency issue to be managed and processing can proceed to step 48 where the data is stored into the local cache 12,141618 of the transaction master 46810 overwriting whatever value was previously stored for that data. In some embodiments a message may also be sent to the coherent interconnect to clear any flag associated with this transaction master and tracking an exclusive store access state. Processing then proceeds to step 50 where the STREX instruction is marked as passing by returning a value indicative of this pass in the result register associated with the STREX instruction as discussed in connection with step 44.

If the determination at step 46 is that the data is not marked as unique, then processing proceeds to step 52 where a signal indicative of the decoding of the STREX instruction at step 40 is passed to the coherent interconnect circuitry 20, and more particularly to the monitoring circuitry 24. Step 54 then determines whether or not the flag within the register 26corresponding to the transaction master 45810 in which the STREX instruction is to be executed is set, i.e. has a value of "1". If this flag is still set, then it indicates that another transaction master 46810 has not reset this flag due to that other transaction master 46810 at least partially executing, its own STREX instruction. If the flag is not set, then theSTREX instruction is later than another STREX instruction, which has prevailed in any arbitration (e.g. managed to reset the flags of the other transaction masters 46810 first) and accordingly processing proceeds to step 44 where the presentSTREX instruction is failed.

If the determination at step 54 is that the flag for the subject transaction master 46810 is still set (i.e. a match), then step56 serves to reset this flag as well as resetting the flags of all of the other transaction masters 46810. Resetting of the flags of all the other transaction masters 46810 will prevent those other transaction masters 46810 from successfully executing an STREX instruction if a store exclusive operation is currently pending within them. At step 58 an indication is returned to the subject transaction master 46810 that the flag for that subject transaction master is still set. Step 60 then stores the data which is subject to the STREX instruction into the local cache 12141618 of the subject transaction master 46810. Step 62 triggers an invalidation operation of any old copies of the data which has just been stored into the local cache 12141618 to take place within the other local cache memories 12141618 of other transaction masters 46810 that may be storing corresponding copies. Step 50 then marks the STREX instruction as passing as previously discussed. If the determination at step 54 was that the flag was not set (i.e. no match), then processing proceeds via step 44 to the end and no invalidation of data in other caches 12141618 is performed.

FIG. 4 schematically illustrates a variation of the monitoring circuitry 24, but in this case storing multiple state variables for each of the transaction masters 46810. In this example the monitoring circuitry 24 stores four flags 64666870, one for each transaction master 46810. Each of these flags 64666870 has an associated register 72747678 storing data defining an associated range of address values for which the flag concerned monitors exclusive store access. There may also be no-address flags indicating that the flags 64666880 are associated with the full memory address range. As illustrated in FIG. 4, when an STREX instruction is decoded, address matching circuitry 80 serves to determine which of the flags 64666870 is associated with the address range within which the address of the STREX instruction falls. When this flag has been identified, then comparison circuitry 82 determines whether or not that flag value is still set and initiates a pass/fail response. If there is a pass response, then reset circuitry 84 resets all of the corresponding flags for other transaction masters which at least partially overlap with the address range for which the flag has been tested. This will have the result that when a STREX is later attempted for those other transaction masters, this will not execute with success, i.e. at least a result value indicating an execution fail will be returned in the result register for that STREX instruction.

It will be appreciated that the address ranges used by each of the transaction masters could be the same or could be different. In the example illustrated the address ranges are shown as separately defined for each transaction master, but in practice one set of programmable address ranges may be suitable for use by all of the transaction masters, and this would simplify implementation and operation.

FIG. 5 schematically illustrates a second example embodiment. In this example embodiment a plurality of transaction masters86889092 are again provided with local cache memories 949698100. Coherency interconnect circuitry 102containing monitoring circuitry 104 manages coherence among the local cache memories 949698100. The coherent interconnect circuitry 102 also manages access to a main memory 106.

Compared to the embodiment of FIG. 1, in this example embodiment the monitoring circuitry 104 includes a counter 108storing a count value which is incremented when a trigger event occurs. This trigger event may be the success of a STREXrelated transaction at the coherent interconnect circuitry 102. A further example of a trigger event is that the counter value has not been sampled for greater than a predetermined number of processing cycles, i.e. the counter value is periodically sampled. It will be appreciated that in the above the counter is described as changing by incrementing, but it will be appreciated that the counter could equally change by decrementing or by changing its value in some other way.

A counter store 110112114116 is associated with each of the transaction masters 86889092. This counter store110112114116 serves to retrieve a copy of the current value of the counter 108 from the monitoring circuitry 104whenever it is desired to set the subject state variable of the transaction master 86889092 concerned so as to monitor an exclusive store state for that transaction master 86889092. These counter values can be transmitted in a number of ways, such as as a sideband signal, as an out-of-band signal upon the normal communication channel or as a data payload within an in-band signal on the normal communication channel. Other ways of communicating this counter value are also possible.

A comparator 118 is provided within the monitoring circuitry 104 and serves to compare a count value (subject state variable) stored within one of the counter stores 110112114116 of a transaction master 86889092 attempting to execute on STREX instruction with a current value of the counter 108 (current value of subject control value). This provides a point of serialisation control on parallel store exclusive states within different transaction masters 86889092 as described below.

The flow of operation may be as follows:

  • STREX being executed at TMx
  • Check in cache to see if data valid→if not fail
  • Check to see if data unique→if unique complete internally
  • If not unique, send transaction to coherent interconnect including locally held copy of previously sampled counter value
  • At monitoring circuitry compare the received count value with the current count value
  • If equal, pass STREX, invalidate other copies held in other caches and increment counter 108
  • If not equal, fail STREX as another STREX has already passed, do not invalidate other copies and do not increment counter 108.

FIG. 6 is a flow diagram schematically illustrating the loading of a counter value into one of the counter stores 110112114,116. Step 126 waits until a counter load trigger event occurs. This trigger event may be the fetching by one of the transaction masters 86889092 of an instruction from an address that was previously identified as containing one of a load-exclusive instruction or a store-exclusive instruction. Another type of trigger event may be the execution by one of the transaction masters 86889092 of a load-exclusive instruction. When such a trigger event occurs, then step 130 then loads the counter value into the counter store 110112114116 of the associated transaction master 86889092. If the count value is incremented on a regular basis, then the incremented counter value may be loaded into all of the counter stores 110112114116. The locally stored counter value serves as the subject control value for each of the transaction masters 86889092. The counter store 110112114116 may form part of the local cache memory 949698100, as such transactions will be routed through these cache memories.

FIG. 7 illustrates the behaviour of the embodiment of FIG. 5 when an STREX instruction is decoded within one of the transaction masters 86889092. At step 132 processing waits until an STREX instruction is decoded. Step 134determines whether or not the data concerned is present and valid within the local cache memory 949698100 of the transaction master 86889092 which decoded the STREX instruction. If the data is not present and loaded, then processing proceeds to step 136 where the STREX instruction is marked as failing as previously described. If the data is present and valid, then processing proceeds to step 138 where a determination is made as to whether or not that data is marked as unique within the local cache memory 949698100 concerned. If the data is marked as unique, then step 140serves to store the data of the STREX instruction into the local cache memory 949698100 and processing proceeds to step 142 where the STREX instruction is marked as passing.

If the determination at step 138 is that the data concerned is not marked as unique, then step 144 serves to issue signals indicating the decoding of the STREX instruction to the coherent interconnect circuitry 102 together with a copy of the previously stored count value associated with the transaction master TMy. Step 148 within the monitoring circuitry 104determines whether or not the received count value from the transaction master TMwhich decoded the STREX instruction matches the current count value of the counter 108. If there is not a match, then processing proceeds to step 136 and theSTREX instruction is marked as failing. The non-matching of the count values indicates that another transaction master has previously succeeded in executing its own STREX instruction and has incremented the counter value so that it no longer matches the counter value which is locally stored by the transaction master TMy.

If the received count value equals the current count value as determined at step 148, then processing proceeds to step 150where the data value is stored into the local cache memory 949698100. Step 152 then triggers invalidation of any old copies of that data stored in other local caches of other transaction masters 86889092 as well as incrementing the count value stored within the counter 108 of the monitoring circuitry 104. Step 142 marks the STREX instruction as passing. Step150 and 152 may be reversed in order in some embodiments.

The embodiment of FIGS. 5, 6 and 7 uses a counter 108 that will have a finite maximum count value and may then wrap back to zero. This causes a potential problem that a transaction master may still be holding a subject state variable with a value of zero even though other STREX instructions have succeeded and the subject control value changed through its maximum range since that sample of the counter was taken. This may cause the transaction master holding the old sampled value to succeed due to the counter wrap when it should fail. (The old value and the new post-wrap value need not be zero). This could cause a live-lock or other erroneous operation with, for example, the transaction master holding the old sample overlapping with a further transaction that caused the wrap and issuing invalidates to the master that issued the further transaction. This effect could continue with the transaction master that issued the further transaction serving to incorrectly invalidate another transaction so that a cycle of erroneous invalidates causes a live-lock. One way of addressing this problem would be the provision of a mechanism to send a message to all transaction masters to indicate that any locally held copy of the counter should be invalidated or resampled. This could be triggered on a counter wrap. Another possibility is if the counter has strictly more states than the number of transaction masters connecting to the point of serialization (monitoring circuitry), then the "domino effect" will reach an end and a cycle of erroneous invalidates will be avoided. After the domino chain has finished, the next STREX to succeed will be using the correctly sampled value of the current counter value and thus forward progress will be made and perpetual live-lock avoided.

FIG. 8 schematically illustrates a further example embodiment. In this embodiment a cluster of transaction masters 154 has coherent interconnect circuitry 156 including monitoring circuitry 158. Arbitration and serialisation between STREXinstructions of the cluster 154 is performed within that cluster using the monitoring circuitry 158 of the coherent interconnect circuitry 156 and the techniques previously described. If an STREX instruction is passed within the cluster 154, then further arbitration is performed against further transaction masters 160162 which are connected to further coherent interconnect circuitry 164 containing further monitoring circuitry 166. Thus, there is a hierarchy of arbitration performed and arbitration can be performed both within a cluster and between clusters at higher levels. The monitoring circuitry 158 and the further monitoring circuitry 166 can both utilise either the flags or the counter mechanisms previously described.

The embodiment of FIG. 5 is shown within a single counter value. It would also be possible to provide multiple counter values each associated with different ranges of addresses, these ranges of addresses could be fixed or programmable. It is also possible to provide flags with no address range associated therewith. Such flags could be used as a default if all address capable resources are already in use.

SRC=https://www.google.com.hk/patents/US20140052921

原文地址:https://www.cnblogs.com/coryxie/p/3892012.html