IntelSoftwareDevelopersManual

711 guaranteed atomic operations the intel386 intel486

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: or page tables) in which two or more processors may try simultaneously to modify the same field or flag. The processor uses three interdependent mechanisms for carrying out locked atomic operations: • • • Guaranteed atomic operations. Bus locking, using the LOCK# signal and the LOCK instruction prefix. Cache coherency protocols that insure that atomic operations can be carried out on cached data structures (cache lock). This mechanism is present in the P6 family processors. These mechanisms are interdependent in the following ways. Certain basic memory transactions (such as reading or writing a byte in system memory) are always guaranteed to be handled atomically. That is, once started, the processor guarantees that the operation will be completed before another processor or bus agent is allowed access to the memory location. The processor also supports bus locking for performing selected memory operations (such as a read-modify-write operation in a shared area of memory) that typically need to be handled atomically, but are not automatically handled this way. Because frequently used memory locations are often cached in a processor’s L1 or L2 caches, atomic operations can often be carried out inside a processor’s caches without asserting the bus lock. Here the processor’s cache coherency protocols insure that other processors that are caching the same memory locations are managed properly while atomic operations are performed on cached memory locations. Note that the mechanisms for handling locked atomic operations have evolved as the complexity of Intel Architecture processors has evolved. As such, more recent Intel Architecture processors (such as the P6 family processors) provide a more refined locking mechanism than earlier Intel Architecture processors, as is described in the following sections. 7.1.1. Guaranteed Atomic Operations The Intel386™, Intel486™, Pentium®, and P6 family processors guarantee that the following basic memory operations will always be carried out atomically: • • • • • • 7-2 Reading or writing a byte. Reading or writing a word aligned on a 16-bit boundary. Reading or writing a doubleword aligned on a 32-bit boundary. The P6 family processors guarantee that the following additional memory operations will always be carried out atomically: Reading or writing a quadword aligned on a 64-bit boundary. (This operation is also guaranteed on the Pentium® processor.) 16-bit accesses to uncached memory locations that fit within a 32-bit data bus. 16-, 32-, and 64-bit accesses to cached memory that fit within a 32-Byte cache line. MULTIPLE-PROCESSOR MANAGEMENT Accesses to cacheable memory that are split across bus widths, cache lines, and page boundaries are not guaranteed to be atomic by the Intel486™, Pentium®, or P6 family processors. The P6 family processors provide bus control signals that permit external memory subsystems to make split accesses atomic; however, nonaligned data accesses will seriously impact the performance of the processor and should be avoided where possible. 7.1.2. Bus Locking Intel Architecture processors provide a LOCK# signal that is asserted automatically during certain critical memory operations to lock the system bus. While this output signal is asserted, requests from other processors or bus agents for control of the bus are blocked. Software can specify other occasions when the LOCK semantics are to be followed by prepending the LOCK prefix to an instruction. In the case of the Intel386™, Intel486™, and Pentium® processors, explicitly locked instructions will result in the assertion of the LOCK# signal. It is the responsibility of the hardware designer to make the LOCK# signal available in system hardware to control memory accesses among processors. For the P6 family processors, if the memory area being accessed is cached internally in the processor, the LOCK# signal is generally not asserted; instead, locking is only applied to the processor’s caches (refer to Section 7.1.4., “Effects of a LOCK Operation on Internal Processor Caches”). 7.1.2.1. AUTOMATIC LOCKING The operations on which the processor automatically follows the LOCK semantics are as follows: • • When executing an XCHG instruction that references memory. When setting the B (busy) flag of a TSS descriptor. The processor tests and sets the busy flag in the type field of the TSS descriptor when switching to a task. To insure that two processors do not switch to the same task simultaneously, the processor follows the LOCK semantics while testing and setting this flag. When updating segment descriptors. When loading a segment descriptor, the processor will set the accessed flag in the segment descriptor if the flag is clear. During this operation, the processor follows the LOCK semantics so that the descriptor will not be modified by another processor while it is being updated. For this action to be effective, operating-system procedures that update descriptors should...
View Full Document

This note was uploaded on 06/07/2013 for the course ECE 1234 taught by Professor Kwhon during the Spring '10 term at University of California, Berkeley.

Ask a homework question - tutors are online