Unformatted text preview: ports MCE THEN IF CPU supports MCA THEN call errorlogging routine; (* returns restartability *) FI; ELSE (* Pentium(R) processor compatible *) READ P5_MC_ADDR READ P5_MC_TYPE; report RESTARTABILITY to console; FI; IF error is not restartable THEN report RESTARTABILITY to console; abort system; FI; CLEAR MCIP flag in MCG_STATUS; 13.7.2. Pentium® Processor Machine-Check Exception Handling
To make the machine-check exception handler portable to the Pentium® and P6 family processors, checks can be made (using the CPUID instruction) to determine the processor type. Then based on the processor type, machine-check exceptions can be handled specifically for Pentium® or P6 family processors. When machine-check exceptions are enabled for the Pentium® processor (MCE flag is set in control register CR0), the machine-check exception handler uses the RDMSR instruction to read the error type from the P5_MC_TYPE register and the machine check address from the P5_MC_ADDR register. The handler then normally reports these register values to the system console before aborting execution (refer to Example 13-2). 13.7.3. Logging Correctable Machine-Check Errors
If a machine-check error is correctable, the processor does not generate a machine-check exception for it. To detect correctable machine-check errors, a utility program must be written that reads each of the machine-check error-reporting register banks and logs the results in an accounting file or data structure. This utility can be implemented in either of the following ways: • A system daemon that polls the register banks on an infrequent basis, such as hourly or daily. 13-16 MACHINE-CHECK ARCHITECTURE • A user-initiated application that polls the register banks and records the exceptions. Here, the actual polling service is provided by an operating-system driver or through the system call interface. Example 13-3 gives pseudocode for an error logging utility.
Example 13-3. Machine-Check Error Logging Pseudocode Assume that execution is restartable; IF the processor supports MCA THEN FOR each bank of machine-check registers DO READ MCi_STATUS; IF VAL flag in MCi_STATUS = 1 THEN IF ADDRV flag in MCi_STATUS = 1 THEN READ MCi_ADDR; FI; IF MISCV flag in MCi_STATUS = 1 THEN READ MCi_MISC; FI; IF MCIP flag in MCG_STATUS = 1 (* Machine-check exception is in progress *) AND PCC flag in MCi_STATUS = 1 AND RIPV flag in MCG_STATUS = 0 (* execution is not restartable *) THEN RESTARTABILITY = FALSE; return RESTARTABILITY to calling procedure; FI; Save time-stamp counter and processor ID; Set MCi_STATUS to all 0s; Execute serializing instruction (i.e., CPUID); FI; OD; FI; If the processor supports the machine-check architecture, the utility reads through the banks of error-reporting registers looking for valid register entries, and then saves the values of the MCi_STATUS, MCi_ADDR, MCi_MISC and MCG_STATUS registers for each bank that is valid. The routine minimizes processing time by recording the raw data into a system data structure or file, reducing the overhead associated with polling. User utilities analyze the collected data in an off-line environment. When the MCIP flag is set in the MCG_STATUS register, a machine-check exception is in progress and the machine-check exception handler has called the exception logging routine. Once the logging process has been completed the exception-handling routine must determine 13-17 MACHINE-CHECK ARCHITECTURE whether execution can be restarted, which is usually possible when damage has not occurred (The PCC flag is clear, in the MCi_STATUS register) and when the processor can guarantee that execution is restartable (the RIPV flag is set in the MCG_STATUS register). If execution cannot be restarted, the system is not recoverable and the exception-handling routine should signal the console appropriately before returning the error status to the Operating System kernel for subsequent shutdown. The machine-check architecture allows buffering of exceptions from a given error-reporting bank although the P6 family processors do not implement this feature. The error logging routine should provide compatibility with future processors by reading each hardware error-reporting bank’s MCi_STATUS register and then writing 0s to clear the OVER and VAL flags in this register. The error logging utility should re-read the MCi_STATUS register for the bank ensuring that the valid bit is clear. The processor will write the next error into the register bank and set the VAL flags. Additional information that should be stored by the exception-logging routine includes the processor’s time-stamp counter value, which provides a mechanism to indicate the frequency of exceptions. A multiprocessing operating system stores the identity of the processor node incurring the exception using a unique identifier, such as the processor’s APIC ID (refer to Section 7.5.9., “Interrupt Destination and APIC ID”). The basic algorithm given in Example 13-3 can be modified to provide more robust recovery techniques. For example, software has the flexibility to attempt recovery usin...
View Full Document
This note was uploaded on 06/07/2013 for the course ECE 1234 taught by Professor Kwhon during the Spring '10 term at Berkeley.
- Spring '10