This preview shows page 1. Sign up to view the full content.
Unformatted text preview: it is asserted, the ECC syndrome field will not be overwritten. The correctable ECC error bit is asserted in the MCi_STATUS register for corrected ECC errors. The ECC syndrome field in the MCi_STATUS register contains the 8-bit ECC syndrome only if the error was a correctable/uncorrectable ECC error, and there wasn’t a previous valid ECC error syndrome logged in the MCi_STATUS register. A previous valid ECC error in MCi_STATUS is indicated by MCi_STATUS.bit45 (uncorrectable error occurred) being asserted. After processing an ECC error, machine-check handling software should clear MCi_STATUS.bit45 so that future ECC error syndromes can be logged. Reserved. 46 47-54 55-56 Other Information 13.7. GUIDELINES FOR WRITING MACHINE-CHECK SOFTWARE
The machine-check architecture and error logging can be used in two different ways: • • To detect machine errors during normal instruction execution, using the machine-check exception (#MC). To periodically check and log machine errors. To use the machine-check exception, the operating system or executive software must provide a machine-check exception handler. This handler can be designed specifically for P6 family processors or be a portable handler that also handles Pentium® processor machine-check errors. A special program or utility is required to log machine errors. Guidelines for writing a machine-check exception handler or a machine-error logging utility are given in the following sections. 13.7.1. Machine-Check Exception Handler
The machine-check exception (#MC) corresponds to vector 18. To service machine-check exceptions, a trap gate must be added to the IDT, and the pointer in the trap gate must point to a machine-check exception handler. Two approaches can be taken to designing the exception handler: • • The handler can merely log all the machine status and error information, then call a debugger or shut down the system. The handler can analyze the reported error information and, in some cases, attempt to correct the error and restart the processor. 13-14 MACHINE-CHECK ARCHITECTURE Virtually all the machine-check conditions detected with the P6 family processors cannot be recovered from (they result in abort-type exceptions). The logging of status and error information is therefore a baseline implementation. Refer to Section 13.7., “Guidelines for Writing Machine-Check Software” for more information on logging errors. For future P6 family processor implementations, where recovery may be possible, the following things should be considered when writing a machine-check exception handler: • • • • To determine the nature of the error, the handler must read each of the error-reporting register banks. The count field in the MCG_CAP register gives number of register banks. The first register of register bank 0 is at address 400H. The VAL (valid) flag in each MCi_STATUS register indicates whether the error information in the register is valid. If this flag is clear, the registers in that bank do not contain valid error information and do not need to be checked. To write a portable exception handler, only the MCA error code field in the MCi_STATUS register should be checked. Refer to Section 13.6., “Interpreting the MCA Error Codes” for information that can be used to write an algorithm to interpret this field. The RIPV, PCC, and OVER flags in each MCi_STATUS register indicate whether recovery from the error is possible. If either of these fields is set, recovery is not possible. The OVER field indicates that two or more machine-check error occurred. When recovery is not possible, the handler typically records the error information and signals an abort to the operating system. Corrected errors will have been corrected automatically by the processor. The UC flag in each MCi_STATUS register indicates whether the processor automatically corrected the error. The RIPV flag in the MCG_STATUS register indicates whether the program can be restarted at the instruction pointed to by the instruction pointer pushed on the stack when the exception was generated. If this flag is clear, the processor may still be able to be restarted (for debugging purposes), but not without loss of program continuity. For unrecoverable errors, the EIPV flag in the MCG_STATUS register indicates whether the instruction pointed to by the instruction pointer pushed on the stack when the exception was generated is related to the error. If this flag is clear, the pushed instruction may not be related to the error. The MCIP flag in the MCG_STATUS register indicates whether a machine-check exception was generated. Before returning from the machine-check exception handler, software should clear this flag so that it can be used reliably by an error logging utility. The MCIP flag also detects recursion. The machine-check architecture does not support recursion. When the processor detects machine-check recursion, it enters the shutdown state. • • • • 13-15 MACHINE-CHECK ARCHITECTURE Example 13-2 gives typical steps carried out by a machine-check exception handler:
Example 13-2. Machine-Check Exception Handler Pseudocode IF CPU sup...
View Full Document
- Spring '10