IntelSoftwareDevelopersManual

Comments d2h partialrat stalls 00h number of cycles or

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: cemonitoring counter events. 7BH BUS_HITM_DRV 00H (Self) Number of bus clock cycles during which this processor is driving the HITM# pin. Includes cycles due to snoop stalls. The event counts correctly, but the BPM i pins function as follows based on the setting of the PC bits (bit 19 in the PerfEvtSel0 and PerfEvtSel1 registers): If the core-clock-to- busclock ratio is 2:1 or 3:1, and a PC bit is set, the BPMipins will be asserted for a single clock when the counters overflow. If the PC bit is clear, the processor toggles the BPMipins when the counter overflows. If the clock ratio is not 2:1 or 3:1, the BPMi pins will not function for these performancemonitoring counter events. 7EH BUS_SNOOP_STALL 00H (Self) Number of clock cycles during which the bus is snoop stalled. A-6 PERFORMANCE-MONITORING EVENTS Table A-1. Events That Can Be Counted with the P6 Family PerformanceMonitoring Counters (Contd.) Unit FloatingPoint Unit Event Num. C1H Mnemonic Event Name FLOPS Unit Mask 00H Description Number of computational floating-point operations retired. Excludes floating-point computational operations that cause traps or assists. Includes floating-point computational operations executed by the assist handler. Includes internal sub-operations for complex floating-point instructions like transcendentals. Excludes floating-point loads and stores. Comments Counter 0 only. 10H FP_COMP_OPS_ EXE 00H Number of computational floating-point operations executed. The number of FADD, FSUB, FCOM, FMULs, integer MULs and IMULs, FDIVs, FPREMs, FSQRTS, integer DIVs, and IDIVs. Note not the number of cycles, but the number of operations. This event does not distinguish an FADD used in the middle of a transcendental flow from a separate FADD instruction. Counter 0 only. 11H FP_ASSIST 00H Number of floating-point exception cases handled by microcode. Counter 1 only. This event includes counts due to speculative execution. Counter 1 only. 12H MUL 00H Number of multiplies. Note: Includes integer as well as FP multiplies and is speculative. 13H DIV 00H Number of divides. Note: Includes integer as well as FP divides and is speculative. Counter 1 only. 14H CYCLES_DIV_BUSY 00H Number of cycles during which the divider is busy, and cannot accept new divides. Note: Includes integer and FP divides, FPREM, FPSQRT, etc., and is speculative. Counter 0 only. A-7 PERFORMANCE-MONITORING EVENTS Table A-1. Events That Can Be Counted with the P6 Family PerformanceMonitoring Counters (Contd.) Unit Memory Ordering Event Num. 03H Mnemonic Event Name LD_BLOCKS Unit Mask 00H Description Number of store buffer blocks. Includes counts caused by preceding stores whose addresses are unknown, preceding stores whose addresses are known but whose data is unknown, and preceding stores that conflicts with the load but which incompletely overlap the load. Comments 04H SB_DRAINS 00H Number of store buffer drain cycles. Incremented every cycle the store buffer is draining. Draining is caused by serializing operations like CPUID, synchronizing operations like XCHG, interrupt acknowledgment, as well as other conditions (such as cache flushing). 05H MISALIGN_ MEM_REF 00H Number of misaligned data memory references. Incremented by 1 every cycle, during which either the proc load or store pipeline dispatches a misaligned uop. Counting is performed if it is the first or second half, or if it is blocked, squashed, or missed. Note: In this context, misaligned means crossing a 64-bit boundary. It should be noted that MISALIGN_MEM_REF is only an approximation to the true number of misaligned memory references. The value returned is roughly proportional to the number of misaligned memory accesses, i.e., the size of the problem. 07H EMON_KNI_PREF_ DISPATCHED 00H 01H 02H 03H 4BH EMON_KNI_PREF_ MISS 00H 01H 02H 03H Number of Streaming SIMD extensions prefetch/weaklyordered instructions dispatched (speculative prefetches are included in counting) 0: prefetch NTA 1: prefetch T1 2: prefetch T2 3: weakly ordered stores Counters 0 and 1. Pentium® III processor only. Number of prefetch/weaklyordered instructions that miss all caches. 0: prefetch NTA 1: prefetch T1 2: prefetch T2 3: weakly ordered stores Counters 0 and 1. Pentium® III processor only. A-8 PERFORMANCE-MONITORING EVENTS Table A-1. Events That Can Be Counted with the P6 Family PerformanceMonitoring Counters (Contd.) Unit Instruction Decoding and Retirement Event Num. C0H Mnemonic Event Name INST_RETIRED Unit Mask OOH Description Number of instructions retired. Comments A hardware interrupt received during/after the last iteration of the REP STOS flow causes the counter to undercount by 1 instruction. C2H D0H D8H UOPS_RETIRED INST_DECODED EMON_KNI_INST_ RETIRED 00H 00H Number of UOPs retired. Number of instructions decoded. Number of Streaming SIMD extensions retired 0: packed & scalar 1: scalar Number of Streaming SIMD extensions computation instructions retired. 0: packed and scalar 1: scalar Number of hardware interrupts received. Number of processor cycles for whic...
View Full Document

Ask a homework question - tutors are online