Those structures are fully synthesizable designs

This preview shows page 6 - 8 out of 8 pages.

Those structures are fully synthesizable designs instead of the traditionalcustomized designs. Thus, the estimated timing, such as BTB’s latency,would be longer than that in a real chip.TABLE IVTIMINGPARAMETERS OFBRANCHCOMPONENTS UNDERTSMCDOPHINLIBRARY65NM(0.9V, 125C)ComponentsSRAMTiming (ns)Estimated Max LatencyCellTSetupTClktoQw/ow/Sub-predTAPTAPin TAPBTBData128x1480.2080.8781.1121.112-Tag128x1200.2120.862Hybrid128x320.2330.7720.9020.9020.802Perceptrons128x640.2230.7901.4701.4701.400O-GEHL128x160.2380.7631.4331.4331.163B. MPKI Impacts and Performance ImprovementFigure 7 shows the indirect branch MPKI of the baselineand the TAP schemes. All TAP schemes for three directionpredictors reduce the indirect-branch MPKI significantly. Forexample, the TAP-Perceptrons scheme reduces the averageMPKI from 3.98 to 1.03. Table V shows the conditional-branch MPKI impacts. Since TAP Prediction reuses existingbranch components, it is inevitable to have adverse impactson conditional branches. However, those impacts are rel-atively small compared to the TAP’s significant indirect-branch MPKI improvement. The results show that the im-pacts on conditional branch prediction vary under those threeTAP schemes. This is because the mechanisms in Perceptronsand O-GEHL, which are inherently used to avoid aliasingproblems and contentions for conditional branches, are alsoeffective to avoid the interference between predicting branchdirections and target address pointers.The MPKI reduction leads to attractive IPC improvement,as shown in Figure 8. Because TAP Prediction does notincrease the latency of the branch prediction, IPC evaluationgenerally reflects the performance impact. The IPC improve-ments of TAP-Hybrid, TAP-Perceptrons and TAP-OGEHLare 18.19%, 21.52%, and 20.59%, respectively. We havealso experimented with a perfect indirect-branch predictorfor each baseline, which predicts all the indirect-branchtargets with 100% accuracy. The TAP schemes achieve76.84%, 85.09%, and 82.35%, respectively, of the maximumperformance improvements provided by the ideal indirect-branch predictors.The effect of the updating mechanism in TAP-Perceptronsscheme is listed in Table VI. The result shows that it takesonly 1.80 cycles for each indirect-branch update averagely.Also, if we assume that each update can be finished one cycle(w/o additional cycles), it can only improve performanceby 0.19% averagely. Therefore, although the target-entriestraversing makes the updating a little complex, it would not124
TABLE VCONDITIONAL-BRANCHMPKI COMPARISONSConfigurationcraftyeonperlbmkgapsjengperlbenchgcc06povrayrichardsAvg.Hybrid Predictorbaseline7.675.985.790.9412.903.225.801.894.515.41TAP8.466.647.371.5014.383.726.562.004.636.14Perceptrons Predictorbaseline4.996.338.402.849.211.254.743.493.144.93TAP5.026.368.442.869.331.254.853.493.144.97VPC5.046.3712.7012.919.401.254.777.473.157.01OGEHL Predictorbaseline5.276.258.262.199.850.784.003.143.104.76TAP5.846.368.972.7611.231.274.703.143.115.26craftyeonperlbmkgapsjengperlbenchgcc06povrayrichardsAverage0123456789MPKIBaselineTAP-HybridTAP-PerceptronsTAP-OGEHLFig. 7.Indirect-branch MPKI impacts of TAP predictorsTABLE VIUPDATEPENALTIES INTAP-PERCEPTRONSSCHEME: INDIRECTBRANCH(IB), IPC IMPROVEMENT WITHOUT ADDITIONAL CYCLES(IPCW/IDEAL UPDATE)

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture