This preview shows pages 1–2. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: Abstract Advances in IC processing allow for more microprocessor design options. The increasing gate density and cost of wires in advanced integrated circuit technologies require that we look for new ways to use their capabilities effectively. This paper shows that in advanced technologies it is possible to implement a single-chip multiproces- sor in the same area as a wide issue superscalar processor. We find that for applications with little parallelism the performance of the two microarchitectures is comparable. For applications with large amounts of parallelism at both the fine and coarse grained levels, the multiprocessor microarchitecture outperforms the superscalar architecture by a significant margin. Single-chip multiprocessor architectures have the advantage in that they offer localized imple- mentation of a high-clock rate processor for inherently sequential applications and low latency interprocessor communication for par- allel applications. 1 Introduction Advances in integrated circuit technology have fueled microproces- sor performance growth for the last fifteen years. Each increase in integration density allows for higher clock rates and offers new opportunities for microarchitectural innovation. Both of these are required to maintain microprocessor performance growth. Microar- chitectural innovations employed by recent microprocessors include multiple instruction issue, dynamic scheduling, speculative execution and non-blocking caches. In the future, the trend seems to be towards CPUs with wider instruction issue and support for larger amounts of speculative execution. In this paper, we argue against this trend. We show that, due to fundamental circuit limitations and limited amounts of instruction level parallelism, the superscalar execution model will provide diminishing returns in performance for increasing issue width. Faced with this situation, building a complex wide issue superscalar CPU is not the most efficient use of silicon resources. We present the case that a better use of silicon area is a multiprocessor microarchitecture constructed from simpler processors. To understand the performance trade-offs between wide-issue pro- cessors and multiprocessors in a more quantitative way, we com- pare the performance of a six-issue dynamically scheduled superscalar processor with a 4 two-issue multiprocessor. Our comparison has a number of unique features. First, we accurately account for and justify the latencies, especially the cache hit time, associated with the two microarchitectures. Second, we develop floor-plans and carefully allocate resources to the two microarchi- tectures so that they require an equal amount of die area. Third, we evaluate these architectures with a variety of integer, floating point and multiprogramming applications running in a realistic operating system environment....
View Full Document
- Winter '10