ENGN8537 Lecture 3_3.pdf - Research School of Engineering...

This preview shows 1 out of 6 pages.

Research School of Engineering ENGN8537: Embedded Systems and Real Time Digital Signal Processing You know I haven't eaten since 6 o'clock this morning, and that was half a cream cheese bagel. And it wasn't even real cream cheese, it was light cream cheese! Now you want me to run off and do Alternative Processor Architecture
Image of page 1

Subscribe to view the full document.

Pipelining Recall from last week’s lecture: We observed that a single instruction on a RISC machine goes through a number of stages as it is executed. Each step takes one clock cycle and uses a different piece of hardware: The instruction is shifted through the CPU much like a part on a factory assembly line. Then, because each step takes different hardware, multiple instructions may be overlapped to improve throughput. This is called pipelining, each instruction takes the same amount of time to run however there are more being executed at any given time so throughput is improved
Image of page 2
Superscalar We further noted that many different instructions have different requirements at the execute phase. For example, the instruction set may prevent any single instruction both using the ALU and having a memory access (like many RISC machines). By doubling the “width” of the other stages (i.e. allowing them to operate on two instructions at once), instructions may be executed completely in parallel with minimal duplication of hardware. This is an example of Superscalar design.
Image of page 3

Subscribe to view the full document.

ARM Cortex A8 The ARM Cortex A8 Interger Pipeline is an example of a superscalar processor. This still logically performs the five canonical RISC operations however for the sake of speed, some of those operations are further split in to smaller pieces. It also contains other logic blocks that are outside the scope of this course: A Bus Interface Unit (BIU) is responsible for memory accesses, including level 2 and 3 cache (level 1 is built in to the pipeline directly). The NEON core is a logically-separate processor that is responsible for SIMD and stream processing of graphics and media. We will visit stream processors later in the lecture. The Trace Port is a debugging aid.
Image of page 4
Fetch takes three clock cycles (three pipeline stages). The target address is first calculated in the Address Generator Unit (AGU). Next, the memory access is done (RAM). The rest of the acronyms during the F1 stage are specific pieces of logic to deal with accelerating types of accesses. These include Branch Prediction (the
Image of page 5

Subscribe to view the full document.

Image of page 6
You've reached the end of this preview.
  • Three '14
  • Central processing unit, SIMD, Vector processor, single instruction, Stream processing

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern