01_SoC_Overview_and_ARM_Integrator

01_SoC_Overview_and_ARM_Integrator - SOC Design Lab

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: SOC Design Lab http://twins.ee.nctu.edu.tw/courses/soclab_04/index.html Adopted from National Chiao-Tung University IP Core Design SOC Consortium Course Material Outline Introduction to SoC ARM-based SoC and Development Tools SoC Labs Available Lab modules in this course Summary SOC Consortium Course Material 1 SoC: System on Chip System A collection of all kinds of components and/or subsystems that are appropriately interconnected to perform the specified functions for end users. A SoC design is a “product creation process” which – Starts at identifying the end-user needs – Ends at delivering a product with enough functional satisfaction to overcome the payment from the end-user SOC Consortium Course Material 2 SoC Definition Complex IC that integrates the major functional elements of a complete end-product into a single chip or chipset The SoC design typically incorporates – – – – – Programmable processor On-chip memory HW accelerating function units (DSP) Peripheral interfaces (GPIO and AMS blocks) Embedded software Source: “Surviving the SoC revolution – A Guide to Platform-based Design,” Henry Chang et al, Kluwer Academic Publishers, 1999 SOC Consortium Course Material 3 SoC Architecture SOC Consortium Course Material 4 SoC Example SOC Consortium Course Material 5 SoC Application Communication – Digital cellular phone – Networking Computer – PC/Workstation – Chipsets Consumer – Game box – Digital camera SOC Consortium Course Material 6 Benefits of Using SoC Reduce overall system cost Increase performance Lower power consumption Reduce size SOC Consortium Course Material 7 Evolution of Silicon Design Source: “Surviving the SoC revolution – A Guide to Platform-based Design,” Henry Chang et al, Kluwer Academic Publishers, 1999 SOC Consortium Course Material 8 Challenges in SoC Era Time-to-market – Process roadmap acceleration – Consumerization of electronic devices Complex systems – µCs, DSPs, HW/SW, SW protocol stacks, RTOS’s, digital/analog IPs, On-chips buses Deep submicron effects – Crosstalk, electronmigration, wire delays, mask costs SOC Consortium Course Material 9 How to Conquer the Complexity Use a known real entity – A pre-designed component (IP reuse) – A platform (architecture reuse) Partition – Based on functionality – Hardware and software Modeling – At different level – Consistent and accurate SOC Consortium Course Material 10 What is IP? Intellectual Property (IP) – Intellectual Property means products, technology, software, etc. that have been protected through patents, copyrights, or trade secrets. Virtual Component (VC) – A block that meets the Virtual Socket Interface Specification and is used as a component in the Virtual Socket design environment. Virtual Components can be of three forms — Soft, Firm, or Hard. (VSIA) Also named mega function, macro block, reusable component SOC Consortium Course Material 11 Reusable Component A design object This refers to the type of components for which a physical implementation exists that can be reused. For example, ALU chips or macrocells that can be embedded in larger chips, etc. These designs are largely implemented in specific technologies. Limited parameterization may be possible. These design objects typically exist in technology libraries from one or more suppliers. ---- EDA Industry Standard Roadmap 1996 SOC Consortium Course Material 12 Types of IP Design Flow Representation Soft System Design - Very flexible RTL Design Behavioral RTL - Not predicable Firm - Flexible - Predicable Hard Libraries Technology Portability N/A Technology Unlimited Independent Reference Library Synthesis RTL & Constraints - Footprint Floorplanning Netlist - Timing model Placement - Wiring model Routing - Not flexible Verification - Very predicable Polygon Data Technology Generic Library Mapping Specific Library Technology - Characterized cells Fixed - Fixed process rules Process Porting SOC Consortium Course Material 13 IP Value Foundation IP – Cell, MegaCell Star IP – ARM ( low power ) Niche IP – JPEG, MPEGII, TV, Filter Standard IP – USB, IEEE1394, ADC, DAC …….. SOC Consortium Course Material 14 IP Sources Legacy IP - from previous IC New IP - specifically designed for reuse Licensed IP - from IP vendors SOC Consortium Course Material 15 Why IP Don’t know how to do it Cannot wait for new in-house development Standard/Compatibility calls for it - PCI, USB, IEEE1394, Bluetooth - software compatibility SOC Consortium Course Material 16 Differences in Design Between IC and IP Limitation of IC design – Number of I/O pin – Design and Implement all the functionality in the silicon Soft IP – No limitation on number of I/O pin – Parameterized IP Design: design all the functionality in HDL code but implement desired parts in the silicon – IP compiler/Generator: select what you want !! – More high level auxiliary tools to verify design – More difficult in chip-level verification Hard IP – No limitation on number of I/O pin – Provide multiple level abstract model – Design and Implement all the functionality in the layout SOC Consortium Course Material Reusability 17 IP-Based SoCs An Evolutionary Path – Early days • IP/Cores not really designed for reuse (no standard deliverables) • Multiple Interfaces, difficult to integrate – IPs evolved: parameterization, deliverables, verification, synthesis – On-Chip bus standards began to appear (e.g, IBM CoreConnect, ARM AMBA, OCP-IP) Reusable IP + Common on-chip bus architectures – 1998: max number of cores > 30 cores core content between 50% and 95% SOC Consortium Course Material 18 Design Reuse Evolution • IP reuse • Architecture reuse (Platform design) Today’s platform is a virtual component for tomorrow’s platform -VSIA PBD SG Source: EEDesign, http://www.eedesign.com/story/OEG20020301S0104 SOC Consortium Course Material 19 History and Status of IP Reuse 1999 Reuse methodology manual Biggest change – Reuse is no longer a proposal – A solution practiced today for state-of-the art SOC design New business model emerges – IP provider and IP integrator – Like TSMC break IDM Top three IP providers (2002) – ARM, Rambus, MIPS Industry forum – worldwide: Virtual socket interface alliance (VSIA) – Taiwan: Taiwan SOC consortium SOC Consortium Course Material 20 Design for Use First To be reusable, useable first Good design techniques – – – – – Good documentation, good code, thorough commenting, well designed verification environments, robust scripts SOC Consortium Course Material 21 Then Design for Reuse Designed to solve a general problem – Configuration and parameters Designed for use in multiple technologies Designed with standards-based interfaces Designed with complete verification process – Robust and verified Design verified to a high level of confidence – Physical prototype, silicon proven, demo system Design with fully documented in terms of appropriate applications and restrictions SOC Consortium Course Material 22 Some Questions for Design Reuse Why should I design for reuse ? Should I plan all my design for reuse ? What is design reuse ? How to design reuse ? Who shall do design reuse ? When shall I do design reuse ? SOC Consortium Course Material 23 Why “Reuse” are not used ? I don’t have time to learn these “reuse” stuff, e.g. coding guidelines, documents, and various verification. I have to get my design done. It lengthen the design cycle. It’s true that – Reuse cost is very expensive • Explicit design reuse requires a dedicated team • 10x or an order of magnitude higher But it rewards – Benefit • 2-3X benefit for implicit reuse • 10X-100X productivity gain in successive design for explicit reuse – Don’t you have any coding style ? It’s a matter of habit. – Don’t you have to write any form of documents, too ? – Don’t you have to verify your design, too ? • Shortcut working style saves time now, but it pays more latter Whether to adopt reuse is a managerial and cultural SOC Consortium Course Material issue. 24 Challenge for Designers Not whether to adopt reuse But how to employ it effectively – Easy to integrate (interface) – Robust to integrate (function) Trends – Buy whatever IP they can and developing reusable macros only when necessary • problems: IP with functional bugs and poor documentation – Standard-based interfaces – Converging rather than diverging in their basic structure, with differentiation by software or specialized computational units SOC Consortium Course Material 25 Paradigm Shift Conventional flow – System/design house for RTL design, synthesis – ASIC vendor for physical implementation But SOC design is too complicated to be handled in-house solely New flow – System/design house provide H/W spec. only and focus on valued-added S/W and application – ASIC vendor offer IPs and integration service SOC Consortium Course Material 26 Survive for New Climate For system/design house – Improve S/W and system architecture skill to differentiate the products For ASIC house – Traditionally, ASIC vendors do not have RTL-based design expertise – Learn to develop, integrate and manage reusable IPs SOC Consortium Course Material 27 Reuse :The Key to SoC Design ss ardne IP H Socketized Functions for Plug & Play Integration Core Reuse e daptiv A Soft / Predictable, Preverified Core Functions Source Reuse Personal Reuse Designer-Specific Reuse Practices Adequate testbench: Executable Spec. Qualification : RTL Linting, Coverage Analyzer Block Design : Spec and Constraint Functional Starting Points for Block Design Documented Spec. Feedback for Update Who's original designer ? ADD ific / Spec Virtual Component Hard Reuse TDD Decision-Making: Qualified IP ? Standard Compliant ? (VSI, etc.,) Core Protocol (OCP, Washbone, etc.,) Interface-Based TestBench PBD BBD Opportunistic IP Reuse Planned IP Reuse SOC Consortium Course Material 28 Outline Introduction to SoC ARM-based SoC and Development Tools SoC Labs Available Lab modules in this course Summary SOC Consortium Course Material 29 ARM-based System Development Processor cores ARM On-Chip Bus: AMBA Platform: PrimeXsys System building blocks: PrimeCell Development tools – – – – – Software development Debug tools Development kits EDA models Development boards SOC Consortium Course Material 30 ARM Architecture Version Core Architecture ARM1 v1 ARM2, ARM2as, ARM3 v2 ARM6, ARM60, ARM610, ARM7, ARM710, ARM7D, ARM7DI v3 ARM7TDMI, ARM710T, ARM720T, ARM740T StrongARM, ARM8, ARM810 ARM9TDMI, ARM920T, ARM940T v4T v4 v4T ARM9E-S, ARM10TDMI, ARM1020E v5TE ARM7EJ-S, ARM926EJ-S, ARM1026EJ-S v5TEJ ARM11 v6 SOC Consortium Course Material 31 ARM Coprocessors Application specific coprocessors – e.g., for specific arithmetic extensions – Developed a new decoupled coprocessor interface – Coprocessor no longer required to carefully track processor pipeline. ARM Core Co-processor SOC Consortium Course Material 32 ARM On-Chip Bus On-Chip RAM ARM Core Memory Interface UART AHB/ASB PIO APB Bridge DMA Master Timer Keypad A typical AMBA system AHB: Advanced High-performance Bus ASB: Advanced System Bus APB: Advanced Peripheral Bus SOC Consortium Course Material 33 ARM PrimeXsys Wireless Platform ETM MOVE Vectored Interrupt Control Static Memory Interface SDRAM Controller ARM926 CPU Color LCD DMA M M S ARM I AHB ARM D AHB LCD AHB Application Specific IP DMA 1 AHB (Periph) DMA 2 AHB (Memory) EXPANSION AHB 1 EXPANSION AHB 2 DMA APB Core APB WATCHDOG TIMERS AHB/ APB AHB/ APB SSP DMA APB Extensions GPIO (x4) GPIO x4 RTC SYSTEM CONTROL UART SIM Card SOC Consortium Course Material 34 PrimeXsys It is no longer the case that a single Intellectual Property (IP) or silicon vendor will be able to supply all of the IP that goes into a device. With the PrimeXsys range, ARM is going one step further in providing a known framework in which the IP has been integrated and proven to work. Each of the PrimeXsys platform definitions will be application focused – there is no ‘one-size-fits-all’ solution. ARM will create different platform solutions to meet the specific needs of different markets and applications. SOC Consortium Course Material 35 PrimeCell (1/2) ARM PrimeCell Peripherals are re-usable soft IP macrocells developed to enable the rapid assembly of SoC designs. Fully verified and compliant with the AMBA on-chip bus standard, the ARM PrimeCell range is designed to provide integrated right-first-time functionality and high system performance. Using the ARM PrimeCell Peripherals, designers save considerable development time and cost by concentrating their resources on developing the system design rather than the peripherals. SOC Consortium Course Material 36 PrimeCell (2/2) A typical AMBA SoC design using PrimeCell Peripherals. Ancillary or general-purpose peripherals are connected to the Advanced Peripherals Bus (APB), while main highperformance system components use the Advanced High-performance Bus (AHB). SOC Consortium Course Material 37 ARM’s Point of View of SoCs Integrating hardware IP Supplying software with the hardware ARM has identified the minimum set of building blocks that is required to develop a platform with the basic set of requirements to: – Provide the non-differentiating functionality, pre-integrated and pre-validated; – Run an OS; – Run application software; – Allow partners to focus on differentiating the final solution where it actually makes a difference. SOC Consortium Course Material 38 ARM-based System Development Processor cores ARM On-Chip Bus: AMBA Platform: PrimeXsys System building blocks: PrimeCell Development tools – – – – – Software development Debug tools Development kits EDA models Development boards SOC Consortium Course Material 39 Main Components in ADS (1/2) ANSI C compilers – armcc and tcc ISO/Embedded C++ compilers – armcpp and tcpp ARM/Thumb assembler - armasm Linker - armlink Project management tool for windows CodeWarrior Instruction set simulator - ARMulator Debuggers - AXD, ADW, ADU and armsd Format converter - fromelf Librarian – armar ADS: ARM Developer Suite ARM profiler - armprof SOC Consortium Course Material 40 Main Components in ADS (2/2) C and C++ libraries ROM-based debug tools (ARM Firmware Suite, AFS) Real time debug and trace support Support for all ARM cores and processors including ARM9E, ARM10, Jazelle, StrongARM and Intel Xscale SOC Consortium Course Material 41 The Structure of ARM Tools C/C++ source asm source C libraries C compiler assembler .o ELF object file With DWARF2 debug tables linker Librarian .axf debug ELF/DWARF2 image System models ARMulator object libraries ARMsd development board DWARF: Debug With Arbitrary Record Format ELF: Executable and linking format SOC Consortium Course Material 42 View in CodeWarrier The CodeWarrior IDE provides a simple, versatile, graphical user interface for managing your software development projects. Develop C, C++, and ARM assembly language code Targeted at ARM and Thumb processors. It speeds up your build cycle by providing: – comprehensive project management capabilities – code navigation routines to help you locate routines quickly. SOC Consortium Course Material 43 ARM Emulator: ARMulator (1/2) A suite of programs that models the behavior of various ARM processor cores and system architecture in software on a host system Can be operated at various levels of accuracy – Instruction accurate – Cycle accurate – Timing accurate SOC Consortium Course Material 44 ARM Emulator: ARMulator (2/2) Benchmarking before hardware is available – Instruction count or number of cycles can be measured for a program – Performance analysis Run software on ARMulator – Through ARMsd or ARM GUI debuggers, e.g., AXD – The processor core model incorporates the remote debug interface, so the processor and the system state are visible from the ARM symbolic debugger – Supports a C library to allow complete C programs to run on the simulated system SOC Consortium Course Material 45 ARM µHAL API µHAL is a Hardware Abstraction Layer that is designed to conceal hardware difference between different systems ARM µHAL provides a standard layer of boarddependent functions to manage I/O, RAM, boot flash, and application flash. – – – – – – – System initialization software Serial port Generic timer Generic LEDs Interrupt control Memory management PCI interface SOC Consortium Course Material 46 µHAL Examples µHAL API provides simple & extended functions that are linkable and code reusable to control the system hardware. General User application AFS utilities C and C++ libraries AFS board-specific µHAL routines AFS support routines Development board Specific AFSF: ARM Firmware Suit SOC Consortium Course Material 47 ARM Symbolic Debugger (ARMsd) (1/2) ARMsd: ARM and Thumb symbolic debugger – can single-step through C or assembly language sources, set breakpoints and watchpoints, and examine program variables or memory It is a front-end interface to debug program running either – under emulation (on the ARMulator) or remotely on a ARM development board (via a serial line or through JTAG test interface) SOC Consortium Course Material 48 ARM Symbolic Debugger (ARMsd) (2/2) It allows the setting of – breakpoints, addresses in the code – watchpoints, memory address if accessed as data address cause exception to halt so that the processor state can be examined SOC Consortium Course Material 49 Debugger-Target Interface To debug your application you must choose: – a debugging system, that can be either hardware-based on an ARM core or software that simulates an ARM core. – a debugger, such as AXD, ADW, ADU, or armsd. SOC Consortium Course Material 50 Debugger A debugger is software that enables you to make use of a debug agent in order to examine and control the execution of software running on a debug target. Examples: AXD, ADU, ADW, armsd – – – – armsd (ARM Symbolic Debugger) ADU (ARM Debugger for UNIX) ADW (ARM Debugger for Windows) AXD (both Windows and UNIX versions) • AXD is the recommended debugger. It provides functionality that is not available in the other debuggers. ADW and ADU will not be supplied in future versions of ADS. • The main improvements in AXD, compared to the earlier ARM debuggers, are: - a completely redesigned graphical user interface offering multiple views - a new command-line interface AXD: the ARM eXtended Debugger SOC Consortium Course Material 51 Debug Agent A debug agent performs the actions requested by the debugger, for example: – setting breakpoints – reading from memory – writing to memory. The debug agent is not the program being debugged, or the debugger itself Examples: ARMulator, Angel, Multi-ICE SOC Consortium Course Material 52 Debug Target Different forms of the debug target – early stage of product development, software – prototype, on a PCB including one or more processors – final product The form of the target is immaterial to the debugger as long as the target obeys these instructions in exactly the same way as the final product. The debugger issues instructions that can: – – – – load software into memory on the target start and stop execution of that software display the contents of memory, registers, and variables allow you to change stored values SOC Consortium Course Material 53 Views in AXD Various views allow you to examine and control the processes you are debugging. In the main menu bar, two menus contain items that display views: – The items in the Processor Views menu display views that apply to the current processor only – The items in the System Views menu display views that apply to the entire, possibly multiprocessor, target system AXD: the ARM eXtended Debugger SOC Consortium Course Material 54 AXD Desktop Menu Toolbar Control System view Source processor view Variable processor view Disassembly processor view Watch processor view Watch system view Console processor view Control System view SOC Consortium Course Material Status bar 55 ARM Debug Architecture (1/2) Two basic approaches to debug – from the outside, use a logic analyzer – from the inside, tools supporting single stepping, breakpoint setting Breakpoint: replacing an instruction with a call to the debugger Watchpoint: a memory address which halts execution if it is accessed as a data transfer address Debug request: through ICEBreaker programming or by DBGRQ pin asynchronously SOC Consortium Course Material 56 ARM Debug Architecture (2/2) In debug state, the core’s internal state and the system’s external state may be examined. Once examination is completed, the core and system state may be restored and program execution is resumed. The internal state is examined via a JTAG-style serial interface, which allows instructions to be serially inserted into the core’s pipeline without using the external data bus. When in debug state, a store-multiple (STM) could be inserted into the instruction pipeline and this would dump the contents of ARM’s registers. SOC Consortium Course Material 57 In Circuit Emulator (ICE) The processor in the target system is removed and replaced by a connection to an emulator The emulator may be based around the same processor chip, or a variant with more pins, but it will also incorporate buffers to copy the bus activity to a “trace buffer” and various hardware resources which can watch for particular events, such as execution passing through a breakpoint SOC Consortium Course Material 58 Multi-ICE and Embedded ICE Multi-ICE and embedded ICE are JTAG-based debugging systems for ARM processors They provide the interface between a debugger and an ARM core embedded within an ASIC – real time address-dependent and data-dependent breakpoints – single stepping – full access to, and control of the ARM core – full access to the ASIC system – full memory access (read and write) – full I/O system access (read and write) SOC Consortium Course Material 59 Basic Debug Requirements Control of program execution – set watchpoints on interesting data accesses – set breakpoints on interesting instructions – single step through code Examine and change processor state – read and write register values Examine and change system state – access to system memory • download initial code SOC Consortium Course Material 60 Debugging with Multi-ICE The system being debugged may be the final system SOC Consortium Course Material 61 ICEBreaker (EmbeddedICE Macrocell) ICEBreaker is programmed in a serial fashion using the TAP controller It consists of 2 real-time watchpoint units, together with a control and status register Either watchpoint unit can be configured to be a watchpoint or a breakpoint DBGRQI EXTERN1 A[31:0] EXTERN0 D[31:0] nOPC nRW TBIT Processor SOC Consortium Course Material RANGEOUT0 RANGEOUT1 MAS[1:0] ICEBreaker nTRANS DBGACKI DBGACK BREAKPT DBGRQ BREAKPTI IFEN DBGEN ECLK nMREQ SDIN SDOUT TCK nTRST TMS TAP TDI TDO 62 Integrate All The Modules in The Integrator Core Module (CM) Logic Module (LM) Integrator ASIC Development Platform Integrator Analyzer Module Integrator IM-PD1 Integrator/IM-AD1 Integrator/PP1 & PP2 Firmware Suite ATX motherboard SOC Consortium Course Material 63 ARM Integrator within an ATX PC Case Provided by NCTU SOC Consortium Course Material 64 Inside the Case Provided by NCTU SOC Consortium Course Material 65 Logic Module Config PLD Multi-ICE LEDs Switchs OSCs Trace Push B LA C CSR IntCntl Prototyping grid (16x17) Flash Xchecker/ Download ZBT SSRAM AHB IP AHB SSRM controller APB IP AHB/APB bridge EXPIM connector FPGA EXPA/EXPB connector LM Provided by NCTU SOC Consortium Course Material 66 Extension with Prototyping Grid You can use the prototyping grid to: − wire to off-board circuitry − mount connectors − mount small components Provided by NCTU SOC Consortium Course Material 67 ARM Integrator – One Configuration Reset controller SDRAM controller Clock generator Memory bus CSR CM ARM 7TDMI HDRA/HDRB connector FPGA System bus bridge Config PLD Multi-ICE SSRAM SSRAM controller LEDs Switchs OSCs Trace Push B LA C Multi-ICE Flash CSR Xchecker/ Download ZBT SSRAM AHB IP AHB SSRM controller APB IP AHB/APB bridge IntCntl Prototyping grid (16x17) EXPIM connector FPGA EXPA/EXPB connector LM 256MB SDRAM 3 PCI slots PCI PCI bridge PCI bridge controller External system Bus interface EBI CompatPCI Boot ROM SMC Clock PLL RTC osc. Interrupt controller (32MB) Flash CSR (512BK) SSRAM reset RTC 3 x timer/ counter Reset control Bridge Arbiter System bus GPIO Keyboard Mouse Serial 2 2xUART LEDs Peripheral bus FPGA AP SOC Consortium Course Material 68 System Memory Map 0xF000_0000 0xE000_0000 0xD000_0000 0xC000_0000 LM LM 4GB CM alias memory 0x8000_0000 0x2C00_0000 0xB000_0000 0x2800_0000 256MB SDRAM (CM 2) LM LM 256MB SDRAM (CM 3) 0x2400_0000 0xA000_0000 3GB 0x9000_0000 0x8000_0000 0x2000_0000 256MB SDRAM (CM 1) 256MB SDRAM (CM 0) 0x4000_0000 768MB EBI 0x2000_0000 ROM / RAM and peripherals 0x1000_0000 Peripheral regs 128MB 64MB Boot ROM 0x0000_0000 Reserved 1GB Flash 192MB Spare 1GB 0x3000_0000 SSRAM 256MB 0x0FFF_FFFF 2GB PCI CS 3 (EXPM) 512MB 256MB CM 0, 1, 2, 3 SOC Consortium Course Material GPIO LED/Switch Mouse Keyboard UART 1 UART 0 RTC Int control Counter/Timer EBI regs Sys control CM regs 69 Outline Introduction to SoC ARM-based SoC and Development Tools SoC Labs Available Lab modules in this course Summary SOC Consortium Course Material 70 SoC Labs Code Development Debugging and Evaluation Core Peripherals and Standard I/O OCB/VCI Virtual Prototyping Memory Controller ASIC logic Real-Time OS (RTOS) Case Study SOC Consortium Course Material 71 Code Development General/Machine-dependent guideline – Compiler optimization: • Space or speed (e.g, -Ospace or -Otime) • Debug or release version (e.g., -O0 ,-O1 or -O2) • Instruction scheduling – Coding style • • • • Parameter passing Loop termination Division operation and modulo arithmetic Variable type and size SOC Consortium Course Material 72 Data Layout Default Optimized char a; short b; char c; int d; char a; char c; short b; int d; a c pad b a pad c b d d occupies 12 bytes, with 4 bytes of padding occupies 8 bytes, without any padding Group variables of the same type together. This is the best way to ensure that as little padding data as possible is added by the compiler. SOC Consortium Course Material 73 Stack Usage C/C++ code uses the stack intensively. The stack is used to hold: – Return addresses for subroutines – Local arrays & structures To minimize stack usage: – Keep functions small (few variables, less spills) minimize the number of ‘live’ variables (i.e., those which contain useful data at each point in the function) – Avoid using large local structures or arrays (use malloc/free instead) – Avoid recursion SOC Consortium Course Material 74 Software Quality Measurement Memory requirement – Data type: volatile (RAM), non-volatile (ROM) – Memory performance: access speed, data width, size, and range Performance benchmarking – Harvard core • D-cycles, ID-cycles, I-cycles – Von Newman cores • N-cycles, S-cycles, I-cycles, C-cycles – Clock rate • Processor, external bus – Cache efficiency • Average memory access time = hit time + miss rate x miss penalty • Cache efficiency = core-cycles / total bus cycles SOC Consortium Course Material 75 Global Data Issues When declaring global variables in source code to be compiled with ARM software, three things are affected by the way you structure your code: – How much space the variables occupy at run time. This determines the size of RAM required for a program to run. The ARM compilers may insert padding bytes between variables, to ensure that they are properly aligned. – How much space the variables occupy in the image. This is one of the factors determining the size of ROM needed to hold a program. Some global variables which are not explicitly initialized in your program may nevertheless have their initial value (of zero, as defined by the C standard) stored in the image. – The size of the code needed to access the variables. Some data organizations require more code to access the data. As an extreme example, the smallest data size would be achieved if all variables were stored in suitably sized bit fields, but the code required to access them would be much larger. SOC Consortium Course Material 76 Debugger Functionality – Execution trace – Exam/modify program states • Memory • Registers (including PC) – Control of program execution • Run/Halt/Continue/Goto/Stepin • Break point: conditional, repeat count Issue: debug optimized code in source SOC Consortium Course Material 77 Concept of the Bus A group of lines shared for interconnection of the functional modules by a standard interface – E.g., ARM AMBA and IBM CoreConnect Interconnection structure – Point-to-point – On-chip bus – On-chip network • Network on Silicon • Network on Chip SOC Consortium Course Material 78 Bus Hierarchy The structure of multiple buses within a system, organized by bandwidth. Local processor bus – High processor-specific – Processor, cache, MMU, coprocessor System bus (backbone) – RISC processor, DSP, DMA (masters) – Memory, high resolution LCD peripheral Peripheral bus – Components with other design considerations (power, gate count, etc.) – Bridge is the only bus master SOC Consortium Course Material 79 Core Peripherals: Interrupt Schemes Polled Interrupt Vectored Interrupt SOC Consortium Course Material 80 Real Time OS A RTOS is an abstraction from hardware and software programming – Shorter development time – Less porting efforts – Better reusability Choosing a RTOS is important – High efforts when porting to a different OS – The chosen OS may have a high impact on the amount of resources needed SOC Consortium Course Material 81 RTOS: Functionalities Interrupt service Process (task) management – Scheduler – Synchronization mechanism • Inter-process communication (IPC) • Semaphores Memory management Service routine Device driver Protection SOC Consortium Course Material 82 Characteristics of a RTOS Multitasking – Non-preemptive vs. preemptive – Priority scheduling Real-time – Soft and hard real time requirements Speedy interrupt response Frequent Interrupts SOC Consortium Course Material 83 Memory Controller The International Technology Roadmap for Semiconductors (ITRS) shows memory already accounting for over 50% of a typical SoC, growing to 94% by the year 2014. Memory design – Size, ports, device number, and memory hierarchy – Application-specific behavior Memory power management SOC Consortium Course Material 84 Outline Introduction to SoC ARM-based SoC and Development Tools SoC Labs Available Lab modules in this course Summary SOC Consortium Course Material 85 Lab 1: Code Development Goal Steps – How to create an application using ARM Developer Suite (ADS) – How to change between ARM state and Thumb state when writing code for different instruction sets Principles – Processor’s organization – ARM/Thumb Procedure Call Standard (ATPCS) Guidance – Basic software development (tool chain) flow – ARM/Thumb Interworking Requirements and Exercises – See next slide Discussion – The advantages and disadvantages of ARM and Thumb instruction sets. – Flow diagram of this Lab – Preconfigured project stationery files SOC Consortium Course Material 86 Lab 1: Code Development (cont’) ARM/Thumb Interworking – Exercise 1: C/C++ for “Hello” program • Caller: Thumb • Callee: ARM – Exercise 2: Assembly for “SWAP” program, w/wo veneers • Caller: Thumb • Callee: ARM – Exercise 3: Mixed language for “SWAP” program, ATPCS for parameters passing • Caller: Thumb in Assembly • Callee: ARM in C/C++ SOC Consortium Course Material 87 Lab 2: Debugging and Evaluation Goal – A variety of debugging tasks and software quality evaluation • Debugging skills – Set breakpoints and watchpoints – Locate, examine and change the contents of variables, registers and memory • Skills to evaluate software quality – Memory requirement of the program – Profiling: Build up a picture of the percentage of time spent in each procedure. – Evaluate software performance prior to implement on hardware – Thought in this Lab the debugger target is ARMulator, but the skills can be applied to Multi-ICE/Angel with the ARM development board(s). SOC Consortium Course Material 88 Lab 2: Debugging and Evaluation (cont’) Principles – The Dhrystone Benchmark – CPU’s organization Guidance – Steps only Steps – Debugging skills – Memory requirement and Profiling – Virtual prototyping – Efficient C programming Requirements and Exercises – Optimize 8x8 inverse discrete cosine transform (IDCT) [1] according to ARM’s architecture. – Deliverables Discussion – Explain the approaches you apply to minimize the code size and enhance the performance of the lotto program according to ARM’s architecture. – Select or modify the algorithms of the code segments used in your program to fit to ARM's architecture. SOC Consortium Course Material 89 Lab 3: Core Peripherals and Std. I/O Goal Steps – Understand the HW/SW coordination • Memory-mapped device • Operation mechanism of polling and Timer/Interrupt • HAL – Understand available resource of ARM Integrator • semihosting – The same to that of code development Requirements and Exercises – Use timer to count the total data transfer time of several data references to SSRAM and SDRAM. Discussion Principles – Semihosting – Interrupt handler – Architecture of Timer and Interrupter controller – Compare the performance between using SSRAM and SDRAM. Guidance – Introduction to Important functions used in interrupt handler SOC Consortium Course Material 90 Lab 4: On-Chip Bus and VCI Requirements and Exercises Goal – To introduce the interface design conceptually. Study the communication between FPGA on logic module and ARM processor on core module. We will introduce the ARMB in detail. Principle – Overview of the AMBA specification – Introducing the AMBA AHB – AMBA AHB signal list – The Arm-based system overview Guide – We use a simple program to lead student understanding the ARMB. – To trace the hardware code and software code, indicate that software how to communicate with hardware using the ARMB interface. Discussion – If we want to design an accumulator (1,2,3…) , how could you do to implement it using the scratch code? – If we want to design a hardware using FPGA, how could you do to add your code to the scratch code and debugger it ? – To study the ARMB bus standard, try to design a simple ARMB interface. SOC Consortium Course Material 91 Lab 5: Virtual Prototyping Requirements and Exercises Goal – To introduce the prototyping using high-level hardware model. Study the communication and synchronization between hardware and software using ARM’s ARMulator. Principle – ARMulator hardware model – Synchronization between hardware and software – Memory-mapped hardware control interface – To trace the C/C++ code of the hardware model and the software code. – Understand the synchronization and communication scheme. – Being able to write simple drviers. Discussion – Compare the advantages/disadvantages of the synchronization schemes. Guide – We use a simple example to help student understand how to virtual prototype usingARMulator. SOC Consortium Course Material 92 Lab 6: Memory Controller Goal – Realize the principle of memory map and internal and external memory Principles – System memory map – Core Module Control Register – Core Module Memory Map Guidance Requirements and Exercises – Modify the memory usage example. Use timer to count the total access time of several data accessing the SSRAM and SDRAM. Compare the performance between using SSRAM and SDRAM. Discussion – We use a simple program to lead student understanding the memory. – Discuss the following items about Flash, RAM, and ROM. • Speed • Capacity • Internal /External SOC Consortium Course Material 93 Lab 7: ASIC Logic Requirements and Exercises Goal – HW/SW Co-verification using Rapid Prototyping Principles – RGB-to-YUV converting hardware module Discussion – Basics and work flow for prototyping with ARM Integrator – Target platform: AMBA AHB subsystem Guidance – Overview of examples used in the Steps Steps – Understand the files for the example designs and FPGA tool – Steps for synthesis with Xilinx Foundation 3.1i – In example 1, explain the differences between the Flash version and the FPGA one. – In example 1, explain how to move data from DRAM to registers in MYIP and how program access these registers. – In example2, draw the interconnect among the functional units and explain the relationships of those interconnect and functional units in AHB sub-system – Compare the differences of polling and interrupt mechanism SOC Consortium Course Material 94 Lab 8: Real-Time OS Goal Steps – A guide to use RTOS and port programs to it Principles – Basic concepts and capabilities of RTOS • Task, task scheduling & context switch • Resource management using Semaphore • Inter-process communication using Mailbox • Memory management – Coding guideline for a program running on the embedded RTOS – Setting up the ARMulator – Building µC/OS-II – Building Program with µC/OS-II – Porting Program to µC/OS-II Requirements and Exercises – Write an embedded software for ID checking engine and a front– end interface Discussion – What are the advantages and disadvantages of using RTOS in SoC design? Guidance SOC Consortium Course Material 95 Lab 9: Case design Goal Steps – Study how to use the ARMbased platform to implement JPEG system. In this chapter, we will describe the JPEG algorithm in detail. Principle – Detail of design method and corresponding algorithm Guidance – In this section, we will introduce the JPEG software file (.cpp) in detail. We will introduce the hardware module. – We divide our program into two parts: • Hardware • Software Requirements and Exercises – Try to understand the communication between the software part and hardware part. To check the computing result is correct. You can easily check that the output value from the FPGA on LM SOC Consortium Course Material 96 Lab 9: Case design (cont’) Discuss – We describe the decoder part algorithm on reference 3, try to implement it on ARMbased platform. You can divide to two parts: software & hardware. SOC Consortium Course Material 97 Summary To build SoC labs – Software tools • Code development\debug\evaluation (e.g. ARM Developer Suite) • Cell-based design EDA tools – Development boards, e.g., ARM Integrator • • • • Core Module: 7TDMI, 720T, 920T, etc Logic Module (Xilin XCV2000E, Altera LM-EP20K1000E) ASIC Development Platform (Integrator/AP AHB ) Multi-ICE Interface – Advanced labs: RTOS (e.g., µC/OS-II) Verification/Validation SOC Consortium Course Material 98 Summary The ARM has played a leading role in the opening of this era since its very small core size leaves more silicon resources available for the rest of the system functions. SoC labs are challenges to universities – Various expertise – Tight schedule for (M.S.) students SOC Consortium Course Material 99 Reference [1] http://twins.ee.nctu.edu.tw/courses/ip_core_02/index.html [2] ARM System-on-Chip Architecture by S.Furber, Addison Wesley Longman: ISBN 0-201-67519-6. [3] http://www.arm.com SOC Consortium Course Material 100 ...
View Full Document

Ask a homework question - tutors are online