La présentation est en train de télécharger. S'il vous plaît, attendez

La présentation est en train de télécharger. S'il vous plaît, attendez

Processeurs de Signaux et Images - 2 nde partie Cours n°2-3 Processeurs à usage universel, Processeurs d é di é s à un domaine d ’ applications (ASIPs)

Présentations similaires


Présentation au sujet: "Processeurs de Signaux et Images - 2 nde partie Cours n°2-3 Processeurs à usage universel, Processeurs d é di é s à un domaine d ’ applications (ASIPs)"— Transcription de la présentation:

1 Processeurs de Signaux et Images - 2 nde partie Cours n°2-3 Processeurs à usage universel, Processeurs d é di é s à un domaine d ’ applications (ASIPs) Amer Baghdadi Amer.Baghdadi@telecom-bretagne.eu T é l é com Bretagne – D é partement Electronique Master2 Recherche - STIP Signal, Télécoms, Image, Parole Novembre 2009

2 Processeurs de Signaux et Images - 2 nde partie page 2 Département Électronique - A. BAGHDADI Course outline  C2-C3 : General purpose processors  Architecture of General Purpose Processors  Datapath unit  Control unit  Execution cycles of an instruction  Architecture considerations  Advanced architecture concepts (Pipeline, superscalar and VLIW architectures, Caches, Princeton and Harvard)  Programming model  Application Specific Instruction-set Processors (ASIPs)  Introducing the lab work: design of an elementary processor  Reference book  EMBEDDED SYSTEM DESIGN: A Unified Hardware/Software Introduction, by F. Vahid (UCR) and T. Givargis (UCI).

3 Processeurs de Signaux et Images - 2 nde partie page 3 Département Électronique - A. BAGHDADI Introduction General-Purpose Processor  Processor designed for a variety of computation tasks  Low unit cost, in part because manufacturer spreads NRE over large numbers of units Motorola sold half a billion 68HC05 microcontrollers in 1996 alone  Carefully designed since higher NRE is acceptable Can yield good performance, size and power  Low NRE cost, short time-to-market/prototype, high flexibility User just writes software; no processor design  “microprocessor” – “micro” used when they were implemented on one or a few chips rather than entire rooms

4 Processeurs de Signaux et Images - 2 nde partie page 4 Département Électronique - A. BAGHDADI Basic Architecture Control unit and datapath  Note similarity to single- purpose processor Key differences  Datapath is general  Control unit doesn’t store the algorithm – the algorithm is “programmed” into the memory Processeur Unité de contrôleUnité opérative Mémoire E/S Commande IRPC Contrôleur UAL Registres État

5 Processeurs de Signaux et Images - 2 nde partie page 5 Département Électronique - A. BAGHDADI Datapath Operations Load  Read memory location into register Execute  Input certain registers through ALU, store back in register Store  Write register to memory location Processeur Unité opérative UAL Registres Mémoire E/S Commande Unité de contrôle IRPC Contrôleur État 10 … … +1 10 11 10 11

6 Processeurs de Signaux et Images - 2 nde partie page 6 Département Électronique - A. BAGHDADI Control Unit Control unit: configures the datapath operations  Sequence of desired operations (“instructions”) stored in memory – “program” Instruction cycle – broken into several sub-operations, each one clock cycle, e.g.:  Fetch: Get next instruction into IR  Decode: Determine what the instruction means  Fetch operands: Move data from memory to datapath register  Execute: Move data through the ALU  Store results: Write data from register to memory Processeur Unité de contrôle Unité opérative UAL Registres IRPC Contrôleur Mémoire E/S Commande État 10 … … 500 501 100 101 load R0, M[500] Inc R1, R0 store M[501], R1 102 R0R1

7 Processeurs de Signaux et Images - 2 nde partie page 7 Département Électronique - A. BAGHDADI Control Unit Sub-Operations Fetch Instruction  Get next instruction into IR  PC : program counter, always points to next instruction  IR : holds the fetched instruction Processeur Unité de contrôleUnité opérative UAL Registres IR PC Contrôleur Mémoire E/S Commande État 10 … … 500 501 100 101 load R0, M[500] Inc R1, R0 store M[501], R1 102 R0R1 100 load R0, M[500] Adresse

8 Processeurs de Signaux et Images - 2 nde partie page 8 Département Électronique - A. BAGHDADI Control Unit Sub-Operations Decode  Determine the instruction type Processeur Unité de contrôleUnité opérative UAL Registres IR PC Contrôleur Mémoire E/S Commande État 10 … … 500 501 load R0, M[500] Inc R1, R0 store M[501], R1 R0R1 100 load R0, M[500] 100 101 102

9 Processeurs de Signaux et Images - 2 nde partie page 9 Département Électronique - A. BAGHDADI Control Unit Sub-Operations Fetch operands  Move data from memory to datapath register Processeur Unité de contrôleUnité opérative UAL Registres IR PC Contrôleur Mémoire E/S Commande État 10 … … 500 501 100 101 load R0, M[500] Inc R1, R0 store M[501], R1 102 R0R1 100 load R0, M[500] 10

10 Processeurs de Signaux et Images - 2 nde partie page 10 Département Électronique - A. BAGHDADI Control Unit Sub-Operations Execute  Move data through the ALU  This particular instruction ( load R0, M[500] ) does nothing during this sub-operation Processeur Unité de contrôleUnité opérative UAL Registres IR PC Contrôleur Mémoire E/S Commande État 10 … … 500 501 100 101 load R0, M[500] Inc R1, R0 store M[501], R1 102 R0R1 100 load R0, M[500] 10

11 Processeurs de Signaux et Images - 2 nde partie page 11 Département Électronique - A. BAGHDADI Control Unit Sub-Operations Store  Write data from register to memory  This particular instruction ( load R0, M[500] ) does nothing during this sub-operation Processeur Unité de contrôleUnité opérative UAL Registres IR PC Contrôleur Mémoire E/S Commande État 10 … … 500 501 100 101 load R0, M[500] Inc R1, R0 store M[501], R1 102 R0R1 100 load R0, M[500] 10

12 Processeurs de Signaux et Images - 2 nde partie page 12 Département Électronique - A. BAGHDADI Instruction Cycles PC=100 Fetch oprends Exec. Store results clk Fetch inst. Decode Processeur Unité de contrôleUnité opérative UAL Registres IR PC Mémoire E/S Commande État 10 … … 500 501 100 101 load R0, M[500] Inc R1, R0 store M[501], R1 102 R0R1 100 load R0, M[500] Contrôleur 10

13 Processeurs de Signaux et Images - 2 nde partie page 13 Département Électronique - A. BAGHDADI Instruction Cycles Processeur Unité de contrôleUnité opérative UAL Registres IR PC Mémoire E/S Commande État 10 … … 500 501 100 101 load R0, M[500] Inc R1, R0 store M[501], R1 102 R0R1 101 Inc R1, R0 10 PC=101 Fetch oprends Exec. Store results clk Decode PC=100 Fetch oprends Exec. Store results clk Decode Fetch inst. Contrôleur +1 11 10

14 Processeurs de Signaux et Images - 2 nde partie page 14 Département Électronique - A. BAGHDADI Instruction Cycles PC=100 Fetch oprends Exec. Store results clk Fetch inst. Decode Processeur Unité de contrôleUnité opérative UAL Registres IR PC Mémoire E/S Commande État 10 … … 500 501 100 101 load R0, M[500] Inc R1, R0 store M[501], R1 102 R0R1 102 store M[501], R1 10 PC=102 Fetch oprends Exec. Store results clk Decode 11 PC=101 Fetch oprends Exec. Store results clk Decode Fetch inst. Contrôleur 11

15 Processeurs de Signaux et Images - 2 nde partie page 15 Département Électronique - A. BAGHDADI Architectural Considerations N-bit processor  N-bit ALU, registers, buses, memory data interface  Embedded: 8-bit, 16-bit, 32-bit common  Desktop/servers: 32-bit, even 64 PC size determines address space Processor Control unitDatapath unit UAL Registers IRPC Controller Memory E/S Commande Status

16 Processeurs de Signaux et Images - 2 nde partie page 16 Département Électronique - A. BAGHDADI Architectural Considerations Clock frequency  Inverse of clock period  Must be longer than longest register to register delay in entire processor  Memory access is often the longest Processor Control unitDatapath unit UAL Registers IRPC Controller Memory E/S Commande Status

17 Processeurs de Signaux et Images - 2 nde partie page 17 Département Électronique - A. BAGHDADI Architectural Considerations Performance can be improved through:  A faster clock (technology limitations!)  Pipeline: dividing the instruction execution in several stages and superposing them  Multiple ALUs to execute multiple instruction flows (concurrently) Superscalar and V LIW architectures

18 Processeurs de Signaux et Images - 2 nde partie page 18 Département Électronique - A. BAGHDADI 12345678 Fetch-inst. Decode Fetch ops. Execute Store res. Wash Dry Pipelined time Pipelined pipelined instruction execution pipelined dish cleaning 123 12 4 3 5 4 6 5 7 6 8 7 8 12345678 12345678 12345678 12345678 12345678 1 st instruction Non-pipelined non-pipelined dish cleaning 12345678 time Two resources available Pipelining: Increasing Instruction Throughput

19 Processeurs de Signaux et Images - 2 nde partie page 19 Département Électronique - A. BAGHDADI Superscalar architecture Superscalar  Scalar operation: being carried out on one or two numbers (contrary to a vector or matrix operation)  Fetches instructions in packets Static scheduling (at compilation time) or dynamic (at execution time) In case of dynamic scheduling : requires extensive hardware to detect independent instructions multiple instructions Registers Cache/ memory Fetch FU Multiple functional units (FU) Decode& Ordre Sequential flow of instructions

20 Processeurs de Signaux et Images - 2 nde partie page 20 Département Électronique - A. BAGHDADI VLIW architecture VLIW (Very Long Instruction Word) : long instruction (128-1024 bits) composed of several independents operations (rather than one)  Equivalent to a superscalar architecture with a static scheduling  Currently growing in popularity one multi-operation instruction Registers Cache/ memory Fetch FU Plusieurs unités fonctionnelles (UF)

21 Processeurs de Signaux et Images - 2 nde partie page 21 Département Électronique - A. BAGHDADI Two memory architectures Processor Program memory Data memory Harvard Processor Memory (program and data) Princeton (Von Neumann) Princeton ( Von Neumann )  Fewer memory wires  Simple implementation Harvard  Simultaneous program and data memory access (Fetch inst. and Fetch oprd)

22 Processeurs de Signaux et Images - 2 nde partie page 22 Département Électronique - A. BAGHDADI Cache Memory Memory access may be slow Cache is small but fast memory close to processor  Holds copy of part of memory  Hits and misses Processor Memory Cache Fast/expensive technology, usually on the same chip Slower/cheaper technology, usually on a different chip

23 Processeurs de Signaux et Images - 2 nde partie page 23 Département Électronique - A. BAGHDADI Programmer’s View Programmer doesn’t need detailed understanding of architecture  Instead, needs to know what instructions can be executed Two levels of instructions:  Assembly level  Structured languages (C, C++, Java, etc.) Most development today done using structured languages  But, some assembly level programming may still be necessary  Drivers: portion of program that communicates with and/or controls (drives) another device Often have detailed timing considerations, extensive bit manipulation Assembly level may be best for these

24 Processeurs de Signaux et Images - 2 nde partie page 24 Département Électronique - A. BAGHDADI Assembly-Level Instructions op.codeoperand1operand2 op.codeoperand1operand2 op.codeoperand1operand2 op.codeoperand1operand2... Instruction 1 Instruction 2 Instruction 3 Instruction 4 Instruction Set  Defines the legal set of instructions for that processor Data transfer: memory/register, register/register, I/O, etc. Arithmetic/logical: move register through ALU and back Branches: determine next PC value when not just PC+1

25 Processeurs de Signaux et Images - 2 nde partie page 25 Département Électronique - A. BAGHDADI ADD Rn, Rm 0100RmRn Rn = Rn + Rm MOV Rn, #immed. 0011Rnimmédiat Rn = immediate MOV Rn, direct Assembly instruct. 0000Rn First byte direct Second byte Rn = M(direct) Operation JZ Rn, relative 0110Rnrelatif PC = PC + relative ( if Rn = 0) SUB Rn, Rm 0101Rm Rn = Rn - Rm Rn MOV direct, Rn 0001Rndirect M(direct) = Rn MOV @Rn, Rm 0010Rn Rm M(Rn) = Rm op.code operands A Simple Instruction Set

26 Processeurs de Signaux et Images - 2 nde partie page 26 Département Électronique - A. BAGHDADI Addressing Modes Operand field Register-direct Register-indirect Direct Indirect Immediate data Register Address Memory Address Addressing Modes Register-file contents Memory contents data Memory Addressdata Memory Address

27 Processeurs de Signaux et Images - 2 nde partie page 27 Département Électronique - A. BAGHDADI Sample Program int total = 0; for (int i=10; i!=0; i -- ) total += i; // next instructions... C Program JZ R1, Next; // Branch if i=0Loop: Next: // next instructions... Equivalent assembly program MOV R0, #0; // total = 00 MOV R1, #10; // i = 101 MOV R2, #1; // constant 12 MOV R3, #0; // constant 03 ADD R0, R1; // total += i5 SUB R1, R2; // i--6 JZ R3, Loop; // Branch7

28 Processeurs de Signaux et Images - 2 nde partie page 28 Département Électronique - A. BAGHDADI Programmer Considerations Program and data memory space  Embedded processors often very limited e.g., 64 Kbytes program, 256 bytes of RAM (expandable) Registers: How many are there?  Only a direct concern for assembly-level programmers I/O  How communicate with external signals? Interrupts

29 Processeurs de Signaux et Images - 2 nde partie page 29 Département Électronique - A. BAGHDADI General-purpose processors  Sometimes too general to be effective in demanding application e.g., video processing – requires huge video buffers and operations on large arrays of data, inefficient on a GPP Single-purpose processor  High NRE, not programmable ASIPs – targeted to a particular domain  Contain architectural features specific to that domain e.g., embedded control, digital signal processing, video processing, network processing, telecommunications, etc.  Still programmable Application-Specific Instruction-set Processors (ASIPs)

30 Processeurs de Signaux et Images - 2 nde partie page 30 Département Électronique - A. BAGHDADI A Common ASIP: the DSP (Digital Signal Processor) For signal processing applications  Large amounts of digitized data, often streaming  Data transformations must be applied fast  e.g., cell-phone voice filter, digital TV, music synthesizer DSP features  Several instruction execution units  Multiple-accumulate single-cycle instruction, other instrs.  Efficient vector operations – e.g., add two arrays Vector ALUs, loop buffers, etc.

31 Processeurs de Signaux et Images - 2 nde partie page 31 Département Électronique - A. BAGHDADI Example: TMS320C67x core

32 Processeurs de Signaux et Images - 2 nde partie page 32 Département Électronique - A. BAGHDADI Another Common ASIP: Microcontroller For embedded control applications  Reading sensors, setting actuators  Mostly dealing with events (bits): data is present, but not in huge amounts  e.g., VCR, disk drive, digital camera (assuming SPP for image compression), washing machine, microwave oven Microcontroller features  On-chip peripherals Timers, analog-digital converters, serial communication, etc. Tightly integrated for programmer, typically part of register space  On-chip program and data memory  Direct programmer access to many of the chip’s pins  Specialized instructions for bit-manipulation and other low-level operations

33 Processeurs de Signaux et Images - 2 nde partie page 33 Département Électronique - A. BAGHDADI Example: Sharp LH77790B microcontroller

34 Processeurs de Signaux et Images - 2 nde partie page 34 Département Électronique - A. BAGHDADI In the past, microprocessors were acquired as chips Today, we increasingly acquire a processor as Intellectual Property (IP)  e.g., synthesizable VHDL model Opportunity to add a custom datapath hardware and a few custom instructions, or delete a few instructions  Can have significant performance, power and size impacts  Problem: need compiler/debugger for customized ASIP Remember, most development uses structured languages One solution: automatic compiler/debugger generation  e.g., www.tensillica.comwww.tensillica.com Another solution: retargettable compilers  e.g., www.improvsys.com (customized VLIW architectures)www.improvsys.com Trend: Even More Customized ASIPs

35 Processeurs de Signaux et Images - 2 nde partie page 35 Département Électronique - A. BAGHDADI Selecting a Microprocessor Criteria  Technical: speed, power, size, cost  Other: development environment, prior expertise, licensing, etc. Speed: how evaluate a processor’s speed?  Clock speed – but instructions per cycle may differ  Instructions per second – but work per instr. may differ  Dhrystone: Synthetic benchmark, developed in 1984. Dhrystones/sec. MIPS: 1 MIPS = 1757 Dhrystones per second. D-MIPS commonly used today.  So, 750 MIPS = 750*1757 = 1,317,750 Dhrystones per second  SPEC: set of more realistic benchmarks, but oriented to desktops  EEMBC – EDN Embedded Benchmark Consortium, www.eembc.org Suites of benchmarks: automotive, consumer electronics, networking, office automation, telecommunications

36 Processeurs de Signaux et Images - 2 nde partie page 36 Département Électronique - A. BAGHDADI Design of an elementary (yet general purpose!) processor

37 Processeurs de Signaux et Images - 2 nde partie page 37 Département Électronique - A. BAGHDADI 8-bits general purpose processor Based on accumulator register named ACCU (8 bits) Four instructions Each instruction is coded on 8 bits. Two bits to code the type of the operation (op.code) and 6 bits to code the operand or the address of the operand in the memory according to the type of the instruction. MnemonicInstruction codingDescription NOR00AAAAAA ACCU = ACCU NOR Mem[AAAAAA] ADD01AAAAAA ACCU = ACCU + Mem[AAAAAA], update the Carry STA10AAAAAA Mem[AAAAAA] = ACCU JCC11DDDDDD If Carry = 0  PC = DDDDDD Else clear Carry (Carry=0) [Instruction-set source : http://www.tuhh.de/~setb0209/cpu/ by T. Böscke] Presentation of the elementary processor

38 Processeurs de Signaux et Images - 2 nde partie page 38 Département Électronique - A. BAGHDADI 000000 : 00001000 (0x08) NOR 0b001000 ; ACCU = ACCU NOR M[001000] 000001 : 01000111 (0x47) ADD 0b000111 ; ACCU = ACCU + M[000111] (Carry) 000010 : 10000110 (0x86) STA 0b000110 ; M[000110] = ACCU 000011 : 11000100 (0xC4) JCC 0b000100 ; if Carry = 0 then PC = 000100 else clear Carry 000100 : 11000100 (0xC4) JCC 0b000100 ; PC = 000100 (Carry has been cleared) 000101 : 00000000 (0x00) 000110 : 00000000 (0x00) 000111 : 01111110 (0x7E) 001000 : 11111111 (0xFF) 001001 : 00000000 (0x00) Mem Adr Binary (hexa) Assembler Comments content instructions … … Data… … … 111111 : 00000000 (0x00) Example of a test program

39 Processeurs de Signaux et Images - 2 nde partie page 39 Département Électronique - A. BAGHDADI Processor design (1/3) Consider the basic architecture model (slide n° 4) Consider the instruction set, the number of registers, and the (eventual) imposed architectural specifications/constraints Using the design methodology presented previously (course n° 1)

40 Processeurs de Signaux et Images - 2 nde partie page 40 Département Électronique - A. BAGHDADI Processor design (2/3) Algorithm – FSMD Clear PC & IR & Carry & Registers; while (1) { Fetch Inst (read one instruction); Decode the instruction; if ( CodeOp=00 or CodeOp=01 ) { Fetch Operand (read the operand to R1); if CodeOp=00 Execute NOR (ACCU = ACCU NOR M[AAAAAA]); else Execute ADD (ACCU = ACCU + M[AAAAAA] Update the Carry); } else if CodeOp=10 { Execute STA (Mem[AAAAAA] = ACCU); } else { Execute JCC (if Carry=0 PC=DDDDDD else Carry=0); }

41 Processeurs de Signaux et Images - 2 nde partie page 41 Département Électronique - A. BAGHDADI Processor design (3/3) selUAL UAL NOR ou ADD ACCU selADR incPC [7:0] ldC PC IR [7:0] clrPC ldPC Contrôle (FSM) [7:0] [7:6] [5:0] Mux [5:0] ldIR [7:0] DataIn DataOut Adr ldACCU 01 CodeOp Rst R1 ldR1 C clrC C enMweM DatapathController Memory

42 Processeurs de Signaux et Images - 2 nde partie page 42 Département Électronique - A. BAGHDADI It is time to practice CAD tools (Computer-aided design) will be used to implement the architecture which you will propose for this processor


Télécharger ppt "Processeurs de Signaux et Images - 2 nde partie Cours n°2-3 Processeurs à usage universel, Processeurs d é di é s à un domaine d ’ applications (ASIPs)"

Présentations similaires


Annonces Google