Télécharger la présentation
La présentation est en train de télécharger. S'il vous plaît, attendez
Publié parAlizée Beauchamp Modifié depuis plus de 7 années
1
Processeurs de Signaux et Images - 2 nde partie Cours n°2-3 Processeurs à usage universel, Processeurs d é di é s à un domaine d ’ applications (ASIPs) Amer Baghdadi Amer.Baghdadi@telecom-bretagne.eu T é l é com Bretagne – D é partement Electronique Master2 Recherche - STIP Signal, Télécoms, Image, Parole Novembre 2009
2
Processeurs de Signaux et Images - 2 nde partie page 2 Département Électronique - A. BAGHDADI Course outline C2-C3 : General purpose processors Architecture of General Purpose Processors Datapath unit Control unit Execution cycles of an instruction Architecture considerations Advanced architecture concepts (Pipeline, superscalar and VLIW architectures, Caches, Princeton and Harvard) Programming model Application Specific Instruction-set Processors (ASIPs) Introducing the lab work: design of an elementary processor Reference book EMBEDDED SYSTEM DESIGN: A Unified Hardware/Software Introduction, by F. Vahid (UCR) and T. Givargis (UCI).
3
Processeurs de Signaux et Images - 2 nde partie page 3 Département Électronique - A. BAGHDADI Introduction General-Purpose Processor Processor designed for a variety of computation tasks Low unit cost, in part because manufacturer spreads NRE over large numbers of units Motorola sold half a billion 68HC05 microcontrollers in 1996 alone Carefully designed since higher NRE is acceptable Can yield good performance, size and power Low NRE cost, short time-to-market/prototype, high flexibility User just writes software; no processor design “microprocessor” – “micro” used when they were implemented on one or a few chips rather than entire rooms
4
Processeurs de Signaux et Images - 2 nde partie page 4 Département Électronique - A. BAGHDADI Basic Architecture Control unit and datapath Note similarity to single- purpose processor Key differences Datapath is general Control unit doesn’t store the algorithm – the algorithm is “programmed” into the memory Processeur Unité de contrôleUnité opérative Mémoire E/S Commande IRPC Contrôleur UAL Registres État
5
Processeurs de Signaux et Images - 2 nde partie page 5 Département Électronique - A. BAGHDADI Datapath Operations Load Read memory location into register Execute Input certain registers through ALU, store back in register Store Write register to memory location Processeur Unité opérative UAL Registres Mémoire E/S Commande Unité de contrôle IRPC Contrôleur État 10 … … +1 10 11 10 11
6
Processeurs de Signaux et Images - 2 nde partie page 6 Département Électronique - A. BAGHDADI Control Unit Control unit: configures the datapath operations Sequence of desired operations (“instructions”) stored in memory – “program” Instruction cycle – broken into several sub-operations, each one clock cycle, e.g.: Fetch: Get next instruction into IR Decode: Determine what the instruction means Fetch operands: Move data from memory to datapath register Execute: Move data through the ALU Store results: Write data from register to memory Processeur Unité de contrôle Unité opérative UAL Registres IRPC Contrôleur Mémoire E/S Commande État 10 … … 500 501 100 101 load R0, M[500] Inc R1, R0 store M[501], R1 102 R0R1
7
Processeurs de Signaux et Images - 2 nde partie page 7 Département Électronique - A. BAGHDADI Control Unit Sub-Operations Fetch Instruction Get next instruction into IR PC : program counter, always points to next instruction IR : holds the fetched instruction Processeur Unité de contrôleUnité opérative UAL Registres IR PC Contrôleur Mémoire E/S Commande État 10 … … 500 501 100 101 load R0, M[500] Inc R1, R0 store M[501], R1 102 R0R1 100 load R0, M[500] Adresse
8
Processeurs de Signaux et Images - 2 nde partie page 8 Département Électronique - A. BAGHDADI Control Unit Sub-Operations Decode Determine the instruction type Processeur Unité de contrôleUnité opérative UAL Registres IR PC Contrôleur Mémoire E/S Commande État 10 … … 500 501 load R0, M[500] Inc R1, R0 store M[501], R1 R0R1 100 load R0, M[500] 100 101 102
9
Processeurs de Signaux et Images - 2 nde partie page 9 Département Électronique - A. BAGHDADI Control Unit Sub-Operations Fetch operands Move data from memory to datapath register Processeur Unité de contrôleUnité opérative UAL Registres IR PC Contrôleur Mémoire E/S Commande État 10 … … 500 501 100 101 load R0, M[500] Inc R1, R0 store M[501], R1 102 R0R1 100 load R0, M[500] 10
10
Processeurs de Signaux et Images - 2 nde partie page 10 Département Électronique - A. BAGHDADI Control Unit Sub-Operations Execute Move data through the ALU This particular instruction ( load R0, M[500] ) does nothing during this sub-operation Processeur Unité de contrôleUnité opérative UAL Registres IR PC Contrôleur Mémoire E/S Commande État 10 … … 500 501 100 101 load R0, M[500] Inc R1, R0 store M[501], R1 102 R0R1 100 load R0, M[500] 10
11
Processeurs de Signaux et Images - 2 nde partie page 11 Département Électronique - A. BAGHDADI Control Unit Sub-Operations Store Write data from register to memory This particular instruction ( load R0, M[500] ) does nothing during this sub-operation Processeur Unité de contrôleUnité opérative UAL Registres IR PC Contrôleur Mémoire E/S Commande État 10 … … 500 501 100 101 load R0, M[500] Inc R1, R0 store M[501], R1 102 R0R1 100 load R0, M[500] 10
12
Processeurs de Signaux et Images - 2 nde partie page 12 Département Électronique - A. BAGHDADI Instruction Cycles PC=100 Fetch oprends Exec. Store results clk Fetch inst. Decode Processeur Unité de contrôleUnité opérative UAL Registres IR PC Mémoire E/S Commande État 10 … … 500 501 100 101 load R0, M[500] Inc R1, R0 store M[501], R1 102 R0R1 100 load R0, M[500] Contrôleur 10
13
Processeurs de Signaux et Images - 2 nde partie page 13 Département Électronique - A. BAGHDADI Instruction Cycles Processeur Unité de contrôleUnité opérative UAL Registres IR PC Mémoire E/S Commande État 10 … … 500 501 100 101 load R0, M[500] Inc R1, R0 store M[501], R1 102 R0R1 101 Inc R1, R0 10 PC=101 Fetch oprends Exec. Store results clk Decode PC=100 Fetch oprends Exec. Store results clk Decode Fetch inst. Contrôleur +1 11 10
14
Processeurs de Signaux et Images - 2 nde partie page 14 Département Électronique - A. BAGHDADI Instruction Cycles PC=100 Fetch oprends Exec. Store results clk Fetch inst. Decode Processeur Unité de contrôleUnité opérative UAL Registres IR PC Mémoire E/S Commande État 10 … … 500 501 100 101 load R0, M[500] Inc R1, R0 store M[501], R1 102 R0R1 102 store M[501], R1 10 PC=102 Fetch oprends Exec. Store results clk Decode 11 PC=101 Fetch oprends Exec. Store results clk Decode Fetch inst. Contrôleur 11
15
Processeurs de Signaux et Images - 2 nde partie page 15 Département Électronique - A. BAGHDADI Architectural Considerations N-bit processor N-bit ALU, registers, buses, memory data interface Embedded: 8-bit, 16-bit, 32-bit common Desktop/servers: 32-bit, even 64 PC size determines address space Processor Control unitDatapath unit UAL Registers IRPC Controller Memory E/S Commande Status
16
Processeurs de Signaux et Images - 2 nde partie page 16 Département Électronique - A. BAGHDADI Architectural Considerations Clock frequency Inverse of clock period Must be longer than longest register to register delay in entire processor Memory access is often the longest Processor Control unitDatapath unit UAL Registers IRPC Controller Memory E/S Commande Status
17
Processeurs de Signaux et Images - 2 nde partie page 17 Département Électronique - A. BAGHDADI Architectural Considerations Performance can be improved through: A faster clock (technology limitations!) Pipeline: dividing the instruction execution in several stages and superposing them Multiple ALUs to execute multiple instruction flows (concurrently) Superscalar and V LIW architectures
18
Processeurs de Signaux et Images - 2 nde partie page 18 Département Électronique - A. BAGHDADI 12345678 Fetch-inst. Decode Fetch ops. Execute Store res. Wash Dry Pipelined time Pipelined pipelined instruction execution pipelined dish cleaning 123 12 4 3 5 4 6 5 7 6 8 7 8 12345678 12345678 12345678 12345678 12345678 1 st instruction Non-pipelined non-pipelined dish cleaning 12345678 time Two resources available Pipelining: Increasing Instruction Throughput
19
Processeurs de Signaux et Images - 2 nde partie page 19 Département Électronique - A. BAGHDADI Superscalar architecture Superscalar Scalar operation: being carried out on one or two numbers (contrary to a vector or matrix operation) Fetches instructions in packets Static scheduling (at compilation time) or dynamic (at execution time) In case of dynamic scheduling : requires extensive hardware to detect independent instructions multiple instructions Registers Cache/ memory Fetch FU Multiple functional units (FU) Decode& Ordre Sequential flow of instructions
20
Processeurs de Signaux et Images - 2 nde partie page 20 Département Électronique - A. BAGHDADI VLIW architecture VLIW (Very Long Instruction Word) : long instruction (128-1024 bits) composed of several independents operations (rather than one) Equivalent to a superscalar architecture with a static scheduling Currently growing in popularity one multi-operation instruction Registers Cache/ memory Fetch FU Plusieurs unités fonctionnelles (UF)
21
Processeurs de Signaux et Images - 2 nde partie page 21 Département Électronique - A. BAGHDADI Two memory architectures Processor Program memory Data memory Harvard Processor Memory (program and data) Princeton (Von Neumann) Princeton ( Von Neumann ) Fewer memory wires Simple implementation Harvard Simultaneous program and data memory access (Fetch inst. and Fetch oprd)
22
Processeurs de Signaux et Images - 2 nde partie page 22 Département Électronique - A. BAGHDADI Cache Memory Memory access may be slow Cache is small but fast memory close to processor Holds copy of part of memory Hits and misses Processor Memory Cache Fast/expensive technology, usually on the same chip Slower/cheaper technology, usually on a different chip
23
Processeurs de Signaux et Images - 2 nde partie page 23 Département Électronique - A. BAGHDADI Programmer’s View Programmer doesn’t need detailed understanding of architecture Instead, needs to know what instructions can be executed Two levels of instructions: Assembly level Structured languages (C, C++, Java, etc.) Most development today done using structured languages But, some assembly level programming may still be necessary Drivers: portion of program that communicates with and/or controls (drives) another device Often have detailed timing considerations, extensive bit manipulation Assembly level may be best for these
24
Processeurs de Signaux et Images - 2 nde partie page 24 Département Électronique - A. BAGHDADI Assembly-Level Instructions op.codeoperand1operand2 op.codeoperand1operand2 op.codeoperand1operand2 op.codeoperand1operand2... Instruction 1 Instruction 2 Instruction 3 Instruction 4 Instruction Set Defines the legal set of instructions for that processor Data transfer: memory/register, register/register, I/O, etc. Arithmetic/logical: move register through ALU and back Branches: determine next PC value when not just PC+1
25
Processeurs de Signaux et Images - 2 nde partie page 25 Département Électronique - A. BAGHDADI ADD Rn, Rm 0100RmRn Rn = Rn + Rm MOV Rn, #immed. 0011Rnimmédiat Rn = immediate MOV Rn, direct Assembly instruct. 0000Rn First byte direct Second byte Rn = M(direct) Operation JZ Rn, relative 0110Rnrelatif PC = PC + relative ( if Rn = 0) SUB Rn, Rm 0101Rm Rn = Rn - Rm Rn MOV direct, Rn 0001Rndirect M(direct) = Rn MOV @Rn, Rm 0010Rn Rm M(Rn) = Rm op.code operands A Simple Instruction Set
26
Processeurs de Signaux et Images - 2 nde partie page 26 Département Électronique - A. BAGHDADI Addressing Modes Operand field Register-direct Register-indirect Direct Indirect Immediate data Register Address Memory Address Addressing Modes Register-file contents Memory contents data Memory Addressdata Memory Address
27
Processeurs de Signaux et Images - 2 nde partie page 27 Département Électronique - A. BAGHDADI Sample Program int total = 0; for (int i=10; i!=0; i -- ) total += i; // next instructions... C Program JZ R1, Next; // Branch if i=0Loop: Next: // next instructions... Equivalent assembly program MOV R0, #0; // total = 00 MOV R1, #10; // i = 101 MOV R2, #1; // constant 12 MOV R3, #0; // constant 03 ADD R0, R1; // total += i5 SUB R1, R2; // i--6 JZ R3, Loop; // Branch7
28
Processeurs de Signaux et Images - 2 nde partie page 28 Département Électronique - A. BAGHDADI Programmer Considerations Program and data memory space Embedded processors often very limited e.g., 64 Kbytes program, 256 bytes of RAM (expandable) Registers: How many are there? Only a direct concern for assembly-level programmers I/O How communicate with external signals? Interrupts
29
Processeurs de Signaux et Images - 2 nde partie page 29 Département Électronique - A. BAGHDADI General-purpose processors Sometimes too general to be effective in demanding application e.g., video processing – requires huge video buffers and operations on large arrays of data, inefficient on a GPP Single-purpose processor High NRE, not programmable ASIPs – targeted to a particular domain Contain architectural features specific to that domain e.g., embedded control, digital signal processing, video processing, network processing, telecommunications, etc. Still programmable Application-Specific Instruction-set Processors (ASIPs)
30
Processeurs de Signaux et Images - 2 nde partie page 30 Département Électronique - A. BAGHDADI A Common ASIP: the DSP (Digital Signal Processor) For signal processing applications Large amounts of digitized data, often streaming Data transformations must be applied fast e.g., cell-phone voice filter, digital TV, music synthesizer DSP features Several instruction execution units Multiple-accumulate single-cycle instruction, other instrs. Efficient vector operations – e.g., add two arrays Vector ALUs, loop buffers, etc.
31
Processeurs de Signaux et Images - 2 nde partie page 31 Département Électronique - A. BAGHDADI Example: TMS320C67x core
32
Processeurs de Signaux et Images - 2 nde partie page 32 Département Électronique - A. BAGHDADI Another Common ASIP: Microcontroller For embedded control applications Reading sensors, setting actuators Mostly dealing with events (bits): data is present, but not in huge amounts e.g., VCR, disk drive, digital camera (assuming SPP for image compression), washing machine, microwave oven Microcontroller features On-chip peripherals Timers, analog-digital converters, serial communication, etc. Tightly integrated for programmer, typically part of register space On-chip program and data memory Direct programmer access to many of the chip’s pins Specialized instructions for bit-manipulation and other low-level operations
33
Processeurs de Signaux et Images - 2 nde partie page 33 Département Électronique - A. BAGHDADI Example: Sharp LH77790B microcontroller
34
Processeurs de Signaux et Images - 2 nde partie page 34 Département Électronique - A. BAGHDADI In the past, microprocessors were acquired as chips Today, we increasingly acquire a processor as Intellectual Property (IP) e.g., synthesizable VHDL model Opportunity to add a custom datapath hardware and a few custom instructions, or delete a few instructions Can have significant performance, power and size impacts Problem: need compiler/debugger for customized ASIP Remember, most development uses structured languages One solution: automatic compiler/debugger generation e.g., www.tensillica.comwww.tensillica.com Another solution: retargettable compilers e.g., www.improvsys.com (customized VLIW architectures)www.improvsys.com Trend: Even More Customized ASIPs
35
Processeurs de Signaux et Images - 2 nde partie page 35 Département Électronique - A. BAGHDADI Selecting a Microprocessor Criteria Technical: speed, power, size, cost Other: development environment, prior expertise, licensing, etc. Speed: how evaluate a processor’s speed? Clock speed – but instructions per cycle may differ Instructions per second – but work per instr. may differ Dhrystone: Synthetic benchmark, developed in 1984. Dhrystones/sec. MIPS: 1 MIPS = 1757 Dhrystones per second. D-MIPS commonly used today. So, 750 MIPS = 750*1757 = 1,317,750 Dhrystones per second SPEC: set of more realistic benchmarks, but oriented to desktops EEMBC – EDN Embedded Benchmark Consortium, www.eembc.org Suites of benchmarks: automotive, consumer electronics, networking, office automation, telecommunications
36
Processeurs de Signaux et Images - 2 nde partie page 36 Département Électronique - A. BAGHDADI Design of an elementary (yet general purpose!) processor
37
Processeurs de Signaux et Images - 2 nde partie page 37 Département Électronique - A. BAGHDADI 8-bits general purpose processor Based on accumulator register named ACCU (8 bits) Four instructions Each instruction is coded on 8 bits. Two bits to code the type of the operation (op.code) and 6 bits to code the operand or the address of the operand in the memory according to the type of the instruction. MnemonicInstruction codingDescription NOR00AAAAAA ACCU = ACCU NOR Mem[AAAAAA] ADD01AAAAAA ACCU = ACCU + Mem[AAAAAA], update the Carry STA10AAAAAA Mem[AAAAAA] = ACCU JCC11DDDDDD If Carry = 0 PC = DDDDDD Else clear Carry (Carry=0) [Instruction-set source : http://www.tuhh.de/~setb0209/cpu/ by T. Böscke] Presentation of the elementary processor
38
Processeurs de Signaux et Images - 2 nde partie page 38 Département Électronique - A. BAGHDADI 000000 : 00001000 (0x08) NOR 0b001000 ; ACCU = ACCU NOR M[001000] 000001 : 01000111 (0x47) ADD 0b000111 ; ACCU = ACCU + M[000111] (Carry) 000010 : 10000110 (0x86) STA 0b000110 ; M[000110] = ACCU 000011 : 11000100 (0xC4) JCC 0b000100 ; if Carry = 0 then PC = 000100 else clear Carry 000100 : 11000100 (0xC4) JCC 0b000100 ; PC = 000100 (Carry has been cleared) 000101 : 00000000 (0x00) 000110 : 00000000 (0x00) 000111 : 01111110 (0x7E) 001000 : 11111111 (0xFF) 001001 : 00000000 (0x00) Mem Adr Binary (hexa) Assembler Comments content instructions … … Data… … … 111111 : 00000000 (0x00) Example of a test program
39
Processeurs de Signaux et Images - 2 nde partie page 39 Département Électronique - A. BAGHDADI Processor design (1/3) Consider the basic architecture model (slide n° 4) Consider the instruction set, the number of registers, and the (eventual) imposed architectural specifications/constraints Using the design methodology presented previously (course n° 1)
40
Processeurs de Signaux et Images - 2 nde partie page 40 Département Électronique - A. BAGHDADI Processor design (2/3) Algorithm – FSMD Clear PC & IR & Carry & Registers; while (1) { Fetch Inst (read one instruction); Decode the instruction; if ( CodeOp=00 or CodeOp=01 ) { Fetch Operand (read the operand to R1); if CodeOp=00 Execute NOR (ACCU = ACCU NOR M[AAAAAA]); else Execute ADD (ACCU = ACCU + M[AAAAAA] Update the Carry); } else if CodeOp=10 { Execute STA (Mem[AAAAAA] = ACCU); } else { Execute JCC (if Carry=0 PC=DDDDDD else Carry=0); }
41
Processeurs de Signaux et Images - 2 nde partie page 41 Département Électronique - A. BAGHDADI Processor design (3/3) selUAL UAL NOR ou ADD ACCU selADR incPC [7:0] ldC PC IR [7:0] clrPC ldPC Contrôle (FSM) [7:0] [7:6] [5:0] Mux [5:0] ldIR [7:0] DataIn DataOut Adr ldACCU 01 CodeOp Rst R1 ldR1 C clrC C enMweM DatapathController Memory
42
Processeurs de Signaux et Images - 2 nde partie page 42 Département Électronique - A. BAGHDADI It is time to practice CAD tools (Computer-aided design) will be used to implement the architecture which you will propose for this processor
Présentations similaires
© 2024 SlidePlayer.fr Inc.
All rights reserved.