(Also based on presentation: Dr. Nam Ling, COEN210 Lecture Notes) ..... ID/EX.
EX/MEM. MEM/WB. M u x. 0. 1. Me m. W rite. Address. Data memory. Address ...
PIPELINING TECHNIQUES Dr. Bill Yi Santa Clara University (Based on text: David A. Patterson & John L. Hennessy, Computer Organization and Design: The Hardware/Software Interface, 3rd Ed., Morgan Kaufmann, 2007) (Also based on presentation: Dr. Nam Ling, COEN210 Lecture Notes)
1
COURSE CONTENTS Introduction Instructions Computer Arithmetic Performance Processor: Datapth Processor: Control Î Pipelining Techniques Memory Input/Output Devices
2
PIPELINING & ADVANCED TECHNIQUES
Pipeline Overview & Hazards Pipelined Datapath Pipelined Control Data Hazards, Forwarding, Stalls Control Hazards & Exceptions 3
Pipelining: Basic Idea
Pipelining: Multiple instructions overlapped in execution Improve performance by increasing instruction throughput Ideal speedup is number of stages in the pipeline. Do we achieve this? P rog ram execution Tim e order (in instructio ns) lw $1, 100($0)
2
Instruction Reg fetch
lw $2, 200($0)
4
6
8
ALU
Data access
10
12
14
ALU
Data access
16
18
Reg Instruction Reg fetch
8 ns
lw $3, 300($0)
Reg Instruction fetch
8 ns
...
8 ns P rogram 2 execution Time ord er (in instructions) Instruction lw $1, 100($0) fetch lw $2, 200($0) lw $3, 300($0)
2 ns
4
Reg Instruction fetch 2 ns
6
ALU Reg Instruction fetch 2 ns
8 Data access ALU Reg 2 ns
10
14
12
Reg Data access
Reg
ALU
Data access
2 ns
2 ns
Reg 2 ns
4
Pipelining: Basic Idea
Improve instruction throughput rather than individual instruction execution time or latency Ideal speedup Time between instructions (non - pipelined) = # of pipe stages Time between instructions (pipelined)
Pipeline stage: balancing length of each stage with equal length, limited # of pipe stages All instructions in pipeline take the same number of clock cycles So, there are no operation in some stages sometimes
5
Pipelining
What makes it easy?
All instructions are the same length (e.g. MIPS) Just a few instruction formats (e.g. MIPS) Memory operands appear only in loads and stores (e.g. MIPS) Operands aligned in memory (e.g. MIPS)
What makes it hard?
structural hazards control hazards data hazards exception handling trying to improve performance with out-of-order execution, etc.
6
Three Pipeline Hazards
Hazards: situations in pipelining when next instruction cannot execute in the following clock cycle Structural hazards -- caused by hardware resource conflicts; hardware cannot support the combination of instructions we want to execute in the same clock cycle. Ex. If we have only one memory for both instruction & data, we have a structural hazard; ----> need two separate memories, one for instruction & one for data
Control hazards -- caused by the need to make a decision based on the results of one instruction while others are executing Ex. branch / jump instructions Î may cause a pipeline stall (bubble): next instruction is stalled extra clock cycle(s) before starting Î branch prediction (may be complicated): execute the predicted instruction after branch without delay; when guess is wrong, ensure that wrongly guessed branch has no effect and restart the pipeline from proper branch address Î delayed decision (branch): place an instruction that is not affected by the branch in the delayed branch slot add $4, $5, $6 beq $1, $2, 40 beq $1, $2, 40 add $4, $5, $6 (in delayed branch slot)
7
Three Pipeline Hazards
Data hazards -- caused by data dependence; an instruction depends on the results of a previous instruction still in the pipeline Read after write dependence (RAW) I1: R2