ECE232: Hardware Organization and Design

Part 2: Datapath Design – Binary Numbers and Adders

http://www.ecs.umass.edu/ece/ece232/

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB

Computer Organization 5 classic components of any computer Computer Processor (CPU) Control

Datapath

Memory

Devices Input

Output

Today we will look at datapaths (adder, multiplier, …) ECE232: Adders 2

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Unsigned Binary Integers Given an n-bit number xn-1

x0

x = x n−1 2n−1 + x n−2 2n−2 + L + x1 21 + x 0 20 Range: 0 to +2n – 1 Example • 0000 0000 0000 0000 0000 0000 0000 10112 = 0 + … + 1×23 + 0×22 +1×21 +1×20 = 0 + … + 8 + 0 + 2 + 1 = 1110 Using 32 bits • 0 to +4,294,967,295

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

ECE232: Adders 3

Koren

2’s-Complement Signed Integers

Given an n-bit number xn-1

x0 n −1

x = − x n−1 2

n−2

+ x n−2 2

+ L + x1 2 + x 0 2 1

0

Bit n-1 is sign bit • 1/0 for negative/non-negative numbers Range: –2n – 1 to +2n – 1 – 1 Example 1111 1111 1111 1111 1111 1111 1111 11002 = –1×231 + 1×230 + … + 1×22 +0×21 +0×20 = –2,147,483,648 + 2,147,483,644 = –410 Using 32 bits • –2,147,483,648 to +2,147,483,647 • Most-negative: 1000 0000 … 0000 • Most-positive: 0111 1111 … 1111 ECE232: Adders 4

2’s comp. binary

decimal

1000

-8

1001

-7

1010

-6

1011

-5

1100

-4

1101

-3

1110

-2

1111

-1

0000

0

0001

1

0010

2

0011

3

0100

4

0101

5

0110

6

0111

7

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Signed Negation To get –X complement X and add 1 • Complement means 1 → 0, x+x 0→1 Example: negate +2 • +2 = 0000 0000 … 00102 • –2 = 1111 1111 … 11012 + 1 = 1111 1111 … 11102 Subtraction: y – x = y + ( x +1)

Sign Extension

ECE232: Adders 5

= 1111...1112 = −1

x + 1 = −x

Representing a number using more bits • Preserve the numeric value Replicate the sign bit to the left Examples: 8-bit to 16-bit • +5: 0000 0101 => 0000 0000 0000 0101 • –5: 1111 1011 => 1111 1111 1111 1011

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Disadvantage of Signed-Magnitude Method Operation may depend on the signs of the operands Example - adding a positive number X and a negative number -Y : X+(-Y) If Y>X, final result is -(Y-X) Calculation • switch order of operands • perform subtraction rather than addition • attach the minus sign A sequence of decisions must be made, costing excess control logic and execution time This is avoided in the 2’s complement method

ECE232: Adders 6

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Overflow in 2’s Comp Add/Subtract (1) Example 01001 9 11001 -7 1 00010 2 Carry-out discarded - does not indicate overflow In general, if X and Y have opposite signs - no overflow can occur regardless of whether there is a carry-out or not Examples -

ECE232: Adders 7

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Overflow in 2’s Comp Add/Subtract (2) If X and Y have the same sign and result has different sign overflow occurs Examples 10111 -9 10111 -9 1 01110 14 = -18 mod 32 • Carry-out and overflow 01001 9 00111 7 0 10000 -16 = 16 mod 32 • No carry-out but overflow

ECE232: Adders 8

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Ripple Carry Adder Addition • most frequent operation • used also for multiplication and division • fast two-operand adder essential

Simple parallel adder • for adding xn-1,xn-2,...,x0 and yn-1,yn-2,…,y0 • using n full adders

Full adder • combinational digital circuit with input bits xi,yi and incoming carry bit ci, producing output sum bit si and outgoing carry bit ci+1 • incoming carry for next FA with input bits

xi+1,yi+1

• si = xi ⊕ yi ⊕ ci • ci+1 = xi ⋅ yi + ci ⋅ (xi + yi) Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

ECE232: Adders 9

Koren

Full-Adder (FA)

Examine the Full Adder table

Cin

x y Cin Cout S

x 0 0 0 0 1 1 1 1

0 0 1 1 0 0 1 1

0 1 0 1 0 1 0 1

0 0 0 1 0 1 1 1

0 1 1 0 1 0 0 1

Cout = x • y + Cin • (x + y) S = x’y’c + x’yc’ + xy’c’ + xyc =x⊕y⊕c

y

Cout

Sum

In general, for bit i: ci+1 = xi yi + ci (xi+yi) where ci+1 = Cout, ci= Cin

Half adder has 2 inputs. In principle HA is same as FA, with Cin set to 0. ECE232: Adders 10

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Parallel Adder: Ripple Carry

In a parallel arithmetic unit • All 2n input bits available at the same time • Carry propagates from the FA to the right to FA to the left • Carries ripple through all n FAs before we can claim that the sum outputs are correct and may be used in further calculations

Each FA has a finite delay ECE232: Adders 11

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Example x3,x2,x1,x0=1111 y3,y2,y1,y0=0001 ∆FA - operation time - delay Assuming equal delays for sum and carry-out • Longest carry propagation chain when adding two 4-bit numbers In synchronous arithmetic units time allowed for adder's operation is worst-case delay - n∆FA • • • •

ECE232: Adders 12

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Subtraction using Ripple Carry Adder

Suppose you are performing X-Y operation • Complement Y bits • Force C0 to 1 • add

Example: X = 0101, Y = 0010; Compute X – Y First step: Complement Y • 1101 Second step: add 0101 + 1101 + 1 = 0011

ECE232: Adders 13

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Carry Look Ahead Adder

Problem with Ripple Carry Adder • Slow • How much is the delay for a 64 bit adder? Solution • Shorten carry propagation delay • Wouldn’t it be great to generate all carry signals in parallel? • How do you do that? Observation 1. If Xi = Yi = 1, a carry will be generated, Cin does not matter 2. If Xi = Yi = 0, no carry will be generated by the FA, Cin does not matter 3. When does Cin matter in Cout generation? • Xi Yi = 01 • Xi Yi = 10 Carry Look Ahead Adder uses the above observation to generate carry signals in parallel

ECE232: Adders 14

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Carry Look Ahead Adder X2 Y2

X3 Y3

p3

g3

X0 Y0

C2 S1

C3 S2

S3

X1 Y1

p2

C1 S0

p1

g2

C0

g1

p0

g0

CLL (carry look-ahead logic)

C4

Gi =Xi. Yi : generated carry ; Pi=Xi + Yi : propagated carry

ECE232: Adders 15

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Plumbing analogy c0

g0 p0 c1 = g0 + c0 p0 c1

c0

g0 g1 c2 = g1 + g0 p1 + c0 p0 p1

p0

c2

g1 g2

c4 = g3 + g2 p3 + g1 p2p3 + g0 p1 p2p3 + c0 p0 p1 p2 p3

g3

c0

g0

p1

p0 p1

p2 p3

c4 ECE232: Adders 16

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Delay of Carry Look Ahead Adders

Let

τ

be the delay of a gate X2 Y2 X3 Y3

p3

p2

g3

X1 Y1

g2

p1

X0 Y0

g1

p0

g0

If inputs are available at time t=0, when are p and g signals available? C3

p3

C1

C2

p2

g3

g2

p1

g1

p0

g0 C0

CLL (carry look-ahead logic) C4

ECE232: Adders 17

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Delay of Carry Look Ahead Adders C3

p3

g3

C1

C2

p2

g2

p1

g1

p0

g0 C0

CLL (carry look-ahead logic) C4

Which signal will be generated last? • How long will it take?

ECE232: Adders 18

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Gates are limited to two inputs

C4=g3 + p3g2 + p3p2g1 + p3p2p1g0 +

p 3 p 2 p 1 p 0 c0

τ 2τ

τ 3τ What What What What

if if if if

there there there there

were were were were

6 7 8 9

inputs? inputs? inputs? inputs?

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

ECE232: Adders 19

Koren

Total Delay X2 Y2

X3 Y3

C2 S1

C3 S2

S3

p3

p2

g3 CLL

C4

τ+3τ+ τ

+2τ

= 7τ

What is the delay of

a 5 bit CLA? 6 bit CLA? 7 bit CLA? 8 bit CLA?

ECE232: Adders 20

X1 Y1

g2

X0 Y0 C1 S0

p1

g1

C0 p0

g0

(carry look-ahead logic)

G*

P*

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

2-level Carry Look Ahead (16-bit)

CLL

n=16 - 4 groups, 4-bit each

ECE232: Adders 21

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Plumbing Analogy

g0

p0 p1

g1

p1

p2 p3

g2

p2

P*0

g3

p3

G*0

ECE232: Adders 22

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Carry Select Adder

Principle: speculative Carry propagate delay CP(2n) = 2*CP(n)

n-bit adder

CP(2n) = CP(n) + CP(mux)

n-bit adder

1

n-bit adder

Compute both, select one

n-bit adder

0

n-bit adder

MUX

Carry-select adder

Cout ECE232: Adders 23

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Summary

Throw hardware for performance Ripple Carry: least hardware, slowest CLA: faster, more hardware Carry Select: even faster, even more hardware Other techniques available, e.g., Carry skip adder • See http://www.ecs.umass.edu/ece/koren/arith/simulator/ Combination of these techniques – hybrid adders

Reading: Chapter 2 - Section 2.4

ECE232: Adders 24

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Different circuit implementation of a CLL MCC - Manchester Carry module

ECE232: Adders 25

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

64-bit Hybrid Adder

ECE232: Adders 26

Part 2: Datapath Design – Binary Numbers and Adders

http://www.ecs.umass.edu/ece/ece232/

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB

Computer Organization 5 classic components of any computer Computer Processor (CPU) Control

Datapath

Memory

Devices Input

Output

Today we will look at datapaths (adder, multiplier, …) ECE232: Adders 2

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Unsigned Binary Integers Given an n-bit number xn-1

x0

x = x n−1 2n−1 + x n−2 2n−2 + L + x1 21 + x 0 20 Range: 0 to +2n – 1 Example • 0000 0000 0000 0000 0000 0000 0000 10112 = 0 + … + 1×23 + 0×22 +1×21 +1×20 = 0 + … + 8 + 0 + 2 + 1 = 1110 Using 32 bits • 0 to +4,294,967,295

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

ECE232: Adders 3

Koren

2’s-Complement Signed Integers

Given an n-bit number xn-1

x0 n −1

x = − x n−1 2

n−2

+ x n−2 2

+ L + x1 2 + x 0 2 1

0

Bit n-1 is sign bit • 1/0 for negative/non-negative numbers Range: –2n – 1 to +2n – 1 – 1 Example 1111 1111 1111 1111 1111 1111 1111 11002 = –1×231 + 1×230 + … + 1×22 +0×21 +0×20 = –2,147,483,648 + 2,147,483,644 = –410 Using 32 bits • –2,147,483,648 to +2,147,483,647 • Most-negative: 1000 0000 … 0000 • Most-positive: 0111 1111 … 1111 ECE232: Adders 4

2’s comp. binary

decimal

1000

-8

1001

-7

1010

-6

1011

-5

1100

-4

1101

-3

1110

-2

1111

-1

0000

0

0001

1

0010

2

0011

3

0100

4

0101

5

0110

6

0111

7

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Signed Negation To get –X complement X and add 1 • Complement means 1 → 0, x+x 0→1 Example: negate +2 • +2 = 0000 0000 … 00102 • –2 = 1111 1111 … 11012 + 1 = 1111 1111 … 11102 Subtraction: y – x = y + ( x +1)

Sign Extension

ECE232: Adders 5

= 1111...1112 = −1

x + 1 = −x

Representing a number using more bits • Preserve the numeric value Replicate the sign bit to the left Examples: 8-bit to 16-bit • +5: 0000 0101 => 0000 0000 0000 0101 • –5: 1111 1011 => 1111 1111 1111 1011

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Disadvantage of Signed-Magnitude Method Operation may depend on the signs of the operands Example - adding a positive number X and a negative number -Y : X+(-Y) If Y>X, final result is -(Y-X) Calculation • switch order of operands • perform subtraction rather than addition • attach the minus sign A sequence of decisions must be made, costing excess control logic and execution time This is avoided in the 2’s complement method

ECE232: Adders 6

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Overflow in 2’s Comp Add/Subtract (1) Example 01001 9 11001 -7 1 00010 2 Carry-out discarded - does not indicate overflow In general, if X and Y have opposite signs - no overflow can occur regardless of whether there is a carry-out or not Examples -

ECE232: Adders 7

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Overflow in 2’s Comp Add/Subtract (2) If X and Y have the same sign and result has different sign overflow occurs Examples 10111 -9 10111 -9 1 01110 14 = -18 mod 32 • Carry-out and overflow 01001 9 00111 7 0 10000 -16 = 16 mod 32 • No carry-out but overflow

ECE232: Adders 8

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Ripple Carry Adder Addition • most frequent operation • used also for multiplication and division • fast two-operand adder essential

Simple parallel adder • for adding xn-1,xn-2,...,x0 and yn-1,yn-2,…,y0 • using n full adders

Full adder • combinational digital circuit with input bits xi,yi and incoming carry bit ci, producing output sum bit si and outgoing carry bit ci+1 • incoming carry for next FA with input bits

xi+1,yi+1

• si = xi ⊕ yi ⊕ ci • ci+1 = xi ⋅ yi + ci ⋅ (xi + yi) Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

ECE232: Adders 9

Koren

Full-Adder (FA)

Examine the Full Adder table

Cin

x y Cin Cout S

x 0 0 0 0 1 1 1 1

0 0 1 1 0 0 1 1

0 1 0 1 0 1 0 1

0 0 0 1 0 1 1 1

0 1 1 0 1 0 0 1

Cout = x • y + Cin • (x + y) S = x’y’c + x’yc’ + xy’c’ + xyc =x⊕y⊕c

y

Cout

Sum

In general, for bit i: ci+1 = xi yi + ci (xi+yi) where ci+1 = Cout, ci= Cin

Half adder has 2 inputs. In principle HA is same as FA, with Cin set to 0. ECE232: Adders 10

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Parallel Adder: Ripple Carry

In a parallel arithmetic unit • All 2n input bits available at the same time • Carry propagates from the FA to the right to FA to the left • Carries ripple through all n FAs before we can claim that the sum outputs are correct and may be used in further calculations

Each FA has a finite delay ECE232: Adders 11

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Example x3,x2,x1,x0=1111 y3,y2,y1,y0=0001 ∆FA - operation time - delay Assuming equal delays for sum and carry-out • Longest carry propagation chain when adding two 4-bit numbers In synchronous arithmetic units time allowed for adder's operation is worst-case delay - n∆FA • • • •

ECE232: Adders 12

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Subtraction using Ripple Carry Adder

Suppose you are performing X-Y operation • Complement Y bits • Force C0 to 1 • add

Example: X = 0101, Y = 0010; Compute X – Y First step: Complement Y • 1101 Second step: add 0101 + 1101 + 1 = 0011

ECE232: Adders 13

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Carry Look Ahead Adder

Problem with Ripple Carry Adder • Slow • How much is the delay for a 64 bit adder? Solution • Shorten carry propagation delay • Wouldn’t it be great to generate all carry signals in parallel? • How do you do that? Observation 1. If Xi = Yi = 1, a carry will be generated, Cin does not matter 2. If Xi = Yi = 0, no carry will be generated by the FA, Cin does not matter 3. When does Cin matter in Cout generation? • Xi Yi = 01 • Xi Yi = 10 Carry Look Ahead Adder uses the above observation to generate carry signals in parallel

ECE232: Adders 14

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Carry Look Ahead Adder X2 Y2

X3 Y3

p3

g3

X0 Y0

C2 S1

C3 S2

S3

X1 Y1

p2

C1 S0

p1

g2

C0

g1

p0

g0

CLL (carry look-ahead logic)

C4

Gi =Xi. Yi : generated carry ; Pi=Xi + Yi : propagated carry

ECE232: Adders 15

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Plumbing analogy c0

g0 p0 c1 = g0 + c0 p0 c1

c0

g0 g1 c2 = g1 + g0 p1 + c0 p0 p1

p0

c2

g1 g2

c4 = g3 + g2 p3 + g1 p2p3 + g0 p1 p2p3 + c0 p0 p1 p2 p3

g3

c0

g0

p1

p0 p1

p2 p3

c4 ECE232: Adders 16

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Delay of Carry Look Ahead Adders

Let

τ

be the delay of a gate X2 Y2 X3 Y3

p3

p2

g3

X1 Y1

g2

p1

X0 Y0

g1

p0

g0

If inputs are available at time t=0, when are p and g signals available? C3

p3

C1

C2

p2

g3

g2

p1

g1

p0

g0 C0

CLL (carry look-ahead logic) C4

ECE232: Adders 17

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Delay of Carry Look Ahead Adders C3

p3

g3

C1

C2

p2

g2

p1

g1

p0

g0 C0

CLL (carry look-ahead logic) C4

Which signal will be generated last? • How long will it take?

ECE232: Adders 18

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Gates are limited to two inputs

C4=g3 + p3g2 + p3p2g1 + p3p2p1g0 +

p 3 p 2 p 1 p 0 c0

τ 2τ

τ 3τ What What What What

if if if if

there there there there

were were were were

6 7 8 9

inputs? inputs? inputs? inputs?

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

ECE232: Adders 19

Koren

Total Delay X2 Y2

X3 Y3

C2 S1

C3 S2

S3

p3

p2

g3 CLL

C4

τ+3τ+ τ

+2τ

= 7τ

What is the delay of

a 5 bit CLA? 6 bit CLA? 7 bit CLA? 8 bit CLA?

ECE232: Adders 20

X1 Y1

g2

X0 Y0 C1 S0

p1

g1

C0 p0

g0

(carry look-ahead logic)

G*

P*

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

2-level Carry Look Ahead (16-bit)

CLL

n=16 - 4 groups, 4-bit each

ECE232: Adders 21

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Plumbing Analogy

g0

p0 p1

g1

p1

p2 p3

g2

p2

P*0

g3

p3

G*0

ECE232: Adders 22

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Carry Select Adder

Principle: speculative Carry propagate delay CP(2n) = 2*CP(n)

n-bit adder

CP(2n) = CP(n) + CP(mux)

n-bit adder

1

n-bit adder

Compute both, select one

n-bit adder

0

n-bit adder

MUX

Carry-select adder

Cout ECE232: Adders 23

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Summary

Throw hardware for performance Ripple Carry: least hardware, slowest CLA: faster, more hardware Carry Select: even faster, even more hardware Other techniques available, e.g., Carry skip adder • See http://www.ecs.umass.edu/ece/koren/arith/simulator/ Combination of these techniques – hybrid adders

Reading: Chapter 2 - Section 2.4

ECE232: Adders 24

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Different circuit implementation of a CLL MCC - Manchester Carry module

ECE232: Adders 25

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB and Kundu, UMass

Koren

64-bit Hybrid Adder

ECE232: Adders 26