Chapter Four
Arithmetic for Computers
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
1998 Morgan Kaufmann Publishers
Ch4-1
Arithmetic
•
•
Where we've been:
– Performance (seconds, cycles, instructions)
– Abstractions:
Instruction Set Architecture
Assembly Language and Machine Language
What's up ahead:
– Implementing the Architecture
operation
a
32
ALU
result
32
b
32
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
1998 Morgan Kaufmann Publishers
Ch4-2
Numbers
•
•
•
•
Bits are just bits (no inherent meaning)
— conventions define relationship between bits and numbers
Binary numbers (base 2)
0000 0001 0010 0011 0100 0101 0110 0111 1000 1001...
decimal: 0...2n-1
Of course it gets more complicated:
numbers are finite (overflow)
fractions and real numbers
negative numbers
e.g., no MIPS subi instruction; addi can add a negative number)
How do we represent negative numbers?
i.e., which bit patterns will represent which numbers?
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
1998 Morgan Kaufmann Publishers
Ch4-3
Possible Representations
•
Sign Magnitude:
000 = +0
001 = +1
010 = +2
011 = +3
100 = -0
101 = -1
110 = -2
111 = -3
•
•
One's Complement
Two's Complement
000 = +0
001 = +1
010 = +2
011 = +3
100 = -3
101 = -2
110 = -1
111 = -0
000 = +0
001 = +1
010 = +2
011 = +3
100 = -4
101 = -3
110 = -2
111 = -1
Issues: balance, number of zeros, ease of operations
Which one is best? Why?
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
1998 Morgan Kaufmann Publishers
Ch4-4
MIPS
•
32 bit signed numbers:
0000
0000
0000
...
0111
0111
1000
1000
1000
...
1111
1111
1111
0000 0000 0000 0000 0000 0000 0000two = 0ten
0000 0000 0000 0000 0000 0000 0001two = + 1ten
0000 0000 0000 0000 0000 0000 0010two = + 2ten
1111
1111
0000
0000
0000
1111
1111
0000
0000
0000
1111
1111
0000
0000
0000
1111
1111
0000
0000
0000
1111
1111
0000
0000
0000
1111
1111
0000
0000
0000
1110two
1111two
0000two
0001two
0010two
=
=
=
=
=
+
+
–
–
–
2,147,483,646ten
2,147,483,647ten
2,147,483,648ten
2,147,483,647ten
2,147,483,646ten
maxint
minint
1111 1111 1111 1111 1111 1111 1101two = – 3ten
1111 1111 1111 1111 1111 1111 1110two = – 2ten
1111 1111 1111 1111 1111 1111 1111two = – 1ten
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
1998 Morgan Kaufmann Publishers
Ch4-5
Two's Complement Operations
•
Negating a two's complement number: invert all bits and add 1
– remember: “negate” and “invert” are quite different!
•
Converting n bit numbers into numbers with more than n bits:
– MIPS 16 bit immediate gets converted to 32 bits for arithmetic
– copy the most significant bit (the sign bit) into the other bits
0010
-> 0000 0010
1010
-> 1111 1010
– "sign extension" (lbu vs. lb)
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
1998 Morgan Kaufmann Publishers
Ch4-6
Novas instruções
•
•
•
instruções “unsigned”: (exemplo de aplicação, cálculo de memória)
sltu $t1, $t2, $t3
# diferença é “sem sinal”
slti e sltiu
# envolve imediato, com ou sem sinal
•
Exemplo pag 215: supor $s0 = FF FF FF FF e $s1 = 00 00 00 01
slt
$t0, $s0, $s1
como $s0 < 0 e $s1 > 0  $s0<$s1  $t0 = 1
sltu
$t0, $s0, $s1
como $s0 e $s1 não tem sinal  $s0>$s1  $t0 = 0
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
1998 Morgan Kaufmann Publishers
Ch4-7
Cuidados com extensão 16 bits
•
•
•
•
•
beq $s0, $s1, nnn
# salta para PC + nnn se teste OK
nnn tem 16 bits e PC tem 32 bits
– estender de 16 para 32 bits antes daoperação aritmética
se nnn > 0
– preencher com zeros à esquerda
se nnn < 0
CUIDADO
– preencher com 1´s à esquerda
– verificar
por este motivo operação é chamada de
– EXTENSÃO DE SINAL
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
1998 Morgan Kaufmann Publishers
Ch4-8
Addition & Subtraction
•
Just like in grade school (carry/borrow 1s)
0111
0111
0110
+ 0110
- 0110
- 0101
•
Two's complement operations easy
– subtraction using addition of negative numbers
0111
+ 1010
•
Overflow (result too large for finite computer word):
– e.g., adding two n-bit numbers does not yield an n-bit number
0111
+ 0001
note that overflow term is somewhat misleading,
1000
it does not mean a carry “overflowed”
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
1998 Morgan Kaufmann Publishers
Ch4-9
Detecting Overflow
•
•
•
No overflow when adding a positive and a negative number
No overflow when signs are the same for subtraction
CONDIÇÕES DE OVERFLOW
op
A
B
re s u lta d o
A+B
+
+
-
A+B
-
-
+
A -B
+
-
-
A -B
-
+
+
Em hardware, comparar o “vai-um” e o
“vem-um” com relação ao bit de sinal
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
1998 Morgan Kaufmann Publishers
Ch4-10
Effects of Overflow
•
•
An exception (interrupt) occurs
– Control jumps to predefined address for exception (EPC —
EXCEPTION PROGRAM COUNTER)
– Interrupted address is saved for possible resumption
• mfc0 (move from system control): copia endereço do EPC para
qualquer registrador
Don't always want to detect overflow
— new MIPS instructions: addu, addiu, subu
note: addiu still sign-extends!
note: sltu, sltiu for unsigned comparisons
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
1998 Morgan Kaufmann Publishers
Ch4-11
Instruções (fig 4.52 - pag 309)
add
add
R
m u ltip ly
mult
R
a d d im m e d ia te
addi
I
m u ltip ly u n sig n e d
multu
R
a d d u n sig n e d
addu
R
d ivid e
div
R
a d d im m e d ia te u n sig n e d
addiu
I
d ivid e u n sig n e d
divu
R
su b tra ct
sub
R
m o ve fro m H i
mfhi
R
su b tra ct u n sig n e d
subu
R
m o ve fro m L o
mflo
R
and
and
R
m o ve fro m s yste m co n tro l (E P C )
mfc0
R
a n d im m e d ia te
andi
I
fp a d d sin g le
add.s
R
or
or
R
fp a d d d o u b lr
add.d
R
o r im m e d ia te
ori
I
fp su b tra ct sin g le
sub.s
R
sh ift le ft lo g ic a l
sll
R
fp su b tra ct d o u b le
sub.d
R
sh ift rig h t lo g ica l
srl
R
fp m u ltip ly sin g le
mul.s
R
lo a d u p p e r im m e d ia te
lui
I
fp m u ltip ly d o u b le
mul.d
R
lo a d w o rd
lw
I
fp d ivid e sin g le
div.s
R
sto re w o rd
sw
I
fp d ivid e d o u b le
div.d
R
lo a d b yte u n sig n e d
lbu
I
lo a d w o rd to fp sin g le
lwc1
I
sto re b yte
sb
I
sto re w o rd to fp sin g le
swc1
I
b ra n ch o n e q u a l
beq
I
b ra n ch o n fp tru e
bclt
I
b ra n ch o n n o t e q u a l
bne
I
b ra n ch o n fp fa ls e
bclf
I
ju m p
j
J
c.x.s
R
ju m p a n d lin k
jal
J
fp co m p a re sin g le
(x= eq,neq,lt,le,gt,ge)
ju m p re g iste r
jr
R
c.x.d
R
se t le ss th a n
slt
R
fp co m p a re d o u b le
(x= eq,neq,lt,le,gt,ge)
se t le ss th a n im m e d ia te
slti
I
se t le ss th a n u n sig n e d
sltu
R
sltiu
se t le ss th a n im m e d ia te u n sig n e d
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
I
1998 Morgan Kaufmann Publishers
Ch4-12
Review: Boolean Algebra & Gates
•
Problem: Consider a logic function with three inputs: A, B, and C.
Output D is true if at least one input is true
Output E is true if exactly two inputs are true
Output F is true only if all three inputs are true
•
Show the truth table for these three functions.
•
Show the Boolean equations for these three functions.
•
Show an implementation consisting of inverters, AND, and OR gates.
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
1998 Morgan Kaufmann Publishers
Ch4-13
An ALU (arithmetic logic unit)
•
Let's build an ALU to support the andi and ori instructions
– we'll just build a 1 bit ALU, and use 32 of them
operation
a
op a
b
res
result
b
•
Possible Implementation (sum-of-products):
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
1998 Morgan Kaufmann Publishers
Ch4-14
Review: The Multiplexor
•
Selects one of the inputs to be the output, based on a control input
S
•
A
0
B
1
C
note: we call this a 2-input mux
even though it has 3 inputs!
Lets build our ALU using a MUX:
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
1998 Morgan Kaufmann Publishers
Ch4-15
Different Implementations
•
Not easy to decide the “best” way to build something
•
– Don't want too many inputs to a single gate
– Dont want to have to go through too many gates
– for our purposes, ease of comprehension is important
Let's look at a 1-bit ALU for addition:
C arryIn
a
Sum
b
cout = a b + a cin + b cin
sum = a xor b xor cin
C arryO u t
•
How could we build a 1-bit ALU for add, and, and or?
•
How could we build a 32-bit ALU?
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
1998 Morgan Kaufmann Publishers
Ch4-16
Building a 32 bit ALU
C a rr yIn
a0
b0
O p e ra tio n
C a rry In
A LU 0
R e su lt0
C a rry O u t
O p e ra tio n
C a rry In
a1
a
0
b1
C a rry In
A LU 1
R e su lt1
C a rry O u t
1
R e su lt
a2
2
b
b2
C a rry In
A LU 2
R e su lt2
C a rry O u t
C a rry O u t
a3 1
b3 1
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
C a rry In
A LU 3 1
1998 Morgan Kaufmann Publishers
R e su lt3 1
Ch4-17
What about subtraction (a – b) ?
•
•
•
Two's complement approch: just negate b and add.
a - b = a + (- b)
How do we negate?
(- a) = comp2(a) = comp1(a) + 1
A very clever solution:
B in ve rt
O pe ration
C a rry In
a
0
1
b
0
R e su lt
2
1
C a rry O ut
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
1998 Morgan Kaufmann Publishers
Ch4-18
Subtrator
B
~B
0
1
B
equivalente à

Binv
Binv
Binv




+
+
+
+
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
1998 Morgan Kaufmann Publishers
Ch4-19
Tailoring the ALU to the MIPS
•
Need to support the set-on-less-than instruction (slt)
– remember: slt is an arithmetic instruction
– produces a 1 if rs < rt and 0 otherwise
– use subtraction: (a-b) < 0 implies a < b
•
Need to support test for equality (beq $t5, $t6, $t7)
– use subtraction: (a-b) = 0 implies a = b
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
1998 Morgan Kaufmann Publishers
Ch4-20
Supporting slt
B inv e rt
O p era tio n
C arr yIn
a
0
•
Can we figure out the idea?
1
R e s ult
b
0
2
1
L ess
Rs
0
bit de sinal
3
a.
C ar ryO u t
Rt
subtração
Rd
B in ve rt
O p er ation
C ar ryIn
a
0
1
R es u lt
b
0
2
1
L e ss
3
Set
O v e rflo w
de te c tion
b.
O ve rflo w
O p era tio n
C a rryIn
B in ve rt
a0
b0
C a rryIn
ALU0
Le ss
C a rryO u t
a1
b1
C a rryIn
ALU1
Le ss
C a rryO u t
0
a2
b2
0
R es ult0
R es ult1
C a rryIn
ALU2
Le ss
C a rryO u t
R es ult2
C a rry In
a3 1
b3 1
0
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
C a rryIn
A L U 31
Le ss
R es ult31
S et
O v e rflo w
1998 Morgan Kaufmann Publishers
Ch4-22
Test for equality
•
Notice control lines:
000
001
010
110
111
=
=
=
=
=
and
or
add
subtract
slt
B n eg ate
O p era tion
a0
b0
C a rry In
ALU0
Le ss
C a rry O ut
a1
b1
C a rry In
ALU1
Le ss
C a rry O ut
0
a2
b2
0
C a rry In
ALU2
Le ss
C a rry O ut
R e su lt0
R e su lt1
Z ero
R e su lt2
•Note: zero is a 1 when the result is zero!
a 31
b 31
0
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
R e su lt31
C a rry In
ALU31
Le ss
S et
1998 Morgan Kaufmann Publishers
O v erflow
Ch4-23
ALU
ALUop
A
32 bits: A, B, result
Zero
1 bit: Zero, Overflow
Result
B
3 bits: ALUop
Overflow
ALUop
B in v-O P
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
In s tru ç ã o
0 00
and
0 01
or
0 10
add
1 10
sub
1 11
s lt
1 10
beq
1998 Morgan Kaufmann Publishers
Ch4-24
Conclusion
•
We can build an ALU to support the MIPS instruction set
– key idea: use multiplexor to select the output we want
– we can efficiently perform subtraction using two’s complement
– we can replicate a 1-bit ALU to produce a 32-bit ALU
•
Important points about hardware
– all of the gates are always working
– the speed of a gate is affected by the number of inputs to the gate
– the speed of a circuit is affected by the number of gates in series
(on the “critical path” or the “deepest level of logic”)
•
Our primary focus: comprehension, however,
– Clever changes to organization can improve performance
(similar to using better algorithms in software)
– we’ll look at two examples for addition and multiplication
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
1998 Morgan Kaufmann Publishers
Ch4-25
Problem: ripple carry adder is slow
C a rr yIn
•
•
Is a 32-bit ALU as fast as a 1-bit ALU?
atraso (ent  soma ou carry = 2G)
n estágios 2nG
Is there more than one way to do addition?
– two extremes:
ripple carry (2nG)
sum-of-products (2G)
a0
b0
O p e ra tio n
C a rry In
A LU 0
R e su lt0
C a rry O u t
a1
b1
C a rry In
A LU 1
R e su lt1
C a rry O u t
a2
b2
C a rry In
A LU 2
R e su lt2
C a rry O u t
Can you see the ripple? How could you get rid of it?
c1
c2
c3
c4
=
=
=
=
b0c0
b1c1
b2c2
b3c3
+
+
+
+
a0c0
a1c1
a2c2
a3c3
+
+
+
+
a0b0
a1b1
a2b2
a3b3
c2 =
c3 =
c4 =
a3 1
b3 1
C a rry In
A LU 3 1
R e su lt3 1
Not feasible! Why?
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
1998 Morgan Kaufmann Publishers
Ch4-26
Carry-lookahead adder
•
•
An approach in-between our two extremes
Motivation:
– If we didn't know the value of carry-in, what could we do?
– When would we always generate a carry?
gi = ai bi
– When would we propagate the carry?
pi = ai + bi
•
Did we get rid of the ripple?
c1
c2
c3
c4
=
=
=
=
•
g0
g1
g2
g3
+
+
+
+
p0c0
p1c1
p2c2
p3c3
c2 =
c3 =
c4 =
Feasible! Why?
atraso: ent  gi pi (1G)
gi pi  carry (2G)
total: 5G independente de n
carry  saídas (2G)
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
1998 Morgan Kaufmann Publishers
Ch4-27
Use principle to build bigger adders
C a rry In
a0
b0
a1
b1
a2
b2
a3
b3
C a rry In
R e s ult0--3
ALU0
P0
G0
pi
gi
C ar ry-loo k ah ea d u n it
C1
a4
b4
a5
b5
a6
b6
a7
b7
a8
b8
a9
b9
a1 0
b1 0
a1 1
b1 1
a1 2
b1 2
a1 3
b1 3
a1 4
b1 4
a1 5
b1 5
ci + 1
C a rry In
R e s ult4--7
ALU1
P1
G1
•
•
•
pi + 1
gi + 1
C2
ci + 2
C a rry In
R e s ult8--1 1
ALU2
P2
G2
pi + 2
gi + 2
C3
ci + 3
Can’t build a 16 bit adder this way... (too big)
Could use ripple carry of 4-bit CLA adders
Better: use the CLA principle again!
– super propagate (ver pag 243)
– super generate (ver pag 245)
– ver exercícios 4.44, 45 e 46 (não será
cobrado)
C a rry In
R e s ult12 --1 5
ALU3
P3
G3
pi + 3
gi + 3
C4
ci + 4
C a rryO u t
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
1998 Morgan Kaufmann Publishers
Ch4-28
Multiplication
•
•
•
More complicated than addition
– accomplished via shifting and addition
More time and more area
Let's look at 3 versions based on gradeschool algorithm
8 10
9 10
10 0 0
10 0 1
10 0 0
0 00 0
0 0 00
1000
7 2 1 0 1 0 0 1 00 0
m u ltip lic a n d o
m u ltip lic a d o r
p ro d u to s p a rc iais
4
4
m a x = (2 –1 ) *(2 – 1 ) = 2 2 5
2 2 5 > 1 2 8  8 b its
3 2 * 3 2 b its  6 4 b its
•
Negative numbers: convert and multiply
– there are better techniques, we won’t look at them
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
1998 Morgan Kaufmann Publishers
Ch4-29
Multiplication: Implementation
S ta rt
M ultip lier0 = 1
1. Test
M u ltip lier0
M ultip lie r0 = 0
1 a. A dd m u ltip lica n d to p rod u ct an d
plac e the re s ult in P ro d u c t re gister
2. S h ift the M ultip lica n d re gis ter le ft 1 b it
3. S h ift th e M ultip lier reg is te r righ t 1 bit
M u ltip lica n d
N o : < 3 2 re pe titio ns
3 2n d rep e tition ?
S h ift le ft
Y e s: 3 2 r e p etitio ns
6 4 b its
D o ne
M u ltip lie r
S h ift rig h t
6 4 -b it A L U
3 2 b its
P ro d u ct
W rite
C o n tro l te s t
6 4 b its
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
1998 Morgan Kaufmann Publishers
Ch4-30
Second Version
S tart
M ultip lie r0 = 1
1. Test
M u ltip lie r0
M u ltip lie r0 = 0
1 a . A d d m u ltip lic a n d to th e le ft h a lf of
th e p ro d u ct a n d p la c e th e re su lt in
th e le ft h a lf o f th e P ro d u c t re giste r
2 . S h ift th e P ro d u c t re g is te r rig h t 1 b it
3 . S h ift th e M u ltip lie r re g is te r rig h t 1 bit
M u ltiplic an d
N o : < 3 2 re p e titio n s
3 2 n d re p e tition ?
3 2 b its
Y e s: 3 2 r e p e titio n s
D on e
M u ltip lie r
S h ift rig h t
32 -b it A L U
3 2 b its
P ro d u ct
S h ift rig ht
W rite
C o n tro l tes t
6 4 b its
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
1998 Morgan Kaufmann Publishers
Ch4-31
Final Version
M u ltiplic an d
S tart
3 2 b its
P ro du ct0 = 1
1. Test
P rod uc t0
P ro du c t0 = 0
1 a. A dd m u ltiplic an d to the le ft ha lf of
th e p ro du ct an d plac e the re su lt in
the le ft ha lf of the P ro d uc t re gister
3 2- b it A L U
2 . S h ift th e P rod uc t reg is te r rig h t 1 bit
N o : < 3 2 re p etitio ns
3 2 nd rep e tition ?
Y e s: 3 2 r ep etitio n s
P rod uct
S h ift rig h t
W rite
C o n trol
te s t
D on e
6 4 b its
• No MIPS:
• dois novos registradores de uso dedicado para
multiplicação: Hi e Lo (32 bits cada)
• mult $t1, $t2
# Hi Lo  $t1 * $t2
• mfhi $t1
# $t1  Hi
• mflo $t1
# $t1  Lo
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
1998 Morgan Kaufmann Publishers
Ch4-32
Algoritmo de Booth (visão geral)
•
•
•
•
•
•
Idéia: “acelerar” multiplicação no caso de cadeia de “1´s” no
multiplicador:
0 1 1 1 0 * (multiplicando) =
+ 1 0 0 0 0 * (multiplicando)
- 0 0 0 1 0 * (multiplicando)
Olhando bits do multiplicador 2 a 2
– 00 nada
– 01 soma (final)
– 10 subtrai (começo)
– 11 nada (meio da cadeia de uns)
Funciona também para números negativos
Para o curso: só os conceitos básicos
Algoritmo de Booth estendido
– varre os bits do multiplicador de 2 em 2
Vantagens:
– (pensava-se: shift é mais rápido do que soma)
– gera metade dos produtos parciais: metade dos ciclos
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
1998 Morgan Kaufmann Publishers
Ch4-33
Geração rápida dos produtos parciais
X0 Y0
X2
X1
X0
Y0
Y1
Y2
X0
X1
X2
X2 Y0
X2 Y1
X2 Y2
X1 Y0
X1 Y1
X1 Y2
X0 Y1
X 0Y2
Y0
Y1
X2 Y0
X1 Y0
X0 Y0
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
Y2
X2 Y1
X1 Y1
X0 Y1
1998 Morgan Kaufmann Publishers
X2 Y 2
X1 Y 2
X0 Y 2
Ch4-34
Carry Save Adders (soma de produtos parciais)
a3
b3
a2
b2
a1
b1
a0
b0
A
B
E
F
T rad itio n al a dd e r
e3
e2
e1
e0
T r ad itio n a l ad d e r
f3
f2
f1
f0
T ra ditio na l ad de r
S
s5
s4
s3
b3
e3
s2
f3
b2
e2
s1
f2
b1
s0
e1
f1
b0
e0
f0
A
B
E
F
C arry s av e ad de r
a3
a2
a1
a0
C a rry s a v e a d de r
C'
s '4
c '3
s '3 c '2
s '2
c'1
s '1
c'0
S'
s'0
T ra d ition al ad de r
S
s5
s4
s3
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
s2
s1
s0
1998 Morgan Kaufmann Publishers
Ch4-35
Divisão
29  3 
29 = 3 * Q + R = 3 * 9 + 2
2910 = 011101
011101
11
310 = 11
11
01001
00101
11
10
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
Q=9
R=2
Como implementar em hardware?
1998 Morgan Kaufmann Publishers
Ch4-36
Alternativa 1: divisão com restauração
• hardware não sabe se “vai caber ou não”
• registrador para guardar resto parcial
• verificação do sinal do resto parcial
• caso negativo  restauração
29 – 3 * 2
4
= -1 9
4
29
q4 = 0
3
5
q3 = 1
2
-7
q2 = 1
2
5
q2 = 0
1
-1
q1 = 1
1
5
q1 = 0
0
2
q0 = 1
-1 9 + 3 * 2 =
29 – 3 * 2 =
5
– 3 * 2 =
-7
+ 3 * 2 =
5
– 3 * 2 =
-1
+ 3 * 2 =
5
– 3 * 2 =
R = 11 = 2
q4 = 1
R e s ta u ra çã o
R e s ta u ra çã o
R e s ta u ra çã o
q4q3q2q1q0 = 01001 = 9
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
1998 Morgan Kaufmann Publishers
Ch4-37
Alternativa 2: divisão sem restauração
Regras
s e re s to p a rc ia l
> 0
p ró x im a o p e raç ã o
som a
s e re s to p a rc ia l
< 0
p ró x im a o p e raç ã o
s u b tra ç ão
s e o p e ra ç ã o c o rre n te
+
qi = 1
s e o p e ra ç ã o c o rre n te
-
qi = 1
29 – 3 * 2
4
= -1 9 < 0
p ró x = S O M A
q4 = 1
3
5 > 0
p ró x = S U B
q3 = 1
2
-7 < 0
p ró x = S O M A
q2 = 1
1
-1 < 0
p ró x = S O M A
q1 = 1
-1 9 + 3 * 2 =
5
– 3 * 2 =
-7
+ 3 * 2 =
-1
+ 3 * 2 =
0
2
R e s to = 2
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
o b je tivo
R  0
q0 = 1
Q u o c ie n te =
11111 ??
1998 Morgan Kaufmann Publishers
Ch4-38
Alternativa 2: conversão do resultado
1 1 1 1 1  (2  2  2  2  2 )
4
3
2
1
0
16 - 8 + 4 - 2 - 1
... 1 1 ...  2
n
2
( n 1 )
 2
( n 1 )
( 2  1)  2
1 111 1
• Nº de somas: 3
01011
01
• Nº de subtrações:2
( n 1 )
• Total: 5
• OBS: se resto < 0 deve haver correção
de um divisor para que resto > 0
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
1998 Morgan Kaufmann Publishers
Ch4-39
Comparação das alternativas
29
29
5
5
5
5
-1
2
-7
-1
c om
2
sem
-7
-19
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
1998 Morgan Kaufmann Publishers
Ch4-40
Hardware para divisão: terceira alternativa
S ta rt
1 . S hift th e R e m a in d e r re g iste r left 1 b it
D iviso r
2 . S u b tra c t the D iv is o r re g iste r fro m th e
le ft h alf o f th e R e m aind e r re g is te r a n d
p la c e th e re s u lt in th e le ft h a lf o f th e
R e m a in d e r re g is te r
3 2 bits
R e m a in d e r >
– 0
R e m a in d e r < 0
T e st R e m a in d e r
3 2- bit A L U
3 a . S h ift th e R e m a in d e r reg is te r to th e
le ft, s e ttin g th e n e w rig h tm os t b it to 1
3 b . R e sto re th e o rig in a l v a lu e b y ad d in g
th e D iv is o r reg is te r to th e le ft h alf o f th e
R e m a in d e r re g iste r a n d p la ce th e su m
in th e left h alf o f th e R e m a ind e r re g is te r.
A ls o sh ift th e R e m aind e r re g is te r to th e
le ft, s ettin g th e n ew rig h tm o s t b it to 0
S h ift righ t
R e m a in de r
S hift left
W rite
C on trol
te s t
N o : < 3 2 re pe titio n s
3 2 n d rep e titio n ?
Y e s: 3 2 r e p e titio n s
6 4 bits
D o n e. S h ift le ft h a lf o f R e m a in d e r rig ht 1 b it
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
1998 Morgan Kaufmann Publishers
Ch4-41
Instruções
• No MIPS:
• dois novos registradores de uso dedicado para
multiplicação: Hi e Lo (32 bits cada)
• mult $t1, $t2
# Hi Lo  $t1 * $t2
• mfhi $t1
# $t1  Hi
• mflo $t1
# $t1  Lo
• Para divisão:
• div $s2, $s3
• divu $s2, $s3
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
# Lo  $s3 / $s3
Hi  $s3 mod $s3
# idem para “unsigned”
1998 Morgan Kaufmann Publishers
Ch4-42
Ponto Flutuante
•
•
•
Objetivos:
– representação de números não inteiros
– aumentar a capacidade de representação (maiores ou menores)
Formato padronizado
1.XXXXXXXXX ..... * 2yyy
(no caso geral Byyy)
No MIPS:
S
exp
mantissa ou significando
8
23
 exp
 faixa
 mantissa  precisão
sinal-magnitude
(-1)S F * 2E
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
1998 Morgan Kaufmann Publishers
Ch4-43
Ponto Flutuante e padrão IEEE 754
expoente  [-128 , 127]
se 210  103
128 = 8 + 10 * 12;
2128 = 2(8 + 10 * 12) = 28 * 2(10 * 12)  2 * 1038
overflow  Nº > 1038
underflow  Nº < 10-38
um implícito
PADRÃO IEEE 754
1.XXXXXXXXXXX
S
exp
mantissa ou significando
8
23
mantissa: precisão simples: 23 bits (+1)
precisão dupla: 52 bits (+1)
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
1998 Morgan Kaufmann Publishers
Ch4-44
Padrão IEEE754: bias
Nº = (-1)S (1 + Mantissa) * 2E
Para simplificar a ordenação (sorting): BIAS
0
127
exp
Bias
255
No padrão: 2 (nE - 1) - 1 = 127
EXP = CAMPOEXP - BIAS
exp
Exemplo: representar - 0,7510 = - (1/2 + 1/4)
- 0,7510 = - 0,112 = -1,11 * 2-1
mantissa = 1000000 ......
(23 bits)
campo expoente: - 1 + 127 = 12610 = 0111 11102
1
0111 1110
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
1000 0000 0000 0000 0000 000
1998 Morgan Kaufmann Publishers
Ch4-45
Tabela de faixas de representação do IEEE 754
P re c is ã o s im p le s
P re c is ã o d u p la
S ig n ific ad o
E x p o e n te
M a n tis s a E x p o e n te
M a n tis s a
0
0
0
0
0
0
 0
0
 0
N ú m e ro n ão n o rm a liza d o
1 – 25 4
qquer
1 – 20 4 6
qquer
N ú m e ro n o rm a liza d o
255
0
2047
0
in fin ito
255
 0
2047
 0
N a N (n o t a n u m b e r)
8 b its
2 3 (+ 1 )
11
5 2 (+ 1 )
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
1998 Morgan Kaufmann Publishers
Ch4-46
Soma em ponto flutuante
S ta r t
1. C om pa re th e e x p o n e n ts o f the tw o n u m b e rs .
S h ift th e sm a lle r n u m b e r to the rig h t u n til its
e xp on e nt w ou ld m atc h th e larg e r e x p o n e n t
2 . A dd th e sig n ifica n d s
3 . N o rm aliz e the s u m , e ith e r s hiftin g rig h t an d
inc rem e nting th e e x p o n en t o r s h iftin g le ft
a nd d ec re m e n ting th e ex po n e n t
O ve rflow o r
u n d e rflo w ?
Y es
No
E x c ep tio n
4. R ou nd th e sign ifica n d to th e ap p ro p r ia te
n u m be r o f b its
No
S till n orm a liz ed ?
Y es
D on e
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
1998 Morgan Kaufmann Publishers
Ch4-47
ULA para soma em ponto flutuante
S ig n
E xp on e nt
S ig n ifica n d
S ig n
E xp on en t
S ig n ifica nd
C om p a r e
ex p o ne nts
S m all A LU
E x p on en t
diffe ren c e
0
1
0
C o ntrol
1
0
1
S h ift s m a lle r
nu m b er rig h t
S h ift rig h t
A dd
B ig A L U
0
1
0
In c rem e nt or
de c rem e nt
1
S h ift le ft o r rig h t
R o u n d in g ha rd w are
S ign
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
E x po ne nt
N or m a liz e
R ou nd
S ign ific an d
1998 Morgan Kaufmann Publishers
Ch4-48
Multiplicação em ponto flutuante
S ta rt
1 . A d d the b ia s e d e x po ne n ts o f th e tw o
n um be rs , s ub tra ctin g th e bias fr om th e su m
to ge t th e ne w b ia se d e x po ne nt
2 . M u ltiply th e s ig nific an ds
3 . N o rm aliz e the p ro d uc t if n ec e ss ar y, sh iftin g
it righ t a nd in c rem e nting th e ex po ne n t
O v e rflo w o r
u n d e rflo w ?
Yes
No
E x ce p tion
4 . R o un d the s ig nific an d to th e a pp ro p ria te
n u m b er o f b its
No
S till n o r m a lize d?
Yes
5 . S et the s ig n o f the p r od uc t to p o s itiv e if th e
sign s of the o rig in al o p er an ds a re th e s a m e ;
if th ey d iffer m a k e the s ig n n eg ativ e
D o ne
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
1998 Morgan Kaufmann Publishers
Ch4-49
Conjunto de instruções do MIPS para fp
Fig 4.47
Mario Côrtes - MO401 - IC/Unicamp- 2004s2
Pag 291
1998 Morgan Kaufmann Publishers
Ch4-50
Download

ch4_v2-cortes - Facom-UFMS