A Comparative Study of Modulo Scheduling Techniques Josep Llosa Computer Architecture Department Universitat Politecnica ` de Catalunya-Barcelona
Computer Architecture Department Universitat Politecnica ` de Catalunya-Barcelona
[email protected]
[email protected]
[email protected]
ABSTRACT
Categories and Subject Descriptors
!! "# $% &' ((% $ ) $ &* +,' &, - $' (. / 0 1 2 34 56 37 5 4 89:5;; 98;< = > ? @A @> ? @< BC > D E8:F 3G H 5: HI 85; J K L "# $% M $ N O O +P M Q N P M - N M ' (. / R
1 @9S 5 T 5 7 58E H3 9 7 < @9U4 3 658; V
General Terms
0 R W X Y
Keywords
1.
Antonio Gonzalez ´
Josep M. Codina Computer Architecture Department Universitat Politecnica ` de Catalunya-Barcelona
INTRODUCTION
"Z.
X
[\Y R ] \ ^ _ ` R a \ Y ^
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ICS’02, June 22-26, 2002, New York, New York, USA. Copyright 2002 ACM 1-58113-483-5/02/0006 ... b 5.00.
X
/ c 9d 5 GH F 5 7 G ; :F 5S I 6 37 T "! ! !K !e !f. [ ]
J ? :F 5S I 65 GH F 5 7 G U 9d 5
X
g 0 ? :F 5S I 65 GH F 5 7 G U 9d 5 / h7 896 6 iE; 5S ; :F 5S I 6 37 T "! f ff K fL L e. J c 9S I 69 ? :F 5S I 6 37 T "fj.
k
/
l
\ Y R "fK !m !L. 0
"m. X g
n R o
p
p q
/ r > H 58E H3 d 5 c 9S I 69 ? :F 5S I 6 37 T s> c ?t "f u. r ? v 37 T c 9S I 69 ? :F 5S I 6 37 T s? c ?t "fw. r ? 6E :x c 9S I 69 ? :F 5S I 6 37 T s? 6E :xt "!u. r > 7 H 5T 8E H 5S = 5T 3 ; H 58 G ? 5 7 ; 3H3 d 5 > H 58E H3 d 5 ? 9y H v E85 4 34 56 37 G 37 T s>= > ?t "!w.
/
Find MII And set II = MII
r
Look for a schedule
No
Found it?
Increase the II Yes
+M - $' P + - ' (
N (+&
(&* ' O ' %
r
% - % &* ' - +P M
,' &*
g ? H ET 5 ? :F 5S I 6 37 T "!j.
X
0
X
X \
X o
\ \_ \ g
f
K L
j
MODULO SCHEDULING TECHNIQUES
p
R 0
\
2.1 A Basic Modulo Scheduling Scheme
! ^
g
&* ' - +P M (,'
/
2.
% O - , +P M , * ' +P +O - O P +, +N , +% P P,' $ N \ \ [\ \]
\ \ [ \ \] \ \
\ \
X \ \ "fu.
r
o p
g
\ / \ \ g \ \ "fw. "f u !u !w. ^
3H 58E H3 d 5 H 5:F 7 3 I 5; J
X
X X \ g
g
I ST 5 H= E H3 9 "f u.
X [ \ \]
P &$' N (+P M
, * ' \ \ \
\ \
g n \ \
2.2 Most Relevant Techniques
\
"f !. "j. g
l X L
2.2.1 Iterative Modulo Scheduling
\ [\ ] _ "f u.
\
p g
\ o \
2.2.2 Slack Modulo Scheduling
[ ] p "!u.
0
\
\ ;
\
g
l
2.2.3 Integrated Register-Sensitive Iterative Software Pipelining
\ _
\ [\_ \ ] _ l "!w.
\
[ f f j ]
l
2.2.4 Swing Modulo Scheduling
[ ] Y 0 "fw. \
\
_ \ \ [ ]
_ \ \
g \ \
^
2.2.5 Stage Scheduling
W "!j.
\ \ \
` R _
3. PERFORMANCE STATISTICS
\
3.1 Benchmarks and Compiling Platform
"u. Y W 0 "!Z.
R R W Z j X Y \ "j. 0 !Z Km / !fje R m ue R W Z j ue Z j X
3.2 Architectures
\
/ X X X X
Q % % O ' +, ) $ &* +,' &,- $' X
k
g
' +- O % O ' +, ) $ &* +,' &,- $' X
k
' $N , +% P O 'O +P, &% O +P, +P, O % ( $, &% O
O % ( $, N '
Q % &% O ' +, f ! f m K m !e
Q N ,' P & % , * ' % ' $N , +% P (
X
g
% O ' ) $ &* +,' &, - $' X
X [
]
k
e L !
3.3 Performance Metrics
/ r W
! n
g \ \ \ f R
K
0
n g X
\ \
L
\
r R
X \Y R \
% $
' +- O
&% O ' +, f ! f m K m !e
% O '
$% &' ((% $
N $ &* +,' &, - $ ' (
, * ' ,* $' '
K K L e j e fw
! \ \ \ \
\ \
f \ \
X
K 0 [\ \ \ \ ]
\ \
L n \ \ \\
r _
0 l
\ \
\
\
"fm. \ \ "fe. ^ g
!
k
f
\ \
k
\ \ K n K f
K f
L n mL
mL
j X
m X Y 0 X Y
[
] 0
\ \
X Y 0
"!u. r
X !
k "f u. k f
\ \ \ \
\ \ K X
\ \
X \
g
n
g
3.4 Study of the BudgetRatio
\ \_ \
^ _ g
k ^ _ X ^ _ / ! f j j !w \ ^ _
' &,+ ' P ' (( g
\ \
\
# ' $ % $ O N P &' \ \
^ _ [ \ \ ] l k
^ _ \ \ \ \
% (, \ ^ _ \ \ p ^ _ g \ \
4. COMPARISON OF THE TECHNIQUES
\
g ^ _
K K
4.1 Low Complexity Architecture
f ^ _ X
^ \ \_ \ g ^ _ o ^ _ \ ^ _ !w
l p ^ _ f j f \ \ \ \ 0 ^ _
^ _ f j
\ \ \ \ f ^ _ \ \_ \ ^ _ o \_ \ \ ^ _ X X \ q ^ _ ^ _ ! g fKe \ \ ^ _ f j
g ^ _
j X ^ _ !w
X ^ _ g
^ _ !w
f
\
N
450
1,2
10
400
1,15
8 6 4
1,05 1
2
0,95
0
0,9
1
2,5
5
10
350
1,1
Total time
IMS IRIS Slack
Sum II/Sum MII
% non scheduled ops
12
&
300 250 200 150 100 50 0
1
2,5
BudgetRatio
5
1
10
2,5
5
10
BudgetRatio
BudgetRatio
+M - $' P - ' P &' % ,* ' - M ' , N , +% % P , * ' +,' $N , + ' ,' &* P + - ' ( % $ , * ' (+O ' N $ &* +,' &, - $' N % P % P (&* ' - ' % ' $N , +% P ( (- O % (- O % N P & ,% ,N (&* ' - +P M , +O '
' $ &' P,N M '
#
$ # # ' '
! !% "( &)* "( &/*
! !% ( + , &,* .. , &/*
! !% " "&,* " "&-*
! !% ( - "&,* . - 0 &)*
! !% - - &.* -! &( *
1 # # '
2
3
2 3 2 3 8 7 9 : 7 ; < # = 3 # = 5 3
> # 3 # 3 8 # = ' # ? 4 ' 6
! !% ( /) &-* . ) " &/*
" ! &" "% ""0 &)* .0 &"*
" ! &""% (( / &/* ./ / &+*
" "0 -+
2 # 3 4 5 6 7
N ' ' &, + ' P ' (( &% (, N $ &* +,' &, - $'
N $N ' +( O
"+! ", "&! ! + , (.
" "0/ " &!! ! + 0
"+! -! " &!! . -)
"+! ( " " &!! + " ."
(( )! (! ".! (0. ". ) ") ) " &, ,!(
( ") ..,( -. (., ""( ") ) " &/ + "+
. /, // . +, 0( ./! -) "0 . " &./ .(
./( ". .+( / / . -+ -/ "0. "&.( - ,
( .+ "! . ))( + (,/ ".! +! ! " &/0( )
( ! (( ! . / .. , ( +! "! +!! "&- -- .
., + ". .. +!! .," /0 "0 0 "&. 0! .
. // 0, . +,( + ./ . /( "0 0 " &./( .
"! + "( 0 ,+ + +)( . "
"! + "( 0 ,+ + +)( . "
"! . /! 0 ) ,! + )+ +,
"! .( ( 0) -( +) ++ /
"!! +0 0 - )+ )- -)
"! ! +( 0- ) . + ) -- )
"! . )0 0! 0 +) / . "
"! . )0 0! 0 + )/ . "
$' M +(,' $
$ ' ((- $'
g W
[ ] \ \_ \ !Z Km o l !
w ! A n
\ \ \_ \
g g
\
f X
\ /
&% ' (+ @ ' N P ' ' & - , +% P ,+O '
% $ %
&% O ' +,
\ \ \ \ \ \ \ \ \ \ \ \
\ \ \ \ \ \ \ Z !Z Km g \ \ !Z f j J \ \
! A [ \ \ ] n k \ \ f
\ /
\ \ K f
mL
X
X Y 0
14
1,4
12
1,2
8 6 4 2
&
1 Total time
IMS IRIS Slack
10
Sum II/Sum MII
% non scheduled ops
N
0,8 0,6 0,4 0,2
0
0
1
2,5
5
10
1
2,5
5 BudgetRatio
BudgetRatio
10
500 450 400 350 300 250 200 150 100 50 0 1
2,5
5
10
BudgetRatio
+M - $' P - ' P &' % , * ' - M ' , N , +% % P ,* ' +,' $N , + ' ,' &* P + - ' ( % $ , * ' O ' +- O &% O ' +, N $ &* +,' &, - $' N ' $ &' P,N M ' % P % P (&* ' - ' % ' $N ,+% P ( (- O % (- O % N P & ,% ,N (&* ' - +P M ,+O '
\ \
\ \ X Y 0 K f mL
\_ \
\ [
]
\ \_ \ n \ \_ \ f
X \ \ X [ ] \_ \
n
X
X
\ \ X
^
0
l g
4.2 Medium Complexity Architecture
K ^ _ X \ \_ \ g ^ _ ^ _ \ ^ _ !w ^ _ j K \ \ \ \ 0
^ _ p
^ _ f j K
^ _ \ \_ \
[\ ] ^ ^ _
^ _ ^ _ ! j o ^ _ j
X \
K \ \_ \ !Z K m o Z w j A \ K
0 g
[ um Z fwm ] g \
\_ \
\ n
K
X
\ \ \ \
! !e \
\ \ \_ \ K
X
\ \_ \ g
p
n
[ K ]
# # + # , , - ./ 0 1
- , - , - 2 1 34 1 56 7 - 7 / 8 - - 2 7 # 9 . #0
N '
' &,+ ' P ' (( &% (, &% O ' +, N $ &* +,' &, - $'
$% &' !($ &' &* ' $"! )'
%! $' !!$ ' %$ )' $!& *'
%!**) % *& %&" $($* $))(* &!"&" &$)( $"% $%% "% !$ %&* %%" % *$%( % (%!( *(! *(! "&!" "&!" $%*) $%*$
%!*!$ % % %%* $%!%( $%%)% &")" &)&) &%! &( $$ & %% %% % !!* % !$( *(($ *("" "&*% "&($ $*"&* $*"&*
N $N ' +( O
$' M +(,' $
$' ((- $'
&% ' (+ @ ' N P ' ' &- , +% P ,+O '
N 1,4
900
10
1,2
800
6 4
,* ' O ' +- O
700
1 Total time
IMS IRIS Slack
8
% $
&
12
Sum II/Sum MII
% non scheduled ops
!" !" ( "' )( !' &" $' )%) *' ") !' ! (' (" $' $( %' %!""( %!($ %)$( % % %$& %"& %!* $)"%! $!% $%() $%$&" &$"&) &&"& &%$( &*&* $$( &*$ &$( &%* !) $! $& & %% %% %&( %&( % (&&( % "%** % !($( % !! *!*! *!*) **"& **"& "%($ "%($ "$)" "$)" $$!$ $$!$ !%&&$ !%&&$
0,8 0,6 0,4
600 500 400 300 200
2
0,2
0
0
1
2,5
5
10
100 0
1
2,5
BudgetRatio
:
5 BudgetRatio
10
1
2,5
5
10
BudgetRatio
+M - $' P - ' P &' % , * ' - M ' , N ,+% % P , * ' +,' $N , + ' ,' &* P + - ' ( % $ , * ' &% O ' N $ &* +,' &, - $' N &' P,N M ' % P % P (&* ' - ' % ' $N , +% P ( (- O % (- O % N P & ,% ,N (&* ' - +P M , +O '
K \_ \
o
X [ ] g
\ \
X \ \
[
] \
0 g
p g l
4.3 Complex Architecture
L ^ _ 0 \ \_ \ g ^ _ 0 X ^ _
'$
[ ! ^ _ !w ] 0 ^ _ f j \ \ \ [ L ]
^ ^ _ f j \ \ _ [ L ] \_ \
\ X ^ _ [ew w
^ _ !w ] ^ _ f j
\
L \ \_ \ !Z Km o fw ! we A \
# # + # , , - ./ 0 1
- , - , - 2 1 34 1 56 7 - 7 / 8 - - 2 7 # 9 . #0
:
N ' ' &, + ' P ' (( N $ &* +,' &, - $'
&% (,
N $ N ' +( O
$!%& '$(!$& %!*& $()! &
)$!"& '* ! & "! & $%*!'& "' % *'% ( (* !$' !' !' ) " (' ))) ( )$( $* ' $*'' $'') $%($( $)%" )"$ )$* $%* $% )$'% (* () $*% '% ') $* $$( " ' * " ( $' ' *" *" $ *) !%*' !''( !$' !$() !' "% !)) * *%"( *%% *$ ' *$ * '(' '(' '" '%$ ' %$ ' %$ $"**) $"**" ($ * ($ ' $*"' $*"* $' M +(,' $
$' ((- $'
g L \
\ \
\ g \ \ \ \
fe !
\ \
\_ \ l K A \ L
\ \_ \ g L \_ \
\
X [ ] g p g
[ ] o
\ \ \ J
J \_ \
5.
CONCLUSIONS
)'!'& (* !(& )!& (*!&
X
X
q
\
/ \ [\ ] [ ] [ ] \ _
\
[\_ \ ] \
!" !" $")!(& )%(!& !'& $ !$& $*"" $* ( % $ ( !$)" *%'" '$)' $*)'$
&% ' (+ @ ' N P ' ' &- , +% P , +O '
"'' !)'( ($" $%* $" % ') ( !$() *%'* '$)' $*)'$ % $
, * ' &% O '
l R ^ R W Z j
[ ] \ \ X \
X \ \ \ p \
[ ] 0
[\ \_ \ ]
o
6. ACKNOWLEDGMENTS
W R _ \ l p 0 o W ` [W R fL ZL f ]
W ` [ W W _ ] \ fw w !w ZZ j w fw ! _ fww ! \ w w mmL ` R 0 R \n 0
7.
REFERENCES
"!. 0 0 0 n o Y R \ 2 89: V< ?> 2 C @9 7y V 9 7 2 89T 8EU U 37 T C E 7 T I ET 5 5; 3T 7 E 7 S > U4 65U 5 7 H E H3 9 7 K we K w !u !Z ee "f. 0 0 0 n 2 58y 5: H 2 34 56 37 37 T 5v C 994 2 E8E6 656 3 E H3 9 7 5:F 7 3 I 5 K ww C 5: HI 85 9 H 5; 37 @9U4 I H 58 ? : 3 5 7 :5 f f ! fK j !Ze e "K. 0 0 0 n 0 _
_ R 0 Sd E 7 :5; 37 C E 7 T I ET 5; E 7 S @9U4 3 658; y 98 2 E8E6 656 2 89:5;; 37 T fuL fZw !Z Z ! "L. 0 ` _ R R \ 2 89: V< 9y H F 5 H F 7 7 I E6 > 7 H 58 7 E H3 9 7 E6 ? U4 9; 3I U 9 7 c 3 :89E8:F 3H 5: HI 85V n !Z Z j "j.
0
\ 2 89: V HF 7 7 I E6 ? U4 9; 3I U 9 7 2 8 37 : 34 65; 9y 2 89T 8EU U 37 T C E 7 T I ET 5; !Ze K "m. W _ 0 _ o W \ > 7 H V 9 I 8 7 E6 9y 2 E8E6 656 2 89T 8EU U 37 T fm [K ] K !K KLL !Ze m "u. W 0 ^ 0 Y Y R _ _ \ / 0 _ \Y R \ > 7 H V @9 7y 585 7 :5 9 7 ? I4 58:9U4 I H37 T !Z Zm "e. ^ _ W q \ ` 0 \ 2 89: V< 9y H F 5 H F 7 7 I E6 > 7 H 58 7 E H3 9 7 E6 ? U4 9; 3I U 9 7 c 3 :89E8:F 3H 5: HI 85V n !Z Z j
"Z. 0 0 0 g 0 R
/ 0 0 R !fw ^ R !mL @9U4 I H 58 !L [Z ] /!e f u !Ze ! "!w. 0 _ _ l _
R \ 2 89: V 9y H F 5 c 58T 5S H F > 7 H 58 7 E H3 9 7 E6 2 E8E6 656 2 89:5;; 37 T ? U4 9; 3I U E 7 S H F > 7 H 58 7 E H3 9 7 E6 ? U4 9; 3I U 9 7 2 E8E6 656 E 7 S 3 ; H 8 3 i I H 5S ? ; H 5U ; !ZZ e "!!. W 0 R Y \ 2 89: V< 9y H F 5 H F c 3 :894 89T 8EU U 37 T D98x ;F 94 sc > @= G t m Z uZ !Z e u "!f. W 0 W q _ Y \ R
\ 2 89: V< H F 7 7 I E6 > 7 H 58 7 E H3 9 7 E6 ? U4 9; 3I U 9 7 c 3 :89E8:F 3H 5: HI 85 !ZZ f
"!j. 0 W W W / 0 _ _
_ \ H F 7 7 I E6 > A @ c > 7 H 58 7 E H3 9 7 E6 ? U4 9; 3I U 9 7 c 3 :89E8:F 3H 5: HI 85 sc 3 :89 G t !ZZ j "!m. _ l W 0 _
_ _ R \ 2 89: V 9y H F 5 H F 7 7 I E6 > 7 H 58 7 E H V ? U4 V 9 7 c 3 :89E8:F 3H 5: HI 85 sc > @= G t e j ZL n !Z ZL
"!u. _ 0 p Y \ 2 89:; V 9y H F 5 @ c ?> 2 C @9 7y V 9 7 2 89T 8EU U 37 T C E 7 T I ET 5; 5; 3T 7 E 7 S > U4 65U 5 7 H E H3 9 7 !Z ZK
"!e. / 0 n R R \ 2 89: V 9y H F 5 @ c ?> 2 C @9 7y 585 7 :5 9 7 2 89T 8EU U 37 T C E 7 T I ET 5 5; 3T 7 E 7 S > U4 65U 5 7 H E H3 9 7 f !Z f fe !Z Z ! "!Z. n F 5 C 2 6E Hy 98U 9y @9U i 37 E H 98 3 E6 E 7 S 59U 5 H 8 3 : @9U4 I H37 T ` !Z ZZ "fw.
"f !. Sd E 7 :5S @9U4 3 658 5; 3T 7 E 7 S > U4 65U 5 7 H E H3 9 7 !ZZ u "f f. 0 n ` R
W X o R \ 2 89: V< 9y H F 5 > 7 H V @9 7y V 9 7 2 E8E6 656 2 89:5;; 37 T 0 !Z e j "fK.
"!K.
W
n 0 n R Y ` ^ Y \ 0 \ 2 89: V< 9y ? 5:9 7 S D98x ;F 94 9 7 C E 7 T I ET 5; E 7 S @9U4 3 658; y 98 2 E8E6 656 @9U4 I H37 T f !K ffZ !Z eZ
"!L. 0 W W 0 o _
_ \ 2 89: V< > 7 H 58 7 E H V @9 7y V 7 ? I4 58:9U4 I H37 T K ! L w !Z Z j
W 0 / 0 Y 0 \ 2 89: V 9y H F 5 > 7 H 58 7 E H3 9 7 E6 @9 7y 585 7 :5 9 7 2 E8E6 656 8:F 3H 5: HI 85; E 7 S @9U4 3 6E H3 9 7 5:F 7 3 I 5; !Z Zm
Y 0
n _ 0 n _
0 R \ @9 7y V = 5: V 9y H F 5 v 5 7 H3 5 H F 7 7 V @ c ?> 2 C G ?> @
? U4 V 9 7 2 8 37 : 34 65; 9y 2 89T 8EU U 37 T C E 7 T I ET 5; fZ L f !ZZ K
"fL. _ l 0 W q R
Y \ 2 89: V< 9y H F 5 HF 7 7 I E6 > 7 H 58 7 E H3 9 7 E6 ? U4 9; 3I U 9 7 c 3 :89E8:F 3H 5: HI 85V f ! ! !Z Z K "f j. ^ _ W p 0 p R g \ 2 89: V 9y H F 5 H F 7 7 I E6 c 3 :894 89T 8EU U 37 T D98x ;F 94 !e K !Z u o !Z e ! "fm. ^ _ Y R R _
0 R Y \ 2 89: V 9y H F 5 @ c ?> 2 C @9 7y 585 7 :5 9 7 2 89T 8EU U 37 T C E 7 T I ET 5 5; 3T 7 E 7 S > U4 65U 5 7 H E H3 9 7 feK fZ Z !Z Z f "f u. ^ _ _ \ / 0 0 R Y \ H F 7 7 I E6 > A @c > 7 H 58 7 E H3 9 7 E6 ? U4 9; 3I U 9 7 c 3 :89E8:F 3H 5: HI 85 sc 3 :89 G t !ZZL "fe.
Y W 0 \ R Y \ 2 89:; V 9y H F 5 2 89T 8EU U 37 T C E 7 T I ET 5; 5; 3T 7 E 7 S > U4 65U 5 7 H E H3 9 7 s2 C > t fww w