Grammar-based Workload Modeling of Communication ... - CiteSeerX

In 5th International Conference on ``Modeling Techniques and Tools for Computer Performance Evaluation'', Turin 1991

Grammar-based Workload Modeling of Communication Systems 1

Winfried Dulz Stefan Hofmann

2

Institute for Mathematical Machines and Data Processing Friedrich-Alexander-University Erlangen-N urnberg Chair for Computer Architecture and Performance Evaluation

Abstract

Profound results in the performance evaluation of communication systems can only be achieved if an abstract model of the system as well as the system's workload is given. Most techniques for workload characterization of communication systems either use stochastic descriptions based on characteristic performance parameters or represent the workload demands by deterministic sequences of service primitives. In this paper we present a new approach to describe workload models by means of attributed grammars: productions specify the syntax of protocol data units whereas attribute functions, as part of semantic rules, formalize the in uence of characteristic workload parameters, such as packet length, interarrival times or timeouts. Proceeding from this basic model we built a grammatical inference system for automated construction of proper attributed grammars and a syntax-driven workload generator which can easily be adjusted to any performance modeling environment running under the UNIX operating system. We also show how attributed grammers can be used to emulate other workload characterizing techniques, i.e. compound poisson arrivals and train models. 1

Introduction and Overview

Hierarchical structuring of communication systems based on the ISO/OSI reference model for open systems interconnection is a generally accepted method to construct complex protocol architectures and distributed systems. For the purpose of performance evaluation, a hierarchical structured communication system must be transformed into an analytic or simulation model, where each layer corresponds to a submodel. 1 This work is supported in part by the Deutsche Forschungsgemeinschaft (German Research Council) in scope of the Sonderforschungsbereich 182 \Multiprocessor and Network Con gurations" connected to the project B3 \Design, Implementation and Performance Evaluation of Communication Services" 2 Philips Kommunikations Industrie AG, N urnberg

1

Most performance evaluation tools, such as RESQ [SM84], QNAP [VP84] or QUARTZ [DC88] allow hierarchical structuring by means of submodels, but it is very cumbersome to link workload and system models for each layer. A more general modeling approach has in uenced the HIT environment described in [BMW88]. Using a hierarchy of cooperating abstract machines it is possible to request services oered on the common interface between adjacent layers. By means of connection elements all workload demands can be linked to speci c machine services. The advantages of explicit separation of workload and system models are discussed in [Her89], referring to parallel systems modeling. Most of the arguments may also be applied to the performance study of hierarchical protocol architectures, such as

Model libraries for workload and system components may be constructed independently layer by layer;

Variations of dierent workload/system con gurations and performance comparison studies are much easier to perform;

Appropriate techniques for workload as well as for system modeling may be applied. With reference to general workload characterization methods the most popular approaches, i.e. instruction mixes, kernel programs, benchmarks, and synthetic jobs are discussed in the monograph [Fer78]. Each technique, however, has a number of shortcomings and limitations with respect to representativeness, reproducibility, exibility, system independence, compactness, and last but not least construction costs. Particular aspects referring to the design and construction problems of executable workload models may also be found in the description of TEL, a versatile tool for emulating system workload [CS88]. Recently, special workload models that allow to describe arrival patterns for communication systems are of particular interest [JR86] [Rol87] [Gih87]. Interactive network trac may be modeled by compound Poisson arrivals that consist of exponentially distributed bursts, each of which has a random batch size. This model treats packets as black boxes and does not distinguish between packets coming from dierent sources or going to dierent destinations. Therefore, it is not suited to describe interdependencies between successive packets, as it is necessary for modeling of le transfer transactions. In this case the so-called train model may be used that consists of a sequence of packets travelling between a given pair of network stations. Here, the interarrival times of individual packets belonging to the same train are much shorter compared to the intertrain intervals. Nevertheless, it is not possible to describe a concrete sequence of packets for opening, transfering, and closing connection-oriented data because in the scope of the train model only stochastic descriptions can be applied. For that reason a hierarchical workload concept for simulation and performance evaluation of a high-speed LAN controller was presented in [KR89]. Besides specifying a concrete application and its trac characteristics, i.e. terminal sessions or le transfer transactions, by using the train model there is also the possibility to describe deterministic sequences of dierent packet 2

types, such as data, acknowledgement, opening and closing of connections. On the other hand error-handling or stochastic eects due to overload situations must be neglected. In this paper we present a new approach to describe workload models for communication systems by means of attributed grammars. This technique is also suited to characterize workload in other application areas. The basic idea is that productions specify the syntax of single protocol data units and their interdependencies in form of character strings, whereas attribute functions, as part of semantic rules, formalize the in uence of characteristic workload parameters, such as packet length, interarrival times or timeouts. Therefore, both aspects that are necessary to study the performance of communication systems can be treated, namely interdependencies between successive protocol data units with respect to special applications as well as stochastic eects that in uence the protocol performance at a given layer. In the next section our basic model and the concept of attributed grammars are introduced. Following up we explain our general approach and the tool box to gain a syntax-driven workload generator starting from some measured event traces. In section four we discuss the grammatical inference of proper attributed grammars and also show how attributed grammars allow the emulation of compound Poisson arrivals. After that, we present the technique of grammarbased workload generation using a weighted push-down automaton and a simple le transfer example demonstrates the possibility of extending a train model with respect to error recovery. Finally, some implementation aspects are discussed and our present and future activities will terminate the explanations. 2

Basic Model

Any distributed system may be described by the interactions of concurrent processes which perform asynchronous actions. By means of local or distributed measurement tools it is possible to record signi cant event traces and to study the system behavior as well as its performance properties. Local event traces contain entries, e.g. process names, user identi cations, processing times, memory requirements, or I/O-activities. Distributed event traces which are recorded by a network monitor consist of PDUs (Protocol Data Units) which possess characteristic attributes, such as interarrival times, packet lenght, protocol type, source and destination addresses. The key assumption is that distributed event traces are composed by basic workload primitives of distributed applications which belong to the terminal alphabets of concurrent attributed grammars. Each grammar Gi (1 i N ) generates event strings ik belonging to the language L(Gi ) as well as attribute values for each event indicated in the distributed event trace of Fig. 1. The class of attributed grammars was rst introduced in [Knu71] to formalize semantic aspects of context-free programming languages. His approach allows to calculate the \meaning" of character strings using attributes of nonterminal symbols while parsing a given string according to the grammar rules. In this paper we associate the term \meaning" with performance properties that are necessary to evaluate the performance of a system under study. 3

Host1 Attributed grammar G1

HostN

. . .

Attributed grammar GN

Event string Nj of language L(GN )

Event string 1i of language L(G1)

?

?

Shued event strings ik as seen by a monitoring system

H HH 8 88

Distributed event trace

Figure 1: Basic model for grammar-based workload description Our approach to characterize workload for communication systems by means of attributed grammars possesses several advantages in relation to other modeling techniques: 1. It is possible to describe sequences of PDUs depending on each other, e.g. data followed by acknowledgements, as well as stochastic aspects according to error and overload situations. 2. Attribute functions formalize workload characteristics, such as interarrival times, packet length, communication delays, or timeouts which are needed to evaluate the performance of dierent protocols. 3. Nonterminal symbols represent properties of higher layers,e.g. transport, session, or application protocols. Therefore, the in uence of particular performance parameters, such as the window size, can be studied for each single layer. 4. Grammatical inference techniques, which are discussed in [Fu82], enable the construction of proper attributed grammars by analyzing measured event traces of distributed applications. 5. Attributed grammars allow the compact and generalized representation of measured event traces and to construct a syntax-driven workload generator which supplies simulation experiments with input stimuli. In the next section we discuss our general approach referring to special tools which are useful in course of grammar-based workload modeling and generation. 4

3

A Tool Box for Grammar-based Performance Studies

Performance studies in scope of the research project \Design, Implementation and Performance Evaluation of Communication Services" have shown that workload modeling and generation must be supported by sophisticated tools to gain proper results. First experiences in designing a workload generator for evaluating hierarchical models of communication systems by means of QNAP2 are described in [Nik88] and [DN89]. A more general approach which is based on attributed grammars and which aims at supporting all stages of workload characterization is shown in Fig. 2. In detail, the following tasks must be performed:

Monitoring of event traces

Preprocessing and - reducing of event traces

1 1 1

A A A

Grammatical inference 1 1 of proper 1 grammars

A A AU

Inference of attribute functions

A

1

A A A

1 1 1

A U A

1 1

Attributed parser construction

WORKLOAD MODELING

Gram- mar-based workload generation

WORKLOAD GENERATION

-

Performance evaluation

-

-

Figure 2: Stages of grammar-based workload modeling and generation 1. By means of the ZM4 monitoring system [HKL+ 88] or similar measurement techniques it is possible to record local and distributed event traces of a given distributed application. In 5

order to evaluate event traces in a global context, which are recorded at dierent network stations, ZM4 provides a global monitor timebase with a resolution of 400 ns. 2. To access and preprocess event traces of arbitrary structure, format, and representation TDL/POET is used [Moh89]. The tool consists of two components: POET (Problem Oriented Event Trace Interface) is a monitor-independent function library which allows to access single event records by using a key le; TDL (Trace Description Language) enables the description of arbitrary event traces and the key le is produced by means of a TDL compiler. The preprocessed and reduced event trace, indicated in Table 1, is piped into two succeeding inference stages.

Event type Arrival time Sender# Receiver# Packet size open ack data ack data nak data ack close ack

0 151 255 1270 1419 3543 3694 5816 6014 6166

12 31 12 31 12 31 12 31 12 31

31 12 31 12 31 12 31 12 31 12

64 64 1042 64 253 64 253 64 64 64

Table 1: Example of a reduced event trace 3. The sequence of all event names found in the rst column of the reduced event trace shown in Table 1 may be interpreted as a terminal character string produced by a unknown grammar. Grammatical inference techniques which are developed to reconstruct grammars by means of structural complete samples will be discussed in the next section. On the other hand, attribute columns describe semantic aspects and are used to infere attribute functions by means of standard statistic techniques, e.g. histogramms, empirical distribution functions or statistical tness tests. 4. Resulting from the separate evaluation of syntax and semantic aspects of the reduced event trace, an attributed parser represents a generalized grammar-based workload model. While recognizing or rejecting a given event sequence, the parser will calculate attribute values thereby using the inferred attribute functions. We will demonstrate the calculation of the sequence's length and the mean event interarrival time in the next section. 5. Based on the parser and its grammar, which may be seen as some kind of knowledge data base, a weighted push-down automaton is constructed to generate sequences of basic 6

workload components, also allowing to de ne weights and predicates for controlling the generation process. We will discuss some aspects concerning weighted push-down automata in section ve, and an instructive example explains how to extend the train model with respect to error-recovery. On the the other hand, is also possible to construct the automaton with the aid of a special speci cation language, without using automated inference mechanisms. A tool named SLOGEN (Syntactical LOad GENerator Compiler) transforms an abstract workload description based on attributed grammars, thereby also allowing the de nition of weights and predicates, into an executable workload generator [Hof90]. 6. In the last step, the push-down automaton must be adapted to the performance evaluation environment. In our case, we use IGEN (Interface GENerator Compiler) to combine HIT models [BMW88] with one ore more workload generator components. A complete presentation of this proceeding may also be found in [Hof90]. In the following section the concept of grammatical inference is explained which originally was introduced to describe complex physical patterns and data structures. 4

Grammatical Inference of Attributed Grammars

In order to model a class of event traces under study more realistically, it is hoped that the grammar can be directly inferred from a set of sample event sequences. Relating to attributed grammars we must distinguish between the inference of rewriting rules to consider syntactical aspects and attribute functions referring to the description of characteristic performance parameters. Informally spoken, the problem of grammatical inference is concerned with techniques that can be used to construct an unknown source grammar Gs based on a nite set Sn of event traces 1 ; ::: ; n from L(Gs) as shown in Fig. 1.

Unknown source grammar Gs

Sn = f1 ; 2; . . . ; ng

-

Inference algorithm

-

Inferred grammar Gn

Figure 3: Basic structure of a grammatical inference system The inferred grammar is a set of rewriting rules for describing the sample Sn as well as other strings which, in some sense, are of the same nature as the given sample set. In addition, any measure of goodnes of the inferred grammar Gn can be de ned, such as the complexity of the inferred production set or the deviation of L(Gn ) from the sample Sn . 7

4.1 K-Tail Inference of Regular Grammars When constructing workload models for communication systems it is natural to assume that the resulting grammar belongs to the class of regular grammars because most protocols are designed by means of cooperating nite-state automata. We therefore decided to develop an inference module based on the k-tail technique for regular languages described in [Fu82]. Starting from the formal derivative Da Sn of a sample event trace Sn with respect to the event a 2 Sn , de ned as DaSn := f j a 2 Sn g which can easily be extended to a string = a1 ::: ak according to

DSn := Da (Da 01 ( ::: (Da2 (Da1 Sn )) ::: )) k

k

it is possible to construct the canonical derivative nite-state grammar GCD = (N; T; S; P ) associated with the sample Sn = f1; ::: ; n g in the following way: 1. Let U = fU1 ; ::: ; Um g be the distinct derivatives of Sn not equal to the empty word " or the empty set ;. Let U1 = D" Sn = Sn . 2. The set of all nonterminal symbols equals N = U . 3. The set of all distinct event symbols found in the traces of Sn form the set T . 4. The production set P is de ned as follows: Let Ui ; Uj

2 N , then

Ui 0! aUj if and only if DaUi = Uj Ui 0! a if and only if " 2 DaUi 5. S = U1 is the starting symbol of GCD . The distinct derivatives Ui of the resulting canonical derivative grammar GCD de ne an equivalence relation that merges event traces with equal pre xes into the same equivalence classes. More than that, GCD exactly generates the given sample, i.e. L(GCD ) = Sn . Another way in which an admissible class of grammars can be de ned is to use the idea of k-tails which de ne equivalence classes on the states of the canonical derivative nite-state grammar starting from a given sample. Therefore, given Sn ; T; DSn ; U; Ui; Uj from above,

t(; Sn ; k) := f j 2 D Sn ^ 0 j j kg denotes the k-tail of Sn with respect to where j j is the length of . Let Ui and Uj be two distinct states of GCD which are associated with the derivatives D Sn and D Sn , respectively, where i and j are words belonging to T 3 , the free monoid over T . Ui and Uj are said to be k-tail-equivalent if and only if i

t( i ; Sn ; k) = t( j ; Sn ; k) 8

j

Further,

[DSn ]k := fD Sn j 2 T 3 ^ t( ; Sn ; k) = t(; Sn ; k)g is called k-tail equivalence class with respect to D Sn and

Uk := fUik j Uik 2 [Ui]k ^ Ui 2 U for 0 i mg denotes the k-tail quotient set of U . By means of the de nitions above it is possible to construct the k-tail grammar GCD;k = (Nk ; Tk ; Sk ; Pk ) associated with the sample Sn = f1; ::: ng in the following way: 1. Let GCD = (N; T; S; P ) be the canonical derivative grammar and Nk be the k-tail quotient set of N . 2. The set of all nonterminal symbols equals Nk = Nk . 3. Tk = T . 4. The production set Pk is de ned as follows: [Ui ]k 0! a[Uj ]k 2 Pk if and only if Ui 0! aUj 2 P [Ui ]k 0! a 2 Pk if and only if Ui 0! a 2 P 5. Sk = [S ]k is the starting symbol of GCD;k . This method of obtaining a grammar that roughly approximates Sn is easy to program, as shown in [Dau90], and allows to derive grammars which are compatible with the sample Sn , i.e.

L(GCD;k ) Sn Furthermore, it is interesting to note that

L(GCD;k ) L(GCD;k+1 ) for k 0 L(GCD;K ) = Sn for K max ji j i

We have demonstrated how to construct simple regular k-tail grammars which generalize given event traces and are able to generate event sequences for evaluation purposes. Now, techniques are discussed which allow the derivation of attribute values by means of attribute functions to describe the in uence of characteristic performance parameters.

4.2 Attributed Grammars for Workload Characterization When analyzing a reduced event trace as shown in Table 1 we must clarify which semantic rules be attained from the attribute columns and how these rules are combined to the inferred grammar. Given a vector of interarrival times tA = (tA1 ; tA2 ; . . . ; tA ) corresponding to each n

9

entry of the event column it is possible to derive an estimation for the expected interrarival time tA or the mean arrival rate by means of

tA := and

:=

n 1X t n i=1 A

1 = tA

i

n tA

n P

i=1

i

respectively. On the other hand, using a vector tB = (tB1 ; tB2 ; . . . ; tB ) of service times the mean values for the service time and the service rate can also be calculated by n

tB := and

:=

n 1X t n i=1 B

i

n 1 =P n tB tB i=1

i

If the reduced event trace contains both attribute columns the trac intensity

can also be derived. The question is now how to calculate a certain attribute value assuming that we can specify attribute values for each event of the trace. The answer are attributed grammars. A 4-tupel GA = (G; A; R; D) with G = (N; T; S; P ) a regular grammar as de ned above, A a set of attributes, R a set of rules, and D an interpretation is called attributed grammar. Each semantic rule consists of elements of the form :=

rp;a;i : ai = fp;a;i (bj1 ; bj2 ; . . . ; bj ) k

where (p : 0 ! 1 . . . l ) 2 P; n 2 N [ T for 1 n lp , i; j1; . . . jk ; 2 f0; . . . ; lpg; ai 2 AT T (i ), and bj 2 AT T (j ) for 0 n k. Function AT T thereby attaches each symbol i to its attributes AT T (i ) 2 A. The meaning of the rules belonging to R are de ned as follows: Considering a production p 2 P , the attribute a of a symbol at position i is de ned by the function fp;a;i . This function uses values of attributes at position j1 to jk as arguments. Here, position means numbering all symbols of a given production from left to right starting with the value 0. The interpretation D consists of p

n

n

a domain Da for each attribute a 2 AT T and

a function

D : (D 2 . . . 2 D ) ! D fp;a;i b1 b a j

jk

10

i

Attribute name Mark Classi cation tA Terminal attribute Interarrival time Mean interarrival time tA Nonterminal attribute Nonterminal attribute Mean arrival rate tB Terminal attribute Service time Mean service time tB Nonterminal attribute Nonterminal attribute Mean service rate Nonterminal attribute Trac intensity Table 2: Classi cation of attributes for workload characterization for each function symbol fp;a;i 2 R. Using so-called synthesized attributes it is possible to derive the \meaning" of an event string while parsing it bottom-up according to the grammar rules. Note that only the interarrival time and the service time are basic attributes relating to terminal symbols while all other attributes must be synthesized and therefore can only belong to nonterminal symbols, as shown in Table 2. By means of the sample attribute grammar shown in Table 3 we will demonstrate the calculation of the length and the mean interarrival time of a given event string by example. Therefore, let l be a recursive attribute function 8 for x = "; y 2 N; (y 0! z ) 2 P < l(z ) l(xy ) = 1 for x 2 T; y = " : 1 + l(y ) for x 2 T; y 2 N to calculate the length of the total event string and let 8 for x = "; y 2 N; (y 0! z ) 2 P < tA (z ) for x 2 T; y = " tA (xy ) = tA (x) 0 1 : 1 t ( x ) + l ( y ) ( y ) for x 2 T; y 2 N t A A 1+l (y )

calculate the mean interarrival time using the equation !

n +1 1X 1 nX 1 n x + xn+1 = x n+1 n i=1 i n + 1 i=1 i By means of the attributed grammar shown in Table 3 it is possible to generate event strings of the form (ab)n for n 1 which model a sequence of client/server interactions assuming a to be a REQUEST event and b to be the corresponding RESPONSE event. Starting from S the parsing process of w = ab will yield l(S ) = l(aX ) = 1 + l(X ) = 1 + l(b) = 1 + 1 = 2 and 1 a +b (a0 + b0 ) = 0 0 tA (S ) = tA (aX ) = 1+1 2 thereby using the values EXP (; a) = a0 and EXP (; b) = b0 of an exponentially distributed interarrival process.

11

Nr. Production p1 S 0! aX tA (a) l(S ) tA (S ) p2 X 0! bS tA (b) l(X ) tA (X ) p3 X 0! b tA (b) l(X ) tA (X )

= = = = = = = = =

Semantitic Rule EXP (; a) l(aX ) = 1 + l(X0) 1 tA (aX ) = 1+l1(X ) tA (a) + l(X ) tA (X ) EXP (; b) l(bS ) = 1 + l(S0) 1 tA (bS ) = 1+l1(S ) tA (b) + l(S ) tA (S ) EXP (; b) l(b) = 1 tA (b) = tA (b)

Table 3: A sample attributed grammar for workload modeling 5

Grammar-based Workload Generation

The method of grammatical inference, which has been presented in the previous section, allows the derivation of a formalized workload model starting from some measured data. Since in our approach attributed grammars are used to characterize workload demands in a global way we need an abstract speci cation for that type of workload model. On the other hand it is necessary to construct automata which are able to generate representative workload sequences needed in course of the evaluation process.

5.1 Grammars for Workload Speci cation To specify a workload model, an attributed grammar must ful ll two requirements. First, whenever a nonterminal symbol has to be replaced by a string of terminal symbols, which directly corresponds to basic workload components, a mechanism must be found to select one of the alternative production rules. Second, it is necessary to prevent the application of a production rule under certain circumstances. If we want to generate sequences of three to eight data packets for instance, we must ensure that the sequence is not terminated before the third packet. On the other hand, the generation process must not be continued after the eighth packet. Therefore we introduce a new class of grammars which is an extension of the probabilistic and weighted grammars described in [Sal69] and the concept of attributed grammars, already discussed in the previous section and rst presented in [Knu71]. Let therefore G = (T; N; A; P; S ) be an ordinary attributed grammar. The predicative weighted grammar Gc = (Tc; Nc; Ac; Pc; Sc) with Tc = T; Nc = N; Sc = S is derived from G by associating a weight w and a predicate c = c(X ) over the attributes of the non-terminal symbol X with each prow;c duction rule X 0! , 2 Vc3 = (Tc [ Nc)3. Thus, if (X 0! ) 2 P then (X 0!

) 2 Pc . The predicates de ne for of each rule whether that rule may be applied during the substitution of the left-hand side nonterminal. The rule's weight is a measure for the selection probability of that rule. 12

How do the predicate and the weight of a production rule interact? Let X 2 Nc be a nonterminal ;c 1 ;c1 2 ;c2

k with i 2 Vc3 , 1 i k, be k production symbol of Gc and X w0!

1 , X w0!

2, . . . , X w0! rules that can substitute the nonterminal X . ;c The predicative weight wic of production X w0!

i is de ned as k

i

wi = c

k

i

wi if ci (X ) 0 if :ci (X )

In other words: the condition speci ed by the predicate ci(X ) is re ected by the weight wic. ;c Therefore, the selection probability pi for production X w0!

i is de ned by i

pi =

wic

Pk

i

j =1 wj

c

To summarize, the predicates specify which production rules are valid to substitute the nonterminal X . From this set of valid production rules one production has to be selected according to the selection probabilities pi . We have, of course, to ensure that at least one of the predicates is ful lled when X is to be substituted.

5.2 Push-down Automata for Workload Generation As known from formal language theory, each class of grammars corresponds to a certain class of automata and vice versa. Therefore, we now introduce that class of automata which directly corresponds to the predicative weighted grammars. In contrast to those automata that are used for recognition purposes, automata of this class are intended to generate sequences of basic workload components. A weighted push-down automaton is a 8-tupel Ac = (Qc ; 6c; 0c; Ac; c; c; qc0; Zc0) with Qc a nite set of states, 6c a set of output symbols, 0c a set of stack symbols with 6c 0c , Ac a set of attribute functions for the symbols of 0c , c a mapping from Qc 2 0c into nite subsets of Qc 2 03c which represents the state transitions of the automaton, c a mapping from Qc 2 0c into 63c which describes the output behaviour of the automaton, qc0 2 Qc the initial state, and Zc0 2 03c the initial stack word. The automaton may perform two dierent types of transitions. Subsequently, let a 2 6c, 2 63c , V 2 0c 0 6c , Z 2 0c and ; i; 0 2 03c with i being an arbitrary index. Furthermore, let the symbol on top of the stack be denoted by the leftmost symbol in the derivation. Let q 2 Qc be the current automaton state and qi ; q 0 2 Qc the successors of q . The selection operation

c (q; V ) = f(q1; 1); (q2; 2); . . . ; (qk ; k )g c (q; V ) = " replaces the symbol V on the top of the stack by one of the symbol strings i , 1 i k. The c predicative weight wq;V (qi ; i) for the corresponding state transition is de ned as

wq;V (qi ; i) = c

wq;V (qi ; i) if cq;V (qi ; i) 0 if :cq;V (qi ; i) 13

where wq;V (qi ; i) is the weight for the corresponding state transition whether or not the predicate cq;V (qi ; i) yields true. In analogy to the predicative weighted grammar, the selection probability is given according to the weights and predicates for that transition by

pq;V (qi ; i) =

c wq;V (qi ; i) c j =1 wq;V (qj ; j )

Pk

During the output operation

c (q; a) = f(q 0; ")g c(q; a) = a the output symbol is removed from top of the stack and written to the output tape. Obviously, the selection operation is non-deterministic. The automaton halts in an unde ned state if the condition c (q; Z ) = ; holds true, and it stops on empty stack. A more convenient notation to describe transition moves of an automaton uses con gurations. A con guration of the automaton Ac is a triple (q; ; ) 2 Qc 2 03c 2 63c where q denotes the current automaton state, the contents of the stack and the contents of the output tape. The operations described above may be written as follows:

() (q0; 0) 2 (q; V ) ^ (q; V ) = " () (q0; ") 2 (q; a) ^ (q; V ) = a As usual, L(A) = f 2 6c j (q ; Z ; ") `3 (q; "; )g de nes the language L(A ) generated by automaton A , where `3 denotes the re exive and transitive closure and (q ; Z ; ") is the initial selection: output:

(q; V ; ) ` (q 0; 0 ; ) (q; a ; ) ` (q 0; ; a) c0

c

c

c

c

c0

c

c

c0

c0

con guration of the automaton. Which mechanisms are required to transform production rules of a given grammar Gc into sequences of operations a push-down automaton Ac has to perform? Let therefore Gc = (Tc; Nc; Ac ; Pc; Sc) be a predicative weighted grammar. The automaton Ac which generates L(Gc) is given by Ac = (fq g; Tc; Tc [ Nc ; Ac; c; c; fq g; Sc where the state transitions have to be de ned as c (q; X ) = f(q; 1); (q; 2); . . . ; (q; k)g c(q; X ) = " ;c c with wq;X (q; i) = wic for all production rules (X w0!

i) 2 Pc , 1 i k, and i

i

c (q; a) = f(q; ")g c(q; a) = a for all a 2 Tc .

5.3 A Train Model with Error-Recovery Since real application traces are too complex for instructive examples, we use a simple connectionoriented, sender-initiated, stop-and-wait le transfer protocol. Accordingly, we distinguish between ve dierent PDU types: 14

open

to build up the connection

close

to break down the connection

data

to transmit some portion of the le data

ack

to indicate the receipt of a correct packet

nak

to signal the sender the occurrence of transmission failures

For the sake of simplicity it is assumed that no packet will be lost. The contents of a packet, however, may be corrupted during the transmission. Furthermore, ack and nak packets are assumed to be always transmitted correctly. First, let open ack data ack data ack close ack

be the trace of an error-free le transfer which may be generated by the grammar OPEN SEND

CLOSE

0! 0! j 0!

open ack SEND data ack SEND CLOSE close ack

Taking into account that data packets may be disturbed by transmission errors and that error recovery is done by retransmissions, the grammar is extended in the following way: SEND

RECOVER

0! j j 0! j

data ack SEND data nak RECOVER CLOSE data ack SEND data nak RECOVER

As shown in the previous section, a set of individual attributes is associated with each terminal and nonterminal symbol. By means of attribute functions values are assigned to these attributes while generating workload sequences. Now let us consider the nonterminals SEND and RECOVER of the grammar above. As mentioned earlier, a weight and a predicate over the nonterminal's attributes are associated with each rule. The predicate speci es whether the application of the corresponding rule is valid, and the weights of all valid rules re ect the selection probabilities of those rules. Thus, if we want to model traces which consist of three to eight data packets and allowing an error probability of 1 % during the transmission, we have to extend the production rules for the nonterminal SEND as follows: w ;c SEND

SEND SEND

11 11 0!

w12 ;c12 0! w13 ;c13

0!

data ack SEND

data nak RECOVER CLOSE

15

Thus, the weights w11 = 99 and w12 = 1 govern the error rate and w13 has in uence on the average train length. To adjust the train between three to eight data packets, we have to \activate" each of the production rules at the right time by providing an appropriate predicate. Let SEND.seqno denote the attribute seqno of SEND that records the number of successfully transmitted data packets. The predicates c11, c12, and c13 must then be de ned as follows:

c11 c12 c13

0 SEND.size = 0

SEND.size

SEND.size

will control the generation of valid train sequences. Note that weight w13 has no impact on the train length, since c11 and c12 do not intersect with c13. Therefore, the third production rule will be the only selectable one. Forcing to cancel the generation process after three unrecoverable errors have been encountered, the predicates c21 and c22 of the rules RECOVER RECOVER

w21 ;c21 0! w22 ;c22

0!

data ack SEND data nak RECOVER

must not be true after the third consecutive error has occurred. Thus, if denotes the number of unrecovered errors the predicates

c21 c22

RECOVER.errno RECOVER.errno

RECOVER.errno

3 3

will do the requested task. Of course, the value of RECOVER.errno must be set to zero when error recovery starts and it must be incremented after each consecutive error. As mentioned previously, the attributes of the terminal symbols represent the characteristic workload parameters. Each time a basic workload component is generated, the values of the corresponding terminal's attributes are computed and assigned by applying their related attribute functions. 16

In the train model each train is characterized by an intertrain arrival time, whereas the packets inside the train are described by individual distribution functions that, in general, dier from each other. Suppose that f (m) is a function which yields a stream of attribute values corresponding to a given distribution function with mean value m, which can be the value of tA from section four, for instance. The assignment open.time = f (mOPEN ) then formalizes the arrival time of an entire train. Inside the train, the arrival of each data packet can be described by data.time = f (mSEND ), data.time = f (mRECOVER ), or close.time = f (mCLOSE ), respectively. Arrivals of acknowledgement packets are characterized accordingly. 6

Concluding Remarks

In the previous sections we introduced a new technique for workload modeling. We also described a tool box which was implemented under UNIX System V.2 and SUN-OS 4.0 respectively, and therefore our modeling system should be portable to most UNIX environments. Due to the fact that we designed a speci cation language for distributed event traces which uses C and YACC notations, developed for compiler construction, workload generators can easily built by SLOGEN presented in section three. If the SLOGEN speci cation contains executable statements or function calls in the attribute section of a given event it is even possible to construct executable workload models for benchmark studies. At the moment several interesting problems related to performance evaluation studies using grammar-based workload models are discussed, such as the description of X-Window applications and the resulting LAN trac by means of appropriate attributed grammars. Furthermore, new inference techniques will be investigated which are especially suited to construct a set of coupled grammars introduced in [Dul90] allowing to model shued event traces which consist of several independent and simultaneous network connections. References

[BMW88] H. Beilner, J. Mater, and N. Weissenberg. Towards a Performance Modelling Environment: News on HIT. In R. Puigjaner and D. Potier, editors, Modeling Techniques and Tools for Computer Performance Evaluation. Plenum Press, 1988. [CS88]

M. Calzarossa and G. Serazzi. TEL: A versatile Tool for Emulating system Load. In R. Puigjaner and D. Potier, editors, Modeling Techniques and Tools for Computer Performance Evaluation. Plenum Press, 1988.

[Dau90]

P. Dauphin. Inferenz von Grammatiken zur Lastmodellierung verteilter Systeme, 1990.

[DC88]

B. Delosme and L. Coyette. QUARTZ: A Tool for Performance Evaluation of Data Networks. In Ramon Puigjaner and Dominique Potier, editors, Modeling Techniques and Tools for Computer Performance Evaluation. Plenum Press, 1988.

17

[DN89]

W. Dulz and U. Nikol. A Hierarchical Workload Generator for QNAP2. Technical Report 3, Institut fur Mathematische Maschinen und Datenverarbeitung,Erlangen, 1989.

[Dul90]

W. Dulz. The Language of Loosely Couled Context-Free Devices. Technical Report 5, Institut fur Mathematische Maschinen und Datenverarbeitung,Erlangen, 1990.

[Fer78]

D. Ferrari.

[Fu82]

King Sun Fu.

[Gih87]

O. Gihr. Vergleich der Kanalzugrisverfahren CSMA/CD, Token-Bus, Token-Ring und Slotted-Ring fur Poisson- und unterbrochene Poisson-Ankunftsprozesse. In U. Herzog and M. Paterok, editors, IFB 154: Messung, Modellierung und Bewertung von Rechensystemen. Springer-Verlag, 1987.

[Her89]

. Prentice-Hall, 1978.

Computer Systems Performance Evaluation

. John Wiley and Sons, 1982.

Syntactic Pattern recognition

U. Herzog. Leistungsbewertung und Modellbildung fur Parallelrechner. Information, 31:31{38, 1989.

stechnik - it

[HKL+ 88] R. Hofmann, R. Klar, N. Luttenberger, B. Mohr, and G. Werner. An Approach to Monitoring and Modeling of Multiprocessor and Multicomputer Systems. In Int. Seminar on Performance of Distributed and Parallel Systems, Kyoto, 1988. [Hof90]

S. Hofmann. Syntaxgesteuerte Lastgenerierung. Master's thesis, Institut fur Mathematische Maschinen und Datenverarbeitung,Erlangen, 1990.

[JR86]

R. Jain and S. A. Routhier. Packet Trains - Measurements and a New Model for Computer Network Trac. IEEE Journal on Selected Area in Communications, 6:986{995, 1986.

[Knu71]

Donald E. Knuth. Semantics of Context-free Languages. 2/5:127{145/95{96, 1968/1971.

[KR89]

W. Kremer and M. Rupprecht. Hierarchisches Lastmodellkonzept zur Simulation und Bewertung von HSLAN-Controllern. In IFB 205: Kommunikation in Verteilten Systemen. Springer-Verlag, 1989.

[Moh89]

B. Mohr. TDL/POET - Version 5.1. Technical Report 7, Institut fur Mathematische Maschinen und Datenverarbeitung,Erlangen, 1989.

[Nik88]

U. Nikol. Ein hierarchischer Ansatz zur Lastmodellierung verteilter Systeme. Master's thesis, Institut fur Mathematische Maschinen und Datenverarbeitung,Erlangen, 1988.

[Rol87]

P. Rolin. Benchmarking Local Area Networks. In AFCET, editor, International Workshop on Modeling Techniques and Performance Evaluation, 1987.

[Sal69]

A. Salomaa. Probabilistic and Weighted Grammars. 18

,

Math. Systems Theory

, 15:529{544, 1969.

Inf. Control

[SM84]

C. Sauer and E. McNair. The Evolution of the Research Queueing Package RESQ. In Modeling Techniques and Tools, 1984.

[VP84]

M. Veran and D. Potier. QNAP2: A Portable Environment for Queueing Network Modeling. In Modeling Techniques and Tools, 1984.

19