Automata theory meets circuit complexity - Springer Link

Two Versus One Index Register and Modifiable Versus Non-modifiable Programs *

by K. Mehlhorn W. J. Paul FB 10 - Informatik Universit£t des Saarlandes D-Saarbriicken, West Germany

Abstract: We compare Random Access Machines with one or two index registers and modifiable or non-modifiable programs and show for a simple problem of data transfer that the more powerful versions are provably more efficient. Our argument uses Kolmogorov complexity.

I. I n t r o d u c t i o n Complexity theory has adressed a number of architectural questions in the past: the power of an additional tape or head in Turing machines, the power of multidimensional versus one- dimensional tapes, the power of two-way versus one-way input, queues and stacks versus tapes,... [1,2,3,4,6,7,8,9,10,11,12]. Many of these results use Kolmogorov complexity; cf. [5] for a recent survey. In this note we want to show that the techniques developed can also be applied to more realistic machine models than Turing machines. More specifically, we consider Random Access Machines with either one or two index registers and either modifiable or non-modifiable programs. Machines with modifiable, non-modifiable respectively, programs are frequently called von Neumann machines and Harvard machines respectively. We exhibit a simple data transfer problem and show that such a problem of size N can be solved in time O(1/E) q- (2 ÷ E)N on either a von Neumann machine with one index register or a Harvard machine with two index registers for every e > 0, but that a Harvard machine with only one index register * work supported by DFG, SFB 124, projects B2 and D4

604 We now give the precise statements of the results. The proofs are contained in section II and section III offers some open problems and conclusions. We consider a typical RAM consisting of a central processing unit and a memory. The CPU contains r registers, k of which can be used as index registers. Each register and memory cell can hold n bits. We assume that the instructions of Table 1 are available to load d a t a from and store d a t a into the memory. Besides these load and store instructions we allow an arbitrary number of instructions affecting only the processing unit, i.e., data and index registers and program counter.

Instruction LOAD&INC LOAD STORE&INC STORE

Effect i,j, i, j, i,j, i,j,

e c c c

Ri ~- D[Rj + c];Rj *-- Rj + Ri +- D[Rj + c]; P C ~- P C D[Rj + c] ~ R,; R j +- Rj + D[R~ + c] ~- Ri; P C ~- P C

1 ; P C *-- P C + 1; + 1; 1; P C +-- P C + 1; + 1;

T a b l e 1: L O A D and S T O R E instructions, 1 < j _< k, 1 < i < r. Ri is the i-th register, D is the memory and P C is the program counter; c is an integer. In a Harvard machine the program is fixed throughout execution, i.e., program and data are stored in different memories and loads and stores only affect the data memory. In yon Neumann machines there is no such restriction. We consider the following simple block transfer problem. Given integers N, a and b in D[0], D[1] and D[2] respectively, move the content of D[a+i] into D[b+i] for 0 < i < N.

Theorem: a) For any c > 0, a Harvard machine can solve the block transfer problem in time O(1/e) + (2 + s ) . Y if two index registers are available, and in time O(1/~) + (2 + 2/(r - 1) + ¢ ) - N if one index register is available. b) For every ~ > 0, a yon Neumann machine with one index register can solve the block transfer problem in time O(1/~) + (2 + e)N. c) Any machine requires time at least 2N. d) Let cz = 2/(2r + 1). Then for any Harvard machine with one index register the following holds: V/~ < aVd > 03no3NoVn >>noVN(No _< N < n d) : the time needed for the block transfer problem is at least (2 + ~ ) N in the worst case.

II. P r o o f s P r o o f of a):

3N + O(1) and (2r + 1 ) N / ( r - 1) + O(1) upper bounds are obvious. Unrolling the loop and testing the termination condition only every 1/E-th iteration gives the improvement.

605

Proof of part b):

Let c = b - a. Clearly, if R1 contains a value between a and a + N - 1 then the following 2r - 2 instructions transfer r - 1 data items from the input to the output.

L O A D 2.,1,0

S T O R E & I N C 2,1,c

LOAD r,l,0

S T O R E & I N C r,l,c

Note t h a t one can use program modification to store the constant c in the S T O R E & INC instructions. Adding two more instructions for the loop control yields a 2 r N / ( r 1) + O(1) solution. As in part a) this can be improved to the bound stated in part b).

Proof of part c): Clearly, there must be one load for every location a + i and one store for every location b + i, 0 < i < N. Proof of part d): The proof uses Kolmogorov complexity. We briefly review the relevant definitions. For an extensive discussion we refer the reader to the recent survey by Li and Vit£nyi. Let us fix a universal Turing machine U. For a string v E {0,1}* define the Kolmogorov complexity K(v) as the length Ix I of the shortest string x such that U on input x halts with o u t p u t v. We need the following fact: Fact: a) K(v)