MIT - Defense Technical Information Center

3 downloads 0 Views 3MB Size Report
M.S. Thesis. Alfred P. Sloan School of Man an-mnt and Dept. of E1Pertrir.a. Fn enr; ag ... AlsoF, Dnrek Henderson, Norm Kohn, and Claude Hans. The author's ...
tINCI ASS IF IF Security Classification DOCUMENT CONTROL DATA - R&D (Security claeelllcation of title, body of abettact avd Indeing annotatlon must be entered when the overall report is classified) ORIGINATING

I.

28.

ACTIVITY (Corporate author)

REPORT

CLASSIFICATION

SECURITY

UNCLASSIFIED

Massachusetts Institute of Technology Project MAC

None

GROUP

REPORT TITLE

S.

Design Strategies for File Systems 4. DESCRIPTIVE

M.S. M

NOTES (ype

Thesis.

o repart and Inclueive datee)

Alfred P. Sloan School of Man

S. AUTHOR(S) (Last name, first nagn,

an-mnt

and

Dept.

of E1Pertrir.a

Fn

ag

enr;

Initial)

Madnick, Stuart E. REPORT DATE

6.

NO. OF REFS

TOTAL NO. OF PAGES

9s.

ORIGINATOR'S REPORT NUMSER(S)

9b.

OTHER REPORT

33

114

October 1970 CONTRACT OR GRANT NO.

I.

17b,

7a.

Nonr-4102(Ol)

MAC TR-78 (THE(S)

b. PROJECT NO. C.

NO(S) (Any ;-'

- numbers that may be

alined this report) d.

-

AVAILABILITY/LIMITATION NOTICES.

10.

taw vubuI' muhi

Distrlbutioi of this document is unlimited.

SUPPLEMENTARY

II.

12.

NOTES

SPONSORING MILITARY

.

ACTIVITY

Advanced Research Projects Agency 3D-200 Pentagon Washington. D.C. 20301

None ABSTRACT

IS.

This thesis describes a methodology for the analysis and synthesis of modern general purpose file systems. The two basic concepts developed are (1) establishment of a uniform representation of a file's structure in the form of virtual memory or segmentation and (2) determination of a hierarchy of logical transformations within a file system. These concepts are used tcgether to form a strictly hierarchical organization (after Dijkstra) such that each transformation can be described as a function of its lower neighboring transformation. In a sense, the complex file system is built up by the composition of simple functional transformations. To illustrate the specifics of the design process, a file system is synthesized for an environment including a multicomputer network, structured file directories, and removable volumes.

C, 14.

KEY WORDSW

File Systems

Operating Systems

Data Management

Programming

D D,, FORM 1473

(M.I.T.)

Modularity

Virtual Memory

UNCLASSIFIED Security Classification

PE5tGN STRATFfl$FS FOR FILE SYSTEMS";

Absrtract

iidescribes a methodoiejgy for the analysts and synLhesls of modern general purpose file systems. The two basic concepts developed are (1) establishment of a uniform representation of a file's structure in the form of virtual nwfiry 01 scgmentatLon and (2) determination of a hierarchy of logical transofrmations withir a file system. These concepts are used together to form a strictly hierarchical organization (after Oijkstra) such that each transformation can be described as a function of its lower neighboring transformation. In a sense, the complex file system Is built up by the compusitlor cf simple functional transformations. To illustrate the .epcifics of the design process, a file system is synthesized for an environment including a multi-computer network, structured file directories, and removable vol ume-,.

I'I :' This report rcprouces a thnsis of the same title submitted to the Alfred P. Sloan School of Management and the Department of Electrical Engineering, Massachusetts lnbtitute of Technology, in partial fulfillment of the requirements for the degree of Master of Science, June 1969.

DESIGN STRATEGIES FOR FILE SYSTEMI tuart E. l$4dnMck October 1970

PROJECT MAC MAS'ACHUSETTS INSTITUTE OF TECHNOLOGY

Cambridge

Massachusetts 02139

AC KNOVLEDGEMENTS

author acknovlediet

-hi

of the basic ideas

which :any

from

Allen Moulton,

Mr.

colleague,

with his

discussicn.s

laug dud ofteL heated

the vany

system design

for this file

were molded.

particular, Ness,

navi

criticism

an4

energy,

Zilles,

due to Prof.

thanks are

and Prof. Fobert M.

Den Ashton,

Hoo-ain

Dnrek Henderson,

AlsoF,

a!;

the

IBM

Canbridge

Michael Mark,

JcseFh

Toong,

and Claude Hans.

with MIT Project MAC

Scientific

was composed

the

the ideas formed in this

and edited,

on-live in

the

with tho aid Cf the

Fzccessing system.

Work reported herein was supported in part by Project 1AAC, an M.I.T. research project sponsorcd by the Advanced Research Projects Agency, Depart-

ment of Defense, under Office of Naval Research Contract Nonr-4102(01).

as well

provided

Center

CP-67./CMS Tive Sharing computer system, SCRIPT fanusctipt

J. Donovan, Prof.

John

Stephen

environment ani influenced many of This report

In

Graham, as well as,

Norm Kohn,

The author's association

zecrt.

this

produce

to help

time,

their

contributed

generously

colleagues

Many

CONT ENTS

to

rNTIRODUCTION

I

EvolUtion of 1ila Sys'tea Scope and Purpose

I 10

It*

MOTIVATION BEHIND FIlE SYSTEM DESIGN Uniform Representation of File Structure Hierarchy of Logical Transformations

15 15 27

III.

FILE SYSTEM DESIGN MODEL Basic Concepts Used in file System Design Overview of File System Design Model Access Methods Logical Pile System Basic Pile System File Organization Strategy Modules Allocation Strategy Modules Device Strategy Modules Input/Output Control System

31 31 39 48 53 57 59 64 66 67

IV.

MOLTI-COMPU'IER NETUORK ENVIRONMENT Background Problems &rising Design of File System Logical Pile System Design

69 69 74 80 80

Rasic File System Design

87

File Organization Strategy Module Design Allocation Strategy Module Design

91 94

Design Strategy Module resign nther Considerations

S7 98

V.

CONCLUDING COMMENTS

REPEE KCS

101

10 3

ILLU STP AT IONS

. Phyaical Computer Configuration File Systems 2, "Farly Access Mthods File Systems 2.1 Un fors File Reprosentation 2.4

2.

Logical computer Configuration

12.6

Logira2 Transformations

3.1 3,,;

Vierarchical Levels "Real" Memccy and "Virtual" File hemory

3.1 3.U

Hierarchical Pile Syste? Parameters and Data Bases Used by file System

in file System

Pappinq Virtual Memory into Physical Records '.C 3.6 Fixed-Length 1Secord Access Methods . 7Variable-Length Record Access Methods 3.8 Hierarchical File Directory Fiample U.1 U.2 4.1 .4 4.'r

Example of Multi-Computer File System Netvork vxample of File Directory Structure (to LIS) Procedure to Perform Logical File System Search Fxample of File Directory Structure (to arS) Example of File Organization Strategy

17 19 21 23

26 28 33 36

41 41 45 5 51 55 71 81 86 89 92

EMOL!)TIO,

OF FILE SYT!15

Introduction

The evolution of general purpose file systems Tazallela very closely the evolution of operating systems. This is nct surprising since the conceFt of file systems grew cut cf the embryonic

input-output

control (IOC)

and now

operatiny systems

functions

represents the

of

early

most significant

component of most modern operatinq systems. There has been very to the specific 1967,

Saul

be

a

problem of analyzing operating

Posen collected

"Programming

selection of

unpublisbed

repcrts

programming

languages operating

systems. fcr a

together material

Systems and Languages",

the

and

discussing

system

concepts.

which was to

most many He

of was

the forced

"The paper on Operating Systems was prepared for at the University of Michigar presentation

Conference, June

and

isjcrtant

conclude:

Enqineering Summer

bcck,

published

previously

In

18-29, 1962.

it has had fairly wide rirculation as Rand Qepcrt vital P-2eS4. The material covered has been of

scst to

[I F .

r. Z T $€icD OC?1ot

the "clamsc&l" th: develcpent of importance in o~erating system, it is difficult find and an ar-q uA t~o treatmentyet outside of very to long u ually dry Systoi matials. Ueorge mealy was one of the few working experts in the field Who took the tise to writa down some of the basic principles of operating systoms and also Gf assombly systems."

Mr.

observations

Rosents

attention

has been

expended

in

functions of operating sy.tems.

imply

that

very

little

the attempt to generalize the systems

?ile

bave alsc been

spvirely neglected. In the

early years

ptogatmsers approachinq

slowly a

wovcd

periods of time, in

away

bare machine

pencils, fiqhting with

or

of computing

defeat

and

from

with card

the

Fractice

decks and

of

sharpened

the console for more or less extended

leaving triumphantly with

with

Operating systems

(rough.y 1952-1962),

a

ream

have

of

final results

dumarBosin

machine

evolved, not so such

69>.

as a blessing,

but as a practical necessity. As computers became faster aed sore

cemplex,

it

va! no

longer possible for

an

individual

proqramer to be an expert in every phase of the programming and

machine usage;

and systcm These

he

programmers

now must

to provide

cperating systems

usually specialized around trul)

successful

System) for the

rely on

the necessities

were

of

IBM 709/7090/7094 As a result

life.

often ill-designed

a single goal. One

operating systems

its specialization.

the operating staff

was PHS family.

of

and

the first

(FORTRAN Monitor Its

name

implies

a large number of cFerating

EVOLUTIOI or rILE SYSTEMS

3

systems appeared, each with its ip cialization. These tied

to

a

own operating procedure and

sy tevs were

programming

typically very

language

(e.g.

clccely

FORTR&N,

COBOL,

Systems (1OCS) emerged as

a part

Assembler). Input Output Control of the Operating

System based on the simple observation that

perform some

all programs

amount of

input and/or

Therefore, rather than requiring each new set of input/output routines and

sufficiently

supplied with especially extended

flexible

the Operating

critical to

include

as

output.

programmer to write a

for each program, a ccamon

collection

of

System. This

computer high-speed,

I/O

routines

were

situation became capabilities

buffered,

channels which required couples program logic

were

asynchronous to efficiently

perform input-output. Prom

the

crude

followed a logical,

beginnings

of

file

though often slow, evolution.

physical input/output functions were many generalizations

ICCS,

systems Once all

localized in the IOCS,

became possible. Usually, there

is no

important difference among the many tape drives available at an installation, sc that any arbitrary tape unit say be used

for input or output to a program. Furthermore, later runs at the same

or different installations

need not use

the same

unit as long as unique correspondences can be maintained. At first it was the choice of

considered that the best input-output units by the

practice in handling object program was

I. INTICIDUCTION

to include unit red in

assignments

followed,

ihis

which

practice

was

seldom.

Vith

the

advent

more foolproof

object prograss

IOCS. The

when

well

it

was

of

the

and flexible

establish the correspondences as

manner of operating was to of the

worked

of IOCS, a

near-univer'sal use

parameter cr to

data and initialite the program

unit a-signeents as

appropriately.

part

as an assembly

dealt strictly

in

symbolic unit assignments. Since the object programs no longer interacted 4irectly even aware ot unit assignments,

with tte I/O units nor were degrees

additional

of

freedom

operating system, Froviding a environment. For assignments

system could

and

(unbuffered,

the

to

determine unit based

upon

availability and performance

(e.g.

automatically

dynamically,

The actual technique of

I/O interference, buffering, etc.). I/0

available

more efficient and convenient

example, the

complex criteria such as

became

single-buffered,

etc.)

doutle-buffered,

could be removed from programmer concern. Tbe

of

proliferation

1/O

device

types,

such

as

low-speed, mediut-sreed, high-speed and hyper-tapes, as well as drums and disks of all expansion

of IOCS

to

shapes and sizes,

include

called data management or file notion exploited is

that

capabilities that

are

the now

system facilities. The basic

just as the

programmer had little

concern as to what tapes were to be used, care what device is used nor

resulted in

be really does not

what method of I/O is evylcyed

ZVCLUTICK O

within

IL E SYSES

broad

programmer column

logical

5

wishes logically

cards,

the

For

constraints.

file

to treat

system

example,

his 1/0

could

if

data as

physically

a 80

utilize

unit-record equipment, tapes, disks, drums, data cells, or a host

of

other

devices in

various

manners

logically

to

simulate the effect of input-output using 30 column cards. This

trend

became

systems, since the

multi-tasking operating devices was an

irreversible with

the

advent

availability of

continuously and dynamically changing.

Environment,

imp idsible to

it

becomes

impractical

designate specific

of

I/0 units

In such

and

probably

statically and

arbitrarily in the program. The syst

importance

of

these

data

cannot be overly emnhasized.

that

programs perform

appears that

management

file

Just as the assumsticn

input-output was

the number and

and

a

flexibility of

basic fact,

it

I/O facilities

demanded by programs are continuously increasing. A major factor the

in the rapid growth of

introduction of

low cost,

direct access devices such as A description of

one as

high capacity,

with tape-like devices.

high-speed,

disks, drums, and data cells.

direct access devices would

fact tMt they have tvo degrees

file systems is

emphasize the

of freedom rather than cnly Since these devices

can be

used for both sequential and direct access applications, the total

amount

of

usage increases.

degrees of freedom necessitate

Of

course,

the

extra

more complex I/O routines and

reliance on

tighten the

furtber

INTIOrUCTIOf

~I,

6

file

Ferfors

systems tc

these functioni . Direct access

devices are

usually as

flexible as

or

more flexible than tape devices, Card-image or printer-iage flied

data

record

types

or structured

variable-length

capabilities could be

vast

second

was

have been

subsumed

as

functions system. factor

time it

was due

data rose nor

convenient

and sophistication

It

correspondingly.

usually

"infcruation

to the

information in the fcras of

of use increased, the amount of programs and

rising

operating systems,

uses,

As the number of users,

explosion".

to the

contributing

systems, as irt early

This

necessity.

these

forms. Altbougb

data

as

the

major

importance of file

VQ11

prcgcau,

by-products of the file The

as

performed by the object

these

majority of

bandled

be

can

physically possible

vas ac to

cnger

haul

the

required boxes of ;roqrams and data to and from the machine. This

information was but

compact

magnetic tape

directly or disk

converted and

maintained form,

machine processible pack. Not

only were

in a such

more as

the individual

programs and data collections large, but the total number of distinct

and

unique

files

(i.e.

programs

and

data

collections) was very large. It is not unccmmon for a single proqrammer to have

and a

to use from 10 to

roughly equivalent number

100 separate Frcgrans

of data

situation became especially acute with

collections.

Ihis

the increased use of

EZCLUTIOP 0? FILE SYSTEMS

online systems. not be

A

7

user at a remote

expected to re-type and

data from the terminal. They and

stored

at

acceessable

the

Quite obviously, store each Robert

enter all his

it

central

computer

would be

unique file

facility,

althcugh

terminal

control.

remot,

uneconomic and unmanageable

on a

Rosin highlights

programs and

must te permanently maintained

alterable under

and

teletype terminal could

separate tape

or disk

these developments

to

sack.

in his

recent

survey of supervisor and monitor systems: "A file system is especially necessary in any system which purports to provide realistic time-sharing. However, the advantages of this facility cannot be overlooked in a cice conventional envirouent". Thus, I/0

people were

devices

addition

to

to

"scratch"

faced with the problem

store

the

thousandc;

of

traditional use

storage.

Direct

capability of storing hundreds or and accessing then

in

permanent

for

access

of using the

input,

devices

files

in

output

and

provide

the

thousands of unique files

any order conveniently.

This type cf

direct access device usage results in many side effects. The first

problem,

organization and

a

involves

a

facility to locate "emEty"

files. such

mechanism

Many

other

as a security

access to restricted hardware

course,

directory-like

individual required,

of

files,

to

complex

stcrage

space on the device keep

track

facilities

are

of

the

usually

system to prevent unauthorized and procedures to

or software failures.

Of course,

each

reccver from installation

1, INTIOCUC'TION

or

group

develops

capakilities

to meet

additional special

elegant

file

system

requirements

or to

provide

extensive flexibility* For the same reasons that

programmers utilize and rely

on the file system, the operating system uses the facilities of the f ilp

system. For example, user

passwords,

account numbers,

etc.),

identification

accounting and

(e.g.

charge

information as well as system

self-measurement data must be

maintained

the

system. space

dynamically using

The on

previously

direct

facilities

menticned

access

of the

directories

devices

and

the

of

file

"empty"

symbclic

file

directory and access control information are usually handled as system files.

The operating system uses

capabilities to store the

FORTRAN,

COBOL,

the file system

various processing programs

Assemblers,

etc.)

as

well

as

(e.g.

many

infrequently used supervisor routines. Furthermore, advanced operating systems perform paging in conjunction to

realize

that

the

important component of manpower required to

"spooling", roll-in/roll-out, and

with the file system. It file

system

is

is not hard

usually

an operating system in

the

most

terms of the

develop and implement, and

the ascunt

of instructions and space used by the file system. Wbereas the

early operating

systems along

with their

rudimentary file systems revolved around the need tc suFIort

MiscEllaneous

1/0

functicns

modern file systems are at the

for

programming

languages,

very center of the operating

I

!YEOLUTION f 3FTLF

STM5

fsys!tem. The

.3-

47

suzpervisor,

programming

systems,

and

programs are totally 4eoendent on the file system.

9

cbject

1P

ban Suffered fro$

Thp development of file systems of

tbe same

prcbless

that of

iss

Frograssieg

Protably the single scat isportant protles

in

most

factors, such receiving

be put into

attention

"best"

the

function

of

this

programming language

depended therefore,

was

up

paper to

of on

FoRIBAN the

had not

utmost

deeply

to

its

the

acceptance

efficiency

language in

hardware capabilities of a specific

in

if the original

that

attention

involved

to illustrate

example,

felt

defined the

macre

times

programmer. It is not

get

For

had not

of

from recent

15

to

controversies, but

trends and changing attitudes. designers

The question

where it has been fcund

than the least proficient

"efficient" the

Frogrameer

other

are finally

proper perspective

studies of real programming groups, that

important,

situations

au productivity and flexibility,

their lonq-deserve4

efficiency can

Vas the ezcesuive

programming

current-day

lany

languages.

course efficiency is

concern with efficiency. Of but

KI3 ODVVI110N

X,4

machine,

terms cf IBM

and, the

C4, it is

possiblf that the evolution of languages such as ?OBIB&N-IV, COBOL, ALGOL, miqht

and PL/I and generalized

have proceeded

in

a

compiler techniques

more organi2ed fashion. The entire

field of generalized approaches tc programming languages and compiler techniques fActor in

has only

recently emerged

the computing profession.

as a

major

SCCPI AIO PUIJCS!

11

?ile systems the

have followed

a similar

development, In

same of "efficiency", each new file system was specially

tailored

to

intended

use

4%perience

the and

or

demands on a facilities

original seeds very

techniqaes given

file

-ould

benefit

of preceeding

of from

the

As

the

systems.

system increased,

its

new featutes and Each of theme

systems drove us further and further flce an

piece-seal file

generalized file

Best literature in forms.

environment

often with a "crovbar".

were added,

organized.

seldom

and

The typical

techniques used

system structure.

this area

manuals

system

to implement

current systems

or in

characteristics for future file user facilities,

but

file

of new file

one of two "clevern

system,

for comparisons

the design

othex type of reference deals

describe the

a specific

litt]- assistance

provide very

has appeared in

but

vith cthez systems.

The

with discussions of desirable systems,

usually emphasizing

adds little insight into

the problems

of designing and implementing such a system. To a certain extent, to

evolve in

systems

"time-sharing" systems.

will

since time is it

is

the

systems

be

called

approaches have begun in

conversational

this paper

such

resource-sharing,

only one of many resources that are sbared and

cnversational that

batcb-oriented These

generalized

is

or

most

interactive easily

nature cf

these

distinguished

from

operating systems.

generalized

file

systems

for

conversational

FI 2

1. ZIM lCCT I~ONJ ol

rvsource-alring crerating systems developed an

,ocvtiity. In order to provide

b; y user prooraex

and the

onsertial.

Purthersore,

4-9~viromonk

and

be impossible

all the fedtures required

supervisor, a

flesible design

owing

complexity

to the

its dynatical1y changing aspects#

to devise an "optimally

The implemnter% were thus forced makc tbe

both by design

of it

sa

tho would

efficient" strategy*

to abandon any atteaFt to

syntol ecre efficic n. and

were free to

develcr a

fles iblt system with a clear conscience, The qoals

of flexibility

contradictory. In most

and efficiency

any multi-tasking system,

Podern, non-conversational,

systems as

well as

processor

time can

vhilf I/C

is in Frogress,

efficiency ceases individual

user

result

in

other

tasks,

be utilized

by channels, bud the central ky

executing other

to be of paramount to

overloading

as

optimlze

excessive

the channels.

tasks

file system

concern. ?urtherscre, performance

unnecessary inefficiencies due to such

operating

I/0 O eLdtiCnfl

In this environment

attempts

be

which includes

batch-oriented

conversational systems,

can be performed asynchronously

need not

I/0

The file

could

conflicts vith

interference

system,

aware of

from the

total requirements, could provide a strategy that results in .k

tir

more harmonious arrangement, noro than individual FvPn in

*'y:;ttM;,

user optimization

single-task cr

there

is

increasing systom thicogh~ut could.

applicatlon-criented ojerating

definite

value

to

art

organized,

sI QPt AND PIPOS .

13

9enor1I' liv4d

file

programs as

well as compilers

system.

For

large,

complex

and asseablers,

user

the Frogram

system requir mects carnct be

action including precise file

is

deteralfed since it

ttiLlcaly

most

input data supplied. Therefor%

a dynagic tuncticn cf the a dynamically flexible file

syntem could often outperform a specialized, but inflexible,

file system. Xt is

the purpose of this

flexible but

precise

model

probably need to be modified and specific

implementation.

Robert Rappaport

a general

extremely important to start with

file system design. It is a

paper to present

This

in his thesis

Primitives in a MultiFlexed

although this

design

will

made tore detailed fcr any issue

was

"eplementing

hbiqlightcd

by

tMulti-Piccess

Computer System" which

describes the development of the

Traffic Controller for the

MrIT Project NAC Multics System: "After having found acceptable solutions for the problems at hand, one asks oneself why it tcc% so long to arrive at these solutions and was there any way to have done it sore quickly? One sight further ask if the arriied-at solutions are in any sense optimum? After being involved in designing a large system involvinq the work of many people, one gets the feeling that such Frobleas as were encountered herA are bound to crop up. The develorumet cf any large system can only retain manageable if distinct parts of the system remain modular and independent. Vithout a theory of computing systems tc fall back on, designing such complex sy-items becomes an art, rather than a science, in which it is impossible tc prcve the degree to which working solutions to problems are in any sense oFtixuv solutions. In much the same way as authors write

____

I

books, large costuter eytems go through several dratts before they begin to take shape. In the abseuce of s theory one con only cope with the complexity of the situation by proceeding in an orderly fasbion to first pto4uce an initial vorkinq model of tho dosired system. This part of the york represents the maor effort of the design and implementation ptoject. Once having arrived at this beacbmark, Uauy of the prib14a5 sy thon be seen in a clearer light ad revisions to the working model are implemented much more quickly than were the original modules. is to the development of a theory, one gets the impression that it will be a long time in coming." Thereforo. computer science,

while

we

atoait

the file system

paper will hopefully serve the

THU

general

theory

model presented

of

in this

need for an "initial working

aodol" from which Wproblems say be seen in a cleaet light".

t'N??CR P 3!P!?I'I?!T|TOW OP PILl SIPUCTUR!t

CHAPIE11 TWO

MotiVation Behind Pile System Design

Tber

are two

systos design.

It

is necessary to

representation of a hierarchy of system.

(1)

establish a uniform

file's structure and

(2)

logical transformations that

V. H.

by the file

basic goals to be satisfied

Henry's recent

management systaes discusses

data

similar notices

cf

separating logical

and physical

file control,

but differs

significantly from

the approaches presented in

this report

in

many fundamental ways.

a reader interested in

It

should be a useful reference to

other current research

RJiorm Renpre 9n Aj 2I 21 his M A typical computer Such

a configuration

secondary storage.

storage Programs

usually has

and data

in

limitless and very

true

a

upon,

Figure 2.1.

varied asscrtment

addition

must be

order to be executed or operated

this area.

_1..

system is portrayed by

devices

It is generally

in

in

to

primary

the

of

Frisary

stcrage

in

respectively.

that J1 primary

stnrage si2e was

inexpensive, there would be

no need for

16

I.

secondary

storage

MOT!YAION

(possible

DEPIND FILE S

exceptions

requirements and tranafor of data). report,

the file

mechanism that hand1in from

be

backup

In the framework of this

be 4.fined

extends the capacity

as the

of primary

software mtcraqj by

and coordinating the trafafer of informatict tc and

the secondary

somewhat hich

system will

may

IEN VESIGV

storage

devices.

more restrictive than

include

as Fart

of

the

This definitica

other common interpretations file system

the

devices or the programs that operate upon the data, interpretation the

is

file system merel, ztores

physical In this

and transfers

information but does not operate upon it.

i

UNIFORM REPRESENTAUION OF FILE S4 RUCTUBE

17

+--->

-

---

I/o DEVICE 1

-

PRrMARY j CENTRAL I STORAGE I IPROCESSOEI 4,, --

-

1/---0--->I I/O DEVICE I I 2 I

1

I

-

I I

0

1 I

*--->I

I

Figure 2. 1 Physical Computer Configuration

1/0-, I/O

DEVICE

n

I I

I

I

18

It.

Early with

file Systems

specific

potentially

MOTIVATION BEItND FILE SYSTEM DESIGN

were usually Vroqrams.

application

a

very

designed tc

large numbet

Since

of

crerate

there

different

ate

secondary

be used in more than one

storage devices, many of which can

way (e.q. sequential or random access. blocked or unblocked, etc.) each file system limited devices

and organizations

intended appication.

itself specifically to those

that

were

Figure 2.2

appropriate fcr

its

depicts the relaticnshits

betwten the applications, the devices, and the file systems. of development produced

This tipe It is without

somewhat analogous any

to assembly

established

standard

language programming calling

impossible,

to use

arbitrary programs

particular,

it was

quite common

payro.l

sequences

or

it difficult, if not

communication conventions, which makes

produced by the

chaotic situations.

as subroutines.

to find

that data

programs, using their

In

files

private file

system, could not be accessed by the file system used by the

personnel progtaes, anO. vice versa. much duplication of effort and and use of these early file

As a result, there was

confusion in the development

systems.

0V4?cRp

PEMSNTION

OF

PYLI S2HUCTUDE

APPLICATIONS

LOIAI- PHYSICALI

DEVICES AND PHYSICAL DATA ORGANIZATION

Figure 2.2 Early File Systems (Analogous to Assembly Language Programming)

19

20

It.

MOTIVATION BEHIND

More recently, the computer system

designers

Amall set

realized

of coamon logical

methoade

that can

programs.

satisfy

Purthermore,

designed in different

a

flexible

devices

the user

vith

the

system.

file

a

the

access

on a

this

Languages,

with

such as

file

systems

device

independent,

to resort

bypass

a

restriction,

inter-mix

access

to assembly and

methods

be

variety

of

This

;rcvided

interface

with

the emergence

for

cf

business

applications.

suffered the (1)

could

structure.

COBOL

FORTRAt for scientific

a

application

methods

to operate

compared

select

(or access

of most

needs

2.3 illustrates

can be

necessary

?ORTRA4

possible to

device-independent

as the programming languages: really

is

tE3IU0

and operating

file orgaoizations

manner

Pigure

applications and

not

it

these

logically

Oriented

access methods

manufacturers

and device organizations.

This apprcach

Problem

that

It? SYST?!

The

sane shortcomings

despite claims, they vere (2)

occasionally

language

(3) it

vas

(analogy would

it

to overcove not be

possible to

was cr to

intermix

and COBOL subrcutines).

| Iii | it

UNIFCPP rEPRESINTATION 0

P1L!

STRUCTURE

APPLICAT IONS

III

LOGICALI

A PHYSICAL

DEVICES AND PHYSICAL DAT!A OBGANIZATION

Figure 2.3 Access Methods File Systes

(Analogous to Early Prograsming Languages, Sueh as POPTRAN and COBOL)

21

22

It.

NOT!I ATOt BUIiD FILI 5IST2I

CISIGI

In order to overcome the weakness in the access methods approach, it

reprtsentation that and

(2) be

design a gj11

is mecessary to can ()

deviece

be

used for

inudao-4entv

analoqous

to the

language,

for which

for

This

search

PL/I is

unifcre file

every application idealistic goa1

*?he" universal probably

is

programming

the most

ambitious

attempt to date. It

is

reasonable

representation vill be it

will

be desirable tc

access Since

methods fcr the

access

representation,

to

expect

that

such

so atomic or primitive

a

unifccm

in fcce

that

construct mote powerful specialized

the convenience methods

are

of

built

the typical1 apon

the

user. uniform

it is much easier to modify or implement new

access methods or, if necessary, operate at the atomic level to

bypass the

approach

pushes

restrictions

of

the access

the logical/physical

methods.

sejaraticn

of

This file

system structure much further as indicated in Figure 2,4.

23

uyrropn REPRESENTATION 0?' FILE STRUCT1Ufh

APP LIC A

0TONS

LO GICA L

ACCESS HETkIODS

EETTO UUIOMR PHYSICAL V DEVICES AND PHYSICAL DATA CIGANIZATION

Uniform

Figure 2.4 File Representation

(Analogous to Universal Programhing Language, P1./I ?)

XI. NOTIV&TIOI

24

?ho

rationale behind

uniform represontation

the

is not

BEHIND TILE ST

melection

tN D3101

of a

pazticular

exav~le,

trivial. For

the e

are three broai classes of c.spon uoniferm representaticni: 1.

Stream -

every file

sequential stream

is treated

as a

continuous

of information. It

to access only the current

is possible

position in the stream

or reposition to the beginning of the stream. this representation can be

imFlemented conveniently on

almost all secondary storage

2.

devices, although it

does not

provide the user

with very

efficient

features for many

applications.

-

Direct-Access

every

file

is

roweful o

treated

as

an

ordered collection of items. Each item is directly accessable

by

means

corresponding

to its

of

organization,

Stream,

but is

intrinsically

very serial

oaique

position

This representation, which storage

a

identifier

in the

ordering.

corresponds to primary

is

more

powerful

difficult

than

to isplement as

devices, such

on

magnetic

tape. 3.

Associative unordered

-

every

collection

file of

is

tre&ted

items,

each

as

an

item

is

directly accessible by means of an identifier that has been very except

WassociatedO

flexible for

a

with the

item.

representation. small

class

This

is a

Unfortunately, of

sophisticated

UNIFORN REPHIESENTA1IO

secondary

0f FILZ S RUCTU3R

storage devices

25

the Ilplesentaticn

in

very complex and inefficient. Irrega r4Iess

chosmn, vieve8 as

of

the spocific

the important

concept

being identical in

particular physical

is that

This generalization is depicted in

t.I

all

representatic:

files can

structure independent

device on which

be compared with figure 2.1.

uniform

the file

be

of the

is recorded.

Figure 2.5, which should

IL.

BEIND FILE SYSTER

#OITY&ION

*---)1

t, PCIH

i.I

ESSZG

I

IFILE -- 4~e

I ARAY I 5qTCPAGE

4

-4

i< -%:>1 I

CENTRlAL

i1 tNIFOSH I I FILE 2

1PPOCESSORI 4

-,

I

4

-.

I 1

4

+--->I UNIFORM I I F aLE 1 5

Figure 2.5 Lcgical Computez Coufiguration

27

1411RARCHT OF LOGZC&L TIUSPSFIafTAIz0S

JJ

llietrbZ 2f12 Although a

ranmf rmationm

later sections,

presented until

not te

of

characterI at 11,

'geiteraI

Rost

user specifieu his

particular, a

system will

a file

precise description~ of

several

there are

In

ayateac.

file

read or

request, such as

write, by designating a file and an element within the file. Rost advanced tile systems allow considerable flexibility in the

is

tile, it

a

specify

used to

mechanism

typically

Furthermore,

described by means of a symbolic file name.

the

element within the file is specified in terms of the logical of elements

representation say

which

specification cf hcv

to a

correspond

not

or may

systfy

particular file

in the

physicil

precise

and where the element

is stored.

PCL

example, a typical request eight he of the form: "Read item 23 from file AtPHA into location 1564.N Realizing that devices

in

sequence of

somewhat

information must obscure

final form

secondary storage device. as

a

single

store'i nP be

must

thE

*

U-'L'~S

that physically operates cr Quite often the transfor-iatirt i:, step

oversimplification that hides the use*

there

transformations required to ccavert

request into its

viewed

ways,

usually be

but

that

fundamental

In Figure 2.6 the conversion process is

is

d

sechartZ.-- irix illu~ft.r.~o!-i n

terms of a discrete sequence of logical transforsuation-> Since the specifics of these *ransformat ionE a~,

rt,:c '-

2P

Ir. MOTIVAION BEHIND FILE SYSTYB UESIGN

?FAr ITEM 23 PROM

11

SEND LETTER TO

FILE

If

JOHN DOE'S

It

HOME ADDRESS.

ALPHA INTO

LOCATION 1564*. SY4BOLI(

ItI

?ILE N tE

JOHN DOE

II

I

II

tFS V NUMERIC

I V 030-34-1234

F1 L*: f ' FIEI BF5

I

if

V

IV

FIIE DESCBIPTOR

FOSM

I

I

Birth date Office Address Home address

I

II

etc.

I

II

I

I V

LOGICAL I/C COMMANDS

nsP

I

i I

I it 11

I

V

EXTRACT HOME ADDRESS

I

IIIj

V PHYSICAL

it

V SEND TO

I/0 COMMANDS

I

POST OFFICE

iI 10 DEVICE

I 11

POSTMAN

Figure 2.6 Logical TransformatioDs in

DELIVERY

File System

I

HIERARCHY OF LOGICAL

obvious until analogy is the

TRANSFORMATIONS

the note

detailed sections

presented in Figure

file

system

29

2.6 that

ttansformations.

intended to provide

later, a

sitFle

loosely parallels

The

analocay

some insight into the

is

only

rationale behind

each stage of the transformation. The process item 23 from step is

starts from

the user*s

request to

the file ALPHA into location

to convert

the symbolic

"read

1564". The first

file name

into a

unique

numeric file identifier. In the analogy, this ccrresonds to looking up John Dce's identifier

which is a social security

in this illustration. The purpose for using an identifier is basically

the

same

in

both cases.

convenient to store information, by means

of a unique numeric

name which

may, under

unique (i.e.

is

usually

more

manually or automatically,

"key"

rather than

certain circumstances,

there may be more

case other factors

It

a symbolic not even

than one John Doe

oust be considered in

be

in which

order to uniquely

identify the person under consideration). The file access

all

identifier can then the

informaticn

known

information collectively is known In

the analogy,

this would

be used

to ccnveniently

about

a

file,

this

as the file's descriptor.

correspond

to requesting

all

information in the social security records of 030-34-1234.

Now nPcernary

that everythi.g to

consider

is known the

about the

specific

performed. Using the file descriptor,

file, it

operaticn

tc

is be

a sequence ci lcgical

30

11. INOTIVITION BEIIND FILE SYSTE

I/0 commands can

commands

because

be produced. These are

they

do

not

called logical I/0

consider

characteristics of the secondary storage

DESIGN

the

physical

device to be used,

This is analogous to putting an address on a letter which is usually done

without considering

the physical

destination

the transformation,

the logical

nor the route to be taken. In order

to complete

1/0 commands must be converted into the appropriate sequence of physical I/O commands. This comple% depending upon I/O interfaces to

conversion may be trivial or

the peculiarities of the

the devices. In the

device and

analogy this process

is performed at the post office where the address is used to determine the physical

routing needed

to get

the letter to

its destination. The final step in the of

information.

This

process is the physical transfer

is usually

softvare/hardware interactions

performed

to activate

by

means

of

the aFrcpriate

device and confirm the successful completion of the request. Of course, in

the analogy this transfer

the postman ("neither

rain nor snow nor

is acccmjlisbed by

dark cf night...")

assisted by trucks, planes, trains and other automaticn.

31

BASIC 1COWCEPTS

CHAPTER THREE

Pile System Design Model

2111

9211M.U 212-d 12. 11-2 System .A912 Two concepts are basic to the general file system model

to be introduced. These concepts

terms "hierarchical

have been described by the

memory".

modularity" and %virtual

They

will be discussed briefly below.

Vierarchical Modularity The term

"modularity" means

many different

different people.

In the context of

concerned

organization similar to that

with an

Dijkstra

and

things to

this paper we

will be

prc;csed by

Randell.

The

important aspect of this organization is that all activities are

divided

into

structure of these or ring

sequential

processes.

A

hierarchical

sequential prccessEs results in a level

organization wherein

each level

only communicates

with its immediately superior and inferior levels.

The notions of "levels of abstraction" or "hierarchical modularity" can Consider an

best be

presented briefly

aeronautical engineer using a

by an

example.

matrix inversion

12

I1.

package

to solve

abstraction, the

space flight computer is

FILl SVST28 DESIGN MCDEL

problems. At vieved as

his level

a matrix

cf

inverter

that accepts the matrix and control information as input, and provties

the

programmer who wrote have

had any

levels

matrix

inTerted

as

output. The

application

the matrix inversion package

knowledge

of

of abstraction).

its intended

fe tight

usage

view the

need not (superior

computer

as

a

"PORTRAN machine", for example, at his level of abstraction He

need

not

operation

have any

of

abstraction), vith it. at

a

the

specific knowledge

FORTRAN

but cnly

Finally,

different

the

of

system

static

internal level

cf

he can interact

FORTRAN compiler implementer operates

(lover) level

since

(inferior

the way in which

of abstraction. In

example the interaction between is

of the

after

the

completed, the engineer need

the

matrix not

the above

3 levels of abstraction inversion

program

interact, even

is

indirectly,

with the applications programmer or compiler implementer. In the form

of hierarchical modularity

design mcdel,

used in the file system

the multi-level interaction is

ccntinual and

basic to the file system operation. There

are

orjani2ation.

completeness designers and

several

advantages

Possibly the

of each

cften a

such

most important

level. It

is easier

implementers to understand the

interactions of each level and is

to

very

an

modular

is the

logical

for the

system

functions and

thus the entire system. This

difficult Froblem

in

very coallet

file

3

BASIC CONCEPTS

V -

--

--

----

ILEVEL K+1

-------

V

Figure 3.1 Hierarchical Levels

I

la

TIT. FILE STST

systems with tons ani

or hundreds of thousands

DESIGN iNODIL

of instructions

hundreds of inter-dependent routines. Another

by-product of

this

structure is

"debugging"

assistance. For *xasple,

when an error occurs it eta

be localized at a level

and

verification usually

freliability checkout)

an impossible

using all possible in each finite thp

identified easily. Tbh

task since

it

of relevant

internal

Therefore,

an

tests,

it

structure

of the

important

goal

a

file system

small

that

construct a

is necessary

to ccnsider

mechanism is

they can

each

level

and verified,

being

abstractions

more

introduced,

to

to design

all

remains within

then

be

be the

tested. internal

test cases tried

level

1, level

but

because

powerful, the

tests

In order to

number

of

is

without

overlookinq an iurcrtant situation. In theory, level te checkel-out

is

requests occuring

structure so that at each level, the number of sufficiently

completo

would require

system

data inFut and

potential "system state". set

of

usually

0 would 2, utc., of

"special

the

cases"

bounds.

Virtual Memory There are objectives: of

four very important and difficult file system

(1) a flexible and versatile format,

the mechanism

o"irep of

machine

,aril automatic

as

possible

should

te

invisible,

and device independence, and

allocation of

(2) as much

secondary storage.

(3)

a

(4) dynamic There have

35

BASIC CCNC9FTS

been

several

file

generalized

"segmentation" II

I ...

! I

I

III

I

I

i

.I A Ile

I I

til

I-< I iead from Pile

DESIGN NlODIL

I I-....

LI_

I

I

I

Main Storaqe

File 8

tigure 3.2 "Real"

Memory and "Virtual"

File Memory

IASIC CONCIPTS

37

60 chdarcters at a time

routine to access locations

80, i60,

0,

...

starting at byte othev

Almost all

.

possible

formats can be realized by similar procedures.

1Ecept

system

for the

mechanism,

modules,

formatting including

file

the entire

allocations,

bufferijg.

4ad

physical location, is completely hidden and invisible tc the user.

This

relates

indeFendence. In which device

closely to

many file

the

objective

systems the

should be used,

its record

of

device

user must size (if

specify it

is

a

hardware formatable device) , blocking and buffering factors, and

sometimes even

parameters and

the

algorithms chosen might,

optimal, many changes required

to

addresses,

physical

run

environment. This

in some

might be necessary if with

a

different

strategy does not

Although sense,

the be

the program is

configuration

prevent the

CT

user frem

providing additional information, such as how often the file will be

that this information is is

manner,

used and in what

Dot

The important

factor is

necessary and its aignificance

determined by the file system rather than the user. There are

by this

very serious questions of

file system

strategy. Most of

efficiency raised these fears

eased by the

following considerations. First, if

to

very

be

used

seldom

I.s

efficiency is not of paramount hand,

in

a file is

development),

importance; if, on the cther

it is for leug-ters use (as in

program),

program

can be

a commercial prcduction

the device-independence and flexibility for change

111. FILE SYSTER

38

and

upkeep will be very

proqaem*r

of the coplexities of

is

he

allocations,

constructively and rslstiq than

to

of current

coordinate the specify critical

files

utilize

hiu

overcome

to

the

than

Farameterm.

the average

te.

shortcomings

Third, in

system will

energy

problem rather

direct-access devices,

file

devices, ea4

clever algoritbs

of his

trvcturinq

file system.

of the

that. the

to

by relieving the

the formats,

creatively to develop

*tricksN

rculiarities

possible

able

the IcqiAl

clever

complexity

important. Second,

DISIGM AODIL

Le

view of it

better

is

or the

quite able

user attempting

to to

OVEIRVIW OP PILI STSTBM DESIGN MODEL

pap r can specitic

be viewed as

a hierarchy

implementation

sub-divided

or combined

several modern

separate

certain as

file systems,

report, attempts

in this

to be presented

system design modpl

?be file

39

of seven levels.

levels

be

A recent

reuired. vhich

way

In a

further study

will be published

to analyze

the

systems

of in 4

in

the

framework of this basic model. In general all of the systems stuied fit into

model

are

the model, although certain

occasicnally

reduced to

trivial

levels in the

form

or

are

incorpormted into other parts of the operating system. The seven hierarchical levels are: 1. Input/Output Control System 41OCS) 2. Device Strategy Modules (DSM) 3. allocation Strategy Modules

JASM)

4. Pile Organization Strategy Modules (POSM) 5. 2asic file System (BPS) 6. Logical File Systes (LFS) 7. Access Methods and User Interface The hierarchical organization can be described from the "top" down

or from the "bottom'

ordtinarily be implemented

up. The file

by starting at the

the Input/Output Control System, and more

mian nqf ul,

however,

to

system wculd lowest level,

working up. It arears

present

the

file

system

organization starting at the most akstract level, the access

40

1I1.

routines, ani

removinq

FZL.' '-.STU DZIGS AODZL

the abstractions

as the

levels are

Op"-leiwdy", In the

Eolloving presentction

the terms

"file

vame ,

"file identifier", and "file descriptor" viii be introduced. Detailt

e

lan.tions

sections,

the following analogy py be used

cannee

assistance. A perscn's haphazard process or manageable

provi4 ed

(file name),

of assignment,

is not

Social Security

Access

unique identifier

identifier can

consists

a format

be routines

record A'iles,

of

on the to

the

file. In

ffile descriptor)

of

access methods or

qcneral there

fornz-

ubviously,

a

will

files, and

axample. Many

routines, also called

, n be user may

access methods to augment this level.

that

fixed-length

record

record files, fo:

data manayesent,

file system.

,:,utines

simulate scquentia3

more elaborate and specialized

the

set

sequential variable-length

direct-access fixed-length

of

then be

(AM)

level

superimpose probably

person, such

that person.

Methods

This

to the somewhat

usually assigned to each number. This

later

necessarily unique

used to locate efficiently the information known atout

until

fot the reader's

due

for computer processing. A

(file identifier) is as a

name

be

supplied as part write his

own

OVuiEIuw

O

41

PILE SISTER DESIGN MODEL

User Proq:La

I Level 7: Access Methods (AN) User Interfaces

I

! I I An I LL

Level 15:I Basic File System

In t LJ2

I BFSI

1

I

IPOSFI

._LAI.

V I l IFOSHI L.._IaL

I I I __j I I I ASMI I L_LIL

Level 3.: : Allocation Strategy Modules (ASK)

I

V 1i IFOSII ._J.LI

I

I

3 V 1 1 1 I I ASKI I 11211

I V

I 1 I I I V

1 I DSMI i_11_1I

Level 2: Device Strategy Modules (DSH)

I

I

V Level : File Organi2ation Strategy Modules (FOSM)

I I I AM I LJi. V I I LFSII

Level 6: Logical 'Pile System

'

I

I DSHI I_12ii

SI V I I IIOCSI

revel 1" input/Output Control System (IOCS) I

I

I

I

LDe vices Figure 3.3 Hierarchical File System

I

Z52

Ills

FILE SYSTEM OESIGN

9I3 USER ON 3 I Accss I I METHODS I VILE ADDRESS I COPE ADDPESS I LFNGTH I._tF1CTTON

I I I

ATA §

I

3 I V

-

I L F S *

----

I ?TIP IDENTIFIER

I

I

I fTLF ADDRESS

I

I

I CaFE ADDPESS

1

1

I LENGTH

I

I

_1

V

I_FU KCTON

&CDIL

I

NAME-->IDEWPIFIEU

j

IDENTIFIER--> DESCEIPTOP

--

4.-

I a F S .IFIIF DESCPIPTCP I ILE ADDRrSS l CORE ADTIrESS I LENGTH FIJFCTION

-

1

I

I I

I

I V

I 4

.-

,

I F 0 S H j

ADDRESS-->LOGIC&L

----------

I/o

I DFVICE IDENTIFIER I I LOGICAL I/O LISI COFF hDDFESS LIST I I_PTINCTION _

1 I V 4----------4.

I D S M

j LOGICAL-->PHYSICkL /0 I/O

-----------+

ItFVICE 11_/0

ADDP.SS,

COMMAND

LYT3

I __1

V *------------

I 1 0C S I *----------v 4.----------4

I EIVICE

I

+------------

Figure 3.4

Parameters and

Data Eases Used by File Systes

43

OVERV1!V OF FILE SISTER C!SIGN HOCIL

Logical File System (LYS)

Routines above symbolic

System to

File

a file.

name vith

corresfonding

unique

is the function of the Lcgical

It

the symbolic

use

"file

assoclatv a

of abstraction

this level

name

file

Below

identifier".

to find this

the

level the

symbolic file nave abstraction is eliminated.

Basic

file

System

(BPS)

The Basic File into

a file

descriptor

descriptor.

In

provides all

an

The file

rights interlocks, File

and System

associated based

upon

set

performs

the file

file

is

File

Organization

etc.),

many

of or

descriptor,

the

*closing"

access

check

read/vrite

bases.

The

functions a

the

Basic

ordinarily

file.

finally,

the appropriate FOSH for the

selected.

Strategy

Direct-access virtual

file

physically

to verify

system-vide data

with "opening"

sense, the

needed to

also used

vrite-only, up

identifier

Olength" and "location" of

descriptor is

(read-only,

the file

abstract

information

locate the file, such as the file.

convert

System oust

memory.

physical associated

A

records. with

Modules

devices

physically do

file must Each

it. The

(FOSM)

be split

record

has

not

resemble

into many a

Pile Organization

unique

a

-eparate address

Strategy Module

maps a logical virtual memory address into the ccrrespcnding

44

111. PIL2 STSTE! DISICS UODEL

physical record ad4rcss and offset within the record. To real or for

the POSM

memory area

write a portion of a file,

to translate the logically into the correct

or pcrtion thereof. by tke ASH.

The

contiguous virtual

collection of physical

If necessary,

list of

it is necessary

new

records to

reccrds

records are allocated

prccessed

be physically

is passed on to the appropriate DSM. Althouqh not allocate

"hidden"

reduniant or

the FOSH is

necessary, file

buffers

unnecessary

I/0. If

virtual memory is contained in the

data can

be

transferred

the file may be buffered. buffer areas

data in

write

and

to the

to

minimize portion of reccrd,

designated user conversely

main

output to

a sutficiently large number of file, it is

requests can

out of

"closed", the buffers

order

a currently buffered

are allocated to a

read and

moving

If

designed to

the requested

without intervening I/0.

storage area

all

in

often

be

possible that

performed by

When

the buffers.

are emptied by updating

merely

a file

is

the thysical

records on the secondary storage device and released fcr use by other files. actively

in use

Buffers are only allocated to files that are (i.e. "open").

Allocation Strategy Modules The available allocating

Allocation

(ASM)

Strategy

records on records for

a device. a

Modules keep They

file that

track

cf

the

are responsible

for

is

being created

or

I

OVERVIEw OF FILE STS?!M DESIGN MOD2L

I

I

I

I

ILlfll,).IId2.1_f1/I) I--->tLII&I/I /III/II/I I .-.-4--------l I I////////I II I I I 4---) FILE7 4--------------' I ----------------I I I-I 1. I IIII .------------I

4--------IL...LJS_./I

III//III 4----------

I

*----*

1

'-> It12.L .

..I>>>>>> VOL2(l) I ---------m ------VOL 2() I >>>>>>>.+

6

4-----------

> I //

/../

-4-

-------

2I/ , z3.nL +->I I -------------D) I.VCL3.(2) /_33 I VCL3(2) I I I I ID LES3(D 4----. I I -----------I VCL2(2) I FIE5 ----- ----

-

4

---------------

+->ILOL24i./I

--------VPDD for "VO1.2"

II--I/I---1

*----------------

-------------------

I

-----------

-

I +-->I/VqL"I_.U/I +----------- " I I VOL3(1)1 >>>>>>> 1 -+ 1

---

vL2/ >-------------------1 DR41(D) I VOL2(3)

I..

I DI83(D)

I>ZgL3.j/-LI I----->>>> --------*------I//I// ---------VOL'(U) I >>>>>>> I ------1-- + -i -3(>>>>>>> VO I ---------I+• ---------foar"VO;.3" 1-vp'DD "YOU" fo VOL35

I FILE8

VOL3(3)I

I

I

->I/VL../I I--iiii---I +---------

Figure 4

Example of File Directory

01L3(5)

I

I

-------------------

+------4

.-----------

I

------------------------

I

------------

>>>>>>

i

I VCLl(2) !

IILE1 .

*---------> IL------/I I I//////// 4 ---------*

0

I

L

I

VFDD for "VOLI"

VOL3(2)I

--

. .. . ]--i1-2 . . I .VOLI(Q) . -------

..... VOL 1(( ) >>>>>>> I VOL (5)I)>>>>>>> +---------VOLt(6) I >>>>>> j--4 I +---w -------

I

4

--

.

..-

....

. -.. - .-,I, . I. -VCLI(3) 34 - >>>>>>>

VOL24)1 >>>>>>>

I V-C-tL ./I -------------I VOL3(4) I FILZ3

1 DIR2(D) I VOL2(3) ---

.

4. - - - - - -

. . .

1 J.

.

-- * I VOL.1 (1)' >>>>w>> #- - ..... -I..-

I

*-)

4

---

*-> I FILE6 +

I

I VOL3(4)

Il

I 4

---------4---

(

--------------------

I VOI1(5) FILE6 --------.---------.

I

I VOL3(3)

I

I VOL3(4) I PILEI ----------------------------------

I

I FILE2

Structure (to BPS)

q 'i

IV.

File

The

MULTI-CCPUTIR NT2TMOl

Strategy

Organization

requests that specify

A sample call

or file identifier. to

processes

a file

descriptor save

rather than a file

from the Basic File System

orga4niation Strategy

File

the

bod ule

a file in terms of

(the entry extracted from the VFDD)

E*TIROXMUI

Nodule,

PL/1-like

in

notation. is: OSM_

CAll where

;

EAD(CESCBIPTOP,COREAriBFILEADDR,COUN) FILE ADDR,

COREADDR,

and

have

COUNT

same

the

interpretation as discussed above. The

of the Basic File

primary function

System

reduces

single request:

to the

CALL POSM_READ(VFDDDESCIPICR,DESCBIPTOR,* is

VFDDDESCPIFTOR

where

descriptcr

the

(INCEX-1),M); the

of

VIDD

associated

with the volume name supplied by the Logical File

System as

part cf

is the

Basic File

System performs

as

protection

validation

core-resident

File

Active

files that are in

simple

the

System,

sufficiently

length

small level in

use

domain

of

and narrow

the hierarchy.

maiatenance

that

cf a

the 'hat

of

the

descriptor for

out, as in the Logical Basic

it

tasks,

enables efficient

identifier and "open").

(i.e.

several other

and

Directory

a file's

association between

File

standard

the

and DESCRIPTOR is the desired file descriptor.

The such

ftcu

INDEI is

identifier,

identifier, I

specified file VPDD entry,

the file

File

System

is

remains a ccnceftually

PIL2 OSGANIZATIOw STIATEGY NODULE DESIGN

91

The Logical Pile System and Basic Pile System &te, great extent,

application and device independent.

Organization Strategy Modules are

to a

The Pile

usually the most critical

area of the file system in terms of overall perfcrmance, for this reason it

is

be

large

used

in a

discussed papers

expected

in this

listed

that vore than

system.

section,

in

the

Only

one strategy may

one strategy

the reader

say

eeferences for other possible alternatives. The physical

POSH must

map

the logical

record address

cr hidden

file

address cntc

buffer

supplied file descriptor information.

based upcn

a the

In the simplest case,

the sapping could be performed by including a tvo-part table in the tile

descriptor. The first part of

indicate a contiguous

range of virtual file

second part of each entry physical record address. all

file

is a

potentially quite include

the

It has

descriptor.

One

a specific

of

mapping the most file

length, whereas

the file's

large. Therefore, it

strategies utilizes an arrangement.

been assumed, however, that

function of

entire

addresses, the

would designate the corresponding

descriptors have

mapping table

each entry would

table as

is not part

powerful

maps, Pigure

length and

file ,.5

the is

feasible to of

the

file

organizaticn

illustrates such

92

MULTI-COPUTIt

IV.

I 1

NfTVOI% !IVZICNIENT

I)--i----I54--------4 I

-

I

-

I



I

Descriptor

9991 4---------

0 I >> > - - I >>>>>>> I

S+

Descriptor

I I

I 499,000

• • •

I

- - >I --.-.....-...........

I

4

-

I

• •

I

l I I I 4->I ---------. I * -- - -- 0-- - -- I >>>>>>> '+ * I 1 9991 1

499,9991 *

5

I

4

I -

- - - - - --

--

,

I>>>>I --->1 >>>>>>> I ------------------ >1 >>>>>>> I- - 4

I

I

II

.

I

4

-

I I I

4

>I >>>)>>> I

4------

Descriptor

I

II

I>>>>>>>

I-

-

-

-

--

1 I

--

*

II I I

• • *

>>>>>>> I I

I -

I>>>>>>>

,

-

-

-

--------

I

4-

1 249.999.000

1 0

•I->I•---------

I I 249,999, 999 I 4-----------

I • •

I I

4

----------

*->j

I I

• •

I 9991 4------------

Pigure 4.5

Example of File Organization Strategy

I

I

93

M!SIGN

FILE OSGANIZATION STRATEGT RODULT

In this example it is assumed that each file is divided

several states

If the

physical record, bytes lotig, the file sap file

the

contains

descriFtor

tirtual

the

of

address

to

(assused

bytes)

for files

file

4S9,999

address cf a of the

designates

the

the file (blocks are ordered 1000-1999,

0-999.

file addresses:

etc.), Furthermore,

2

require

and

Each entry

secondary storage.

the

coarerFcding

ketween 1000

file is

physical addrest of a block of by

bytes, the

to 999

file desczritor specifies the

locatcd on

sap

range 1

in the

length is

file's

length. If

its current

depending upon

cne of

can be in

A file

byte physical records.

into 1000

greater than

but less than 250.000,000 bytes, there

2000-2999,

500,000 bytes.

are 2 levels of file

saps as illustrated. This strategy

has several advantages. Onder

the vcrst

conditions of random access file processing only frcu one to three

I/o operations

several hidden file maps,

need to

buffers for

the nuaber of

be

performed. By

blocks of the

file as

1/0 operations required

accesses can be drastically reduced.

utilizing well as for file

I

1ULTI-CONPUTER NETWOPm

IV.

The function

of blocks

of allocation and deallocation

several separate

Involvea

LiiVU OUlENT

otfore d.scribinq

4ctofI.

the

vise

to review the

allowed to grow in size,

the POSH will

mechanisms,

iuslimentation of the

it

is

d4sir J charactoistics: A file is

1.

request

from the

add.tional blocks

data portions

of a file

ASH for

the

tables, as

or its index

needei. contain from 8000 to

Common direct access devices

2.

32C00 separately not feasible in 3.

thus it

allocatable blocks,

to store all

is

allocatica iefcrtaticn

main stcrage.

Since twc

x&tdepeadent processors

at the same time, it

now files on the same volume is necessary to provide

interlocks such that they

do not accidently allocate the than one

file, yet not

writing

may be

same block to sore

require one

frccesscr to

wait until the other processor finishes. These

problems can be solved by use of a sFecial Volume

Allocation Table volu~p must For

(VAT) on

be subdivided

direct access

each volume.

In this

into arbitrary

devices with

contiguous areas.

movable read/write as a

heads,

"cylinder") covers

Pach discret.

position (kncwn

area of

UC tc 160 blocks. A cylinder

about

schete, a

an

is a reasonable

ALLOCATION STIATEGY NODUL? DZS'I

unit of sub4ivision. is

entry in

a correspondinq

*bit map"

for each

9,

cylinder on the vclume, tho VAT.

that indicates which

not teen allocated.

Pro

already been allocated.

cylinder consists of

the cor.@spondin q VAT entry mould bit is a

the first

has not been allocated;

entry contains a

blocks on that cylinder have

example, if a

40 blocks, the bit map in be 40 bits long. If

Each

tbere

00". the first block

if the bit is a "1", Likevise for

tbc block has

the second,

third, aad

remaining bits. Uban the POSE first

requests

allocation of a blcck cn a

volume, the ASM selects a cylinder and requests that the DSM read

the corresponding

VAT

entry

into main

storaqe.

An

availatle block, indicated by a 00" bit, is located and then marked as allocated. the VAT entry

will be kept in main storage

remains in use, and blocks viii

that cylinder. Vhen all the

be allocated on cylinder

As long as the volume

have been

allocated,

the

blocks on that

updated VA?

entry

is

written out and a now cylinder selected. With this technique the

amount

of

information is

main

kert

storage

to a minimum

(about

for

40 to

allocaticn 160 bits per

at the same time the numter of extra I/0 clerations

volume),

is minimized

(abcut one per 40 to 160 blocks of allocaticn).

The problem ef interlocking still

required

remains. As

blocks on

the

different cylinders

they may both

i"

long as

the independent processors processors are using separate

allocating VAT entries,

proceed uninterrupted. This condition

can be

ZYV.

MULTI-COMPUTER N1TVCPV INVZICNNEW?1

accoaplitheA1 by utilizing a hardware feature kncv cecords" available Syatfa/36.

Fach

on several of tho

computers including

TAT entries

consistiisq of a rhysical key area .rea

contains

The key area

the allocaticm

as "keyed

is

a

the I5

separatc record

and a data area.

information described

The data above.

is divided itto two

Farts: the identification

number nf the processor curzently

allocating blocks on that

cylinAer an4 have been

an indication if

all blocks on

allocated. A VAT entry

that cylinder

with a key of

woul4 identify a cylinder that was

all zerces

not currently in use and

had blccks available for allocation. There are 1/O instructions that can

be used by the DSH

that will automatically search for a record with a specified key,

such as

zerc. Since

the device

controller will

not

switch processors in the midst of a continuous stream of 1/O operations from it is

a processor (i.e. "chained

possible to generate

T/O commands searching the

that will

an uninterruptible

(1) find

VAT for a

I/C coamands"),

an available

entry with a

sequence of cylinder by

key of zero

and (2)

change the ke.y to indicate the cylinder is in use. This thus solves thp multi-processor

allocation interlock Ezcblem.

DPYVCR STRATEGY

strategy

Device

The

"logical

convert

Nodules

IO

command

the Input/Output

Control

allocation

Strategy Modules into actual computer

sequences

that are

forwarded to

1/O

Modules and

Orqani:ation Strataqy

the l'ile

r*qumstm* from

97

.51GN

FOCULI

System for execution.

for example)

bytes

(!1.1000

is

amount

buffers.

will, therefore,

It

transfer for

reaA

block 211

into

I/O

if each

12930,

into locatior. etc."

location 13930.

consider the physical characteristics of rotational delay and

to request

generate logical I/O

fore*: *read block 227

requests of the

hidden

in

about 10 blocks

The ?OSM will

long).

are

blocks

be necessary

several blocks (e.g.

10P.0 bytes

block

unlikely that a

is

it

issued,

the needed

cf

siguificant

a file

large portion of

request to transfer a

Vhen a

The DSH

must

the device such as

*seek" position for movable

heads. It

then decides upon an optimal sequence to read the b1ccks and qenera te

the

including

necessary

actually

retry,

and ether

issues

housekeeping

as

of course, very device dependent

command

sequence

Input/Output

the physical 1/O

detailed strategy for choosing the

here.

1/0

The

commands.

positicning

System

physical

Ccntrol

request,

discussed earlier.

error The

optimal I/C sequence is. and will not be elaborated

The preceeding sections

have

biqblighted the framewcrk

of a file system. There are, of course, many other itportant decizionr to ani

be sade in such

orgarization

measurement

and

of

a system, such as

tables,

accounting

error

the format

conditions,

of

the

subtle points will be discussed in this section. The Basic

File System is

representerd

by

pr.sented,

the

unique identifiers. identifier

very efficient

mechanism

avoided

is

In the

much

designated

as

the

turle,

resulted file's

for accessing a

of the

with files

specific system

VFDD>. This representation