M.S. Thesis. Alfred P. Sloan School of Man an-mnt and Dept. of E1Pertrir.a. Fn enr; ag ... AlsoF, Dnrek Henderson, Norm Kohn, and Claude Hans. The author's ...
tINCI ASS IF IF Security Classification DOCUMENT CONTROL DATA - R&D (Security claeelllcation of title, body of abettact avd Indeing annotatlon must be entered when the overall report is classified) ORIGINATING
I.
28.
ACTIVITY (Corporate author)
REPORT
CLASSIFICATION
SECURITY
UNCLASSIFIED
Massachusetts Institute of Technology Project MAC
None
GROUP
REPORT TITLE
S.
Design Strategies for File Systems 4. DESCRIPTIVE
M.S. M
NOTES (ype
Thesis.
o repart and Inclueive datee)
Alfred P. Sloan School of Man
S. AUTHOR(S) (Last name, first nagn,
an-mnt
and
Dept.
of E1Pertrir.a
Fn
ag
enr;
Initial)
Madnick, Stuart E. REPORT DATE
6.
NO. OF REFS
TOTAL NO. OF PAGES
9s.
ORIGINATOR'S REPORT NUMSER(S)
9b.
OTHER REPORT
33
114
October 1970 CONTRACT OR GRANT NO.
I.
17b,
7a.
Nonr-4102(Ol)
MAC TR-78 (THE(S)
b. PROJECT NO. C.
NO(S) (Any ;-'
- numbers that may be
alined this report) d.
-
AVAILABILITY/LIMITATION NOTICES.
10.
taw vubuI' muhi
Distrlbutioi of this document is unlimited.
SUPPLEMENTARY
II.
12.
NOTES
SPONSORING MILITARY
.
ACTIVITY
Advanced Research Projects Agency 3D-200 Pentagon Washington. D.C. 20301
None ABSTRACT
IS.
This thesis describes a methodology for the analysis and synthesis of modern general purpose file systems. The two basic concepts developed are (1) establishment of a uniform representation of a file's structure in the form of virtual memory or segmentation and (2) determination of a hierarchy of logical transformations within a file system. These concepts are used tcgether to form a strictly hierarchical organization (after Dijkstra) such that each transformation can be described as a function of its lower neighboring transformation. In a sense, the complex file system is built up by the composition of simple functional transformations. To illustrate the specifics of the design process, a file system is synthesized for an environment including a multicomputer network, structured file directories, and removable volumes.
C, 14.
KEY WORDSW
File Systems
Operating Systems
Data Management
Programming
D D,, FORM 1473
(M.I.T.)
Modularity
Virtual Memory
UNCLASSIFIED Security Classification
PE5tGN STRATFfl$FS FOR FILE SYSTEMS";
Absrtract
iidescribes a methodoiejgy for the analysts and synLhesls of modern general purpose file systems. The two basic concepts developed are (1) establishment of a uniform representation of a file's structure in the form of virtual nwfiry 01 scgmentatLon and (2) determination of a hierarchy of logical transofrmations withir a file system. These concepts are used together to form a strictly hierarchical organization (after Oijkstra) such that each transformation can be described as a function of its lower neighboring transformation. In a sense, the complex file system Is built up by the compusitlor cf simple functional transformations. To illustrate the .epcifics of the design process, a file system is synthesized for an environment including a multi-computer network, structured file directories, and removable vol ume-,.
I'I :' This report rcprouces a thnsis of the same title submitted to the Alfred P. Sloan School of Management and the Department of Electrical Engineering, Massachusetts lnbtitute of Technology, in partial fulfillment of the requirements for the degree of Master of Science, June 1969.
DESIGN STRATEGIES FOR FILE SYSTEMI tuart E. l$4dnMck October 1970
PROJECT MAC MAS'ACHUSETTS INSTITUTE OF TECHNOLOGY
Cambridge
Massachusetts 02139
AC KNOVLEDGEMENTS
author acknovlediet
-hi
of the basic ideas
which :any
from
Allen Moulton,
Mr.
colleague,
with his
discussicn.s
laug dud ofteL heated
the vany
system design
for this file
were molded.
particular, Ness,
navi
criticism
an4
energy,
Zilles,
due to Prof.
thanks are
and Prof. Fobert M.
Den Ashton,
Hoo-ain
Dnrek Henderson,
AlsoF,
a!;
the
IBM
Canbridge
Michael Mark,
JcseFh
Toong,
and Claude Hans.
with MIT Project MAC
Scientific
was composed
the
the ideas formed in this
and edited,
on-live in
the
with tho aid Cf the
Fzccessing system.
Work reported herein was supported in part by Project 1AAC, an M.I.T. research project sponsorcd by the Advanced Research Projects Agency, Depart-
ment of Defense, under Office of Naval Research Contract Nonr-4102(01).
as well
provided
Center
CP-67./CMS Tive Sharing computer system, SCRIPT fanusctipt
J. Donovan, Prof.
John
Stephen
environment ani influenced many of This report
In
Graham, as well as,
Norm Kohn,
The author's association
zecrt.
this
produce
to help
time,
their
contributed
generously
colleagues
Many
CONT ENTS
to
rNTIRODUCTION
I
EvolUtion of 1ila Sys'tea Scope and Purpose
I 10
It*
MOTIVATION BEHIND FIlE SYSTEM DESIGN Uniform Representation of File Structure Hierarchy of Logical Transformations
15 15 27
III.
FILE SYSTEM DESIGN MODEL Basic Concepts Used in file System Design Overview of File System Design Model Access Methods Logical Pile System Basic Pile System File Organization Strategy Modules Allocation Strategy Modules Device Strategy Modules Input/Output Control System
31 31 39 48 53 57 59 64 66 67
IV.
MOLTI-COMPU'IER NETUORK ENVIRONMENT Background Problems &rising Design of File System Logical Pile System Design
69 69 74 80 80
Rasic File System Design
87
File Organization Strategy Module Design Allocation Strategy Module Design
91 94
Design Strategy Module resign nther Considerations
S7 98
V.
CONCLUDING COMMENTS
REPEE KCS
101
10 3
ILLU STP AT IONS
. Phyaical Computer Configuration File Systems 2, "Farly Access Mthods File Systems 2.1 Un fors File Reprosentation 2.4
2.
Logical computer Configuration
12.6
Logira2 Transformations
3.1 3,,;
Vierarchical Levels "Real" Memccy and "Virtual" File hemory
3.1 3.U
Hierarchical Pile Syste? Parameters and Data Bases Used by file System
in file System
Pappinq Virtual Memory into Physical Records '.C 3.6 Fixed-Length 1Secord Access Methods . 7Variable-Length Record Access Methods 3.8 Hierarchical File Directory Fiample U.1 U.2 4.1 .4 4.'r
Example of Multi-Computer File System Netvork vxample of File Directory Structure (to LIS) Procedure to Perform Logical File System Search Fxample of File Directory Structure (to arS) Example of File Organization Strategy
17 19 21 23
26 28 33 36
41 41 45 5 51 55 71 81 86 89 92
EMOL!)TIO,
OF FILE SYT!15
Introduction
The evolution of general purpose file systems Tazallela very closely the evolution of operating systems. This is nct surprising since the conceFt of file systems grew cut cf the embryonic
input-output
control (IOC)
and now
operatiny systems
functions
represents the
of
early
most significant
component of most modern operatinq systems. There has been very to the specific 1967,
Saul
be
a
problem of analyzing operating
Posen collected
"Programming
selection of
unpublisbed
repcrts
programming
languages operating
systems. fcr a
together material
Systems and Languages",
the
and
discussing
system
concepts.
which was to
most many He
of was
the forced
"The paper on Operating Systems was prepared for at the University of Michigar presentation
Conference, June
and
isjcrtant
conclude:
Enqineering Summer
bcck,
published
previously
In
18-29, 1962.
it has had fairly wide rirculation as Rand Qepcrt vital P-2eS4. The material covered has been of
scst to
[I F .
r. Z T $€icD OC?1ot
the "clamsc&l" th: develcpent of importance in o~erating system, it is difficult find and an ar-q uA t~o treatmentyet outside of very to long u ually dry Systoi matials. Ueorge mealy was one of the few working experts in the field Who took the tise to writa down some of the basic principles of operating systoms and also Gf assombly systems."
Mr.
observations
Rosents
attention
has been
expended
in
functions of operating sy.tems.
imply
that
very
little
the attempt to generalize the systems
?ile
bave alsc been
spvirely neglected. In the
early years
ptogatmsers approachinq
slowly a
wovcd
periods of time, in
away
bare machine
pencils, fiqhting with
or
of computing
defeat
and
from
with card
the
Fractice
decks and
of
sharpened
the console for more or less extended
leaving triumphantly with
with
Operating systems
(rough.y 1952-1962),
a
ream
have
of
final results
dumarBosin
machine
evolved, not so such
69>.
as a blessing,
but as a practical necessity. As computers became faster aed sore
cemplex,
it
va! no
longer possible for
an
individual
proqramer to be an expert in every phase of the programming and
machine usage;
and systcm These
he
programmers
now must
to provide
cperating systems
usually specialized around trul)
successful
System) for the
rely on
the necessities
were
of
IBM 709/7090/7094 As a result
life.
often ill-designed
a single goal. One
operating systems
its specialization.
the operating staff
was PHS family.
of
and
the first
(FORTRAN Monitor Its
name
implies
a large number of cFerating
EVOLUTIOI or rILE SYSTEMS
3
systems appeared, each with its ip cialization. These tied
to
a
own operating procedure and
sy tevs were
programming
typically very
language
(e.g.
clccely
FORTR&N,
COBOL,
Systems (1OCS) emerged as
a part
Assembler). Input Output Control of the Operating
System based on the simple observation that
perform some
all programs
amount of
input and/or
Therefore, rather than requiring each new set of input/output routines and
sufficiently
supplied with especially extended
flexible
the Operating
critical to
include
as
output.
programmer to write a
for each program, a ccamon
collection
of
System. This
computer high-speed,
I/O
routines
were
situation became capabilities
buffered,
channels which required couples program logic
were
asynchronous to efficiently
perform input-output. Prom
the
crude
followed a logical,
beginnings
of
file
though often slow, evolution.
physical input/output functions were many generalizations
ICCS,
systems Once all
localized in the IOCS,
became possible. Usually, there
is no
important difference among the many tape drives available at an installation, sc that any arbitrary tape unit say be used
for input or output to a program. Furthermore, later runs at the same
or different installations
need not use
the same
unit as long as unique correspondences can be maintained. At first it was the choice of
considered that the best input-output units by the
practice in handling object program was
I. INTICIDUCTION
to include unit red in
assignments
followed,
ihis
which
practice
was
seldom.
Vith
the
advent
more foolproof
object prograss
IOCS. The
when
well
it
was
of
the
and flexible
establish the correspondences as
manner of operating was to of the
worked
of IOCS, a
near-univer'sal use
parameter cr to
data and initialite the program
unit a-signeents as
appropriately.
part
as an assembly
dealt strictly
in
symbolic unit assignments. Since the object programs no longer interacted 4irectly even aware ot unit assignments,
with tte I/O units nor were degrees
additional
of
freedom
operating system, Froviding a environment. For assignments
system could
and
(unbuffered,
the
to
determine unit based
upon
availability and performance
(e.g.
automatically
dynamically,
The actual technique of
I/O interference, buffering, etc.). I/0
available
more efficient and convenient
example, the
complex criteria such as
became
single-buffered,
etc.)
doutle-buffered,
could be removed from programmer concern. Tbe
of
proliferation
1/O
device
types,
such
as
low-speed, mediut-sreed, high-speed and hyper-tapes, as well as drums and disks of all expansion
of IOCS
to
shapes and sizes,
include
called data management or file notion exploited is
that
capabilities that
are
the now
system facilities. The basic
just as the
programmer had little
concern as to what tapes were to be used, care what device is used nor
resulted in
be really does not
what method of I/O is evylcyed
ZVCLUTICK O
within
IL E SYSES
broad
programmer column
logical
5
wishes logically
cards,
the
For
constraints.
file
to treat
system
example,
his 1/0
could
if
data as
physically
a 80
utilize
unit-record equipment, tapes, disks, drums, data cells, or a host
of
other
devices in
various
manners
logically
to
simulate the effect of input-output using 30 column cards. This
trend
became
systems, since the
multi-tasking operating devices was an
irreversible with
the
advent
availability of
continuously and dynamically changing.
Environment,
imp idsible to
it
becomes
impractical
designate specific
of
I/0 units
In such
and
probably
statically and
arbitrarily in the program. The syst
importance
of
these
data
cannot be overly emnhasized.
that
programs perform
appears that
management
file
Just as the assumsticn
input-output was
the number and
and
a
flexibility of
basic fact,
it
I/O facilities
demanded by programs are continuously increasing. A major factor the
in the rapid growth of
introduction of
low cost,
direct access devices such as A description of
one as
high capacity,
with tape-like devices.
high-speed,
disks, drums, and data cells.
direct access devices would
fact tMt they have tvo degrees
file systems is
emphasize the
of freedom rather than cnly Since these devices
can be
used for both sequential and direct access applications, the total
amount
of
usage increases.
degrees of freedom necessitate
Of
course,
the
extra
more complex I/O routines and
reliance on
tighten the
furtber
INTIOrUCTIOf
~I,
6
file
Ferfors
systems tc
these functioni . Direct access
devices are
usually as
flexible as
or
more flexible than tape devices, Card-image or printer-iage flied
data
record
types
or structured
variable-length
capabilities could be
vast
second
was
have been
subsumed
as
functions system. factor
time it
was due
data rose nor
convenient
and sophistication
It
correspondingly.
usually
"infcruation
to the
information in the fcras of
of use increased, the amount of programs and
rising
operating systems,
uses,
As the number of users,
explosion".
to the
contributing
systems, as irt early
This
necessity.
these
forms. Altbougb
data
as
the
major
importance of file
VQ11
prcgcau,
by-products of the file The
as
performed by the object
these
majority of
bandled
be
can
physically possible
vas ac to
cnger
haul
the
required boxes of ;roqrams and data to and from the machine. This
information was but
compact
magnetic tape
directly or disk
converted and
maintained form,
machine processible pack. Not
only were
in a such
more as
the individual
programs and data collections large, but the total number of distinct
and
unique
files
(i.e.
programs
and
data
collections) was very large. It is not unccmmon for a single proqrammer to have
and a
to use from 10 to
roughly equivalent number
100 separate Frcgrans
of data
situation became especially acute with
collections.
Ihis
the increased use of
EZCLUTIOP 0? FILE SYSTEMS
online systems. not be
A
7
user at a remote
expected to re-type and
data from the terminal. They and
stored
at
acceessable
the
Quite obviously, store each Robert
enter all his
it
central
computer
would be
unique file
facility,
althcugh
terminal
control.
remot,
uneconomic and unmanageable
on a
Rosin highlights
programs and
must te permanently maintained
alterable under
and
teletype terminal could
separate tape
or disk
these developments
to
sack.
in his
recent
survey of supervisor and monitor systems: "A file system is especially necessary in any system which purports to provide realistic time-sharing. However, the advantages of this facility cannot be overlooked in a cice conventional envirouent". Thus, I/0
people were
devices
addition
to
to
"scratch"
faced with the problem
store
the
thousandc;
of
traditional use
storage.
Direct
capability of storing hundreds or and accessing then
in
permanent
for
access
of using the
input,
devices
files
in
output
and
provide
the
thousands of unique files
any order conveniently.
This type cf
direct access device usage results in many side effects. The first
problem,
organization and
a
involves
a
facility to locate "emEty"
files. such
mechanism
Many
other
as a security
access to restricted hardware
course,
directory-like
individual required,
of
files,
to
complex
stcrage
space on the device keep
track
facilities
are
of
the
usually
system to prevent unauthorized and procedures to
or software failures.
Of course,
each
reccver from installation
1, INTIOCUC'TION
or
group
develops
capakilities
to meet
additional special
elegant
file
system
requirements
or to
provide
extensive flexibility* For the same reasons that
programmers utilize and rely
on the file system, the operating system uses the facilities of the f ilp
system. For example, user
passwords,
account numbers,
etc.),
identification
accounting and
(e.g.
charge
information as well as system
self-measurement data must be
maintained
the
system. space
dynamically using
The on
previously
direct
facilities
menticned
access
of the
directories
devices
and
the
of
file
"empty"
symbclic
file
directory and access control information are usually handled as system files.
The operating system uses
capabilities to store the
FORTRAN,
COBOL,
the file system
various processing programs
Assemblers,
etc.)
as
well
as
(e.g.
many
infrequently used supervisor routines. Furthermore, advanced operating systems perform paging in conjunction to
realize
that
the
important component of manpower required to
"spooling", roll-in/roll-out, and
with the file system. It file
system
is
is not hard
usually
an operating system in
the
most
terms of the
develop and implement, and
the ascunt
of instructions and space used by the file system. Wbereas the
early operating
systems along
with their
rudimentary file systems revolved around the need tc suFIort
MiscEllaneous
1/0
functicns
modern file systems are at the
for
programming
languages,
very center of the operating
I
!YEOLUTION f 3FTLF
STM5
fsys!tem. The
.3-
47
suzpervisor,
programming
systems,
and
programs are totally 4eoendent on the file system.
9
cbject
1P
ban Suffered fro$
Thp development of file systems of
tbe same
prcbless
that of
iss
Frograssieg
Protably the single scat isportant protles
in
most
factors, such receiving
be put into
attention
"best"
the
function
of
this
programming language
depended therefore,
was
up
paper to
of on
FoRIBAN the
had not
utmost
deeply
to
its
the
acceptance
efficiency
language in
hardware capabilities of a specific
in
if the original
that
attention
involved
to illustrate
example,
felt
defined the
macre
times
programmer. It is not
get
For
had not
of
from recent
15
to
controversies, but
trends and changing attitudes. designers
The question
where it has been fcund
than the least proficient
"efficient" the
Frogrameer
other
are finally
proper perspective
studies of real programming groups, that
important,
situations
au productivity and flexibility,
their lonq-deserve4
efficiency can
Vas the ezcesuive
programming
current-day
lany
languages.
course efficiency is
concern with efficiency. Of but
KI3 ODVVI110N
X,4
machine,
terms cf IBM
and, the
C4, it is
possiblf that the evolution of languages such as ?OBIB&N-IV, COBOL, ALGOL, miqht
and PL/I and generalized
have proceeded
in
a
compiler techniques
more organi2ed fashion. The entire
field of generalized approaches tc programming languages and compiler techniques fActor in
has only
recently emerged
the computing profession.
as a
major
SCCPI AIO PUIJCS!
11
?ile systems the
have followed
a similar
development, In
same of "efficiency", each new file system was specially
tailored
to
intended
use
4%perience
the and
or
demands on a facilities
original seeds very
techniqaes given
file
-ould
benefit
of preceeding
of from
the
As
the
systems.
system increased,
its
new featutes and Each of theme
systems drove us further and further flce an
piece-seal file
generalized file
Best literature in forms.
environment
often with a "crovbar".
were added,
organized.
seldom
and
The typical
techniques used
system structure.
this area
manuals
system
to implement
current systems
or in
characteristics for future file user facilities,
but
file
of new file
one of two "clevern
system,
for comparisons
the design
othex type of reference deals
describe the
a specific
litt]- assistance
provide very
has appeared in
but
vith cthez systems.
The
with discussions of desirable systems,
usually emphasizing
adds little insight into
the problems
of designing and implementing such a system. To a certain extent, to
evolve in
systems
"time-sharing" systems.
will
since time is it
is
the
systems
be
called
approaches have begun in
conversational
this paper
such
resource-sharing,
only one of many resources that are sbared and
cnversational that
batcb-oriented These
generalized
is
or
most
interactive easily
nature cf
these
distinguished
from
operating systems.
generalized
file
systems
for
conversational
FI 2
1. ZIM lCCT I~ONJ ol
rvsource-alring crerating systems developed an
,ocvtiity. In order to provide
b; y user prooraex
and the
onsertial.
Purthersore,
4-9~viromonk
and
be impossible
all the fedtures required
supervisor, a
flesible design
owing
complexity
to the
its dynatical1y changing aspects#
to devise an "optimally
The implemnter% were thus forced makc tbe
both by design
of it
sa
tho would
efficient" strategy*
to abandon any atteaFt to
syntol ecre efficic n. and
were free to
develcr a
fles iblt system with a clear conscience, The qoals
of flexibility
contradictory. In most
and efficiency
any multi-tasking system,
Podern, non-conversational,
systems as
well as
processor
time can
vhilf I/C
is in Frogress,
efficiency ceases individual
user
result
in
other
tasks,
be utilized
by channels, bud the central ky
executing other
to be of paramount to
overloading
as
optimlze
excessive
the channels.
tasks
file system
concern. ?urtherscre, performance
unnecessary inefficiencies due to such
operating
I/0 O eLdtiCnfl
In this environment
attempts
be
which includes
batch-oriented
conversational systems,
can be performed asynchronously
need not
I/0
The file
could
conflicts vith
interference
system,
aware of
from the
total requirements, could provide a strategy that results in .k
tir
more harmonious arrangement, noro than individual FvPn in
*'y:;ttM;,
user optimization
single-task cr
there
is
increasing systom thicogh~ut could.
applicatlon-criented ojerating
definite
value
to
art
organized,
sI QPt AND PIPOS .
13
9enor1I' liv4d
file
programs as
well as compilers
system.
For
large,
complex
and asseablers,
user
the Frogram
system requir mects carnct be
action including precise file
is
deteralfed since it
ttiLlcaly
most
input data supplied. Therefor%
a dynagic tuncticn cf the a dynamically flexible file
syntem could often outperform a specialized, but inflexible,
file system. Xt is
the purpose of this
flexible but
precise
model
probably need to be modified and specific
implementation.
Robert Rappaport
a general
extremely important to start with
file system design. It is a
paper to present
This
in his thesis
Primitives in a MultiFlexed
although this
design
will
made tore detailed fcr any issue
was
"eplementing
hbiqlightcd
by
tMulti-Piccess
Computer System" which
describes the development of the
Traffic Controller for the
MrIT Project NAC Multics System: "After having found acceptable solutions for the problems at hand, one asks oneself why it tcc% so long to arrive at these solutions and was there any way to have done it sore quickly? One sight further ask if the arriied-at solutions are in any sense optimum? After being involved in designing a large system involvinq the work of many people, one gets the feeling that such Frobleas as were encountered herA are bound to crop up. The develorumet cf any large system can only retain manageable if distinct parts of the system remain modular and independent. Vithout a theory of computing systems tc fall back on, designing such complex sy-items becomes an art, rather than a science, in which it is impossible tc prcve the degree to which working solutions to problems are in any sense oFtixuv solutions. In much the same way as authors write
____
I
books, large costuter eytems go through several dratts before they begin to take shape. In the abseuce of s theory one con only cope with the complexity of the situation by proceeding in an orderly fasbion to first pto4uce an initial vorkinq model of tho dosired system. This part of the york represents the maor effort of the design and implementation ptoject. Once having arrived at this beacbmark, Uauy of the prib14a5 sy thon be seen in a clearer light ad revisions to the working model are implemented much more quickly than were the original modules. is to the development of a theory, one gets the impression that it will be a long time in coming." Thereforo. computer science,
while
we
atoait
the file system
paper will hopefully serve the
THU
general
theory
model presented
of
in this
need for an "initial working
aodol" from which Wproblems say be seen in a cleaet light".
t'N??CR P 3!P!?I'I?!T|TOW OP PILl SIPUCTUR!t
CHAPIE11 TWO
MotiVation Behind Pile System Design
Tber
are two
systos design.
It
is necessary to
representation of a hierarchy of system.
(1)
establish a uniform
file's structure and
(2)
logical transformations that
V. H.
by the file
basic goals to be satisfied
Henry's recent
management systaes discusses
data
similar notices
cf
separating logical
and physical
file control,
but differs
significantly from
the approaches presented in
this report
in
many fundamental ways.
a reader interested in
It
should be a useful reference to
other current research
RJiorm Renpre 9n Aj 2I 21 his M A typical computer Such
a configuration
secondary storage.
storage Programs
usually has
and data
in
limitless and very
true
a
upon,
Figure 2.1.
varied asscrtment
addition
must be
order to be executed or operated
this area.
_1..
system is portrayed by
devices
It is generally
in
in
to
primary
the
of
Frisary
stcrage
in
respectively.
that J1 primary
stnrage si2e was
inexpensive, there would be
no need for
16
I.
secondary
storage
MOT!YAION
(possible
DEPIND FILE S
exceptions
requirements and tranafor of data). report,
the file
mechanism that hand1in from
be
backup
In the framework of this
be 4.fined
extends the capacity
as the
of primary
software mtcraqj by
and coordinating the trafafer of informatict tc and
the secondary
somewhat hich
system will
may
IEN VESIGV
storage
devices.
more restrictive than
include
as Fart
of
the
This definitica
other common interpretations file system
the
devices or the programs that operate upon the data, interpretation the
is
file system merel, ztores
physical In this
and transfers
information but does not operate upon it.
i
UNIFORM REPRESENTAUION OF FILE S4 RUCTUBE
17
+--->
-
---
I/o DEVICE 1
-
PRrMARY j CENTRAL I STORAGE I IPROCESSOEI 4,, --
-
1/---0--->I I/O DEVICE I I 2 I
1
I
-
I I
0
1 I
*--->I
I
Figure 2. 1 Physical Computer Configuration
1/0-, I/O
DEVICE
n
I I
I
I
18
It.
Early with
file Systems
specific
potentially
MOTIVATION BEItND FILE SYSTEM DESIGN
were usually Vroqrams.
application
a
very
designed tc
large numbet
Since
of
crerate
there
different
ate
secondary
be used in more than one
storage devices, many of which can
way (e.q. sequential or random access. blocked or unblocked, etc.) each file system limited devices
and organizations
intended appication.
itself specifically to those
that
were
Figure 2.2
appropriate fcr
its
depicts the relaticnshits
betwten the applications, the devices, and the file systems. of development produced
This tipe It is without
somewhat analogous any
to assembly
established
standard
language programming calling
impossible,
to use
arbitrary programs
particular,
it was
quite common
payro.l
sequences
or
it difficult, if not
communication conventions, which makes
produced by the
chaotic situations.
as subroutines.
to find
that data
programs, using their
In
files
private file
system, could not be accessed by the file system used by the
personnel progtaes, anO. vice versa. much duplication of effort and and use of these early file
As a result, there was
confusion in the development
systems.
0V4?cRp
PEMSNTION
OF
PYLI S2HUCTUDE
APPLICATIONS
LOIAI- PHYSICALI
DEVICES AND PHYSICAL DATA ORGANIZATION
Figure 2.2 Early File Systems (Analogous to Assembly Language Programming)
19
20
It.
MOTIVATION BEHIND
More recently, the computer system
designers
Amall set
realized
of coamon logical
methoade
that can
programs.
satisfy
Purthermore,
designed in different
a
flexible
devices
the user
vith
the
system.
file
a
the
access
on a
this
Languages,
with
such as
file
systems
device
independent,
to resort
bypass
a
restriction,
inter-mix
access
to assembly and
methods
be
variety
of
This
;rcvided
interface
with
the emergence
for
cf
business
applications.
suffered the (1)
could
structure.
COBOL
FORTRAt for scientific
a
application
methods
to operate
compared
select
(or access
of most
needs
2.3 illustrates
can be
necessary
?ORTRA4
possible to
device-independent
as the programming languages: really
is
tE3IU0
and operating
file orgaoizations
manner
Pigure
applications and
not
it
these
logically
Oriented
access methods
manufacturers
and device organizations.
This apprcach
Problem
that
It? SYST?!
The
sane shortcomings
despite claims, they vere (2)
occasionally
language
(3) it
vas
(analogy would
it
to overcove not be
possible to
was cr to
intermix
and COBOL subrcutines).
| Iii | it
UNIFCPP rEPRESINTATION 0
P1L!
STRUCTURE
APPLICAT IONS
III
LOGICALI
A PHYSICAL
DEVICES AND PHYSICAL DAT!A OBGANIZATION
Figure 2.3 Access Methods File Systes
(Analogous to Early Prograsming Languages, Sueh as POPTRAN and COBOL)
21
22
It.
NOT!I ATOt BUIiD FILI 5IST2I
CISIGI
In order to overcome the weakness in the access methods approach, it
reprtsentation that and
(2) be
design a gj11
is mecessary to can ()
deviece
be
used for
inudao-4entv
analoqous
to the
language,
for which
for
This
search
PL/I is
unifcre file
every application idealistic goa1
*?he" universal probably
is
programming
the most
ambitious
attempt to date. It
is
reasonable
representation vill be it
will
be desirable tc
access Since
methods fcr the
access
representation,
to
expect
that
such
so atomic or primitive
a
unifccm
in fcce
that
construct mote powerful specialized
the convenience methods
are
of
built
the typical1 apon
the
user. uniform
it is much easier to modify or implement new
access methods or, if necessary, operate at the atomic level to
bypass the
approach
pushes
restrictions
of
the access
the logical/physical
methods.
sejaraticn
of
This file
system structure much further as indicated in Figure 2,4.
23
uyrropn REPRESENTATION 0?' FILE STRUCT1Ufh
APP LIC A
0TONS
LO GICA L
ACCESS HETkIODS
EETTO UUIOMR PHYSICAL V DEVICES AND PHYSICAL DATA CIGANIZATION
Uniform
Figure 2.4 File Representation
(Analogous to Universal Programhing Language, P1./I ?)
XI. NOTIV&TIOI
24
?ho
rationale behind
uniform represontation
the
is not
BEHIND TILE ST
melection
tN D3101
of a
pazticular
exav~le,
trivial. For
the e
are three broai classes of c.spon uoniferm representaticni: 1.
Stream -
every file
sequential stream
is treated
as a
continuous
of information. It
to access only the current
is possible
position in the stream
or reposition to the beginning of the stream. this representation can be
imFlemented conveniently on
almost all secondary storage
2.
devices, although it
does not
provide the user
with very
efficient
features for many
applications.
-
Direct-Access
every
file
is
roweful o
treated
as
an
ordered collection of items. Each item is directly accessable
by
means
corresponding
to its
of
organization,
Stream,
but is
intrinsically
very serial
oaique
position
This representation, which storage
a
identifier
in the
ordering.
corresponds to primary
is
more
powerful
difficult
than
to isplement as
devices, such
on
magnetic
tape. 3.
Associative unordered
-
every
collection
file of
is
tre&ted
items,
each
as
an
item
is
directly accessible by means of an identifier that has been very except
WassociatedO
flexible for
a
with the
item.
representation. small
class
This
is a
Unfortunately, of
sophisticated
UNIFORN REPHIESENTA1IO
secondary
0f FILZ S RUCTU3R
storage devices
25
the Ilplesentaticn
in
very complex and inefficient. Irrega r4Iess
chosmn, vieve8 as
of
the spocific
the important
concept
being identical in
particular physical
is that
This generalization is depicted in
t.I
all
representatic:
files can
structure independent
device on which
be compared with figure 2.1.
uniform
the file
be
of the
is recorded.
Figure 2.5, which should
IL.
BEIND FILE SYSTER
#OITY&ION
*---)1
t, PCIH
i.I
ESSZG
I
IFILE -- 4~e
I ARAY I 5qTCPAGE
4
-4
i< -%:>1 I
CENTRlAL
i1 tNIFOSH I I FILE 2
1PPOCESSORI 4
-,
I
4
-.
I 1
4
+--->I UNIFORM I I F aLE 1 5
Figure 2.5 Lcgical Computez Coufiguration
27
1411RARCHT OF LOGZC&L TIUSPSFIafTAIz0S
JJ
llietrbZ 2f12 Although a
ranmf rmationm
later sections,
presented until
not te
of
characterI at 11,
'geiteraI
Rost
user specifieu his
particular, a
system will
a file
precise description~ of
several
there are
In
ayateac.
file
read or
request, such as
write, by designating a file and an element within the file. Rost advanced tile systems allow considerable flexibility in the
is
tile, it
a
specify
used to
mechanism
typically
Furthermore,
described by means of a symbolic file name.
the
element within the file is specified in terms of the logical of elements
representation say
which
specification cf hcv
to a
correspond
not
or may
systfy
particular file
in the
physicil
precise
and where the element
is stored.
PCL
example, a typical request eight he of the form: "Read item 23 from file AtPHA into location 1564.N Realizing that devices
in
sequence of
somewhat
information must obscure
final form
secondary storage device. as
a
single
store'i nP be
must
thE
*
U-'L'~S
that physically operates cr Quite often the transfor-iatirt i:, step
oversimplification that hides the use*
there
transformations required to ccavert
request into its
viewed
ways,
usually be
but
that
fundamental
In Figure 2.6 the conversion process is
is
d
sechartZ.-- irix illu~ft.r.~o!-i n
terms of a discrete sequence of logical transforsuation-> Since the specifics of these *ransformat ionE a~,
rt,:c '-
2P
Ir. MOTIVAION BEHIND FILE SYSTYB UESIGN
?FAr ITEM 23 PROM
11
SEND LETTER TO
FILE
If
JOHN DOE'S
It
HOME ADDRESS.
ALPHA INTO
LOCATION 1564*. SY4BOLI(
ItI
?ILE N tE
JOHN DOE
II
I
II
tFS V NUMERIC
I V 030-34-1234
F1 L*: f ' FIEI BF5
I
if
V
IV
FIIE DESCBIPTOR
FOSM
I
I
Birth date Office Address Home address
I
II
etc.
I
II
I
I V
LOGICAL I/C COMMANDS
nsP
I
i I
I it 11
I
V
EXTRACT HOME ADDRESS
I
IIIj
V PHYSICAL
it
V SEND TO
I/0 COMMANDS
I
POST OFFICE
iI 10 DEVICE
I 11
POSTMAN
Figure 2.6 Logical TransformatioDs in
DELIVERY
File System
I
HIERARCHY OF LOGICAL
obvious until analogy is the
TRANSFORMATIONS
the note
detailed sections
presented in Figure
file
system
29
2.6 that
ttansformations.
intended to provide
later, a
sitFle
loosely parallels
The
analocay
some insight into the
is
only
rationale behind
each stage of the transformation. The process item 23 from step is
starts from
the user*s
request to
the file ALPHA into location
to convert
the symbolic
"read
1564". The first
file name
into a
unique
numeric file identifier. In the analogy, this ccrresonds to looking up John Dce's identifier
which is a social security
in this illustration. The purpose for using an identifier is basically
the
same
in
both cases.
convenient to store information, by means
of a unique numeric
name which
may, under
unique (i.e.
is
usually
more
manually or automatically,
"key"
rather than
certain circumstances,
there may be more
case other factors
It
a symbolic not even
than one John Doe
oust be considered in
be
in which
order to uniquely
identify the person under consideration). The file access
all
identifier can then the
informaticn
known
information collectively is known In
the analogy,
this would
be used
to ccnveniently
about
a
file,
this
as the file's descriptor.
correspond
to requesting
all
information in the social security records of 030-34-1234.
Now nPcernary
that everythi.g to
consider
is known the
about the
specific
performed. Using the file descriptor,
file, it
operaticn
tc
is be
a sequence ci lcgical
30
11. INOTIVITION BEIIND FILE SYSTE
I/0 commands can
commands
because
be produced. These are
they
do
not
called logical I/0
consider
characteristics of the secondary storage
DESIGN
the
physical
device to be used,
This is analogous to putting an address on a letter which is usually done
without considering
the physical
destination
the transformation,
the logical
nor the route to be taken. In order
to complete
1/0 commands must be converted into the appropriate sequence of physical I/O commands. This comple% depending upon I/O interfaces to
conversion may be trivial or
the peculiarities of the
the devices. In the
device and
analogy this process
is performed at the post office where the address is used to determine the physical
routing needed
to get
the letter to
its destination. The final step in the of
information.
This
process is the physical transfer
is usually
softvare/hardware interactions
performed
to activate
by
means
of
the aFrcpriate
device and confirm the successful completion of the request. Of course, in
the analogy this transfer
the postman ("neither
rain nor snow nor
is acccmjlisbed by
dark cf night...")
assisted by trucks, planes, trains and other automaticn.
31
BASIC 1COWCEPTS
CHAPTER THREE
Pile System Design Model
2111
9211M.U 212-d 12. 11-2 System .A912 Two concepts are basic to the general file system model
to be introduced. These concepts
terms "hierarchical
have been described by the
memory".
modularity" and %virtual
They
will be discussed briefly below.
Vierarchical Modularity The term
"modularity" means
many different
different people.
In the context of
concerned
organization similar to that
with an
Dijkstra
and
things to
this paper we
will be
prc;csed by
Randell.
The
important aspect of this organization is that all activities are
divided
into
structure of these or ring
sequential
processes.
A
hierarchical
sequential prccessEs results in a level
organization wherein
each level
only communicates
with its immediately superior and inferior levels.
The notions of "levels of abstraction" or "hierarchical modularity" can Consider an
best be
presented briefly
aeronautical engineer using a
by an
example.
matrix inversion
12
I1.
package
to solve
abstraction, the
space flight computer is
FILl SVST28 DESIGN MCDEL
problems. At vieved as
his level
a matrix
cf
inverter
that accepts the matrix and control information as input, and provties
the
programmer who wrote have
had any
levels
matrix
inTerted
as
output. The
application
the matrix inversion package
knowledge
of
of abstraction).
its intended
fe tight
usage
view the
need not (superior
computer
as
a
"PORTRAN machine", for example, at his level of abstraction He
need
not
operation
have any
of
abstraction), vith it. at
a
the
specific knowledge
FORTRAN
but cnly
Finally,
different
the
of
system
static
internal level
cf
he can interact
FORTRAN compiler implementer operates
(lover) level
since
(inferior
the way in which
of abstraction. In
example the interaction between is
of the
after
the
completed, the engineer need
the
matrix not
the above
3 levels of abstraction inversion
program
interact, even
is
indirectly,
with the applications programmer or compiler implementer. In the form
of hierarchical modularity
design mcdel,
used in the file system
the multi-level interaction is
ccntinual and
basic to the file system operation. There
are
orjani2ation.
completeness designers and
several
advantages
Possibly the
of each
cften a
such
most important
level. It
is easier
implementers to understand the
interactions of each level and is
to
very
an
modular
is the
logical
for the
system
functions and
thus the entire system. This
difficult Froblem
in
very coallet
file
3
BASIC CONCEPTS
V -
--
--
----
ILEVEL K+1
-------
V
Figure 3.1 Hierarchical Levels
I
la
TIT. FILE STST
systems with tons ani
or hundreds of thousands
DESIGN iNODIL
of instructions
hundreds of inter-dependent routines. Another
by-product of
this
structure is
"debugging"
assistance. For *xasple,
when an error occurs it eta
be localized at a level
and
verification usually
freliability checkout)
an impossible
using all possible in each finite thp
identified easily. Tbh
task since
it
of relevant
internal
Therefore,
an
tests,
it
structure
of the
important
goal
a
file system
small
that
construct a
is necessary
to ccnsider
mechanism is
they can
each
level
and verified,
being
abstractions
more
introduced,
to
to design
all
remains within
then
be
be the
tested. internal
test cases tried
level
1, level
but
because
powerful, the
tests
In order to
number
of
is
without
overlookinq an iurcrtant situation. In theory, level te checkel-out
is
requests occuring
structure so that at each level, the number of sufficiently
completo
would require
system
data inFut and
potential "system state". set
of
usually
0 would 2, utc., of
"special
the
cases"
bounds.
Virtual Memory There are objectives: of
four very important and difficult file system
(1) a flexible and versatile format,
the mechanism
o"irep of
machine
,aril automatic
as
possible
should
te
invisible,
and device independence, and
allocation of
(2) as much
secondary storage.
(3)
a
(4) dynamic There have
35
BASIC CCNC9FTS
been
several
file
generalized
"segmentation" II
I ...
! I
I
III
I
I
i
.I A Ile
I I
til
I-< I iead from Pile
DESIGN NlODIL
I I-....
LI_
I
I
I
Main Storaqe
File 8
tigure 3.2 "Real"
Memory and "Virtual"
File Memory
IASIC CONCIPTS
37
60 chdarcters at a time
routine to access locations
80, i60,
0,
...
starting at byte othev
Almost all
.
possible
formats can be realized by similar procedures.
1Ecept
system
for the
mechanism,
modules,
formatting including
file
the entire
allocations,
bufferijg.
4ad
physical location, is completely hidden and invisible tc the user.
This
relates
indeFendence. In which device
closely to
many file
the
objective
systems the
should be used,
its record
of
device
user must size (if
specify it
is
a
hardware formatable device) , blocking and buffering factors, and
sometimes even
parameters and
the
algorithms chosen might,
optimal, many changes required
to
addresses,
physical
run
environment. This
in some
might be necessary if with
a
different
strategy does not
Although sense,
the be
the program is
configuration
prevent the
CT
user frem
providing additional information, such as how often the file will be
that this information is is
manner,
used and in what
Dot
The important
factor is
necessary and its aignificance
determined by the file system rather than the user. There are
by this
very serious questions of
file system
strategy. Most of
efficiency raised these fears
eased by the
following considerations. First, if
to
very
be
used
seldom
I.s
efficiency is not of paramount hand,
in
a file is
development),
importance; if, on the cther
it is for leug-ters use (as in
program),
program
can be
a commercial prcduction
the device-independence and flexibility for change
111. FILE SYSTER
38
and
upkeep will be very
proqaem*r
of the coplexities of
is
he
allocations,
constructively and rslstiq than
to
of current
coordinate the specify critical
files
utilize
hiu
overcome
to
the
than
Farameterm.
the average
te.
shortcomings
Third, in
system will
energy
problem rather
direct-access devices,
file
devices, ea4
clever algoritbs
of his
trvcturinq
file system.
of the
that. the
to
by relieving the
the formats,
creatively to develop
*tricksN
rculiarities
possible
able
the IcqiAl
clever
complexity
important. Second,
DISIGM AODIL
Le
view of it
better
is
or the
quite able
user attempting
to to
OVEIRVIW OP PILI STSTBM DESIGN MODEL
pap r can specitic
be viewed as
a hierarchy
implementation
sub-divided
or combined
several modern
separate
certain as
file systems,
report, attempts
in this
to be presented
system design modpl
?be file
39
of seven levels.
levels
be
A recent
reuired. vhich
way
In a
further study
will be published
to analyze
the
systems
of in 4
in
the
framework of this basic model. In general all of the systems stuied fit into
model
are
the model, although certain
occasicnally
reduced to
trivial
levels in the
form
or
are
incorpormted into other parts of the operating system. The seven hierarchical levels are: 1. Input/Output Control System 41OCS) 2. Device Strategy Modules (DSM) 3. allocation Strategy Modules
JASM)
4. Pile Organization Strategy Modules (POSM) 5. 2asic file System (BPS) 6. Logical File Systes (LFS) 7. Access Methods and User Interface The hierarchical organization can be described from the "top" down
or from the "bottom'
ordtinarily be implemented
up. The file
by starting at the
the Input/Output Control System, and more
mian nqf ul,
however,
to
system wculd lowest level,
working up. It arears
present
the
file
system
organization starting at the most akstract level, the access
40
1I1.
routines, ani
removinq
FZL.' '-.STU DZIGS AODZL
the abstractions
as the
levels are
Op"-leiwdy", In the
Eolloving presentction
the terms
"file
vame ,
"file identifier", and "file descriptor" viii be introduced. Detailt
e
lan.tions
sections,
the following analogy py be used
cannee
assistance. A perscn's haphazard process or manageable
provi4 ed
(file name),
of assignment,
is not
Social Security
Access
unique identifier
identifier can
consists
a format
be routines
record A'iles,
of
on the to
the
file. In
ffile descriptor)
of
access methods or
qcneral there
fornz-
ubviously,
a
will
files, and
axample. Many
routines, also called
, n be user may
access methods to augment this level.
that
fixed-length
record
record files, fo:
data manayesent,
file system.
,:,utines
simulate scquentia3
more elaborate and specialized
the
set
sequential variable-length
direct-access fixed-length
of
then be
(AM)
level
superimpose probably
person, such
that person.
Methods
This
to the somewhat
usually assigned to each number. This
later
necessarily unique
used to locate efficiently the information known atout
until
fot the reader's
due
for computer processing. A
(file identifier) is as a
name
be
supplied as part write his
own
OVuiEIuw
O
41
PILE SISTER DESIGN MODEL
User Proq:La
I Level 7: Access Methods (AN) User Interfaces
I
! I I An I LL
Level 15:I Basic File System
In t LJ2
I BFSI
1
I
IPOSFI
._LAI.
V I l IFOSHI L.._IaL
I I I __j I I I ASMI I L_LIL
Level 3.: : Allocation Strategy Modules (ASK)
I
V 1i IFOSII ._J.LI
I
I
3 V 1 1 1 I I ASKI I 11211
I V
I 1 I I I V
1 I DSMI i_11_1I
Level 2: Device Strategy Modules (DSH)
I
I
V Level : File Organi2ation Strategy Modules (FOSM)
I I I AM I LJi. V I I LFSII
Level 6: Logical 'Pile System
'
I
I DSHI I_12ii
SI V I I IIOCSI
revel 1" input/Output Control System (IOCS) I
I
I
I
LDe vices Figure 3.3 Hierarchical File System
I
Z52
Ills
FILE SYSTEM OESIGN
9I3 USER ON 3 I Accss I I METHODS I VILE ADDRESS I COPE ADDPESS I LFNGTH I._tF1CTTON
I I I
ATA §
I
3 I V
-
I L F S *
----
I ?TIP IDENTIFIER
I
I
I fTLF ADDRESS
I
I
I CaFE ADDPESS
1
1
I LENGTH
I
I
_1
V
I_FU KCTON
&CDIL
I
NAME-->IDEWPIFIEU
j
IDENTIFIER--> DESCEIPTOP
--
4.-
I a F S .IFIIF DESCPIPTCP I ILE ADDRrSS l CORE ADTIrESS I LENGTH FIJFCTION
-
1
I
I I
I
I V
I 4
.-
,
I F 0 S H j
ADDRESS-->LOGIC&L
----------
I/o
I DFVICE IDENTIFIER I I LOGICAL I/O LISI COFF hDDFESS LIST I I_PTINCTION _
1 I V 4----------4.
I D S M
j LOGICAL-->PHYSICkL /0 I/O
-----------+
ItFVICE 11_/0
ADDP.SS,
COMMAND
LYT3
I __1
V *------------
I 1 0C S I *----------v 4.----------4
I EIVICE
I
+------------
Figure 3.4
Parameters and
Data Eases Used by File Systes
43
OVERV1!V OF FILE SISTER C!SIGN HOCIL
Logical File System (LYS)
Routines above symbolic
System to
File
a file.
name vith
corresfonding
unique
is the function of the Lcgical
It
the symbolic
use
"file
assoclatv a
of abstraction
this level
name
file
Below
identifier".
to find this
the
level the
symbolic file nave abstraction is eliminated.
Basic
file
System
(BPS)
The Basic File into
a file
descriptor
descriptor.
In
provides all
an
The file
rights interlocks, File
and System
associated based
upon
set
performs
the file
file
is
File
Organization
etc.),
many
of or
descriptor,
the
*closing"
access
check
read/vrite
bases.
The
functions a
the
Basic
ordinarily
file.
finally,
the appropriate FOSH for the
selected.
Strategy
Direct-access virtual
file
physically
to verify
system-vide data
with "opening"
sense, the
needed to
also used
vrite-only, up
identifier
Olength" and "location" of
descriptor is
(read-only,
the file
abstract
information
locate the file, such as the file.
convert
System oust
memory.
physical associated
A
records. with
Modules
devices
physically do
file must Each
it. The
(FOSM)
be split
record
has
not
resemble
into many a
Pile Organization
unique
a
-eparate address
Strategy Module
maps a logical virtual memory address into the ccrrespcnding
44
111. PIL2 STSTE! DISICS UODEL
physical record ad4rcss and offset within the record. To real or for
the POSM
memory area
write a portion of a file,
to translate the logically into the correct
or pcrtion thereof. by tke ASH.
The
contiguous virtual
collection of physical
If necessary,
list of
it is necessary
new
records to
reccrds
records are allocated
prccessed
be physically
is passed on to the appropriate DSM. Althouqh not allocate
"hidden"
reduniant or
the FOSH is
necessary, file
buffers
unnecessary
I/0. If
virtual memory is contained in the
data can
be
transferred
the file may be buffered. buffer areas
data in
write
and
to the
to
minimize portion of reccrd,
designated user conversely
main
output to
a sutficiently large number of file, it is
requests can
out of
"closed", the buffers
order
a currently buffered
are allocated to a
read and
moving
If
designed to
the requested
without intervening I/0.
storage area
all
in
often
be
possible that
performed by
When
the buffers.
are emptied by updating
merely
a file
is
the thysical
records on the secondary storage device and released fcr use by other files. actively
in use
Buffers are only allocated to files that are (i.e. "open").
Allocation Strategy Modules The available allocating
Allocation
(ASM)
Strategy
records on records for
a device. a
Modules keep They
file that
track
cf
the
are responsible
for
is
being created
or
I
OVERVIEw OF FILE STS?!M DESIGN MOD2L
I
I
I
I
ILlfll,).IId2.1_f1/I) I--->tLII&I/I /III/II/I I .-.-4--------l I I////////I II I I I 4---) FILE7 4--------------' I ----------------I I I-I 1. I IIII .------------I
4--------IL...LJS_./I
III//III 4----------
I
*----*
1
'-> It12.L .
..I>>>>>> VOL2(l) I ---------m ------VOL 2() I >>>>>>>.+
6
4-----------
> I //
/../
-4-
-------
2I/ , z3.nL +->I I -------------D) I.VCL3.(2) /_33 I VCL3(2) I I I I ID LES3(D 4----. I I -----------I VCL2(2) I FIE5 ----- ----
-
4
---------------
+->ILOL24i./I
--------VPDD for "VO1.2"
II--I/I---1
*----------------
-------------------
I
-----------
-
I +-->I/VqL"I_.U/I +----------- " I I VOL3(1)1 >>>>>>> 1 -+ 1
---
vL2/ >-------------------1 DR41(D) I VOL2(3)
I..
I DI83(D)
I>ZgL3.j/-LI I----->>>> --------*------I//I// ---------VOL'(U) I >>>>>>> I ------1-- + -i -3(>>>>>>> VO I ---------I+• ---------foar"VO;.3" 1-vp'DD "YOU" fo VOL35
I FILE8
VOL3(3)I
I
I
->I/VL../I I--iiii---I +---------
Figure 4
Example of File Directory
01L3(5)
I
I
-------------------
+------4
.-----------
I
------------------------
I
------------
>>>>>>
i
I VCLl(2) !
IILE1 .
*---------> IL------/I I I//////// 4 ---------*
0
I
L
I
VFDD for "VOLI"
VOL3(2)I
--
. .. . ]--i1-2 . . I .VOLI(Q) . -------
..... VOL 1(( ) >>>>>>> I VOL (5)I)>>>>>>> +---------VOLt(6) I >>>>>> j--4 I +---w -------
I
4
--
.
..-
....
. -.. - .-,I, . I. -VCLI(3) 34 - >>>>>>>
VOL24)1 >>>>>>>
I V-C-tL ./I -------------I VOL3(4) I FILZ3
1 DIR2(D) I VOL2(3) ---
.
4. - - - - - -
. . .
1 J.
.
-- * I VOL.1 (1)' >>>>w>> #- - ..... -I..-
I
*-)
4
---
*-> I FILE6 +
I
I VOL3(4)
Il
I 4
---------4---
(
--------------------
I VOI1(5) FILE6 --------.---------.
I
I VOL3(3)
I
I VOL3(4) I PILEI ----------------------------------
I
I FILE2
Structure (to BPS)
q 'i
IV.
File
The
MULTI-CCPUTIR NT2TMOl
Strategy
Organization
requests that specify
A sample call
or file identifier. to
processes
a file
descriptor save
rather than a file
from the Basic File System
orga4niation Strategy
File
the
bod ule
a file in terms of
(the entry extracted from the VFDD)
E*TIROXMUI
Nodule,
PL/1-like
in
notation. is: OSM_
CAll where
;
EAD(CESCBIPTOP,COREAriBFILEADDR,COUN) FILE ADDR,
COREADDR,
and
have
COUNT
same
the
interpretation as discussed above. The
of the Basic File
primary function
System
reduces
single request:
to the
CALL POSM_READ(VFDDDESCIPICR,DESCBIPTOR,* is
VFDDDESCPIFTOR
where
descriptcr
the
(INCEX-1),M); the
of
VIDD
associated
with the volume name supplied by the Logical File
System as
part cf
is the
Basic File
System performs
as
protection
validation
core-resident
File
Active
files that are in
simple
the
System,
sufficiently
length
small level in
use
domain
of
and narrow
the hierarchy.
maiatenance
that
cf a
the 'hat
of
the
descriptor for
out, as in the Logical Basic
it
tasks,
enables efficient
identifier and "open").
(i.e.
several other
and
Directory
a file's
association between
File
standard
the
and DESCRIPTOR is the desired file descriptor.
The such
ftcu
INDEI is
identifier,
identifier, I
specified file VPDD entry,
the file
File
System
is
remains a ccnceftually
PIL2 OSGANIZATIOw STIATEGY NODULE DESIGN
91
The Logical Pile System and Basic Pile System &te, great extent,
application and device independent.
Organization Strategy Modules are
to a
The Pile
usually the most critical
area of the file system in terms of overall perfcrmance, for this reason it
is
be
large
used
in a
discussed papers
expected
in this
listed
that vore than
system.
section,
in
the
Only
one strategy may
one strategy
the reader
say
eeferences for other possible alternatives. The physical
POSH must
map
the logical
record address
cr hidden
file
address cntc
buffer
supplied file descriptor information.
based upcn
a the
In the simplest case,
the sapping could be performed by including a tvo-part table in the tile
descriptor. The first part of
indicate a contiguous
range of virtual file
second part of each entry physical record address. all
file
is a
potentially quite include
the
It has
descriptor.
One
a specific
of
mapping the most file
length, whereas
the file's
large. Therefore, it
strategies utilizes an arrangement.
been assumed, however, that
function of
entire
addresses, the
would designate the corresponding
descriptors have
mapping table
each entry would
table as
is not part
powerful
maps, Pigure
length and
file ,.5
the is
feasible to of
the
file
organizaticn
illustrates such
92
MULTI-COPUTIt
IV.
I 1
NfTVOI% !IVZICNIENT
I)--i----I54--------4 I
-
I
-
I
•
I
Descriptor
9991 4---------
0 I >> > - - I >>>>>>> I
S+
Descriptor
I I
I 499,000
• • •
I
- - >I --.-.....-...........
I
4
-
I
• •
I
l I I I 4->I ---------. I * -- - -- 0-- - -- I >>>>>>> '+ * I 1 9991 1
499,9991 *
5
I
4
I -
- - - - - --
--
,
I>>>>I --->1 >>>>>>> I ------------------ >1 >>>>>>> I- - 4
I
I
II
.
I
4
-
I I I
4
>I >>>)>>> I
4------
Descriptor
I
II
I>>>>>>>
I-
-
-
-
--
1 I
--
*
II I I
• • *
>>>>>>> I I
I -
I>>>>>>>
,
-
-
-
--------
I
4-
1 249.999.000
1 0
•I->I•---------
I I 249,999, 999 I 4-----------
I • •
I I
4
----------
*->j
I I
• •
I 9991 4------------
Pigure 4.5
Example of File Organization Strategy
I
I
93
M!SIGN
FILE OSGANIZATION STRATEGT RODULT
In this example it is assumed that each file is divided
several states
If the
physical record, bytes lotig, the file sap file
the
contains
descriFtor
tirtual
the
of
address
to
(assused
bytes)
for files
file
4S9,999
address cf a of the
designates
the
the file (blocks are ordered 1000-1999,
0-999.
file addresses:
etc.), Furthermore,
2
require
and
Each entry
secondary storage.
the
coarerFcding
ketween 1000
file is
physical addrest of a block of by
bytes, the
to 999
file desczritor specifies the
locatcd on
sap
range 1
in the
length is
file's
length. If
its current
depending upon
cne of
can be in
A file
byte physical records.
into 1000
greater than
but less than 250.000,000 bytes, there
2000-2999,
500,000 bytes.
are 2 levels of file
saps as illustrated. This strategy
has several advantages. Onder
the vcrst
conditions of random access file processing only frcu one to three
I/o operations
several hidden file maps,
need to
buffers for
the nuaber of
be
performed. By
blocks of the
file as
1/0 operations required
accesses can be drastically reduced.
utilizing well as for file
I
1ULTI-CONPUTER NETWOPm
IV.
The function
of blocks
of allocation and deallocation
several separate
Involvea
LiiVU OUlENT
otfore d.scribinq
4ctofI.
the
vise
to review the
allowed to grow in size,
the POSH will
mechanisms,
iuslimentation of the
it
is
d4sir J charactoistics: A file is
1.
request
from the
add.tional blocks
data portions
of a file
ASH for
the
tables, as
or its index
needei. contain from 8000 to
Common direct access devices
2.
32C00 separately not feasible in 3.
thus it
allocatable blocks,
to store all
is
allocatica iefcrtaticn
main stcrage.
Since twc
x&tdepeadent processors
at the same time, it
now files on the same volume is necessary to provide
interlocks such that they
do not accidently allocate the than one
file, yet not
writing
may be
same block to sore
require one
frccesscr to
wait until the other processor finishes. These
problems can be solved by use of a sFecial Volume
Allocation Table volu~p must For
(VAT) on
be subdivided
direct access
each volume.
In this
into arbitrary
devices with
contiguous areas.
movable read/write as a
heads,
"cylinder") covers
Pach discret.
position (kncwn
area of
UC tc 160 blocks. A cylinder
about
schete, a
an
is a reasonable
ALLOCATION STIATEGY NODUL? DZS'I
unit of sub4ivision. is
entry in
a correspondinq
*bit map"
for each
9,
cylinder on the vclume, tho VAT.
that indicates which
not teen allocated.
Pro
already been allocated.
cylinder consists of
the cor.@spondin q VAT entry mould bit is a
the first
has not been allocated;
entry contains a
blocks on that cylinder have
example, if a
40 blocks, the bit map in be 40 bits long. If
Each
tbere
00". the first block
if the bit is a "1", Likevise for
tbc block has
the second,
third, aad
remaining bits. Uban the POSE first
requests
allocation of a blcck cn a
volume, the ASM selects a cylinder and requests that the DSM read
the corresponding
VAT
entry
into main
storaqe.
An
availatle block, indicated by a 00" bit, is located and then marked as allocated. the VAT entry
will be kept in main storage
remains in use, and blocks viii
that cylinder. Vhen all the
be allocated on cylinder
As long as the volume
have been
allocated,
the
blocks on that
updated VA?
entry
is
written out and a now cylinder selected. With this technique the
amount
of
information is
main
kert
storage
to a minimum
(about
for
40 to
allocaticn 160 bits per
at the same time the numter of extra I/0 clerations
volume),
is minimized
(abcut one per 40 to 160 blocks of allocaticn).
The problem ef interlocking still
required
remains. As
blocks on
the
different cylinders
they may both
i"
long as
the independent processors processors are using separate
allocating VAT entries,
proceed uninterrupted. This condition
can be
ZYV.
MULTI-COMPUTER N1TVCPV INVZICNNEW?1
accoaplitheA1 by utilizing a hardware feature kncv cecords" available Syatfa/36.
Fach
on several of tho
computers including
TAT entries
consistiisq of a rhysical key area .rea
contains
The key area
the allocaticm
as "keyed
is
a
the I5
separatc record
and a data area.
information described
The data above.
is divided itto two
Farts: the identification
number nf the processor curzently
allocating blocks on that
cylinAer an4 have been
an indication if
all blocks on
allocated. A VAT entry
that cylinder
with a key of
woul4 identify a cylinder that was
all zerces
not currently in use and
had blccks available for allocation. There are 1/O instructions that can
be used by the DSH
that will automatically search for a record with a specified key,
such as
zerc. Since
the device
controller will
not
switch processors in the midst of a continuous stream of 1/O operations from it is
a processor (i.e. "chained
possible to generate
T/O commands searching the
that will
an uninterruptible
(1) find
VAT for a
I/C coamands"),
an available
entry with a
sequence of cylinder by
key of zero
and (2)
change the ke.y to indicate the cylinder is in use. This thus solves thp multi-processor
allocation interlock Ezcblem.
DPYVCR STRATEGY
strategy
Device
The
"logical
convert
Nodules
IO
command
the Input/Output
Control
allocation
Strategy Modules into actual computer
sequences
that are
forwarded to
1/O
Modules and
Orqani:ation Strataqy
the l'ile
r*qumstm* from
97
.51GN
FOCULI
System for execution.
for example)
bytes
(!1.1000
is
amount
buffers.
will, therefore,
It
transfer for
reaA
block 211
into
I/O
if each
12930,
into locatior. etc."
location 13930.
consider the physical characteristics of rotational delay and
to request
generate logical I/O
fore*: *read block 227
requests of the
hidden
in
about 10 blocks
The ?OSM will
long).
are
blocks
be necessary
several blocks (e.g.
10P.0 bytes
block
unlikely that a
is
it
issued,
the needed
cf
siguificant
a file
large portion of
request to transfer a
Vhen a
The DSH
must
the device such as
*seek" position for movable
heads. It
then decides upon an optimal sequence to read the b1ccks and qenera te
the
including
necessary
actually
retry,
and ether
issues
housekeeping
as
of course, very device dependent
command
sequence
Input/Output
the physical 1/O
detailed strategy for choosing the
here.
1/0
The
commands.
positicning
System
physical
Ccntrol
request,
discussed earlier.
error The
optimal I/C sequence is. and will not be elaborated
The preceeding sections
have
biqblighted the framewcrk
of a file system. There are, of course, many other itportant decizionr to ani
be sade in such
orgarization
measurement
and
of
a system, such as
tables,
accounting
error
the format
conditions,
of
the
subtle points will be discussed in this section. The Basic
File System is
representerd
by
pr.sented,
the
unique identifiers. identifier
very efficient
mechanism
avoided
is
In the
much
designated
as
the
turle,
resulted file's
for accessing a
of the
with files
specific system
VFDD>. This representation