On Reducing the Complexity of Management and ... - Semantic Scholar

41 downloads 23236 Views 115KB Size Report
architectural aspects of complexity, the task of designing a software system for ... approaches for integration center around a common distributed processing ... operators that act upon a shared information base, which we call the network ...
-1-

On Reducing the Complexity of Management and Control of Future Broadband Networks

Aurel A. Lazar and Rolf Stadler Center for Telecommunications Research Columbia University, New York, NY 10027-6699 Abstract In order to reduce the architectural complexity of broadband networks, the sharing of information among network management and control mechanisms is investigated. We introduce a concept for sharing dynamic information in real-time that identifies the current changes in the network state as a global pool of network events called the event base. Network mechanisms share events by asynchronously interacting with the event base. They filter the continuous stream of data produced by the event base and, thereby, monitor the network state according to their needs. The approach enables the integration of management and control tasks by allowing network mechanisms to share updates to their views of the network. The scheme is efficient in the sense that, first, only changes of the network state are distributed, and, second, this data can be shared (in principle) by any number of mechanisms. This is illustrated by an example of virtual path and virtual circuit connection management and control. The event base is embedded into an architectural framework, the Integrated Reference Model, that incorporates the subsystems of a broadband network, i.e., information transport, real-time control, and network management.

Authors’ Addresses: Aurel A. Lazar, Center for Telecommunications Research, 801, Schapiro Research Building, Columbia University, New York, NY 10027-6699, Tel: 212-854-1747, Fax: 212-316-9068, E-Mail: [email protected]. Rolf Stadler, Center for Telecommunications Research, 801, Schapiro Research Building, Columbia University, New York, NY 100276699, Tel: 212-854-2481, Fax: 212-316-9068, E-Mail: [email protected].

-21.

Introduction

Designers of future broadband networks are confronted with different aspects of complexity, which, in this combination, have not occurred in system design before. Like today’s telephone network, a future broadband network is expected to provide world-wide connectivity to billions of subscribers, which leads to the problem of large numbers, e.g., in terms of of network nodes. However, when compared with the telephone network, the complexity of broadband networks will be even higher because of the ever growing number of services with different Quality of Service (QOS) requirements these networks are expected to integrate. In addition to the architectural aspects of complexity, the task of designing a software system for management and control of such a network is by no means understood - and most of the functionality of a broadband network will be implemented in software. For convenience, we use the term broadband network instead of Broadband ISDN (Integrated Services Digital Network), the future all-digital network which is expected to handle a variety of traffic types, such as voice, video, and data, covering all types of applications, from digital voice and electronic mail to multi-party video conferences. Compared to traditional networks, broadband networks require complex, distributed real-time monitoring, control and management capabilities in order to efficiently guarantee the QOS for the communication services they integrate - this is the price to pay for efficient integration. A wealth of mechanisms has been identified to provide these capabilities (see, e.g., [1]). These mechanisms run on various time-scales that extend over 10 orders of magnitude. (By the term [network] mechanism, we understand a module or an application which takes part in the task of traffic control or management of a network.) Historically, the tasks of network management and traffic control have been designed to achieve different objectives. They thus exhibit different characteristics, which makes their integration hard. Network management attempts global control of the network, a task that evolves on a slow time scale (seconds and above), and research in this field is focussed on the questions of how to structure network information and how to exploit this structure for network management. Traffic control, on the other side, is understood as an inherently distributed task that is performed on a very fast time scale (nanoseconds to 100 milliseconds), and work in this field deals with aspects of dynamics in the first place. This difference explains the fact that, so far, research problems in broadband networks are addressed by different communities: network management is seen primarily from a computer science perspective, whereas traffic control problems are addressed by traffic engineers.

-3The development of a network architecture that allows the integration of the main tasks of a broadband network - information transport, traffic control, and network management - is an important step towards reducing the overall complexity of the system. We think that such an architecture must primarily provide concepts for the integration of traffic control and network management, since both tasks share functionality as well as network data [2]. They both are based on the same paradigm of monitoring and controlling the network state, and primarily differ in the time scale on which they execute. Our approach for integrating management and control is data-centered in the sense that these tasks interact with each other by reading from and writing to a network-wide database, which we call the network telebase. While we think it necessary to model information used by various mechanisms in a uniform way, we suggest that, in a further step, this information be also distributed in a uniform way and thus made accessible to all network mechanisms. The difficulties in managing the telebase are the inherent distribution of the mechanisms interacting with it and the network dynamics. Other activities to integrate network management and traffic control within telecommunication networks are currently pursued, mainly by joint efforts of consortia. They include the TINA initiative [3,4] and the ROSA project [5], which is carried out within the European RACE program. They both have the goal to come up with a common framework for the TMN (Telecommunication Management Network) and IN (Intelligent Network) architectures, and their approaches for integration center around a common distributed processing environment, and a common object-oriented framework, respectively. Note that the architectural concepts that we present in this paper are clearly not an alternative to the activities mentioned above, which are much broader in scope (and in volume). We focus on specific problems that we think to be critical in order to understand and eventually build large-scale broadband networks. This paper deals with architectural aspects of the network telebase, which is embedded in a network architecture, the Integrated Reference Model [6]. We address the question of how information for management and control of a broadband network can be efficiently shared among network mechanisms. We specifically propose a concept for sharing dynamic information, e.g., data about the status of connections and the availability of network resources, among mechanisms in a distributed network environment. As part of this approach, we introduce a new paradigm for making distributed information available to a network mechanism. Our approach contains time as an architectural concept, for two reasons. First, network delays, given by the laws of physics, restrict in a fundamental manner the way a network can be controlled and hence managed.

-4Second, since network mechanisms run on completely different time scales (according to which they change their internal state) there must be a generic concept how they can interact with each other. Why is this approach important? The intent of the Integrated Reference Model in general, and the telebase in particular, is not only to facilitate the management of complexity but also to reduce it. First, by achieving the integration of different mechanisms, via sharing of dynamic data, complexity is reduced. Second, by distributing only changes in the system state, less information is transported, and furthermore, only needed changes are evaluated by various mechanisms. Therefore for every state change of the system, only a subset of components is updated and, thus, only a subset of mechanisms is triggered for execution. This markedly can reduce the complexity of the system. Although the content of this paper is high-level (as its subject implies), we give a specific example and show some implications for the system design and implementation levels, for we think that an architectural model not only must provide a consistent and concise framework of the system in mind, but also should facilitate its realization - a further attempt to reduce the overall complexity. The rest of the paper is organized as follows. In section 2 we introduce the network telebase as a data repository that conceptualizes the network. In section 3 the Telebase is incorporated into an architectural framework for broadband networks, the Integrated Reference Model, which is based upon the concept of information sharing. We also discuss some fundamental problems related to data sharing in a broadband environment. Section 4 gives an example that illustrates our approach to sharing dynamic information. In section 5, this approach is developed into an architectural concept, the event base, and some of its implications are investigated. Finally, in section 6, we briefly summarize the results of the paper, and discuss questions for further research. 2.

The Telebase: Sharing Information in Broadband Networks

The subsystems of a broadband network - the information transport, traffic control, and network management systems - must share information about the network, in order to perform their tasks. Consider, for example, a routing table in an ATM network, which is set by traffic control mechanisms, and is used to control the way cells are switched within a node in the information transport system. Further, data representing the state on a physical network link, may be used simultaneously by a call admission controller, a call routing function, and a performance management application.

-5Following this observation, we model the functionality of a broadband network by a set of operators that act upon a shared information base, which we call the network telebase, or telebase, for short (figure 1). This conceptual, high-level view of a network will be refined in the next section. The tasks in a broadband network operate on two main time-scales: a fast time-scale with reaction times to events ranging from nanoseconds to milliseconds, and a slow time-scale where actions in response to events take place within seconds and above. In our model, the control and the communication operators run on the fast time-scale, whereas the management operators run on the slow time-scale. The telebase incorporates all time-scales, since it serves as a binding among the other architectures - enabling integration within our framework.

Communication Operators

Management Operators

Control Operators

signalling &resource data MIB ATM

MIB abstractions, statisitcs Network Telebase

Network Telebase Abstraction Operators

Figure 1: Operators Acting on the Network Telebase Some parts of the telebase data can be derived or abstracted from its other data items, such as a virtual network (e.g., a private network with reserved communication resources running on a public network) can be represented as an abstraction of the data representing the underlying

-6network. The process of abstraction is supported by abstraction operators, which are part of the functionality of the telebase. The set of data obtained by abstraction is similar to the concept of the intensional database in knowledge-based systems [7] or the notion of derived objects in object-oriented modeling [8]. In contrast to other domains, abstractions in broadband networks often involve time. A performance parameter, for instance, can be seen as an abstraction of a state variable over a time horizon and at specific points in time. Concerning management and information transport, the nature and the structure of the data in the telebase is - to a large extend - worked out, understood, and standardized today. For control, however, little has been done so far. From the point of view of network management, the network abstraction is known as the MIB (Management Information Base). In the framework that has been worked out by the OSI and CCITT standard bodies, the MIB is designed following an object oriented approach, and is organized as a tree of so-called managed objects [9]. In addition to the standardization work, consortia, such as the Network Management Forum [10] developed guidelines for MIB definitions and defined MIBs for special application areas like TMN. At the same time, specific MIB definitions for industrial communication standards, such as SMDS from Bellcore, were published. A common information framework for information transport, traffic control and network management - although necessary for the realization of data sharing in broadband networks - still needs to be developed, and is beyond the scope of this paper. Nevertheless, let us briefly mention that data in the telebase is structured according to different time scales (figure 2). Data that describes the structure of the system, such as its configuration (on various levels of abstraction), is represented in the configuration database. This database contains knowledge about the network entities and their relationships. The knowledge about the sensors and their relationships with the network objects is stored in the sensor database. The dynamic database represents the dataspace generated by the state and control variables of the network. There, we can distinguish between state variables that represent aspects of the network state, control variables that are parameters of the network operations and hence determine the dynamics of the system, and event variables that represent either external stimuli to the network, or reflect something that occurs at some point in time within the network. Finally, the statistical database represents the statistical information about the network. It is derived from the dynamic database through a set of performance evaluation operators that are associated with the corresponding state, event and control variables, and it represents the dataspace generated by the performance parameters of the objects of interest.

-7-

Perf_Parameter Statistical Database

Throughput

Time_Delay

Packet_Loss

Call_Blocked

Packet_Gap

Statistical_History Abstraction of State and Event Variables

State_Variable

Dynamic Database

Event_History

State_History Event_Variable

Control_Variable Sensor Setup and Data Collection

Network_Station Network

Link

Traffic_Source

Status_Sensor Node

Traffic_Buffer

Sensor_Station_Info

Event_Sensor

Sensor_Traffic_Buffer

Traffic_Server Configuration Database

Sensor Database

Static Database

Figure 2: The Organization of the Telebase around Time Scales. Figure 3 gives an example of how various data abstractions relate to each other. A network buffer is presented in the form of a semantic network that shows the relationship among data of different kind (namely, configuration, state, and abstraction). (See [11] for further details.) Recall that network data changes according to various time-scales. The state variable in the buffer in figure 3 may change within nanoseconds, the characteristics of a virtual path connection that is represented as a managed object in the MIB changes, say, within minutes.

-8-

STATE_VARIABLE

ARRIVAL_RATE

IS-A

PERF-OF-GENERICVARIABLE

NETWORK_OBJECT

IS-A

IS-A

PERF_PARAMETER

STATE-VAR-OF-GENERICOBJECT PACKET_INTO_ BUFFER BUFFER

IS-A

IS-A

IS-A

PERF-OF-GENERICOBJECT

STATE-VAR-OF-GENERICOBJECT

PERF-OF-GENERICVARIABLE

PACKET_INTO_ LINK_BUFFER

LINK_BUFFER_ ARRIVAL_RATE

LINK_BUFFER

IS-A

IS-A

IS-A

PERF-OF-GENERICOBJECT

HAS-STATE-VARIABLE PACKET_INTO_ LINK_BUFFER_I LINK_BUFFER_ ARRIVAL_RATE_I

HAS-STATE-VARIABLE PACKET_INTO_ LINK_BUFFER_0_1_I

LINK_BUFFER_ ARRIVAL_RATE_0_1_I

INSTANCE-OF

INSTANCE-OF

INSTANCE-OF

LINK_BUFFER_I HAS-PERF_PARAMETER

LINK_BUFFER_0_1_I HAS-PERF_PARAMETER

Figure 3: Semantic Network of Performance Parameters and State Variables of BUFFER and its Subclasses. The focus of this paper lies on a scheme for sharing dynamic information, i.e., data that changes during the operation of the network. Having such a scheme for a broadband network is essential, because network mechanisms execute control actions based on their view of the network state, which generally includes dynamic data produced by other mechanisms.

-93.

The Integrated Reference Model: An Architecture based on Information Sharing

To overcome the complexity problems in emerging broadband networks - caused by the variety of communication services to be provided, the complexity of the control software, the large number of network nodes, etc. - there is an urgent need for integrating management and real-time control tasks into a consistent framework. To this end, we have developed an overall network architecture, the Integrated Reference Model (IRM) [6]. In this model, the key role for integration is played by the network telebase, a distributed data repository, which is shared among network mechanisms. The IRM incorporates monitoring, control, communication, and abstraction primitives that are organized into the Traffic Control Architecture, the Management Architecture, the Information Transport Architecture and the Telebase Architecture, respectively. (Note that this division follows the network abstraction explained in the last section.) The subdivision of the IRM into the Management and the Traffic Control Architectures on the one hand, and the Information Transport Architecture on the other, is based on the principle of separation between communications and controls. The separation between the Management and the Traffic Control Architecture is primarily due to the different time-scales on which these architectures operate. The Integrated Reference Model can be organized into five planes that model the above architectures (figure 4). The Management Architecture resides in the network management or Nplane, which covers the functional areas of network management, namely, configuration, performance, fault, accounting and security management. Manager and agents, its basic functional components, interact with each other according to the client-server paradigm. The Traffic Control Architecture consists of the resource control, or M-, and the connection management and control, or C-, planes. The M-plane comprises the entities and mechanisms responsible for resource control, such as cell scheduling, call admission, and call routing; the Cplane those for connection management and control, and is based on a signalling network. The Information Transport Architecture is located in the user transport or U-plane, which models the protocols, services, and entities for the transport of user information. Both, the U- and C-planes are horizontally layered, following the ISO Reference Model for Open System Interconnection. Finally, the Telebase Architecture resides within the Data Abstraction and Management or Dplane, and implements the principles of data sharing for network monitoring and control primitives, the functional building blocks of the N-, M-, and C-plane mechanisms. The mechanisms within both the management and control architectures are based on the classical

- 10 -

N-plane Agent

Agent

Network Management

Manager Agent Management Protocol

Agent

M-plane Resource Control

D-plane Data Abstraction and Management

Telebase

C-plane

SCP

Connection Management and Control

Signaling Network

U-plane Node

Node

Protocols Node User Access Protocols

Node

User Information Transport

Figure 4: The Integrated Reference Model, organized into planes. monitoring/control paradigm. This means that a mechanism performs an operation by monitoring the network state, and, based on that knowledge, executes a control operation, leading to a change of the network state. Therefore, sharing information among network mechanisms includes sharing state data in the first place. Up to now, the telebase in the D-plane has been described as being a global concept. It deals with the structure of network information, the time scales according to which this data changes, the relationships among data items, and data abstraction. But - so far - it contains no concept of distribution, which would determine how information is to be stored and made accessible in a distributed environment. Having such a concept is especially important for the dynamic information in the telebase, i.e., the data that changes during the operation of the network. Sharing dynamic information is hard in broadband networks, due to the delays given by the laws of

- 11 physics, and due to the fact that network mechanisms run on time-scales that extend over 10 orders of magnitude. A link scheduler, for example, performs an operation within a few nanoseconds, the setup of a virtual connection may take in the range of 100 milliseconds in a large network, and, finally, a network management operation such as setting a control parameter on a switch is executed usually in the range of seconds. In order to make the telebase globally accessible, we thus face the following problem. How can dynamic data be shared among network mechanisms, which run at different locations of the network, which operate on different views of this network, and which run on completely different time scales? The approach we propose in this paper follows the idea that mechanisms share changes in the state, in the form of shared events that can be observed throughout the network. In the following section, we give an example that illustrates this concept. Later, we will present an architectural model, the event base, that reflects our approach, and allows us to think of the telebase as of a distributed subsystem of a broadband network. 4.

Integrating Virtual Path and Virtual Circuit Management and Control in a Broadband Network

Virtual Path and Virtual Circuit management and control in a broadband network is an inherently distributed task, which involves management and real-time control activities that are performed by several classes of mechanisms running on various time-scales. In this section, we outline a scheme that allows us to demonstrate how network mechanisms can share dynamic information in order to accomplish this task. In the scheme we use, a Virtual Path (VP) as well as a Virtual Circuit (VC) provides an unidirectional communication path between two users of the network. (In general, a VP is not necessarily an end-to-end concept. [12]) A VC can be set up “inside” a VP, in which case cells belonging to the VC are sent along the same route as those belonging to the VP. On the other hand, a VC can be routed differently from a VP that has the same source and destination addresses, i.e., it is independent of a VP. Four mechanisms take part in the task of VP and VC management and control. A VP connection manager (1) allows to set up VPs, change their routes, and modify their assigned communication resources. A VC connection manager (2) allows to create and delete VCs, change their routes and renegotiate previously assigned resources to them. Two types of control mechanisms, the link admission controller (4) and the VP admission controller (3), regulate access to communication

- 12 resources, namely, to a physical network link and to the resources that are bound to a VP. Link state change

VC setup, release, change

VP state change

VP setup, release, change

(1) VP Connection Manager

read

read

-

create and read

(2) VC Connection Manager

read

read

create

read

(3) VP Admission Controller

-

create

-

-

(4) Link Admission Controller

create

-

-

-

Table 1: Sharing of events among network mechanisms. According to our paradigm, mechanisms share dynamic information by sharing events, i.e., they share knowledge about changes of the network state. A mechanism, after having performed a control operation, such as setting up a VC or allocating a resource, communicates the change in the system state immediately by creating an event that reflects this change. Other mechanisms observe this event and update their views of the network, i.e., their (internal) states, accordingly. Table 1 lists the types of events that are needed in our scheme for virtual path and virtual circuit management, and shows which mechanisms either create or read these events. Below, we informally describe the functionality of these mechanisms, the knowledge they need, in order to make decisions on how to execute network control operations, and the way how they update this knowledge by sharing events. (1) The VP connection manager allows to set up, release and change VPs in a broadband network. Its commands are initiated, e.g., by a human network operator. A VP manager has the knowledge about the physical topology of the network, the VPs that have been set up, their routes, and the state of the communication resources, i.e., the load of the network. This knowledge is part of its internal state. It constantly monitors events that represent changes in the state of the VPs and of the network links, in order to update its internal state. Also, it creates events that reflect the result of its control operations. (2) A VC connection manager is associated with a network access point. It sets up and releases VCs that originate from this point. Its knowledge includes the current state of the VPs that go through the network access point needed for creating VCs inside these VPs, and the current state of the physical links needed for computing routes for VCs that are set up outside a VP. A VC connection manager creates events related to the VCs that it handles.

- 13 -

(3) A VP admission controller manages the communication resources of a specific VP. Every change in the state of this VP triggers the creation of an event that is observed by, e.g., a VP connection manager. (4) Likewise, a link admission controller grants access to the communication resources of a physical network link. In our scheme, there is one link admission controller per link. A change in the state of the link results in an event that is observed by connection managers, as table 1 shows. In terms of the Integrated Reference Model (figure 4), the VP connection manager clearly is an Nplane mechanism, whereas a VC connection manager resides in the C-plane, and the VP and link admission controllers operate in the M-plane. The events that are shared among these mechanisms are D-plane objects. They can be interpreted as defining updates on the telebase. (A framework for sharing events will be presented in the following section.) To illustrate the cooperation of the mechanisms described above, suppose a request for creating a VC arrives at a VC connection manager. Assume further that the mechanism decides to route the VC outside a VP that has already been set up in the network. In a first step, the connection manager computes a route for the new VC based on the information it has on link state. It then performs the control operations that are involved in creating a VC, such as updating routing tables, reserving bandwidth with link admission controllers, etc. After committing the control operations, it creates an event that describes the new VC. As a further result of these control operations, every link access controller, which controls a link along the route of the new VC, produces an event that reflects the change in the state of this particular link. These events are observed, e.g., by all connection managers of the network, as table 1 shows. The above scheme can be extended, by introducing further mechanisms or additional types of events. Take a connection monitor as an example of a mechanism that can be added to the network. The connection monitor processes information about VPs and VCs that currently exist in the network and updates its state by reading events produced by mechanisms (1) - (4). Therefore, the following row can be added to table 1. Link state change (5) Connection monitor

-

VP state change

read

VC setup, release, change read

VP setup, release, change read

- 14 5.

The Event Base: Sharing Dynamic Information

Traditionally, sharing data among several mechanisms is based on sharing state information. This approach implies that all mechanisms share a common, distributed, global system state, which can be represented in some data model. This model conceptualizes the network in the form of a database. The OSI MIB [9] or the model given in [13] provide such a representation for the purpose of network management. In a broadband network, however, it is very difficult (although theoretically possible) to define and compute a system state common to all network mechanisms. This is because network mechanisms widely differ in the views (of the network) on which they operate, and the timescales on which they run. In order to avoid computing a common global state, we propose that mechanisms share dynamic information in the form of state changes - we call them events - that are used to update the internal states of these mechanisms. Our model for management and control includes a set of distributed processes representing these mechanisms, which run asynchronously, on different time-scales, and communicate among themselves by shared events. Let us briefly clarify the terms (internal) state and time-scale of a network mechanism. A mechanism, such as a VC connection manager, typically performs an operation by first consulting its internal state, then deciding on an execution plan, e.g., by computing a route, and, finally committing this operation, by changing certain routing tables, for instance. The internal state of this mechanism includes its view of the network, which can be represented by an instance of some data structure. The latter has to be continuously monitored and periodically refreshed, if it includes non-local dynamic information. In the case of the VC connection manager, the topology of the network and the current link states are part of its state. The time-scale of a mechanism refers to the refreshment cycle of its internal state, i.e., the duration, within which the state is updated. The time-scale usually is in the same order of magnitude as the average execution time of an operation. Recalling the example in the previous section, we expect the VC connection manager to run on a time-scale in the order of 100 milliseconds in a large network, the VP manager, which will typically be implemented as an OSI management application, may change its state every 10 minutes, a VP admission controller, since it acts only upon local information, every 10 milliseconds. An event is generally understood as something that happens at some point in time. Events can be grouped into classes in order to indicate a common semantic domain and a common structure. In

- 15 our approach, we specifically use events to communicate what is happening in the network, e.g., the fact that a virtual connection has been set up, or that a link load has been changed. Instances of events, related to the example in the previous section, are: VC_Setup ( VC_Name: Source Address: Dest Address: Path: Resource Consumed:

{12378, 10:28:33am}; 7182; 5428; [SW4, SW8, SW7]; Class_II;

)

Link_State ( Resource Name: New State:

{Sw7, Port 2}; Class_I: 17, Class_II: 245, Class_III: 43;

).

We conceptualize the set of “current” events (we will explain later what we understand by “current”) in the network as the event base. The event base, therefore, represents the current dynamics of the network; it represents what is happening “now” in the network. Network mechanisms communicate by reading from and writing to the event base. They read to update their state, and write to publish changes resulting from network control operations. A mechanism generally does not read the whole event base, as table 1 shows. In other words, mechanisms operate only on limited views of the event base (see figure 5), which may or may not overlap. Further, a mechanism may want to dynamically change its view of the event base (what can be interpreted as “navigation through a dynamic data space”). Note that although a mechanism-specific view covers only a part of the event base, it can yet be used to update a global network view. For example, the VP connection manager by maintaining the knowledge about all current VPs in the network acts upon a global view.

- 16 -

view of mechanism 1 view of mechanism 2

view of mechanism 3 Event Base

Figure 5: Views of the Event Base. We specify the view of a mechanism by a first-order predicate, the domain of which is given by the structure and the values of events. The view of the VC connection management module, e.g., can be defined by the predicate Event_Type=Link_State or Event_Type=VP_State or Event_Type=VP_Setup or Event_Type=VP_Release.

To give another example, a predicate for observing the call-level dynamics of a certain traffic class, say, the class carrying video traffic, is: (Event_Type=VC_Setup and Event_Instance.Resource=Class_II) or (Event_Type=VC_Release and Event_Instance.Resource=Class_II).

In the case of the VC connection manager, every event related to a change in a link state may trigger a change in the internal state of this mechanism. With a management application, however, events related to link states may be abstracted to an average value over a certain period of time. Therefore, both the view of the event base as well as the interpretation of its events differs among network mechanisms.

- 17 -

Management Mechanisms

Real-Time Control Mechanisms

Access Point

Event Base System

Figure 6: The Event Base System The system that realizes the event base in a distributed environment is called the event base system (figure 6). The interface between a network mechanism and the event base system is provided by an access point. On the one hand, an access point hides the architecture and complexity of the event base system, on the other hand, it guarantees certain properties. The most important of them is the maximum delay for distributing an event within the event base system, which is expressed by a time interval ∆t. This interval defines what is meant by “the event base represents the current events of the network”, or more precisely the term “current”. Figure 7 shows the architecture of a network mechanism interacting with the event base system, in the form of a data flow diagram. The event base system can be seen as producing a stream of events that is filtered according toa mechanism-specific predicate. The resulting substream corresponds to the desired view of the event base. It is interpreted by the network mechanism and employed for updating its internal state. Conversely, the mechanism, by performing its task, may produce events that are submitted to the event base system for distribution. The event base system is used for distributing changes in the network. It thus enables sharing of dynamic information among network mechanisms. In terms of the Integrated Reference Model (figure 4), the event base system is a D-plane mechanism taht makes the telebase accessible in a broadband environment. In order to further explain the role of the event base, we split the functionality of a network

- 18 -

operation interface

internal state

perform operation

map between events and abstractions events wanted

filter predicate

filter events

Event Base System

Figure 7: Architecture of a Network Mechanism interacting with the Event Base. mechanism in a monitoring and a control part. The former can be seen as a continuous process that updates the internal state of the mechanism periodically. The latter performs the network control operations that can be described as transactions upon the states of network mechanisms. The event base supports the monitoring-related activities of a mechanism. A control operation, on the other hand, is performed on a communication infrastructure that is different from the event base, such as a signalling network. (Although the outcome of such an operation, i.e., the resulting state changes, are written to the event base, as discussed in section 4.) Two reasons justify this specific use of the event base. First, a control operation normally is based on a one-to-one communication pattern (one sender, one recipient). Monitoring, on the other hand, is often based on a many-to-many pattern, which favors the use of a shared medium, such as the event base. Second - and this is a system design argument - monitoring and control activities have different requirements for communication support, e.g., concerning the reliability of the

- 19 transport service. The use of the event base for monitoring implies a specific paradigm for accessing dynamic information in a distributed environment. Information is seen as a stream of data, which continuously flows towards a receiver and is filtered according to its needs. Note that no protocol is used by the receiver for accessing remote information. It does not know what data sources send the information, let alone their location in the network. We further recall that our model provides access to information not only for one mechanism, but supports simultaneously a large number of them. A similar paradigm, based on data access by filtering a stream of information, has been developed for the Datacycle architecture, an approach to high-performance transaction processing in a network environment [14]. The chief differences between the Datacycle approach and our approach are that, in the former, a fixed single node sends data and all other nodes receive them, as opposed to the event base system where every node can act as a sender and as a receiver at the same time. Furthermore, the single sender in the Datacycle architecture broadcasts the complete state of a system (in this case the current state of a relational database) in regular time intervals, whereas in our approach only state changes, i.e., events, are transmitted - immediately after they occur. 6.

Discussion

In order to reduce the architectural complexity of future broadband networks, its subsystems performing information transport, real-time control, and network management - must be integrated into a consistent framework. The approach discussed here for achieving this integration is data-centered, in the sense that network mechanisms cooperate by sharing data. They act upon a shared information repository, which consists of data items that differ in the time-scales their values change, and in their level of abstraction. We presented the Integrated Reference Model as an architectural framework that supports the concept of integration by sharing information. This paper centered around a model for the integration of the management and real-time control tasks. From an architectural point of view, the chief differences between these tasks are the timescale on which they evolve and level of abstraction of the data on which they operate. The problem of integration thus can be reduced to the following question: How can information be shared among network mechanisms that run at different locations of the network, operate on different views of this network, and run on completely different time scales? The hard part of sharing information in a large-scale broadband environment is to make dynamic

- 20 information globally available. We introduced the event base as a concept for sharing dynamic information. The subsystem that performs the tasks of management and real-time control in a broadband network is a high performance distributed system, based on high speed links. It runs network mechanisms that exchange dynamic information in the form of events, by asynchronously interacting with the event base. This interaction enables each mechanism to maintain its own view of the network while running on its own time-scale. For each mechanism, a predicate defines, and a filter provides, the desired fragment of the event base. The event base conceptualizes the dynamics of the network; it is the base of the current network events. This concept can be interpreted as a new paradigm for accessing data in a distributed real-time environment. Instead of sending requests to various data sources, as it is usually done, the receiver configures a filter on a dynamic data pool, as conceptualized by the event base. The event base can be seen as creating a flow of data that is filtered by a receiver according to its need. Obtaining information in a distributed environment no longer focuses on a distributed, but on a local operation, namely filtering. By analogy, we call this the disc head model. How to realize an event base efficiently in a broadband environment, remains a challenging problem. To attack it, we must determine, how events should be distributed, and how they should be filtered. A straightforward approach to distributing events is to link all network nodes by a spanning tree that supports multicasting - a method, which is used to broadcast updates for the topological database within the plaNET network [15,16]. As demonstrated in the Datacycle project [14], powerful hardware filters can be built today that perform filter operations, similar to what we envision for the event base, at very high speed. Further problems that need to be investigated include aspects of congestion control and scalability. The event base supports network monitoring in the sense that it can be used to update the state of the network as seen by a mechanism. Since only state changes are distributed, the problem arises, how a mechanism can be initialized, i.e., how its initial state can be determined - in an efficient way. This situation occurs whenever a new mechanism is added to the network, or whenever a mechanism is initialized after a fault condition has been detected. The same problem applies to mechanisms that dynamically change the scope of network information on which they operate, i.e., their view of the network. We believe that an architectural framework, as presented in this paper, not only must support a

- 21 high-level view of the system in mind and emphasize its fundamental concepts, but also should provide guidelines for a system design that follows these concepts. Our current work thus focuses primarily on system design aspects. We are modeling a subsystem for virtual path and virtual circuit connection management. The resulting system components will run on both, an emulation environment and a real gigabit testbed, in order to validate and revise architectural design decisions. We believe that, in the current state of research, experiments will provide essential contributions to the understanding of future broadband networks with manageable complexity. Acknowledgments The work presented here was supported in part by the National Science Foundation under Grant # CDA-90-24735, and by the Swiss National Science Foundation. References [1]

Henry Gilbert, Osama Aboul-Magd, Van Phung: Developing a Cohesive Traffic Management Strategy for ATM Networks, IEEE Communication Magazine, October 1991.

[2]

Mitsuru Tsuchida, Aurel A. Lazar, Nikos G. Aneroussis: Structural Representation of Management and Control Information in Broadband Networks, Proceedings of the International Conference on Communications, Chicago, IL, June, 1992, pp. 1019-1024.

[3]

Menso Appeldorn, Roberto Kung, Roberto Saracco: TMN + IN = TINA, IEEE Communications Magazine, March 1993.

[4]

William J. Barr, Trevor Boyd, Yuji Inoue: The TINA Initiative, IEEE Communications Magazine, March 1993.

[5]

Jane Hall, Thomas Magedanz: Uniform Modelling of Management and Telecommunication Services in Future Telecommunication Environments based on the ROSA Approach, in: Integrated Network Management, III, H.-G. Hegering and Y. Yemini (Editors), Elsevier Science Publishers, 1993.

[6]

Aurel A. Lazar: A Real-Time Management, Control, and Information Transport Architecture for Broadband Networks, Proceedings of the 1992 Zurich Seminar in Digital Communications, Zurich, Switzerland, March 16-19, 1992.

[7]

Jeffrey D. Ullmann: Principles of Database and Knowledge-Base System, Volumes I&II, Computer Science Press, 1988.

[8]

James Rumbaugh, Michael Blaha, William Premerlani, Frederick Eddy, William Lorensen: Object-Oriented Modeling and Design, Prentice-Hall 1991.

- 22 [9]

ISO: ISO/IEC IS 10040: Information Processing Systems - Open System Interconnection Systems Management Overview, 1991.

[10]

Bruce Murill: OMNIPoint: An Implementation Guide to Integrated Networked Information Systems Management, in: Integrated Network Management, III, H.-G. Hegering and Y. Yemini (Editors), Elsevier Science Publishers, 1993.

[11]

Aurel A. Lazar: The Integration of Real-Time Control with Management in Broadband Networks, Proceedings of the Workshop on Broadband Communications, Estoril, Portugal, January 20-22, 1992.

[12]

Jean-Yves Le Boudec: The Asynchronous Transfer Mode: A Tutorial, Computer Networks and ISDN Systems, No. 24, 1992.

[13]

Ouri Wolfson, Soumitra Sengupta, Yechiam Yemini: Managing Communication Networks by Monitoring Databases, IEEE Transactions on SE, Vol. 17, No. 9, September 91.

[14]

T. F. Bowen, G. Gopal, G. Herman, T. Hickey, K.C. Lee, W.H. Mansfield, J. Raitz, A. Weinrib: The Datacycle Architecture, Communications of the ACM, December 1992, Vol. 35, No. 12.

[15]

Inder Gopal, Roch Guerin, Jim Janniello, Vasilios Theoharakis: ATM Support in a Transparent Network, Proceedings Globecom ‘92.

[16]

Israel Cidon, Inder Gopal, Marc Kaplan, Shay Kutten: Distributed Control for PARIS, Proceedings of the 9th Annual ACM Symposium on Principles of Disitriubted Computing, Quebec, Canada, 1990.

[17]

S. Kheradpir, W. Stinson, J. Vucetic and A. Gersht, Real-Time Manaagement of Telephone Operating Company Networks: Issues and Approaches, IEEE Journal of Selected Areas in Communications, Vol. 11, No. 7, December 1993.

-1-

On Reducing the Complexity of Management and Control of Future Broadband Networks

Aurel A. Lazar and Rolf Stadler Center for Telecommunications Research Columbia University, New York, NY 10027-6699 Abstract In order to reduce the architectural complexity of broadband networks, the sharing of information among network management and control mechanisms is investigated. We introduce a concept for sharing dynamic information in real-time that identifies the current changes in the network state as a global pool of network events called the event base. Network mechanisms share events by asynchronously interacting with the event base. They filter the continuous stream of data produced by the event base and, thereby, monitor the network state according to their needs. The approach enables the integration of management and control tasks by allowing network mechanisms to share updates to their views of the network. The scheme is efficient in the sense that, first, only changes of the network state are distributed, and, second, this data can be shared (in principle) by any number of mechanisms. This is illustrated by an example of virtual path and virtual circuit connection management and control. The event base is embedded into an architectural framework, the Integrated Reference Model, that incorporates the subsystems of a broadband network, i.e., information transport, real-time control, and network management.

Authors’ Addresses: Aurel A. Lazar, Center for Telecommunications Research, 801, Schapiro Research Building, Columbia University, New York, NY 10027-6699, Tel: 212-854-1747, Fax: 212-316-9068, E-Mail: [email protected]. Rolf Stadler, Center for Telecommunications Research, 801, Schapiro Research Building, Columbia University, New York, NY 100276699, Tel: 212-854-2481, Fax: 212-316-9068, E-Mail: [email protected].

-21.

Introduction

Designers of future broadband networks are confronted with different aspects of complexity, which, in this combination, have not occurred in system design before. Like today’s telephone network, a future broadband network is expected to provide world-wide connectivity to billions of subscribers, which leads to the problem of large numbers, e.g., in terms of of network nodes. However, when compared with the telephone network, the complexity of broadband networks will be even higher because of the ever growing number of services with different Quality of Service (QOS) requirements these networks are expected to integrate. In addition to the architectural aspects of complexity, the task of designing a software system for management and control of such a network is by no means understood - and most of the functionality of a broadband network will be implemented in software. For convenience, we use the term broadband network instead of Broadband ISDN (Integrated Services Digital Network), the future all-digital network which is expected to handle a variety of traffic types, such as voice, video, and data, covering all types of applications, from digital voice and electronic mail to multi-party video conferences. Compared to traditional networks, broadband networks require complex, distributed real-time monitoring, control and management capabilities in order to efficiently guarantee the QOS for the communication services they integrate - this is the price to pay for efficient integration. A wealth of mechanisms has been identified to provide these capabilities (see, e.g., [1]). These mechanisms run on various time-scales that extend over 10 orders of magnitude. (By the term [network] mechanism, we understand a module or an application which takes part in the task of traffic control or management of a network.) Historically, the tasks of network management and traffic control have been designed to achieve different objectives. They thus exhibit different characteristics, which makes their integration hard. Network management attempts global control of the network, a task that evolves on a slow time scale (seconds and above), and research in this field is focussed on the questions of how to structure network information and how to exploit this structure for network management. Traffic control, on the other side, is understood as an inherently distributed task that is performed on a very fast time scale (nanoseconds to 100 milliseconds), and work in this field deals with aspects of dynamics in the first place. This difference explains the fact that, so far, research problems in broadband networks are addressed by different communities: network management is seen primarily from a computer science perspective, whereas traffic control problems are addressed by traffic engineers.

-3The development of a network architecture that allows the integration of the main tasks of a broadband network - information transport, traffic control, and network management - is an important step towards reducing the overall complexity of the system. We think that such an architecture must primarily provide concepts for the integration of traffic control and network management, since both tasks share functionality as well as network data [2]. They both are based on the same paradigm of monitoring and controlling the network state, and primarily differ in the time scale on which they execute. Our approach for integrating management and control is data-centered in the sense that these tasks interact with each other by reading from and writing to a network-wide database, which we call the network telebase. While we think it necessary to model information used by various mechanisms in a uniform way, we suggest that, in a further step, this information be also distributed in a uniform way and thus made accessible to all network mechanisms. The difficulties in managing the telebase are the inherent distribution of the mechanisms interacting with it and the network dynamics. Other activities to integrate network management and traffic control within telecommunication networks are currently pursued, mainly by joint efforts of consortia. They include the TINA initiative [3,4] and the ROSA project [5], which is carried out within the European RACE program. They both have the goal to come up with a common framework for the TMN (Telecommunication Management Network) and IN (Intelligent Network) architectures, and their approaches for integration center around a common distributed processing environment, and a common object-oriented framework, respectively. Note that the architectural concepts that we present in this paper are clearly not an alternative to the activities mentioned above, which are much broader in scope (and in volume). We focus on specific problems that we think to be critical in order to understand and eventually build large-scale broadband networks. This paper deals with architectural aspects of the network telebase, which is embedded in a network architecture, the Integrated Reference Model [6]. We address the question of how information for management and control of a broadband network can be efficiently shared among network mechanisms. We specifically propose a concept for sharing dynamic information, e.g., data about the status of connections and the availability of network resources, among mechanisms in a distributed network environment. As part of this approach, we introduce a new paradigm for making distributed information available to a network mechanism. Our approach contains time as an architectural concept, for two reasons. First, network delays, given by the laws of physics, restrict in a fundamental manner the way a network can be controlled and hence managed.

-4Second, since network mechanisms run on completely different time scales (according to which they change their internal state) there must be a generic concept how they can interact with each other. Why is this approach important? The intent of the Integrated Reference Model in general, and the telebase in particular, is not only to facilitate the management of complexity but also to reduce it. First, by achieving the integration of different mechanisms, via sharing of dynamic data, complexity is reduced. Second, by distributing only changes in the system state, less information is transported, and furthermore, only needed changes are evaluated by various mechanisms. Therefore for every state change of the system, only a subset of components is updated and, thus, only a subset of mechanisms is triggered for execution. This markedly can reduce the complexity of the system. Although the content of this paper is high-level (as its subject implies), we give a specific example and show some implications for the system design and implementation levels, for we think that an architectural model not only must provide a consistent and concise framework of the system in mind, but also should facilitate its realization - a further attempt to reduce the overall complexity. The rest of the paper is organized as follows. In section 2 we introduce the network telebase as a data repository that conceptualizes the network. In section 3 the Telebase is incorporated into an architectural framework for broadband networks, the Integrated Reference Model, which is based upon the concept of information sharing. We also discuss some fundamental problems related to data sharing in a broadband environment. Section 4 gives an example that illustrates our approach to sharing dynamic information. In section 5, this approach is developed into an architectural concept, the event base, and some of its implications are investigated. Finally, in section 6, we briefly summarize the results of the paper, and discuss questions for further research. 2.

The Telebase: Sharing Information in Broadband Networks

The subsystems of a broadband network - the information transport, traffic control, and network management systems - must share information about the network, in order to perform their tasks. Consider, for example, a routing table in an ATM network, which is set by traffic control mechanisms, and is used to control the way cells are switched within a node in the information transport system. Further, data representing the state on a physical network link, may be used simultaneously by a call admission controller, a call routing function, and a performance management application.

-5Following this observation, we model the functionality of a broadband network by a set of operators that act upon a shared information base, which we call the network telebase, or telebase, for short (figure 1). This conceptual, high-level view of a network will be refined in the next section. The tasks in a broadband network operate on two main time-scales: a fast time-scale with reaction times to events ranging from nanoseconds to milliseconds, and a slow time-scale where actions in response to events take place within seconds and above. In our model, the control and the communication operators run on the fast time-scale, whereas the management operators run on the slow time-scale. The telebase incorporates all time-scales, since it serves as a binding among the other architectures - enabling integration within our framework.

Communication Operators

Management Operators

Control Operators

signalling &resource data MIB ATM

MIB abstractions, statisitcs Network Telebase

Network Telebase Abstraction Operators

Figure 1: Operators Acting on the Network Telebase Some parts of the telebase data can be derived or abstracted from its other data items, such as a virtual network (e.g., a private network with reserved communication resources running on a public network) can be represented as an abstraction of the data representing the underlying

-6network. The process of abstraction is supported by abstraction operators, which are part of the functionality of the telebase. The set of data obtained by abstraction is similar to the concept of the intensional database in knowledge-based systems [7] or the notion of derived objects in object-oriented modeling [8]. In contrast to other domains, abstractions in broadband networks often involve time. A performance parameter, for instance, can be seen as an abstraction of a state variable over a time horizon and at specific points in time. Concerning management and information transport, the nature and the structure of the data in the telebase is - to a large extend - worked out, understood, and standardized today. For control, however, little has been done so far. From the point of view of network management, the network abstraction is known as the MIB (Management Information Base). In the framework that has been worked out by the OSI and CCITT standard bodies, the MIB is designed following an object oriented approach, and is organized as a tree of so-called managed objects [9]. In addition to the standardization work, consortia, such as the Network Management Forum [10] developed guidelines for MIB definitions and defined MIBs for special application areas like TMN. At the same time, specific MIB definitions for industrial communication standards, such as SMDS from Bellcore, were published. A common information framework for information transport, traffic control and network management - although necessary for the realization of data sharing in broadband networks - still needs to be developed, and is beyond the scope of this paper. Nevertheless, let us briefly mention that data in the telebase is structured according to different time scales (figure 2). Data that describes the structure of the system, such as its configuration (on various levels of abstraction), is represented in the configuration database. This database contains knowledge about the network entities and their relationships. The knowledge about the sensors and their relationships with the network objects is stored in the sensor database. The dynamic database represents the dataspace generated by the state and control variables of the network. There, we can distinguish between state variables that represent aspects of the network state, control variables that are parameters of the network operations and hence determine the dynamics of the system, and event variables that represent either external stimuli to the network, or reflect something that occurs at some point in time within the network. Finally, the statistical database represents the statistical information about the network. It is derived from the dynamic database through a set of performance evaluation operators that are associated with the corresponding state, event and control variables, and it represents the dataspace generated by the performance parameters of the objects of interest.

-7-

Perf_Parameter Statistical Database

Throughput

Time_Delay

Packet_Loss

Call_Blocked

Packet_Gap

Statistical_History Abstraction of State and Event Variables

State_Variable

Dynamic Database

Event_History

State_History Event_Variable

Control_Variable Sensor Setup and Data Collection

Network_Station Network

Link

Traffic_Source

Status_Sensor Node

Traffic_Buffer

Sensor_Station_Info

Event_Sensor

Sensor_Traffic_Buffer

Traffic_Server Configuration Database

Sensor Database

Static Database

Figure 2: The Organization of the Telebase around Time Scales. Figure 3 gives an example of how various data abstractions relate to each other. A network buffer is presented in the form of a semantic network that shows the relationship among data of different kind (namely, configuration, state, and abstraction). (See [11] for further details.) Recall that network data changes according to various time-scales. The state variable in the buffer in figure 3 may change within nanoseconds, the characteristics of a virtual path connection that is represented as a managed object in the MIB changes, say, within minutes.

-8-

STATE_VARIABLE

ARRIVAL_RATE

IS-A

PERF-OF-GENERICVARIABLE

NETWORK_OBJECT

IS-A

IS-A

PERF_PARAMETER

STATE-VAR-OF-GENERICOBJECT PACKET_INTO_ BUFFER BUFFER

IS-A

IS-A

IS-A

PERF-OF-GENERICOBJECT

STATE-VAR-OF-GENERICOBJECT

PERF-OF-GENERICVARIABLE

PACKET_INTO_ LINK_BUFFER

LINK_BUFFER_ ARRIVAL_RATE

LINK_BUFFER

IS-A

IS-A

IS-A

PERF-OF-GENERICOBJECT

HAS-STATE-VARIABLE PACKET_INTO_ LINK_BUFFER_I LINK_BUFFER_ ARRIVAL_RATE_I

HAS-STATE-VARIABLE PACKET_INTO_ LINK_BUFFER_0_1_I

LINK_BUFFER_ ARRIVAL_RATE_0_1_I

INSTANCE-OF

INSTANCE-OF

INSTANCE-OF

LINK_BUFFER_I HAS-PERF_PARAMETER

LINK_BUFFER_0_1_I HAS-PERF_PARAMETER

Figure 3: Semantic Network of Performance Parameters and State Variables of BUFFER and its Subclasses. The focus of this paper lies on a scheme for sharing dynamic information, i.e., data that changes during the operation of the network. Having such a scheme for a broadband network is essential, because network mechanisms execute control actions based on their view of the network state, which generally includes dynamic data produced by other mechanisms.

-93.

The Integrated Reference Model: An Architecture based on Information Sharing

To overcome the complexity problems in emerging broadband networks - caused by the variety of communication services to be provided, the complexity of the control software, the large number of network nodes, etc. - there is an urgent need for integrating management and real-time control tasks into a consistent framework. To this end, we have developed an overall network architecture, the Integrated Reference Model (IRM) [6]. In this model, the key role for integration is played by the network telebase, a distributed data repository, which is shared among network mechanisms. The IRM incorporates monitoring, control, communication, and abstraction primitives that are organized into the Traffic Control Architecture, the Management Architecture, the Information Transport Architecture and the Telebase Architecture, respectively. (Note that this division follows the network abstraction explained in the last section.) The subdivision of the IRM into the Management and the Traffic Control Architectures on the one hand, and the Information Transport Architecture on the other, is based on the principle of separation between communications and controls. The separation between the Management and the Traffic Control Architecture is primarily due to the different time-scales on which these architectures operate. The Integrated Reference Model can be organized into five planes that model the above architectures (figure 4). The Management Architecture resides in the network management or Nplane, which covers the functional areas of network management, namely, configuration, performance, fault, accounting and security management. Manager and agents, its basic functional components, interact with each other according to the client-server paradigm. The Traffic Control Architecture consists of the resource control, or M-, and the connection management and control, or C-, planes. The M-plane comprises the entities and mechanisms responsible for resource control, such as cell scheduling, call admission, and call routing; the Cplane those for connection management and control, and is based on a signalling network. The Information Transport Architecture is located in the user transport or U-plane, which models the protocols, services, and entities for the transport of user information. Both, the U- and C-planes are horizontally layered, following the ISO Reference Model for Open System Interconnection. Finally, the Telebase Architecture resides within the Data Abstraction and Management or Dplane, and implements the principles of data sharing for network monitoring and control primitives, the functional building blocks of the N-, M-, and C-plane mechanisms. The mechanisms within both the management and control architectures are based on the classical

- 10 -

N-plane Agent

Agent

Network Management

Manager Agent Management Protocol

Agent

M-plane Resource Control

D-plane Data Abstraction and Management

Telebase

C-plane

SCP

Connection Management and Control

Signaling Network

U-plane Node

Node

Protocols Node User Access Protocols

Node

User Information Transport

Figure 4: The Integrated Reference Model, organized into planes. monitoring/control paradigm. This means that a mechanism performs an operation by monitoring the network state, and, based on that knowledge, executes a control operation, leading to a change of the network state. Therefore, sharing information among network mechanisms includes sharing state data in the first place. Up to now, the telebase in the D-plane has been described as being a global concept. It deals with the structure of network information, the time scales according to which this data changes, the relationships among data items, and data abstraction. But - so far - it contains no concept of distribution, which would determine how information is to be stored and made accessible in a distributed environment. Having such a concept is especially important for the dynamic information in the telebase, i.e., the data that changes during the operation of the network. Sharing dynamic information is hard in broadband networks, due to the delays given by the laws of

- 11 physics, and due to the fact that network mechanisms run on time-scales that extend over 10 orders of magnitude. A link scheduler, for example, performs an operation within a few nanoseconds, the setup of a virtual connection may take in the range of 100 milliseconds in a large network, and, finally, a network management operation such as setting a control parameter on a switch is executed usually in the range of seconds. In order to make the telebase globally accessible, we thus face the following problem. How can dynamic data be shared among network mechanisms, which run at different locations of the network, which operate on different views of this network, and which run on completely different time scales? The approach we propose in this paper follows the idea that mechanisms share changes in the state, in the form of shared events that can be observed throughout the network. In the following section, we give an example that illustrates this concept. Later, we will present an architectural model, the event base, that reflects our approach, and allows us to think of the telebase as of a distributed subsystem of a broadband network. 4.

Integrating Virtual Path and Virtual Circuit Management and Control in a Broadband Network

Virtual Path and Virtual Circuit management and control in a broadband network is an inherently distributed task, which involves management and real-time control activities that are performed by several classes of mechanisms running on various time-scales. In this section, we outline a scheme that allows us to demonstrate how network mechanisms can share dynamic information in order to accomplish this task. In the scheme we use, a Virtual Path (VP) as well as a Virtual Circuit (VC) provides an unidirectional communication path between two users of the network. (In general, a VP is not necessarily an end-to-end concept. [12]) A VC can be set up “inside” a VP, in which case cells belonging to the VC are sent along the same route as those belonging to the VP. On the other hand, a VC can be routed differently from a VP that has the same source and destination addresses, i.e., it is independent of a VP. Four mechanisms take part in the task of VP and VC management and control. A VP connection manager (1) allows to set up VPs, change their routes, and modify their assigned communication resources. A VC connection manager (2) allows to create and delete VCs, change their routes and renegotiate previously assigned resources to them. Two types of control mechanisms, the link admission controller (4) and the VP admission controller (3), regulate access to communication

- 12 resources, namely, to a physical network link and to the resources that are bound to a VP. Link state change

VC setup, release, change

VP state change

VP setup, release, change

(1) VP Connection Manager

read

read

-

create and read

(2) VC Connection Manager

read

read

create

read

(3) VP Admission Controller

-

create

-

-

(4) Link Admission Controller

create

-

-

-

Table 1: Sharing of events among network mechanisms. According to our paradigm, mechanisms share dynamic information by sharing events, i.e., they share knowledge about changes of the network state. A mechanism, after having performed a control operation, such as setting up a VC or allocating a resource, communicates the change in the system state immediately by creating an event that reflects this change. Other mechanisms observe this event and update their views of the network, i.e., their (internal) states, accordingly. Table 1 lists the types of events that are needed in our scheme for virtual path and virtual circuit management, and shows which mechanisms either create or read these events. Below, we informally describe the functionality of these mechanisms, the knowledge they need, in order to make decisions on how to execute network control operations, and the way how they update this knowledge by sharing events. (1) The VP connection manager allows to set up, release and change VPs in a broadband network. Its commands are initiated, e.g., by a human network operator. A VP manager has the knowledge about the physical topology of the network, the VPs that have been set up, their routes, and the state of the communication resources, i.e., the load of the network. This knowledge is part of its internal state. It constantly monitors events that represent changes in the state of the VPs and of the network links, in order to update its internal state. Also, it creates events that reflect the result of its control operations. (2) A VC connection manager is associated with a network access point. It sets up and releases VCs that originate from this point. Its knowledge includes the current state of the VPs that go through the network access point needed for creating VCs inside these VPs, and the current state of the physical links needed for computing routes for VCs that are set up outside a VP. A VC connection manager creates events related to the VCs that it handles.

- 13 -

(3) A VP admission controller manages the communication resources of a specific VP. Every change in the state of this VP triggers the creation of an event that is observed by, e.g., a VP connection manager. (4) Likewise, a link admission controller grants access to the communication resources of a physical network link. In our scheme, there is one link admission controller per link. A change in the state of the link results in an event that is observed by connection managers, as table 1 shows. In terms of the Integrated Reference Model (figure 4), the VP connection manager clearly is an Nplane mechanism, whereas a VC connection manager resides in the C-plane, and the VP and link admission controllers operate in the M-plane. The events that are shared among these mechanisms are D-plane objects. They can be interpreted as defining updates on the telebase. (A framework for sharing events will be presented in the following section.) To illustrate the cooperation of the mechanisms described above, suppose a request for creating a VC arrives at a VC connection manager. Assume further that the mechanism decides to route the VC outside a VP that has already been set up in the network. In a first step, the connection manager computes a route for the new VC based on the information it has on link state. It then performs the control operations that are involved in creating a VC, such as updating routing tables, reserving bandwidth with link admission controllers, etc. After committing the control operations, it creates an event that describes the new VC. As a further result of these control operations, every link access controller, which controls a link along the route of the new VC, produces an event that reflects the change in the state of this particular link. These events are observed, e.g., by all connection managers of the network, as table 1 shows. The above scheme can be extended, by introducing further mechanisms or additional types of events. Take a connection monitor as an example of a mechanism that can be added to the network. The connection monitor processes information about VPs and VCs that currently exist in the network and updates its state by reading events produced by mechanisms (1) - (4). Therefore, the following row can be added to table 1. Link state change (5) Connection monitor

-

VP state change

read

VC setup, release, change read

VP setup, release, change read

- 14 5.

The Event Base: Sharing Dynamic Information

Traditionally, sharing data among several mechanisms is based on sharing state information. This approach implies that all mechanisms share a common, distributed, global system state, which can be represented in some data model. This model conceptualizes the network in the form of a database. The OSI MIB [9] or the model given in [13] provide such a representation for the purpose of network management. In a broadband network, however, it is very difficult (although theoretically possible) to define and compute a system state common to all network mechanisms. This is because network mechanisms widely differ in the views (of the network) on which they operate, and the timescales on which they run. In order to avoid computing a common global state, we propose that mechanisms share dynamic information in the form of state changes - we call them events - that are used to update the internal states of these mechanisms. Our model for management and control includes a set of distributed processes representing these mechanisms, which run asynchronously, on different time-scales, and communicate among themselves by shared events. Let us briefly clarify the terms (internal) state and time-scale of a network mechanism. A mechanism, such as a VC connection manager, typically performs an operation by first consulting its internal state, then deciding on an execution plan, e.g., by computing a route, and, finally committing this operation, by changing certain routing tables, for instance. The internal state of this mechanism includes its view of the network, which can be represented by an instance of some data structure. The latter has to be continuously monitored and periodically refreshed, if it includes non-local dynamic information. In the case of the VC connection manager, the topology of the network and the current link states are part of its state. The time-scale of a mechanism refers to the refreshment cycle of its internal state, i.e., the duration, within which the state is updated. The time-scale usually is in the same order of magnitude as the average execution time of an operation. Recalling the example in the previous section, we expect the VC connection manager to run on a time-scale in the order of 100 milliseconds in a large network, the VP manager, which will typically be implemented as an OSI management application, may change its state every 10 minutes, a VP admission controller, since it acts only upon local information, every 10 milliseconds. An event is generally understood as something that happens at some point in time. Events can be grouped into classes in order to indicate a common semantic domain and a common structure. In

- 15 our approach, we specifically use events to communicate what is happening in the network, e.g., the fact that a virtual connection has been set up, or that a link load has been changed. Instances of events, related to the example in the previous section, are: VC_Setup ( VC_Name: Source Address: Dest Address: Path: Resource Consumed:

{12378, 10:28:33am}; 7182; 5428; [SW4, SW8, SW7]; Class_II;

)

Link_State ( Resource Name: New State:

{Sw7, Port 2}; Class_I: 17, Class_II: 245, Class_III: 43;

).

We conceptualize the set of “current” events (we will explain later what we understand by “current”) in the network as the event base. The event base, therefore, represents the current dynamics of the network; it represents what is happening “now” in the network. Network mechanisms communicate by reading from and writing to the event base. They read to update their state, and write to publish changes resulting from network control operations. A mechanism generally does not read the whole event base, as table 1 shows. In other words, mechanisms operate only on limited views of the event base (see figure 5), which may or may not overlap. Further, a mechanism may want to dynamically change its view of the event base (what can be interpreted as “navigation through a dynamic data space”). Note that although a mechanism-specific view covers only a part of the event base, it can yet be used to update a global network view. For example, the VP connection manager by maintaining the knowledge about all current VPs in the network acts upon a global view.

- 16 -

view of mechanism 1 view of mechanism 2

view of mechanism 3 Event Base

Figure 5: Views of the Event Base. We specify the view of a mechanism by a first-order predicate, the domain of which is given by the structure and the values of events. The view of the VC connection management module, e.g., can be defined by the predicate Event_Type=Link_State or Event_Type=VP_State or Event_Type=VP_Setup or Event_Type=VP_Release.

To give another example, a predicate for observing the call-level dynamics of a certain traffic class, say, the class carrying video traffic, is: (Event_Type=VC_Setup and Event_Instance.Resource=Class_II) or (Event_Type=VC_Release and Event_Instance.Resource=Class_II).

In the case of the VC connection manager, every event related to a change in a link state may trigger a change in the internal state of this mechanism. With a management application, however, events related to link states may be abstracted to an average value over a certain period of time. Therefore, both the view of the event base as well as the interpretation of its events differs among network mechanisms.

- 17 -

Management Mechanisms

Real-Time Control Mechanisms

Access Point

Event Base System

Figure 6: The Event Base System The system that realizes the event base in a distributed environment is called the event base system (figure 6). The interface between a network mechanism and the event base system is provided by an access point. On the one hand, an access point hides the architecture and complexity of the event base system, on the other hand, it guarantees certain properties. The most important of them is the maximum delay for distributing an event within the event base system, which is expressed by a time interval ∆t. This interval defines what is meant by “the event base represents the current events of the network”, or more precisely the term “current”. Figure 7 shows the architecture of a network mechanism interacting with the event base system, in the form of a data flow diagram. The event base system can be seen as producing a stream of events that is filtered according toa mechanism-specific predicate. The resulting substream corresponds to the desired view of the event base. It is interpreted by the network mechanism and employed for updating its internal state. Conversely, the mechanism, by performing its task, may produce events that are submitted to the event base system for distribution. The event base system is used for distributing changes in the network. It thus enables sharing of dynamic information among network mechanisms. In terms of the Integrated Reference Model (figure 4), the event base system is a D-plane mechanism taht makes the telebase accessible in a broadband environment. In order to further explain the role of the event base, we split the functionality of a network

- 18 -

operation interface

internal state

perform operation

map between events and abstractions events wanted

filter predicate

filter events

Event Base System

Figure 7: Architecture of a Network Mechanism interacting with the Event Base. mechanism in a monitoring and a control part. The former can be seen as a continuous process that updates the internal state of the mechanism periodically. The latter performs the network control operations that can be described as transactions upon the states of network mechanisms. The event base supports the monitoring-related activities of a mechanism. A control operation, on the other hand, is performed on a communication infrastructure that is different from the event base, such as a signalling network. (Although the outcome of such an operation, i.e., the resulting state changes, are written to the event base, as discussed in section 4.) Two reasons justify this specific use of the event base. First, a control operation normally is based on a one-to-one communication pattern (one sender, one recipient). Monitoring, on the other hand, is often based on a many-to-many pattern, which favors the use of a shared medium, such as the event base. Second - and this is a system design argument - monitoring and control activities have different requirements for communication support, e.g., concerning the reliability of the

- 19 transport service. The use of the event base for monitoring implies a specific paradigm for accessing dynamic information in a distributed environment. Information is seen as a stream of data, which continuously flows towards a receiver and is filtered according to its needs. Note that no protocol is used by the receiver for accessing remote information. It does not know what data sources send the information, let alone their location in the network. We further recall that our model provides access to information not only for one mechanism, but supports simultaneously a large number of them. A similar paradigm, based on data access by filtering a stream of information, has been developed for the Datacycle architecture, an approach to high-performance transaction processing in a network environment [14]. The chief differences between the Datacycle approach and our approach are that, in the former, a fixed single node sends data and all other nodes receive them, as opposed to the event base system where every node can act as a sender and as a receiver at the same time. Furthermore, the single sender in the Datacycle architecture broadcasts the complete state of a system (in this case the current state of a relational database) in regular time intervals, whereas in our approach only state changes, i.e., events, are transmitted - immediately after they occur. 6.

Discussion

In order to reduce the architectural complexity of future broadband networks, its subsystems performing information transport, real-time control, and network management - must be integrated into a consistent framework. The approach discussed here for achieving this integration is data-centered, in the sense that network mechanisms cooperate by sharing data. They act upon a shared information repository, which consists of data items that differ in the time-scales their values change, and in their level of abstraction. We presented the Integrated Reference Model as an architectural framework that supports the concept of integration by sharing information. This paper centered around a model for the integration of the management and real-time control tasks. From an architectural point of view, the chief differences between these tasks are the timescale on which they evolve and level of abstraction of the data on which they operate. The problem of integration thus can be reduced to the following question: How can information be shared among network mechanisms that run at different locations of the network, operate on different views of this network, and run on completely different time scales? The hard part of sharing information in a large-scale broadband environment is to make dynamic

- 20 information globally available. We introduced the event base as a concept for sharing dynamic information. The subsystem that performs the tasks of management and real-time control in a broadband network is a high performance distributed system, based on high speed links. It runs network mechanisms that exchange dynamic information in the form of events, by asynchronously interacting with the event base. This interaction enables each mechanism to maintain its own view of the network while running on its own time-scale. For each mechanism, a predicate defines, and a filter provides, the desired fragment of the event base. The event base conceptualizes the dynamics of the network; it is the base of the current network events. This concept can be interpreted as a new paradigm for accessing data in a distributed real-time environment. Instead of sending requests to various data sources, as it is usually done, the receiver configures a filter on a dynamic data pool, as conceptualized by the event base. The event base can be seen as creating a flow of data that is filtered by a receiver according to its need. Obtaining information in a distributed environment no longer focuses on a distributed, but on a local operation, namely filtering. By analogy, we call this the disc head model. How to realize an event base efficiently in a broadband environment, remains a challenging problem. To attack it, we must determine, how events should be distributed, and how they should be filtered. A straightforward approach to distributing events is to link all network nodes by a spanning tree that supports multicasting - a method, which is used to broadcast updates for the topological database within the plaNET network [15,16]. As demonstrated in the Datacycle project [14], powerful hardware filters can be built today that perform filter operations, similar to what we envision for the event base, at very high speed. Further problems that need to be investigated include aspects of congestion control and scalability. The event base supports network monitoring in the sense that it can be used to update the state of the network as seen by a mechanism. Since only state changes are distributed, the problem arises, how a mechanism can be initialized, i.e., how its initial state can be determined - in an efficient way. This situation occurs whenever a new mechanism is added to the network, or whenever a mechanism is initialized after a fault condition has been detected. The same problem applies to mechanisms that dynamically change the scope of network information on which they operate, i.e., their view of the network. We believe that an architectural framework, as presented in this paper, not only must support a

- 21 high-level view of the system in mind and emphasize its fundamental concepts, but also should provide guidelines for a system design that follows these concepts. Our current work thus focuses primarily on system design aspects. We are modeling a subsystem for virtual path and virtual circuit connection management. The resulting system components will run on both, an emulation environment and a real gigabit testbed, in order to validate and revise architectural design decisions. We believe that, in the current state of research, experiments will provide essential contributions to the understanding of future broadband networks with manageable complexity. Acknowledgments The work presented here was supported in part by the National Science Foundation under Grant # CDA-90-24735, and by the Swiss National Science Foundation. References [1]

Henry Gilbert, Osama Aboul-Magd, Van Phung: Developing a Cohesive Traffic Management Strategy for ATM Networks, IEEE Communication Magazine, October 1991.

[2]

Mitsuru Tsuchida, Aurel A. Lazar, Nikos G. Aneroussis: Structural Representation of Management and Control Information in Broadband Networks, Proceedings of the International Conference on Communications, Chicago, IL, June, 1992, pp. 1019-1024.

[3]

Menso Appeldorn, Roberto Kung, Roberto Saracco: TMN + IN = TINA, IEEE Communications Magazine, March 1993.

[4]

William J. Barr, Trevor Boyd, Yuji Inoue: The TINA Initiative, IEEE Communications Magazine, March 1993.

[5]

Jane Hall, Thomas Magedanz: Uniform Modelling of Management and Telecommunication Services in Future Telecommunication Environments based on the ROSA Approach, in: Integrated Network Management, III, H.-G. Hegering and Y. Yemini (Editors), Elsevier Science Publishers, 1993.

[6]

Aurel A. Lazar: A Real-Time Management, Control, and Information Transport Architecture for Broadband Networks, Proceedings of the 1992 Zurich Seminar in Digital Communications, Zurich, Switzerland, March 16-19, 1992.

[7]

Jeffrey D. Ullmann: Principles of Database and Knowledge-Base System, Volumes I&II, Computer Science Press, 1988.

[8]

James Rumbaugh, Michael Blaha, William Premerlani, Frederick Eddy, William Lorensen: Object-Oriented Modeling and Design, Prentice-Hall 1991.

- 22 [9]

ISO: ISO/IEC IS 10040: Information Processing Systems - Open System Interconnection Systems Management Overview, 1991.

[10]

Bruce Murill: OMNIPoint: An Implementation Guide to Integrated Networked Information Systems Management, in: Integrated Network Management, III, H.-G. Hegering and Y. Yemini (Editors), Elsevier Science Publishers, 1993.

[11]

Aurel A. Lazar: The Integration of Real-Time Control with Management in Broadband Networks, Proceedings of the Workshop on Broadband Communications, Estoril, Portugal, January 20-22, 1992.

[12]

Jean-Yves Le Boudec: The Asynchronous Transfer Mode: A Tutorial, Computer Networks and ISDN Systems, No. 24, 1992.

[13]

Ouri Wolfson, Soumitra Sengupta, Yechiam Yemini: Managing Communication Networks by Monitoring Databases, IEEE Transactions on SE, Vol. 17, No. 9, September 91.

[14]

T. F. Bowen, G. Gopal, G. Herman, T. Hickey, K.C. Lee, W.H. Mansfield, J. Raitz, A. Weinrib: The Datacycle Architecture, Communications of the ACM, December 1992, Vol. 35, No. 12.

[15]

Inder Gopal, Roch Guerin, Jim Janniello, Vasilios Theoharakis: ATM Support in a Transparent Network, Proceedings Globecom ‘92.

[16]

Israel Cidon, Inder Gopal, Marc Kaplan, Shay Kutten: Distributed Control for PARIS, Proceedings of the 9th Annual ACM Symposium on Principles of Disitriubted Computing, Quebec, Canada, 1990.

[17]

S. Kheradpir, W. Stinson, J. Vucetic and A. Gersht, Real-Time Manaagement of Telephone Operating Company Networks: Issues and Approaches, IEEE Journal of Selected Areas in Communications, Vol. 11, No. 7, December 1993.