MedTransact - University of Maryland

3 downloads 0 Views 242KB Size Report
travel agent negotiating with flight, hotel and car reservation services. This scenario is ... car can be picked up at the particular D.C. airport where the flight arrives. ... On the other hand, the acceptable car reservations are those made at the.
MedTransact: Transaction Support for Mediation with Remote Service Providers Paulo F. Pires1, 2

Louiqa Raschid3

Abstract: There are many service providers that provide services on the internet via Web accessible servers. In order to effectively develop applications in this environment, it is important to provide a way to integrate these services so they can be used to built distributed applications. A system that integrates these services must deal with the dissimilar transaction capabilities of remote servers. In this paper, we present MedTransact, a mediator system that coordinates dissimilar service capabilities of remote service providers. MedTransact supports distributed transaction semantics across the mediator and the remote service providers. We describe the MedTransact architecture and discuss how distributed transactions are supported. Keywords: Distributed transaction management, mediator systems, transaction semantics.

1. Introduction Wrapper mediator systems ([11],[15],[22],[29])have been successfully developed to mediate the query capability of remote (database or non-database) sources and to provide information integration at the mediator level. Typically these projects support transactions within the mediator but the transaction semantics is not extended to the remote services. Supporting a transaction service that is extended to the remote servers is critical if we wish to implement E-business services based on wrapper mediator systems. There has been extensive research in transaction support in distributed computing systems ([6],[7],[21],[25],[26],[27]) and in workflow management systems ([10],[14],[19],[23],[28]). While these projects address the support of distributed transactions, they do not consider mediating the service capabilities of autonomous remote service providers. This is a significant difference, since remote service providers will support dissimilar capabilities with respect to commit, and abort semantics. Thus, implementing a transaction semantics across the remote servers becomes much more difficult, compared to a scenario where all distributed components support identical transaction behavior. We note that current commercial E-business products ([13],[18]) are not able to mediate service providers with different transaction capabilities. MedTransact is an attempt to mediate dissimilar service capabilities of remote service providers, while supporting distributed transaction semantics across the mediator and the remote service providers. In this paper, we describe a motivating example of application semantics at the mediator level. It involves a travel agent negotiating with flight, hotel and car reservation services. This scenario is typical of current online E-business services. The example will illustrate the need to specify application level semantics and to implement distributed transaction semantics across the remote service providers. We then describe the MedTransact approach, and the three main contributions to support distributed transactions. The first is a language to specify remote service capability and to map remote service capability to a mediator service. The second contribution is to specify application semantics over the 1

CNPq – Brazil grant holder.

2

Computer Science Department – COPPE, Federal University of Rio de Janeiro, Brazil. E-mail: [email protected]

3

Maryland Bussiness School and UMIACS, University of Maryland. E-mail: [email protected]

mediator services. The third contribution is a platform to implement distributed transactions over the remote service providers. MedTransact uses the application semantics and the remote server capability to produce safe execution plans that implement transactions at the mediator level. A (limited) prototype implementation of MedTransact is being built and will be available in March 2001.

1.1 Related work There is much work on supporting distributed transactions across autonomous remote servers. Heterogeneous database systems , and workflow systems are the main research areas that have research on this issue. The area of heterogeneous distributed database systems has been focused on supporting transactions in heterogeneous distributed database systems in the face of incompatible concurrency control sub-systems and/or uncooperative transaction managers. A good survey of the relevant work can be found in ([5],[25]). Transaction management in these system are performed at two levels, at a local level by the preexisting transaction managers of the local databases (LTMs) and at a global level by a global transaction manager (GTM) superimposed on them. The split of control between GTM and LTMs generates the need of managing the autonomy of individual remote servers and the consistency requirements of the global transactions. Most of the work on transaction management in heterogeneous distributed database addresses the issue of data consistency in global transactions, while preserving different aspects of remote server autonomy. Approaches in the literature dealing with workflows [12] can be described as those based on multidatabase and extended transactions ([1],[7],[8],[30]), active database and rule-based approaches ( [3],[4]) combinations of the above two [9], and office and process-automation ( [16],[17],[19]). Workflow systems support the specification of intra- and inter-transaction state dependencies, and correctness dependencies such as serialization, visibility, cooperation and temporal dependencies. While these works address the support of distributed transactions, they do not consider aspects of transaction support when mediating the service capabilities of autonomous remote service providers. MedTransact exploit the transaction capabilities of remote servers and the semantic information of the application execution behavior to provide distributed transaction. This is a significant difference, since remote service providers can be integrated in the system despite of their dissimilar capabilities with respect to commit, and abort semantics.

2. Motivating Example We consider a travel agency application since its semantics are well understood. A user X wants to travel from Miami to Washington D.C. to attend a conference. X wants to leave Miami on May 10 and return on May 15. X also wants to rent a car in Washington D.C. If no car is available, X wants to stay in the conference hotel. The application involves making flight, hotel, and car rental reservations. If no flight or hotel is available, the whole trip is cancelled. If a car cannot be rented, the trip can still proceed. Therefore, the application will successfully terminate if a flight and hotel reservation occurs, irrespective of the result of the car reservation. One plan to arrange this trip is to first make the flight reservation, and if it succeeds, then make the car reservation, and then make the hotel reservation. This plan may take longer to return results due to its sequential execution. To improve the response time, an alternate plan is that flight and car reservations could be executed in parallel, provided we can undo the car reservation, should the need arise. The above example, while simple, must be properly specified in order to be correctly executed in an environment of remote service providers. First, we need to select the appropriate remote service providers. In this example, we need to select remote sources that provide services of flight, hotel and car reservations. Next, we need to query those sources to determine the available flights, hotels and cars in accordance with some constraints. For example we need to find flights from Miami to D.C. on May 10, and cars to rent in

D.C. that are available from May 10 to May 15. There are 3 metro D.C. airports, so we must check that the car can be picked up at the particular D.C. airport where the flight arrives. The constraints related to the flight reservation are based only on static information, i.e. information that is known when the application starts. On the other hand, the acceptable car reservations are those made at the arrival airport, whose identity is not known when the application starts. Thus, the constraints for the car reservation include both static information and dynamic information that is only known during execution. Therefore, there is an implied dependency between the car reservation and the flight reservation. One way to implement this dependency is to first make the flight reservation and if it commits to make a car reservation. Suppose instead that we make these reservations in parallel. Then, when a flight reservation commits, the system must verify if there is a car reservation that matches the flight reservation. If so, the application can successfully terminate. Otherwise, the system must continue trying to make a car reservation until an acceptable car reservation is made or there are no more car reservations. The application will successfully terminate if the flight and appropriate hotel reservations commit, and it will fail if either the flight or the hotel reservation fails. Car reservations that do not match the successful flight reservation may need to be undone. A simplified version of this example will be used in the paper.

3. Architecture The MedTransact system shown in Figure 1 is designed to support distributed transactions based on existing autonomous services of remote service providers. To achieve this goal the architecture must provide the following: 1) A mechanism to specify application semantics (execution behavior) at the mediator level. Applications are specified at the mediator level without knowledge of remote service providers. This feature allows the addition of new remote service providers to existing applications. 2) A mechanism to specify the services and the transaction semantics of the remote service providers, and to define the mappings from the remote service to the mediator service. 3) A mechanism to define and implement coordination among system components to support distributed transactions with autonomous service providers with dissimilar transaction semantics. Semantic information from the application (mediator level) and the transaction semantics of the remote service providers must be used to provide distributed transaction semantics at the mediator level.

































%











&



!

#















!

















$











"





%









&









!





















"







"





















#













!









%





!

"

















&



!

#













$



!











$















!



!











#









$









#













Figure 1 - Mediator architecture.











!

















$

















$













"











3.1 Specifying Remote Services and Mediator Services. Remote Service Provider (RSP) objects specify the service description of a set of related services of a remote server. An RSP object describes each service interface and its commit and abort actions. Each RSP object is defined by an identifier rid and a set of supported services svc. Each svc is a 6-tuple (sid, input, output, req, cd, tsd), as follows: a) sid: service name. b) input: a set of input attributes that correspond to the bindings accepted by the service. c) output: a set of output attributes returned by sid. The input and output describe the interface of the service sid. d) req: required attributes (a subset of input) that are required by the service. e) cd: content description is a set of tuples (att, dom), where att is an attribute of input or output, and dom is a subset of the domain of this attribute. For every attribute in input and output, there must be a tuple (att, dom) in cd. f) tsd: transaction semantics description is a pair (abs, aba), such that: abs: a Boolean value. A value yes means that the service supports aborts.

i)

ii) aba: a service name representing the action that must be taken to abort the service, when abs is yes. This is a value from sid, a service name of a service provided by a known RSP. In the case that the abort action is directly supported by RSP, then the value of aba is NULL. Figure 2 shows example RSP objects, and Table 1 shows their transaction semantics. The RSP object publishes two services: and . The transaction semantic description (tsd) of indicates that it supports the abort action, and that the abort action must execute another service . '

(

)

*

+

)

,

*

,

)

-

(

.

/

0

1

(

1

,

2

)

,

*

,

)

-

(

.

(

/

6

[

7

\

8

]

^

9

:

;




8

4

,

*

,

)

-

(

.

/

0

1

,

3

4

,

*

,

)

-

(

.

/

0

1

?

2

;

@

A

=

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

N

T

U

V

W

X

Y

Z

I

J

K

L

M

N

J

U

N

T

L

S

W

X

Y

X

I

_

`

r

=

3

2

1

1

2

5

0

u

v

e

w

i

a

x

k

b

a

a

l

c

`

d

a

e

b

f

g

a

c

h

i

d

e

j

f

g

h

a

i

b

i

k

e

i

k

a

l

`

a

b

a

c

d

e

f

g

h

h

i

m

c

i

q

l

l

a

b

a

c

d

e

f

g

h

i

n

k

h

o

a

p

y

c

a

b

a

c

q

c

d

k

e

{

f

e

g

h

i

b

j

i

q

l

l

i

a

h

i

q

l

l

i

h

i

q

l

l

a

b

z

|

}

w

~

s

x

y

`

a

b

a

c

d

e

f

g

h

i

Table 1 – Transaction Semantics for 

€



‚

ƒ

„

…

†

, ‡

ˆ

„

‰

€

…

and †

Š

‹

Œ

…

†

RSP objects.

The RSP object publishes two services: and . The transaction semantic description (tsd) of indicates that there is no transaction support for abort, while indicates that there is transaction support for abort, and no additional action is needed. 





Œ

‘

ƒ

‹

…

‰

€



‚

ƒ

„

…

†

Œ

‰

…

‰

Œ



‹

„



ˆ

Ž





Œ

‘

ƒ

‹

…

‰

Œ

‰

…

‰

Œ



‹

„



ˆ

Ž

The RSP object specification could be implemented using the RDF model[2]. An specification of an RSP object in RDF is in [24].

// remote services providers // The remote service provider that makes car reservations. (// RSP Name: Cars1; // Supported services: { (reservation, {pickupPlace, date}, {code}, {}, {}, (yes, cancelReservation(reservation.code)); // (cancelReservation, {code}, {msg}, {}, {}, (no,null))}) … (// RSP Name: CarsN; // Supported services: { … }) // The remote service provider that makes flight reservations. (// RSP Name: Flights1; // Supported services: { (reservation, {originAirport, destinationAirport,date}, {code, price}, {}, {}, (yes, none)); // (purchase, {code}, {msg}, {}, {}, (no, null)}); // The remote service provider that makes hotel reservations. (//RSP Name: Hotels1; // Supported services: { (reservation, {city, category, date}, {code}, {}, {}, (no, null)});

Figure 2 – Specification of RSP objects.

Mediator Service Providers (MSP) specify a set of related mediator services. A mediator service does not provide a service implementation, and each mediator service is mapped to one or more RSP services. For simplicity, we assume that MSP uses identical attributes and content descriptions as RSP, and so the mapping is trivial. A MSP object is defined by a tuple (mid, svc) where mid is the name of the MSP object, and svc is a set of services provided by mid. svc is described by a quadruple (sid, input, output, req), where input, output and req are the same definition used by RSP.

// The mediator service provider that makes flight reservations. (// Mediator Service Name : MFlights; // Supported services: { (reservation, {}, {},{}); // (purchase, {}, {},{}) }

Figure 3 – Specification of MSP objects.

3.2 Specifying Application Semantics. A MedTransact Application object is a program specification that is built upon services provided by Mediator Service Provider (MSP) objects. It describes interactions among its component services and the expected application transaction behavior. An Application object is described by a 7-tuple (id, smid, input, ssr, cons, prg, ms): a) id: the name of the Application object. b) smid: a nonempty set of MSP objects participating in the application. c) input: a set of input attributes accepted by the application. d) ssr: a set of services supported by the application object. This set is a subset of services provided by MSP objects appearing in smid. Each service is specified by a tuple (mid, srv) such that: i)

mid is the identification of the MSP object supporting the service.

ii) srv is a set of services provided by the MSP appearing in mid and supported by the Application object. e) cons: a set of constraints on service attributes. These constraints may be domain value restrictions, domain range restrictions, or a matching condition between two attribute values (join condition). Each constraint has an associated context representing when the constraint must be valid. f) prg: a program that implements the execution flow of MSP services. We borrow ideas from [20] to describe execution flow. Suppose P and Q are two services or subprograms. Then, we can describe a program using the following three operators applied to P and Q:

P|Q

Sequential execution of P and Q

P || Q

Parallel execution of P and Q

cond P, Q

Selection between P and Q depending on value of condition cond

The sequential execution operator indicates that P must precede Q. The parallel execution operator specifies that P and Q may execute in parallel. A selection of P or Q selects exactly one depending on the value of the Boolean expression cond. The scope of all constraints in cons must be specified in prg. The expression c[P] indicates that the constraint c is verified during the execution of the service or subprogram P.

g) ms: a set of mandatory services defining the termination state of the application. The application reaches a successful termination state if and only if all services contained in the set of mandatory services commit. Figure 4 shows an Application object specification that implements the travel planning application. object utilizes three MSP objects, . , , and ’

“

”

•

–

—

˜

—

”

™

™

š

™

›

—

œ

š

›

ž

Ÿ

 

”



œ

“

 

Ÿ

¡

œ

¢

–

—

The

 

£

// Mediator: Description of the travel planning application. TravelPlanning ( // MSP objects: (MFlights F, MCars R, MHotels H); // Input attributes: {String conferenceHotel}; // Component services: (F,{reservation}), (R,{reservation}), (H,{reservation}); // Constraints: {c1: R.pickupPlace = F.reservation.destinationAirport, c2

R.reservation.code IS NULL,

c3: H.hotelName = conferenceHotel} // Application execution flow: ( c1 [ R.reservation ] || F.reservation)| ( c2 〈 c3[ H.reservation ] , H.reservation 〉 ); // Mandatory services: (F.reservation, H.reservation) )

Figure 4 – Specification of the travel planning application object.

Constraints describe dependencies among MSP objects. For example, the constraint of attributes of cars and of flights. š

¦

˜

¤

§

¨

—

¦

”

–

–

¤

 

Ÿ

š

™

”

Ÿ

š

©

™

ª

š

“

“

£

¦

¤

indicates a match ¥

Ÿ

£

The execution flow of is as follows. The application starts with the parallel execution of car and flight services. The constraint must be verified after the execution of these two services, i.e., the car must be picked up at the destination airport. This is followed by the selection operator. If the car fails, i.e., is not true, then the application executes hotel with constraint Constraint verifies that the hotel must be made only in the conference hotel. Otherwise, if the car succeeds, the hotel is executed without any constraint. ’

«

¬

­

¬

«

®

¯

°

±

²

³

«

¬

­

“

¬

”

•

«

®

–

¯

—

°

˜

—

±

²

”

™

™

š

™

›

³

¤

«

¬

­

¬

«

®

¯

°

±

²

³

«

¤

µ



¤

¬

­

¬

«

®

¯

°

¬

­

¬

«

®

¯

°

±

²

­

¬

«

®

¯

°

±

²

³

³

µ

“

«

¬

´

«

¤

¥

±

²

–

 

–

“

•

”

Ÿ

š

™

³

£

The mandatory services indicate that the application successfully terminate if the flight services commit, irrespective of the success of the car service. “

–

 

–

“

•

”

Ÿ

š

™

“

–

 

–

£

“

•

”

Ÿ

š

“

–

 

–

“

•

”

Ÿ

š

™

£

and hotel

™

£

4. MedTransact Application Execution The execution flow of the MedTransact application object could be implemented in several ways. Thus a MedTransact application object can have multiple execution plans. The choice of plans is increased since each MSP object could map to multiple RSP objects and services. The goal of MedTransact application planning is to generate safe plans for executing an application, and then to choose an efficient plan. A safe plan must respect semantic knowledge of the application object and the RSP object.

The application object specifies execution flow, constraints and mandatory services. Each mediator object and service maps to a RSP object and service, each of which has some particular transaction capabilities. We use this knowledge to generate transaction constraints on the mediator service. These transaction constraints on mediator services are used in generating safe plans. We use an example to illustrate how a safe plan is obtained. Consider the following application execution flow: Application execution flow: (P || Q); Mandatory services: (Q);

The application execution flow defines the parallel execution of services P and Q while the mandatory services specification indicates that the commit of service Q is required to successfully terminate the application. We consider the following cases: • Suppose both P and Q were successful or only Q was successful. Then the application is successful. • Suppose P was successful but Q did not commit. Since Q is a mandatory service the application is unsuccessful. Thus, P must be aborted, consequently we have a mandatory abort constraint on service P. We therefore have the transaction constraint that service P must support mandatory abort, if P and Q are to execute in parallel. There is no transaction constraint on Q. On the other hand, if we replace the parallel execution of P and Q with the sequential execution (Q | P), then we will not have a transaction constraint on P. Suppose that the application execution behavior was (P || Q) and both P and Q were mandatory services. In this case, there is a mandatory abort transaction constraint on both P and Q. We have only described how a safe plan is generated for a simple program. In general, a program will be decomposed into subprograms, and safe execution plans are generated for each subprogram.

5. Conclusions We have described MedTransact, a system designed to mediate dissimilar service capabilities of remote service providers, while supporting distributed transaction semantics across the mediator and the remote service providers. MedTransact makes use of the application transaction semantics and the description of remote service transaction capabilities to enforce distributed transactions. Future work involves remote service discovery and mapping from discovered remote services to mediator services. A prototype of MedTransact will be available in March 2001. It will be capable of generating execution plans for a (limited) application object, i.e., for limited program execution flow.

References [1] Ansari M., et.al., “Using Flexible Transactions to Support Multi-System Telecommunication Applications”, Proc. of the 18th VLDB Conference, August (1992). [2] Brickley, D., and Guha, R.V., “Resource Description Framework (RDF) schema specification”, Technical Report http://www.w3.org/TR/PR-rdf-schema, WWW-Consortium (1999). [3] Dayal U., Hsu M., and Ladin R., “A Transactional Model for Long-Running Activities”, Proc. of the 17th VLDB Conference, September (1991). [4] Dayal U., Hsu M., and Ladin R., “Organizing Long-Running Activities with Triggers and Transactions”. Proc. of ACM SIGMOD Conf. on Management of Data (1990). [5] Elmagarmid A. and Pu C., eds., “Special Issue on Heterogeneous Databases”, ACM Comp. Surveys vol. 22, no. 3 (1990).

[6] Elmagarmid A., ed. “Transaction Models for Advanced Database Applications”. Morgan-Kaufmann, February (1992). [7] Elmagarmid A.K., et al., “A Multidatabase Transaction Model for InterBase”. Proc. of the 16th VLDB Conference (1990). [8] Garcia-Molina H., et. al. “Coordinating Multi-transaction Activities”, Technical Report CS-TR-247-90, Princeton University, February 1990. [9] Georgakopoulos D., et. al., “Specification and Management of Extended Transactions in a Programmable Transaction Environment”, Proc. of the Intl. Conf. on Data Engineering, February (1994). [10] Georgakopoulos, D., Hornick, M., and Sheth, A., “An overview of workflow management: from process modeling to workflow automation infrastructure”, Intl. Journal on distributed and parallel databases, vol. 3, no. 2 , 119-153 (1995). [11] Haas, L., et al., “Optimizing Queries across Diverse Data Sources”, Proceedings of VLDB Conference (1997). [12] Hsu M., ed. “Special Issue on workflow and Extended Transaction Systems), vol.16, no. 2, June (1993). [13] IBM White Paper, “The IBM WebSphere software platform and patterns for e-business - invaluable tools for IT architects of the new economy”. http://www4.ibm.com/software/info/websphere/ docs/wswhitepaper.pdf (2000). [14] Lawrence,P., ed., “WfMC Workflow Handbook”. John Wiley & Sons Ltd (1997). [15] Levy,A.Y., et al., “Querying Heterogeneous Information Sources Using Source Descriptions”. Proceedings of the VLDB Conference (1996). [16] McCarthy D. and Sarin S., “Workflow and Transaction in InConcert”. In [12]. [17] Medina-Mora R., Wong H., and Flores P., “ActionWorkflow TM as the Enterprise Integration Technology”. In [12]. [18] Microsoft White Paper, “A Blueprint for Building Web Sites Using the Microsoft Windows DNA Platform”. http://www.microsoft.com/commerceserver/techres/ whitepapers.asp (2000). [19] Miller,J.A, et al., “WebWork: METEOR2’s Web-Based Workflow Management System”, Journal of Intelligent Information Systems, vol.10, no. 2 (1998). [20] Milner, R., “Communication and Concurrency”, International Series in Computer Science. PrenticeHall, Englewood Cliffs, NJ (1989). [21] Özsu,M.T., Dayal,U., and Valduriez,P., eds., “Distributed Object Management”, Morgan-Kaufmann, San Mateo, CA (1994). [22] Papakonstantinou,Y., et al., “Describing and Using Query Capabilities of Heterogeneous Sources”. Proceedings of the VLDB Conference (1997). [23] Paul,S., Park,E., and Chaar,J., “RainMan: a Workflow System for the Internet”, Proc. of USENIX Symp. on Internet Technologies and Systems (1997). [24] Pires,P.F., Raschid,L., “MedTransact: Transaction Support for Mediation with Remote Service Providers”, In preparation, Technical Report, UMIACS, University of Maryland (2000). [25] Pitoura,E., Bukhres,O.A., Elmagarmid,A.K., “Object Orientation in Multidatabase Systems”, ACM Computing Surveys, vol. 27, no. 2, 141-195 (1995). [26] Ramamritham,K., Chrysanthis,P.K., ed., “Advances in Concurrency Control and Transaction Processing”, IEEE Computer Society Press, CA (1997) [27] Ranno,F., Shrivastava,S.K., and Wheater,S.M., “A Language for Specifying the Composition of Reliable Distributed Applications”, The 18th International Conference on Distributed Computing Systems (ICDCS '98), Amsterdam, The Netherlands (1998). [28] Rusinkiewicz,M., and Sheth,A., “Specification and execution of transactional workflows”, in: Modern database systems, ed. W. Kim, ACM Press, 592-620 (1995). [29] Tomasic, A., et. al., “Scaling access to distributed heterogeneous data sources with Disco”. IEEE Transactions on Knowledge and Data Engineering (1998). [30] Wachter H. and Reuter A., “The ConTract Model”. Chapter 7, In [6] (1992).