Preface Ulrike Golas
Thomas Soboll
KonradZuseZentrum f¨ur Informationstechnik Berlin, Germany
Universit¨at Salzburg, Austria
[email protected]
[email protected]
Category Theory is a wellknown powerful mathematical modeling language with a wide area of applications in mathematics and computer science, including especially the semantical foundations of topics in software science and development. Categorical methods are already well established for the semantical foundation of type theory (cartesian closed categories), data type specification frameworks (institutions) and graph transformation (adhesive high level replacement categories). It is the intention of the ACCAT Workshops on Applied and Computational Category Theory to bring together leading researchers in these areas with those in software science and development in order to transfer categorical concepts and theories in both directions. The workshops aims to represent a forum for researchers and practitioners who are interested in an exchange of ideas, notions, and techniques for different applications of category theory. The originators of the ACCAT workshops were Hartmut Ehrig (Berlin) and Jochen Pfalzgraf (Salzburg); they started with the 1st ACCAT workshop as a satellite event of ETAPS 2006, the European Joint Conferences on Theory and Practice of Software. Since then, the workshop has been a yearly meeting at the ETAPS conferences. The seventh ACCAT workshop on Applied and Computational Category Theory 2012 was held in Tallinn, Estonia on the 1st of April 2012 as a satellite event of ETAPS 2012. Topics relevant to the scope of the workshop included (but were not restricted to) • General Modeling Aspects • Categorical, Algebraic, Geometric, Topological Modeling Aspects • Petri Nets, Process Algebras, Activity Networks • Logical Modeling, Unifying Frameworks • Coalgebraic Methods in Systems Theory • Generalized Automata, Categorical Methods In contrast to previous ones, this workshop consisted of three invited talks as well as four peerreviewed presentations. This issue contains the full version of one of the invited talks as well as the submitted papers, which cover a wide range of applications of category theory, from modeldriven engineering over transition systems in stochastic processes to transformations in Madhesive categories. We are grateful to the external referees for their excellent work in reviewing and selecting the submitted papers. Moreover, we would like to thank the Organizing Committee of the ETAPS conference, and in particular the general chair, Tarmo Uustalu, and the workshop chair, Keiko Nakata, for their successful organization.
24th of July, 2012
Submitted to: ACCAT 2012
Ulrike Golas Thomas Soboll c U. Golas & T. Soboll
This work is licensed under the Creative Commons Attribution License.
Category Theory and ModelDriven Engineering: From Formal Semantics to Design Patterns and Beyond Zinovy Diskin1,2 1 Network
Tom Maibaum1
for Engineering of Complex SoftwareIntensive Systems for Automotive Systems (NECSIS), McMaster University, Canada 2 Generative
Software Development Lab, University of Waterloo, Canada
[email protected] [email protected]
There is a hidden intrigue in the title. CT is one of the most abstract mathematical disciplines, sometimes nicknamed ”abstract nonsense”. MDE is a recent trend in software development, industrially supported by standards, tools, and the status of a new ”silver bullet”. Surprisingly, categorical patterns turn out to be directly applicable to mathematical modeling of structures appearing in everyday MDE practice. Model merging, transformation, synchronization, and other important model management scenarios can be seen as executions of categorical specifications. Moreover, the paper aims to elucidate a claim that relationships between CT and MDE are more complex and richer than is normally assumed for ”applied mathematics”. CT provides a toolbox of design patterns and structural principles of real practical value for MDE. We will present examples of how an elementary categorical arrangement of a model management scenario reveals deficiencies in the architecture of modern tools automating the scenario.
Keywords: Modeldriven engineering, mathematical modeling, category theory
1
Introduction
There are several well established applications of category theory (CT) in theoretical computer science; typical examples are programming language semantics and concurrency. Modern software engineering (SE) seems to be an essentially different domain, not obviously suitable for theoretical foundations based on abstract algebra. Too much in this domain appears to be ad hoc and empirical, and the rapid progress of open source and collaborative software development, serviceoriented programming, and cloud computing far outpaces their theoretical support. Model driven (software) engineering (MDE) conforms to this description as well: the diversity of modeling languages and techniques successfully resists all attempts to classify them in a precise mathematical way, and model transformations and operations — MDE’s heart and soul — are an area of a diverse experimental activity based on surprisingly weak (if any) semantic foundations. In this paper we claim that theoretical underpinning of modern SE could (and actually quite naturally) be based on CT. The chasm between SE and CT can be bridged, and MDE appears as a “golden cut”, in which an abstract view of SE realities and concrete interpretations of categorical abstractions merge together: SE → MDE ← CT. The left leg of the cospan is extensively discussed in the MDE literature (see [47] and references therein); prerequisites and challenges for building the right leg are discussed in the present paper. Moreover, we aim to elucidate a claim that relationships between CT and MDE are more complex and richer than is normally assumed for ”applied mathematics”. CT provides a toolbox of design patterns and principles, whose added value goes beyond such typical applications of mathematics to SE as formal semantics for a language, or formal analysis and model checking. U. Golas, T. Soboll (Eds.): Proceedings of ACCAT 2012 EPTCS 93, 2012, pp. 1–21, doi:10.4204/EPTCS.93.1
c Z. Diskin & T. Maibaum
This work is licensed under the Creative Commons Attribution License.
2
Category Theory and Model Driven Engineering
Two aspects of the CTMDE “marriage” are discussed in the paper. The first one is a standard argument about the applicability of a particular mathematical theory to a particular engineering discipline. To wit, there is a mathematical framework called CT, there is an engineering domain called MDE, and we will try to justify the claim that they make a good match, in the sense that concepts developed in the former are applicable for mathematical modeling of constructs developed in the latter. What makes this standard argument exciting is that the mathematical framework in question is known to be notoriously abstract, while the engineering domain is very agile and seemingly not suitable for abstract treatment. Nevertheless, the argument lies within the boundaries of yet another instance of the evergreen story of applying mathematics to engineering problems. Below we will refer to this perspective on the issue as Aspect A. The second perspective (Aspect B) is less standard and even more interesting. It is essentially based on specific properties of categorical mathematics and on the observation that software engineering is a special kind of engineering. To wit, CT is much more than a collection of mathematical notions and techniques: CT has changed the very way we build mathematical models and reason about them; it can be seen as a toolbox of structural design patterns and the guiding principles of their application. This view on CT is sometimes called arrow thinking. On the other hand, SE, in general, and MDE, in particular, essentially depend on proper structuring of the universe of discourse into subuniverses, which in their turn are further structured and so on, which finally results in tool architectures and code modularization. Our experience and attempts to understand complex structures used in MDE have convinced us that general ideas of arrow thinking, and general patterns and intuitions of what a healthy structure should be, turn out to be useful and beneficial for such practical concerns as tool architecture and software design. The paper is structured as follows. In Section 2 we present two very general Atype arguments that CT provides a “right” mathematical framework for SE. The second argument also gives strong prerequisites for the Bside of our story. Section 3.1 gives a brief outline of MDE, and Section 3.2 reveals a truly categorical nature of the cornerstone notions of multimodeling and intermodeling (another Aargument). In Section 4 we present two examples of categorical arrangement of model management scenarios: model merge and bidirectional update propagation. This choice is motivated by our research interests and the possibility to demonstrate the Bside of our story. In Section 5 we discuss and exemplify three ways of applying CT for MDE: understanding, design patterns for specific problems, and general design guidance on the level of tool architecture.
2 2.1
Two very general perspectives on SE and Mathematics The plane of Software × Mathematics
The upper half of Fig. 1 presents the evolution of software engineering in a schematic way, following Mary Shaw [48] and Jos´e Fiadeiro [25]. Programminginthehead refers to the period when a software product could be completely designed, at least in principle, ”inside the head” of one (super intelligent) programmer, who worked like a researcher rather than as an engineer. The increasing complexities of problems addressed by software solutions (larger programs, more complex algorithms and data structures) engendered more industrially oriented/engineering views and methods (e.g., structured programming). Nevertheless, for Programminginthesmall, the software module remained the primary goal and challenge of software development, with module interactions being simple and straightforward (e.g., procedure calls). In contrast, Programminginthelarge marks a shift to the stage when module composition becomes the main issue, with the numbers of modules and the complexity of module interaction enormously increased. This tendency continued to grow and widened in scope as time went on, and
Z. Diskin & T. Maibaum
3
Programming Programming Programming Programming inthehead inthesmall inthelarge intheworld (and MDE)
60s
70s
1
80s
90s
00s
Mn
M
Figure 1: Evolution of software (M refers to Many/Multitude) today manifests itself as Programmingintheworld. The latter is characterized by a large, and growing, heterogeneity of modules to be composed and methods for their composition, and such essentially zdmodiffyinthelargelarge modern technologies as service orientation, open source and collaborative software development, and cloud computing. The lower part of Fig. 1 presents this picture in a very schematic way as a path from 1 to M to M n with M referring to multiplicity in different forms, and degree n indicating the modern tendencies of growth in heterogeneity and complexity. MDE could be seen as a reaction to this development, a way of taming the growth of n in a systematic way. Indeed, until recently, software engineers may feel that they could live without mathematical models: just build the software by whatever means available, check and debug it, and keep doing this throughout the software’s life. (Note that the situation in classical (mechanical and electrical) engineering is essentially different: debugging, say, a bridge, would be a costly procedure, and classical engineers abandoned this approach long time ago.) But this gift of easily built systems afforded to SEs is rapidly degrading as the costs of this process and the liability from getting it wrong are both growing at an enormous rate. By slightly rephrasing Dijkstra, we may say that precise modeling and specification become a matter of death and life rather than luxury. These considerations give us the vertiSE cal axis in Fig. 2, skipping the intermediMDE ate point. The horizontal axis represents + Heterogeneous Programming multimodeling, n the evolution of mathematics in a simiM intheworld Model mngmnt, lar simplified way. Point 1 corresponds to Model sync the modern mathematics of mathematical structures in the sense of Bourbaki: what Programming + Compilers, + Concurrency, FP 1 Model checking, inthesmall matters is operations and relations over Semantics for UML Concurrency, Quantum comput. mathematical objects rather than their inRelDB ternal structure. Skipped point M corre1 M∞ sponds to basic category theory: the inMath Bourbaki Cat. theory ternal structure of the entire mathematical Automata, Fibrations, Monads structure is encapsulated, and mathematModal logic, FOL Sheaves, Hyperdoctrines ical studies focus on operations and relations over structures considered as holistic Figure 2: Software engineering and mathematics entities. The multitude of higher degree, M ∞ , refers to categorical facilities for reflection: enrichment, internalization, higher dimensions, which can be applied ad infinitum, hence, ∞degree. This (oversimplified) schema gives us four points of Math×SE interaction. Interaction (1,1) turned

4
Category Theory and Model Driven Engineering
out to be quite successful, as evidenced by such theorybased practical achievements as compilers, model checking, and relational DB theory. As for the point (1, M n ), examining the literature shows that attempts at building theoretical foundations for MDE based on classical 1mathematics were not successful. A major reason seems to be clear: 1mathematics does not provide an adequate machinery for specifying and reasoning about interstructural relationships and operations, which are at the very heart of modern software development. This point may also explain the general skepticism that a modern software engineer, and an academic teaching software engineering, feel about the practicality of using mathematics for modern software design: unfortunately, the only mathematics they know is the classical mathematics of Bourbaki and Tarski. On the other hand, we view several recent applications of categorical methods to MDE problems [2, 5, 17, 37, 21, 45, 44, 22, 19, 43, 46] as promising theoretical attempts, with great potential for practical application. It provides a firm plus for the (M ∞ , M n )point in the plane. Moreover, as emphasized by Lawvere, the strength of CT based modeling goes beyond modeling multistructural aspects of the mathematical universe, and a categorical view of a single mathematical structure can be quite beneficial too. This makes point (M ∞ , 1) in the plane potentially interesting, and indeed, several successful applications at this point are listed in the figure.
2.2
Mathematical modeling of engineering artifacts: Roundtripping abstraction vs. waterfall based abstraction
Figure Fig. 3(a) shows a typical way of Metamath Physical building mathematical models for mechanPhysics Math (Logic) Engineering ical and electrical engineering domains. Metamathematics (the discipline of model(a) Normal (waterfall) modeling chain ing mathematical models) is not practically needed for engineering as such. The situaComputer tion dramatically changes for software engiScience neering. Indeed, category theory (CT) could be defined as a discipline for studying matheSoftware Math CT matical structures: how to specify, relate and Engineering manipulate them, and how to reason about Categorical them. In this definition, one can safely reMetamath move the adjective “mathematical” and con(b) Circular modeling chain sider CT as a mathematical theory of structures in a very broad sense. Then CT beFigure 3: Modeling chains comes directly applicable to SE as shown in Fig. 3(b). Moreover, CT has actually changed the way of building mathematical structures and thinking about them, and found extensive and deep applications in theoretical computer science. Hence, CT can be considered as a common theoretical framework for all modeling stages in the chain (and be placed at the center). In this way, CT provides a remarkable unification for modeling activities in SE. The circular, non linear nature of the figure also illustrates an important point about the role of CT in SE. Because software artifacts are conceptual rather than physical entities, there is potential for feedback between SE and Mathematics in a way that is not possible in traditional scientific and engineering disciplines. Design patterns employed in SE can be, and have been, influenced by mathematical model of software and the way we develop them.
Z. Diskin & T. Maibaum
3
5
MDE and CT: an overall sketch
We will begin with a rough general schema of the MDE approach to building software (Section 3.1), and then will arrange this schema in categorical terms (Section 3.2).
3.1
MDE in a nutshell
The upperleft corner of Fig. 4 shows a World general goal of software design: buildModelware ing software that correctly interacts with different subsystems of the world (shown by figures of different shapes). For example, software embedded in a car interacts with its mechanical, electrical and electronic subsystems, with the driver and passengers, and with other cars on the road in future car designs. These components interact between themselves, which Modeling Code Software is schematically shown by overlaps of the Generation respective shapes. The lowerright corner of Fig. 4 shows software modularized in Figure 4: MDE, schematically parallel to the physical world it should interact with. The passage from the left to the right is highly nontrivial, and this is what makes SE larger and more challenging than mere programming. An effective means to facilitate the transition is to use models — a system of syntactical objects (as a rule, diagrammatic) that serve as abstractions of the “world entities” as shown in the figure (note the links from pieces of World to the respective parts of Modelware). These abstractions are gradually developed and refined until finally transformed into code. The modelware universe actually consists of a series of “modelwares” — systems of requirement, analysis, and design models, with each consecutive member in the list refining the previous one, and in its own turn encompassing several internal refinement chains. Modelware development consumes intelligence and time, but still easier and more natural for a human than writing code; the latter is generated automatically. The main idea of MDE is that human intelligence should be used for building models rather than code. Of course, models have been used for building software long before the MDE vision appeared in the market. That time, however, after the first version of a software product had been released, its maintenance and further evolution had been conducted mainly through code, so that models had quickly become outdated, degraded and finally became useless. In contrast, MDE assumes that maintenance and evolution should also go through models. No doubts that some changes in the real world are much easier to incorporate immediately in the code rather than via models, but then MDE prescribes to update the models to keep them in sync with code. In fact, code becomes just a specific model, whose only essential distinction from other models in the modelware universe is its final position in the refinement chain. Thus, the Modelware boundary in Fig. 4 should be extended to encompass the Software region too.
6
3.2
Category Theory and Model Driven Engineering
Modelware categorically
Consider a modelware snapshot in Fig. 4. Notice that models as such are separated whereas their referents are overlapped, that is, interact between themselves. This interaction is a fundamental feature of the real world, and to make the model universe adequate to the world, intermodel correspondences/relations must be precisely specified. (For example, the figure shows three binary relations, and one ternary relation visualized as a ternary span with a diamond head.) With reasonable modeling techniques, intermodel relations should be compatible with model structures. The modelware universe then appears as a collection of structured objects and structurecompatible mappings between them, that is, as a categorical phenomenon. In more detail, a rough categorical arrangement could be as follows. The base universe. Models are multisorted structures whose theories are called metamodels. The latter can be seen as generalized sketches [39, 20], that is, pairs M = (GM ,CM ) with GM a graph (or, more generally, an object of an apiori fixed presheaf topos G), and CM a set of constraints (i.e., diagram predicates) declared over GM . An instance of metamodel M is a pair A = (GA ,tA ) with GA another graph (an object in G) and tA : GA → GM a mapping (arrow in G) to be thought of as typing, which satisfy the constraints, A = CM (see [20] for details). An instance mapping A → B is a graph mapping f : GA → GB commuting with typing: f ;tB = tA . This defines a category Mod(M) ⊂ G/GM of Minstances. To deal with the heterogeneous situation of models over different metamodels, we first introduce metamodel morphisms m : M → N as sketch morphisms, i.e., graph mappings m : GM → GN compatible with constraints. This gives us a category of metamodels MMod. Now we can merge all categories Mod(M) into one category Mod, whose objects are instances (= Garrows) tA : GA → GM(A) , tB : GB → GM(B) etc, each having its metamodel, and morphisms f : A → B are pairs fdata : GA → GB , fmeta : M(A) → M(B) such that fdata ;tB = tA ; fmeta , i.e., commutative squares in G. Thus, Mod is a subcategory of the arrow category G·→· . It can be shown that pulling back a legal instance tB : GB → GN of metamodel N along a sketch morphism m : M → N results in a legal instance of M [20]. We thus have a fibration p : Mod → MMod, whose Cartesian lifting is given by pullbacks. Intermodel relations and queries. A typical intermodeling situation is when an element of one model corresponds to an element that is not immediately present in another model, but can be derived from other elements of that model by a suitable operation (a query, in the database jargon) [19]. Query facilities can be modeled by a pair of monads (Qdef , Q) over categories MMod and Mod, resp. The first monad describes the syntax (query definitions), and the second one provides the semantics (query execution). A fundamental property of queries is that the original data are not affected: queries compute new data but do not change the original. Mathematical modeling of this property results in a number of equations, which can be summarized by saying that monad Q is p Cartesian, ı.e., the Cartesian and the monad structure work in sync. If can be shown [19] that a query language (Q, Qdef ) gives rise to a fibration p Q : ModQ → MModQdef between the corresponding Kleisli categories. These Kleisli categories have immediate practical interpretations. Morphisms in MModQdef are nothing but view definitions: they map elements of the source metamodel to queries against the target one. Correspondingly, morphisms in ModQ are view executions composed from query execution followed by retyping. The fact that projection p Q is fibration implies that the view execution mechanism is compositional: execution of a composed view equals the composition of executions. Now a correspondence between models A, B over metamodels M, N can be specified by data shown in Fig. 5; these data consist of three components.. (1) span (m : N ⇐ MN, n : MN ⇒ N) (whose legs are Kleisli mappings) specifies a common view MN between the two metamodels. (2) trapezoids (arrows
Z. Diskin & T. Maibaum
7
in ModQ ) are produced by p Q Cartesian “lifting”, i.e., by executing views m and n for models A and B resp., which results in models Am and Bn (here and below we use the following notation: computed nodes are not framed, and computed arrows are dashed). (3) span (p : Am ← AB, q : AB → Bn ) specifies a correspondence between the views. Note that this span is an independent modelware component and cannot be derived from models A, B. Spans like in Fig. 5 integrate a colm n lection of models into a holistic system, M ⇐============ MN ============⇒ N which we will refer to as a multimodel. Examples, details, and a precise definim n tion of a multimodel’s consistency can be GM ⇐============ GMN ============⇒ GN found in [21]. t B m6 6 It is tempting to encapsulate spans in t n tA tB 6 A tAB Fig. 5 as composable arrows and work GA ⇐= = GA GAB  GB == ⇒ GB with the corresponding (bi)categories of n m p q mA nB metamodels and models. Unfortunately, it would not work out because, in gen Figure 5: Correspondences between heterogeneous models eral, Kleisli categories are not closed under pullbacks, and it is not clear how to compose Kleisli spans. It is an important problem to overcome this obstacle and find a workable approach to Kleisli spans, Until the problem above is solved, our working universe is the Kleisli category of heterogeneous models fibred over the Kleisli category of metamodels. This universe is a carrier of different operations and predicates over models, and a stage on which different modeling scenarios are played. Classification and specification of these operations and predicates, and their understanding in conventional mathematical terms, is a major task of building mathematical foundations for MDE. Algebraic patterns appear here quite naturally, and then model management scenarios can be seen as algebraic terms composed from diagramalgebra operations over models and model mappings.1 The next section provides examples of such algebraic arrangements.
4
Model management (MMt) and algebra: Two examples
We will consider two examples of algebraic modeling of MMt scenarios. A simple one — model merging, and a more complex and challenging — bidirectional update propagation (BX).
4.1
Model merge via colimit
Merging several interrelated models without data redundancy and loss is an important MDE scenario. Models are merged (virtually rather than physically) to check their consistency, or to extract an integrated information about the system. A general schema is shown in Fig. 6. Consider first the case of several homogeneous models A, B,C... to be merged. The first step is to specify correspondences/relations between models via Kleisli spans R1, R2, ..., or perhaps direct mappings like r3. The intuition of merging without data loss and redundancy (duplication of correspondent data) is precisely captured by the universal property of colimits, that is, it is reasonable to define merge as the colimit of a diagram of models and model mappings specifying intermodel correspondences. 1 Note, however, that a proper categorical treatment of these operations in terms of universal constructions can be not straightforward.
8
Category Theory and Model Driven Engineering
If models are heterogeneous, their relations are specified as in R1 B Fig. 5. To merge, we first merge metamodels modulo metamodel A spans. Then we can consider all models and heads of the corre... r3 R2 spondence spans as instances of the merged metamodel, and merge models by taking the colimit of the entire diagram in the category of instances of the merged metamodel. C . . . … An important feature of viewing model merge as described M above is a clear separation of two stages of the merge process: (i) discovery and specifying intermodel correspondences (often called Figure 6: Model merge model matching), and (ii) merging models modulo these correspondences. The first stage is inherently heuristic and context dependent. It can be assisted by tools based on AItechnologies, but in general a user input is required for final adjustment of the match (and of course to define the heuristics used by the tool). The second stage is pure algebra (colimit) and can be performed automatically. The first step may heavily depend on the domain and the application, while the second one is domain and application independent. However, a majority of model merge tools combine the two stages into a holistic merge algorithm, which first somehow relates models based on a specification of conflicts between them, and then proceeds accordingly to merging. Such an approach complicates merge algorithms, and makes a taxonomy of conflicts their crucial component; typical examples are [49, 42]. The cause of this deficiency is that tool builders rely on a very simple notion of model matching, which amounts to linking thesamesemantics elements in the models to be matched. However, as discussed above in Section 3.2, for an element e in model A, thesamesemantics Belement e0 can only be indirectly present in B, i.e., e0 can be derived from other elements of B with a suitable operation (query) over B rather than being an immediate element of B. With complex (Kleisli) matching that allows one to link basic elements in one model with derived elements in another model, the algebraic nature of merge as such (via the colimit operation) can be restored. Indeed, it is shown in [9] that all conflicts considered in [42] can be managed via complex matching, that is, described via Kleisli spans with a suitable choice of queries, afterwards merge is computed via colimit.
4.2
Bidirectional update propagation (BX)
Keeping a system of models mutually consistent (model synchronization) is vital for modeldriven engineering. In a typical scenario, given a pair of interrelated models, changes in either of them are to be propagated to the other to restore consistency. This setting is often referred to as bidirectional model transformation (BX) [6]. 4.2.1
BX via tile algebra
A simple BXscenario is presented in Fig. 7(a). Two models, A and B, are interrelated by some correspondence specification r (think of a span in a suitable category, or an object in a suitable comma category, see [21] for examples). We will often refer to them as horizontal deltas between models. In addition, there is a notion of delta consistency (extensionally, a class of consistent deltas), and if r is consistent, we call models A and B synchronized. Now suppose that (the state of) model B has changed: the updated (state of the) model is B0 , and arrow b denotes the correspondence between B and B0 (a vertical delta). The reader may think of a span, whose head consists of unchanged elements and the legs are injections so that B’s elements beyond the
Z. Diskin & T. Maibaum
9
range of the upper leg are deleted, and B0 ’s elements beyond the range of the lower leg are inserted. Although update spans are denoted by bidirectional arrows, the upper node is always the source, and the lower is the target. Suppose that we can realign models A and B0 and compute new horizontal delta r ∗ b (think of A B B A r a span composition). If this new delta is not con1:bPpg 1:bPpg sistent, we need to update model A so that the upa b dated model A0 would be in sync with B0 . More acA’ A’ r’ B’ curately, we are looking for an update a : A ↔ A0 B’ 0 0 0 2 :fPpg such that the triple (A , r , B ) is consistent. Of b′ a' 2:fPpg course, we want to find a minimal update a (with the biggest head) that does the job. r’’ A’’ B’’ A’’ B’’ Unfortunately, in a majority of practically in(b) (a) teresting situations, the minimality condition is not strong enough to provide uniqueness of a. To achieve uniqueness, some update propagation pol Figure 7: BX scenario specified in (a) deltabased icy is to be chosen, and then we have an alge and (b) statebased way braic operation bPpg (’b’ stands for ’backward’), which, from a given a pair of arrows (b, r) connected as shown in the figure, computes another pair (a, r0 ) connected with (b, r) as shown in the figure. Thus, a propagation policy is algebraically modeled by a diagram operation of arity specified by the upper square in Fig. 7(a): shaded elements denote the input data, whereas blank ones are the output. Analogously, choosing a forward update propagation policy (from the Aside to the Bside) provides a forward operation fPpg as shown by the lower square. The entire scenario is a composition of two operations: a part of the input for operation application 2:fPpg is provided by the output of 1:bPpg. In general, composition of diagram operations, i.e., operations acting upon configurations of arrows (diagrams), amounts to their tiling, as shown in the figure; then complex synchronization scenarios become tiled structures. Details, precise definitions and examples can be found in [15]. Different diagram operations involved in model synchronization are not independent and their interaction must satisfy certain conditions. These conditions capture the semantics of synchronization procedures, and their understanding is important for the user of synchronization tools: it helps to avoid surprises when automatic synchronization steps in. Fortunately, principal conditions (synchronization laws) can be formulated as universally valid equations between diagrammatic terms — a tile algebra counterpart of universal algebraic identities. In this way BX becomes based on an algebraic theory: a signature of diagram operations and a number of equational laws they must satisfy. The appendix presents one such theory — the notion of a symmetric delta lens, which is currently an area of active research from both a practical and a theoretical perspective. 4.2.2
BX: deltabased vs. statebased
As mentioned above, understanding the semantics of model synchronization procedures is important, both theoretically and practically. Synchronization tools are normally built on some underlying algebraic theory [28, 53, 40, 4, 1, 41, 30], and many such tools (the first five amongst those cited above) use algebraic theories based on statebased rather than deltabased operations. The statebased version of the propagation scenario in Fig. 7(a) is described in Fig. 7(b). The backward propagation operation takes models A, B, B0 , computes necessary relations between them (r and b on the adjacent diagram), and then
10
Category Theory and Model Driven Engineering
computes an updated model A0 . The twochevron symbol reminds us that the operation actually consists of two stages: model alignment (computing r and b) and update propagation as such. The statebased frameworks, although they may look simpler, actually hides several serious deficiencies. Model alignment is a difficult task that requires contextual information about models. It can be facilitated by intelligent AIbased tools, or even be automated, but the user should have an option to step in and administer corrections. In this sense, model alignment is similar to model matching preceding model merge.2 Weaving alignment (delta discovery) into update (delta) propagation essentially complicates the semantics of the latter, and correspondingly complicates the algebraic theory. In addition, the user does not have an access to alignment results and cannot correct them. Two other serious problems of the statebased frameworks and A B C architectures are related to operation composition. The scenario de1:fPpgAB 2:fPpgBC b1 b2 scribed in Fig. 7(a) assumes that the model correspondence (delta) used for update propagation 2:fPpg is the delta computed by operaA’ B’ C’ tion 1:bPpg; this is explicitly specified in the tile algebra specification of the scenario. In contrast, the statebased framework cannot capture this requirement. A similar problem appears when we se Figure 8: Statebased BX: erroquentially compose a BX program synchronizing models A and B neous horizontal composition and another program synchronizing models B and C: composition amounts to horizontal composition of propagation operations as shown in Fig. 8, and again continuity, b1 = b2 , cannot be specified in the statebased framework. A detailed discussion of delta vs. statebased synchronization can be found in [22, 10]. 4.2.3
Assembling model transformations
Suppose M, N are two metamodels, and we need to transform Minstances (models) into Nones. Such a transformation makes sense if metamodels are somehow related, and we suppose that their relationship is specified by a span (m : M ⇐ MN, n : MN ⇒ N) (Fig. 9), whose legs are Kleisli mappings of the respective query monad. m n Now Ntranslation of an Mmodel A can be done in two M ⇐===== MN =====⇒ N steps. First, view m is executed (via its Cartesian lifting actually going down in the figure), and we obtain Kleilsi arrow mA : A ⇐ R (with R = Am ). Next we need to find an Nmodel mA nB A ⇐====== R ======⇒ B B such that its view along n, Bn , is equal to R. In other words, given a view, we are looking for a source providing this view. 6 6 6 a & &:Getm & &:Putn b There are many such sources, and to achieve uniqueness, we ? need to choose some policy. Afterwards, we compute model B ? ? mA 0 nB 0 A0 ⇐ = = R0 = = ⇒ B0 related to A by span (mA , nB ). If model A is updated to A0 , it is reasonable to compute a corresponding update b : B ↔ B0 rather than recompute B0 from Figure 9: Model transformation via scratch (recall that models can contain thousands elements). GetPutdecomposition Computing b again consists of two steps shown in the figure. Operations Getm and Putn are similar to fPpg and bPpg considered above, but work in the asymmetric situation when mappings m and n are total (Kleisli) functions and hence view R contains nothing new wrt. M and N. Because of asymmetry, operations Get (‘get’ the view update) and Put (’put’ it back to 2 A difference is that model matching usually refers to relating independently developed models, while models to be aligned are often connected by a given transformation.
Z. Diskin & T. Maibaum
11
the source) are different. Getm is uniquely determined by the view definition m. Putn needs, in addition to n, some update propagation policy. After the latter is chosen, we can realize transformation from M to N incrementally by composition fPpg = Getm ; Putn — this is an imprecise linear notation for tiling (composition of diagram operations) specified in Fig. 9. Note that the initial transformation from M to N sending, first, an Minstance A to its view R = Am , and then finding an Ninstance B ∈ N such that Bn = R, can be also captured by Get and Put. For this, we need to postulate initial objects ΩM and ΩN in categories of M and Ninstances, so that for any A over M and B over N there are unique updates 0A : ΩM → A and 0B : ΩN → B. Moreover, there is a unique span (mΩ : ΩM ⇐ ΩMN , nΩ : ΩMN ⇒ ΩN ) relating these initial objects. Now, given a model A, model B can be computed as B0 in Fig. 9 with the upper span being (mΩ , nΩ ), and the update a being 0A : ΩM → A. The backward transformation is defined similarly by swapping the roles of m and n: bPpg = Getn ; Putm . The schema described above can be seen as a general pattern for defining model transformation declaratively with all benefits (and all pains) of having a precise specification before the implementation is approached (and must obey). Moreover, this schema can provide some semantic guarantees in the following way. Within the tile algebra framework, laws for operations Get and Put, and their interaction (invertibility), can be precisely specified [22] (see also the discussion in Section 5.1); algebras of this theory are called delta lenses. Then we can deduce the laws for the composed operations fPpg and bPpg from the delta lens laws. Also, operations Getm , Putm can themselves be composed from smaller blocks, if the view m is composed: m = m1 ; m2 ; ...; mk , via sequential lens composition. In this way, a complex model transformation is assembled from elementary transformation blocks, and its important semantic properties are guaranteed. More examples and details can be found in [15].
5
Applying CT to MDE: Examples and Discussion.
We will try to exemplify and discuss three ways in which CT can be applied in MDE. The first one — gaining a deeper understanding of an engineering problem — is standard, and appears as a particular instantiation of the general case of CT’s employment in applied domains. The other two are specific to SE: structural patterns provided by categorical models of the software system to be built can directly influence the design. We will use models of BX as our main benchmark; other examples will be also used when appropriate. 5.1 Deeper understanding. As mentioned in Sect. 4.2, stating algebraic laws that BX procedures must obey is practically important as it provides semantic guaranties for synchronization procedures. Moreover, formulation of these laws should be semantically transparent and concise as the user of synchronization tools needs a clear understanding of propagation semantics. The original statebased theory of asymmetric BX [28] considered two groups of laws: invertibility (or roundtripping) laws, GetPut and PutGet, and history ignorance, PutPut. Two former laws say that two propagation operations, Get and Put, are mutually inverse. The PutPut law says that if a complex update is decomposed into consecutive pieces, it can be propagated incrementally, one piece after the other. A twosorted algebra comprising two operations, Get and Put, satisfying the laws, is called a wellbehaved lens. Even an immediate arrowbased generalization of lenses to delta lenses (treated in elementary terms via tile algebra [15, 22]) revealed that the GetPut law is a simple law of identity propagation, IdPut, rather than of roundtripping. The benefits of renaming GetPut as IdPut are not exhausted by clarification of semantics: as soon as we understand that the original GetPut is about identity propagation, we at
12
Category Theory and Model Driven Engineering
once ask what the real roundtripping law GetPut should be, and at once see that operation Put is not the inverse of Get. We only have the weaker 1.5roundtripping GetPutGet law (or weak invertibility; see the Appendix, where the laws in question are named IdPpg and fbfPpg and bfbPpg). It is interesting (and remarkable) that papers [14, 31], in which symmetric lenses are studied in the statebased setting, mistakenly consider identity propagation laws as roundtripping laws, and correspondingly analyze a rather poor BXstructure without real roundtripping laws at all. The tile algebra formulation of the PutPut law clarified its meaning as a composition preservation law [15, 22], but did not solve the enigmatic PutPut problem. The point is that PutPut does not hold in numerous practically interesting situations, but its entire removal from the list of BX laws is also not satisfactory, as it leaves propagation procedures without any constraints on their compositionality. The problem was solved, or at least essentially advanced, by a truly categorical analysis performed by Michael Johnson et al [35, 34]. They have shown that an asymmetric wellbehaved lens is an algebra for some KZ monad, and PutPut is nothing but the basic associativity condition for this algebra. Hence, as Johnson and Rosebrugh write in [34], the status of the PutPut changes from being (a) “some law that may have arisen from some special applications and should be discarded immediately if it seems not to apply in a new application” to (b) a basic requirement of an otherwise adequate and general mathematical model. And indeed, Johnson and Rosebrugh have found a weaker — monotonic — version of PutPut (see Fig. 13 in the Appendix), which holds in a majority of practical applications, including those where the original (nonmonotonic or mixed) PutPut fails. Hopefully, this categorical analysis can be generalized for the symmetric lens case, thus stating solid mathematical foundations for BX. 5.2 Design patterns for specific problems. Recalling Figure 3, Figure 10 presents a rough illustration of how mathematical models can reshape our view of XtoM X a domain or construct X. Building a wellstructured matheMath. Model matical model M of X, and then reinterpreting it back to X, can change our view of the latter as schematically shown in M the figure with the reshaped construct X 0 . Note the discrep0 ancy between the reshaped X and model M: the upperleft block is missing from X 0 . If X is a piece of reality (think of X′ MtoX mathematical modeling of physical phenomena), this discrepancy means, most probably, that the model is not adequate (or, perhaps, some piece of X is not observable). If X is a piece of software, the discrepancy may point to a de Figure 10: From mathematical models to ficiency of the design, which can be fixed by redesigning design patterns the software. Even better to base software design on a wellstructured model from the very beginning. Then we say that model M provides a design pattern for X. We have found several such cases in our work with categorical modeling of MDEconstructs. For example, the notion of a jointlymonic nary arrow span turns out to be crucial for modeling associations between object classes, and their correct implementation as well [17]. It is interesting to observe how a simple arrow arrangement allows one to clean the UML metamodel and essentially simplify notation [12, 8]. Another example is modeling intermodel mappings by Kleisli morphisms, which provide a universal pattern for model matching (a.k.a alignment) and greatly simplify model merge as discussed in Sect. 4.1. In addition, the Kleisli view of model mappings provides a design pattern for mapping composition — a problem considered to be difficult in the model management literature [3]. Sequential
Z. Diskin & T. Maibaum
13
composition of symmetric delta lenses is also not evident; considering such lenses as algebras whose carriers are profunctors (see Appendix) suggests a precise pattern to be checked (this work is now in progress). Decomposition of a model transformation into Cartesian lifting (view execution) followed by the inverse operation of Cartesian lifting completion (view updating) as described in Section 4.2.3 provides a useful guidance for model transformation design, known to be laborious and errorprone. In particular, it immediately provides bidirectionality. The graph transformation community also developed several general patterns applicable to MDE (with models considered as typed attributed graphs, see [23] for details). In particular, an industrial standard for model transformation, QVT [41], was essentially influenced by triplegraph grammars (TGGs). Some applications of TGGs to model synchronization (and further references) can be found in [30].
5.3 Diagrammatic modeling culture and tool architecture. The design patterns mentioned above are based on the respective categorical machinery (monads, fibrations, profunctors). A software engineer not familiar with these patterns would hardly recognize them in the arrays of implementation details. Even less probable is that he will abstract away his implementation concerns and reinvent such patterns from scratch; distillation of these structures by the CT community took a good amount of time. In contrast, simple arrow diagrams, like in Fig. 7(a) (see also the Appendix), do not actually need any knowledge of CT: all that is required is making intermodel relations explicit, and denoting them by arcs (directed or undirected) connecting the respective objects. To a lesser extent, this also holds for the model transformation decomposition in Fig. 9 and the model merge pattern in Fig. 6. We refer to a lesser extent because the former pattern still needs familiarity with the relationsarespans idea, and the latter needs an understanding of what colimit is (but, seemingly, it should be enough to understand it roughly as some algebraic procedure of “merging things”). The importance of mappings between models/software artifacts is now well recognized in many communities within SE, and graphical notations have been employed in SE for a long time. Nevertheless, a majority of model management tools neglect the primary status of model mappings: in their architecture, model matching and alignment are hidden inside (implementations of) algebraic routines, thus complicating both semantics and implementation of the latter; concerns are intricately mixed rather than separated. As all SE textbooks and authorities claim separation of concerns to be a fundamental principle of software design, an evident violation of the principle in the cases mentioned above is an empirical fact that puzzles us. It is not clear why a BXtool designer working on tool architecture does not consider simple arrow diagrams like in Fig. 7(a), and prefers discrete diagrams (b). The latter are, of course, simpler but their simplicity is deceiving in an almost evident way. The only explanation we have found is that understanding the deceiving simplicity of discrete diagrams (b), and, simultaneously, manageability of arrow diagrams (a), needs a special diagrammatic modeling culture that a software engineer normally does not possess. This is the culture of elementary arrow thinking, which covers the most basic aspects of manipulating and using arrow diagrams. It appears that even elementary arrow thinking habits are not cultivated in the current SE curriculum, the corresponding highlevel specification patterns are missing from the software designer toolkit, and software is often structured and modularized according to the implementation rather than specification concerns.
14
6
Category Theory and Model Driven Engineering
Related work
First applications of CT in computer science, and the general claim of CT’s extreme usefulness for computer applications should be, of course, attributed to Joseph Goguen [29]. The shift from modeling semantics of computation (behavior) to modeling structures of software programs is emphasized by Jos´e Fiadeiro in the introduction to his book [36], where he refers to a common “social” nature of both domains. The ideas put forward by Fiadeiro were directly derived from joint work with Tom Maibaum on what has become known as component based design and software architecture [26, 27, 24]. A clear visualization of these ideas by Fig. 2 (with M standing for Fiadeiro’s “social”) seems to be new. The idea of roundtripping modeling chain Fig. 3 appears to be novel, its origin can be traced to [11]. Don Batory makes an explicit call to using CT in MDE in his invited lecture for MoDELS’2008 [2], but he employs the very basic categorical means, in fact, arrow composition only. In our paper we refer to much more advanced categorical means: sketches, fibrations, Cartesian monads, Kleisli categories. Generalized sketches (graphs with diagram predicates) as a universal syntactical machinery for formalizing different kinds of models were proposed by Diskin et al, [18]. Their application to special MDE problems can be found in [17, 38] and in the work of Rutle et al, see [46], [43] and references therein. A specific kind of sketches, ERsketches, is employed for a number of problems in the database context by Johnson et al [32]. Considering models as typed attributed graphs with applications to MDE has been extensively put forward by the graph transformation (GT) community [23]; their work is much more operationally oriented than our concerns in the present paper. On the other hand, in contrast to the generalized sketches framework, constraints seem to be not the firstclass citizens in the GT world. The shift from functorial to fibrational semantics for sketches to capture the metamodeling foundations of MDE was proposed in [13] and formalized in [20]. This semantics is heavily used in [15], and in the work of Rutle et al mentioned above. Comparison of the two semantic approaches, functorial and fibrational, and the challenges of proving their equivalence, are discussed in [52]. The idea of modeling query languages by monads, and metamodel (or data schema) mappings by Kleisli mappings, within the functorial semantics approach, was proposed in [16], and independently by Johnson and Rosebrugh in their work on ERsketches [32]. Reformulation of the idea for fibrational semantics was developed and used for specifying important MDE constructs in [15, 21]. An accurate formalization via Cartesian monads can be found in [19]. Algebraic foundations for BX is now an area of active research. Basics of the statebased algebraic framework (lenses) were developed by Pierce with coauthors [28]; their application to MDE is due to Stevens [50]. Deltalenses [22, 10] is a step towards categorical foundations, but they have been described in elementary terms using tile algebra [15]. A categorical approach to the view update problem has been developed by Johnson and Rosebrugh et al[33]; and extended to categorical foundations for lenses based on KZmonads in [35, 34]. The notion of symmetric delta lens in Appendix is new; it results from incorporating the monotonic PutPutlaw idea of Johnson and Rosebrugh into the earlier notion of symmetric delta lens [10]. Assembling synchronization procedures from elementary blocks is discussed in [15].
7
Conclusion
The paper claims that category theory is a good choice for building mathematical foundations for MDE. We first discuss two very general prerequisites that concepts and structures developed in category theory have to be well applicable for mathematical modeling of MDEconstructs. We then exemplify the argu
Z. Diskin & T. Maibaum
15
ments by sketching several categorical models, which range from general definitions of multimodeling and intermodeling to important model management scenarios of model merge and bidirectional update propagation. We briefly explain (and refer to other work for relevant details) that these categorical models provide useful design patterns and guidance for several problems considered to be difficult. Moreover, even an elementary arrow arrangement of model merge and BX scenarios makes explicit a deficiency of the modern tools automating these scenarios. To wit: these tools’ architecture weaves rather than separates such different concerns as (i) model matching and alignment based on heuristics and contextual information, and (ii) relatively simple algebraic routines of merging and update propagation. This weaving complicates both semantics and implementation of the algebraic procedures, does not allow the user to correct alignment if necessary, and makes tools much less flexible. It appears that even simple arrow patterns, and the corresponding structural decisions, may not be evident for a modern software engineer. Introduction of CT courses into the SE curriculum, especially in the MDE context, would be the most natural approach to the problem: even elementary CT studies should cultivate arrow thinking, develop habits of diagrammatic reasoning and build a specific intuition of what is a healthy vs. illformed structure. We believe that such intuition, and the structural lessons one can learn from CT, are of direct relevance for many practical problems in MDE. Acknowledgement. Ideas presented in the paper have been discussed at NECSIS seminars at McMaster and the University of Waterloo; we are grateful to all their participants, and especially to Michał Antkiewicz and Krzysztof Czarnecki, for useful feedback and stimulating criticism. We have also benefited greatly from discussions with Don Batory, Steve Easterbrook, Jos´e Fiadeiro, Michael Healy, Michael Johnson, Ralf L¨ammel, Bran Selic and Uwe Wolter. Thanks also go to anonymous referees for comments. Financial support was provided with the NECSIS project funded by the Automotive Partnership Canada.
References [1] Michal Antkiewicz, Krzysztof Czarnecki & Matthew Stephan (2009): Engineering of FrameworkSpecific Modeling Languages. IEEE Trans. Software Eng. 35(6), pp. 795–824. Available at http://doi. ieeecomputersociety.org/10.1109/TSE.2009.30. [2] Don S. Batory, Maider Azanza & Jo˜ao Saraiva (2008): The Objects and Arrows of Computational Design. In Czarnecki et al. [7], pp. 1–20. Available at http://dx.doi.org/10.1007/9783540878759_1. [3] Philip A. Bernstein (2003): Applying Model Management to Classical Meta Data Problems. In: CIDR. Available at http://wwwdb.cs.wisc.edu/cidr/cidr2003/program/p19.pdf. [4] Aaron Bohannon, J. Nathan Foster, Benjamin C. Pierce, Alexandre Pilkiewicz & Alan Schmitt (2008): Boomerang: resourceful lenses for string data. In: Proceedings of the 35th annual ACM SIGPLANSIGACT symposium on Principles of programming languages, POPL ’08, ACM, New York, NY, USA, pp. 407–419. Available at http://dx.doi.org/10.1145/1328438.1328487. [5] Artur Boronat, Alexander Knapp, Jos´e Meseguer & Martin Wirsing (2008): What Is a Multimodeling Language? In Andrea Corradini & Ugo Montanari, editors: WADT, Lecture Notes in Computer Science 5486, Springer, pp. 71–87. Available at http://dx.doi.org/10.1007/9783642034299_6. [6] Krzysztof Czarnecki, J. Nathan Foster, Zhenjiang Hu, Ralf L¨ammel, Andy Sch¨urr & James F. Terwilliger (2009): Bidirectional Transformations: A CrossDiscipline Perspective. In Richard F. Paige, editor: ICMT, Lecture Notes in Computer Science 5563, Springer, pp. 260–283. Available at http://dx.doi.org/10. 1007/9783642024085_19.
16
Category Theory and Model Driven Engineering
[7] Krzysztof Czarnecki, Ileana Ober, JeanMichel Bruel, Axel Uhl & Markus V¨olter, editors (2008): Model Driven Engineering Languages and Systems, 11th International Conference, MoDELS 2008, Toulouse, France, September 28  October 3, 2008. Proceedings. Lecture Notes in Computer Science 5301, Springer. Available at http://dx.doi.org/10.1007/9783540878759. [8] Z. Diskin (2007): Mappings, maps, atlases and tables: A formal semantics for associations in UML2. Technical Report CSRG566, University of Toronto. http://ftp.cs.toronto.edu/pub/reports/csrg/566/ TR566umlAssons.pdf. [9] Z. Diskin, S. Easterbrook & R. Miller (2008): Integrating schema integration frameworks, algebraically. Technical Report CSRG583, University of Toronto. http://ftp.cs.toronto.edu/pub/reports/ csrg/583/TR583schemaIntegr.pdf. [10] Z. Diskin, Y. Xiong, K. Czarnecki, H. Ehrig, F. Hermann & F. Orejas (2011): From State to DeltaBased Bidirectional Model Transformations: The Symmetric Case. In Whittle et al. [51], pp. 304–318. Available at http://dx.doi.org/10.1007/9783642244858_22. [11] Zinovy Diskin (2001): On Modeling, Mathematics, Category Theory and RMODP. In Jos´e A. Moinhos Cordeiro & Haim Kilov, editors: WOODPECKER, ICEIS Press, pp. 38–54. [12] Zinovy Diskin (2002): Visualization vs. Specification in Diagrammatic Notations: A Case Study with the UML. In Mary Hegarty, Bernd Meyer & N. Hari Narayanan, editors: Diagrams, Lecture Notes in Computer Science 2317, Springer, pp. 112–115. Available at http://dx.doi.org/10.1007/3540460373_15. [13] Zinovy Diskin (2005): Mathematics of Generic Specifications for Model Management. In Laura C. Rivero, Jorge Horacio Doorn & Viviana E. Ferraggine, editors: Encyclopedia of Database Technologies and Applications, Idea Group, pp. 351–366. [14] Zinovy Diskin (2008): Algebraic Models for Bidirectional Model Synchronization. In Czarnecki et al. [7], pp. 21–36. Available at http://dx.doi.org/10.1007/9783540878759_2. [15] Zinovy Diskin (2009): Model Synchronization: Mappings, Tiles, and Categories. In Jo˜ao M. Fernandes, Ralf L¨ammel, Joost Visser & Jo˜ao Saraiva, editors: GTTSE, Lecture Notes in Computer Science 6491, Springer, pp. 92–165. Available at http://dx.doi.org/10.1007/9783642180231_3. [16] Zinovy Diskin & Boris Cadish (1997): A Graphical Yet Formalized Framework for Specifying View Systems. In: ADBIS, Nevsky Dialect, pp. 123–132. Available at http://www.bcs.org/upload/pdf/ewic_ad97_ paper17.pdf. [17] Zinovy Diskin, Steve M. Easterbrook & J¨urgen Dingel (2008): Engineering Associations: From Models to Code and Back through Semantics. In Richard F. Paige & Bertrand Meyer, editors: TOOLS (46), Lecture Notes in Business Information Processing 11, Springer, pp. 336–355. Available at http://dx.doi.org/ 10.1007/9783540698241_19. [18] Zinovy Diskin, Boris Kadish, Frank Piessens & Michael Johnson (2000): Universal Arrow Foundations for Visual Modeling. In Michael Anderson, Peter Cheng & Volker Haarslev, editors: Diagrams, Lecture Notes in Computer Science 1889, Springer, pp. 345–360. Available at http://link.springer.de/link/ service/series/0558/bibs/1889/18890345.htm. [19] Zinovy Diskin, Tom Maibaum & Krzysztof Czarnecki (2012): Intermodeling, Queries, and Kleisli Categories. In Juan de Lara & Andrea Zisman, editors: FASE, Lecture Notes in Computer Science 7212, Springer, pp. 163–177. Available at http://dx.doi.org/10.1007/9783642288722_12. [20] Zinovy Diskin & Uwe Wolter (2008): A Diagrammatic Logic for ObjectOriented Visual Modeling. Electr. Notes Theor. Comput. Sci. 203(6), pp. 19–41. Available at http://dx.doi.org/10.1016/j.entcs. 2008.10.041. [21] Zinovy Diskin, Yingfei Xiong & Krzysztof Czarnecki (2010): Specifying Overlaps of Heterogeneous Models for Global Consistency Checking. In: MoDELS Workshops, Lecture Notes in Computer Science 6627, Springer, pp. 165–179. Available at http://dx.doi.org/10.1007/9783642212109_16.
Z. Diskin & T. Maibaum
17
[22] Zinovy Diskin, Yingfei Xiong & Krzysztof Czarnecki (2011): From State to DeltaBased Bidirectional Model Transformations: the Asymmetric Case. Journal of Object Technology 10, pp. 6: 1–25. Available at http://dx.doi.org/10.5381/jot.2011.10.1.a6. [23] H. Ehrig, K. Ehrig, U. Prange & G. Taenzer (2006): Fundamentals of Algebraic Graph Transformation. [24] J. L. Fiadeiro & T. S. E. Maibaum (1995): A Mathematical Toolbox for the Software Architect. In J. Kramer & A. Wolf, editors: 8th Int. Workshop on Software Specification and Design, IEEE CS Press, pp. 46–55. [25] Jos´e Luiz Fiadeiro (2004): Software Services: Scientific Challenge or Industrial Hype? In Zhiming Liu & Keijiro Araki, editors: ICTAC, Lecture Notes in Computer Science 3407, Springer, pp. 1–13. Available at http://dx.doi.org/10.1007/9783540318620_1. [26] Jos´e Luiz Fiadeiro & T. S. E. Maibaum (1992): Temporal Theories as Modularisation Units for Concurrent System Specification. Formal Asp. Comput. 4(3), pp. 239–272. Available at http://dx.doi.org/10. 1007/BF01212304. [27] Jos´e Luiz Fiadeiro & T. S. E. Maibaum (1995): Interconnecting Formalisms: Supporting Modularity, Reuse and Incrementality. In: SIGSOFT FSE, pp. 72–80. Available at http://doi.acm.org/10.1145/222124. 222141. [28] J. N. Foster, M. Greenwald, J. Moore, B. Pierce & A. Schmitt (2007): Combinators for bidirectional tree transformations: A linguistic approach to the viewupdate problem. ACM Trans. Program. Lang. Syst. 29(3), doi:10.1145/1232420.1232424. [29] Joseph A. Goguen (1991): A Categorical Manifesto. Mathematical Structures in Computer Science 1(1), pp. 49–67. Available at http://dx.doi.org/10.1017/S0960129500000050. [30] Frank Hermann, Hartmut Ehrig, Fernando Orejas, Krzysztof Czarnecki, Zinovy Diskin & Yingfei Xiong (2011): Correctness of Model Synchronization Based on Triple Graph Grammars. In Whittle et al. [51], pp. 668–682. Available at http://dx.doi.org/10.1007/9783642244858_49. [31] M. Hofmann, B. Pierce & D. Wagner (2011): Symmetric Lenses. In: POPL. Available at http://doi.acm. org/10.1145/1328438.1328487. [32] M. Johnson, R. Rosebrugh & R. Wood (2002): Entityrelationshipattribute designs and sketches. Theory and Applications of Categories 10(3), pp. 94–112. [33] Michael Johnson & Robert D. Rosebrugh (2007): Fibrations and universal view updatability. Theor. Comput. Sci. 388(13), pp. 109–129. Available at http://dx.doi.org/10.1016/j.tcs.2007.06.004. [34] Michael Johnson & Robert D. Rosebrugh (2012): Lens putput laws: monotonic and mixed. To appear. http://www.easst.org/eceasst. [35] Michael Johnson, Robert D. Rosebrugh & Richard J. Wood (2010): Algebras and Update Strategies. J. UCS 16(5), pp. 729–748. Available at http://dx.doi.org/10.3217/jucs016050729. [36] Jos´e Fiadeiro (2004): Categories for Software Engineering. Springer. [37] Stefan Jurack & Gabriele Taentzer (2009): Towards Composite Model Transformations Using Distributed Graph Transformation Concepts. In Andy Sch¨urr & Bran Selic, editors: MoDELS, Lecture Notes in Computer Science 5795, Springer, pp. 226–240. Available at http://dx.doi.org/10.1007/ 9783642044250_17. [38] Hongzhi Liang, Zinovy Diskin, J¨urgen Dingel & Ernesto Posse (2008): A General Approach for Scenario Integration. In Czarnecki et al. [7], pp. 204–218. Available at http://dx.doi.org/10.1007/ 9783540878759_15. [39] M. Makkai (1997): Generalized Sketches as a Framework for Completeness Theorems. Journal of Pure and Applied Algebra 115, pp. 49–79, 179–212, 214–274. [40] Kazutaka Matsuda, Zhenjiang Hu, Keisuke Nakano, Makoto Hamana & Masato Takeichi (2007): Bidirectionalization transformation based on automatic derivation of view complement functions. In Ralf Hinze & Norman Ramsey, editors: ICFP, ACM, pp. 47–58. Available at http://dx.doi.org/10.1145/1291151. 1291162.
18
Category Theory and Model Driven Engineering
[41] Object Management Group (2008): MOF Query / Views / Transformations Specification 1.0. http://www. omg.org/docs/formal/080403.pdf. [42] Rachel Pottinger & Philip A. Bernstein (2003): Merging Models Based on Given Correspondences. In: VLDB, pp. 826–873. Available at http://www.vldb.org/conf/2003/papers/S26P01.pdf. [43] Alessandro Rossini, , Juan de Lara, Esther Guerra, Adrian Rutle & Yngve Lamo (2012): A Graph Transformationbased Semantics for Deep Metamodelling. In: AGTIVE 2012. Available at http: //dx.doi.org/10.1016/j.jlap.2009.10.003. To appear. [44] Alessandro Rossini, Adrian Rutle, Yngve Lamo & Uwe Wolter (2010): A formalisation of the copymodifymerge approach to version control in MDE. J. Log. Algebr. Program. 79(7), pp. 636–658. Available at http://dx.doi.org/10.1016/j.jlap.2009.10.003. [45] Adrian Rutle, Alessandro Rossini, Yngve Lamo & Uwe Wolter (2010): A Formalisation of ConstraintAware Model Transformations. In David S. Rosenblum & Gabriele Taentzer, editors: FASE, Lecture Notes in Computer Science 6013, Springer, pp. 13–28. Available at http://dx.doi.org/10.1007/ 9783642120299_2. [46] Adrian Rutle, Alessandro Rossini, Yngve Lamo & Uwe Wolter (2012): A formal approach to the specification and transformation of constraints in MDE. J. Log. Algebr. Program. 81(4), pp. 422–457. Available at http://dx.doi.org/10.1016/j.jlap.2012.03.006. [47] Bran Selic (2008): Personal reflections on automation, programming culture, and modelbased software engineering. Autom. Softw. Eng. 15(34), pp. 379–391. Available at http://dx.doi.org/10.1007/ s1051500800357. [48] M. Shaw (1996): Three patterns that help explain the development of software engineering (position paper). In: Dagstuhl Workshop on Software Architecture. [49] Stefano Spaccapietra & Christine Parent (1994): View Integration: A Step Forward in Solving Structural Conflicts. IEEE Trans. Knowl. Data Eng. 6(2), pp. 258–274. Available at http://doi. ieeecomputersociety.org/10.1109/69.277770. [50] Perdita Stevens (2010): Bidirectional model transformations in QVT: semantic issues and open questions. Software and System Modeling 9(1), pp. 7–20. Available at http://dx.doi.org/10.1007/ s1027000801099. [51] Jon Whittle, Tony Clark & Thomas K¨uhne, editors (2011): Model Driven Engineering Languages and Systems, 14th International Conference, MODELS 2011, Wellington, New Zealand, October 1621, 2011. Proceedings. Lecture Notes in Computer Science 6981, Springer. Available at http://dx.doi.org/10.1007/ 9783642244858. [52] Uwe Wolter & Zinovy Diskin: From Indexed to Fibred Semantics The Generalized Sketch File. Technical Report 361, Department of Informatics, University of Bergen, Norway. http://www.ii.uib.no/ publikasjoner/texrap/pdf/2007361.pdf, year = 2007,. [53] Y. Xiong, D. Liu, Z. Hu, H. Zhao, M. Takeichi & H. Mei (2007): Towards automatic model synchronization from model transformations. In: ASE, pp. 164–173. Available at http://doi.acm.org/10.1145/ 1321631.1321657.
A
Appendix. Algebra of bidirectional update propagation
In Section 4.2, we considered operations of update propagation, but did not specify any laws they must satisfy. Such laws are crucial for capturing semantics, and the present section aims to specify algebraic laws for BX. We will do it in an elementary way using tile algebra (rather than categorically — it is nontrivial and left for a future work). We will begin with the notion of an alignment framework to formalize delta composition (∗ in Section 4.2), and then proceed to algebraic structures modeling BX — symmetric delta lenses. (Note that the lenses we will introduce here are different from those defined in [10].)
Z. Diskin & T. Maibaum
 B
r A B
. .:bAln 6 b
b 
?
A0
?
B0 (a) arities
6
a
6
?
A0
b r∗
r∗
r a∗
a
r

6 & &:fAln
A
 B 
a∗ r
r
A
19
b
r0  0 B
?
(b) (fAlnbAln) law
Figure 11: Realignment operations and their laws Definition 1 () An alignment framework is given by the following data. (i) Two categories with pullbacks, A and B, called model spaces. We will consider spans in these categories up to their equivalence via a head isomorphism commuting with legs. That is, we will work with equivalence classes of spans, and the term ’span’ will refer to an equivalence class of spans. Then span composition (via pullbacks) is strictly associative, and we have categories (rather than bicategories) of spans, Span1 (A) and Span1 (B). Their subcategories consisting of spans with injective legs will be denoted by A• and B• resp. Such spans are to be thought of as (model) updates. They will be depicted by vertical bidirectional arrows, for example, a and b in the diagrams Fig. 11(a). We will assume that the upper node of such an arrow is its formal source, and the lower one is the target; the source is the the original (state of the) model, and the target is the updated model. Thus, model evolution is directed down. A span whose upper leg is identity (nothing deleted) is an insert update; it will be denoted by unidirectional arrows going down. Dually, a span with identity lower leg is a delete update; it will be denoted by a unidirectional arrow going up (but the formal source of such an arrow is still the upper node). (ii) For any two objects, A ∈ A0 and B ∈ B0 , there is a set R(A, B) of correspondences (or corrs in short) from A to B. Elements of R(A, B) will be depicted by bidirectional horizontal arrows, whose formal source is A and the target is B. Updates and corrs will also be called vertical and horizontal deltas, resp. (iii) Two diagram operations over corrs and updates called forward and backward (re)alignment. Their arities are shown in Fig. 11(a) (output arrows are dashed). We will also write a ∗ r for fAln(a, r) and r ∗ b for bAln(b, r). We will often skip the prefix ’re’ and say ’alignment’ to ease terminology. There are three laws regulating alignment. Identity updates do not actually need realignment: (IdAln)
idA ∗ r = r = r ∗ idB
for any corr r : A ↔ B. The result of applying a sequence of interleaving forward and backward alignments does not depend on the order of application as shown in Fig. 11(b): (fAln−bAln)
(a ∗ r) ∗ b = a ∗ (r ∗ b)
for any corr r and any updates a, b. We will call diagrams like those shown in Fig. 11(a,b) commutative if the arrow at the respective operation output is indeed equal to that one computed by the operation. For example, diagram (b) is commutative if r0 = a ∗ r ∗ b.
20
Category Theory and Model Driven Engineering
A 6
r
 B 6
& &:fPpg
a ?
A0 A 6
r0 r
? A0
r0
b ?
 B0  B
. .:bPpg
a
6
b
?  B0
(a) arities
A idA
6 ?
A A idA
6
? A
r
 B
& &:fPpg r r
idB
?
?
A0
 B
A
6
idB
?  B
(b) (IdPpg) laws
r
 B
& &:fPpg
a
 B
. .:bPpg r
6
A
a ? A0
r0 r
?
a
6
 B0
A0
 B
A
. .:bPpg r0
b
A
b
a
6
?  B0
A0
r
 B
& &:fPpg r0 r
b
 B0  B
. .:bPpg r0
6
6
b
 B0
(c) monotonicity laws
Figure 12: Operations of update propagation Finally, alignment is compositional: for any consecutive updates a : A → A0 , a0 : A0 → A00 , b : B → B0 , b0 : B0 → B00 , the following holds: (AlnAln)
a0 ∗ (a ∗ r) = (a; a0 ) ∗ r and (r ∗ b) ∗ b0 = r ∗ (b; b0 )
where ; denotes sequential span composition. It is easy to see that having an alignment framework amounts to having a functor α : A• × B• → Set. Definition 2 () A symmetric delta lens (briefly, an sdlens) is a triple (α, fPpg, bPpg) with α : A• × B• → Set an alignment framework, and fPpg, bPpg two diagram operations over corrs and updates (called forward and backward update propagation, resp.). The arities are specified in Fig. 12(a) with output arrows dashed and output nodes not framed. Sometimes we will use a linear notation and write b = a.fPpg(r) and a = b.bPpg(r) for the cases specified in the diagrams. Each operation must satisfy the following laws. Stability or IdPpg law: if nothing changes on one side, nothing happens on the other side as well, that is, identity mappings are propagated into identity mappings as shown by diagrams Fig. 12(b). Monotonicity: Insert updates are propagated into inserts, and delete updates are propagated into deletes, as specified in Fig. 12(c). Monotonic Compositionality or PpgPpg law: composition of two consecutive inserts is propagated into composition of propagations as shown by the left diagram in Fig. 13 (to be read as follows: if the two squares are fPpg, then the outer rectangle is fPpg as well). The right diagram specifies compositionality for deletes. The same laws are formulated for bPpg. Note that we do not require compositionality for propagation of general span updates. The point is that interleaving inserts and deletes can annihilate, and lost information cannot be restored: see [28, 22, 10] for examples. Commutativity: Diagrams Fig. 12(a) must be commutative in the sense that a ∗ r ∗ b = r0 . Finally, forward and backward propagation must be coordinated with each other by some invertibility law. Given a corr r : A → B, an update a : A → A0 is propagated into update b = a.fPpg(r), which can be propagated back to update a0 = b.bPpg(r). For an ideal situation of strong invertibility, we should require a0 = a. Unfortunately, this does not hold in general because the A• specific part of the information is lost
Z. Diskin & T. Maibaum
21
r r r A B A B 6
a
6
& &:fPpg b
?
A0
6
6
& &:bPpg a1 & &:fPpg b
? ? 0 0 r0  ? 0 r1  0 r1  0 B A1 B
(a) fbfPpg law
r r r A B A B 6
a ? A0
6
. .:bPpg b1 . .:fPpg
6
a
? 0 r10  ? 0 r1  0 B1 A
6
. .:bPpg b r0  0 B
?
(b) bfbPpg law
Figure 14: Roundtripping laws. (Scenario in diagram (b) “runs” from the right to the left.) in passing from a to b, and cannot be restored [10]. However, it makes sense to require the following weak invertibility specified in Fig. 14, which does hold in a majority of practically interesting situations, e.g., for BX determined by TGGrules [30]. The law fbfPpg says that although a1 = a.fPpg(r).bPpg(r) 6= a, a1 is equivalent to a in the sense that a1 .fPpg(r) = a.fPpg(r). Similarly for the bfbPpg law. The notion of sdlens is specified above in elemenr r tary terms using tile algebra. Its categorical underpinA B A B ning is not evident, and we only present several brief 6 6 remarks. a & &:fPpg b a & &:fPpg b ? 1) An alignment framework α : A• × B• → Set can r0  ? r0  0 B0 B A0 A0 be seen as a profunctor, if Aarrows will be considered 6 directed up (i.e., the formal source of update a in dia6 0 0 0 0 a & & :fPpg b a & & :fPpg b0 gram Fig. 11(a) is A , and the target is A). Then align? ment amounts to a functor α : A•op × B• → Set, that is, r00  ? r00  00 00 A B00 A00 B • • a profunctor α : B 9 A . Note that reversing arrows • in A actually changes the arity of operation fAln: now Figure 13: Monotonic (PpgPpg) laws its input is a pair (a, r) with a an update and r a corr from the target of a, and the output is a corr r0 from the source of a, that is, realignment goes back in time. 2) Recall that operations fPpg and bPpg are functorial wrt. injective arrows in A, B, not wrt. arrows in A• , B• . However, if we try to resort to A, B entirely and define alignment wrt. arrows in A, B, then we will need two fAln operations with different arities for inserts and deletes, and two bAln operations with different arities for inserts and deletes. We will then have four functors αi : A × B → Set with i ranging over fourelement set {insert, delete} × {A, B}. 3) The weak invertibility laws suggest that a Galois connection/adjunction is somehow hidden in sdlenses. 4) Working with chosen spans and pullbacks rather than with their equivalence classes provides a more constructive setting (given we assume the axiom of choice), but then associativity of span composition only holds up to chosen natural isomorphisms, and A• and B• have to be considered bicategories rather than categories. All in all, we hope that the categorical analysis of asymmetric delta lenses developed by Johnson et al [35, 34] could be extended to capture the symmetric case too.
Bisimulation of Labeled StatetoFunction Transition Systems of Stochastic Process Languages D. Latella & M. Massink CNR – Istituto di Scienza e Tecnologie dell’Informazione ‘A. Faedo’
E.P. de Vink ∗ Technische Universiteit Eindhoven and Centrum Wiskunde Informatica
Abstract Labeled statetofunction transition systems, FuTS for short, admit multiple transition schemes from states to functions of finite support over general semirings. As such they constitute a convenient modeling instrument to deal with stochastic process languages. In this paper, the notion of bisimulation induced by a FuTS is addressed from a coalgebraic point of view. A correspondence result is proven stating that FuTSbisimulation coincides with the behavioral equivalence of the associated functor. As generic examples, the concrete existing equivalences for the core of the stochastic process algebras PEPA and IML are related to the bisimulation of specific FuTS, providing via the correspondence result coalgebraic justification of the equivalences of these calculi.
1
Introduction
Process description languages equipped with formal operational semantics are successful formalisms for modeling concurrent systems and analyzing their behavior. Typically, the operational semantics is defined by means of a labeled transition system following the SOS approach. The states of the transition systems are just process terms, while the labels of the transitions between states represent the possible actions and interactions. Process description languages often come equipped with process equivalences, so that system models can be compared according to specific behavioral relations. In the last couple of decades, process languages have been enriched with quantitative information. Among these quantitative extensions, those allowing a stochastic representation of time, usually referred to as stochastic process algebras, have received particular attention. The main aim has been the integration of qualitative descriptions and quantitative analysis in a single mathematical framework by building on the combination of labeled transition systems and continuoustime Markov chains. The latter being one of the most successful approaches to modeling and analyzing the performance of computer systems and networks. An overview on stochastic process algebras, equivalences and related analysis techniques can be found in [14, 1, 3], for example. A common feature of many stochastic process algebras is that actions are enriched with the rates of exponentially distributed random variables that characterize their duration. Although exploiting the same class of distributions, the models and the techniques underlying the definition of the calculi turn out to be significantly different in many respects. A prominent difference concerns the modeling of the race condition by means of the choice operator, and its relationship to the issue of transition multiplicity. In the quantitative setting, multiplicities can make a crucial distinction between processes that are qualitatively equivalent. Several significantly different approaches have been proposed for handling transition multiplicity. The proposals range from multirelations [17, 13], to proved transition systems [23], to LTS with numbered transitions [14], to unique rate names [9], just to mention a few. ∗ Corresponding
author, email
[email protected]
U. Golas, T. Soboll (Eds.): Proceedings of ACCAT 2012 EPTCS 93, 2012, pp. 23–43, doi:10.4204/EPTCS.93.2
24
Bisimulation of FuTS
In [7], Latella, Massink et al. have proposed a variant of LTS, called Rate Transition Systems (RTS). In LTS, a transition is a triple (P, α, P0 ) where P and α are the source state and the label of the transition, respectively, while P0 is the target state reached from P via the transition. In RTS, a transition is a triple of the form (P, α, P ). The first and second component are the source state and the label of the transition, as in LTS, while the third component P is a continuation function which associates a nonnegative real value to each state P0 . A nonzero value for the state P0 represents the rate of the exponential distribution characterizing the time for the execution of the action represented by α, necessary to reach P0 from P via the transition. If P maps P0 to 0, then state P0 is not reachable from P via the transition. The use of continuation functions provides a clean and simple solution to the transition multiplicity problem and make RTS particularly suited for stochastic process algebra semantics. In order to provide a uniform account of the many stochastic process algebras proposed in the literature, in previous joint work of the first two authors [8] Labelled StatetoFunction Transition Systems (FuTS) have been introduced as a natural generalization of RTS. In FuTS the codomain of the continuation functions are arbitrary semirings, rather than just the nonnegative reals. This provides increased flexibility while preserving basic properties of primitive operations like sum and multiplication. In this paper we present a coalgebraic treatment of FuTS that allow multiple statetofunction transition relations involving arbitrary semirings. Given label sets L i and semirings R i , a FuTS takes the general format S = ( S , h i ini=1 ) with transition relations i ⊆ S × L i × FS( S , R i ). Here, FS( S , R i ) are the sets of functions from S to R i of finite support, a subcollection of functions also occurring in other work combining coalgebra and quantitative modeling. We will associate to S the product of the functors FS(·, R i ) L i . For this to work, we need the transition relations i to be total and deterministic for the coalgebraic modeling as a function. Maybe surprisingly, this isn’t a severe restriction at all in the presence of continuation functions: the zerocontinuation λs0 .0 expresses that no LTStransition exists from the state s to any state s0 ; if s allows a transition to some state s1 as well as a state s2 , the continuation function will simply yield a nonzero value for s1 and for s2 . The notion of Sbisimulation that arises from a FuTS S is reinterpreted coalgebraically as the behavioral equivalence of a functor that is induced by S, along the lines sketched above. Behavioral equivalence rather than coalgebraic bisimulation is targeted, since, dependent on the semirings involved, weak pullbacks may not be preserved and the construction of a mediating morphism for a coalgebraic bisimulation from a concrete one may fail for degenerate denominators. However, following a familiar argument, we show that the functor associated with a FuTS does possess a final coalgebra and therefore has an associated notion of behavioral equivalence indeed. It is noted, in the presence of a final coalgebra for FuTS a more general definition of behavioral equivalence based on cospans coincides [20]. A correspondence result is proven in this paper that shows that the concrete bisimulation of a FuTS, coincides with behavioral equivalence of its functor. Pivotal for its proof is the absence of multiplicities in the FuTS treatment of quantities. Using the bridge established by the correspondence result, we continue by showing for two wellknown stochastic process algebras, viz. Hillston’s PEPA [17] and Hermanns’s IML [13], that the respective standard notion of strong equivalence and strong bisimulation coincides with behavioral equivalence of the associated FuTS. This constitutes the main contribution of the paper. PEPA stands out as one of the prominent Markovian process algebras, while IML specifically provides separate prefix constructions for actions and for delays. The equivalences of PEPA and of IML are compared with the bisimulations of the respective FuTS as given by an alternative operational semantics involving the statetofunction scheme. In passing, the multiplicities have to be dealt with. Appropriate lemmas are provided relating the relationbased cumulative treatment with FuTS to the multirelationbased explicit treatment of PEPA and IML.
Latella, Massink & De Vink
25
Related work on coalgebra includes [28, 19, 26], papers that also cover measures and congruence formats, a topic not touched upon here. For the discrete parts, regarding the correspondence of bisimulations, our work aligns with the approach of the papers mentioned. In this paper the bialgebraic perspective of SOS and bisimulation [27] is left implicit. An interesting direction of research combining coalgebra and quantities studies various types of weighted automata, including linear weighted automata, and associated notions of bisimulation and languages, as well as algorithms for these notions [6, 25, 5]. In particular, building on a result on bounded functors [11], it is shown in [5] for a functor involving functions of finite support over a field that the final coalgebra exists. Below, we have followed the scheme of [5] to obtain such a result for a functor induced by a FuTS. The notions of equivalence addressed in this paper, as often in coalgebraic treatments of process relations, are all strong bisimilarities. The present paper is organized as follows: Section 2 briefly discusses some material on semirings and coalgebras. Labeled statetofunction transition systems and FuTS as well as the associated notion of bisimulation are provided in Section 3. The coalgebraic counterparts of FuTS and FuTSbisimulation are defined in Section 4, where we also establish the correspondence with behavioral equivalence of the final coalgebra. In Section 5 the standard equivalence of PEPA is identified with the bisimulation of a FuTS and, hence, with behavioral equivalence. In Section 6 the same is done for the language of IMC where actions and delays are present on equal footing. Section 7 wraps up and discusses directions of future research. An appendix provides the proofs of a number of lemmas.
2
Preliminaries
A tuple R = (R, +, 0, ∗, 1) is called a semiring, if (R, +, 0) is a commutative monoid with neutral element 0, (R, ∗, 1) is a monoid with neutral element 1, ∗ distributes over +, and 0 ∗ r = r ∗ 0 = 0 for all r ∈ R. As examples of a semiring we will use are the booleans B = { false, true } with disjunction as sum and conjunction as multiplication, and the nonnegative reals R>0 with the standard operations. We will P consider, for a semiring R and a function ϕ : X → R, countable sums x ∈ X 0 ϕ(x) in R, for X 0 ⊆ X. For such a sum to exist we require ϕ to be of finite support, i.e. the support set spt(ϕ) = { x ∈ X  ϕ(x) , 0 } is finite. Here, 0 is the neutral element of R with respect to +. We use the notation FS(X, R ) for the collection of all functions of finite support from the set X to the semiring R. A construct [ x1 7→ r1 , . . . , xn 7→ rn ], with xi ∈ X, i = 1 . . . n all distinct, ri ∈ R, i = 1 . . . n, denotes the mapping that assigns ri to xi , i = 1 . . . n, and assigns 0 to all x ∈ X different from all xi . In particular [], or more precisely [] R , is the constant function x 7→ 0 and X x = [ x 7→ 1 ] is the P characteristic function on R for x ∈ X. For ϕ ∈ FS(X, R ), we write ⊕ϕ for the value x∈X ϕ(x) in R. For ϕ, ψ ∈ FS(X, R ), the function ϕ + ψ is the pointwise sum of ϕ and ψ, i.e. (ϕ + ψ)(x) = ϕ(x) + ψ(x) ∈ R. Clearly, ϕ + ψ is of finite support as ϕ and ψ are. Given an injective operation  : X × X → X, we define ϕ  ψ : X → R, by (ϕ  ψ)(x) = ϕ(x1 ) ∗ ψ(x2 ) if x = x1  x2 for some x1 , x2 ∈ X, and (ϕ  ψ)(x) = 0 otherwise. Again, ϕ  ψ is of finite support as ϕ and ψ are. This is used in the setting of syntactic processes P that may have the form P1 k A P2 for two processes P1 and P2 and a syntactic operator k A . Lemma 1. Let X be a set, R a semiring, and  an injective binary operation on X. For ϕ, ψ ∈ FS(X, R ) it holds that ⊕(ϕ + ψ) = ⊕ϕ + ⊕ψ and ⊕( ϕ  ψ ) = (⊕ϕ) ∗ (⊕ψ). We recall some basic definitions from coalgebra. See e.g. [24] for more details. For a functor F : Set → Set on the category Set of sets and functions, a coalgebra of F is a set X together with a mapping α : X → F (X). A homomorphism between two F coalgebras (X, α) and (Y, β) is a function f : X → Y such that F ( f ) ◦ α = β ◦ f . An F coalgebra (Ω, ω) is called final, if there exists, for every F coalgebra
26
Bisimulation of FuTS
(X, α), a unique homomorphism [[·]]FX : (X, α) → (Ω, ω). Two elements x1 , x2 of a coalgebra (X, α) are called behavioral equivalent with respect to F if [[x1 ]]FX = [[x2 ]]FX , notation x1 ≈ F x2 . Using a characterization of [12], a functor F on Set is bounded, if there exist sets A and B and a surjective natural transformation η : A × (·)B ⇒ F . Here, A × (·) is the functor that maps a set X to the Cartesian product A × X and maps a function f : X → Y to the mapping A × f : A × X → A × Y with (A × f )(a, x) = (a, f (x)), while (·)B denotes the functor that maps a set X to the function space X B of all functions from B to X and that maps a function f : X → Y to the mapping f B : X B → Y B with f B (ϕ)(b) = f (ϕ(b)). For bounded functors we have the following result, see [11] for a proof. Theorem 2. If a functor F : Set → Set is bounded, then its final coalgebra exists.
A number of proofs of results on process languages P in this paper relies on socalled guarded recursion [2]. Typically, constants X are a syntactical ingredient in these languages. As usual, if X := P, i.e. the constant X is declared to have the process P as its body, we require P to be prefixguarded. Thus, any occurrence of a constant in the body P is in the scope of a prefixconstruct of the language. Guarded recursion assumes the existence of a function c : P → N such that c(P1 • P2 ) > max{ c(P1 ), c(P2 ) } for all syntactic operations • of P, and moreover c(X) > c(P) if X := P.
3
Labeled StatetoFunction Transition Systems
The definition of a labeled statetofunction transition system, FuTS for short, involves a set of states S and one or more relations of states and functions from states into a semiring. For sums over arbitrary subsets of states to exist, the functions are assumed to be of finite support. Definition 1. A FuTS S, in full a labeled statetofunction transition system, over a number of label sets L i and semirings R i , i = 1 . . . n, is a tuple S = ( S , h i ini=1 ) such that i ⊆ S × L i × FS( S , R i ), for i = 1 . . . n. • `
As usual, we write s i v for (s, `, v) ∈ i . For a FuTS S = ( S , h i ini=1 ) the set S is called the set of states. We refer to each i as a statetofunction transition relation of S or just as a transition relation of it. If for S we have that n = 1, i.e. there is only one statetofunction transition relation , then S is called simple. A FuTS S is called total and deterministic if for each transition relation i ⊆ S × L i × FS(S , R i ) `
involved and for all s ∈ S , ` ∈ L i , we have s i v for exactly one v ∈ FS(S , R i ). In such a situation, the `
zerofunction []R i plays a special role. A statetofunction transition s i []R i reflects the absence of a `
nontrivial transition s i v for v , []R i . In the context of LTS one says that s has no `transition. For the remainder of the paper, all FuTS we consider are assumed to be total and deterministic.1 Examples For the modeling of CCS processes, we choose a set of actions A as label set and the booleans B as semiring. Consider the two CCS processes P = a.b.0 + a.c.0 and Q = a.(b.0 + c.0), their a representation as a FuTS is depicted in Figure 1. For process P we have P [b.0 7→ true, c.0 7→ true], a while the process Q we have Q [b.0 + c.0 7→ true]. So, FuTS are able to represent branching. As another example of a simple FuTS, Figure 1 displays a FuTS over the action set A and the semiring R>0 of the nonnegative real numbers. The functions v0 to v4 used in the example have the property that ⊕v i (s) = 1, for i = 0 . . . 4. Usually, such a FuTS over R>0 is called a (reactive) probabilistic transition system. 1 Definition
1 slightly differs in formulation from the one in [8].
Latella, Massink & De Vink
27
Figure 1: FuTS for two CCS processes and a probabilistic process. In Section 6 we will provide semantics for the process language IML of interactive Markov chains [13, 16] using FuTS. Unlike many other stochastic process algebras, a single IML process can in general both perform actionbased transitions and timedelays governed by exponential distributions. It will be notationally convenient to consider a (total and deterministic) FuTS as a tuple ( S , h θi ini=1 ) with transition functions θi : S → L i → FS( S , R i ), i = 1 . . . n, rather than using the form ( S , h i ini=1 ) that occurs more frequent for concrete examples in the literature. Alternatively, using disjoint unions, Ln Ln one could see a FuTS represented by a function θ0 : S → L → FS(S , R i ) satisfying the i i=1 i=1 0 additional property that θ (s)(` ) ∈ FS(S , R i ) if ` ∈ L i . As this fits less smoothly with the categorytheoretical approach to the former format. Note, an interpretation of a FuTS as a Ln of Section 4, we stick function S → i=1 L i → FS(S , R i ) does not suit our purposes as the IML example above illustrates. We will use the notation with transition functions θ i : S → L i → FS( S , R i ) to introduce the notion of bisimilarity for a FuTS. Definition 2. Let S = ( S , h θi ini=1 ) be a FuTS over the label sets L i and semirings R i , i = 1 . . . n. An equivalence relation R ⊆ S × S is called an Sbisimulation if R(s1 , s2 ) implies P P 0 0 (1) t0 ∈[t]R θi (s1 )(` )(t ) = t0 ∈[t]R θi (s2 )(` )(t ) for all t ∈ S , i = 1 . . . n and ` ∈ L i . Two elements s1 , s2 ∈ S are called Sbisimilar if R(s1 , s2 ) for some Sbisimulation R for S. Notation s1 ∼S s2 . • We use the notation [t]R to denote the equivalence class of t ∈ S with respect to R. Note that sums in equation (1) exist since the functions θi (s j )(` ) ∈ FS( S , R i ), i = 1 . . . n, j = 1, 2, are of finite support. Hence, θi (s j )(` )(t0 ) = 0 ∈ R i for all but finitely many t0 ∈ [t]R ⊆ S . For the combined FuTS of the two CCSprocesses of Figure 1, the obvious equivalence relation relating P b.0 and b.0 + c.0 is not a FuTSbisimulation. Although t0 ∈[0]R θ(b.0)(b)(t0 ) = θ(b.0)(b)(0) = true and P P 0 0 0 ∈[0] θ(b.0 + c.0)(b)(t ) = θ(b.0 + c.0)(b)(0) = true, we have t t0 ∈[0]R θ(b.0)(c)(t ) = false, while R P 0 t0 ∈[0]R θ(b.0 + c.0)(c)(t ) = true, taking sums, i.e. disjunctions, in B.
4 FuTS coalgebraically In this section we will cast FuTS in the framework of coalgebras and prove a correspondence result of FuTSbisimulation and behavioral equivalence for a suitable functor on Set.
28
Bisimulation of FuTS
Definition 3. Let L be a set of labels and let R be a semiring. The functor VRL : Set → Set assigns to a set X the function space FS(X, R )L of all functions ϕ : L → FS(X, R ) and assigns to a function f : X → Y the mapping VRL ( f ) : FS(X, R )L → FS(Y, R )L where VRL ( f )(ϕ)(` )(y) =
P
x0 ∈ f −1 (y)
ϕ(` )(x0 )
for all ϕ ∈ FS(X, R )L , ` ∈ L and y ∈ Y.
•
Again we rely on ϕ(` ) ∈ FS(X, R ) having a finite support for the sum to exist and for VRL being welldefined. In fact, we have spt( VRL ( f )(ϕ)(` ) ) = { f (x)  x ∈ spt(ϕ)(` ) }. As we aim to compare our notion of bisimulation for FuTS with behavioral equivalence for the functor VRL , given a set of labels L and a semiring R, we need to check that VRL possesses a final coalgebra. We follow the approach of [5]. Lemma 3. Let L be a set of labels, R a semiring. Then the functor VRL on Set is bounded.
Working with total and deterministic FuTS, we can interpret a FuTS S = ( S , h θi ini=1 ) over the label sets Q L i and semirings R i , i = 1 . . . n as a product θ1 × · · · × θn : S → ni=1 ( L i → FS(S , R i ) ) of functions θ i : S → L i → FS(S , R i ). To push this idea a bit further, we want to consider the FuTS S = ( S, h θ i ini=1 ) as a coalgebra of a suitable product functor on Set. Definition 4. Let S = ( S , h θi ini=1 ) be a FuTS over the label sets L i and semirings R i , i = 1..n. The Q Q functor VS on Set is defined by VS = ni=1 VRLii = ni=1 FS( · , R i )L i . The point is, under conditions that are generally met, coalgebras come equipped with a natural notion of behavioral equivalence that can act as a reference for strong equivalences, in particular of bisimulation for FuTS. Below, see Theorem 5, we prove that Sbisimilarity as given by Definition 2 coincides with behavioral equivalence for the functor VS as given by Definition 4, providing justification for the notion of equivalence defined on FuTS. For the notion of behavioral equivalence for the functor VS obtained from S to be defined, we establish that it possesses a final coalgebra. Theorem 4. The functor VS has a final coalgebra. Proof. By Lemma 3 we have that each factor VRLii of VS is bounded, and hence possesses a final coalgebra ΩV L i by Theorem 2. It follows that also VS has a final coalgebra Ω S . Writing [[·]]XS for the final Ri
morphism of a VS coalgebra X into Ω S , we have L
Ω S = Ω V L 1 × · · · × ΩV L n R1
as can be straightforwardly shown.
Rn
and
[[·]]XS
VR 1
= [[·]]X
1
VL n
× · · · × [[·]]X R n
Since the functor VS of a FuTS has a final coalgebra, we can speak of the behavioral equivalence ≈S induced by VS . Next we establish, for a FuTS S, the correspondence of Sbisimulation ∼S as given by Definition 2 and behavioral equivalence ≈S . n Theorem 5. Let S = ( S , h θi ii=1 ) be a FuTS over the label sets L i and semirings R i , i = 1 . . . n. Then s1 ∼S s2 ⇔ s1 ≈S s2 , for all s1 , s2 ∈ S .
Latella, Massink & De Vink
29
Proof. Let s1 , s2 ∈ S . We first prove s1 ∼S s2 ⇒ s1 ≈S s2 . So, assume s1 ∼S s2 . Let R ⊆ S × S be an Sbisimulation with R(s1 , s2 ). Put θ = θ1 × · · · × θ n . Note (S , θ ) is a VS coalgebra. We turn the collection of equivalence classes S /R into a VS coalgebra (S /R, θR ) by putting θRi ( [s]R )(` )( [t]R ) =
P
t0 ∈[t]R
θ i (s)(` )(t0 )
and θR = θR1 × · · · × θRn
for s, t ∈ S , ` ∈ L i , i = 1 . . . n. This is welldefined since R is an Sbisimulation: if R(s, s0 ) then we have P P 0 0 0 t0 ∈[t]R θ i (s)(` )(t ) = t0 ∈[t]R θ i (s )(` )(t ). The canonical mapping εR : S → S /R is a VS homomorphism: For i = 1 . . . n, ` ∈ L i and t ∈ S , we have both FS(εR , R i )L i ( θ i (s) )(` )([t]R ) =
P
t0 ∈ [t]R
θ i (s)(` )(t0 )
and θRi ([s]R )(` )([t]R ) =
P
t0 ∈ [t]R
θ i (s)(` )(t0 )
Q Thus, FS(εR , R i )Li ◦ θ i = θRi ◦ εR . Since VS (εR ) = ni=1 FS(εR , R i )L i it follows that εR is a VS homomorphism. Therefore, by uniqueness of a final morphism, we have [[·]]SS = [[·]]SS/R ◦ εR . In particular, [[s1 ]]SS = [[s2 ]]SS since εR (s1 ) = εR (s2 ). Thus, s1 ≈S s2 . For the reverse, i.e. s1 ≈S s2 ⇒ s1 ∼S s2 , assume s1 ≈S s2 , i.e. [[s1 ]]SS = [[s2 ]]SS . Since the map S [[·]]S : (S , θ ) → (Ω S , ωS ) is a VS homomorphism, the relation RS with RS (s0 , s00 ) ⇔ [[s0 ]]SS = [[s00 ]]SS is 1 × ··· × θn . an Sbisimulation: Suppose RS (s0 , s00 ), i.e. s0 ≈S s00 , for some s0 , s00 ∈ S. Assume θΩ S = θΩ ΩS S Pick 1 6 i 6 n, ` ∈ L i , t ∈ S . Put [[t]]SS = w ∈ Ω S . Let [t] S denote the equivalence class of t in RS . P
θ i (s0 )(` )(t0 ) P
t0 ∈[t] S
= = = = =
0 0 t0 ∈( [[·]]SS )−1 (w) θ i (s )(` )(t ) θΩi S ( [[s0 ]]SS )(` )(w) θΩi S ( [[s00 ]]SS )(` )(w) P 00 0 t0 ∈( [[·]] S )−1 (w) θ i (s )(` )(t ) S
P
t0 ∈[t] S
θ i (s00 )(` )(t0 )
(by definition of RS and w) ( [[·]]SS is a VS homomorphism) (s0 ≈S s00 by assumption) ( [[·]]SS is a VS homomorphism) (by definition of RS and w)
P P Thus, if RS (s0 , s00 ) then t0 ∈[t] S θ i (s0 )(` )(t0 ) = t0 ∈[t] S θ i (s00 )(` )(t0 ) for all i = 1 . . . n, t ∈ S , ` ∈ L i and RS is an Sbisimulation. Since [[s1 ]]SS = [[s2 ]]SS , it follows that RS (s1 , s2 ). Thus RS is an Sbisimulation relating s1 and s2 . Conclusion, it holds that s1 ∼S s2 .
5 FuTS Semantics of PEPA Next we will consider a significant fragment of the process algebra PEPA [17], including the parallel operator implementing the scheme of socalled minimal apparent rates, and provide a FuTS semantics for it. We will show that PEPA’s notion of equivalence =pepa , called strong equivalence in [17], fits with the bisimilarity ∼pepa as arising from the FuTS semantics. Definition 5. The set PPEPA of PEPA processes is given by the BNF P ::= nil  (a, λ).P  P + P  P k A P  X where a ranges over the set of actions A, λ over R>0 , A over the set of finite subsets of A, and X over the set of constants X. • PEPA, like many other stochastic process algebras (e.g. [15, 4]), couples actions and rates. The prefix (a, λ) of the process (a, λ).P expresses that the duration of the execution of the action a ∈ A is sampled from an exponential distribution of rate λ. The parallel composition P k A Q of a process P and a process Q
30
Bisimulation of FuTS
(NIL)
(RAPF1)
δa
nil p []R>0 δa
(CHO) δa
(PAR1)
P p P
(a, λ).P p [P 7→ λ] δa
P p P
(RAPF2)
δa
Q p Q δa
P + Q p P + Q
δa
(CNS)
P p P
δa
X := P
δa
δa
P k A Q p ( P k A XQ ) + ( XP k A Q )
δb
(a, λ).P p []R>0
X p P
δa
Q p Q a < A
b,a
(PAR2)
P p P
δa
Q p Q a ∈ A
δa
P k A Q p arf(P, Q) · ( P k A Q )
Figure 2: FuTS semantics for PEPA. for a set of actions A ⊆ A allows for the independent, asynchronous execution of actions of P and Q not occurring in the subset A, on the one hand, and requires the simultaneous, synchronized execution of P and Q for the actions occurring in A, on the other hand. The FuTSsemantics of the fragment of PEPA that we consider here, is given by the SOS of Figure 2, on which we comment below. Characteristic for the PEPA language is the choice to model parallel composition, or cooperation in the terminology of PEPA, scaled by the minimum of the socalled apparent rates. By doing so, PEPA’s strong equivalence becomes a congruence [17]. Intuitively, the apparent rate ra (P) of an action a for a process P is the sum of the rates of all possible aexecutions for P. When considering the CSPstyle parallel composition P k A Q, with cooperation set A, an action a occurring in A has to be performed by both P and Q. The rate of such an execution is governed by the slowest, on average, of the two processes in this respect.2 Thus ra (P k A Q) for a ∈ A is the minimum min{ ra (P), ra (Q) }. Now, if P schedules an execution of a with rate r1 and Q schedules a transition of a with rate r2 , in the minimal apparent rate scheme the combined execution yields the action a with rate r1 · r2 · arf(P, Q). Here, the ‘syntactic’ scaling factor arf(P, Q), the apparent rate factor, is defined by arf(P, Q) =
min{ ra (P), ra (Q) } ra (P) · ra (Q)
assuming ra (P), ra (Q) > 0, otherwise arf(P, Q) = 0. Thus, for P k A Q the minimum min{ ra (P), ra (Q) } of the apparent rates is adjusted by the relative probabilities r1 /ra (P) and r2 /ra (Q) for executing a by P and Q, respectively. See [17, Definition 3.3.1] (or the appendix) for an explicit definition of the apparent rate ra of a PEPAprocess. The FuTS we consider for the semantics of PEPA in Figure 2 involves a set of labels ∆ defined by ∆ = { δa  a ∈ A }. The symbol δa denotes the execution of the action a, with a duration that is still to be established. The underlying semiring for the simple FuTS for PEPA is the semiring R>0 of nonnegative reals. Definition 6. The FuTS Spepa = (PPEPA , p ) over ∆ and R>0 has its transition relation given by the rules of Figure 2. • δa
We discuss the rules of Figure 2. The FuTS semantics provides nil p []R>0 , for every action a, with []R>0 the 0function λP.0 of R>0 . However, the latter expresses θpepa (nil)( δa )(P0 ) = 0 for every a ∈ A 2 One cannot take the slowest process per sample, because such an operation cannot be expressed as an exponential distribution in general.
Latella, Massink & De Vink
31
and P0 ∈ PPEPA , or, in standard terminology, nil has no transition. For the rated action prefix (a, λ) we distinguish two cases: (i) execution of the prefix in rule (RAPF1); (ii) no execution of the prefix in rule (RAPF2). In the case of rule (RAPF1) the label δa signifies that the transition involves the execution of the action a. The continuation [ P 7→ λ ] is the function that assigns the rate λ to the process P. All other processes are assigned 0, i.e. the zeroelement of the semiring R>0 . In the second case, rule (RAPF2), for labels δb with b , a, we do have a transition, but it is a degenerate one. The two rules for the prefix, in particular having the ‘nullcontinuation’ rule (RAPF2), support the unified treatment of the choice operator in rule (CHO) and the parallel operator in rules (PAR1) and (PAR2). Note the semantic sum of functions P + Q replacing the syntactic sum in P + Q. The treatment of constants is as usual. Regarding the parallel operator k A , with respect to some subset of actions A ⊆ A, the socalled cooperation set, there are again two rules. Now the distinction is between interleaving and synchronization. In the case of a label δa involving an action a not in the subset A, either the Poperand or the Qoperand of P k A Q makes progress. For example, the effect of the pattern P k A XQ is that the value P(P0 ) · 1 is assigned to a process P0 k A Q, the value P(P0 ) · 0 = 0 to a process P0 k A Q0 for some Q0 , Q, and the value 0 for a process not of the form P0 k A Q0 . Here, as in all other rules, the righthand sides of the transitions only involve functions in FS(PPEPA , R>0 ) and operators on them. δa
δa
For the synchronization case of the parallel construct, assuming P P and Q Q, the ‘semantic’ scaling factor arf(P, Q) is applied to P k A Q (with k A on FS(PPEPA , R>0 ) induced by k A on PPEPA ). This scaling factor, defined for functions in FS(PPEPA , R>0 ), is given by arf(P, Q) =
min { ⊕P, ⊕Q } ⊕P · ⊕Q
provided ⊕P, ⊕Q > 0, and arf(P, Q) = 0 otherwise. This results for arf(P, Q) · ( P k A Q ), for a process R = R1 k A R2 , in the value arf(P, Q) · ( P k A Q )(R1 k A R2 ) = arf(P, Q) · P(R1 ) · Q(R2 ). The following lemma establishes the relationship between the ‘syntactic’ and ‘semantic’ apparent rate factors defined on processes and on continuation functions, respectively. δa
Lemma 6. Let P ∈ PPEPA and a ∈ A. Suppose P p P. Then ra (P) = ⊕P.
The proof of the lemma is straightforward. It is also easy to prove, by guarded induction, that the FuTS Spepa given by Definition 6 is total and deterministic. So, it is justified to write Spepa = (PPEPA , θpepa ). We use ∼pepa to denote the bisimilarity induced by Spepa . Lemma 7. The FuTS Spepa is total and deterministic.
Example To illustrate the ease to deal with multiplicities in the FuTS semantics, consider the PEPA δa
processes P1 = (a, λ).P and P2 = (a, λ).P + (a, λ).P for some P ∈ PPEPA . We have P1 p [ P 7→ λ ] by δa
rule (RAPF1), but P2 p [ P 7→ 2λ ] by rule (RAPF1) and rule (CHO). The latter makes us to compute [ P 7→ λ ]+[ P 7→ λ ], which equals [ P 7→ 2λ ]. Thus, in particular we have P1 /pepa P2 . Intuitively it is clear that, in general we cannot have P + P ∼ P for any reasonable quantitative process equivalence ∼ in the Markovian setting. Having twice as many alabelled transitions, the average number for (a, λ).P+(a, λ).P of executing the action a per time unit is double the average of executing a for (a, λ).P. The standard operational semantics of PEPA [17, 18] is given in Figure 3. The transition relation → ⊆ PPEPA × ( A × R>0 ) × PPEPA is the least relation satisfying the rules. For a proper treatment of the rates, the transition relation is considered as a multitransition system, where also the number of possible
32
Bisimulation of FuTS a,λ
(RAPF)
a,λ
(a, λ).P → P
a,λ
(PAR1a)
(CHO1)
P → P0
(PAR1b)
a,λ
P k A Q → P0 k A Q
a,λ
Q → Q0
a0
(RPF1)
b
P k A Q i ( P k A XQi ) + ( XPi k A Q ) (CHO)
(APF3)
δ
nil 2 []R>0
(CON)
P i P α
X := P
(i = 1, 2)
X i P
Figure 4: FuTS semantics for IML. In view of our general correspondence result Theorem 5, the above theorem shows that PEPA’s strong equivalence =pepa is a behavioral equivalence, viz. behavioral equivalence ≈pepa with respect to the functor of Spepa , and that its standard, FuTS and coalgebraic semantics coincide.
6
FuTS Semantics of IML
In this section we provide a FuTS semantics for a relevant part of the language of IMC [13]. IMC, Interactive Markov Chains, are automata that combine two types of transitions: interactive transitions that involve the execution of actions and Markovian transitions that represent the progress of time governed by exponential distribution. As a consequence, IMC embody both nondeterministic and stochastic behaviour. System analysis using IMC proves to be a powerful approach because of the orthogonality of qualitative and quantitative dynamics, their logical underpinning and tool support. A number of equivalences, both strong and weak, are available for IMC [10]. In our treatment here, dealing with a fragment we call IML, we do not deal with internal τsteps and focus on strong bisimulation. Definition 8. The set PIML of IML processes is given by the BNF P ::= nil  a.P  λ.P  P + P  P k A P  X where a ranges over the set of actions A, λ over R>0 , A over the set of finite subsets of A and X over the set of constants X. • In IML there are separate prefix constructions for actions a.P and for timedelays λ.P. No restriction is imposed on the alternative and parallel composition of processes. For example, we have the process P = a.λ.nil + µ.b.nil in IML. It should be noted that for IMC actions are considered to take no time. Definition 9. The formal semantics of PIML is given by the FuTS Siml = ( PIML , 1 , 2 ) over the label sets A and ∆ = { δ } and the semirings B and R>0 with transition relations 1 ⊆ PIML ×A× FS(PIML , B ) and 2 ⊆ PIML × ∆ × FS(PIML , R>0 ) defined as the least relations satisfying the rules of Figure 4. • To accommodate for actionbased and delayrelated transitions, the FuTS Siml is nonsimple, having the two transitiontofunction relations 1 and 2 . Actions a ∈ A decorate 1 , the special symbol δ decorates 2 . Note rule (APF3) and rule (RPF1) that involve the nullfunctions of R>0 and of B, respectively, to express that a process a.P does not trigger a delay and a process λ.P does not execute an action. For the
34
Bisimulation of FuTS
a
(APF)
(CHO1)
a
a.P → P a
(PAR1a)
P → P0 a
P kA Q →
a
P+Q → R (PAR1b)
kA Q
Q → Q0
P+Q → R
a
λ
λ.P d P
(PAR2)
λ
P+Q d R
a
Q → Q0
P kA Q →
P0
(CON2)
λ
P+Q d R
a∈A
k A Q0
λ
QdR
(CHO4)
X := P a
a
Q0
λ
(PAR1c)
P → P0
λ
PdR
P→Q
X→Q a
a0 × PIML
Below we will use the functions T and R based on → and d, cf. [16]. We have T : PIML × A × 2PIML → B a given by T(P, a,C) = true if the set { P0 ∈ C  P → P0 } is nonempty, for all P ∈ PIML , a ∈ A and any P λ subset C ⊆ PIML . For R : PIML × PIML → R>0 we put R(P, P0 ) = { λ  P d P0 }. Here, as common for probabilistic and stochastic process algebras, the comprehension is over the multiset of transitions P P λ leading from P to P0 with label λ. We extend R to PIML × 2PIML by R(P,C) = P0 ∈C { λ  P d P0 }. For IML we have the following notion of strong bisimulation [13, 16] that we will compare with the notion of bisimulation associated with the FuTS Siml .
Latella, Massink & De Vink
35
Definition 10. An equivalence relation R ⊆ PIML × PIML is called a strong bisimulation for IML if, for all P1 , P2 ∈ PIML it holds that • for all a ∈ A and Q ∈ PIML : T(P1 , a, [Q]R ) ⇐⇒ T(P2 , a, [Q]R ) • for all Q ∈ PIML : R(P1 , [Q]R ) = R(P2 , [Q]R ). for all P1 , P2 ∈ PIML such that R(P1 , P2 ). Two processes P1 , P2 ∈ PIML are called strongly bisimilar if R(P1 , P2 ) for a strong bisimulation R for IML, notation P1 =iml P2 . • To establish the correspondence of FuTS bisimilarity ∼iml for Siml of Definition 9 and strong bisimilarity =iml for IML, we need to connect the statetofunction relation 1 and the transition relation → as well as the statetofunction relation 2 and the transition relation d . Lemma 11. a
a
(a) Let P ∈ PIML and a ∈ A. If P 1 P then P → P0 ⇐⇒ P(P0 ) = true. P δ λ (b) Let P ∈ PIML . If P 2 P then { λ  P d P0 } = P(P0 ).
We are now in a position to relate FuTS bisimulation and standard strong bisimulation for IML. Theorem 12. For any two processes P1 , P2 ∈ PIML it holds that P1 ∼iml P2 iff P1 =iml P2 . Proof. Let R be an equivalence relation on PIML . Pick P ∈ PIML , a ∈ A and choose any Q ∈ PIML . a Suppose P P. Thus θ1 (P)(a) = P. Then we have a
T(P, a, [Q]R ) ⇔ ∃Q0 ∈ [Q]R : P → Q0 (by definition of T) ⇔ ∃Q0 ∈ [Q]R : P(Q0 ) = true (by Lemma 11a) P ⇔ Q0 ∈[Q]R θ1 (P)(a)(Q) = true (by definition of θ1 ) Note, summation in B is disjunction. Likewise, on the quantitative side, we have P P λ R(P, [Q]R ) = { λ  P d Q0 } (by definition of R) Q0 ∈[Q]R P 0 = (by Lemma 11b) Q0 ∈[Q]R P(Q ) P = (by definition of θ2 ) Q0 ∈[Q]R θ2 (P)(δ)(Q) Combining the equations, we conclude that a strong bisimulation for IML is also a bisimulation for the FuTS Siml , and vice versa. From this the theorem follows. Again, as a corollary of the theorem above, we have for IML that its notion of strong bisimulation is coalgebraically underpinned, as it coincides, calling to Theorem 5 once more, with behavioral equivalence of the functor Viml induced by the FuTS Siml . As a consequence, the standard, FuTS and coalgebraic semantics for IML are all equal.
7
Concluding remarks
Total and deterministic labeled statetofunction transition systems, FuTS, are a convenient instrument to express the operational semantics of both qualitative and quantitative process languages. In this paper
36
Bisimulation of FuTS
we have introduced the notion of bisimulation that arises from a FuTS, possibly involving multiple transition relations. A correspondence result, Theorem 5, relates the bisimulation of a FuTS S to behavioral equivalence of the functor VS that arises from the FuTS S too. For two prototypical stochastic process languages based on PEPA and on IMC we have shown that the notion of stochastic bisimulation associated with these calculi, coincides with the notion of bisimulation of the corresponding FuTS. Using these FuTS as a stepping stone, the correspondence result bridges between the concrete notion of bisimulation for PEPA and IMC, and the coalgebraic notion of behavioral equivalence. Hence, from this perspective, the concrete notions are seen as the natural strong equivalence to consider. It is shown in [5], in the context of weighted automata, that in general the type of functors FS(·, R ) may not preserve weak pullbacks and, therefore, the notions of coalgebraic bisimulation and of behavioral equivalence may not coincide. Essential for the construction in their setting is the fact that the sum of nonzero weights may add to weight 0. The same phenomenon prevents a general proof, along the lines of [28], for coalgebraic bisimulation and FuTS bisimulation to coincide. In the construction of a mediating morphism, going from FuTS bisimulation to coalgebraic bisimulation a denominator may be zero, hence a division undefined, in case the sum over an equivalence class cancels out. In the concrete case for [19], although no detailed proof is provided there, this will not happen with R>0 as underlying semiring. We expect that for semirings enjoying the property that for a sum x = x1 + · · · + xn it holds that x = 0 iff xi = 0 for all i = 1 . . . n, we will be able to prove that pullbacks are weakly preserved, and hence that coalgebraic bisimulation and behavioral equivalence are the same. Obviously, Milnertype strong bisimulation [21, 22] and bisimulation for FuTS over B coincide. Also, strong bisimulation of [17] involving, apart from the usual transfer conditions, the comparison of state information, viz. the apparent rates, can be treated with FuTS. Again the two notions of equivalence coincide. We expect to be able to deal with discrete time and socalled Markov automata as well. For dense time and general measures one may speculate that the use of functions of compact support with respect to a suitable topology may be fruitful. Future research needs to reveal under what algebraic conditions of the semirings, or similar structures, or the coalgebraic conditions on the format of the functors involved standard bisimulation, FuTSbisimulation, coalgebraic bisimulation and behavioral equivalence will amount to similar identifications. Acknowledgments The authors are grateful to Rocco De Nicola, Michele Loreti and Jan Rutten for fruitful discussions useful suggestions. DL and MM acknowledge support by EU Project n. 257414 Autonomic ServiceComponents Ensembles (ASCENS) and by CNR/RSTL Project XXL. This research has been conducted while EV was spending a sabbatical leave at the CNR/ISTI. EV gratefully acknowledges the hospitality and support during his stay in Pisa.
References [1] C. Baier, B.R. Haverkort, H. Hermanns, J.P. Katoen & M. Siegle, editors (2004): Validation of Stochastic Systems – A Guide to Current Research. LNCS 2925, doi:10.1007/b98484. [2] J.W. de Bakker & E.P. de Vink (1996): Control Flow Semantics. The MIT Press. [3] M. Bernardo (2007): A Survey of Markovian Behavioral Equivalences. In M. Bernardo & J. Hillston, editors: SFM 2007 Advanced Lectures, LNCS 4486, pp. 180–219, doi:10.1007/9783540725220 5. [4] M. Bernardo & R. Gorrieri (1998): A tutorial on EMPA: a theory of concurrent processes with nondeterminism, priorities, probabilities and time. Theoretical Computer Science 202(1–2), pp. 1–54, doi:10.1016/S03043975(97)001278. [5] F. Bonchi, M. Bonsangue, M. Boreale, J. Rutten & A. Silva (2011): A coalgebraic perspective on linear weighted automata. Technical Report SEN–1104, CWI. 31pp.
Latella, Massink & De Vink
37
[6] M. Boreale (2009): Weighted Bisimulation in Linear Algebraic Form. In M. Bravetti & G. Zavattaro, editors: Proc. CONCUR 2009, LNCS 5710, pp. 163–177, doi:10.1007/9783642040818 12. [7] R. De Nicola, D. Latella, M. Loreti & M. Massink (2009): Ratebased Transition Systems for Stochastic Process Calculi. In S. Albers et al., editor: Proc. ICALP 2009, Part II, LNCS 5556, pp. 435–446, doi:10.1007/9783642029301 36. [8] R. De Nicola, D. Latella, M. Loreti & M. Massink (2011): State to function labelled transition systems: a uniform framework for defining stochastic process calculi. Technical Report ISTI2011TR012, CNR/ISTI. [9] R. De Nicola, D. Latella & M. Massink (2005): Formal modeling and quantitative analysis of Klaimbased mobile systems. In H. Haddad et al., editor: Proc. SAC 2005, ACM, pp. 428–435, doi:10.1145/1066677.1066777. [10] C. Eisentraut, H. Hermanns & L. Zhang (2010): Concurrency and Composition in a Stochastic World. In P. Gastin & F. Laroussinie, editors: Proc. CONCUR 2010, LNCS 6269, pp. 21–39, doi:10.1007/9783642153754 3. [11] H.P. Gumm & T. Schr¨oder (2001): Products of coalgebras. Algebra Universalis 46, pp. 163–185. [12] H.P. Gumm & T. Schr¨oder (2002): Coalgebras of bounded type. Mathematical Structures in Computer Science 12, pp. 565–578, doi:10.1017/S0960129501003590. [13] H. Hermanns (2002): Interactive Markov Chains. LNCS 2428, doi:10.1007/3540458042. [14] H. Hermanns, U. Herzog & J.P. Katoen (2002): Process algebra for performance evaluation. Theoretical Computer Science 274(1–2), pp. 43–87, doi:10.1016/S03043975(00)003054. [15] H. Hermanns, U. Herzog & V. Mertsiotakis (1998): Stochastic process algebras – between LOTOS and Markov chains. Computer Networks and ISDN Systems 30, pp. 901–924, doi:10.1016/S01697552(97)001335. [16] H. Hermanns & J.P. Katoen (2010): The How and Why of Interactive Markov Chains. In F.S. de Boer, M.M. Bonsangue, S. Hallerstede & M. Leuschel, editors: Proc. FMCO 2009, LNCS 6286, pp. 311–337, doi:10.1007/9783642170713 16. [17] J. Hillston (1996): A Compositional Approach to Performance Modelling. Distinguished Dissertations in Computer Science 12, Cambridge University Press. [18] J. Hillston (2005): Process Algebras for Quantitative Analysis. In: Proc. LICS, Chicago, IEEE, pp. 239–248, doi:10.1109/LICS.2005.35. [19] B. Klin & V. Sassone (2008): Structural Operational Semantics for Stochastic Process Calculi. In R.M. Amadio, editor: Proc. FoSSaCS 2008, LNCS 4962, pp. 428–442, doi:10.1007/9783540784999 30. [20] A. Kurz (2000): Logics for coalgebras and applications to computer science. Ph.D. thesis, LMU M¨unchen. [21] R. Milner (1980): A Calculus of Communicating Systems. LNCS 92, doi:10.1007/3540102353. [22] D. Park (1981): Concurrency and Automata on Infinite Sequences. In: Proc. GIConference 1981, Karlsruhe, LNCS 104, pp. 167–183. [23] C. Priami (1995): Stochastic doi:10.1093/comjnl/38.7.578.
πcalculus.
The
Computer
Journal
38,
pp.
578–589,
[24] J.J.M.M. Rutten (2000): Universal coalgebra: a theory of systems. Theoretical Computer Science 249, pp. 3–80, doi:10.1016/S03043975(00)000566. [25] A. Silva, F. Bonchi, M. Bonsangue & J. Rutten (2011): Quantitative Kleene coalgebras. Information and Computation 209(5), pp. 822–846, doi:10.1016/j.ic.2010.09.007. [26] A. Sokolova (2011): Probabilistic systems coalgebraically: a survey. Theoretical Computer Science 412(38), pp. 5095–5110, doi:10.1016/j.tcs.2011.05.008. [27] D. Turi & G.D. Plotkin (1997): Towards a Mathematical Operational Semantics. In: Proc. LICS 1997, Warsaw, IEEE, pp. 280–291, doi:10.1109/LICS.1999.782615.
38
Bisimulation of FuTS
[28] E.P. de Vink & J.J.M.M. Rutten (1999): Bisimulation for probabilistic transition systems: a coalgebraic approach. Theoretical Computer Science 221, pp. 271–293, doi:10.1016/S03043975(99)000353.
Latella, Massink & De Vink
A
39
Additional proofs
Additional proof for Section 2 Lemma 1. Let X be a set, R a semiring and  an injective binary operation on X. For ϕ, ψ ∈ FS(X, R ) it holds that ⊕(ϕ + ψ) = ⊕ϕ + ⊕ψ and ⊕( ϕ  ψ ) = (⊕ϕ) ∗ (⊕ψ). Proof. We verify ⊕( ϕ  ψ ) = (⊕ϕ) ∗ (⊕ψ) for arbitrary ϕ, ψ ∈ FS(X, R ). (⊕ϕ) ∗ (⊕ψ) P P = x1 ∈ X ϕ(x1 ) ∗ x2 ∈ X ψ(x2 ) P = x ,x ∈ X ϕ(x1 ) ∗ ψ(x2 ) P 1 2 = x ∈ X, x=x1  x2 ϕ(x1 ) ∗ ψ(x2 ) P = x ∈ X, x=x1  x2 (ϕ  ψ)(x) P = x ∈ X (ϕ  ψ)(x)
(by definition ⊕) (by distributivity of ∗ over +) (by injectivity of  ) (by definition of ϕ  ψ) (since (ϕ  ψ)(x) = 0 if x <  −1 (X))
The fact ⊕(ϕ + ψ) = ⊕ϕ + ⊕ψ follows direct from the definitions and commutativity of +.
Additional proof for Section 4 Lemma 3. Let L be a set of labels and R a semiring. Then functor VRL on Set is bounded. Proof. Consider the elements ν ∈ FS(N, R )L as parametrized ‘valuation’ functions, and the elements σ ∈ X N as ‘selection’ functions. The functor FS(N, R )L × (·)N : Set → Set is the product functor of the lifting id FS(N,R )L of the identity functor id FS(N,R ) to L and of the functor (·)N . Define the mapping η : id FS(N,R )L × (·)N → VRL by putting P ηX (ν, σ)(` )(x) = n ∈ σ−1 (x) ν(` )(n) for ν ∈ FS(N, R )L , σ ∈ X N and ` ∈ L. For ` ∈ L and x ∈ X, the righthand sum defining ηX (ν, σ)(` )(x) exists, since ν(` ) : N → R is of finite support. Note that ηX (ν, σ)(` ) is of finite support too: If ηX (ν, σ)(` )(x) P , 0, by definition { ν(` )(n)  n ∈ σ−1 (x) } , 0. Then ν(` )(n) , 0 for some n ∈ σ−1 (x). Thus, σ(n) = x for some n ∈ spt(ν(` )). So, spt( ηX (ν, σ) ) ⊆ { σ(n)  n ∈ spt(ν(` )) } and spt( ηX (ν, σ) ) is finite. Next we verify that η : id FS(N,R )L × (·)N → VRL is a natural transformation, i.e. we check that for f : X → Y it holds that VRL ◦ ηX = ηY ◦ h id FS(N,R )L , f N i.
For ν ∈ FS(N, R )L and σ ∈ X N we have, for ` ∈ L and y ∈ Y, ( FS( f, R )L ◦ ηX )(ν, σ)(`)(y) P = x ∈ f −1 (y) ηX (ν, σ)(`)(x) P = n ∈ ( f ◦ σ)−1 ν(`)(n)
=
P
x ∈ f −1 (y)
P
n ∈ σ−1 (x)
ν(`)(n)
= ηY (ν, f ◦ σ)(` )(y)
= ηY (id FS(N,R )L × f N )(ν, σ) (` )(y) = ( ηY ◦ ( id FS(N,R )L × f N ))(ν, σ)(` )(y)
40
Bisimulation of FuTS
Thus, FS( f, R )L ◦ ηX = ηY ◦ ( id FS(N,R )L × f N ) and η : id FS(N,R )L × (·)N → VRL is a natural transformation. Finally, we check that ηX : id FS(N,R )L × X N → VRL (X) is surjective. Choose a set X and a mapping ` }. Without loss of generality we assume X , spt(ϕ(` )) ϕ : L → FS(X, R ). Say, spt(ϕ(` )) = { x 1` , . . . , xn(` ) and pick x0` ∈ X\spt(ϕ(` )). Define ν ∈ FS(N, R )L by ν(` )(n) = ϕ(` )(xn` ) for n = 1 . . . n(` ) and ν(` )(n) = 0 otherwise. Define σ : N → X by σ(n) = xn` for 1 6 n 6 n(` ) and σ(n) = x0` otherwise. Then we have P ηX h ν, σ i(` )(xi` ) = = ν(` )(i) = ϕ(` )(xi` ) −1 ` ν(` )(m) P m ∈ σ (xi ) P ηX h ν, σ i(` )(x) = n ∈ N\{ 1,...,n(` ) } ν(` )(n) = 0 m ∈ σ−1 (x) ν(` )(m) = for i = 1 . . . n(` ) and x < spt(ν(`)). Thus ηX h ν, σ i(` )(x) = ϕ(` )(x) for all ` ∈ L and x ∈ X, ηX h ν, σ i = ϕ and ηX : id FS(N,R )L × X N → VRL (X) is surjective.
Additional proofs for Section 5 Definition.[17, Definition 3.3.1] We put ra (nil) ra ((a, λ).P) ra ((b, λ).P) ra (X)
= = = =
ra (P + Q) = ra (P) + ra (Q) ra (P k A Q) = ra (P) + ra (Q) if a < A ra (P k A Q) = min{ ra (P), ra (Q) } if a ∈ A
0 λ 0 for b , a ra (P) if X := P
δa
Lemma 6. Let P ∈ PPEPA and a ∈ A. Suppose P p P. Then ⊕P = ra (P). Proof. Guarded recursion. We treat the two cases for the parallel construct. δa
δa
Case P = P1 k A P2 , a < A. Suppose P1 p P1 , P2 p P2 . Then P = ( P1 k A XP2 ) + ( XP1 k A P2 ). Therefore we have ⊕P = = = = = =
⊕( P1 k A XP2 ) + ( XP1 k A P2 ) ⊕( P1 k A XP2 ) + ⊕( XP1 k A P2 ) ( ⊕P1 · ⊕XP2 ) + ( ⊕XP1 · ⊕P2 ) ⊕P1 + ⊕P2 ra (P1 ) + ra (P2 ) ra (P1 k A P2 ) δa
(by Lemma 1) (by Lemma 1) (since ⊕XP1 , ⊕XP2 = 1) (by the induction hypothesis) (by definition ra ) δa
Case P = P1 k A P2 , a ∈ A. Suppose P1 p P1 , P2 p P2 . Then P = arf(P1 , P2 ) · ( P1 k A P2 ). If ⊕P1 , ⊕P2 > 0 we have ⊕P = ⊕(arf(P1 , P2 ) · ( P1 k A P2 ) = arf(P1 , P2 ) · ⊕P1 · ⊕P2 min { ⊕P1 , ⊕P2 } · ⊕P1 · ⊕P2 = ⊕P1 · ⊕P2 ) = min { ⊕P1 , ⊕P2 } = min { ra (P1 ), ra (P2 ) } = ra (P1 k A P2 )
(by Lemma 1) (by definition of arf)
(by the induction hypothesis) (by definition ra )
Latella, Massink & De Vink
41
If ⊕P1 , ⊕P2 = 0, then arf(P1 , P2 ) = 0, by definition, and ra (P1 ), ra (P2 ) = 0, by induction hypothesis. Therefore we have ⊕P = arf(P1 , P2 ) · ⊕P1 · ⊕P2 = 0 as well as ra (P1 k A P2 ) = min { ra (P1 ), ra (P2 ) } = 0. So, also now, ⊕P = ra (P1 k A P2 ). The other cases are straightforward, in the case of P1 + P2 also relying on Lemma 1. δa
δa
Corollary. If P P and Q Q, then arf(P, Q) = arf(P, Q).
Proof. Direct from the definitions. δa
Lemma 8. Let P ∈ PPEPA and a ∈ A. Suppose P P. Then it holds that P(P0 ) = all P0 ∈ PPEPA .
P
a,λ
{ λ  P → P0 } for
Proof. Guarded induction on P. We only treat the cases for the parallel composition. Note, the operation k A : PPEPA × PPEPA → PPEPA with k A (P1 , P2 ) = P1 k A P2 is injective. Recall, for P1 , P2 ∈ FS(PPEPA , R>0 ), we have (P1 k A P2 )(P1 k A P2 ) = P1 (P1 ) · P2 (P2 ). a
a
a
Suppose a < A. Assume P1 P1 , P2 P2 , P1 k A P2 P. We distinguish three cases. Case (I), P0 = P01 k A P2 , P01 , P1 . Then we have P
a,λ
{ λ  P1 k A P2 → P0 } P a,λ = { λ  P1 → P01 } (by rule (PAR1a)) = P1 (P01 ) (by the induction hypothesis) 0 = P1 (P1 ) · XP2 (P2 ) (as XP2 (P2 ) = 1) = (P1 k A XP2 )(P01 k A P2 ) + (XP1 k A P2 )(P01 k A P2 ) (definition k A on FS(PPEPA , R>0 ), XP1 (P01 ) = 0) = P(P0 ) (by rule (PAR1)
Case (II), P0 = P1 k A P02 , P02 , P2 : similar. Case (III), P0 = P1 k A P2 . Then we have P
a,λ
{ λ  P1 k A P2 → P0 } P P a,λ a,λ = { λ  P1 → P1 } + { λ  P2 → P2 } (by rules (PAR1a) and (PAR1b)) = P1 (P1 ) + P2 (P2 ) (by the induction hypothesis) = (P1 k A XP2 )(P1 k A P2 ) + (XP1 k A P2 )(P1 k A P2 ) (definition k A on FS(PPEPA , R>0 ), XP1 (P1 ), XP2 (P2 ) = 1) 0 = P(P ) (again by rule (PAR1) a
a
a
Suppose a ∈ A. Assume P1 P1 , P2 P2 , P1 k A P2 P. Without loss of generality, P0 = P01 k A P02 for suitable P01 , P02 ∈ PPEPA .
42
P
Bisimulation of FuTS
a,λ
{ λ  P1 k A P2 → P0 } a,λ2 a,λ1 P (by rule (PAR2)) = { arf(P1 , P2 ) · λ1 · λ2  P1 → P01 , P2 → P02 } P a,λ2 0 a,λ1 0 P { λ2  P2 → P2 } (by distributivity) { λ1  P1 → P1 } · = arf(P1 , P2 ) · = = = =
arf(P1 , P2 ) · P1 (P01 ) · P2 (P02 ) arf(P1 , P2 ) · P1 (P01 ) · P2 (P02 ) arf(P1 , P2 ) · (P1 k A P2 )(P01 k A P02 ) P(P0 )
(by the induction hypothesis) (by the corollary above) (definition k A on FS(PPEPA , R>0 )) (by rule (PAR2)
The other cases are simpler and omitted here.
Additional proofs for Section 6 Lemma 11. a
a
(a) Let P ∈ PIML and a ∈ A. If P 1 P then P → P0 ⇐⇒ P(P0 ) = true. P δ λ (b) Let P ∈ PIML . If P 2 P then { λ  P d P0 } = P(P0 ).
Proof. (a) Guarded induction. Let a ∈ A. We treat the typical cases λ.P and P1 k A P2 for a < A. a a Case λ.P. Suppose λ.P P. Then we have P = []B . Thus, both λ.P → P0 for no P0 ∈ PIML , as no transition is provided in →, and P(P0 ) = false by definition of []B , for all P0 ∈ PIML . a a a Case P1 k A P2 , a < A. Suppose P1 P1 , P2 P2 and P1 k A P2 P. Then it holds that P = (P1 k A XP2 ) + (XP1 k A P2 ). Recall, for Q ∈ PIML and XQ ∈ FS(PIML , B ), XQ (Q0 ) = true iff Q0 = Q, for Q0 ∈ PIML . We have a
P1 k A P2 → P0 a
a
⇔ ( P1 → P01 ∧ P0 = P01 k A P2 ) ∨ ( P2 → P02 ∧ P0 = P1 k A P02 ) (by analysis of →) ⇔ ( P1 (P01 ) = true ∧ P0 = P01 k A P2 ) ∨ ( P2 (P02 ) = true ∧ P0 = P1 k A P02 ) (by the induction hypothesis) ⇔ ( P1 (P01 ) · XP2 (P2 ) = true ∧ P0 = P01 k A P2 ) ∨ ( XP1 (P1 ) · P2 (P02 ) = true ∧ P0 = P1 k A P02 ) (by definition of XP1 and XP2 ) ⇔ ( (P1 k A XP2 )(P01 k A P2 ) = true ∧ P0 = P01 k A P2 ) ∨ ( (XP1 k A P2 )(P1 k A P02 ) = true ∧ P0 = P1 k A P02 ) (by definition of k A ) ⇔ (P1 k A XP2 )(P0 ) = true ∨ (XP1 k A P2 )(P0 ) = true (by definition of k A , XP1 and XP2 ) ⇔ ( (P1 k A XP2 ) + (XP1 k A P2 ) )(P0 ) = true (by definition of + on FS(PIML , B )) ⇔ P(P0 ) = true
Latella, Massink & De Vink
43
The other cases are standard or similar and easier. δ (b) Guarded induction. We treat the cases for µ.P and P1 k A P2 . Case µ.P. Assume P 2 P. µ Suppose P = µ.P0 . Then it holds that P admits a single d transition, viz. P d P0 . Thus we have P λ { λ  P d P0 } = µ = [ P0 7→ µ ](P0 ) = P(P0 ). Suppose P = µ.P00 for some P00 , P. Then we have P λ { λ  P d P0 } = 0 = [ P00 7→ µ ](P0 ) = P(P0 ). δ
δ
δ
Case P1 k A P2 . Assume P1 2 P1 , P2 2 P2 and P1 k A P2 P. It holds that P = (P1 k A XP2 ) + (XP1 k A P2 ). We calculate P
λ
{ λ  P1 k A P2 d P0 } P P λ λ = { λ  P1 d P01 , P0 = P01 k A P2 } + { λ  P2 d P02 , P0 = P1 k A P02 } (by analysis of d) P λ = ( if P0 = P01 k A P2 then { λ  P1 d P01 } else 0 end ) + P λ ( if P0 = P1 k A P02 then { λ  P2 d P02 } else 0 end ) = ( if P0 = P01 k A P2 then P1 (P01 ) else 0 end ) + ( if P0 = P1 k A P02 then P2 (P02 ) else 0 end ) (by induction hypothesis for P1 and P2 ) = ( P1 k A XP2 )(P0 ) + ( XP1 k A P2 )(P0 ) (by definition of k A , XP1 ,XP2 and + on FS(PIML , R>0 )) = P(P0 )
The remaining cases are left to the reader.
Decorated proofs for computational effects: States JeanGuillaume Dumas ∗ Dominique Duval † Laurent Fousse
JeanClaude Reynaud
LJK, Universit´e de Grenoble, France. {JeanGuillaume.Dumas, Dominique.Duval, Laurent.Fousse, JeanClaude Reynaud}@imag.fr
Abstract. The syntax of an imperative language does not mention explicitly the state, while its denotational semantics has to mention it. In this paper we show that the equational proofs about an imperative language may hide the state, in the same way as the syntax does.
Introduction The evolution of the state of the memory in an imperative program is a computational effect: the state is never mentioned as an argument or a result of a command, whereas in general it is used and modified during the execution of commands. Thus, the syntax of an imperative language does not mention explicitly the state, while its denotational semantics has to mention it. This means that the state is encapsulated: its interface, which is made of the functions for looking up and updating the values of the locations, is separated from its implementation; the state cannot be accessed in any other way than through his interface. In this paper we show that equational proofs in an imperative language may also encapsulate the state: proofs can be performed without any knowledge of the implementation of the state. We will see that a naive approach (called “apparent”) cannot deal with the updating of states, while this becomes possible with a slightly more sophisticated approach (called “decorated”). This is expressed in an algebraic framework relying on category theory. To our knowledge, the first categorical treatment of computational effects, using monads, is due to Moggi [Moggi 1991]. The examples proposed by Moggi include the sideeffects monad T (A) = (A × St)St where St is the set of states. Later on, Plotkin and Power used Lawvere theories for dealing with the operations and equations related to computational effects. The Lawvere theory for the sideeffects monad involves seven equations [Plotkin & Power 2002]. In Section 1 we describe the intended denotational semantics of states. Then in Section 2 we introduce three variants of the equational logic for formalizing the computational effects due to the states: the apparent, decorated an explicit logics. This approach is illustrated in Section 3 by proving some of the equations from [Plotkin & Power 2002], using rules which do not mention any type of states.
1
Motivations
This section is made of three independent parts. Section 1.1 is devoted to the semantics of states, an example is presented in Section 1.2, and our logical framework is described in Section 1.3. ∗ This † This
work is partly funded by the project HPAC of the French Agence Nationale de la Recherche (ANR 11 BS02 013). work is partly funded by the project CLIMT of the French Agence Nationale de la Recherche (ANR 11 BS02 016).
U. Golas, T. Soboll (Eds.): Proceedings of ACCAT 2012 EPTCS 93, 2012, pp. 45–59, doi:10.4204/EPTCS.93.3
c J.G. Dumas, D. Duval, L. Fousse & J.C. Reynaud
This work is licensed under the Creative Commons AttributionNo Derivative Works License.
46
Decorated proofs for computational effects: States
1.1 Semantics of states This section deals with the denotational semantics of states, by providing a setvalued interpretation of the lookup and update operations. Let St denote the set of states. Let Loc denote the set of locations (also called variables or identifiers). For each location i, let Vali denote the set of possible values for i. For each location i there is a lookup function for reading the value of location i in the given state, without modifying this state: this corresponds to a function lookupi,1 : St → Vali or equivalently to a function lookupi : St → Vali × St such that lookupi (s) = hlookupi,1 (s), si for each state s. In addition, for each location i there is an update function updatei : Vali × St → St for setting the value of location i to the given value, without modifying the values of the other locations in the given state. This is summarized as follows, for each i ∈ Loc : a set Vali , two functions lookupi,1 : St → Vali and updatei : Vali × St → St, and equations (1): (1.1) ∀ a ∈ Vali , ∀ s ∈ St , lookupi,1 (updatei (a, s)) = a , (1.2) ∀ a ∈ Vali , ∀ s ∈ St , lookup j,1 (updatei (a, s)) = lookup j,1 (s) for every j ∈ Loc, j 6= i . The state can be observed thanks to the lookup functions. We may consider the tuple hlookupi,1 ii∈Loc : St → ∏i∈Loc Vali . If this function is an isomorphism, then Equations (1) provide a definition of the update functions. In [Plotkin & Power 2002] an equational presentation of states is given, with seven equations: in Remark 1.1 these equations are expressed according to [Melli`es 2010] and they are translated in our framework. We use the notations li = lookupi : St → Vali ×St, li,1 = lookupi,1 : St → Vali and ui = updatei : Vali × St → St, and in addition idi : Vali → Vali and qi : Vali × St → St respectively denote the identity of Vali and the projection, while permi, j : Val j × Vali × St → Vali × Val j × St permutes its first and second arguments. Remark 1.1. The equations in [Plotkin & Power 2002] can be expressed as the following Equations (2): (2.1) Annihilation lookupupdate. Reading the value of a location i and then updating the location i with the obtained value is just like doing nothing. ∀ i ∈ Loc, ∀ s ∈ St, ui (li (s)) = s ∈ St (2.2) Interaction lookuplookup. Reading twice the same location loc is the same as reading it once. ∀ i ∈ Loc, ∀ s ∈ St, li (qi (li (s))) = li (s) ∈ Vali × St (2.3) Interaction updateupdate. Storing a value a and then a value a′ at the same location i is just like storing the value a′ in the location. ∀ i ∈ Loc, ∀ s ∈ St, ∀ a, a′ ∈ Vali , ui (a′ , ui (a, s)) = ui (a′ , s) ∈ St (2.4) Interaction updatelookup. When one stores a value a in a location i and then reads the location i, one gets the value a. ∀ i ∈ Loc, ∀ s ∈ St, ∀ a ∈ Vali , li,1 (ui (a, s)) = a ∈ Vali (2.5) Commutation lookuplookup. The order of reading two different locations i and j does not matter. ∀ i 6= j ∈ Loc, ∀ s ∈ St, (idi × l j )(li (s)) = permi, j ((id j × li )(l j (s))) ∈ Vali × Val j × St (2.6) Commutation updateupdate. The order of storing in two different locations i and j does not matter. ∀ i 6= j ∈ Loc, ∀ s ∈ St, ∀ a ∈ Vali , ∀ b ∈ Val j , u j (b, ui (a, s)) = ui (a, u j (b, s)) ∈ St (2.7) Commutation updatelookup. The order of storing in a location i and reading in another location j does not matter. ∀ i 6= j ∈ Loc, ∀ s ∈ St, ∀ a ∈ Vali , l j (ui (a, s)) = (id j × ui )(perm j,i (a, l j (s))) ∈ Val j × St Proposition 1.2. Let us assume that hli,1 ii∈Loc : St → ∏i∈Loc Vali is invertible. Then Equations (1) are equivalent to Equations (2).
J.G. Dumas, D. Duval, L. Fousse & J.C. Reynaud
47
Proof. It may be observed that (2.4) is exactly (1.1). In addition, (2.7) is equivalent to (1.2) : indeed, (2.7) is equivalent to the conjunction of its projection on Val j and its projection on St; the first one is l j,1 (ui (a, s)) = l j,1 (s), which is (1.2), and the second one is ui (a, s) = ui (a, s). Equations (2.2) and (2.5) follow from qi (li (s)) = s. For the remaining equations (2.1), (2.3) and (2.6), which return states, it is easy to check that for each location k, by applying lk to both members and using equation (1.1) or (1.2) according to k, we get the same value in Valk for both handsides. Then equations (2.1), (2.3) and (2.6) follow from the fact that hli,1 ii∈Loc : St → ∏i∈Loc Vali is invertible. Proposition 1.2 will be revisited in Section 3, where it will be proved that equations (1) imply equations (2) without ever mentioning explicitly the state in the proof.
1.2 Computational effects: an example In an informal way, we consider that a computational effect occurs when there is an apparent mismatch, i.e., some lack of soundness, between the syntax and the denotational semantics of a language. For instance in an objectoriented language, the state of an object does not appear explicitly as an argument nor as a result of any of its methods. In this section, as a toy example, we build a class BankAccount for managing (very simple!) bank accounts. We use the types int and void, and we assume that int is interpreted by the set of integers Z and void by a singleton {⋆}. In the class BankAccount, there is a method balance() which returns the current balance of the account and a method deposit(x) for the deposit of x Euros on the account. The deposit method is a modifier, which means that it can use and modify the state of the current account. The balance method is an inspector, or an accessor, which means that it can use the state of the current account but it is not allowed to modify this state. In the objectoriented language C++, a method is called a member function; by default a member function is a modifier, when it is an accessor it is called a constant member function and the keyword const is used. So, the C++ syntax for declaring the member functions of the class BankAccount looks like: int balance ( ) const ; void deposit (int) ; • Forgetting the keyword const, this piece of C++ syntax can be translated as a signature Bankapp , which we call the apparent signature (we use the word “apparent” in the sense of “seeming” i.e., “appearing as such but not necessarily so”). ( balance : void → int Bankapp : deposit : int → void In a model (or algebra) of the signature Bankapp , the operations would be interpreted as functions: ( [[balance]] : {⋆} → Z [[deposit]] : Z → {⋆} which clearly is not the intended interpretation. • In order to get the right semantics, we may use another signature Bankexpl , which we call the explicit signature, with a new symbol state for the “type of states”: ( balance : state → int Bankexpl : deposit : int × state → state
Decorated proofs for computational effects: States
48
The intended interpretation is a model of the explicit signature Bankexpl , with St denoting the set of states of a bank account: ( [[balance]] : St → Z [[deposit]] : Z × St → St So far, in this example, we have considered two different signatures. On the one hand, the apparent signature Bankapp is simple and quite close to the C++ code, but the intended semantics is not a model of Bankapp . On the other hand, the semantics is a model of the explicit signature Bankexpl , but Bankexpl is far from the C++ syntax: actually, the very nature of the objectoriented language is lost by introducing a “type of states”. Let us now define a decorated signature Bankdeco , which is still closer to the C++ code than the apparent signature and which has a model corresponding to the intended semantics. The decorated signature is not exactly a signature in the classical sense, because there is a classification of its operations. This classification is provided by superscripts called decorations: the decorations (1) and (2) correspond respectively to the objectoriented notions of accessor and modifier. ( balance(1) : void → int Bankdeco : deposit(2) : int → void The decorated signature is similar to the C++ code, with the decoration (1) corresponding to the keyword const. The apparent specification Bankapp may be recovered from Bankdeco by dropping the decorations. In addition, we claim that the intended semantics can be seen as a decorated model of this decorated signature: this will become clear in Section 2.3. In order to add to the signature constants of type int like 0, 1, 2, . . . and the usual operations on integers, a third decoration is used: the decoration (0) for pure functions, which means, for functions which neither inspect nor modify the state of the bank account. So, we add to the apparent and explicit signatures the constants 0, 1, . . . : void → int and the operations +, , ∗ : int × int → int, and we add to the decorated signature the pure constants 0(0) , 1(0) , . . . : void → int and the pure operations +(0) , (0) , ∗(0) : int× int → int. For instance the C++ expressions deposit(7); balance() and 7 + balance() can be seen as the decorated terms: balance(1) ◦ deposit(2) ◦ 7(0)
and
+(0) ◦ h7(0) , balance(1) i
which may be illustrated as: void and
7(0)
void
deposit(2)
/ int h7(0) ,balance (1) i
/ int × int
/ void +(0)
balance(1)
/ int
/ int
These two decorated terms have different effects: the first one does modify the state while the second one is an accessor; however, both return the same integer. Let us introduce the symbol ∼ for the relation “same result, maybe distinct effects”. Then: balance(1) ◦ deposit(2) ◦ 7(0)
∼
+(0) ◦ h7(0) , balance(1) i
1.3 Diagrammatic logics In this paper, in order to deal with a relevant notion of morphisms between logics, we define a logic as a diagrammatic logic, in the sense of [Dom´ınguez & Duval 2010]. For the purpose of this paper let us simply say that a logic L determines a category of theories T which is cocomplete, and that a morphism
J.G. Dumas, D. Duval, L. Fousse & J.C. Reynaud
49
of logics is a left adjoint functor, so that it preserves the colimits. The objects of T are called the a theories of the logic L . Quite often, T is a category of structured categories. The inference rules of the logic L describe the structure of its theories. When a theory Φ is generated by some presentation or specification Σ, a model of Σ with values in a theory Θ is a morphism M : Φ → Θ in T.
The monadic equational logic. For instance, and for future use in the paper, here is the way we describe the monadic equational logic Lmeqn . In order to focus on the syntactic aspect of the theories, we use a congruence symbol “≡” rather than the equality symbol “=”. Roughly speaking, a monadic equational theory is a sort of category where the axioms hold only up to congruence (in fact, it is a 2category). Precisely, a monadic equational theory is a directed graph (its vertices are called objects or types and its edges are called morphisms or terms) with an identity term idX : X → X for each type X and a composed term g ◦ f : X → Z for each pair of consecutive terms ( f : X → Y, g : Y → Z); in addition it is endowed with equations f ≡ g : X → Y which form a congruence, which means, an equivalence relation on parallel terms compatible with the composition; this compatibility can be split in two parts: substitution and replacement. In addition, the associativity and identity axioms hold up to congruence. These properties of the monadic equational theories can be described by a set of inference rules, as in Figure 1.
X f : X →Y g :Y → Z (comp) id X : X → X g◦ f : X → Z f : X →Y f : X →Y f : X →Y g :Y → Z h : Z →W (idsrc) (idtgt) (assoc) f ◦ id X ≡ f idY ◦ f ≡ f h ◦ (g ◦ f ) ≡ (h ◦ g) ◦ f f ≡g f ≡g g≡h (≡refl) (≡sym) (≡trans) f≡f g≡ f f ≡h f : X → Y g1 ≡ g2 : Y → Z f1 ≡ f2 : X → Y g : Y → Z (≡subs) (≡repl) g1 ◦ f ≡ g2 ◦ f : X → Z g ◦ f1 ≡ g ◦ f2 : X → Z (id)
Figure 1: Rules of the monadic equational logic
Adding products to the monadic equational logic. In contrast with equational theories, the existence of products is not required in a monadic equational theory. However some specific products may exist. A product in a monadic equational theory T is “up to congruence”, in the following sense. Let (Yi )i∈I be a family of objects in T, indexed by some set I. A product with base (Yi )i∈I is a cone (qi : Y → Yi )i∈I such that for every cone ( fi : X → Yi )i∈I on the same base there is a term f = h fi ii∈I : X → Y such that qi ◦ f ≡ fi for each i, and in addition this term is unique up to congruence, in the sense that if g : X → Y is such that qi ◦ g ≡ fi for each i then g ≡ f . When I is empty, we get a terminal object 1, such that for every X there is an arrow h iX : X → 1 which is unique up to congruence. The corresponding inference rules are given in Figure 2. The quantification “∀i”, or “∀i ∈ I”, is a kind of “syntactic sugar”: when occuring in the premisses of a rule, it stands for a conjunction of premisses.
Decorated proofs for computational effects: States
50
When (qi : Y → Yi )i∈I is a product: (tuple)
( fi : X → Yi )i h fi ii : X → Y
(tupleproji)
( fi : X → Yi )i qi ◦ h f j i j ≡ fi
(tupleunique)
g : X → Y ∀i qi ◦ g ≡ fi g ≡ h f ji j
When 1 is a terminal type (“empty product”): (final)
X h iX : X → 1
(finalunique)
g:X →1 g ≡ h iX
Figure 2: Rules for products
2
Three logics for states
In this section we introduce three logics for dealing with states as computational effects. This generalizes the example of the bank account in Section 1.2. We present first the explicit logic (close to the semantics), then the apparent logic (close to the syntax), and finally the decorated logic and the morphisms from the decorated logic to the apparent and the explicit ones. In the syntax of an imperative language there is no type of states (the state is “hidden”) while the interpretation of this language involves a set of states St. More precisely, if the types X and Y are interpreted as the sets [[X ]] and [[Y ]], then each term f : X → Y is interpreted as a function [[ f ]] : [[X ]] × St → [[Y ]] × St. In Moggi’s paper introducing monads for effects [Moggi 1991] such a term f : X → Y is called a computation, and whenever the function [[ f ]] is [[ f ]]0 × idSt for some [[ f ]]0 : [[X ]] → [[Y ]] then f is called a value. We keep this distinction, using modifier and pure term instead of computation and value, respectively. In addition, an accessor (or inspector) is a term f : X → Y that is interpreted by a function [[ f ]] = h[[ f ]]1 , qX i, for some [[ f ]]1 : [[X ]] × St → [[Y ]], where qX : [[X ]] × St → St is the projection. It follows that every pure term is an accessor and every accessor is a modifier. We will respectively use the decorations (0), (1) and (2), written as superscripts, for pure terms, accessors and modifiers. Moreover, we distinguish two kinds of equations: when f , g : X → Y are parallel terms, then a strong equation f ≡ g is interpreted as the equality [[ f ]] = [[g]] : [[X ]] × St → [[Y ]] × St, while a weak equation f ∼ g is interpreted as the equality pY ◦ [[ f ]] = pY ◦ [[g]] : [[X ]] × St → [[Y ]], where pY : [[Y ]] × St → [[Y ]] is the projection. Clearly, strong and weak equations coincide on accessors and on pure terms, while they differ on modifiers. As in Section 1.1, we consider some given set of locations Loc and for each location i a set Vali of possible values for i. The set of states is defined as St = ∏i∈Loc Vali , and the projections are denoted by lookupi,1 : St → Vali . For each location i, let updatei : Vali × St → St be defined by Equations (1) as in Section 1.1. In order to focus on the fundamental properties of states as effects, the three logics for states are based on the “poor” monadic equational logic (as described in Section 1.3).
2.1 The explicit logic for states The explicit logic for states Lexpl is a kind of “pointed” monadic equational logic: a theory Θexpl for Lexpl is a monadic equational theory with a distinguished object S, called the type of states, and with a productwithS functor X × S. As in Section 1.2, the explicit logic provides the relevant semantics, but it is far from the syntax. The explicit theory for states Stateexpl is generated by a type Vi and an operation li,1 : S → Vi for each location i, which form a product (li,1 : S → Vi )i∈Loc . Thus, for each location i there
J.G. Dumas, D. Duval, L. Fousse & J.C. Reynaud
51
is an operation ui : Vi × S → S, unique up to congruence, which satisfies the equations below (where pi : Vi × S → Vi and qi : Vi × S → S are the projections): operations li,1 : S → Vi , ui : Vi × S → S Stateexpl : product (li,1 : S → Vi )i∈Loc equations li,1 ◦ ui ≡ pi : Vi × S → Vi , l j,1 ◦ ui ≡ l j,1 ◦ qi : Vi × S → V j for each j 6= i
Let us define the explicit theory Setexpl as the category of sets with the equality as congruence and with the set of states St = ∏ j∈Loc Val j as its distinguished set. The semantics of states, as described in Section 1.1, is the model Mexpl : Stateexpl → Set expl which maps the type Vi to the set Vali for each i ∈ Loc, the type S to the set St, and the operations li,1 and ui to the functions lookupi,1 and updatei , respectively.
2.2 The apparent logic for states The apparent logic for states Lapp is the monadic equational logic (Section 1.3). As in Section 1.2, the apparent logic is close to the syntax but it does not provide the relevant semantics. The apparent theory for states Stateapp can be obtained from the explicit theory Stateexpl by identifying the type of states S with the unit type 1. So, there is in Stateapp a terminal type 1 and for each location i a type Vi for the possible values of i and an operation li : 1 → Vi for observing the value of i. A setvalued model for this part of Stateapp , with the constraint that for each i the interpretation of Vi is the given set Vali , is made of an element ai ∈ Vali for each i (it is the image of the interpretation of li ). Thus, such a model corresponds to a state, made of a value for each location; this is known as the statesasmodels or statesasalgebras point of view [Gaudel et al. 1996]. In addition, it is assumed that in Stateapp the operations li ’s form a product (li : 1 → Vi )i∈Loc . This assumption implies that each li is an isomorphism, so that each Vi must be interpreted as a singleton: this does not fit with the semantics of states. However, we will see in Section 2.3 that this assumption becomes meaningful when decorations are added, in a similar way as in the bank example in Section 1.2. Formally, the assumption that (li : 1 → Vi )i∈Loc is a product provides for each location i an operation ui : Vi → 1, unique up to congruence, which satisfies the equations below (where id i : Vi → Vi is the identity and h ii = h iVi : Vi → 1) : operations li : 1 → Vi , ui : Vi → 1 Stateapp : product (li : 1 → Vi )i∈Loc with terminal type 1 equations li ◦ ui ≡ idi : Vi → Vi , l j ◦ ui ≡ l j ◦ h ii : Vi → V j for each j 6= i At first view, these equations mean that after ui (a) is executed, the value of i is put to a and the value of j (for j 6= i) is unchanged. However, as noted above, this intuition is not supported by the semantics in the apparent logic. However, the apparent logic can be used for checking the validity of a decorated proof, as explained in Section 2.4.
2.3 The decorated logic for states Now, as in Section 1.2, we introduce a third logic for states, which is close to the syntax and which provides the relevant semantics. It is defined by adding “decorations” to the apparent logic. A theory Θdeco for the decorated logic for states Ldeco is made of: • A monadic equational theory Θ(2) . The terms in Θ(2) may be called the modifiers and the equations f ≡ g may be called the strong equations.
Decorated proofs for computational effects: States
52
• Two additional monadic equational theories Θ(0) and Θ(1) , with the same types as Θ(2) , and such that Θ(0) ⊆ Θ(1) ⊆ Θ(2) and the congruence on Θ(0) and on Θ(1) is the restriction of the congruence on Θ(2) . The terms in Θ(1) may be called the accessors, and if they are in Θ(0) they may be called the pure terms. • A second equivalence relation ∼ between parallel terms in Θ(2) , which is only “weakly” compatible with the composition; the relation ∼ satisfies the substitution property but only a weak version of the replacement property, called the pure replacement: if f1 ∼ f2 : X → Y and g : Y → Z then in general g ◦ f1 6∼ g ◦ f2 , except when g is pure. The relations f ∼ g are called the weak equations. It is assumed that every strong equation is a weak equation and that every weak equation between accessors is a strong equation, so that the relations ≡ and ∼ coincide on Θ(0) and on Θ(1) . We use the following notations, called decorations: a pure term f is denoted f (0) , an accessor f is denoted f (1) , and a modifier f is denoted f (2) ; this last decoration is unnecessary since every term is a modifier, however it may be used for emphasizing. Figure 3 provides the decorated rules, which describe the properties of the decorated theories. For readability, the decoration properties may be grouped with other properties: for instance, “ f (1) ∼ g(1) ” means “ f (1) and g(1) and f ∼ g”.
Rules of the monadic equational logic, and: (0id)
X (0)
id X : X → X
(0comp)
(1∼to≡)
(∼refl)
(∼subs)
f∼f
f (0) g(0) (g ◦ f )(0)
(0to1)
f (1) ∼ g(1) f ≡g
(≡to∼)
(∼sym)
f : X → Y g1 ∼ g2 : Y → Z g1 ◦ f ∼ g2 ◦ f : X → Z
f ∼g g∼ f
f (0) f (1)
f (1) g(1) (g ◦ f )(1)
f ≡g f ∼g
(∼trans)
(0∼repl)
(1comp)
f ∼g g∼h f ∼h
f1 ∼ f2 : X → Y g(0) : Y → Z g ◦ f1 ∼ g ◦ f2 : X → Z
Figure 3: Rules of the decorated logic for states Some specific kinds of products may be used in a decorated theory, for instance: • A distinguished type 1 with the following decorated terminality property: for each type X there is a pure term h iX : X → 1 such that every modifier g : X → 1 satisfies g ∼ h iX . It follows from the properties of weak equations that 1 is a terminal type in Θ(0) and in Θ(1) . • An observational product with base (Yi )i∈I is a cone of accessors (qi : Y → Yi )i∈I such that for every cone of accessors ( fi : X → Yi )i∈I on the same base there is a modifier f = h fi ii∈I : X → Y such that qi ◦ f ∼ fi for each i, and in addition this modifier is unique up to strong equations, in the sense that if g : X → Y is a modifier such that qi ◦ g ∼ fi for each i then g ≡ f . An observational product allows to prove strong equations from weak ones: by looking at the results of some observations, thanks to the properties of the observational product, we get information on the state.
J.G. Dumas, D. Duval, L. Fousse & J.C. Reynaud
53
When 1 is a decorated terminal type: (0final) (1)
When (qi
X (0) h iX
(∼finalunique)
:X →1
g:X →1 g ∼ h iX (1)
: Y → Yi )i is an observational product:
(obstupleproji)
(1) ( fi : X
→ Yi )i qi ◦ h f j i j ∼ fi
(obstuple)
(obstupleunique)
( fi
: X → Yi )i
(2) h fi ii : X
g(2) : X
→Y (1)
→ Y ∀i qi ◦ g ∼ fi g ≡ h f ji j
Figure 4: Rules for some decorated products for states (1)
The decorated theory of states Statedeco is generated by a type Vi and an accessor li : 1 → Vi for (1) each i ∈ Loc, which form an observational product (li : 1 → Vi )i∈Loc . The modifiers ui ’s are defined (up to strong equations), using the property of the observational product, by the weak equations below: (1) (2) li : 1 → Vi , ui : Vi → 1 operations Statedeco : observational product (li(1) : 1 → Vi )i∈Loc with decorated terminal type 1 equations li ◦ ui ∼ idi : Vi → Vi , l j ◦ ui ∼ l j ◦ h ii : Vi → V j for each j 6= i
The decorated theory of sets Set deco is built from the category of sets, as follows. There is in Setdeco a type for each set, a modifier f (2) : X → Y for each function f : X × St → Y × St, an accessor f (1) : X → Y for each function f : X × St → Y , and a pure term f (0) : X → Y for each function f : X → Y , with the straightforward conversions. Let f (2) , g(2) : X → Y corresponding to f , g : X × St → Y × St. A strong equation f ≡ g is an equality f = g : X × St → Y × St, while a weak equation f ∼ g is an equality p ◦ f = p ◦ g : X × St → Y , where p : Y × St → Y is the projection. For each location i the (1) projection lookupi : St → Vali corresponds to an accessor lookupi : 1 → Vali in Setdeco , so that the (1) family (lookupi )i∈Loc forms an observational product in Set deco . We get a model Mdeco of Statedeco with (1) (1) values in Setdeco by mapping the type Vi to the set Vali and the accessor li to the accessor lookupi , for (2) (2) each i ∈ Loc. Then for each i the modifier ui is mapped to the modifier updatei .
2.4 From decorated to apparent Every decorated theory Θdeco gives rise to an apparent theory Θapp by dropping the decorations, which means that the apparent theory Θapp is made of a type X for each type X in Θdeco , a term f : X → Y for each modifier f : X → Y in Θdeco (which includes the accessors and the pure terms), and an equation f ≡ g for each weak equation f ∼ g in Θdeco (which includes the strong equations). Thus, the distinction between modifiers, accessors and pure terms disappears, as well as the distinction between weak and strong equations. Equivalently, the apparent theory Θapp can be defined as the apparent theory Θ(2) together with an equation f ≡ g for each weak equation f ∼ g in Θdeco which is not associated to a strong equation in Θdeco (otherwise, it is yet in Θ(2) ). Thus, a decorated terminal type in Θdeco becomes a (1) terminal type in Θapp and an observational product (qi : Y → Yi )i in Θdeco becomes a product (qi : Y →
Decorated proofs for computational effects: States
54
Yi )i in Θapp . In the same way, each rule of the decorated logic is mapped to a rule of the apparent logic by dropping the decorations. This property can be used for checking a decorated proof in two steps, by checking on one side the undecorated proof and on the other side the decorations. This construction of Θapp from Θdeco , by dropping the decorations, is a morphism from Ldeco to Lapp , denoted Fapp .
2.5 From decorated to explicit Every decorated theory Θdeco gives rise to an explicit theory Θexpl by expanding the decorations, which means that the explicit theory Θexpl is made of: • A type X for each type X in Θdeco ; projections are denoted by pX : X × S → X and qX : X × S → S. • A term f : X × S → Y × S for each modifier f : X → Y in Θdeco , such that: – if f is an accessor then there is a term f1 : X × S → Y in Θexpl such that f = h f1 , qX i, – if moreover f is a pure term then there is a term f0 : X → Y in Θexpl such that f1 = f0 ◦ pX : X × S → Y , hence f = h f0 ◦ pX , qX i = f0 × idS in Θexpl . • An equation f ≡ g : X × S → Y × S for each strong equation f ≡ g : X → Y in Θdeco . • An equation pY ◦ f ≡ pY ◦ g : X × S → Y for each weak equation f ∼ g : X → Y in Θdeco . (1)
• A product (qi,1 : Y × S → Yi )i for each observational product (qi
: Y → Yi )i in Θdeco .
This construction of Θexpl from Θdeco is a morphism from Ldeco to Lexpl , denoted Fexpl and called the expansion. The expansion morphism makes explicit the meaning of the decorations, by introducing a “type of states” S. Thus, each modifier f (2) gives rise to a term f which may use and modify the state, while whenever f (1) is an accessor then f may use the state but is not allowed to modify it, and when moreover f (0) is a pure term then f may neither use nor modify the state. When f (2) ≡ g(2) then f and g must return the same result and the same state; when f (2) ∼ g(2) then f and g must return the same result but maybe not the same state. We have seen that the semantics of states cannot be described in the apparent logic, but can be described both in the decorated logic and in the explicit logic. It should be reminded that every morphism of logics is a left adjoint functor. This is the case for the expansion morphism Fexpl : Ldeco → Lexpl : it is a left adjoint functor Fexpl : Tdeco → Texpl , its right adjoint is denoted Gexpl . In fact, it is easy to check that Setdeco = Gexpl (Set expl ), and since Stateexpl = Fexpl (Statedeco ) it follows that the decorated model Mdeco : Statedeco → Setdeco and the explicit model Mexpl : Stateexpl → Set expl are related by the adjunction Fexpl ⊣ Gexpl . This means that the models Mdeco and Mexpl are two different ways to formalize the semantics of states from Section 1.1. In order to conclude Section 2, the morphims of logic Fapp and Fexpl are summarized in Figure 5.
3
Decorated proofs
The inference rules of the decorated logic Ldeco are now used for proving some of the Equations (2) (in Remark 1.1). All proofs in this section are performed in the decorated logic; for readability the identity and associativity rules (idsrc) , (idtgt) and (assoc) are omitted. Some derived rules are proved in Section 3.1, then Equation (2.1) is proved in Section 3.2. In order to deal with the equations with two values as argument or as result, we use the semipure products introduced in [Dumas et al. 2011]; the rules for semipure products are reminded in Section 3.3, then all seven Equations (2) are expressed in the decorated logic and Equation (2.6) is proved in Section 3.4. Proving the other equations would be similar. We use as axioms the fact that li is an accessor and the weak equations in Statedeco (Section 2.3).
J.G. Dumas, D. Duval, L. Fousse & J.C. Reynaud
Fapp
Θapp o f f f f f
: X →Y : X →Y : X →Y ≡ g : X →Y ≡ g : X →Y
55
Fexpl
Θdeco f : X →Y f (1) : X → Y f (0) : X → Y f ≡ g : X →Y f ∼ g : X →Y
modifier accessor pure term strong equation weak equation
/ Θexpl
f : X ×S →Y ×S f1 : X × S → Y f0 : X → Y f ≡ g : X ×S →Y ×S pY ◦ f ≡ pY ◦ g : X × S → Y
Figure 5: A span of logics for states
3.1 Some derived rules Let us now derive some rules from the rules of the decorated logic (Figures 3 and 4).
(1)
(E1 ) (1)
(E2 )
(1)
(E4 )
f (0) : X → 1 f ≡ h iX
(0)
(E1 )
f (1) : X → 1 g(1) : X → 1 f ≡g
f (1) : X → Y
(1)
(E3 )
f (1) : X → 1 f ≡ h iX (0)
(E2 )
g(1) : Y → 1 h(1) : X → 1 g◦ f ≡ h
f (0) : X → 1 g(0) : X → 1 f ≡g
f (0) : X → Y
(0)
(E3 )
f (1) : 1 → X h iX ◦ f ≡ id1
(0)
(E4 )
g(0) : Y → 1 h(0) : X → 1 g◦ f ≡ h f (0) : 1 → X h iX ◦ f ≡ id1
Figure 6: Some derived rules in the decorated logic for states Proof. The derived rules in the left part of Figure 6 can be proved as follows. The proof of the rules in the right part are left to the reader. (0final)
(1∼to≡)
f (1)
(0to1)
X (0)
h iX
(∼finalunique)
(1)
h iX
f :X →1 f ∼ h iX
(1)
f ≡ h iX (E1 )
(1)
(E1 ) (≡trans)
f
g(1) : 1 → X (1) (E1 ) g ≡ h i1 (≡sym) h i1 ≡ g
(1)
:1→X f ≡ h i1
(1comp)
f (1) : X →Y (1)
(E2 )
(1)
(g◦ f )(1) : X → 1
(1) (E3 )
f (1) : 1 → X
(0to1)
X (0) h iX (1) h iX
(1)
(0id)
:X →1
(0to1)
:X →1
1 (0)
id1 : 1 → 1 (1)
id1 : 1 → 1 (1)
h iX ◦ f ≡ id1 (E4 )
h(1) : X → 1
g◦ f ≡ h (E3 )
f ≡ g (E2 ) (0final)
g(1) :Y → 1
Decorated proofs for computational effects: States
56
3.2 Annihilation lookupupdate (2)
(1)
(0)
It is easy to check that the decorated equation ui ◦ li ≡ id1 gets expanded as ui ◦ li ≡ id S , which clearly gets interpreted as Equation (2.1) in Remark 1.1. Let us prove this decorated equation, using the axioms (for each location i), from Statedeco in Section 2.3: (1)
(A0 ) li
(A1 ) li ◦ ui ∼ idi ,
,
(A2 ) l j ◦ ui ∼ l j ◦ h ii for each j 6= i .
Proposition 3.1. For each location i , reading the value of a location i and then updating the location i with the obtained value is just like doing nothing. (2)
(1)
ui ◦ li
(0)
≡ id1 : 1 → 1 .
Proof. Let i be a location. Using the unicity property of the observational product, we have to prove that lk ◦ ui ◦ li ∼ lk : 1 → Vk for each location k . • When k = i, the substitution rule for ∼ yields: (∼subs)
(A1 ) li ◦ ui ∼ idi li ◦ ui ◦ li ∼ li
• When k 6= i, using the substitution rule for ∼ and the replacement rule for ≡ we get: (1)
(A0 ) li h ii ◦ li ≡ id1 (≡repl) (A2 ) lk ◦ ui ∼ lk ◦ h ii lk ◦ h ii ◦ li ≡ lk (∼subs) (≡to∼) lk ◦ ui ◦ li ∼ lk ◦ h ii ◦ li lk ◦ h ii ◦ li ∼ lk (∼trans) lk ◦ ui ◦ li ∼ lk (1) (E4 )
Remark 3.2. At the top of the right branch in the proof above, the decoration (1) for li could not be (2) replaced by (2). Indeed, from li we can derive the weak equation h ii ◦ li ∼ id1 , but this is not sufficient for deriving lk ◦ h ii ◦ li ∼ lk by replacement since lk is not pure.
3.3 Semipure products Let Θdeco be a theory with respect to the decorated logic for states and let Θ(0) be its pure part, so that Θ(0) is a monadic equational theory. The product of two types X1 and X2 in Θdeco is defined as their product in Θ(0) (it is a product up to strong equations, as in Section 1.1). The projections from (0) (0) X1 × X2 to X1 and X2 are respectively denoted by π1 and π2 when the types X1 and X2 are clear (0) (0) from the context. The product of two pure morphisms f1 : X1 → Y1 and f2 : X2 → Y2 is a pure (0) morphism ( f1 × f2 ) : X1 × X2 → Y1 × Y2 subject to the rules in Figure 7, which are the usual rules for products up to strong equations. Moreover when X1 or X2 is 1 it can be proved in the usual way (0) (0) that the projections π1 : X1 × 1 → X1 and π2 : 1 × X2 → X2 are isomorphisms. The permutation (0) permX1 ,X2 : X1 × X2 → X2 × X1 is defined as usual by π1 ◦ permX1 ,X2 ≡ π2 and π2 ◦ permX1 ,X2 ≡ π1 . The rules in Figure 7, which are symmetric in f1 and f2 , cannot be applied to modifiers: indeed, the effect of building a pair of modifiers depends on the evaluation strategy. However, following
J.G. Dumas, D. Duval, L. Fousse & J.C. Reynaud
57
(0)
(0)
(0prod) (0)
(0)
(0proj1)
f1 : X1 → Y1 f2 : X2 → Y2 ( f1 × f2 )(0) : X1 × X2 → Y1 ×Y2
(0produnique)
(0)
(0)
f1 : X1 → Y1 f2 : X2 → Y2 π1 ◦ ( f 1 × f 2 ) ≡ f 1 ◦ π1
(0proj2)
g(0) : X1 × X2 → Y1 ×Y2
f1 : X1 → Y1 f2 : X2 → Y2 π2 ◦ ( f 1 × f 2 ) ≡ f 2 ◦ π2
π1 ◦ g ≡ f 1 ◦ π1 g ≡ f1 × f2
π2 ◦ g ≡ f 2 ◦ π2
Figure 7: Rules for products of pure morphisms [Dumas et al. 2011], we define the left semipure product of an identity idX and a modifier f : X2 → Y2 , as a modifier idX ⋉ f : X × X2 → X ×Y2 subject to the rules in Figure 8, which form a decorated version of the rules for products. Symmetrically, the right semipure product of a modifier f : X1 → Y1 and an identity idX is a modifier f ⋊ idX : X1 × X → Y1 × X subject to the rules symmetric to those in Figure 8.
(leftprod) (leftproj1)
f (2) : X2 → Y2 (id X ⋉ f )(2) : X × X2 → X ×Y2
f (2) : X2 → Y2 π1 ◦ (id X ⋉ f ) ∼ π1
(leftprodunique)
(leftproj2)
f (2) : X2 → Y2 π2 ◦ (idX ⋉ f ) ≡ f ◦ π2
g(2) : X × X2 → Y ×Y2 π1 ◦ g ∼ π1 g ≡ id X ⋉ f
π2 ◦ g ≡ f ◦ π2
Figure 8: Rules for left semipure products Let us add the rules for semipure products to the decorated logic for states. In the decorated theory of states Statedeco , let us assume that there are products Vi × V j and Vi × 1 and 1 × V j for all locations i and j. Then it is easy to check that the expansion of the decorated Equations (2)d below gets interpreted as Equations (2) in Remark 1.1. We use the simplified notations idi = idVi and h ii = h iVi and permi, j = permVi ,V j . Equation (2.1)d has been proved in Section 3.2 and Equation (2.6)d will be proved in Section 3.4. The other equations can be proved in a similar way. (2.1)d Annihilation lookupupdate. ∀ i ∈ Loc, ui ◦ li ≡ id1 : 1 → 1 (2.2)d Interaction lookuplookup. ∀ i ∈ Loc, li ◦ h ii ◦ li ≡ li : 1 → Vi (2.3)d Interaction updateupdate. ∀ i ∈ Loc, ui ◦ π2 ◦ (ui ⋊ idi ) ≡ ui ◦ π2 : Vi ×Vi → 1 (2.4)d Interaction updatelookup. ∀ i ∈ Loc, li ◦ ui ∼ id i : Vi → Vi (2.5)d Commutation lookuplookup. ∀ i 6= j ∈ Loc, l j ◦ h ii ◦ li ≡ perm j,i ◦ li ◦ h i j ◦ l j : 1 → Vi ×V j (2.6)d Commutation updateupdate. ∀ i 6= j ∈ Loc, u j ◦ π2 ◦ (ui ⋊ id j ) ≡ ui ◦ π1 ◦ (id i ⋉ u j ) : Vi ×V j → 1 (2.7)d Commutation updatelookup. ∀ i 6= j ∈ Loc, l j ◦ ui ≡ π2 ◦ (id i ⋉ l j ) ◦ (ui ⋊ id j ) ◦ π1−1 : Vi → V j
Decorated proofs for computational effects: States
58
3.4 Commutation updateupdate Proposition 3.3. For each locations i 6= j , the order of storing in the locations i and j does not matter. (0)
(2)
(2)
(0)
u j ◦ π2 ◦ (ui ⋊ id j )(2) ≡ ui ◦ π1 ◦ (id i ⋉ u j )(2) : Vi ×V j → 1 . Proof. In order to avoid ambiguity, in this proof the projections from Vi × 1 are denoted π1,i and π2,i and the projections from 1 ×V j are denoted π1, j and π2, j , while the projections from Vi ×V j are denoted π1,i, j (0) and π2,i, j . It follows from Section 3.3 that π1,i and π2, j are isomorphisms, while the derived rule (E1 ) implies that π2,i ≡ h ii and π1, j ≡ h i j . Using the unicity property of the observational product, we have to prove that lk ◦ u j ◦ π2, j ◦ (ui ⋊ id j ) ∼ lk ◦ ui ◦ π1,i ◦ (idi ⋉ u j ) for each location k . • When k 6= i, j, let us prove independently four weak equations (W1 ) to (W4 ): (∼subs)
(A2 ) lk ◦ u j ∼ lk ◦ h i j lk ◦ u j ◦ π2, j ◦ (ui ⋊ id j ) ∼ lk ◦ h i j ◦ π2, j ◦ (ui ⋊ id j ) (W1 )
.. .
ui (rightprod) h i j ◦ π2, j ≡ π1, j ui ⋊ id j ui (rightproj1) (≡subs) h i j ◦ π2, j ◦ (ui ⋊ id j ) ≡ π1, j ◦ (ui ⋊ id j ) π1, j ◦ (ui ⋊ id j ) ≡ ui ◦ π1,i, j (≡trans) h i j ◦ π2, j ◦ (ui ⋊ id j ) ≡ ui ◦ π1,i, j (≡repl) lk ◦ h i j ◦ π2, j ◦ (ui ⋊ id j ) ≡ lk ◦ ui ◦ π1,i, j (≡to∼) lk ◦ h i j ◦ π2, j ◦ (ui ⋊ id j ) ∼ lk ◦ ui ◦ π1,i, j (W2 ) (0) (E3 )
(∼subs)
(A2 ) lk ◦ ui ∼ lk ◦ h ii lk ◦ ui ◦ π1,i, j ∼ lk ◦ h ii ◦ π1,i, j (W3 )
.. . (0) (E3 ) h ii ◦ π1,i, j ≡ h iVi ×V j (≡subs) lk ◦ h ii ◦ π1,i, j ≡ lk ◦ h iVi ×V j (≡to∼) lk ◦ h ii ◦ π1,i, j ∼ lk ◦ h iVi ×V j (W4 )
Equations (W1 ) to (W4 ) together with the transitivity rule for ∼ give rise to the weak equation lk ◦ u j ◦ π2, j ◦ (ui ⋊ id j ) ∼ lk ◦ h iVi ×V j . A symmetric proof shows that lk ◦ ui ◦ π1,i ◦ (idi ⋉ u j ) ∼ lk ◦ h iVi ×V j . With the symmetry and transitivity rules for ∼, this concludes the proof when k 6= i, j. • When k = i, it is easy to prove that li ◦ ui ◦ π1,i ◦ (idi ⋉ u j ) ∼ π1,i, j , as follows. uj (A1 ) li ◦ ui ∼ idi (leftproj1) (∼subs) li ◦ ui ◦ π1,i ◦ (idi ⋉ u j ) ∼ π1,i ◦ (id i ⋉ u j ) π1,i ◦ (idi ⋉ u j ) ∼ π1,i, j (∼trans) li ◦ ui ◦ π1,i ◦ (idi ⋉ u j ) ∼ π1,i, j
Now let us prove that li ◦ u j ◦ π2, j ◦ (ui ⋊ id j ) ∼ π1,i, j , as follows. (0) (E3 )
.. . h i j ◦ π2, j ≡ h i1×V j
(≡repl) li ◦ h i j ◦ π2, j ≡ li ◦ h i1×V j (A2 ) li ◦ u j ∼ li ◦ h i j (∼subs) (≡to∼) li ◦ u j ◦ π2, j ∼ li ◦ h i j ◦ π2, j li ◦ h i j ◦ π2, j ∼ li ◦ h i1×V j (∼trans) li ◦ u j ◦ π2, j ∼ li ◦ h i1×V j (∼subs) li ◦ u j ◦ π2, j ◦ (ui ⋊ id j ) ∼ li ◦ h i1×V j ◦ (ui ⋊ id j ) (W1′ )
J.G. Dumas, D. Duval, L. Fousse & J.C. Reynaud
(0)
(E2 ) (≡subs) (≡trans)
59
.. .
h i1×V j ≡ π1, j
h i1×V j ◦ (ui ⋊ id j ) ≡ π1, j ◦ (ui ⋊ id j ) (≡repl) (≡to∼)
(rightproj1)
ui π1, j ◦ (ui ⋊ id j ) ≡ ui ◦ π1,i, j
h i1×V j ◦ (ui ⋊ id j ) ≡ ui ◦ π1,i, j li ◦ h i1×V j ◦ (ui ⋊ id j ) ≡ li ◦ ui ◦ π1,i, j
li ◦ h i1×V j ◦ (ui ⋊ id j ) ∼ li ◦ ui ◦ π1,i, j (W2′ )
(∼subs)
(A1 ) li ◦ ui ∼ idi li ◦ ui ◦ π1,i, j ∼ π1,i, j (W3′ )
Equations (W1′ ) to (W3′ ) and the transitivity rule for ∼ give rise to li ◦ u j ◦ π2, j ◦ (ui ⋊ id j ) ∼ π1,i, j . With the symmetry and transitivity rules for ∼, this concludes the proof when k = i. • The proof when k = j is symmetric to the proof when k = i.
Conclusion In this paper, decorated proofs are used for proving properties of states. To our knowkedge, such proofs are new. They can be expanded in order to get the usual proofs, however decorated proofs are more concise and closer to the syntax; in the expanded proof the notion of effect is lost. This approach can be applied to other computational effects, like exceptions [Dumas et al. 2012a, Dumas et al. 2012b].
References [Dom´ınguez & Duval 2010] C´esar Dom´ınguez, Dominique Duval. Diagrammatic logic applied to a parameterization process Mathematical Structures in Computer Science 20, p. 639654 (2010). doi:10.1017/ S0960129510000150. [Dumas et al. 2011] JeanGuillaume Dumas, Dominique Duval, JeanClaude Reynaud. Cartesian effect categories are Freydcategories. Journal of Symbolic Computation 46, p. 272293 (2011). doi:10.1016/j.jsc.2010. 09.008. [Dumas et al. 2012a] JeanGuillaume Dumas, Dominique Duval, Laurent Fousse, JeanClaude Reynaud. A duality between exceptions and states. Mathematical Structures for Computer Science 22, p. 719722 (2012). doi:10.1017/S0960129511000752. [Dumas et al. 2012b] JeanGuillaume Dumas, Dominique Duval, Laurent Fousse, JeanClaude Reynaud. Adjunctions for exceptions. Submitted for publication (2012). arXiv:1207.1255. [Gaudel et al. 1996] MarieClaude Gaudel, Pierre Dauchy, Carole Khoury. A Formal Specification of the SteamBoiler Control Problem by Algebraic Specifications with Implicit State. Formal Methods for Industrial Applications 1995. SpringerVerlag Lecture Notes in Computer Science 1165, p. 233264 (1996). doi:10.1007/ BFb0027240. [Melli`es 2010] PaulAndr´e Melli`es. Segal condition meets computational effects. LICS 2010. IEEE Computer Society, p. 150159 (2010). doi:10.1109/LICS.2010.46. [Moggi 1991] Eugenio Moggi. Notions of Computation and Monads. Information and Computation 93(1), p. 5592 (1991). doi:10.1016/08905401(91)900524. [Plotkin & Power 2002] Gordon D. Plotkin, John Power. Notions of Computation Determine Monads. FoSSaCS 2002. SpringerVerlag Lecture Notes in Computer Science 2303, p. 342356 (2002). doi:10.1007/ 3540459316_24.
Characterizing Van Kampen Squares via Descent Data Harald K¨onig
Uwe Wolter
Michael L¨owe
University of Applied Sciences FHDW Hannover, Germany
[email protected]
Department of Informatics University of Bergen, Norway
[email protected]
University of Applied Sciences FHDW Hannover, Germany
[email protected]
Categories in which cocones satisfy certain exactness conditions w.r.t. pullbacks are subject to current research activities in theoretical computer science. Usually, exactness is expressed in terms of properties of the pullback functor associated with the cocone. Even in the case of nonexactness, researchers in model semantics and rewriting theory inquire an elementary characterization of the image of this functor. In this paper we will investigate this question in the special case where the cocone is a cospan, i.e. part of a Van Kampen square. The use of Descent Data as the dominant categorical tool yields two main results: A simple condition which characterizes the reachable part of the above mentioned functor in terms of liftings of involved equivalence relations and (as a consequence) a necessary and sufficient condition for a pushout to be a Van Kampen square formulated in a purely algebraic manner.
1
Introduction
There is a considerable amount of theoretical work in software engineering and category theory that has frequently encountered the question whether the interplay of pushouts and pullbacks satisfies certain exactness conditions. There is ongoing research in classifying and characterizing categories in which colimits and pullbacks are reasonably related. A prominent example are adhesive categories [13], in which pushouts along monomorphisms are Van Kampen squares. However, this property can be formulated for any commutative square in the bottom of Figure 1 as follows: The functor PB, which takes σ ∈ C↓S and maps it to a rear pullback span by pulling back along a ◦ r = r ◦ a, has to be an equivalence of categories. /H
I /K
J
a
A
L r
r
~
σ
~ /S
/R a
Figure 1: Van Kampen square If the bottom square is a pushout, the following property is equivalent to this definition: In every commutative cube as in Figure 1 with two pullbacks as rear faces, the following equivalence holds: The top face is a pushout if and only if the front and right faces are pullbacks [18]. The category SET of sets and mappings between them as well as the category GRAPH of graphs1 and graph morphisms are adhesive. In many applications (e.g. [4, 18]) it is sufficient to infer exactness 1
i.e. directed graphs (V, E; s,t : E → V )
U. Golas, T. Soboll (Eds.): Proceedings of ACCAT 2012 EPTCS 93, 2012, pp. 61–81, doi:10.4204/EPTCS.93.4
c H. K¨onig, U. Wolter, M.L¨owe
This work is licensed under the Creative Commons Attribution License.
62
Characterizing Van Kampen Squares
for pushouts along monomorphisms only. However, it is wellknown that already in SET there are many more Van Kampen squares than the ones where one participating morphism is monic. Additionally, several research topics have evolved, where the implications of the Van Kampen property were needed in the case of nonmonic a and r. An important example are diagrammatic specifications2 in model driven engineering [3]: In Figure 2, there are specifications A, L, R, and S each of which contain (data) types and directed relations between them, i.e they are small graphs. Since the specifications require compositionality [6], it is important to investigate amalgamation, a simple and natural construction which provides the basis for compositionality. It is a method to uniquely and correctly compose interpretations of parts of an already composed specification. Formulated in indexed semantics this takes the form as shown in the left diagram of Figure 2: There is a pushout of specification morphisms as top face together with interpretations τ and β with common part γ, i.e. β ◦ r = γ = τ ◦ a. Here the large graph SET serves as a ”semantic universe”. All arrows in Figure 2 are graph homomorphisms. A unique and correct amalgamation of interpretations for the indexed case comes quite naturally by constructing the unique mediating arrow from S to SET . /R
r
L A
a0
a
a

/S
r
J
(
SET
A
? β
?
/R
r
L
τ
/K
?
β !
/H
γ
γ
τ
r0
I
/S
a r
a
Figure 2: Indexed vs Fibred Amalgamation of τ and β with common part γ There is also a global view on the amalgamation procedure in the indexed setting: Let us denote the category of interpretations of a specification X by Alg(X)3 and let V denote the usual forgetful functor along a specification morphism (e.g. the functor Vr : Alg(R) → Alg(L) is defined by Vr (β ) = β ◦ r), cf. Figure 3. The Amalgamation Lemma [5] states that (2) is a pullback in the category CAT of categories if (1) is a pushout of specifications. L a
r
(1)
A
r
/R a
/S
Alg(L) o
Vr
O
Va
Alg(A) o
Alg(R) O
(2) Vr
Va
Alg(S)
Figure 3: Amalgamation Lemma (Indexed setting) Note that the ”philosophy” of semantic universes implies two important facts: On the one hand, elements of a set (objects) can be multiply interpreted (typed) (if e.g. γ(t1 ) ∩ γ(t2 ) 6= 0/ for two different 2 3
E.g. UML class diagrams or ER diagrams We use this abbreviation, because the term ”interpretation” is often substituted by the term ”algebra”.
H. K¨onig, U. Wolter, M.L¨owe
63
nodes t1 ,t2 in L). On the other hand, one can determine all objects that are ttyped by considering γ(t). But reliable semantics for modeldriven structures has to omit the ”philosophy” of semantic universes, because in software environments each object possesses exactly one type and it should not be possible to determine the set of ttyped objects 4 . This mismatch requires the shift from indexed to fibred semantics [3]. In the fibred setting, interpretations are called instances and are formalized by objects of the slice categories C↓A, C↓L, C↓R, and C↓S. Forgetful functors are now ”pulling back”functors (e.g. the functor r∗ which constructs the pullback of (r, β ), see the right part of Figure 2). This raises the question whether the amalgamation procedure smoothly carries over to the fibred setting. I.e. given two instances τ ∈ C ↓ A and β ∈ C ↓ R with common part γ, i.e. r∗ β = γ = a∗ τ, one wants to prove that the syntactical composition (pushout of a and r) is reflected on the instance level by a unique construction. The counterpart for correctness is the requirement to obtain an Sinstance of C↓S, such that its pullbacks along a and r yield β and τ, resp, cf. Figure 2.
2
The Reachability Problem
In contrast to indexed amalgamation, there are intrinsic difficulties for the fibred setting, because the given rear pullback span must not be in the image of PB. In other words, a reasonable construction on the instancelevel fails if and only if the pullback span is not reachable by PB. This is demonstrated in Example 1 In Figure 4, objects are denoted i:t, instances map objects to their types. a and r map according to the letters. i:t, j:s ∈ I are connected via dashed lines if r0 (i:t) = r0 ( j:s). Dotted lines depict the kernel of a0 . It can easily be computed that the two rear squares establish a pullback span in SET . However, the span is not reachable: On the one hand, pullback complements for the right and the front face with sets over S containing two elements will always yield a noncommutative top face. On the other hand, the pushout on the top face creates a C ↓ Sobject (the mediator out of the pushout), whose domain is a singleton set. But pulling back this instance along r, a resp. does not yield τ and β , resp. 1:x 2:x
1:z
1:y
1:w
2:z
I
r' 2:y
1:zw 12:yx 21:yx 2:zw
2:w
a' 1:xz 1:wy
2:xz
β
γ
2:wy
L
τ
x
z
y
w
r
yx
zw
R
a a xz
A
wy
xyzw
r
S
Figure 4: Unreachable pullback span These effects can not occur in the indexed setting because multiple typing was allowed. To get rid of multiple typing, the transition from indexed to fibred semantics entails the production of copies. E.g. in the indexed setting, it would be sufficient to let γ map each element of L to the set {1, 2}, whereas the fibred view requires to produce 4 copies of this 2element set (yielding the set I in Figure 4). It is 4
Consider conformance relations in standards of software engineering (e.g. UML object diagrams or MOF) [17].
64
Characterizing Van Kampen Squares
wellknown that indexed categories are related to fibrations via the Grothendieck construction [1, 19]. However, since the image of this construction is the category of split fibrations, all produced copies behave in a uniform way as in the next example. Example 2 In this example fibres are lifted in a uniform way. The pullback span is now reachable. It is isomorphic to PB(σ ) where σ : {1:xyzw, 2:xyzw} → S. 1:x 2:x
1:z
1:y
1:w 2:w
2:z
I
r' 2:y
1:yx 2:yx
1:zw 2:zw
a' 1:xz 1:wy
2:xz
β
γ
2:wy x
z
y
w
L
τ
r
yx
zw
R
a a xz
xyzw
A wy
r
S
Figure 5: Reachable pullback span But if pullback spans are not results of the Grothendieck construction, we suffer from the enlarged degree of freedom for defining the relationship between fibres, i.e. the equivalence relations of a0 and r0 may chaotically be intertwined as in Example 1. Of course, fibred amalgamation is successful, if the bottom square in the cube of Figure 2 would be a Van Kampen square. In this case, one simply has to construct the pushout on top of the cube and can automatically deduce that this produces two pullbacks in front as desired. Then the question arises, how to detect whether a square is a Van Kampen square from properties of a and r only. Example 3 In the pushout in Figure 6 neither a nor r is monic. Hence we cannot infer the Van Kampen property from the fact that SET is an adhesive category. r
a
a
r
Figure 6: Van Kampen square?
Question 1 Can we find a feasable condition which characterizes reachability in terms of the rear pullback span only (even in the case that the bottom square is not a Van Kampen square)? Question 2 Can we find a necessary and sufficient condition for a pushout to be a Van Kampen square in terms of the span (a, r) only?
H. K¨onig, U. Wolter, M.L¨owe
65
It became evident that a comprehensive investigation has to be performed in a more abstract categorical environment, see also [15]. A good generalization are topoi [8], i.e. categories which have finite limits, are cartesian closed, and where the subobject functor is representable5 . SET and GRAPH are topoi. Topoi are adhesive [14], i.e. pushouts along monomorphisms are Van Kampen squares. Thus the above questions are relevant only for the case where both a and r in Figure 2 have nontrivial kernel relations. It has turned out that Descent Theory [9] is a good tool for quantifying the interrelation of kernel pairs on a common domain. In Section 3 we describe descent data and point out its two main facets: On the one hand it describes algebraic structures, on the other hand it codes lifted equivalence relations in pullback squares. In Section 4 we introduce precise notions of reachability of pullback spans and of coherence of a pair of algebraic structures. Algebraic structures are coherent if they are reducts of a uniquely determined larger algebraic structure. Thus coherence is a local property in the sense that it can be cheked by investigating the basic material only, whereas reachability is a global property which is hardly checkable. In the main contribution of this paper (Proposition 17) we prove reachability to be equivalent to coherence, if the specification square is a pushout. This provides an answer to Question 1. This answer is formulated in a practical way in Theorem 19. It also yields an answer to Question 2 in Theorem 20 which is a surprising analogon to the amalgamation lemma in Figure 3. Unfortunately, Theorem 20 is still unpractical in that we still have to investigate all pullback spans in order to decide the Van Kampen property. But we can show that a practical answer to Question 2 can be achieved if matters are restricted to sets and graphs (Section 5, Theorem 24). As related work we want to mention that [14] prove topoi to be adhesive with similar methods (i.e. they use descent theory and some similar auxiliary results). Furthermore, [10] show that being a Van Kampen square in a category C is equivalent to saying that its embedding into a certain span category over C is a pushout. In contrast to this generalization to higher level structures, we aim at an elementary characterization which can be checked within C.
3
Descent Theory
In this section, we work in a general topos C. We will use the following notations: ObC , MorC denote / / ” denotes epimorphisms. x ∈ C means x ∈ ObC . objects and arrows of any category C, resp. ” The application of a functor F to an object or an arrow x will be denoted without parenthesis: Fx. For an arrow p of C we sometimes want pullbacks along p to be uniquely determined. Thus we work with chosen pullbacks. The notation for the pullback functor p∗ is E ×B A π1α :=p∗ α
E
π2α
p
/A
α
/B
where (π1α , π2α ) is the chosen pullback of (α, p) (emphasized by decorating projections with α). In an adjoint situation a , η is the unit, ε the counit. If p : E → B is any arrow in a category with pullbacks and p∗ : C ↓ E → C ↓ B is the postcomposingfunctor, we have p∗ a p∗ . The monad arising from this adjunction is (T p , η p , µ p ), i.e., T p := p∗ ◦ p∗ : C↓E → C↓E, η p := η, and µ p := p∗ ε p∗ . 5
In the sequel, we assume the reader to possess basic understanding of the notion of topos.
66
Characterizing Van Kampen Squares
We intend to describe the categories des(p) of Descent Data, where p : E → B is an arrow in C. Grothendieck invented this theory in order to reason about structures in C↓B (which may be difficult) by reasoning about monadic algebraic structures over C↓E, thus in a sense ”descending” along p [9]. We analyse the relationship between these algebraic structures and the category pb(p) of all pullbacks along p (to be defined precisely later on) such that it will facilitate our characterization of reachability in terms of descent data. p
/E / B be given and (T p , η p , µ p ) be the monad on C ↓ E Definition 4 (Descent Data) Let C arising from the adjunction p∗ a p∗ . Descent data for γ relative to p is an arrow γ
p◦γ
ξ : π1 of C↓E with
= T pγ → γ
ξ ◦ ηγp = idC and ξ ◦ T p ξ = ξ ◦ µγp .
(1)
The situation is as in Figure 7.
C
ξ
z
ηγp
γ
u
/ E ×B C o T pγ
T pξ µγp
Co
E ×B (E ×B C)
p◦γ
E
ξ p◦γ
E ×B C
π2
(T p )2 γ
}
z
p◦T p γ
/B
p
Figure 7: Monadic Descent Data p◦γ
Besides the C ↓ Barrow π2 , the righthand side shows objects and the arrow ξ after applying the p◦γ leftadjoint p∗ only (p∗ is the identity on arrows of C ↓ E). Note that π2 establishes the counit of the adjunction p∗ a p∗ . Thus p◦γ p◦γ π2 ◦ ηγp = id and µγp = p∗ π2 . (2) Note that, for some γ and p, an arrow ξ as in Definition 4 must not exist and must not be unique. For p◦γ future reference, we note that the E ×B Cendomorphism ξ := hγ ◦ π2 , ξ i can reconstruct ξ via p◦γ
ξ = π2
◦ξ.
(3)
[11] gives a detailed investigation on that topic. It is also shown that ξ ◦ ξ = idE×BC .
(4)
Definition 5 (Category of Descent Data) The category des(p) has objects (γ, ξ ) with the properties of Definition 4 and arrows h : (γ, ξ ) → (γ 0 , ξ 0 ) the morphisms h : γ → γ 0 of C↓E with ξ 0 ◦ T p h = h ◦ ξ . p
/ B ∈ MorC let pb(p) denote the category with Definition 6 (Category of Pullbacks) For any E 6 objects commutative diagrams of arbitrary pullbacks along p together with morphism pairs (m1 , m2 ) ∈ MorC↓E × MorC↓B such that the rear square in Figure 8 commutes. Note that by the decomposition property of pullbacks the rear square is a pullback, too. 6
... not only chosen pullbacks ...
H. K¨onig, U. Wolter, M.L¨owe
67 q
C
/A
m1
γ
E
m2
C0
γ0 p
α
/B
α0
q0
/ A0
Figure 8: The category pb(p) The monoidal conditions (1) (neutrality and associativity) imply that des(p) is the EilenbergMoore Category associated with the monad T p . Thus, there is the comparison functor Φ p : C↓B → des(p) [1]. Obviously pb(p) is equivalent to C↓B via chosen pullbacks, such that we obtain a functor7 Φ p : pb(p) → des(p). In order to compute this functor, let us consider an arbitrary pullback (q, γ) of a cospan (p, α) in C, cf. p◦γ Fig. 9. Computing T p γ using the chosen pullback (p∗ p∗ γ = T p γ, π2 := π2 ) of (p ◦ γ, p) yields a unique ξ α : T p γ → γ such that q ◦ ξ α = q ◦ π2 . (5) From (2), (5), and the uniqueness of mediating morphisms for the original pullback one easily deduces ξ α ◦ ηγp = idC . p◦T p γ
Let π 2 := π2 , then T p ξ α : (T p )2 γ → T p γ is unique with ξ α ◦ π 2 = π2 ◦ T p ξ α , such that a similar argumentation together with the second equation in (2) and (5) yields ξ α ◦ T p ξ α = ξ α ◦ µγp . Hence ξ α fulfills (1). Thus the original pullback is mapped to (γ, ξ α ), an object of des(p). An investigation of the general construction of Φ p [1] shows that our mapping reflects this construction where Φ p (m1 , m2 ) = (m1 , T p m1 )
(6)
on arrows. In the sequel, (γ, ξ α ) (or just ξ α if γ is fixed) will be called canonical descent data for the pullback of α along p. p◦γ From Fig. 7, we obtain p ◦ γ ◦ π2 = p ◦ γ ◦ ξ for each (γ, ξ ) ∈ Obdes(p) . Hence there is a functor Ψ p : des(p) → C ↓ B which maps (γ, ξ ) to the unique arrow α, which mediates p ◦ γ and a chosen p◦γ coequalizer c of π2 and ξ (cf. also Fig. 10). [11] shows that i) Ψ p is leftadjoint to the comparison functor Φ p : C↓B → des(p) with monic counit and ii) if p is an epimorphism, Φ p becomes an equivalence of categories with pseudoinvers Ψ p . We use these facts to state the main result of this section. For this, we need some auxiliary considerations. The following statement is Lemma 20 in [14]: /· /· · Lemma 7 Let C be a topos and a commutative diagram be given with (1) (2) an epimorphism as indicated. If (1) + (2) and (1) are pullbacks, then //· /· · (2) is a pullback, too. 7
To simplify matters, we still use the name Φ p for this functor.
68
Characterizing Van Kampen Squares / E ×B C
π2
E ×B (E ×B C) T pξ α
µγp
ξα
E× C j B
π2
/C q
π2 ηγp
q
ξα q
.C
T pγ
(T p )2 γ
γ
/A
p◦γ
p◦T p γ
α
+%
p
E
/ B u
Figure 9: Canonical Descent Data Definition 8 (Equivalence Relation) An equivalence relation on A ∈ ObC is a pair of arrows a, b : U → A, such that U /
ha,bi
/ A × A is a monomorphism, and which is
1. reflexive: ∃r : A → U : a ◦ r = b ◦ r = id, 2. symmetric: ∃s : U → U : a ◦ s = b, b ◦ s = a, and 3. transitive: If (p : P → U, q : P → U) is the pullback of (a, b) (especially b ◦ p = a ◦ q), there is t : P → U, such that a ◦ t = a ◦ p and b ◦ t = b ◦ q. p◦γ
hξ ,π2 i
Lemma 9 E ×B C
/ C ×C establishes an equivalence relation.
p◦γ
p◦γ
Proof: Because ξ : π1 = T p γ → γ, it is not difficult to see, that hξ , π2 i is monic. For reflexivity, let r := ηγp and use (1) and (2). Symmetry follows with s := ξ , (3), and (4). Transitivity can be established t u via t := µγp (using the commuting top square in Fig. 9 and (1)). p◦γ Note that this implies that hξ , π2 i is the kernel pair of its coequalizer, because in topoi, equivalence relations are effective (see [12], A 2.4.1.). Consider now the above introduced coequalizer construction for Ψ p . E ×B C
ξ
/C
p◦γ
π2
C
c
γ
c
//H
/E
α
p
/B
Figure 10: Coequalizer construction Lemma 10 The right square in Figure 10 is a pullback. Hence, Ψ p : des(p) → pb(p)8 . Proof: By Lemma 9 and because equivalence relations are the kernel pair of their coequalizer, the left square in Figure 10 is a pullback. Because γ ◦ ξ = T p γ and α ◦ c = p ◦ γ by definition of α, the outer rectangle in Figure 10 is the pullback of p ◦ γ and p as indicated in Figure 7. Since c is epic, the result follows from Lemma 7. t u 8
Again not changing the name for this new functor.
H. K¨onig, U. Wolter, M.L¨owe
69
Proposition 11 (Correspondence of Pullbacks and Descent Data) a) For each choice of coequalizer in the construction of Ψ p the unit of the adjunction Ψ p a Φ p : pb(p) → des(p) is the identity. b) If p is an epimorphism, Φ p : pb(p) → des(p) becomes an equivalence of categories. Moreover, the coequalizer in the construction of Ψ p can be chosen such that the counit is identical. Proof: We use the facts i. and ii. on page 67. For a) we use Lemma 10: If (γ, ξ ) ∈ Obdes(p) , Φ p Ψ p (γ, ξ ) is unique with (5) (with q replaced by c) by the above considerations on Φ p . But the coequalizer construcp◦γ tion also yields c ◦ ξ = c ◦ π2 , such that (γ, ξ ) = Φ p Ψ p (γ, ξ ), hence the unit is the identity. To prove b) consider an arbitrary pullback square sqr as in Figure 9. Pullbacks in topoi preserve epimorphisms p◦γ ([8], 5.3) thus, using the isomorphic counit, it is easy to show, that the diagram q ◦ ξ α = q ◦ π2 in the upper right corner of Figure 9 establishes a coequalizer situation. Hence for this choice of coequalizer, Ψ p Φ p sqr = sqr, yielding an identical counit. t u For future reference, we want to illustrate these facts in the category SET . In the following proposition, the first part reformulates neutrality and associativity, whereas the nature of descent data as equivalence relation (on C) becomes evident from the second part. For a detailed explanation of this proposition, the reader is referred to the Appendix. Proposition 12 (Descent Data in SET) Let C = SET . 1. There is a bijective correspondence between objects (γ, ξ ) of des(p) and families (ξe,e0 )(e,e0 )∈ker(p) : γ −1 e → γ −1 e0 of bijections which satisfy ξe,e = idγ −1 e
ξe,e00 = ξe0 ,e00 ◦ ξe,e0 .
and
for all (e, e0 ), (e, e00 ) ∈ ker(p). p◦γ
2. Let c be the coequalizer of ξ and π2 . Then ker(c) = {(x, ξγ(x),γ(y) (x))  x, y ∈ C, (γ(x), γ(y)) ∈ ker(p)}.
4
Coherence and Van Kampen Squares
In this section we study the interplay of reachability of pullback spans and coherent coexistence of descent data in a general topos C. After having defined these two concepts precisely, we state a local criterion for reachability and a global characterization of Van Kampen squares in terms of coherent algebraic structures. Let a commuting square as in the bottom of Figure 11 be given. r0
I
r0 ?
a
r
σ?
r s:=a◦r
A
β
γ
L
τ
a0 ?
/K
J
/H
s0 ?
a0
/R
a
/S
Figure 11: Reachability
70
Characterizing Van Kampen Squares
Reachability: Because s = r ◦ a, we can decompose any diagonal pullback in pb(s) into a left and a right part by calculating the right part via the chosen r∗ σ . This calculation of the left part of the pullback of pb(s) extends to a functor ∆r : pb(s) → pb(a). From s = a ◦ r, we obtain ∆a : pb(s) → pb(r) in the same way. Then we define PB := h∆r , ∆a i : pb(s) → pb(a) ×C↓L pb(r). where pb(a) ×C↓L pb(r) is the category of all pullback spans over (a, r) together with morphism triples similar to the definition in Figure 8. The name clash of this functor with the functor PB in the introduction is deliberate: Both functors are equal up to an equivalence of categories, because C↓S ∼ = pb(s). Definition 13 (Reachability) A pullback span in pb(a) ×C↓L pb(r) is said to be reachable, if it is in the image of PB up to a pb(a) ×C↓L pb(r)isomorphism. Coherence: To investigate the counterpart of reachability on the instance level, we apply the methodology of Section 3 to the situation in Figure 11 in which the two back faces are pullbacks. Let f : L → B, g : B → S be any two arrows in C and let h := g ◦ f . We consider the pullbacks f ∗ ( f ◦ γ) and h∗ (h ◦ γ) f ◦γ h◦γ as in Figure 7 (with C := I, E := L, and p : E → B replaced by f : L → B, h : L → S, resp.). Let π2 , π2 be the ”second projections” in these pullbacks, resp. f ◦γ For any γ ∈ C↓L we have h ◦ γ ◦ π2 = h ◦ T f γ, thus there is a unique ugγ : L ×B I → L ×S I with f ◦γ
h◦γ
and T h γ ◦ ugγ = T f γ
π2 ◦ ugγ = π2
(7)
cf. Figure 12. f ◦γ
L ×B I r
π2 ugγ
$
L ×S I Tfγ
Th γ
h◦γ π2
&
L
/! I
h
h◦γ
/S
Figure 12: Construction of Embedding Note, that in SET , T f γ and T h γ are first projections, which actually makes ug invariant under projections: Indeed L ×B I = {(l, i)  f (γ(i)) = f (l)} ⊆ {(l, i)  h(γ(i)) = h(l)} = L ×S I where the embedding is ug . This justifies the use of the hooked arrow in Figure 12. In this way, we obtain 5 embeddings for the original pushout situation: urγ : L ×A I → L ×S I, uaγ : L ×R I → L ×S I (using s = a ◦ r = r ◦ a instead of h = g ◦ f ) as well as urγ : L ×L I → L ×R I, uaγ : L ×L I → L ×A I, usγ : L ×L I → L ×S I (using r = r ◦ idL , a = a ◦ idL , and s = s ◦ idL ) with corresponding projection compatibility and uniqueness as in (7). The uniqueness property easily yields compositionality: ∀γ ∈ C↓L : usγ = uaγ ◦ urγ = urγ ◦ uaγ .
(8)
It can easily be shown that ugγ are monomorphisms, but we can do better (see the Appendix for a proof):
H. K¨onig, U. Wolter, M.L¨owe
Lemma 14 Let L
f
/B
71
g
/ S be given with h := g ◦ f . ug : T f ⇒ T h is a monad monomorphism.
Lemma 15 Let f , g, h be as in Lemma 14. There is a full and faithful functor U g : des(h) → des( f ) for which U g (γ, ξ ) = (γ, ξ ◦ ugγ ). Proof: Since des(p) is the category of EilenbergMooreAlgebras associated with T p , the result follows from Lemma 14 and the proof of a theorem of Barr and Wells ([2], Theorem 6.3 in Chapter 3). t u Let us fix the rear pullback span PBS in Figure 11. Since γ is fixed, considered objects of des(r), des(a), and des(s) will always have codomain γ, hence ξ β and ξ τ are appropriate abbreviations for the two canonical descent datas (cf. Section 3) arising from the two pullbacks. Definition 16 (Coherence) ξ τ and ξ β are called coherent, if there is (γ, ξ ) ∈ des(s), such that hU r ,U a i(γ, ξ ) = (ξ τ , ξ β )
(9)
We call any (γ, ξ ) ∈ des(s) with this property a coherence witness (for ξ τ and ξ β ). Thus two algebraic structures are coherent, if there is an algebraic structure over γ relative to s which effectively approximates them. We are ready to state the main technical result of this section: Proposition 17 (Reachability vs. Coherence) Let in a topos C a diagram be given as in Figure 11 where the bottom square is commutative and the rear faces form a pullback span. Let ξ τ and ξ β be the above introduced canonical descent datas. a) If the span is reachable, ξ τ and ξ β are coherent. b) If the bottom square is a pushout and ξ τ and ξ β are coherent, then the span is reachable. c) Under the prerequisites of b), the coherence witness is unique. Proof: To simplify matters we write u instead of uγ . To show a), let sqrdiag ∈ pb(s) (the pullback (γ, s0 ) of (σ , s)) with PB(sqrdiag ) being the rear pullback span (this is Figure 11 without question marks)9 . We a◦γ show coherence with ξ := (γ, ξ ) := Φs sqrdiag . By (5) ξ τ is unique with a0 ◦ ξ τ = a0 ◦ π2 , such that for the first projection in (9) it suffices to show validity of this equation with ξ τ replaced by ξ ◦ ur . The argumentation for the second projection is then similar. We have τ ◦ a0 ◦ ξ ◦ ur = a ◦ γ ◦ ξ ◦ ur
Left rear pullback in Figure 11
= a◦T γ ◦u
Since ξ : T s γ → γ
= a ◦ Taγ
By (7)
s
= a◦γ ◦ξ
r
Since ξ τ : T a γ → γ
τ a◦γ
= τ ◦ a0 ◦ π2 s◦γ
Left rear pullback and (5) for ξ τ a◦γ
a◦γ
and also r0 ◦ a0 ◦ ξ ◦ ur = s0 ◦ π2 ◦ ur = s0 ◦ π2 = r0 ◦ a0 ◦ π2 (by (5) for ξ and (7)). This implies the desired result, because in the front face pullback τ and r0 are jointly monic. To show b), asssume we already knew the result in the case a and r are both epimorphisms. We can then use epimonofactorizations r = rm ◦re and a = am ◦ae (which exist in topoi) to decompose both back 9
If PB(sqrdiag ) yields the rear pullback span not exactly but only up to isomorphism, we can exchange the instances over A and R by their compositions with the isomorphisms, such that there is a complete cube with 4 pullbacks having the original pullback span as rear faces. It is no problem that front and right pullbacks are no longer chosen.
72
Characterizing Van Kampen Squares
face pullbacks into two pullbacks resp. It can then be verified that the bottom face can be decomposed into 4 pushouts along these epimonofactorizations. Because re and ae are both epic and ξ τ and ξ β are also canonical descent datas of re and ae (this follows from fact i. on page 67), the inner pullback span is reachable. Since topoi are adhesive [14], this reachability can be continued along the other pairs of bottom arrows (of which either one or both are now monic) by constructing top face pushouts. Thus it suffices to assume that r and a are epic. Then r0 in Figure 11 is the appropriate coequalizer of r◦γ s◦γ β ξ and π2 by Proposition 11, b). Let s0 : I → K be the coequalizer of π2 and the coherence witness ξ with sqrdiag := Ψs (γ, ξ ) the resulting diagonal pullback by Lemma 10. By coherence and (7) r◦γ
s0 ◦ ξ β = s0 ◦ π2
yielding a unique mediator a0 : H → K for the coequalizer r0 , i.e. a0 ◦ r0 = s0 .
(10)
Let σ ∈ C ↓ S be part of sqrdiag as indicated in Figure 11. Then by construction and (10) a ◦ β ◦ r0 = a ◦ r ◦ γ = s ◦ γ = σ ◦ s0 = σ ◦ a0 ◦ r0 , hence we obtain a commutative square as right face of the cube in Figure 11 (the coequalizer r0 is an epimorphism) which is also a pullback by Lemma 7, i.e. a∗ σ ∼ = β. Analogously one shows r∗ σ ∼ τ. = To show c) assume that there are two coherence witnesses (γ1 , ξ1 ), (γ2 , ξ2 ). Clearly γ := γ1 = γ2 by Lemma 15, such that it remains to show ξ1 = ξ2 . By b), ξ1 and ξ2 yield two cubes each of which possess 4 pullbacks as side faces. They possess the same arrows except a0 , r0 , and σ . But the two variants of the arrows a0 , r0 both form a top pushout of a0 , r0 because, in topoi, pullbacks preserve colimits. Hence there is an isomorphism i which can be shown to mediate between the two variants of σ . 1 2 Consequently, we have two diagonal pullbacks sqrdiag = Ψs (γ, ξ1 ) and sqrdiag = Ψs (γ, ξ2 ) (see part b)) for which by (6) and Proposition 11 a) 1 Φs ( sqrdiag
(id,i)
s / sqr2 ) = (γ, ξ1 )(id,T id)/ (γ, ξ2 ) diag
which yields ξ1 = ξ2 . t u By the remark after Lemma 9 any descent data (γ, ξ ) ∈ des(p) yields the kernel pair ker(q) := (ξ , π2 ) of the top arrow q of Ψ p (γ, ξ ), see Figure 9. In the category Eq(C) of equivalence relations on C ∈ ObC (i.e. the full subcategory of C↓(C ×C) of arrows with the properties of Definition 8), we call an object e an upper bound of e1 and e2 , if there are Eq(C)arrows (necessarily monos) v1 and v2 with e ◦ v1 = e1 and e ◦ v2 = e2 .
(11)
It is wellknown [2] that the least upper bound (lub) of two equivalence relations ker(a0 ) : X → C × C, ker(r0 ) : Y → C × C can be constructed by extracting the mono part m of [ker(a0 ), ker(r0 )] : X + Y → C ×C followed by constructing the kernel pair of the coequalizer of m. Let π20 be the second projection in the pullback associated with the monad T s . Lemma 18 Let the bottom square in Figure 11 be a pushout. ξ τ and ξ β are coherent if and only if there is (γ, ξ ) ∈ des(s) with lub(ker(a0 ), ker(r0 )) ∼ = (ξ , π20 ) Proof: If ξ τ and ξ β are coherent, then the coherence witness ξ from Proposition 17 b) and c) was used to complete the pullback cube. Since, in topoi, the top face becomes a pushout, and (ξ , π20 ) is the
H. K¨onig, U. Wolter, M.L¨owe
73
kernel pair of the top diagonal, using the universal property of pushouts, it can easily be shown that (ξ , π20 ) ∼ = lub(ker(a0 ), ker(r0 )). The opposite direction follows directly from the uniqueness properties (7) of the monad morphisms ua and ur : Any mediating monomorphisms v1 , v2 as in (11) in the least upper bound constellation must t u coincide with ua , ur , resp. Theorem 19 (Answer to Question 1) Let C be a topos and a pullback span be given as in the rear of Figure 11 with the bottom square a pushout. The span is reachable if and only if lub(ker(a0 ), ker(r0 )) ∼ = (ξ , π20 ) for some (γ, ξ ) ∈ des(s). Proof: This follows from Proposition 17 and Lemma 18. t u 0 0 Thus there is an algorithm to check reachability: Given a rear pullback span PBS with top arrows a , r 1. Compute e := lub(ker(a0 ), ker(r0 )). 2. Check the monadicity requirements (1) of e relative to s by interpreting it as a pair (ξ , π20 ). 3. PBS is reachable if and only if e meets the requirements. In the next section we will recall the introductory examples from Section 2 such that these theoretical results become more evident. We conclude this section with a global statement on Van Kampen squares in the spirit of Figure 3. We still assume a square as the bottom in Figure 11 to be given. In the following, the category des(idL ) is integrated. It represents the ”common part” of the forgetful functors U a and U r , namely the carrier γ represented by certain isomorphisms from the ”graph” L ×L I of γ to I. Theorem 20 (Fibred Version of Amalgamation Lemma) Let C be a topos. In Figure 13, the pushout (1) is a Van Kampen square if and only if (2) is a pullback in CAT .
L a
r
(1)
A
r
/R
des(idL ) o
Ur
O
des(r) O
a
Ua
(2)
/S
des(a) o
Ur
Ua
des(s)
Figure 13: Amalgamation Lemma (Fibred setting) Proof: ”⇒”: By (8) and Lemma 15 (2) commutes. By assumption, PB is an equivalence of categories, i.e. each rear pullback span is reachable. By Proposition 17 a) each pair (ξ τ , ξ β ) is coherent and by Proposition 17 c) the coherence witness is unique. Standard arguments together with the fact that U g are full and faithful functors (Lemma 15) yield the pullback property. ”⇐”: The pullback property immediately yields coherence for each pair ((γ, ξ τ ), (γ, ξ β )) ∈ des(a) × des(r). Because (1) is a pushout, Proposition 17 b) implies reachability of each rear pullback span, thus PB is essentially surjective, which is sufficient for (1) to have the Van Kampen property [18]. t u
74
5
Characterizing Van Kampen Squares
Coherence and Van Kampen Squares in SET and GRAPH
This section illustrates the use of Theorem 19 and develops a simply checkable characterization of Van Kampen squares in SET and GRAPH based on Theorem 20. As mentioned before, in SET , the u’s are natural embeddings. Hence coherence (cf. Definition 16) yields the existence of descent data ξ relative to s = a ◦ r = r ◦ a with τ ∀(x, x0 ) ∈ ker(r) : ξx,x0 = ξx,x0 and ∀(y, y0 ) ∈ ker(a) : ξy,y0 = ξy,y 0 β
(12)
where all mappings are understood as the components of the families of bijections from Proposition 12. We can now observe Theorem 19 at work: Recall the situation in Figure 4. By Proposition 12, 2 β the canonical descent data ξ β and ξ τ map along the dashed and dotted lines, resp. E.g. ξx,y (1:x) = β 2:y, ξx,y (2:x) = 1:y. Reachability means that the least upper bound of the kernels of a0 and r0 yield a monadic structure ξ relative to s. By (12) and hypothetical associativity (cf. Proposition 12) of ξ the τ ◦ ξ β ◦ ξ τ on the fibre over x. But this must then coincide with ξ β , bijection ξx,y must be equal to ξw,y z,w x,y x,z which is not the case in Figure 4. Obviously, the kernels of r and a are intertwined through the cycle (x, z), (z, w), (w, y), (y, x) ∈ ker(s) and are thus not enough separated. The following definition makes this more precise: Definition 21 (Separated Kernels) Let C = SET and a and r be given as in Figure 11. A sequence (xi )i∈{0,1,...,2k+1} of elements in L is called a domain cycle (of a and r), if k ∈ N and the following conditions hold: 1. ∀ j ∈ {0, 1, . . . , 2k + 1} : x j 6= x j+1 2. ∀i ∈ {0, . . . , k} : (x2i , x2i+1 ) ∈ ker(a) 3. ∀i ∈ {0, . . . , k} : (x2i+1 , x2i+2 ) ∈ ker(r) where the sums are understood modulo 2k + 2 (i.e. x2k+2 = x0 ). We call 2k + 2 the length of the domain cycle. Moreover, a domain cycle is proper if we have for all i, j ∈ {0, 1, . . . , 2k + 1} that xi 6= x j if i 6= j. The pair a and r is said to have separated kernels, if it has no domain cycle. Remark 1: It is easy to see that each domain cycle c possesses a proper subcycle, i.e. a proper cycle with smaller or equal length than the length of c and whose elements are a subset of the elements of c. Remark 2: ”Having separated kernels” is only sufficient but not necessary for ”being jointly monic”. Indeed, being not jointly monic induces a domain cycle of length 2. But longer domain cycles occur for jointly monic a and r (see Figure 4). Domain cycles are connected to coherence as follows: Proposition 22 Let C = SET and a commutative square be given like the bottom square in Figure 11 and let the two rear faces be pullbacks with canonical descent data ξ τ and ξ β , resp. ξ τ and ξ β are coherent iff for all domain cycles (xi )i∈{0,1,...,2k+1} of a and r we have ξxβ2k+1 ,x0 ◦ ξxτ2k ,x2k+1 ◦ · · · ◦ ξxτ2 ,x3 ◦ ξxβ1 ,x2 ◦ ξxτ0 ,x1 = idγ −1 x0
(13)
The statement is illustrated in Example 2, where coherence is now achieved by harmonizing the equivalences of a0 and r0 in the two copies of L that make up the domain of γ. Alternatively, we can use Theorem 19 to check reachability: The least upper bound yields a descent data for γ relative to s, because it is evident that neutrality and associativity are not destroyed. In order not to interrupt the flow of arguments, the proof of Proposition 22 is contained in the Appendix. The next proposition illustrates how domain cycles are connected to reachability. This time we include the proof, because it demonstrates the use of descent data.
H. K¨onig, U. Wolter, M.L¨owe
75
Proposition 23 Let C = SET and a commutative square be given like the bottom square in Figure 11. If all pullback spans in the rear are reachable, a and r have separated kernels. Proof: Assume to the contrary that a and r possess a domain cycle (xi )i∈{0,1,...,2k+1} for some k ∈ N. By the first remark after Definition 21, we can assume that this cycle is proper. Let Ω = {0, 1} and γ := π2 : Ω×L → L be the ordinary second projection. We construct descent data ξ a for γ relative to a and r (b, x) := (b, x0 ) for ξ r for γ relative to r: Because the fibre of γ over x is {(0, x), (1, x)} we can define ξx,x 0 all (x, x0 ) ∈ ker(r) and b ∈ {0, 1}. It is obvious that this yields neutrality and associativity of Proposition 12. Consider now the equivalence class E0 = {x ∈ L  a(x) = a(x0 )} of ker(a), where x0 is the begin of the cycle. The domain cycle has at least length 2, hence we have x1 6= x0 , x1 ∈ E0 in the cycle. For any x ∈ E0 we define a bijection ξxa0 ,x : {0, 1} × {x0 } → {0, 1} × {x} by (b, x) if x 6= x1 a ξx0 ,x (b, x0 ) := (1 − b, x) if x = x1 Further we set a a a −1 ξx,x 0 := ξx ,x0 ◦ (ξx ,x ) 0 0
for all x, x0 ∈ E0 , x 6= x0 .
Neutrality and associativity are straightforwardly ensured by these definitons. For (x, x0 ) ∈ ker(a) − E02 a in the same way as ξ r . we define ξx,x 0 By Proposition 11 a), ξ β := ξ r and ξ τ := ξ a are canonical descent datas of the pullbacks Ψr (γ, ξ r ) and Ψa (γ, ξ a ), resp, such that for the resulting pullback span we obtain: (ξxβ2k+1 ,x0 ◦ ξxτ2k ,x2k+1 ◦ · · · ◦ ξxτ2 ,x3 ◦ ξxβ1 ,x2 ◦ ξxτ0 ,x1 )(0, x0 ) = (1, x0 ) τ interchanges it only if x = x because, in this chain, ξ β always preserves the first projection and ξx,x 0 0 and x0 = x1 since the cycle is proper. Thus, by Proposition 22, ξ τ and ξ β are not coherent, hence, by Proposition 17, the pullback span is not reachable contradicting the assumption. t u The following theorem is the main result of this section (cf. [16]): Theorem 24 Let C = SET or C = GRAPH. A pushout diagram as the bottom square in Figure 11 is a Van Kampen square if and only if a and r have separated kernels. Proof: ”⇒” follows from Theorem 20 and Propositions 17 and 23, ”⇐” follows from Proposition 22 and Theorem 20. It is shown in [16] that the argumentation easily carries over to graphs once the result has been proven for SET . t u Recall Example 3, where we can now easily derive from Theorem 24 that each pullback span is reachable, i.e. each amalgamation of instances is successful.
6
Outlook
The paper presents first outcomes of a more comprehensive collaborative project based on [3, 16, 19] and addressing ”compositional fibred semantics in topoi”. There are several topics for future research: First we have to address persistency requirements and extension lemmas for fibred semantics. Moreover, we are looking for a categorical generalization of Proposition 22 which would give rise, due to Proposition 17, to a kind of ”conditional compositionality”. An interesting open question, in this context, is how to characterize domain cycles on a pure categorical level. This should yield an elementary characterization of Van Kampen squares in more general categories in the spirit of Theorem 24.
76
Characterizing Van Kampen Squares
7
Appendix
Descent Data in SET : Here, we give details about the different view on descent data from Proposition 12 in SET . First, we remind that pullbacks, in general, can be described as products in slice categories. p◦γ For the situation in Definition 4 this means that the diagonal p ◦ T p γ = p ◦ γ ◦ π2 : E ×B C → B forms p◦γ p the product p × (p ◦ γ) in C ↓ B with projections T γ : p × (p ◦ γ) → p and π2 : p × (p ◦ γ) → p ◦ γ. Second, any ξ : E ×B C → C in C which is an arrow ξ : T p γ → γ in C ↓ E establishes also an arrow ξ : p ×(p ◦ γ) → p ◦ γ in C↓B. C↓B, however, is also a topos by the fundamental theorem of Freyd [7] and thus, espcially cartesian closed. In C = SET , finally, any ξ ∈ Hom(p×(p◦γ), p◦γ) ∼ = Hom(p, (p◦γ) p◦γ ) can be interpreted as a map that assigns to any element e ∈ E an endomap ξ (e, .) of the fibre of p ◦ γ over p(e) (cf. [8], Chapter 4). For our purposes, an appropriate representation of these maps for descent data will be in terms of the kernel of p: The fibre of p ◦ γ over p(e) is the preimage of the equivalence class [e]ker(p) w.r.t. γ. Let ξe,e0 be the restriction of the map ξ (e0 , .) to γ −1 e whenever (e, e0 ) ∈ ker(p). If c ∈ γ −1 e we obtain γ(ξ (e0 , c)) = e0 from Definition 4, hence the codomain of ξe,e0 is γ −1 e0 and ξ represents a family (ξe,e0 : γ −1 e → γ −1 e0 )(e,e0 )∈ker(p)
(14)
ξ (e0 , c) = ξe,e0 (c) for γ(c) = e.
(15)
which fulfills Let us now investigate the influence of neutrality and associativity (1) to this family. A canonical choice of pullbacks in SET yields E ×B C = {(e, c) ∈ E ×C  p(e) = p(γ(c))}, E ×B (E ×B C) = {(e, (e0 , c)) ∈ E × (E ×C)  p(e) = p(e0 ) = p(γ(c))}, and ηγp (c) = (γ(c), c), µγp (e00 , (e0 , c)) = (e00 , c), T p ξ (e00 , (e0 , c)) = (e00 , ξ (e0 , c)). Thus for all
(e, e0 ), (e0 , e00 )
∈ ker(p) and c ∈
γ −1 e,
(16)
(1) and the first equation in (16) yield
ξe,e (c) = c, whereas the second equation in (1) (applied to a triple (e00 , (e0 , c))) and the second and third equation of (16) imply ξe0 ,e00 (ξe,e0 (c)) = ξe,e00 (c). By choosing e00 = e, these two equations force each ξe,e0 to be bijective. By reversing the whole argumentation, we can also show that any family as in (14) which satisfies these two equations yields a descent data by defining ξ as in (15) for e := γ(c). Altogether we obtain the statement in Proposition 12, 1 which subsumes the monoidal nature of descent data in SET . Moreover, 2 follows from effectiveness of equivalence relations and (15). Proof of Lemma 14: For simplicity we write u instead of ug . There are several statements to prove: 1. Each uγ is a monomorphism. 2. u : T f ⇒ T h is a natural transformation. 3. u is compatible with units, i.e. u ◦ η f = η h .
H. K¨onig, U. Wolter, M.L¨owe
77
4. u is compatible with counits, i.e. µ h ◦ u2 = u ◦ µ f where u2 is the horizontal composition of u with itself. 1. To show that uγ is monic for each γ, let x, y : X → L ×B I with uγ ◦ x = uγ ◦ y be given. By (7), f ◦γ f ◦γ f ◦γ one computes T f γ ◦ x = T f γ ◦ y and π2 ◦ x = π2 ◦ y. Because T f γ and π2 are jointly monic (being a limit cone in a pullback square), we obtain x = y. In the sequel, the property of a pullback cone to be jointly monic will be used several times. We will do this without further reference. 2. Let γ, γˆ ∈ C↓L and φ : γ → γˆ be a C ↓ Lmorphism. As before, π2 and π20 denote the second projections in the pullbacks involving γ ˆ and the monads T f and T h , resp. πˆ2 and πˆ20 denote the second projections involving γ. Pulling back φ (as an arrow in C↓B and as an arrow in C↓S) yields φ ◦ π2 = πˆ2 ◦ T f φ
(17)
φ ◦ π20 = πˆ20 ◦ T h φ
(18)
and ˆ d1 = d2 (and thus the Let now d1 = T h φ ◦ uγ and d2 = uγˆ ◦ T f φ , which are both arrows from T f γ to T h γ. desired result) follows from T h γˆ ◦ d1 = T h (γˆ ◦ φ ) ◦ uγ
Definition of d1
= T γ ◦ uγ
Because φ : γ → γˆ
=T γ
By (7)
h f
= T γˆ ◦ T φ f
f
See two lines above
= T γˆ ◦ uγˆ ◦ T φ
By (7)
= T γˆ ◦ d2
Definition of d2
h
f
h
and πˆ20 ◦ d1 = πˆ20 ◦ T h φ ◦ uγ =φ
◦ π20 ◦ uγ
Definition of d1 By (18)
= φ ◦ π2
By (7)
= πˆ2 ◦ T f φ
By (17)
= =
πˆ20 ◦ uγˆ ◦ T f φ πˆ20 ◦ d2
By (7) Definition of d2 .
In the sequel we denote projections with π2 , π 2 in pullbacks along f and with π20 , π 02 in pullbacks along h. 3. Compatibility with the units follows from π20 ◦ uγ ◦ ηγf = π2 ◦ ηγf = id = π20 ◦ ηγh (apply (7) and (2) twice) and T h γ ◦ uγ ◦ ηγf = T f γ ◦ ηγf = γ = T h γ ◦ ηγh (again using (7) and the fact, that ηγp : T p γ → γ for p ∈ { f , h}). 4. Let u2 := u ∗ u be the horizontal composition. By the definition of u2 we have for each γ ∈ C↓L: u2γ = uTh γ ◦ T f uγ = T h uγ ◦ uT f γ .
(19)
78
Characterizing Van Kampen Squares
From Fig. 9, we get π2 ◦ π 2 = π2 ◦ µγp
(20)
where µγp = p∗ π2 . In the sequel, we use this for p := f and p := h. The diagrams L ×S (L ×S I) o
T h uγ
L ×S (L ×B I)
µγh =h∗ π20
(
π˜2
/ L ×B I
h∗ π2
L ×S I
π2
/I
π20
Figure 14: Compatibility with counit, part 1 and L ×B (L ×B I) u
π2 uT f γ
(
L ×S (L ×B I) (T f )2 γ
Th T f γ
(
π˜2
' / L ×B I
L
h
h◦T f γ
/S
Figure 15: Compatibility with counit, part 2 commute: In the first diagram, the triangle commutes by applying h∗ to (7) interpreted as diagram in C ↓ S. The square is just the pullback which arises from pulling back π2 : h ◦ T f γ → h ◦ γ along h. We denote with π˜2 the second projection in this case. The second diagram is just Figure 12 taken at T f γ instead of γ where the same π˜2 occurs again. Thus π20 ◦ µγh ◦ u2γ = π20 ◦ µγh ◦ T h uγ ◦ uT f γ
By (19)
= π2 ◦ π˜2 ◦ uT f γ
Figure 14
= π2 ◦ π 2
Figure 15
=
π2 ◦ µγf
=
π20 ◦ uγ
By (20) ◦ µγf
By (7)
On the other hand, by (7) and the fact that µ f and µ h are γindexed families of arrows from (T f )2 γ to (T f )γ and (T h )2 γ to (T h )γ, resp., we obtain T h γ ◦ uγ ◦ µγf = T f γ ◦ µγf = (T f )2 γ. Since u2 is a γindexed family of arrows from (T f )2 γ to (T h )2 γ, we also have T h γ ◦ µγh ◦ u2γ = (T h )2 γ ◦ u2γ = (T f )2 γ. Because T h γ = π10 and π20 are jointly monic, the proof is complete.
t u
H. K¨onig, U. Wolter, M.L¨owe
79
Proof of Proposition 22: ”⇒” follows immediately from (12) and Proposition 12 applied to the coherence witness ξ . ”⇐”: We call a sequence (yi )i∈{0,1,...,m} of elements in L an alternating sequence (of a and r), if m ∈ N and the following conditions hold: a) for all even i ∈ {0, . . . , m − 1} : (yi , yi+1 ) ∈ ker(p) b) for all odd i ∈ {0, . . . , m − 1} : (yi , yi+1 ) ∈ ker(−p) where p ∈ {a, r} and −a = r and −r = a. m + 1 is called the length of the sequence. A sequence is called proper if yi 6= y j for all i ∈ {0, 1, · · · , m} and j ∈ {0, 1, · · · , m − 1} with i 6= j10 . For the rear pullback span with canonical descent data ξ β , ξ τ , we define for any alternating sequence σ = (yi )i∈{0,1,...,m} a bijection ξσ : γ −1 y0 → γ −1 ym as follows: For m = 0: ξσ := idγ −1 y0 and for m ≥ 1 ! −p! p! ξσ := ξym−1 ,ym ◦ · · · ◦ ξy1 ,y2 ◦ ξy0 ,y1
where a! = τ and r! = β .
(21)
Obviously, for a domain cycle c = (xi )i∈{0,1,...,2k+1} σc = (x0 , x1 , . . . , x2k+1 , x0 ) is an alternating sequence, thus we can reformulate condition (13) as ξσc = idγ −1 x0 for all domain cycles c. We claim that the following conditions are equivalent: 1. ξσc = idγ −1 x0 for all domain cycles c = (xi )i∈{0,1,...,2k+1} . 2. ξσ = ξσ 0 for all alternating sequences σ = (yi )i∈{0,1,...,m} and σ 0 = (zi )i∈{0,1,...,n} with y0 = z0 and ym = zn (Independence of representative on paths from y0 to ym ). Assume for the moment that this is true, then 2. is true because 1. is the assumption of the proposition. We can then use this independence of representative to uniquely construct a coherence witness, i.e. a family (ξe,e0 )(e,e0 )∈ker(s) (where s = a¯ ◦ r = r¯ ◦ a) of bijections which satisfies neutrality and associativity from Proposition 12 and for which (12) is valid: Clearly, (x, x0 ) ∈ ker(s) iff there exists an alternating sequence σ = (yi )i∈{0,1,...,m} with x = y0 and x0 = ym such that ξx,x0 := ξσ does not depend on the choice of σ . Neutrality follows from (21) for sequences of length 0, (12) is ensured by sequences of length 1. To show associativity we define the composition of two alternating sequences by • σ 0 ◦ σ := (y0 , . . . , ym = z0 , . . . , zn ) if mn = 0 or m, n ≥ 1 and (ym−1 , ym ) ∈ ker(p), (z0 , z1 ) ∈ ker(−p) • σ 0 ◦ σ := (y0 , . . . , ym−1 , z1 , . . . , zn ) if m, n ≥ 1 and (ym−1 , ym ) ∈ ker(p), (z0 , z1 ) ∈ ker(p) Again by the independence of representative we obtain for each pair (x, x0 ), (x0 , x00 ) ∈ ker(s) (with representing alternating sequences σ , σ 0 ): ξσ 0 ◦ ξσ = ξσ 0 ◦σ , hence associativity. It remains to prove the equivalence ”1. ⇐⇒ 2.”. It is easy to show that one can restrict oneself to proper alternating sequences. Then ”2. ⇒ 1.” because 1. is a special case of 2. with m = 2k + 2 for k ∈ N, p = a, and n = 0. Thus, it remains to show ”1. ⇒ 2.”. Note first that the equation ξe,e0 = (ξe0 ,e )−1 (cf. Proposition 12) carries over to alternating sequences: If σ = (y0 , . . . , ym ), then σ − := (ym , . . . , y0 ) is an alternating sequence with ξσ = (ξσ − )−1 .
(22)
For the proof of ”1. ⇒ 2.”, we use induction over n. For n = 0 we have y0 = z0 = ym and ξσ 0 = idγ −1 z0 . For m = 0 and m = 1 we have ξσ = idγ −1 y0 = idγ −1 ym = idγ −1 z0 = ξσ 0 . Thus there remain two major cases 10
Thus, y0 = ym is allowed.
80
Characterizing Van Kampen Squares 1. m = 2k + 2 for some k ∈ N: Then either σ represents a domain cycle (if p = a) or the reverse cycle σ − is a domain cycle (if p = r). Both situations yield ξσ = idγ −1 y0 = ξσ 0 (cf. (22)). 2. m = 2k + 3 for some k ∈ N: If p = a we know that ym−1 6= y1 , since σ is proper, thus the alternating sequence σ = (ym−1 , y1 , . . . , ym−1 ) represents a domain cycle which is connected to σ via ξσ = ξyτm−1 ,ym ◦ ξσ ◦ ξyτ0 ,ym−1 (using associativity for (y0 = ym , ym−1 ), (ym−1 , y1 ) ∈ ker(a)). By assumption ξσ = ξyτm−1 ,ym ◦ ξyτ0 ,ym−1 = ξyτ0 ,ym = idγ −1 y0 = ξσ 0 because y0 = ym . If p = r the same argument can be carried out with σ = (y1 , ..., ym−1 , y1 ).
Now we show the induction step to n ≥ 1 under the hypothesis that the assertion is true for all pairs (m, n0 ) with n0 < n. Again there are several cases with possible subcases: 1. z1 = y0 : This means z1 = y0 = z0 and thus ξσ 0 = ξσ10 for the sequence σ10 = (z1 , . . . , zn ). ξσ = ξσ10 , however, holds by induction hypothesis. 2. z1 = yk for some 1 ≤ k ≤ m: By induction hypothesis we have ξσ1 = ξσ10 and ξσ2 = ξσ20 for the subsequences σ1 = (y0 , . . . , yk ), σ2 = (yk , . . . , ym ), σ10 = (z0 , z1 ), σ20 = (z1 , . . . , zn ) thus we also obtain ξσ = ξσ2 ◦ ξσ1 = ξσ20 ◦ ξσ10 = ξσ 0 . 3. z1 6= yk for all 0 ≤ k ≤ m: (a) (y0 , y1 ) ∈ ker(p), (z0 , z1 ) ∈ ker(p): Then σ1 = (z1 , y1 , . . . , ym ) is a proper alternating sequence. By induction hypothesis we have ξσ1 = ξσ10 for the alternating sequence σ10 =
0 (z1 , . . . , zn ), thus ξσ = ξσ1 ◦ ξyp!0 ,z1 = ξσ10 ◦ ξzp! 0 ,z1 = ξσ . (b) (y0 , y1 ) ∈ ker(p), (z0 , z1 ) ∈ ker(−p): Then σ1 = (z1 , z0 = y0 , y1 , . . . , ym ) is a proper alternating sequence. By induction hypothesis we have ξσ1 = ξσ10 for the alternating sequence σ10 =
−p! 0 0 (z1 , . . . , zn ), thus we obtain, finally, ξσ = ξσ1 ◦ ξz−p! 0 ,z1 = ξσ1 ◦ ξz0 ,z1 = ξσ .
t u
References [1] M. Barr & C. Wells (1990): Category Theory for Computing Sciences. Prentice Hall International Series. [2] M. Barr & C. Wells (2005): Toposes, Triples and Theories. Reprints in Theory and Applications of Categories 12, pp. 1–287. Available at http://www.case.edu/artsci/math/wells/pub/pdf/ttt.pdf. [3] Z. Diskin & U. Wolter (2008): A Diagrammatic Logic for ObjectOriented Visual Modeling. Electr. Notes Theor. Comput. Sci. 203(6), pp. 19–41, doi:10.1016/j.entcs.2008.10.041. [4] H. Ehrig, K. Ehrig, U. Prange & G. Taentzer (2006): Fundamentals of Algebraic Graph Transformations. Springer. [5] H. Ehrig & B. Mahr (1985): Fundamentals of Algebraic Specification 1: Equations and Initial Semantics. SpringerVerlag Berlin, Heidelberg. [6] Hartmut Ehrig, M. GrosseRhode & U. Wolter (1998): Applications of Category Theory to the Area of Algebraic Specification in Computer Science. Applied Categorical Structures 6, pp. 1–35, doi:10.1023/A:1008688122154. [7] Peter Freyd (1972): Aspects doi:10.1017/S0004972700044828.
of
Topoi.
Bull.
Austral.
Math.
Soc.
7,
pp.
1–76,
[8] Robert Goldblatt (1984): Topoi: The Categorial Analysis of Logic. Dover Publications. [9] A. Grothendieck (1959): Techniques de descente et th´eoremes d’existence en g´eometrie alg´ebraique, I. G´eneralit´es. S´eminaire Bourbaki 190.
H. K¨onig, U. Wolter, M.L¨owe
81
[10] T. Heindel & P. Sobocinski (2009): Van Kampen Colimits as Bicolimits in Span. In A. Kurz, M. Lenisa & A. Tarlecki, editors: Algebra and Coalgebra in Computer Science, Lecture Notes in Computer Science 5728, Springer Berlin / Heidelberg, pp. 335–349, doi:10.1007/9783642037412 23. [11] G Janelidze & W. Tholen (1994): Facets of Descent I. Applied Categorical Structures 2, pp. 245–281, doi:10.1007/BF00878100. [12] Peter Johnstone (2002): Sketches of an Elephant A Topos Theory Compendium, Volume 1. Oxford Science Publication. [13] S. Lack & P. Soboci´nski (2004): Adhesive Categories. In: Foundations of Software Science and Computation Structures (FoSSaCS ’04), 2987, Springer, pp. 273–288, doi:10.1007/9783540247272 20. [14] S. Lack & P. Soboci´nski (2006): Toposes are Adhesive. LNCS 4178, pp. 184–198, doi:10.1007/11841883 14. [15] Michael L¨owe (2010): Graph Rewriting in Span Categories. In: Proceedings of the ICGT, Lecture Notes in Computer Science 6372, Springer, pp. 218–233, doi:10.1007/9783642159282 15. [16] Michael L¨owe (2010): VanKampen Pushouts for Sets and Graphs. Technical Report, University of Applied Sciences, FHDW Hannover. [17] Adrian Rutle, Uwe Wolter & Yngve Lamo (2008): A Diagrammatic Approach to Model Transformations. In: Proceedings of the 2008 Euro American Conference on Telematics and Information Systems (EATIS 2008), ACM, pp. 1–8, doi:10.1145/1621087.1621105. [18] P. Soboczi´nsky (2004): Deriving Process Congruences from Reaction Rules. Technical Report DS046, BRICS Dissertation Series. [19] U. Wolter & Z. Diskin (2007): From Indexed to Fibred Semantics – The Generalized Sketch File –. Reports in Informatics 361, Dep. of Informatics, University of Bergen.
Satisfaction, Restriction and Amalgamation of Constraints in the Framework of MAdhesive Categories Hanna Schölzel1 , Hartmut Ehrig1 , Maria Maximova1 , Karsten Gabriel1 , and Frank Hermann1,2 1) Institut für Softwaretechnik und Theoretische Informatik, Technische Universität Berlin, Germany 2) University of Luxembourg, Interdisciplinary Centre for Security, Reliability and Trust [hannas, ehrig, mascham, kgabriel, frank]@cs.tuberlin.de
Application conditions for rules and constraints for graphs are wellknown in the theory of graph transformation and have been extended already to Madhesive transformation systems. According to the literature we distinguish between two kinds of satisfaction for constraints, called general and initial satisfaction of constraints, where initial satisfaction is defined for constraints over an initial object of the base category. Unfortunately, the standard definition of general satisfaction is not compatible with negation in contrast to initial satisfaction. Based on the wellknown restriction of objects along type morphisms, we study in this paper restriction and amalgamation of application conditions and constraints together with their solutions. In our main result, we show compatibility of initial satisfaction for positive constraints with restriction and amalgamation, while general satisfaction fails in general. Our main result is based on the compatibility of composition via pushouts with restriction, which is ensured by the horizontal van Kampen property in addition to the vertical one that is generally satisfied in Madhesive categories.
1
Introduction
The framework of Madhesive categories has been introduced recently [8, 3] as a generalization of different kinds of high level replacement systems based on the double pushout (DPO) approach [6]. Prominent examples that fit into the framework of Madhesive categories are (typed attributed) graphs [6, 19] and (highlevel) Petri nets [2, 10]. In the context of domain specific languages and model transformations based on graph transformation, graph conditions (constraints) are already used extensively for the specification of model constraints and the specification of application conditions of transformation rules. Graph conditions can be nested, may contain Boolean expressions [13, 14] and are expressively equivalent to firstorder formulas on graphs [4] as shown in [14, 20]. We generally use the term “nested condition” whenever we refer to the most general case. Restriction is a general concept for the definition of views of domain languages and is used for reducing the complexity of a model and for increasing the focus to relevant model element types. A major research challenge in this field is to provide general results that allow for reasoning on properties of the full model (system) by analyzing restricted properties on the views (restrictions) of the model only. Technically, a restriction of a model is given as a pullback along type morphisms. While this construction can be extended directly to restrictions of nested conditions, the satisfaction of the restricted nested conditions is not generally guaranteed for the restricted models, but—as we show in this paper—can be ensured under some sufficient conditions. According to the literature [14, 6], we distinguish between two kinds of satisfaction for nested conditions, called general and initial satisfaction, where initial satisfaction is defined for nested conditions over U. Golas, T. Soboll (Eds.): Proceedings of ACCAT 2012 EPTCS 93, 2012, pp. 83–104, doi:10.4204/EPTCS.93.5
© H. Schölzel, H. Ehrig, M. Maximova, K. Gabriel & F. Hermann This work is licensed under the Creative Commons Attribution License.
84
Satisfaction, Restriction and Amalgamation of Constraints
an initial object of the base category. Intuitively, general satisfaction requires that a property holds for all occurrences of a premise pattern, while initial satisfaction requires this property for at least one occurrence. Unfortunately, the standard definition of general satisfaction is not compatible with the Boolean operators for negation and disjunction, but initial satisfaction is compatible with all Boolean operators (see App. A in [21]). In order to show, in addition, compatibility of initial satisfaction with restriction, we introduce the concept of amalgamation for typed objects, where objects can be amalgamated along their overlapping according to the given type restrictions. As the main technical result, we show that solutions for nested conditions can be composed and decomposed along an amalgamation of them (Thm. 4.10), if the nested conditions are positive, i.e., they contain neither a negation nor a “for all” expression (universal quantification). Based on this property, we show in our main result (Thm. 5.1), that initial satisfaction of positive nested conditions is compatible with amalgamation based on restrictions that agree on their overlappings. Note in particular that this result does not hold for general satisfaction which we illustrate by a concrete counterexample. The structure of the paper is as follows. Section 2 reviews the general framework of Madhesive categories and main concepts for nested conditions and their satisfaction. Thereafter, Sec. 3 presents the restriction of objects and nested conditions along type object morphisms. Section 4 contains the constructions and results concerning the amalgamation of objects and nested conditions and in Sec. 5, we present our main result showing the compatibility of initial satisfaction with amalgamation and restriction. Related work is discussed in Sec. 6. Section 7 concludes the paper and discusses aspects of future work. Appendix A contains the proofs that are not contained in the main part. Additionally, App. A in [21] provides formal details concerning the transformation between both satisfaction relations and, moreover, their compatibility resp. incompatibility with Boolean operators.
2
General Framework and Concepts
In this section we recall some basic wellknown concepts and notions and introduce some new notions that we are using in our approach. Our considerations are based on the framework of Madhesive categories. An Madhesive category [8] consists of a category C together with a class M of monomorphisms as defined in Def. 2.1 below. The concept of Madhesive categories generalizes that of adhesive [17], adhesive HLR [9], and weak adhesive HLR categories [6]. Definition 2.1 (MAdhesive Category). An Madhesive category (C, M) is a category C together with a class M of monomorphisms satisfying: • the class M is closed under isomorphisms, composition and decomposition, • C has pushouts and pullbacks along Mmorphisms, • Mmorphisms are closed under pushouts and pullbacks, and • it holds the vertical van Kampen (short VK) property. This means that pushouts along Mmorphisms are MVK squares, i. e., pushout (1) with m ∈ M is an MVK square, if for all commutative cubes (2) with (1) in the bottom, all vertical morphisms a, b, c, d ∈ M and pullbacks in the back faces we have that the top face is a pushout if and only if the front faces are pullbacks.
H. Schölzel, H. Ehrig, M. Maximova, K. Gabriel & F. Hermann A C
C0 r
m
(1)
D
B
c
Cr
85 A0
) 0 B
(
a
D0 r
d
) r D
A
(2)
b m
)
B
(1)
Remark 2.2. In Sec. 3, Sec. 4 and Sec. 5 we will also need the horizontal VK property, where the VK property is only required for commutative cubes with all horizontal morphisms in M (see [8]), to show the compatibility of object composition and the corresponding restrictions. Note moreover, that an Madhesive category which also satisfies the horizontal VK property is a weak adhesive HLR category [6]. A set of transformation rules over an Madhesive category according to the DPO approach constitutes an Madhesive transformation system [8]. For various examples (graphs, Petri nets, etc.) see [6]. In Sec. 3, Sec. 4 and Sec. 5 we are considering Madhesive categories with effective pushouts. According to [18], the formal definition is as follows. Definition 2.3 (Effective Pushout). Given Mmorphisms a : B → X, b : C → X in an Madhesive category (C, M) and let (A, p1 , p2 ) be obtained by the pullback A of a and b. Then pushout (1) of p1 and p2 is called effective, if the unique morphism u : D → X induced by pushout (1) is an Mmorphism.
7B
p1
i1
(1) p2
'
C
a
' 7D
u
! /X >
i2 b
Nested conditions in this paper are defined as application conditions for rules in [13]. Depending on the context in which a nested condition occurs, we use the terms application condition [13] and constraint [6], respectively. Furthermore, we define positive nested conditions to be used in Sec. 3, Sec. 4, and Sec. 5 for our main results. Definition 2.4 (Nested Condition). A nested condition acP over an object P is inductively defined as follows: • true is a nested condition over P. • For every morphism a : P → C and nested condition acC over C, ∃ (a, acC ) is a nested condition over P. • A nested condition can also be a Boolean formula over nested conditions. This means that also V W ¬acP , i∈I acP,i , and i∈I acP,i are nested conditions over P for nested conditions acP , acP,i (i ∈ I) over P for some index set I. Furthermore, we distinguish the following concepts: • A nested condition is called application condition in the context of rules and match morphisms. • A nested condition is called constraint in the context of properties of objects. • A positive nested condition is built up only by nested conditions of the form true, ∃ (a, ac), V W 6 0. / i∈I acP,i and i∈I acP,i , where I = An example for a nested condition and its meaning is given below.
86
Satisfaction, Restriction and Amalgamation of Constraints
Example 2.5 (Nested Condition). Given the nested condition acP from Fig. 2 where all morphisms are inclusions. Condition acP means that the source of every bedge has a bselfloop and must be followed by some cedge such that subsequently, there is a path in the reverse direction visiting the source and target of the first bedge with precisely one cedge and one bedge in an arbitrary order. We denote this nested condition by acP = ∃ (a1 ,true) ∧ ∃ (a2 , ∃ (a3 ,true) ∨ ∃ (a4 ,true)). We are now defining inductively whether a morphism satisfies a nested condition (see [6]). Definition 2.6 (Satisfaction of Nested Condition). Given a nested condition acP over P, a morphism p : P → G satisfies acP (see Fig. 1(a)), written p acP , if: • acP = true, or • acP = ∃ (a, acC ) with a : P → C and there exists a morphism q : C → G ∈ M such that q ◦ a = p and q acC , or • acP = ¬ac0P and p 6 ac0P , or • acP =
V
and for all i ∈ I holds p acP,i , or
• acP =
W
and for some i ∈ I holds p acP,i .
i∈I acP,i
i∈I acP,i
In the following we distinguish two kinds of satisfaction relations for constraints: General [6] and initial satisfaction [14]. Initial satisfaction is defined for constraints over an initial object of the base category while general satisfaction is considered for constraints over arbitrary objects. Intuitively, while general satisfaction requires that a constraint acP is satisfied by every Mmorphism p : P → G, intial satisfaction requires just the existence of an Mmorphism p : P → G which satisfies acP . / C / acC
a
P p
#
=
G
x
iP
I iG
q acC
(a) Satisfaction of acP by morphism p
#
=
G
x
/ P / acP p acP
(b) Initial satisfaction of acI
Figure 1: Satisfaction of nested conditions Definition 2.7 (General Satisfaction of Constraints). Given a constraint acP over P. An object G generally satisfies acP , written G acP , if ∀ p : P → G ∈ M. p acP (see Fig. 1(a)). Definition 2.8 (Initial Satisfaction of Constraints). Given a constraint acI over an initial object I. An I
object G initially satisfies acI , written G acI , if iG acI for the initial morphism iG : I → G. Note, that for acI = ∃ (iP , acP ) we have I
G acI ⇔ ∃ p : P → G ∈ M. p acP (see Fig. 1(b)). This means that the general satisfaction corresponds more to the universal satisfaction of constraints while the initial satisfaction corresponds more to the existential satisfaction. For positive nested conditions, we define solutions for the satisfaction problem. A solution Q (a tree of morphisms) determines which morphisms are used to fulfill the satisfaction condition. Definition 2.9 (Solution for Satisfaction of Positive Nested Conditions). Given a positive nested condition acP over P and a morphism p : P → G. Then Q is a solution for p acP if: • acP = true and Q = 0, / or
H. Schölzel, H. Ehrig, M. Maximova, K. Gabriel & F. Hermann
87
• acP = ∃ (a, acC ) with a : P → C and Q = (q, QC ) with Mmorphism q : C → G such that q ◦ a = p and QC is a solution for q acC (see Fig. 1(a)), or • acP =
V
i∈I acP,i
and Q = (Qi )i∈I such that Qi is a solution for p acP,i for all i ∈ I, or
• acP = i∈I acP,i and Q = (Qi )i∈I such that there is j ∈ I with solution Q j for p acP, j and for all k ∈ I with k 6= j it holds that Qk = 0. / W
The following example demonstrates the general and initial satisfaction of constraints and gives their corresponding solutions. Example 2.10 (Satisfaction and Solution of Constraints). 1. General Satisfaction Consider the graph GA from Fig. 2 below and the constraint acP from Ex. 2.5. There are two possible Mmorphisms p1 , p2 : P → GA , where p1 is an inclusion and p2 maps b1 to b2 with the corresponding node mapping. For both matches p1 and p2 , there is a bselfloop on the image of node 1, a cedge outgoing from the image of node 2, as well as the corresponding images for edges b2 and c2 in C3 . Thus, GA generally satisfies acP . A corresponding solution for p1 acP is given by Qgen = (Qi )i∈{1,2} with Q1 = (q1 , 0) / and Q2 = (q2 , (Q j ) j∈{3,4} ), where Q3 = (q3 , 0), / Q4 = 0/ and qi : Ci → GA for i = 1, 2, 3 are inclusions.
acP
C4
C1 1
b3
I ∅
iP
1
b1
2
C2
∧
1
a2 p1,p2
iG
1
2
b1
2
q1
c1
∨
3
b1
2
c1
b2
a4
a1
P
b1
C3 1
a3
q2
3 c2
b1
2 c2
c1
3 b2
q3
GA b3
1
b1
2 c2
c1
3 b2
c3
4 b4
Figure 2: General and initial satisfaction of constraints 2. Initial Satisfaction Let acI = ∃ (iP , acP ) with iP as depicted in Fig. 2 and acP from Ex. 2.5. The graph GA initially satisfies acI since there is p1 : P → GA ∈ M satisfying acP as mentioned before. A corresponding solution for iG acI is given by Qinit = (p1 , Qgen ) with Qgen from the example for general satisfaction. Remark 2.11. A nested condition is called typed over a given type object, if all nested conditions in every of its nesting levels are also typed over the same type object. Furthermore, matches and corresponding solutions are required to be compatible with this type of object as well.
3
Restriction Along Type Morphisms
In this section, we present the restriction of objects, morphisms, positive nested conditions and their solutions along type morphisms which are the basis for the amalgamation of nested conditions in Sec. 4.
88
Satisfaction, Restriction and Amalgamation of Constraints
General Assumption. In this and the following sections, we consider an Madhesive category (C, M) satisfying the horizontal VK property (see Rem. 2.2) and has effective pushouts (see Def. 2.3). Definition 3.1 (Restriction along Type Morphism). Given an object GA typed over T GA by tGA : GA → T GA and t : T GB → T GA ∈ M, then T GB is called restriction of T GA , GB is a restriction of GA , and tGB is a restriction of tGA , if (1) is a pullback. Given a : G0A → GA , then b is a restriction of a along type morphism t, written b = Restrt (a), if (2) is a pullback. T GO A o t
T GB o
tGA
GO A o
G0A O
tG (2)
(1) tGB
a
GB o
b
tG0
G0B
For positive nested conditions, we can define the restriction recursively as restriction of their components. Definition 3.2 (Restriction of Positive Nested Conditions). Given a positive nested condition acPA typed over T GA and let T GB be a restriction of it with t : T GB → T GA ∈ M. Then we define the restriction acPB = Restrt (acPA ) over the restriction PB of PA as follows: • The restriction of true is true, • the restriction of ∃ (a, acCA ) is given by restriction of a and acCA , i. e., acPB = ∃ (Restrt (a), Restrt (acCA )), and • the restriction of a Boolean formula is given by the restrictions of its components, i. e., V V Restrt (¬ac0PA ) = ¬Restrt (ac0PA ), Restrt ( i∈I acPA ,i ) = and i∈I Restrt (acPA ,i ), W W Restrt ( i∈I acPA ,i ) = i∈I Restrt (acPA ,i ). a
T GO A o t
T GB o

PAO
CA / acCA O
tC
tP
CB / acCB c
b
PB
Now we extend the restriction construction to solutions of positive nested conditions and show in Fact 3.4 that a restriction of a solution is also a solution for the corresponding restricted constraint. Definition 3.3 (Restriction of Solutions for Positive Nested Conditions). Given a positive nested condition acPA typed over T GA together with a restriction acPB along t : T GB → T GA . For a morphism pA : PA → G and a solution QA for pA acPA , the restriction QB of QA along t, written QB = Restrt (QA ), is defined inductively as follows: • If QA is empty then also QB is empty, • if acPA = ∃ (a : PA → CA , acCA ) and QA = (qA , QCA ), then QB = (qB , QCB ) such that qB and QCB are restrictions of qA respectively QCA , and V
W
• if acPA = i∈I acPA ,i or acPA = i∈I acPA ,i , and QA = (QA,i )i∈I , then QB = (QB,i )i∈I such that QB,i is a restriction of QA,i for all i ∈ I. Fact 3.4 (Restriction of Solutions for Positive Nested Conditions). Given a positive nested condition acPA and a match pA : PA → GA over T GA with restrictions acPB = Restrt (acPA ), pB = Restrt (pA ) along t : T GB → T GA . Then for a solution QA of pA acPA , there is a solution QB = Restrt (QA ) for pB acPB .
H. Schölzel, H. Ehrig, M. Maximova, K. Gabriel & F. Hermann
4
89
Amalgamation
The amalgamation of typed objects allows to combine objects of different types provided that they agree on a common subtype. This concept is already known in the context of different types of Petri net processes, such as open net processes [1] and algebraic highlevel processes [7], which can be seen as special kinds of typed objects. In this section, we introduce a general definition for the amalgamation of typed objects. Moreover, we extend the concept to the amalgamation of positive nested conditions and their solutions. As required for amalgamation, we discuss under which conditions morphisms can be composed via a span of restriction morphisms. Two morphisms gB and gC “agree” in a morphism gD , if gD can be constructed as a common restriction and can be used as a composition interface for gB and gC as in Def. 4.1. tgDB
tgDC
Definition 4.1 (Agreement and Amalgamation of Typed Objects). Given a span T GB ←− T GD −→ T GC , gD gB gC with tgDB ,tgDC ∈ M and typed objects GB → T GB , GC → T GC and GD → T GD . We say gB , gC agree in gD , if gD is a restriction of gB and gC , i.e., RestrtgDB (gB ) = gD = RestrtgDC (gC ). Given pushout (1) below with all morphisms in M and typed objects gB , gC agreeing in gD . A morphism gA : GA → T GA is called amalgamation of gB and gC over gD , written gA = gB +gD gC , if the outer square is a pushout and gB , gC are restrictions of gA . GD gD
T GD GB
u gB
/ T GB u
tgDB
tgDC
(1)
tgBA
)
tgCA
T GO A
gA
)
GA
)
T GC o
) gC
GC
u
u
Fact 4.2 is essentially based on the horizontal VK property. Fact 4.2 (Amalgamation of Typed Objects). Given pushout (1) with all morphisms in M as in Def. 4.1. Composition. Given gB , gC agreeing in gD , then there exists a unique amalgamation gA = gB +gD gC . Decomposition. Vice versa, given gA : GA → T GA , there are unique restrictions gB , gC , and gD of gA such that gA = gB +gD gC . Here and in the following, uniqueness means uniqueness up to isomorphism. Proof. Given gB , gC agreeing in gD , we have that the upper two trapezoids are pullbacks. Now we construct GA as pushout over GB and GC via GD , such that the outer diamond is a pushout. This leads to a unique induced morphism gA : GA → T GA , such that the diagram commutes and via the horizontal VK property we get that the lower two trapezoids are pullbacks and therefore gA = gB +gD gC . Vice versa, we can construct GB , GC , GD as restrictions such that the trapezoids become pullbacks, where gA : GA → T GA and T GA , T GB , T GC , T GD are given such that (1) is a pushout with Mmorphisms only. Then the horizontal VK property implies that the outer diamond is a pushout and gA is unique because of the universal property and gA = gB +gD gC . The uniqueness (up to isomorphism) of the amalgamated composition and decomposition constructions follows from uniqueness of pushouts and pullpacks up to isomorphism.
90
Satisfaction, Restriction and Amalgamation of Constraints
Example 4.3 (Amalgamation of Typed Objects). Figure 3 shows a pushout of type graphs T GA , T GB , T GC and T GD . Composition. Consider the typed graphs GB , GC and GD typed over T GB , T GC and T GD , respectively. Graph GD , containing the same nodes as GB and GC and no edges, is the common restriction of GB and GC . So, the type morphisms gB and gC agree in gD , which by Fact 4.2 means that there is an amalgamation gA = gB +gD gC . It can be obtained by computing the pushout of GB and GC over GD , leading to the graph GA that contains the bedges of GB as well as the cedges of GC . The type morphism gA is induced by the universal property of pushouts, mapping all edges in the same way as gB and gC . Decomposition. Vice versa, consider the graph GA typed over T GA . We can restrict GA to the type graphs T GB and T GC , leading to typed graphs GB and GC , containing only the b respectively cedges of GA . Restricting the graphs GB and GC to type graph T GD , we get in both cases the graph GD that contains no edges, and we have that gA = gB +gD gC . GA 1
b1
c1
2
gA
4
1
c1
2
TGA
b
TGC
4
TGB 3
gD
TGD
4
b2
gC
c
c
2
c3
3
c2
b
gB b1
GC
b2
c2
1
3
c3
1
GB
2
3
4
GD
Figure 3: Amalgamation of typed graphs We already defined the restriction of positive nested conditions (Def. 3.2) and their solutions (Def. 3.3). Now we want to consider the case that we have two conditions, which have a common restriction and can be amalgamated. Definition 4.4 (Agreement and Amalgamation of Positive Nested Conditions). Given a pushout (1) below with all morphisms in M. Two positive nested conditions acPB typed over TGB and acPC typed over TGC agree in acPD typed over TGD if acPD is a restriction of acPB and acPC . Given acPB and acPC agreeing in acPD then a positive nested condition acPA typed over T GA is called amalgamation of acPB and acPC over acPD , written acPA = acPB +acPD acPC , if acPB and acPC are restrictions of acPA and tPA = tPB +tPD tPC . In particular, we have trueA = trueB +trueD trueC , short true = true +true true. PD / acPD tPD
T GD acPB . PB u
tPB
/ T GB u
tgDB
tgDC
(1)
tgBA
)
tgCA
T GO A
tPA
)
u
u
PA / acPA
)
T GC o
) tPC
PC / acPC
H. Schölzel, H. Ehrig, M. Maximova, K. Gabriel & F. Hermann
91
In the following Fact 4.5, we give a construction for the amalgamation of positive nested conditions and in Thm. 4.10 for the corresponding solutions. Fact 4.5 (Amalgamation of Positive Nested Conditions). Given a pushout (1) as in Def. 4.4 with all morphisms in M. Composition. If there are positive nested conditions acPB and acPC typed over T GB and T GC , respectively, agreeing in acPD typed over T GD , then there exists a unique positive nested condition acPA typed over T GA such that acPA = acPB +acPD acPC . Decomposition. Vice versa, given a positive nested condition acPA typed over T GA , there are unique restrictions acPB , acPC and acPD of acPA such that acPA = acPB +acPD acPC . The amalgamated composition and decomposition constructions are unique up to isomorphism. Remark 4.6. Given an amalgamation acPA = acPB +acPD acPC of positive nested conditions, we can conclude from the proof of Fact 4.5 (see App. A) that we also have corresponding amalgamations in each level of nesting.
acPA
C2,A
PA
a2,A b1
1
2
b1
2
c1
1
PC
a2,C 1
2
TGA
b
c1
3
a1,C
Ç 1
c1
2
C1,C 3
c2
TGC c
c
TGB
c1,DC
pDC
c2,DC
C1,D
TGD
tP,B
1
tP,D
2
3
c1,DB
Ç
a1,B 2
Ç
3
C2,B a2,B
acPB
3
c2
b2
1
2
tP,C
c2,BA
3
b1
1
c1
c1,CA
b
PB
c2,CA
tP,A
pBA
2
3
c2
b2
C1,B b1
c1
Ç
3
c1,BA
2
pCA
b2
c2
acPC
C2,C b1
3
a1,A
C1,A 1
c1
1
1
b1 b2
2
3
pDB
c2,DB
1
C2,D 1
a1,D 2
PD 3
a2,D 2
3
acPD
Figure 4: Amalgamation of positive nested conditions Example 4.7 (Amalgamation of Positive Nested Conditions). Figure 4 shows a pushout of typed graphs T GA , T GB , T GC and T GD , and four positive nested conditions acPA , acPB , acPC and acPD typed over T GA , T GB , T GC and T GD , respectively. For simplicity, the figure contains only the type morphisms of the Ps, but there are also corresponding type morphisms for the Cs, mapping all bedges to b and all cedges to W c. There is acPA = i∈{1,2} acCi,A with acCi,A = ∃ (ai,A ,true) for i = 1, 2, and acPB , acPC and acPD have a similar structure. Composition. We have that tPD is a common restriction of tPB and tPC , and also that ai,D is a common restriction of ai,B and ai,C for i = 1, 2. Thus, acPD is a common restriction of acPB and acPC which means that acPB and acPC agree in acPD . So by Fact 4.5 there exists an amalgamation acPA = acPB +acPD acPC , and according to Rem. 4.6 it can be obtained as amalgamation of its components.
92
Satisfaction, Restriction and Amalgamation of Constraints This means that we have an amalgamation tPA = tPB +tPD tPC with pushout of the Ps as shown in Fig. 4, as well as amalgamations of the corresponding type morphisms of the Cs, leading to the pushouts depicted in Fig. 4 by dotted arrows for the C1 s and by dashed arrows for the C2 s. The morphisms a1,A and a2,A are obtained by the universal property of pushouts.
Decomposition. The other way around, considering the condition acPA , we can construct the restrictions acPB and acPC by deleting the c respectively bedges. Then, restricting acPB and acPC to T GD by deleting all remaining edges, we obtain the same condition acPD such that acPA = acPB +acPD acPC . In order to answer the question, under which conditions such amalgamated positive nested conditions are satisfied, we need to define an amalgamation of their solutions. Afterwards, we show in the proof of Thm. 4.10 that a composition of two solutions via an interface leads to a unique amalgamated solution and that a given solution for an amalgamated positive nested condition is the amalgamation of its unique restrictions. Definition 4.8 (Agreement and Amalgamation of Solutions for Positive Nested Conditions). Given pushout (1) below with all morphisms in M, an amalgamation of typed objects gA = gB +gD gC , and an amalgamation of positive nested conditions acPA = acPB +acPD acPC with corresponding matches pA = pB + pD pC . 1. Two solutions QB for pB acPB and QC for pC acPC agree in a solution QD for pD acPD , if QD is a restriction of QB and QC . 2. Given solutions QB for pB acPB and QC for pC acPC agreeing in a solution QD for pD acPD , then a solution QA for pA acPA is called amalgamation of QB and QC over QD , written QA = QB +QD QC , if QB and QC are restrictions of QA . acPA . PA o PC / acPC p O
O
CA
pA
&
GO A o gA
pBA
gBA
acPB . PB o
o 8 GB
T GO A o
tgBA gB
pB
'
gCA
o 7 T GB
tgCA (1) tgDB
gDB pDB
T GO C
w gC
pC
pDC
gDC
tgDC
T GD g
GO C
x
gD
GD f
pD
PD / acPD
Remark 4.9. Note that by assumption gA = gB +gD gC in the definition above we already have a pushout over the Gs, and by acPA = acPB +acPD acPC we also have a pushout over the Ps. Theorem 4.10 (Amalgamation of Solutions for Positive Nested Conditions). Given pushout (1) as in Def. 4.8 with all morphisms in M, an amalgamation of typed objects gA = gB +gD gC , and an amalgamation of positive nested conditions acPA = acPB +acPD acPC with corresponding matches pA = pB + pD pC . Composition. Given solutions QB for pB acPB and QC for pC acPC agreeing in a solution QD for pD acPD , then there is a solution QA for pA acPA constructed as amalgamation QA = QB +QD QC . Decomposition. Given a solution QA for pA acPA , then there are solutions QB , QC and QD for pB acPB , pC acPC and pD acPD , respectively, which are constructed as restrictions QB , QC and QD of QA such that QA = QB +QD QC . The amalgamated composition and decomposition constructions are unique up to isomorphism. Remark 4.11. From the proof of Thm. 4.10 (see App. A) we can conclude that for a given amalgamation of solutions QA = QB +QD QC , we also have corresponding amalgamations of its components.
H. Schölzel, H. Ehrig, M. Maximova, K. Gabriel & F. Hermann
5
93
Compatibility of Initial Satisfaction with Restriction and Amalgamation
In this section we present our main result showing compatibility of initial satisfaction with amalgamation (Thm. 5.1) and restriction (Cor. 5.2) which are based on the amalgamation of solutions for positive nested conditions (Thm. 4.10). This main result allows to conclude the satisfaction of a constraint for a composed object from the satisfaction of the corresponding restricted constraints for the component objects. It is valid for initial satisfaction, but not for general satisfaction. Theorem 5.1 (Compatibility of Initial Satisfaction with Amalgamation). Given pushout (1) below with all morphisms in M, an amalgamation of typed objects gA = gB +gD gC , and an amalgamation of positive constraints acA = acB +acD acC . Then we have: I
I
Decomposition. Given a solution QA for GA acA , then there are solutions QB for GB acB , QC for I
I
GC acC and QD for GD acD such that QA = QB +QD QC . I
I
Composition. Vice versa, given solutions QB for GB acB and QC for GC acC agreeing in a solution I
I
QD for GD acD , then there exists a solution QA for GA acA such that QA = QB +QD QC . acA . IO o
iGA
'
GO A o gBA
idI
(
T GO A o
tgBA gB
tgCA
T GO C
i GB
v
T GD h
w
gC gDC
tgDC
(1)
o 6 T GB tgDB gDB
7 GB o
acB . I o
GO C
gCA gA
IO / acC
iGC
idI
idI
gD
GD g i GD
idI
I / acD
Proof. I
Decomposition. By Def. 2.8 a solution QA for GA acA is also a solution for iGA acA , where iGA is the unique morphism iGA : I → GA . Moreover, due to amalgamation gA = gB +gD gC the inner trapezoids in the diagram above are pullbacks. So by closure of M under pullbacks we have that gBA , gCA , gDB , gDC ∈ M which means that they are monomorphisms. Therefore, the outer trapezoids become pullbacks by standard category theory, which means that iGB : I → GB is a restriction of iGA , iGC : I → GC is a restriction of iGA , and iGD : I → GD is a restriction of iGB as well as of iGC . Furthermore, the outer square in the diagram is a pushout, implying that we have an amalgamation iGA = iGB +iGD iGC . Thus, using Thm. 4.10 we obtain solutions QB for iGB acB , QC for iGC acC and QD for iGD acD such that QA = QB +QD QC , and by Def. 2.8 QB , QC and QD are solutions for I
I
I
GB acB , GC acC and GD acD , respectively. I
I
I
Composition. Now, given solutions QB , QC and QD for GB acB , GC acC and GD acD , respectively. Then by Def. 2.8 we have that QB , QC and QD are solutions for iGB acB , iGC acC and iGD acD , respectively. As shown in item 1, there is iGA = iGB +iGD iGC and therefore, since QB and QC agree
94
Satisfaction, Restriction and Amalgamation of Constraints in QD , by Thm. 4.10 we obtain a solution QA for iGA acA such that QA = QB +QD QC . Finally, I
Def. 2.8 implies that QA is a solution for GA acA . Corollary 5.2 (Compatibility of Initial Satisfaction with Restriction). Given type restriction t : T GB → T GA ∈ M, object GA typed over T GA with restriction GB , and a positive constraint acA I
I
over initial object I typed over T GA with restriction acB . Then GA acA implies GB acB . Moreover, if I
I
QA is a solution for GA acA then QB = Restrt (QA ) is a solution for GB acB . Proof. Consider the diagram in Thm. 5.1 with GC = GA , GD = GB , acC = acA and acD = acB . Then by standard category theory we have that all rectangles in the diagram are pushouts and the trapezoids are pullbacks. Thus, we have gA = gB +gB gA and, analogously, acA = acB +acB acA with corresponding I
matches iGA = iGB +iGB iGA . So, given a solution QA for GA acA , by item 1 of Thm. 5.1 there is a solution I
QB for GB acB with QA = QB +QB QA such that by Def. 4.4 QB is a restriction of QA . Example 5.3 (Compatibility of Initial Satisfaction with Amalgamation). Figure 5 shows the amalgamation of typed graphs gA = gB +gD gC from Ex. 4.3 and an amalgamation of positive nested conditions acA = acB +acD acC . Note that we have acA = ∃ (iPA , acPA ) and acB , acC and acD with similar structure, where the amalgamation acPA = acPB +acPD acPC is presented in Ex. 4.7. I
Composition. For GB acB we have the solution QB = (qB , (Q1,B , Q2,B )) with Q1,B = (q1,B , 0) / and Q2,B = 0, / where qB and q1,B are inclusions. Moreover, we have similar solutions QC for I
I
GC acC and QD for GD acD . According to Rem. 4.11, the amalgamation QA = QB +QD QC can be constructed by amalgamation of the components. First, we explain in detail the amalgamation q1,A = q1,B +q1,D q1,C . Note that the graphs GA , GB , GC and GD can be considered as type graphs such that e. g. C1,D is typed over GD by q1,D . So, since q1,D is a common restriction of q1,B and q1,C , we have that q1,B and q1,C agree in q1,D . This means that there is an amalgamation of typed objects q1,A = q1,B +q1,D q1,C , where the inclusion q1,A maps all nodes and edges in the same way as q1,B and q1,C . Moreover, for the empty solutions we have an empty solution as amalgamation, and thus, we have amalgamations of solutions Q1,A = Q1,B +Q1,D Q1,C = (q1,A , 0) / and Q2,A = Q2,B +Q2,D Q2,C = 0. / The amalgamation qA = qB +qD qC can be obtained analogously as described for q1,A , and hence, we I
have QA = QB +QD QC = (qA , (Q1,A , Q2,A )), which is a solution for GA acA . I
Decomposition. For GA acA we have a solution QA = (qA , (Q1,A , Q2,A )) with Q1,A = (q1,A , 0) / and Q2,A = 0/ where qA and q1,A are inclusions. The restrictions QB , QC and QD of QA are given by restrictions of the components. By computing the restrictions q1,B , q1,C and q1,D of q1,A , and I
similar the restrictions of qA and 0, / we get as result again the solutions QB for GB acB , QC for I
I
GC acC , and QD for GD acD as described in the composition case above. From Cor. 5.2, we know that initial satisfaction is compatible with restriction of typed objects and constraints. In contrast, general satisfaction and restriction are not compatible in general. As the following example illustrates, it is possible that a typed object generally satisfies a constraint while the same does not hold for their restrictions.
H. Schölzel, H. Ehrig, M. Maximova, K. Gabriel & F. Hermann
C2,A b1
1
PA
c1
2
3
a2,A
c2
b2
b1
1
2
Ç
b1
1
acA
c1
2
q1,A
3
b2
c2
c1
2
1
2
3
q1,B
b1
1
2
3
1
b1
Ç 2
3 a2,B
b2
4
b1
1
2
3
1
2
C1,C 3
acC
C1,D 2
3
q1,D
4
1
2
acD
3
GD
Â
Â
a1,D
qD
Ç
PD
I
I iPB
c1
gD
1
3
c2
gC
iGD
PB
2
c2
TGD
GB
c1
Ç
q1,C
4
1
a2,C
3 a1,C
c3
3
c
iGB
qB
c1
2
TGC
TGB
b2 a1,B
C2,B
qC
c
b
b2
iGC
1
TGA
C1,B b1
4
c1
2
c2
b
gB
1
GC
b2
c2 gA
acB
GA c3
3
PC
iPC
Â
iGA
b1
1
Â
3
qA
a1,A
C1,A
c1
C2,C
I
I
iPA
95
1
iPD
2
3
C2,D 1
2
3
a2,D
Figure 5: Amalgamation of solutions for initial satisfaction Example 5.4 (Restriction of General Satisfaction Fails in General). Figure 6 shows a restriction GB of the typed graph GA and a restriction acPB of constraint acPA . There are two possible matches p1,A , p2,A : PA → GA ∈ M where p1,A is an inclusion and p2,A maps b1 to b2 and c1 to c2 . Since for each of the matches the graph GA contains the required edges in the inverse direction, both of the matches satisfy acPA . For pi,A we have qi,A with qi,A ◦ aA = pi,A for i = 1, 2. Thus, we have that GA acPA . For the constraint acPB there is a match pB : PB → GB ∈ M mapping edge b1 identically and node 3 to node 4. We have that pB 6 acPB because there is no edge from node 4 to node 2 in GB , which means that GB 6 acPB . This is due to the fact that there is no match pA : PA → GA ∈ M such that pB is the restriction of pA .
GA
TGA tGA b
TGB
1
c
b1
2 c2
c1
3 b2
c3
4
q1A,q2A
1
b1
2
GB tGB
b
1
b1
qB 2
3 b2
c1
c2
tG
t
4
acPA
p1A,p2A
CA
3 b2
PA
aA
1
b1
tC
CB 1
b1
2
2 tP
3 b2
PB
aB
1
b1
2
c1
3
acPB 3
pB
Figure 6: Counterexample for restriction of general satisfaction
6
Related Work
The framework of Madhesive categories [8] generalizes various kinds of categories for high level replacement systems, e.g. adhesive [17], quasiadhesive [18], partial VK square adhesive [15], and weak
96
Satisfaction, Restriction and Amalgamation of Constraints
adhesive categories [6]. Therefore, the results of this paper are applicable to all of them, where the category of typed attributed graphs is a prominent example. The concepts of nested graph conditions [13] and firstorder graph formulas [4] are shown to be expressively equivalent in [14] using the translation between firstorder logic and predicates on edgelabeled graphs without parallel edges [20]. Multiview modelling is an important concept in software engineering. Several approaches have been studied and used, e.g. focussing on aspect oriented techniques [12]. In this line, graph transformation (GT) approaches have been extended to support view concepts based on the integration of type graphs. For this purpose, the concept of restriction along type morphisms has been studied and used intensively [11, 5] including GT systems using the concept of inheritance and views [5, 16]. Instead of restriction of constraints considered in this paper, only more restrictive forward translations of view constraints have been studied in [5] for the case of atomic constraints with general satisfaction leading to a result similar to Thm. 5.1. The notions of initial and general satisfaction for nested conditions can be transformed one into the other [14], but this transformation uses the Boolean operator negation that is not present in positive constraints, for which, however, our main result on the compatibility of restriction and initial satisfaction holds. Moreover, we have shown by counterexample that general satisfaction is not compatible with restriction in general, even if only positive constraints are considered.
7
Conclusion
Nested application conditions for rules and constraints for graphs and more general models have been studied already in the framework of Madhesive transformation systems [6, 9]. The new contribution of this paper is to study compatibility of satisfaction with restriction and amalgamation. This is important for large typed systems respectively objects, which can be decomposed by restriction and composed by amalgamation. The main result in this paper shows that initial satisfaction of positive constraints is compatible with restriction and amalgamation (Thm. 5.1 and Cor. 5.2). The amalgamation construction is based on the horizontal van Kampen (VK) property, which is required in addition to the vertical VK property of Madhesive categories. To our best knowledge, this is the most interesting result for Madhesive transformation systems which is based on the horizontal VK property. Note that the main result is not valid for general satisfaction of positive constraints nor for initial satisfaction of general constraints. For future work, it is important to obtain weaker versions of the main result, which are valid for general satisfaction and constraints, respectively.
References [1] P. Baldan, A. Corradini, H. Ehrig & R. Heckel (2001): Compositional Modeling of Reactive Systems Using Open Nets. In K. G. Larsen & M. Nielse, editors: Proc. of CONCUR 2001, LNCS 2154, Springer, pp. 502–518, doi:10.1007/3540446850_34. [2] E. Biermann, H. Ehrig, C. Ermel, K. Hoffmann & T. Modica (2009): Modeling Multicasting in Dynamic Communicationbased Systems by Reconfigurable Highlevel Petri Nets. In: Proc. of IEEE Symposium on Visual Languages and HumanCentric Computing (VL/HCC 2009), IEEE, pp. 47–50, doi:10.1109/VLHCC.2009.5295303. [3] B. Braatz, H. Ehrig, K. Gabriel & U. Golas (2010): Finitary MAdhesive Categories. In H. Ehrig, A. Rensink, G. Rozenberg & A. Schürr, editors: Proc. ICGT 2010, LNCS 6372, Springer, pp. 234–249, doi:10.1007/9783642159282_16.
H. Schölzel, H. Ehrig, M. Maximova, K. Gabriel & F. Hermann
97
[4] B. Courcelle (1997): The Expression of Graph Properties and Graph Transformations in Monadic SecondOrder Logic. In Grzegorz Rozenberg, editor: Handbook of Graph Grammars, World Scientific, pp. 313–400. [5] H. Ehrig, K. Ehrig, C. Ermel & U. Prange (2010): Consistent Integration of Models based on Views of Meta Models. Formal Aspects of Computing 22 (3), pp. 327–345, doi:10.1007/s0016500901276. [6] H. Ehrig, K. Ehrig, U. Prange & G. Taentzer (2006): Fundamentals of Algebraic Graph Transformation. EATCS Monographs in Theor. Comp. Science, Springer. [7] H. Ehrig & K. Gabriel (2011): Transformation of Algebraic HighLevel Nets and Amalgamation of Processes with Applications to Communication Platforms. Festschrift in Honour of Manfred Broy’s 60th Birthday. International Journal of Software and Informatics 5(12,Part1). [8] H. Ehrig, U. Golas & F. Hermann (2010): Categorical Frameworks for Graph Transformation and HLR Systems based on the DPO Approach. Bulletin of the EATCS 102, pp. 111–121. [9] H. Ehrig, A. Habel, J. Padberg & U. Prange (2006): Adhesive HighLevel Replacement Systems: A New Categorical Framework for Graph Transformation. Fundamenta Informaticae 74(1), pp. 1–29. [10] H. Ehrig, K. Hoffmann, J. Padberg, C. Ermel, U. Prange, E. Biermann & T. Modica (2008): Petri Net Transformations. In: Petri Net Theory and Applications, ITech Education and Publication, pp. 1–16, doi:10.5772/5310. [11] G. Engels, R. Heckel, G. Taentzer & H. Ehrig (1997): A Combined Reference Model and ViewBased Approach to System Specification. International Journal of Software Engineering and Knowledge Engineering 7(4), pp. 457–477. [12] R. B. France, I. Ray, G. Georg & S. Ghosh (2004): Aspectoriented approach to early design modelling. IEE Proceedings  Software 151(4), pp. 173–186, doi:10.1049/ipsen:20040920. [13] A. Habel & K.H. Pennemann (2005): Nested constraints and application conditions for highlevel structures. In H.J. Kreowski, U. Montanari, F. Orejas, G. Rozenberg & G. Taentzer, editors: Formal Methods in Software and Systems Modeling, LNCS 3393, Springer, pp. 294–308, doi:10.1007/9783540318477_17. [14] A. Habel & K.H. Pennemann (2009): Correctness of highlevel transformation systems relative to nested conditions. Mathematical Structures in Computer Science 19, pp. 1–52, doi:10.1017/S0960129508007202. [15] T. Heindel (2010): Hereditary Pushouts Reconsidered. In H. Ehrig, A. Rensink, G. Rozenberg & A. Schürr, editors: Proc. ICGT 2010, LNCS 6372, Springer, pp. 250–265, doi:10.1007/9783642159282_17. [16] S. Jurack & G. Taentzer (2010): A Component Concept for Typed Graphs with Inheritance and Containment Structures. In H. Ehrig, A. Rensink, G. Rozenberg & A. Schürr, editors: Proc. ICGT 2010, LNCS 6372, Springer, pp. 187–202, doi:10.1007/9783642159282_13. [17] S. Lack & P. Soboci´nski (2004): Adhesive Categories. In Igor Walukiewicz, editor: Proc. FOSSACS 2004, LNCS 2987, Springer, pp. 273–288, doi:10.1007/9783540247272_20. [18] S. Lack & P. Soboci´nski (2005): Adhesive and quasiadhesive categories. doi:10.1051/ita:2005028.
ITA 39(3), pp. 511–545,
[19] J. de Lara, R. Bardohl, H. Ehrig, K. Ehrig, U. Prange & G. Taentzer (2007): Attributed Graph Transformation with Node Type Inheritance. TCS 376(3), pp. 139–163, doi:10.1016/j.tcs.2007.02.001. [20] A. Rensink (2004): Representing FirstOrder Logic Using Graphs. In H. Ehrig, G. Engels, F. ParisiPresicce & G. Rozenberg, editors: Proc. ICGT 2004, LNCS 3256, Springer, pp. 319–335, doi:10.1007/9783540302032_23. [21] H. Schölzel, H. Ehrig, M. Maximova, K. Gabriel & F. Hermann (2012): Satisfaction, Restriction and Amalgamation of Constraints in the Framework of MAdhesive Categories: Extended Version. Technical Report 2012/1, TU Berlin, Fak. IV. Available at http://www.eecs.tuberlin.de/menue/forschung/ forschungsberichte/2012.
98
A
Satisfaction, Restriction and Amalgamation of Constraints
Remaining Proofs
In this appendix, we give the proofs for Fact 3.4, Fact 4.5 and Thm. 4.10. Fact 3.4 (Restriction of Solutions for Positive Nested Conditions). Given a positive nested condition acPA and a match pA : PA → GA over T GA with restrictions acPB = Restrt (acPA ), pB = Restrt (pA ) along t : T GB → T GA . Then for a solution QA of pA acPA there is a solution QB = Restrt (QA ) for pB acPB . Proof. • For acPA = true the implication is trivial, because QA is empty which means that also QB is empty and thus a solution for pB acPB is empty, because acPB is also true. • For acPA = ∃ (a, acCA ) we have that QA = (qA , QCA ) such that qA : CA → GA ∈ M with qA ◦ a = pA and QCA is a solution for qA acCA . Then by qB = Restrt (qA ) : CB → GB , we have qB ∈ M and we also have tG : GB → GA ∈ M, because t ∈ M (see Fig. 7). So for acPB = ∃ (b, acCB ) we have tG ◦ qB ◦ b = qA ◦ tC ◦ b = qA ◦ a ◦ tP = pA ◦ tP = tG ◦ pB , which by monomorphism tG implies qB ◦ b = pB . Moreover, the fact that QCA is a solution for qA acCA implies that QCB = Restrt (QCA ) is a solution for qB acCB by induction hypothesis and hence the restriction QB = (qB , QCB ) of QA is a solution for pB acPB . V
V
• Now, for acPA = i∈I acPA ,i we have acPB = i∈I Restrt (acPA ,i ). By the fact that QA is a solution for pA acPA , we have that QA = (QA,i )i∈I such that QA,i is a solution for pA acA,i for all i ∈ I. Thus, by induction hypothesis, we have restrictions QB,i = Restrt (QA,i ) that are solutions for pB Restrt (acPA ,i ) for all i ∈ I. Hence, the restriction QB = (QB,i )i∈I of QA is a solution for pB acPB . W
W
• Finally, for acPA = i∈I acPA ,i we have acPB = i∈I Restrt (acPA ,i ). By the fact that QA is a solution for pA acPA we have that QA = (QA,i )i∈I such that for one j ∈ I there is a solution QA, j for pA acA, j and for all k 6= j we have that QA,k = 0. / Thus, by induction hypothesis, the restriction QB, j of QA, j is a solution for pB Restrt (acPA , j ). Hence, we also have that the restriction QB = (QB,i )i∈I is a solution for pB acPB with QB,k = 0/ for k 6= j.
Fact 4.5 (Amalgamation of Positive Nested Conditions). Given a pushout (1) as in Def. 4.4 with all morphisms in M. pA
T GO A o
t GA
GO A o
qA
t GB
GB go

qB
a
PAO
CA / acCA O
tG
t
T GB o
w
tC
tP
CB / acCB c
pB
b
PB
Figure 7: Restriction of solution qA for pA ∃ (a, acCA )
H. Schölzel, H. Ehrig, M. Maximova, K. Gabriel & F. Hermann
99
Composition. If there are positive nested conditions acPB and acPC typed over T GB and T GC , respectively, agreeing in acPD typed over T GD then there exists a unique positive nested condition acPA typed over T GA such that acPA = acPB +acPD acPC . Decomposition. Vice versa, given a positive nested condition acPA typed over T GA , there are unique restrictions acPB , acPC and acPD of acPA such that acPA = acPB +acPD acPC . The amalgamated composition and decomposition constructions are unique up to isomorphism. Proof. Composition. We perform an induction over the structure of acPD : • acPD = true. Then we also have acPB = true and acPC = true, and the amalgamation acPA is trivially given by acPA = true. • acPD = ∃ (d, acCD ) with d : PD → CD . The assumption that acPB and acPC agree in acPD means that acPD is a restriction of acPB and acPC and thus, by Def. 3.2, we have that acPB = ∃ (b, acCB ) with b : PB → CB , acPC = ∃ (c, acCC ) with c : PC → CC , d is a restriction of b and c, and acCD is a restriction of acCB and acCC . This in turn means that acCB and acCC agree in acCD according to Def. 4.4. So, by induction hypothesis, we obtain an amalgamation acCA = acCB +acCD acCC , which implies that tCA = tCB +tCD tCC , i. e., diagrams (2)(5) below are pullbacks. By closure of M under pullbacks, we obtain from tgBA ,tgCA ∈ M that also cBA , cCA ∈ M. Moreover, the fact that d is a restriction of b and c means that (6)+(2) and (7)+(3) are pullbacks, which by pullback decomposition implies that (6) and (7) are pullbacks. Note that b, c and d can be considered as typed over CB , CC and CD , respectively. So, according to Def. 4.1, we obtain that b and c agree in d with respect to the pushout of the Cs, leading to an amalgamation a = b +d c : PA → CA with pullbacks (8) and (9) by Fact 4.2. Hence, acPA = ∃ (a, acCA ) is the required amalgamation. pCA
acPA . PA o O
a
acCA & O CO A o
(8) cCA tCA
pBA
(9) cBA
acPB . PB o V
8 CB
M acCB
(4)
'
tgCA T GO A o
(5) tgBA tCB
b
acCC O x COC
o
o 7 T GB
(1) tgDB (2) cDB (6) pDB
T GO C
w
c
tCC
tgDC (3)
T GD g
PCO / acPC
cDC (7)
pDC
tCD
CD f M acCD
d
PD / acPD
• acPD = i∈I acPD ,i . V Since acPD is a restriction of acPB and acPC , they must be of the form acPB = i∈I acPB ,i and V acPC = i∈I acPC ,i . Moreover, since acPB and acPC agree in acPD , we obtain that also acPB ,i and acPC ,i agree in acPD ,i for all i ∈ I. So, by induction hypothesis, there are amalgamations acPA ,i = acPB ,i +acPD ,i acPC ,i such that acPB ,i and acPC ,i are restrictions of acPA ,i for all i ∈ I. V Hence, acPA = i∈I acPA ,i is the required amalgamation.
100
Satisfaction, Restriction and Amalgamation of Constraints • The remaining case for disjunction works analogously to the case for conjunction. The uniqueness of the amalgamation follows from the fact that we have an amalgamation in each level of nesting and the amalgamation of typed objects is unique by Fact 4.2.
Decomposition. We do an induction over the structure of acPA : • acPA = true. This case is trivial because true = true +true true. • acPA = ∃ (a, acCA ) with a : PA → CA . Then by induction hypothesis, there exist restrictions acCB , acCC and acCD of acCA such that acCA = acCB +acCD acCC . Moreover, by Fact 4.2, there are unique restrictions b, c and d of a such that a = b +d c. Hence, we have restrictions acPB = ∃ (b, acCB ), acPC = ∃ (c, acCC ) and acPD = ∃ (d, acCD ) of acPA , and, as shown for the case of composition before, the fact that acCA = acCB +acCD acCC and a = b +d c implies that acPA = acPB +acPD acPC . V
• acPA = i∈I acPA ,i . Then by induction hypothesis, there exist restrictions acPB ,i , acPC ,i and acPD ,i of acPA ,i such V V that acPA ,i = acPB ,i +acPD ,i acPC ,i for all i ∈ I. Hence, acPB = i∈I acPB ,i , acPC = i∈I acPC ,i and V acPD = i∈I acPD ,i are restrictions of acPA such that acPA = acPB +acPD acPC . • Again, the remaining case for disjunction works analogously to the case for conjunction. The uniqueness of the decomposition follows from the uniqueness of restrictions by pullback construction.
Theorem 4.10 (Amalgamation of Solutions for Positive Nested Conditions). Given pushout (1) as in Def. 4.8 with all morphisms in M, an amalgamation of typed objects gA = gB +gD gC , and an amalgamation of positive nested conditions acPA = acPB +acPD acPC with corresponding matches pA = pB + pD pC . Composition. Given solutions QB for pB acPB and QC for pC acPC agreeing in a solution QD for pD acPD , then there is a solution QA for pA acPA constructed as amalgamation QA = QB +QD QC . Decomposition. Given a solution QA for pA acPA , then there are solutions QB , QC and QD for pB acPB , pC acPC and pD acPD , respectively, which are constructed as restrictions QB , QC and QD of QA such that QA = QB +QD QC . The amalgamated composition and decomposition constructions are unique up to isomorphism. Proof. Composition. We perform an induction over the structure of acPA . • acPA = true. Then also acPB , acPC , acPD are true and we have empty solutions QA , QB , QC and QD . Since the restriction of an empty solution is empty, we have that QB and QC are restrictions of QA . • acPA = ∃ (a, acCA ) with a : PA → CA . By Fact 4.5 (Composition), we have the following diagram, where all rectangles are pushouts and all trapezoids are pullbacks, and all horizontal and vertical morphisms are in M.
H. Schölzel, H. Ehrig, M. Maximova, K. Gabriel & F. Hermann pCA
acPA . PA o O
acCA O & CO A o
a
cCA
'
(30 ) cBA
T GO A o
(3) tgBA
T GO C
w
cDC (50 )
tgDC (5)
T GD g
pDC
tCD
CD f M acCD
cDB (40 )
M acCB
c
tCC
(4)
8 CB o
acPB . PB o
(1)
o tgDB 7 T GB
tCB
b
(2) tgCA
PCO / acPC
acCC O x COC
(20 )
tCA pBA
101
d
PD / acPD
pDB
Now, we consider solutions QB = (qB , QCB ), QC = (qC , QCC ) and QD = (qD , QCD ) for pB acPB , pC acPC and pD acPD , respectively, such that QD is a restriction of QB and QC . Then we also have that qD is a restriction of qB and qC , and thus gBA ◦ qB ◦ cDB = gBA ◦ gDB ◦ qD = gCA ◦ gDC ◦ qD = gCA ◦ qC ◦ cDC . Together with the pushout over the Cs, this implies a unique morphism qA : CA → GA with qA ◦ cBA = gBA ◦ qB and qA ◦ cCA = gCA ◦ qC . pCA
acPA . PA o O
pC
pA
' acCA . CO A o a
qA
cCA
'
GO A o gA
pBA
gBA
cBA
T GO A o
tgBA gB
qB
(
o 6 T GB
7 G> B o
tgDB
} w
T GO C
v gC
GO C
qC
pDC
cDC
gD
Ga D g
qD
CD /g acCD
cDB
pB
acPB . PB o
PCO / acPC
COC / acCC
gDC
tgDC
T GD h
gDB
acCB .7 CB o b
gCA tgCA
w
c
d
pD
PD / acPD
pDB
Moreover, we have qA ◦ a ◦ pBA = qA ◦ cBA ◦ b = gBA ◦ qB ◦ b = gBA ◦ pB = pA ◦ pBA and analogously qA ◦ a ◦ pCA = pA ◦ pCA . Since pBA and pCA are jointly epimorphic, this implies that qA ◦ a = pA . In order to show that qA ∈ M, we consider the following diagram in the left: GO A o gBA
GO B o qB
CB o
gCA
GO C o
(6) gDC gDB (7) cDB
GO D o qD
CD o
qC
(8) qD (9)
idCD
COC cDC
COD idCD
CD
CC a
gCA ◦qC
v G A oh
qA
CA a
}
cDC
cCA
CD cBA
gBA ◦qB
CB
}
cDB
102
Satisfaction, Restriction and Amalgamation of Constraints We have that (6) is a pushout with all morphisms in M and thus also a pullback. Diagrams (7) and (8) are pullbacks by restriction, and (9) is a pullback because qD ∈ M is a monomorphism. Hence, by composition of pullbacks, we obtain that the complete diagram is a pullback along Mmorphisms gBA ◦ qB and gCA ◦ qC , which means that the pushout of the Cs is effective (see Def. 2.3), implying that qA ∈ M. It remains to show that qB and qC are restrictions of qA . In the following diagram, we have that (10) and (11) are pullbacks by restrictions, the Cs and the Gs form pushouts (see Rem. 4.9) and all morphisms in (10)(13) are in M. So, the horizontal as well as the vertical VK property implies that also (12) and (13) are pullbacks, which means that qB and qC are restrictions of qA . acCA . CA o CC / acCC c O
O
CA
qA
(
(12)
GO A o gA
cBA (13)
gBA
acCB . CB o
6 GB o
(
T GO A o
tgCA
tgBA gB
qB
GO C
gCA
o tgDB 6 T GB gDB (10) cDB
T GO C
v
qC
gC gDC (11)
tgDC
T GD h
v
cDC
gD
GD h
qD
CD / acCD
Finally, QD being a restriction of QB and QC means that QCD is a restriction of QCB and QCC by induction hypothesis, this implies a solution QCA of qA acCA such that QCB and QCC are restrictions of QCA . Hence, QA = (qA , QCA ) is a solution for pA acA such that QB and QC are restrictions of QA . V • acPA = i∈I acPA ,i . V V V We have acPB = i∈I acPB ,i , acPC = i∈I acPC ,i and acPD = i∈I acPD ,i such that for all i ∈ I there is acPD ,i a restriction of acPB ,i and acPC ,i . Moreover, given solutions QB , QC and QD of pB acPB , pC acPC and pD acPD , respectively, we have QB = (QB,i )i∈I , QC = (QC,i )i∈I and QD = (QD,i )i∈I such that for all i ∈ I we have that QB,i , QC,i and QD,i are solutions for pB acPB ,i , pC acPC ,i and pD acPD ,i , respectively, and QD,i is a restriction of QB,i and QC,i . Then, by induction hypothesis, there are solutions QA,i for pA acPA ,i for all i ∈ I such that QB,i and QC,i are restrictions of QA,i . Hence, QA = (QA,i )i∈I is the required solution for pA acPA . W • acPA = i∈I acPA ,i . W W W We have acPB = i∈I acPB ,i , acPC = i∈I acPC ,i and acPD = i∈I acPD ,i such that for all i ∈ I there is acPD ,i a restriction of acPB ,i and acPC ,i . Moreover, given solutions QB , QC and QD of pB acPB , pC acPC and pD acPD , respectively. Then we have QB = (QB,i )i∈I , QC = (QC,i )i∈I and QD = (QD,i )i∈I such that for some jB , jC , jD ∈ I we have that QB, jB , QC, jC and QD, jD are solutions for pB acPB , jB , pC acPC , jC and pD acPD , jD , respectively, and for all kB , kC , kD ∈ I with kB 6= jB , kC 6= jC and kD 6= jD we have that QB,kB , QC,kC and QD,kD are empty. Furthermore, QD,i is a restriction of QB,i and QC,i for all i ∈ I . Case 1. QD, jD = 0. / Then we have QD, j = 0/ for all j ∈ I. According to Def. 3.3, only the restriction of
H. Schölzel, H. Ehrig, M. Maximova, K. Gabriel & F. Hermann
103
an empty solution is empty, implying that we also have QB, j = QC, j = 0/ for all j ∈ I. Moreover, since QD, jD is a solution for pD acPD , jD , we can conclude that acPD , jD = true, and by the fact that acPD , jD is a restriction of acPA , jD , acPB , jD and acPC , jD it follows that also acPA , jD = true, acPB , jD = true and acPC , jD = true. So, as shown above, there is a solution QA, jD = 0/ for pA acPA , jD . Hence, QA = (QA,i )i∈I with QA,i = 0/ for all i ∈ I is a solution for pA acPA such that QB and QC are restrictions of QA . Case 2. QD, jD 6= 0. / Then according to Def. 3.3, there are also QB, jD 6= 0/ and QC, jD 6= 0/ which means that jB = jC = jD . So, by induction hypothesis, there is a solution QA, jD for pA acPA , jD such that QB, jD and QC, jD are restrictions of QA, jD . Hence, QA = (QA,i )i∈I with QA,k = 0/ for all k ∈ I with k 6= jD is a solution for pA acPA , and we have that QB and QC are restrictions of QA . In the first case (acPA = true), the uniqueness of the amalgamation follows from the fact that an empty solution can only be the restriction of another empty solution. In the second case (acPA = ∃ (a, acCA )), the uniqueness of QA = (qA , QCA ) follows from the uniqueness of qA by universal pushout property, and by uniqueness of QCA by induction hypothesis. Finally, in the cases of conjunction and disjunction, the uniqueness of the solution follows from uniqueness of its components by induction hypothesis. Decomposition. Again, we perform an induction over the structure of acPA . • acPA = true. Then we also have that acPB , acPC and acPD are true. Moreover, we have that QA is empty, leading to empty restrictions QB , QC and QD that are solutions for pB acPB , pC acPC and pD acPD , respectively. • acPA = ∃ (a, acCA ) with a : PA → CA . Then we have acPB = ∃ (b, acCB ), acPC = ∃ (c, acCC ) and acPD = ∃ (d, acCD ). By amalgamation gA = gB +gD gC , we have pullbacks (2)(5) below. Moreover, by restrictions acPB , acPC and acPD of acPA , we have restrictions b, c and d of a, implying pullbacks (6)(9) below. According to Rem. 4.6, we have an amalgamation of positive nested conditions acCA = acCB +acCD acCC , which implies an amalgamation of typed objects tCA = tCB +tCD tCC by Def. 4.4. pCA
acPA . PA o O
pA
a
(7)
% acCA . CO A o
qA
(6)
gA
tCB
acCB . CB o 8 b
acPB . PB o
pB
'
/ T GA o O
gBA (2) tgBA
cBA (10)
qB
cCA (11)
& GO A o
tCA pBA
pC
o 8 G? B
gB
gCA (3) tgCA (1)
w
(4)
gDB (12) cDB (8) pDB
GO C gC
tgDC (5)
/ o tgDB T GD o 7 T GB g
PCO / acPC
COC / acCC
x qC
T GO C o
x
c
gD
tCC gDC(13)
pDC
cDC (9)
tCD
G_ D f qD CD /f acCD pD
d
PD / acPD
104
Satisfaction, Restriction and Amalgamation of Constraints Now, given a solution QA = (qA , QCA ) for pA acPA , there is qA : CA → GA ∈ M with qA ◦ a = pA . Furthermore, we have gA ◦ qA ◦ cBA = tCA ◦ cBA = tgBA ◦ tCB , which implies a unique morphism qB : CB → GB by pullback (2) such that gB ◦ qB = tCB and gBA ◦ qB = qA ◦ cBA . Due to amalgamation tCA = tCB +tCD tCC , we have that tCB is a restriction of tCA and thus (10)+(2) is a pullback. So, together with pullback (2), we obtain that also (10) is a pullback by pullback decomposition and, thus, qB is a restriction of qA . Moreover, by qA ,tgBA ∈ M and closure of M under pullbacks, we know that qB , gBA ∈ M. Hence, by gBA ◦ pB = pA ◦ pBA = qA ◦ a ◦ pBA = qA ◦ cBA ◦ b = gBA ◦ qB ◦ b, we obtain pB = qB ◦ b because gBA ∈ M is a monomorphism. Analogously, due to pullback (3) and restriction tCC of tCA , there is a unique restriction qC : CC → GC ∈ M of qA with pullback (11) such that pC = qC ◦ c, and due to pullback (4) and restriction tCD of tCB , there is a unique restriction qD : CD → GD ∈ M of qB with pullback (12) such that pD = qD ◦ d. Then, since tCD is a restriction of tCC , (5)+(13) is a pullback which implies that also (13) is a pullback by pullback decomposition and pullback (5). Thus, qD is also a restriction of qC , which means that we have qA = qB +qD qC . So, by induction hypothesis, there are solutions QCB for qB acCB , QCC for qC acCC , and QCD for qD acCD such that QCA = QCB +QCD QCC . Hence, for QB = (qB , QCB ), QC = (qC , QCC ) and QD = (qD , QCD ) we obtain that QA = QB +QD QC . V • acPA = i∈I acPA ,i . V V V Then we also have acPB = i∈I acPB ,i , acPC = i∈I acPC ,i , and acPD = i∈I acPD ,i . Now, given a solution QA = (QA,i )i∈I for pA acPA , then QA,i is a solution for pA acPA ,i for all i ∈ I. Thus, by induction hypothesis for all i ∈ I, there are solutions QB,i for pB acPB ,i , QC,i for pC acPC ,i , and QD,i for acPD ,i such that QA,i = QB,i +QD,i QC,i . This in turn means that for all i ∈ I there are QB,i and QC,i restrictions of QA,i , and QD,i is a restriction of QB,i and QC,i . Hence, for QB = (QB,i )i∈I , QC = (QC,i )i∈I and QD = (QD,i )i∈I we have that QB and QC are restrictions of QA , and QD is a restriction of QB and QC , implying QA = QB +QD QC . W • acPA = i∈I acPA ,i . W W W Then we also have acPB = i∈I acPB ,i , acPC = i∈I acPC ,i , and acPD = i∈I acPD ,i . Given a solution QA = (QA,i )i∈I for pA acPA , then there is j ∈ I, such that QA, j is a solution for pA acPA , j , and for all k ∈ I with k 6= j there is QA,k = 0. / By induction hypothesis, there are solutions QB, j for pB acPB , j , QC, j for pC acPC , j , and QD, j for pD acPD , j such that QA, j = QB, j +QD, j QC, j . Hence, for QB = (QB,i )i∈I ,QC = (QC,i )i∈I and QD = (QD,i )i∈I , where for all k ∈ I with k 6= j there is QB,k = QC,k = QD,k = 0, / we have that QA = QB +QD QC . The uniqueness of the solutions follows from uniqueness of restrictions by pullback constructions.