Towards Understanding Content Markup Language

0 downloads 0 Views 149KB Size Report
concentrate on design of Content Markup languages for mathematics. • basis for all other scientific fields. • non-trivial sub-task likely to unearth all major pitfalls.
Towards Understanding Content Markup Language Design

L. Kohout, Florida State University A. Strotmann, Universität zu Köln

Every sentence I utter must be understood not as an affirmation but as a question. (N. Bohr)

11th OpenMath Workshop, Florida State University, November 1998 L.Kohout,A.Strotmann: Towards Understanding Content Markup Language Design, 1 of 19

RZK RZK R___

1. Introduction • Content Markup Languages • Towards a Fuller Perspective • A Practical Approach 2. The Compositionality Principle • Compositionality: a “linguistic” design principle • Compositionality in Content Markup • Consequences of Compositionality 3. Compositionality in Mathematical Systems • Compositional Treatment of Integral Expressions • Compositionality in Existing Systems • A Taste of Things to Come... 11th OpenMath Workshop, Florida State University, November 1998 L.Kohout,A.Strotmann: Towards Understanding Content Markup Language Design, 2 of 19

RZK RZK R___

4. Distributed Computations in Knowledge Networks • Extending OpenMath with Extra-Mathematical Concepts • Compositionality Revisited • Effective Computing with Distributed Knowledge • A Manufacturing Application • Categorial Type Logics and BK-Relational Products

Based on the paper L.J. Kohout, A. Strotmann: “Understanding and Improving Content Markup for the Web: from the Perspectives of Formal Linguistics, Algebraic Logics, and Cognitive Science.” presented at: ISIC/CIRA/ISAS ‘98 Joint Conference on the Science and Technology of Intelligent Systems (IS’98), Gaithersburg, MD, 1998.

11th OpenMath Workshop, Florida State University, November 1998 L.Kohout,A.Strotmann: Towards Understanding Content Markup Language Design, 3 of 19

RZK RZK R___

Content Markup Languages • content (“meaning”) rather than form (presentation) • many projects • CA systems (Maple, Mathematica, and many more) • Computational Logic, Knowledge Representation... • interchange languages (MathML, OpenMath, CML, Camino Real, ASAP, MultiProtocol, KIF/KQML, ...)

• some with narrow focus... • syntax only (“meaning in the eye of the beholder”: CR,ASAP) • formulas restricted to first order logic (KIF) • mathematical formulas only (“Four Colors Suffice” is not one) • ... or with blurry focus • “mathematics... whatever that means”

11th OpenMath Workshop, Florida State University, November 1998 L.Kohout,A.Strotmann: Towards Understanding Content Markup Language Design, 4 of 19

RZK RZK R___

Towards a Fuller Perspective The Goal: Designing Good Content Markup Languages. Sub-Goal: Understanding Content Markup Languages. • a thorough problem analysis is overdue • consider the full communications model - morphology, syntax, semantics, pragmatics and beyond • take an external and multi-disciplinary perspective mathematical logic, formal linguistics (syntax, semantics, and beyond), semiotics, cognitive science

“Caution! Work in Progress! Hardhats Required!”

11th OpenMath Workshop, Florida State University, November 1998 L.Kohout,A.Strotmann: Towards Understanding Content Markup Language Design, 5 of 19

RZK RZK R___

A Practical Approach • concentrate on design of Content Markup languages for mathematics • basis for all other scientific fields • non-trivial sub-task likely to unearth all major pitfalls • yet simpler: meaning of “meaning” fairly well explored • empirical studies: symbolic computation systems exist • explore the parallels to a linguistic communications model • “steal,” analyse, implement, and test design principles discovered in the study of the only high-quality system for exchanging “meaning” known to exist: human language • crucial if OpenMath is to become a basis for knowledge exchange between intelligent agents beyond mathematics Abbott,v.Leeuwen,Strotmann: OpenMath: Communicating Mathematical Information in a Knowledge Network. in: J. of Intelligent Systems, 1998. (formerly known as “OpenMath Objectives”)

11th OpenMath Workshop, Florida State University, November 1998 L.Kohout,A.Strotmann: Towards Understanding Content Markup Language Design, 6 of 19

RZK RZK R___

The Compositionality Principle: a “linguistic” design principle A formal semantics for a language is said to be compositional iff “the meaning of a compound expression is a function of the meaning of its parts and the syntactic rule by which the parts are combined.” B.Partee et al. quoted in Janssen: “Compositionality” in: v.Benthem, terMeulen(eds.): “Handbook of Logic and Language.” (1997)

• powerful principle in linguistics: look for compositional formal semantics of natural language, and study ramifications • in CS and engineering, compositionality valued for improved • scalability • extensibility • correctness

11th OpenMath Workshop, Florida State University, November 1998 L.Kohout,A.Strotmann: Towards Understanding Content Markup Language Design, 7 of 19

RZK RZK R___

Compositionality in Content Markup • evaluation of “meaning” requires semantic interpretation procedure in every application • compositionality means specifying a complete “skeleton” semantic interpretation • definition of “part” of a compound syntactic expression • small list of “syntactic rules” available for forming compounds • interpretation of combinations given interpretation of “parts” • no interpretation rules for specific lexical entries • strict separation of “syntax” and “lexicon” • “skeleton semantics” corresponds to “OM expression layer” • firmly grounded scaffolding for lexical extension mechanism

11th OpenMath Workshop, Florida State University, November 1998 L.Kohout,A.Strotmann: Towards Understanding Content Markup Language Design, 8 of 19

RZK RZK R___

Consequences • any construct requiring “special” treatment in semantic interpretations should be treated as “special syntax” • examples: variables, binding, co-reference, empty categories • if a whole class of lexical items appears to require special treatment, provide a common “special syntax” • examples: generalized quantifiers and operators : binding • “parts” of a compound expression that are crucial for interpretation should be syntactically “part” of the expression • counter-example: root_of(_z^2 - 1) with _z deeply embedded

• to add superficially uncompositional syntactic sugar, define a (separate!) macro facility to map it to a compositional syntax

11th OpenMath Workshop, Florida State University, November 1998 L.Kohout,A.Strotmann: Towards Understanding Content Markup Language Design, 9 of 19

RZK RZK R___

Compositional Treatment of Integral Expressions x

Consider ∫ sin x dx 0 • “parts”: integral sign, bounds, sin x, dx • groupings (OpenMath) • bounds form an interval [0,x] • the x in dx denotes a bound variable, and sin x dx the function ( x → sin x )

• integral operator takes integration domain and unary function as arguments • sin and x form a group joined by function application

• special compositions: application, binding (abstraction) • symbol categories: constant, variable, function/operator

11th OpenMath Workshop, Florida State University, November 1998 L.Kohout,A.Strotmann: Towards Understanding Content Markup Language Design, 10 of 19

RZK RZK R___

Advanced Compositional Treatment of Integration • Consider the equation x = ∫ dx • Note how the variable x bound in the integral is “exported” in apparent violation of the notion of binding • solution: “x” is the implicit topic of this equation • “we’re moving in a space of functions in (the variable) x” • the name of the variable is imported from the context into the integral expression • the two identical names of variables denote the common notion of “functions in a common space” while being different variables with different scopes in a semantic interpretation

• “topicalization” is an important concept in linguistics

11th OpenMath Workshop, Florida State University, November 1998 L.Kohout,A.Strotmann: Towards Understanding Content Markup Language Design, 11 of 19

RZK RZK R___

Advanced Compositional Treatment of Integration II ∫

x  x dx =  ( x, ) → ( x → x ) 



• mathematical notation is incredibly concise • information is often left unspecified (implicit) • mathematicians hate writing down what can be “easily inferred” • linguistic notions of co-reference and traces may help solve this riddle: mathematical notation is made by people for communication with people

11th OpenMath Workshop, Florida State University, November 1998 L.Kohout,A.Strotmann: Towards Understanding Content Markup Language Design, 12 of 19

RZK RZK R___

Compositionality in Existing Systems • traditional Computer Algebra systems’ user languages tend to fail to treat variable binding in a compositional manner • OpenMath improves constantly wrt. compositionality • 1.0 factors out lambda binding, but treats lambda as an ordinary symbol • new version introduces OMBIND syntactic construct and employs it consistently; some CDs still need to catch up

• MathML has also improved with each new version, but needs much more work • no clear distinction between syntactic and lexical constructs • BVAR element used for bound variables but unfortunately not employed consistently

11th OpenMath Workshop, Florida State University, November 1998 L.Kohout,A.Strotmann: Towards Understanding Content Markup Language Design, 13 of 19

RZK RZK R___

A Taste of Things to Come... A few more suggestions for improving Content Markup (and perhaps Presentation Markup) that are expected to come out of a linguistics perspective on the problem: • morphosyntactic component (notprsubset and subset are • • • • • •

closely related concepts; better: not-proper-subset-relation) structural labeling/referencing syntax and usage restrictions very general “invisible” presentation markup (for marking up “Four Colors Suffice”) and “default value” content markup positional required arguments and named optional arguments syntax for distinguishing semantic (“what I mean”) and pragmatic (“what I want”) information categorial type semantics (i.e. expression layer) as a scaffolding for a concrete type system syntax for topic and focus and for facilitating category shifts

11th OpenMath Workshop, Florida State University, November 1998 L.Kohout,A.Strotmann: Towards Understanding Content Markup Language Design, 14 of 19

RZK RZK R___

Extending OpenMath with Extra-Mathematical Concepts • practical, real-world application problems require • a mathematical core as an integral component • non-mathematical kinds of information and knowledge • therefore, we propose • to develop a scheme for knowledge networking • based on OpenMath structures • and extending them by incorporating linguistic and nonmathematical symbolic representations and communication

• unification of proposed extensions within a sound mathematical core via Activity Structures and Relational Calculi (Bandler-Kohout-relational products)

11th OpenMath Workshop, Florida State University, November 1998 L.Kohout,A.Strotmann: Towards Understanding Content Markup Language Design, 15 of 19

RZK RZK R___

Compositionality Revisited • current distributed knowledge networking systems (e.g. KIF) • assume a completely sharable knowledge base • crucially depend on first-order semantics • a truly distributed knowledge network, however, consists of • a conglomerate of only partially sharable knowledge bases • used in a highly parallel fashion • requiring extensions to first-order semantics • compositionality and distributed knowledge networks • Tarski semantics of First-Order Logic require compositionality, which requires parallelism, invariance, determinacy (Hintikka) • first-order semantics are crucial locally within a network • this makes compositionality extremely important in practice

11th OpenMath Workshop, Florida State University, November 1998 L.Kohout,A.Strotmann: Towards Understanding Content Markup Language Design, 16 of 19

RZK RZK R___

Effective Computing with Distributed Knowledge • managing knowledge effectively is a major practical concern • locating, retrieving, transforming knowledge facilitated by • structuring knowledge in an organized manner • clustering and “chunking” information • linking contexts in linguistic representations • granularity, decomposition, and composition of information • locality for “crisp” and “fuzzy” relations • adequate definition of locality by Bandler, Kohout • effective computational testing of locality available • effective comparison of partial relational structures • these methods can handle locality of symbolic structures 11th OpenMath Workshop, Florida State University, November 1998 L.Kohout,A.Strotmann: Towards Understanding Content Markup Language Design, 17 of 19

RZK RZK R___

A Manufacturing Application • need to make business decisions on products yet to be designed or manufactured • typical high-technology industry problem • scarcity of information, lack of historical precedent • need affordability models applying to such problems • take into account human factors (technological concepts, psychological and linguistic constructs) • capture management, financial, and organizational activities

• fuzzy relational techniques provide a framework • handle incomplete and conflicting information • integrate linguistic information with uncertainty • unify quantitative and qualitative (symbolic) knowledge processing plus communication and interaction of agents

11th OpenMath Workshop, Florida State University, November 1998 L.Kohout,A.Strotmann: Towards Understanding Content Markup Language Design, 18 of 19

RZK RZK R___

Categorial Type Logics and BK-Relational Products • categorial type logics (especially Lambek calculus) • promise unifying framework for mathematical and linguistic knowledge descriptions (content markup) • important mutual links exist between Lambek’s \ and / syntactic operators and BK-relational product semantics

• BK-Relational Products are promising as • both a theoretical and a computational framework • for both a semantic and a pragmatic interpretation • of content markup languages like OpenMath • extended with semiotic descriptors L.J.Kohout: A Perspective on Intelligent Systems: A Framework for Analysis and Design. Chapaman&Hall,London 1990

11th OpenMath Workshop, Florida State University, November 1998 L.Kohout,A.Strotmann: Towards Understanding Content Markup Language Design, 19 of 19

RZK RZK R___