Home | Downloads | Projects | License | Resources | History

L1 (Linear Assertion Notation)


During the development of GwTk we noticed that the process of constructing a topic map graph from markup (e.g. XTM) can be split into two phases: transforming the markup into a sequence of assertions between subjects and building the graph from that sequence.

The advantage of this approach is the possibility to decouple the markup processor from the graph implementation. It also emphazises that the interpretation of a certain type of markup (the processing model) is absolutely independent from the concept of the topic map graph and its validity requirements.

In GwTk the connection between processor and topic map graph were made by way of callback functions (a portion of glue code registered callbacks with the processor and passed the received assertions on to the graph building module) while Goose takes this a step further and introduces an intermediate representation of the topic map information. L1 is the notation for this intermediate representation.

L1 is a linear notation and each line of an L1 representation contains exactly one assertion as described in the ISO Draft Reference Model for Topic Maps

So, what is an assertion ?

An assertion can be pictured in ASCII-art like this:

              R1      |      R2 
              |       |      |
              |       |      |

The assertion itself (the subject that represents the relationship) is expressed as the A-node. The P-node represents the assertion pattern (the subject that expresses the type of relationship). For each membership there is an RCx subgraph that expresses that a certain player (x) plays a certain role (R) in the assertion. The 'fact that x plays R in A' is itself a subject that can be talked about, hence it is represented as a node, too, called the casting node (C-node).

A line in L1 notation is the linear representation of such an assertion and the above example would look like this:

    P A ( R1 C1 x1 )( R2 C2 x2 ) 

In order to identify what subjects all the nodes refer to, subject indicators of the nodes are used. In most cases, the markup contains so-called node-demanders (elements that demand the existance of a node in the resulting graph) for the various portions of the assertions and the addresses of those elements become the subject indicators in the L1 notation. Sometimes no node demanders are present for A and C nodes and they may therfore be omited from the L1 line. Thus

    P ( R1 x1 )( R2 x2 )
is also a valid L1 version of an assertion.

P, A and R subjects may never be addressable subjects so they can be unambigously identified by the plain address (e.g. URI) of any of their subject indicators. Not so in the case of the role players. They might either be non-addressable subjects (indicated by a resource) or addressable subjects (constituted by a resource). In order two distiguish between the two the addresses are surrounded by angle brackets in the former case and square brackets in the later:

    < >
refers to the subject indicated by the resource (presumably the W3C) and
    [ ]
refers to the particular resource itself (the particular document).

BNF for L1

Here is the BNF for L1:

  assertion     ::= patternLoc ws (anodeLoc ws)? member*
  member        ::= '(' ws roleLoc ws (cnodeLoc ws)?  (sir | scr) ws ')' 
  sir           ::= '<' ws resource ws '>'
  scr           ::= '[' ws resource ws ']'
  resource      ::= locator (ws '"' data '"')?
  patternLoc    ::= loc
  anodeLoc      ::= loc
  roleLoc       ::= loc
  cnodeLoc      ::= loc
  loc           ::= character+ except ws
  data          ::= character* " must be escaped as \"
  ws            ::= space 
  space         ::= #x20 /* US-ASCII space - decimal 32 */
  character     ::= [#x20-#x7E] /* US-ASCII space to decimal 127 */
  tab           ::= #x9 /* US-ASCII horizontal tab - decimal 9 */

[home] Copyright © 2001, 2002, 2003 by Jan Algermissen and eTopicality, Inc.