GooseWorks.Org

[geese]
Home | Downloads | Projects | License | Resources | History
[character]
[bamboo]

STMQL

This is the file included in the Goose distribution:

The design of STMQL is based on an association centric view of the topic maps
paradigm, which in fact, brings topic maps very close to the entity relationship
model and it is not by coincidence that STMQL has an SQL-like look and feel.

In order to understand STMQL, basically you have to understand is how assertion
patterns in the topic map world correspond to tables in the relational world.

A short note on the terminology I use in this text:

The 'Draft Reference Model for ISO Topic Maps' (RM) [1] uses the term
'assertion' to express strongly typed relationships between subjects and the
term 'assertion pattern' for the subjects that express these 'strong types'.
This terminology avoids the confusion that sometimes arises with the use of
the term 'association' (the relationship or the association element) and with
the use of the term 'association type' (where it is not clear if the nature of
a particular relationship between some subjects is meant or merely a class
that the association is an instance of [2])

The subjects that participate in the assertions are represented as nodes in
a graph model by the RM and letters will be used to represent subjects
in the examples. x1,x2,x3,.. will be used to represent arbitrary subjects,
R1,R2,R3,... will be used for subjects that are roles and P for those
that are patterns. The letter C will be used for the so-called 'casting nodes'
that represent 'the fact that a particular subject plays a particular role
in a particular assertion' and -finally- the letter A will be used to
represent the assertions themselves.

Let's not worry here about how subjects are related to locators (e.g. URIs) or
data content, this will be handled further down.

Here is how an assertion with two members looks like in ASCII art:

                      P1
                       |
           R1          |         R2
            |          |          |
            |          |          |
x1----------C----------A----------C----------x2


Read as: "x1 plays R1 in A" and "x2 plays R2 in A" and "A is patterned by P"

In a topic map graph there will be many assertions that have the same pattern,
for example the 'class-instance' pattern. If we group all these assertions
together and line up the players of corresponding roles in collumns we get
a relational table representation of all the assertions. Each assertion
represented as a single row:

assertion pattern: P1
roles: R1, R2

+-----+-----+-----+
|     |  R1 |  R2 |
+-----+-----+-----+
|  A  |  x1 |  x2 |
+-----+-----+-----+
 ...
+-----+-----+-----+
| An  | xn-1|  xn |
+-----+-----+-----+

Or, as an example:

assertion pattern: ap-class-instance

+-----+---------------+------------+
|     | role-instance | role-class |
+-----+---------------+------------+
| A1  | Jim           | Person     |
+-----+---------------+------------+
| A2  | Betty         | Cat        |
+-----+---------------+------------+
| A3  | Wanda         | Fish       |
+-----+---------------+------------+
etc..

Now, what STMQL basically does is to select certain parts of each assertion in
such a table that matches a given filter expression (WHERE clause).

If one is interested in all the classes that Jim is an instance of the query
would be something like: "Give me the PLAYER OF role-class in all assertions
FROM pattern ap-class-instance WHERE the PLAYER OF the role-instance is Jim.

In general:

SELECT PLAYER OF R1,
  FROM P
 WHERE PLAYER OF R2 IS x2

The result of STMQL queries is a set of tuples of subjects where the
order of the tuples corresponds to the selection expression just following
SELECT.

In the example, the result would be a set of one tuple with one element
(Jim).


Identifying Subjects
====================

In oder to express WHAT the subjects are we refer to in STMQL queries several
referencing mechanism are needed and STMQL provides the follwoing:

- by locator of a subject indicating resource:

      http://www.topicmaps.org/xtm/1.0/core.xtm#class

  refers to the subject that represents the role-instance in our example

- by locator of a subject constituting resource:

      [ http://www.w3.org/index.html ]

  represents the homepage of the W3C (the particular document).

- by macro:

  STMQL includes a macro mechanism to simplify its use. Macros start and end
  with a period. It will be possible in the future to define new macros at
  runtime by issuing a CREATE MACRO command. The list of build in macros is
  in the file goose/macros.tab . A macro maps a name to a locator.

  The predefined macros are:
  ap-topic-subjectIndicator    DRM#ap-topic-subjectIndicator
  aptsi                        DRM#ap-topic-subjectIndicator
  topic                        DRM#role-topic
  subjectIndicator             DRM#role-subjectIndicator
  si 		               DRM#role-subjectIndicator
  class                        SAM#role-class
  instance                     SAM#role-instance
  class-instance               SAM#ap-class-instance
  topic-occurrence             SAM#ap-topic-occurrence
  occurrence                   SAM#role-occurrence
  superclass-subclass          SAM#ap-superclass-subclass
  superclass                   SAM#role-superclass
  subclass                     SAM#role-subclass
  topic-basename               SAM#ap-topic-basename
  basename                     SAM#role-basename
  pattern-role-rpc             SAM#ap-pattern-role-rpc
  pattern                      SAM#role-pattern
  role                         SAM#role-role
  basename-variantname         SAM#ap-basename-variantname
  variantname                  SAM#role-variantname
  assertion-scopecomponentset  SAM#ap-assertion-scopecomponentset
  set-setmember                SAM#ap-set-setmember
  assertion                    SAM#role-assertion
  scopecomponentset            SAM#role-scopecomponentset
  setmember                    SAM#role-set-setmember
  set                          SAM#role-set

  DRM and SAM refer to the Draft Reference Model and Standard Application
  Model base URIs that are not yet officially defined.

- by node ID 

  You can retrieve plain node IDs from queries and also feed them back into
  queries. This is usefull in situations where you know that certain query
  results are only used to build subsequent queries. The use of node IDs
  avoids the lookup (e.g. locator -> node) in the server.

  Beware that the nodes are not stable references to the subjects, they
  might change if a map is re-imported.

There are also expressions related to the data contents of resources, they
are discussed further down.


Specifying the Subject Representation to be Returned
====================================================

Within the selection statement of the SELECT clause you need to specify
what representation of the retrieved subjects should be returned.

Here is the general form of an STMQL query again:

SELECT PLAYER OF role1, PLAYER OF role2,...
  FROM assertion-pattern
 WHERE PLAYER OF role1 IS x

And an (incomplete) example that selects all classes from a topic map:

SELECT PLAYER OF .class.            ( note the use of macros in this
  FROM .class-instance.	              example)

The example is incomplete because we need to say what representation of the
classes are to be returned. Suppose you just want to display one of the
basenames of the classes, you'd the use the follwoing query:

SELECT PLAYER OF .class. AS BASENAME
  FROM .class-instance.

This query will return a list of tuples with one field each. This field
will contain one of the basenames of each returned class or the empty
string if no basename has been found.

Note that you can use the DISTINCT keyword to avoid doubles (which
are likely to occur especially in this kind of query):

SELECT DISTINCT PLAYER OF .class. AS BASENAME
  FROM .class-instance.

Here is a list of all supported return 'type' specifiers:

DATA                return the data content of the subject constituting
                    resource of a node
LOCATOR             return one of the address of the subject constituting
                    resource of a node
INDICATOR           return one of the addresses of any of the subject
                    indicating resources of a node
INDICATORDATA       return the data content of any of the subject
                    indicating resources of a node
BASENAME            return any of the base names that are associated with 
                    a node



Expressing Sets of Subjects
===========================

Within the WHERE clause single subjects can be used together with the
IS keyword (WHERE .topic. IS http://www.some.org/map.xtm#t1 ) or set
can be used together with the IN keyword. The simple form of a set is
a comma separated list of subjects enclosed in curly parenthesis:

    WHERE .topic. IN { 665, 764, 887 } 

The other set-denoting expression that is currently supported are
regular expressions:


[ /regex/ ] 	refers to the set of subjects that have subject constituting
                resources whose data content matches regex
/regex/         refers to the set of subjects that have subject indicating
                resources whose data content matches regex

-/regex/-       refers to the set of subjects that have basenames that
                match regex.

Note: regex matching is currently limited to simple containment of
      the regex string in the matching data, so that /forest/ would
      match "forest" and "deforestation". All strings are turned to
      lower case before comparision, no wildcards etc. are supported.





Current Limitations
===================

- no nested selects
- no support for scope yet, possibly this will be done with
  an additional WITHIN  clause.
- no SELECT COUNT yet
- no GROUP BY and ORDER BY yet
- there is only one condition possible in the WHERE clause, this 
  will be extended to SQL-like WHERE clauses such as
  WHERE PLAYER OF role1 IS x1 AND PLAYER OF role2 IS x2 ...
- no cursor operations yet
- there should be a traversal oriented syntax included in STMQL that
  supports the idea of starting at a particular node or set of nodes an
  returning all nodes that can be reached by traversing only those arcs
  (and nodes) that match a certain path pattern. Such a traversal syntax
  will for exaple enable the retrieval of all assertions where is node is
  a player (which is currently unsupported)


The remainder of this text will be the attempt to show that all 'known'
requirements for a topic map query language can be met by STMQL.


TMQL Requirements [4]
=====================

3.6.1. Queries returning topics
-------------------------------

1.Find all topics with specific names whose scopes match a specific scope. 

  SELECT .topic. AS NODE, .basename. AS INDICATORDATA 
    FROM .topic-basename. 
   WHERE .basename. IN /fragment/
  WITHIN {  }

2.Find all topics with specific resources as occurrences whose scopes match a
  specific scope. 
  
  SELECT .topic. AS NODE, .occurrence. AS LOCATOR, .occurrence. AS DATA 
    FROM .topic-occurrence. 
   WHERE .occurrence. IN { [ http://resource1 ], [ http://resource2 ],...}
  WITHIN {  }

  (note that this query will return both, occurrence coming from 
   elements as well as those coming from  elements. You can
   distinguish both by checking for emptyness of the DATA column.)

3.Find all topics playing one of a set of roles in an association of one of a
  set of types whose scopes match a specific scope. 

  [TBD]

  
4.Find all topics playing one of a set of roles in an association of one of a
  set of types, where one of a set of topics plays one of a set of roles,
  whose scopes match a specific scope. 

  [TBD] 

5.Find the topic that has a specific resource as one of its subject indicators. 

  SELECT .topic. AS BASENAME 
    FROM DRM#ap-topic-subjectIndicator
   WHERE PLAYER OF DRM#role-subjectindicator IS [ resource-uri ]


6.Find the topic that has a specific resource as its subject address. 

  This is obsolete because we can address the topic if we know the
  subject constituting resource. This is done with [ yourLocator ]


7.Find all topics that play one of a set of roles in instances of one of a set
  of association types. 

  SELECT PLAYER OF ????
    FROM ???
  

 
3.6.2. Queries returning associations
-------------------------------------

1.Find all associations whose scopes match a specific scope. 

  SELECT PLAYER OF SAM#role-assertion AS NODE
    FROM SAM#ap-assertion-scope
   WHERE PLAYER OF SAM#role-scope IS  

  (note: if scope is not given as a subject but as a set of themes you
   need another query before to retrieve the scope)

2.Find all associations that are instances of a specific type. 

  I assume type here means pattern in the Reference Model world so this
  query comes down to:

  SELECT THIS AS NODE
    FROM 


3.Find all associations where one of a set of topics play any role, and whose
  scopes match a specific scope. 

  [TBD]

4.Find all associations where one of a set of topics play one of a set of
  roles, and whose scopes match a specific scope. 

  [TBD] 

3.6.3. Queries returning any topic map object
----------------------------------------------

1.Find the object that has a specific resource as its source locator. 

  [TBD]

2.Find all objects that are direct instances of a specific type. 

  SELECT .instance. AS NODE
    FROM .class-instance.
   WHERE .class. IS subject 

3.Find all objects that are instances of a specific type or any of its
  subtypes. 

  [TBD]

3.6.4. Queries returning various types of objects
-------------------------------------------------

1.Find all the names of the topics in a particular set of topics, whose scopes
  match a particular scope. 

  SELECT .basename. AS INDICATORDATA
    FROM .topic-basename.
   WHERE .topic. IN { subject subject subject >
   ORDER BY .topic.    (note that ORDER BY is still unsupported)

2.Find all the occurrences of the topics in a particular set of topics, whose
  scopes match a particular scope. 
  
  SELECT .basename. AS INDICATORDATA
    FROM .topic-basename.
   WHERE .topic. IN { subject,subject, subject } 

3.Find all the occurrences of any of a particular set of types of the topics
  in a particular set of topics, whose scopes match a particular scope. 

  [TBD]

4.Find all the resources that are subject indicators of the topics in a
  particular set of topics. 

  SELECT .subjectIndocator.
    FROM .topic-subjectIndicator.
   WHERE .topic. IN { subjectm subject ... }

5.Find the resources that are the addressable subjects of the topics in a
  particular set of topics. 

  [TBD]

*** END TMQL REQUIREMENTS ***



 *****************************************************************************
  
   THE FOLLOWING QUERIES ARE EARLY EXPERIMENTS, THEY ARE NOT SUPPORTED BY
   THE CURRENT SQL SYNTAX

 *****************************************************************************
    



'A draft statement of requirements for a comprehensive topic
map query language.' [3]
=============================================================

  Proposers: Michel Biezunski and Steven R. Newcomb.

   Note:  As  detailed  below  within  square brackets [], the results of
   certain  query  types  can  become  parameters  to  certain subsequent
   queries.

   Note:  "Scope  matching  expressions"  are discussed at the end of the
   list.

Three Kinds of Queries

  (1) Queries that return "hit lists" where each "hit" is a topic:

     * (a) What topics have "[list of names or query returning a hit list
       of names]" as a name within "[scope matching expression]"?

       SELECT PLAYERS OF ROLE {'#role-topic'}
        WHERE PLAYERS OF ROLE {'#role-basename'} SCRDATAMATCH {'name','name'}
          AND TEMPLATE = '#at-topic-basename'
          AND SCOPE MATCHES 

       SELECT PLAYERS OF ROLE {'#role-topic'}
        WHERE PLAYERS OF ROLE {'#role-basename'} SCRMATCH
              SELECT PLAYERS OF ROLE '#role-basename' AS STRINGS
               WHERE TEMPLATE = '#at-topic-basename'
          AND TEMPLATE = '#at-topic-basename'
          AND SCOPE MATCHES 

     * (b)  What  topics  have "[list of occurrences or query returning a
       hit list of occurrences]" as an occurrence within "[scope matching
       expression]"?

       SELECT PLAYERS OF ROLE {'#role-topic'}
        WHERE TEMPLATE = '#at-topic-occurrence'
          AND PLAYERS OF ROLE '#role-occurrence' SCRMATCH 
          AND SCOPE MATCHES 

     * (c) What topics play the role "[list of association role topics or
       query  returning a hit list of association role topics]" in "[list
       of  association  type  topics  or  query  returning  a hit list of
       association  type  topics]"  associations  within "[scope matching
       expression]"?

       SELECT PLAYERS OF ROLES {'uri','uri'...}
        WHERE TEMPLATE IN {'uri','uri'...}
          AND SCOPE MATCHES 

     * (d) What topics play the role "[list of association role topics or
       query  returning a hit list of association role topics]" in "[list
       of  association  type  topics  or  query  returning  a hit list of
       association type topics]" associations wherein "[list of topics or
       query returning a hit list of topics]" plays "[list of association
       role  topics  or  query  returning  a hit list of association role
       topics]"   within   "[scope  matching  expression]"?  (This  is  a
       refinement of (c), above.)

       SELECT PLAYERS OF ROLES {'uri','uri'...}
        WHERE TEMPLATE IN {'uri','uri'...}
          AND PLAYER OF ROLE  IN 
          AND SCOPE MATCHES 
       
     * (e)  What topics are members of the set of topics that constitutes
       the  scope  within which "[list of topics or query returning a hit
       list of topics]" has the name "[list of names or query returning a
       hit  list  of  names]"? (Returns a hit list of topics which is the
       union of the sets of topics that are the selected scopes.)

       SELECT SCOPE
        WHERE PLAYER OF ROLE '#role-topic' = uri
          AND PLAYER OF ROLE '#role-basename' = 
          AND TEMPLATE = '#at-topic-basename'

       SELECT SCOPE COMPONENTS
        WHERE PLAYER OF ROLE '#role-topic' = uri
          AND PLAYER OF ROLE '#role-basename' = 
          AND TEMPLATE = '#at-topic-basename'

       NOTE: what is the meaning of SCOPE or SCOPES or SCOPE(S) COMPONENTS ?

     * (f)  What topics are members of the set of topics that constitutes
       the  scope  within which "[list of topics or query returning a hit
       list of topics]" has the occurrence "[list of occurrences or query
       returning  a  hit  list  of  occurrences]"? (Returns a hit list of
       topics  which  is  the  union  of  the sets of topics that are the
       selected scopes.)

       s.a.

     * (g)  What topics are members of the set of topics that constitutes
       the  scope  within which "[list of topics or query returning a hit
       list  of topics]" plays the role "[list of association role topics
       or  query  returning  a  hit  list of association role topics]" in
       "[list of association type topics or query returning a hit list of
       association  type  topics]"  associations  within "[scope matching
       expression]"?  (Returns a hit list of topics which is the union of
       the sets of topics that are the selected scopes.)
     * (h)  What topics are members of the set of topics that constitutes
       the  scope  within which "[list of topics or query returning a hit
       list  of topics]" plays the role "[list of association role topics
       or  query  returning  a  hit  list of association role topics]" in
       "[list of association type topics or query returning a hit list of
       association type topics]" associations wherein "[list of topics or
       query  returning  a  hit  list of topics]" plays (other) "[list of
       association   role  topics  or  query  returning  a  hit  list  of
       association  role  topics]"  within "[scope matching expression]"?
       (Returns  a  hit  list of topics which is the union of the sets of
       topics that are the selected scopes.)
     * (i)   What  topic(s)  has/have  "[list  of  occurrences  or  query
       returning  a  hit  list  of  occurrences]"  as  its/their  subject
       indicators?

       This is lookup by SIR

     * (j)   What  topic(s)  has/have  "[list  of  occurrences  or  query
       returning  a  hit  list  of  occurrences]"  as  its/their  subject
       constituters?

       This is lookup by SCR

     * (k) What topics are association type topics?

       (what's the meaning here ? templates or types e.g. occurrence types)

       Templates:
       1) SELECT PLAYERS OF ROLE '#role-template'
           WHERE TEMPLATE = '#at-template-role-rpc'
       
       2) SELECT TEMPLATE
           WHERE *

       Types:
       SELECT PLAYERS OF ROLE '#role-class'
        WHERE TEMPLATE = '#at-class-instance'
          AND


     * (l)   What   topics   are  the  association  types  of  "[list  of
       associations or query returning a hit list of associations]"?
     * (m) What topics are association role types?
     * (n)   What   topics  are  association  role  types  in  "[list  of
       association   type  topics  or  query  returning  a  hit  list  of
       association type topics]"?
     * (o)  What topics play the role "[list of association role topics]"
       in  association  "[list  of  associations or query returning a hit
       list of associations]"?
     * (p)  What topics play the role "[list of association role topics]"
       in  association  type  "[list  of association type topics or query
       returning a hit list of association type topics]"?

  (2) Miscellaneous queries. These return "hit lists" where each "hit" is a
  topic name, a topic occurrence, or a topic's "subject indicator" or "subject
  constituter":

     * (a)  What  are  the names of "[list of topics or query returning a
       hit   list  of  topics]"  within  "[scope  matching  expression]".
       (Returns a hit list wherein each hit is a name of a topic, and the
       topic of which it is a name.)
     * (b)  What  are  the  occurrences  of  "[list  of  topics  or query
       returning   a   hit  list  of  topics]"  within  "[scope  matching
       expression]".   (Returns  a  hit  list  wherein  each  hit  is  an
       occurrence, and the topic of which it is an occurrence.)
     * (c)  What are the subject indicators of "[topic or query returning
       a  hit list containing exactly one topic]"? (Returns a hit list in
       which each hit is a subject indicator.)
     * (d)  What is the subject constituter of "[topic or query returning
       a  hit  list  containing  exactly one topic]"? (Returns a hit list
       which  is  either  empty or contains exactly one hit, which is the
       subject constituter.)

  (3) Queries that return "hit lists" where each "hit" is an association
  between topics:

     * (a) What associations exist within "[scope matching expression]".

       SELECT ASSOCIATION
        WHERE SCOPE MATCHES 

     * (b) In which associations does "[list of topics or query returning
       a  hit  list  of  topics]"  play  a  role  within "[scope matching
       expression]".  (Returns  a  hit list of associations and the roles
       played in each.)

       SELECT ASSOCIATION, ROLE  ????
        WHERE PLAYERS OF ANYROLE IN 

     * (c) In which associations does "[list of topics or query returning
       a  hit  list  of topics]" play the role "[list of association role
       topics  or  query  returning  a  list of association role topics]"
       within  "[scope  matching  expression]"?  (Returns  a  hit list of
       associations.)

       SELECT ASSOCIATION
        WHERE PLAYERS OF ROLES  IN 

     * (d)  Which  associations  are  of  type "[list of association type
       topics or query returning a hit list of association type topics]"?
       (Returns a hit list of associations.)

       SELECT ASSOCIATION
        WHERE TEMPLATE IN 

Scope Matching Expressions

   The  "scope  matching  expression"  feature of many of the above query
   types  is  an  extremely  important  aspect  of  topic map queries. As
   explained  above,  a  scope  is  a set of topics used to establish the
   valid  context(s) within which a topic name, a topic occurrence, or an
   association   with  one  or  more  other  topics,  has  such  a  name,
   occurrence,  or  role in an association. (Remember: topics do not have
   scope;  rather,  their  characteristics -- names, occurrences, and the
   roles they play in associations -- have scope.) Scopes are simply sets
   of  topics,  but  using  them  in  powerful  queries  may involve some
   complexity.  Like  any  other  term  in  a  query  expression, a scope
   matching  expression  is  used  to  suppress the reporting of unwanted
   "hits". Scope matching expressions may consist of any of the following
   kinds  of  selections,  arbitrarily grouped (parenthesized) to control
   the  order of operations, joined by logical AND and OR connectors, and
   optionally negated:
     * (1) Match any scope.
           
           ANYSCOPE

     * (2)  Match  any  scope in which any "[integer]" or more of "[topic
       list or query returning a hit list of topics]" appear.
     * (3)  Match  any scope in which any "[integer]" or fewer of "[topic
       list or query returning a hit list of topics]" appear.
     * (4)  Match  any scope in which exactly "[integer]" of "[topic list
       or query returning a hit list of topics]" appear.







REFERENCES AND NOTES
====================
 
[1] 'Draft Reference Model for ISO 13250 Topic Maps', Michel Biezunski and
    Steven R. Newcomb.
    URL: http://www.y12.doe.gov/sgml/sc34/document/0298.htm
 
[2] Think about occurrences: an occurrence is an assertion between a topic
    and a particular resource, the nature (pattern) of the assertion is the
    'topic-occurrence relationship' but occurrences can aslo be instances of
    o certain class.

[3] 'A draft statement of requirements for a comprehensive topic map query
    language.', Michel Biezunski and Steven R. Newcomb
    URL:http://www.topicmaps.net/query-rq.htm

[4] 'TMQL requirements (1.0.0)', The TMQL Working Group
    URL: http://groups.yahoo.com/group/tmql-wg/files/official-docs/tmqlreqs.html

[home] Copyright © 2001, 2002, 2003 by Jan Algermissen and eTopicality, Inc.