STMQL
This is the file included in the Goose distribution:
The design of STMQL is based on an association centric view of the topic maps
paradigm, which in fact, brings topic maps very close to the entity relationship
model and it is not by coincidence that STMQL has an SQL-like look and feel.
In order to understand STMQL, basically you have to understand is how assertion
patterns in the topic map world correspond to tables in the relational world.
A short note on the terminology I use in this text:
The 'Draft Reference Model for ISO Topic Maps' (RM) [1] uses the term
'assertion' to express strongly typed relationships between subjects and the
term 'assertion pattern' for the subjects that express these 'strong types'.
This terminology avoids the confusion that sometimes arises with the use of
the term 'association' (the relationship or the association element) and with
the use of the term 'association type' (where it is not clear if the nature of
a particular relationship between some subjects is meant or merely a class
that the association is an instance of [2])
The subjects that participate in the assertions are represented as nodes in
a graph model by the RM and letters will be used to represent subjects
in the examples. x1,x2,x3,.. will be used to represent arbitrary subjects,
R1,R2,R3,... will be used for subjects that are roles and P for those
that are patterns. The letter C will be used for the so-called 'casting nodes'
that represent 'the fact that a particular subject plays a particular role
in a particular assertion' and -finally- the letter A will be used to
represent the assertions themselves.
Let's not worry here about how subjects are related to locators (e.g. URIs) or
data content, this will be handled further down.
Here is how an assertion with two members looks like in ASCII art:
P1
|
R1 | R2
| | |
| | |
x1----------C----------A----------C----------x2
Read as: "x1 plays R1 in A" and "x2 plays R2 in A" and "A is patterned by P"
In a topic map graph there will be many assertions that have the same pattern,
for example the 'class-instance' pattern. If we group all these assertions
together and line up the players of corresponding roles in collumns we get
a relational table representation of all the assertions. Each assertion
represented as a single row:
assertion pattern: P1
roles: R1, R2
+-----+-----+-----+
| | R1 | R2 |
+-----+-----+-----+
| A | x1 | x2 |
+-----+-----+-----+
...
+-----+-----+-----+
| An | xn-1| xn |
+-----+-----+-----+
Or, as an example:
assertion pattern: ap-class-instance
+-----+---------------+------------+
| | role-instance | role-class |
+-----+---------------+------------+
| A1 | Jim | Person |
+-----+---------------+------------+
| A2 | Betty | Cat |
+-----+---------------+------------+
| A3 | Wanda | Fish |
+-----+---------------+------------+
etc..
Now, what STMQL basically does is to select certain parts of each assertion in
such a table that matches a given filter expression (WHERE clause).
If one is interested in all the classes that Jim is an instance of the query
would be something like: "Give me the PLAYER OF role-class in all assertions
FROM pattern ap-class-instance WHERE the PLAYER OF the role-instance is Jim.
In general:
SELECT PLAYER OF R1,
FROM P
WHERE PLAYER OF R2 IS x2
The result of STMQL queries is a set of tuples of subjects where the
order of the tuples corresponds to the selection expression just following
SELECT.
In the example, the result would be a set of one tuple with one element
(Jim).
Identifying Subjects
====================
In oder to express WHAT the subjects are we refer to in STMQL queries several
referencing mechanism are needed and STMQL provides the follwoing:
- by locator of a subject indicating resource:
http://www.topicmaps.org/xtm/1.0/core.xtm#class
refers to the subject that represents the role-instance in our example
- by locator of a subject constituting resource:
[ http://www.w3.org/index.html ]
represents the homepage of the W3C (the particular document).
- by macro:
STMQL includes a macro mechanism to simplify its use. Macros start and end
with a period. It will be possible in the future to define new macros at
runtime by issuing a CREATE MACRO command. The list of build in macros is
in the file goose/macros.tab . A macro maps a name to a locator.
The predefined macros are:
ap-topic-subjectIndicator DRM#ap-topic-subjectIndicator
aptsi DRM#ap-topic-subjectIndicator
topic DRM#role-topic
subjectIndicator DRM#role-subjectIndicator
si DRM#role-subjectIndicator
class SAM#role-class
instance SAM#role-instance
class-instance SAM#ap-class-instance
topic-occurrence SAM#ap-topic-occurrence
occurrence SAM#role-occurrence
superclass-subclass SAM#ap-superclass-subclass
superclass SAM#role-superclass
subclass SAM#role-subclass
topic-basename SAM#ap-topic-basename
basename SAM#role-basename
pattern-role-rpc SAM#ap-pattern-role-rpc
pattern SAM#role-pattern
role SAM#role-role
basename-variantname SAM#ap-basename-variantname
variantname SAM#role-variantname
assertion-scopecomponentset SAM#ap-assertion-scopecomponentset
set-setmember SAM#ap-set-setmember
assertion SAM#role-assertion
scopecomponentset SAM#role-scopecomponentset
setmember SAM#role-set-setmember
set SAM#role-set
DRM and SAM refer to the Draft Reference Model and Standard Application
Model base URIs that are not yet officially defined.
- by node ID
You can retrieve plain node IDs from queries and also feed them back into
queries. This is usefull in situations where you know that certain query
results are only used to build subsequent queries. The use of node IDs
avoids the lookup (e.g. locator -> node) in the server.
Beware that the nodes are not stable references to the subjects, they
might change if a map is re-imported.
There are also expressions related to the data contents of resources, they
are discussed further down.
Specifying the Subject Representation to be Returned
====================================================
Within the selection statement of the SELECT clause you need to specify
what representation of the retrieved subjects should be returned.
Here is the general form of an STMQL query again:
SELECT PLAYER OF role1, PLAYER OF role2,...
FROM assertion-pattern
WHERE PLAYER OF role1 IS x
And an (incomplete) example that selects all classes from a topic map:
SELECT PLAYER OF .class. ( note the use of macros in this
FROM .class-instance. example)
The example is incomplete because we need to say what representation of the
classes are to be returned. Suppose you just want to display one of the
basenames of the classes, you'd the use the follwoing query:
SELECT PLAYER OF .class. AS BASENAME
FROM .class-instance.
This query will return a list of tuples with one field each. This field
will contain one of the basenames of each returned class or the empty
string if no basename has been found.
Note that you can use the DISTINCT keyword to avoid doubles (which
are likely to occur especially in this kind of query):
SELECT DISTINCT PLAYER OF .class. AS BASENAME
FROM .class-instance.
Here is a list of all supported return 'type' specifiers:
DATA return the data content of the subject constituting
resource of a node
LOCATOR return one of the address of the subject constituting
resource of a node
INDICATOR return one of the addresses of any of the subject
indicating resources of a node
INDICATORDATA return the data content of any of the subject
indicating resources of a node
BASENAME return any of the base names that are associated with
a node
Expressing Sets of Subjects
===========================
Within the WHERE clause single subjects can be used together with the
IS keyword (WHERE .topic. IS http://www.some.org/map.xtm#t1 ) or set
can be used together with the IN keyword. The simple form of a set is
a comma separated list of subjects enclosed in curly parenthesis:
WHERE .topic. IN { 665, 764, 887 }
The other set-denoting expression that is currently supported are
regular expressions:
[ /regex/ ] refers to the set of subjects that have subject constituting
resources whose data content matches regex
/regex/ refers to the set of subjects that have subject indicating
resources whose data content matches regex
-/regex/- refers to the set of subjects that have basenames that
match regex.
Note: regex matching is currently limited to simple containment of
the regex string in the matching data, so that /forest/ would
match "forest" and "deforestation". All strings are turned to
lower case before comparision, no wildcards etc. are supported.
Current Limitations
===================
- no nested selects
- no support for scope yet, possibly this will be done with
an additional WITHIN clause.
- no SELECT COUNT yet
- no GROUP BY and ORDER BY yet
- there is only one condition possible in the WHERE clause, this
will be extended to SQL-like WHERE clauses such as
WHERE PLAYER OF role1 IS x1 AND PLAYER OF role2 IS x2 ...
- no cursor operations yet
- there should be a traversal oriented syntax included in STMQL that
supports the idea of starting at a particular node or set of nodes an
returning all nodes that can be reached by traversing only those arcs
(and nodes) that match a certain path pattern. Such a traversal syntax
will for exaple enable the retrieval of all assertions where is node is
a player (which is currently unsupported)
The remainder of this text will be the attempt to show that all 'known'
requirements for a topic map query language can be met by STMQL.
TMQL Requirements [4]
=====================
3.6.1. Queries returning topics
-------------------------------
1.Find all topics with specific names whose scopes match a specific scope.
SELECT .topic. AS NODE, .basename. AS INDICATORDATA
FROM .topic-basename.
WHERE .basename. IN /fragment/
WITHIN { }
2.Find all topics with specific resources as occurrences whose scopes match a
specific scope.
SELECT .topic. AS NODE, .occurrence. AS LOCATOR, .occurrence. AS DATA
FROM .topic-occurrence.
WHERE .occurrence. IN { [ http://resource1 ], [ http://resource2 ],...}
WITHIN { }
(note that this query will return both, occurrence coming from
elements as well as those coming from elements. You can
distinguish both by checking for emptyness of the DATA column.)
3.Find all topics playing one of a set of roles in an association of one of a
set of types whose scopes match a specific scope.
[TBD]
4.Find all topics playing one of a set of roles in an association of one of a
set of types, where one of a set of topics plays one of a set of roles,
whose scopes match a specific scope.
[TBD]
5.Find the topic that has a specific resource as one of its subject indicators.
SELECT .topic. AS BASENAME
FROM DRM#ap-topic-subjectIndicator
WHERE PLAYER OF DRM#role-subjectindicator IS [ resource-uri ]
6.Find the topic that has a specific resource as its subject address.
This is obsolete because we can address the topic if we know the
subject constituting resource. This is done with [ yourLocator ]
7.Find all topics that play one of a set of roles in instances of one of a set
of association types.
SELECT PLAYER OF ????
FROM ???
3.6.2. Queries returning associations
-------------------------------------
1.Find all associations whose scopes match a specific scope.
SELECT PLAYER OF SAM#role-assertion AS NODE
FROM SAM#ap-assertion-scope
WHERE PLAYER OF SAM#role-scope IS
(note: if scope is not given as a subject but as a set of themes you
need another query before to retrieve the scope)
2.Find all associations that are instances of a specific type.
I assume type here means pattern in the Reference Model world so this
query comes down to:
SELECT THIS AS NODE
FROM
3.Find all associations where one of a set of topics play any role, and whose
scopes match a specific scope.
[TBD]
4.Find all associations where one of a set of topics play one of a set of
roles, and whose scopes match a specific scope.
[TBD]
3.6.3. Queries returning any topic map object
----------------------------------------------
1.Find the object that has a specific resource as its source locator.
[TBD]
2.Find all objects that are direct instances of a specific type.
SELECT .instance. AS NODE
FROM .class-instance.
WHERE .class. IS subject
3.Find all objects that are instances of a specific type or any of its
subtypes.
[TBD]
3.6.4. Queries returning various types of objects
-------------------------------------------------
1.Find all the names of the topics in a particular set of topics, whose scopes
match a particular scope.
SELECT .basename. AS INDICATORDATA
FROM .topic-basename.
WHERE .topic. IN { subject subject subject >
ORDER BY .topic. (note that ORDER BY is still unsupported)
2.Find all the occurrences of the topics in a particular set of topics, whose
scopes match a particular scope.
SELECT .basename. AS INDICATORDATA
FROM .topic-basename.
WHERE .topic. IN { subject,subject, subject }
3.Find all the occurrences of any of a particular set of types of the topics
in a particular set of topics, whose scopes match a particular scope.
[TBD]
4.Find all the resources that are subject indicators of the topics in a
particular set of topics.
SELECT .subjectIndocator.
FROM .topic-subjectIndicator.
WHERE .topic. IN { subjectm subject ... }
5.Find the resources that are the addressable subjects of the topics in a
particular set of topics.
[TBD]
*** END TMQL REQUIREMENTS ***
*****************************************************************************
THE FOLLOWING QUERIES ARE EARLY EXPERIMENTS, THEY ARE NOT SUPPORTED BY
THE CURRENT SQL SYNTAX
*****************************************************************************
'A draft statement of requirements for a comprehensive topic
map query language.' [3]
=============================================================
Proposers: Michel Biezunski and Steven R. Newcomb.
Note: As detailed below within square brackets [], the results of
certain query types can become parameters to certain subsequent
queries.
Note: "Scope matching expressions" are discussed at the end of the
list.
Three Kinds of Queries
(1) Queries that return "hit lists" where each "hit" is a topic:
* (a) What topics have "[list of names or query returning a hit list
of names]" as a name within "[scope matching expression]"?
SELECT PLAYERS OF ROLE {'#role-topic'}
WHERE PLAYERS OF ROLE {'#role-basename'} SCRDATAMATCH {'name','name'}
AND TEMPLATE = '#at-topic-basename'
AND SCOPE MATCHES
SELECT PLAYERS OF ROLE {'#role-topic'}
WHERE PLAYERS OF ROLE {'#role-basename'} SCRMATCH
SELECT PLAYERS OF ROLE '#role-basename' AS STRINGS
WHERE TEMPLATE = '#at-topic-basename'
AND TEMPLATE = '#at-topic-basename'
AND SCOPE MATCHES
* (b) What topics have "[list of occurrences or query returning a
hit list of occurrences]" as an occurrence within "[scope matching
expression]"?
SELECT PLAYERS OF ROLE {'#role-topic'}
WHERE TEMPLATE = '#at-topic-occurrence'
AND PLAYERS OF ROLE '#role-occurrence' SCRMATCH
AND SCOPE MATCHES
* (c) What topics play the role "[list of association role topics or
query returning a hit list of association role topics]" in "[list
of association type topics or query returning a hit list of
association type topics]" associations within "[scope matching
expression]"?
SELECT PLAYERS OF ROLES {'uri','uri'...}
WHERE TEMPLATE IN {'uri','uri'...}
AND SCOPE MATCHES
* (d) What topics play the role "[list of association role topics or
query returning a hit list of association role topics]" in "[list
of association type topics or query returning a hit list of
association type topics]" associations wherein "[list of topics or
query returning a hit list of topics]" plays "[list of association
role topics or query returning a hit list of association role
topics]" within "[scope matching expression]"? (This is a
refinement of (c), above.)
SELECT PLAYERS OF ROLES {'uri','uri'...}
WHERE TEMPLATE IN {'uri','uri'...}
AND PLAYER OF ROLE IN
AND SCOPE MATCHES
* (e) What topics are members of the set of topics that constitutes
the scope within which "[list of topics or query returning a hit
list of topics]" has the name "[list of names or query returning a
hit list of names]"? (Returns a hit list of topics which is the
union of the sets of topics that are the selected scopes.)
SELECT SCOPE
WHERE PLAYER OF ROLE '#role-topic' = uri
AND PLAYER OF ROLE '#role-basename' =
AND TEMPLATE = '#at-topic-basename'
SELECT SCOPE COMPONENTS
WHERE PLAYER OF ROLE '#role-topic' = uri
AND PLAYER OF ROLE '#role-basename' =
AND TEMPLATE = '#at-topic-basename'
NOTE: what is the meaning of SCOPE or SCOPES or SCOPE(S) COMPONENTS ?
* (f) What topics are members of the set of topics that constitutes
the scope within which "[list of topics or query returning a hit
list of topics]" has the occurrence "[list of occurrences or query
returning a hit list of occurrences]"? (Returns a hit list of
topics which is the union of the sets of topics that are the
selected scopes.)
s.a.
* (g) What topics are members of the set of topics that constitutes
the scope within which "[list of topics or query returning a hit
list of topics]" plays the role "[list of association role topics
or query returning a hit list of association role topics]" in
"[list of association type topics or query returning a hit list of
association type topics]" associations within "[scope matching
expression]"? (Returns a hit list of topics which is the union of
the sets of topics that are the selected scopes.)
* (h) What topics are members of the set of topics that constitutes
the scope within which "[list of topics or query returning a hit
list of topics]" plays the role "[list of association role topics
or query returning a hit list of association role topics]" in
"[list of association type topics or query returning a hit list of
association type topics]" associations wherein "[list of topics or
query returning a hit list of topics]" plays (other) "[list of
association role topics or query returning a hit list of
association role topics]" within "[scope matching expression]"?
(Returns a hit list of topics which is the union of the sets of
topics that are the selected scopes.)
* (i) What topic(s) has/have "[list of occurrences or query
returning a hit list of occurrences]" as its/their subject
indicators?
This is lookup by SIR
* (j) What topic(s) has/have "[list of occurrences or query
returning a hit list of occurrences]" as its/their subject
constituters?
This is lookup by SCR
* (k) What topics are association type topics?
(what's the meaning here ? templates or types e.g. occurrence types)
Templates:
1) SELECT PLAYERS OF ROLE '#role-template'
WHERE TEMPLATE = '#at-template-role-rpc'
2) SELECT TEMPLATE
WHERE *
Types:
SELECT PLAYERS OF ROLE '#role-class'
WHERE TEMPLATE = '#at-class-instance'
AND
* (l) What topics are the association types of "[list of
associations or query returning a hit list of associations]"?
* (m) What topics are association role types?
* (n) What topics are association role types in "[list of
association type topics or query returning a hit list of
association type topics]"?
* (o) What topics play the role "[list of association role topics]"
in association "[list of associations or query returning a hit
list of associations]"?
* (p) What topics play the role "[list of association role topics]"
in association type "[list of association type topics or query
returning a hit list of association type topics]"?
(2) Miscellaneous queries. These return "hit lists" where each "hit" is a
topic name, a topic occurrence, or a topic's "subject indicator" or "subject
constituter":
* (a) What are the names of "[list of topics or query returning a
hit list of topics]" within "[scope matching expression]".
(Returns a hit list wherein each hit is a name of a topic, and the
topic of which it is a name.)
* (b) What are the occurrences of "[list of topics or query
returning a hit list of topics]" within "[scope matching
expression]". (Returns a hit list wherein each hit is an
occurrence, and the topic of which it is an occurrence.)
* (c) What are the subject indicators of "[topic or query returning
a hit list containing exactly one topic]"? (Returns a hit list in
which each hit is a subject indicator.)
* (d) What is the subject constituter of "[topic or query returning
a hit list containing exactly one topic]"? (Returns a hit list
which is either empty or contains exactly one hit, which is the
subject constituter.)
(3) Queries that return "hit lists" where each "hit" is an association
between topics:
* (a) What associations exist within "[scope matching expression]".
SELECT ASSOCIATION
WHERE SCOPE MATCHES
* (b) In which associations does "[list of topics or query returning
a hit list of topics]" play a role within "[scope matching
expression]". (Returns a hit list of associations and the roles
played in each.)
SELECT ASSOCIATION, ROLE ????
WHERE PLAYERS OF ANYROLE IN
* (c) In which associations does "[list of topics or query returning
a hit list of topics]" play the role "[list of association role
topics or query returning a list of association role topics]"
within "[scope matching expression]"? (Returns a hit list of
associations.)
SELECT ASSOCIATION
WHERE PLAYERS OF ROLES IN
* (d) Which associations are of type "[list of association type
topics or query returning a hit list of association type topics]"?
(Returns a hit list of associations.)
SELECT ASSOCIATION
WHERE TEMPLATE IN
Scope Matching Expressions
The "scope matching expression" feature of many of the above query
types is an extremely important aspect of topic map queries. As
explained above, a scope is a set of topics used to establish the
valid context(s) within which a topic name, a topic occurrence, or an
association with one or more other topics, has such a name,
occurrence, or role in an association. (Remember: topics do not have
scope; rather, their characteristics -- names, occurrences, and the
roles they play in associations -- have scope.) Scopes are simply sets
of topics, but using them in powerful queries may involve some
complexity. Like any other term in a query expression, a scope
matching expression is used to suppress the reporting of unwanted
"hits". Scope matching expressions may consist of any of the following
kinds of selections, arbitrarily grouped (parenthesized) to control
the order of operations, joined by logical AND and OR connectors, and
optionally negated:
* (1) Match any scope.
ANYSCOPE
* (2) Match any scope in which any "[integer]" or more of "[topic
list or query returning a hit list of topics]" appear.
* (3) Match any scope in which any "[integer]" or fewer of "[topic
list or query returning a hit list of topics]" appear.
* (4) Match any scope in which exactly "[integer]" of "[topic list
or query returning a hit list of topics]" appear.
REFERENCES AND NOTES
====================
[1] 'Draft Reference Model for ISO 13250 Topic Maps', Michel Biezunski and
Steven R. Newcomb.
URL: http://www.y12.doe.gov/sgml/sc34/document/0298.htm
[2] Think about occurrences: an occurrence is an assertion between a topic
and a particular resource, the nature (pattern) of the assertion is the
'topic-occurrence relationship' but occurrences can aslo be instances of
o certain class.
[3] 'A draft statement of requirements for a comprehensive topic map query
language.', Michel Biezunski and Steven R. Newcomb
URL:http://www.topicmaps.net/query-rq.htm
[4] 'TMQL requirements (1.0.0)', The TMQL Working Group
URL: http://groups.yahoo.com/group/tmql-wg/files/official-docs/tmqlreqs.html
|