Practical RDF Inference: a Case Study of OpenTechnology.org
Uche Ogbuji
Knowledge Technologies 2001
March 5, 2001
Introduction
- What is OpenTechnology.org?
- Why develop OpenTechnology.org?
- What is the status of OpenTechnology.org?
- Why is OpenTechnology.org relevant to knowledge technologies?
The OpenTechnology.org user experience
-
OpenTalk: a threaded discussion system with some unique features
-
Minder: a console for metadata search, query, filtering and navigation
-
The core: basic user and session management
- The future: Prism, custom metadata collections, ...
Enabling technologies
- Pretty much everything is stored and processed as XML
- XSLT is used to render almost every page
- XLink (moving to XInclude) is used to proxy objects in hub documents
- Schematron is used for validation of added objects
- And of course RDF, which is used pretty much everywhere as we shall see
- RDF Inference Language (RIL) is used for filtering based on user preferences
The role of RDF
- Everything has a classification: a document definition which connects the XML schema with the RDF schema
- Library-like metadata (Dublin Core) marks common metadata such as author, title and date created. These properties are often specialized.
- RIL's filtering based on user preference operates on the entire RDF model of OpenTechnology.org
- "Object" relationships, for instance, there are generic container relationship properties which are specilized to, say tie a message to a topic.
- Searching and navigation is provided for by matching patterns in the RDF model, and then expanding and narrowing until the user makes an entry into the object map
- Policies and grouping are implemented using RDF statements, for instance, the fact that a forum is moderated or the different views that make up a skin
- Structured import and export of data is performed by special mappings from, say MIME fields to the RDF schema for messages
OpenTechnology.org architecture
OpenTechnology.org architecture notes
- What is 4Suite Server (4SS)?
- Other supporting packages include Apache and mod_snake
- Web requests flow from Apache to mod_snake to the OpenTechnology.org layer to 4SS
- The OpenTechnology.org layer is quite thin, representing the high-level rules of operation which are actuated within 4SS
- 4SS metaservers build on base servers to integrate RDF-based metadata management
- The system uses pretty traditional distributed transactions and session management
RDF powers OpenTalk
A ten minute live demo of an OpenTalk session, demonstrating how RDF enables the display format and navigation, filtering and basic object relationships.
A RIL-based filter
An example RIL filter
<ot:filter
xmlns:ot='https://www.namespaces.opentechnology.org/'
xmlns:dc='https://www.purl.org/dc/elements/1.1'
xmlns:ril = "https://www.namespaces.rdfinference.org/ril"
type="https://www.namespaces.opentechnology.org/types#AdvancedFilter"
>
<dc:Creator>supertest</dc:Creator>
<dc:Date>2003-11-19 14:06:04-05:00</dc:Date>
<dc:Title>/root/Users/rilguy.filter</dc:Title>
<ril:expression>
<ril:variable-set name='spammers'>
<ril:query>
<dc:creator>
<ril:variable name='X'/>
<ril:string>spammer</ril:string>
</dc:creator>
</ril:query>
</ril:variable-set>
<ril:variable-ref name='spammer'/>
</ril:expression>
</ot:filter>
RIL notes
- RIL treats an RDF model as an expert-system knowledgebase
- Implements basic inference, forward chaining in particular
- Operates in a "sandbox", a separate, temporary RDF model where assertions can be made; there are special actions to update the real model if necessary
- The future: functors of more than two parameters, certainty factors, a full complement of aggregate functions, abbreviations, ...
Skins: RDF connects XSLT machines
- Views and view transforms: each skin is a collection of views, each of which is an XSLT transform that is related to the skin object
- Candidate objects are gathered for each view using RDF properties
- Candidate objects are passed to a RIL filter as we've discussed
- The future: prioritization, favorites, ...
RDF powers Minder
A ten minute live demo of an Minder session, demonstrating how RDF enables search, query and resource navigation.
Minder notes
- The basic entry is a pattern-based query: "I want all objects of type message"
- Search result expansion, narrowing and refinement using additional patterns: "I want only those messages whose creator is 'Uche'"
- The object node view: displays the object's raw XML and presents in-bound and out-bound RDF statements to allow navigating the relationship map
- The future: custom object views, full-text searching, RDF graph visualization and Natural Language Processing
Why use RDF?
- OpenTechnology is a combination forum, trove, learning environment and playground for open, XML-based technologies of which RDF is an important example
- Relationship graphs are a natural model for semistructured metadata
- The node-centric approach is a natural for aggregated object relationships
- RDF integrates well with XML-based data
- The tools are implemented, efficient and readily available
- There are already useful vocabularies available for RDF such as Dublin Core and DMOZ
RDF difficulties
- RDF relies heavily on URIs, which can be quite chaotic
- The scalability of RDF processors to widely distributed systems is unproven
- The syntax is messy and the model is a tad uncooked
- RDF schemas are rather wimpy
- RDF is likely to change to improve its design and to adapt to other changes in the XML spectrum
- Important paradigms such as N-ary relationships, quantification and attribution require reification, which is not completely specified and may be inefficient
Closed system counters
- In a closed system, URIs can be strictly controlled and evolved
- The required scalability is along the lines of traditional DBMS capacity
- Careful normalization of the model and syntax can be imposed in areas where the RDF spec is thin or contradictory
- RDF schema can be dummied out given strongly normalized mechanisms for generating statements (note that this changes once custom schemas are allowed)
- Migration to future RDF versions can be carefully controlled
- Reification can be implemented in an efficient and normalized manner
The core OpenTechology.org design pattern
- Numerous small packets of XML data tied together with relationships encoded in RDF
- Many XML processing tasks scale poorly with document size; small XML documents help
- The connection between the XML schemata and the RDF schemata is a natural one
- Richer dimensionality of object relationships is manageable
- Addressig is simplified
- Synchronicity and referential integrity between the XML and RDF data is of crucial importance
- What shall we call it? meta-flyweight? markup nanites?
Resources