It is relatively common for us to hear from customers that they want to use Pellet without it accessing the network. Sometimes they want to avoid network problems by caching locally; sometimes they’re conforming to local security policy constraints; often, people just like hacking on local copies before publishing their ontologies on the Web. Regardless of motivation, they need to avoid the network access used to fetch the contents of an ontology’s imports closures. In this post I outline how a user can setup a local ontology repository that will be used by Pellet’s Jena loader.
The most common use case is a user hand editing a collection of local
ontologies which use HTTP URLs. Until the ontologies are ready to be
published there is no content (or, even worse, outdated content) at
those URLs. The problem is that it’s cumbersome to change all the
URLs to file: URLs only to change them back when
publishing.
Consider two simple ontologies. First,
@prefix owl: <http://www.w3.org/2002/07/owl#> .
<http://example.org/PeopleOntology> a owl:Ontology .
<http://example.org/PeopleOntology#Person> a owl:Class .
And, second,
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
<http://example.org/FriendOntology> a owl:Ontology ;
owl:imports <http://example.org/PeopleOntology> .
<http://example.org/FriendOntology#Friend> a owl:Class ;
rdfs:subClassOf <http://example.org/PeopleOntology#Person> .
We want to use Pellet to iteratively check the inferred class hierarchy as we develop the ontologies. To do this with the command line tools, we normally issue the following command
:; pellet classify http://example.org/FriendOntology
But if we try this as-is, we’ll get an error. We need Pellet to
recognize that, while these ontologies are destined to be published on
the Web, for now they are in local files named people.ttl
and friend.ttl. To do this, we use a Jena
LocationMapper configuration file. We can setup the file, named
location-mapping.ttl, with the following Turtle content:
@prefix lm: <http://jena.hpl.hp.com/2004/08/location-mapping#> .
[] lm:mapping
[ lm:name "http://example.org/PeopleOntology" ; lm:altName "file:people.ttl" ] ,
[ lm:name "http://example.org/FriendOntology" ; lm:altName "file:friend.ttl" ] .
The only other change we need is to explicitly tell Pellet to use the Jena loader; it uses the OWLAPI loader by default. The command line looks like
:; pellet classify --loader Jena http://example.org/FriendOntology
With the location mapping configuration file in place, we no longer get a timeout but instead see the class hierarchy we expect, based on the content of the local files.
The second common use case is a user working with an ontology they’ve found on the Web and which has an arbitrarily large imports closure. This user wants to avoid network accesses to fetch ontologies. There are three steps to addressing this; first we need to identify all of the ontology URLs in the imports closure, then we need store them in our local repository; finally, we need to create an adequate mapping file.
To illustrate this example, we’ll use the LKIF-Core
ontology. This ontology is interesting because it has a moderate
number of ontologies in its imports closure. We could use a tool like
Protégé 4 to
identify the ontologies in the imports closure; but we’re going to
assume that Pellet is the only ontology tool available. To find all
the network resources fetched, we can take advantage of some debug
logging available in Jena. Jena uses log4j, so we need to create a
log4j configuration file, called
lm-log4j.properties, to echo the interesting
content to standard error.
log4j.rootLogger=WARN, stderr
log4j.appender.stderr=org.apache.log4j.ConsoleAppender
log4j.appender.stderr.target=System.err
log4j.appender.stderr.layout=org.apache.log4j.SimpleLayout
log4j.logger.com.hp.hpl.jena.util.FileManager=DEBUG
Once created, we set the system property
log4j.configuration
to reference the file. If you’re using the shell script included with Pellet-2.0 RC5 or newer, you can do this with an environment variable as follows
:; export pellet_java_args="-Dlog4j.configuration=file:lm-log4j.properties"
Then proceed as before
:; pellet consistency --loader Jena http://www.estrellaproject.org/lkif-core/lkif-core.owl
There will be a lot of DEBUG messages, but it’s easy to
narrow in on the useful details with a simple grep command, such as
:; pellet consistency --loader Jena http://www.estrellaproject.org/lkif-core/lkif-core.owl 2>&1 | grep 'Not mapped'
What’s output should be something like the following, enumerating all of the URLs which are being retrieved:
DEBUG - Not mapped: http://www.estrellaproject.org/lkif-core/lkif-core.owl
DEBUG - Not mapped: http://www.estrellaproject.org/lkif-core/norm.owl
DEBUG - Not mapped: http://www.estrellaproject.org/lkif-core/legal-role.owl
DEBUG - Not mapped: http://www.estrellaproject.org/lkif-core/legal-action.owl
DEBUG - Not mapped: http://www.estrellaproject.org/lkif-core/role.owl
DEBUG - Not mapped: http://www.estrellaproject.org/lkif-core/expression.owl
DEBUG - Not mapped: http://www.estrellaproject.org/lkif-core/action.owl
DEBUG - Not mapped: http://www.estrellaproject.org/lkif-core/process.owl
DEBUG - Not mapped: http://www.estrellaproject.org/lkif-core/relative-places.owl
DEBUG - Not mapped: http://www.estrellaproject.org/lkif-core/time.owl
DEBUG - Not mapped: http://www.estrellaproject.org/lkif-core/mereology.owl
DEBUG - Not mapped: http://www.estrellaproject.org/lkif-core/lkif-top.owl
After downloading each of these files and saving them locally, we can
create a location mapping file as above, with one map entry per
ontology. That said, the location mapping configuration file supports
more sophisticated mapping, and this is a great time to take advantage
of prefix based mapping. The following content in
location-mapping.ttl should be sufficient:
@prefix lm: <http://jena.hpl.hp.com/2004/08/location-mapping#> .
[] lm:mapping
[ lm:prefix "http://www.estrellaproject.org/lkif-core/" ; lm:altPrefix "file:./" ] .
With this in place and all the files in the working directory, if we rerun the previous command, grep doesn’t find any matches. To disable the debug output
:; unset pellet_java_args
Then proceed as before
:; pellet consistency --loader Jena http://www.estrellaproject.org/lkif-core/lkif-core.owl
We’ve used the location mapping configuration to completely avoid network access.
A few additional details are worth noting. First, Jena does some
searching for the location mapping configuration file, but the easiest
approach is to keep it in the working directory. Alternatively, it
can be explicitly named using the LocationMap system
property. This approach can be attractive if you work on multiple
ontology projects and would like them to share a single local
repository. E.g., you might use
:; export pellet_java_args="-DLocationMap=file:///etc/my-repository.ttl"
Second, in Pellet 2.0 RC5 this functionality is only available if Pellet’s Jena loader is used. We’ve got a ticket open to duplicate the functionality in the OWLAPI loader and hope to have it in place before the final Pellet 2.0 release.
Feel free to comment on this functionality or any other aspect of Pellet’s behavior on the pellet-users mailing list. See you there.
Update: There has been some public discussion, such as this thread on public-owl-wg@w3.org about tools using XML Catalogs to provide a standardized map description format similar to the one provided by the location mapper configuration file used here. We think that any mechanism that is sane and supported by OWL tools in an interoperable way is a good thing. Translation between the Jena format and XML Catalogs looks straight forward, so you needn’t worry about backwards compatibility issues if Pellet supports XML Catalogs in the future.

