Empire is an implementation of the Java Persistence API (JPA) for RDF and the Semantic Web. Instead of another implementation of for relational databases, Empire implements JPA for RDF and SPARQL, thus allowing developers who are familiar with JPA, but not with semantic web techologies like RDF, to make an easy transition into this brave, new world. JPA is a specification for managing Java objects, most commonly with an RDBMS; it's industry standard for Java ORMs.
What Itch Does Empire Scratch?
We started Empire—which is available under the terms of Apache 2.0 open source license—to bridge the gap between an RDBS-backed web application and the Semantic Web. We built a web application for a customer which used JPA & Hibernate, but we also wanted to provide a SPARQL endpoint so that we could use Pelorus, a faceted browser for RDF and SPARQL. Ideally, we wanted to use a JPA implementation which would operate against an RDF database in support of these requirements. The objective of this article is to walk through some basic uses of Empire to illustrate how it can be used in your application. For the purposes of the article, we’ll present some examples from an application which uses metadata about various O'Reilly books.
Persistence With Plain RDF
O'Reilly has recently started publishing its catalog pages with RDFa markup as mentioned here. For example, if you checkout the page for “Switching to the Mac” you’d find this RDF embedded in the page:
If you were to create this data by hand using the Sesame API, it's going to look something like this:
You might have factory classes or constants to represent common concepts such as terms from the FOAF or DC ontologies; but for the most part, creating RDF data is going to look quite similar to this. While this is a perfectly functional example, you might find a couple issues with it. First, this code does not look “natural” — that is, it does not represent what is actually going on in an easily discernible way. It doesn’t really look like we’re creating some data about a book in the O'Reilly catalog. It also has locked us into a particular RDF API; this is Sesame code. It’s a non-trivial task to transition this code to another API. Third, the code is only going to make sense to someone who is familiar with RDF; it exposes a lot of RDF minutiae to the developer, which is only going to increase the learning curve for new developers.
What we want are simple Java beans to represent concepts in our application; that application code is easier
to create and maintain and does not leak RDF specifics into the codebase nor does it tie us to any particular RDF API.
Consider the following example:
This code is much easier to work with; it's more clear in what it's trying to accomplish, it succinctly represents our domain, does not tie us to any API other than our own, and exposes no RDF details to the programmer. Nearly any developer, Java or otherwise, could look at this code and immediately understand what's going on. Obviously using Plain Old Java Objects (POJOs) is ideal, but that is only half of the challenge. We still need to save, remove and search for our data, and we want it represented as RDF. This is where Empire comes in.
If you’ve used a JPA implementation before, a lot of the following code should be very familiar to you. Mappings between a Java bean and an RDBMS are often controlled through the common annotations provided by JPA. You typically begin by declaring that your bean is a JPA entity:
Empire simply extends this approach by adding an additional annotation to the class to specify its type:
We’ve now mapped instances of the Book class to individuals of the
frb:Expression class. You’ll notice an additional optional
@Namespaces, on the class where we specify namespaces
that we’ll use throughout our markup; this allows us to use qnames
instead of full URIs. We need to make one last change before we can
start mapping the properties of the class to RDF: we need to assert
that this book can have an RDF identifier:
In Empire it’s easier to work with named individuals than anonymous ones; but Empire supports both and provides builtin handlers for keeping them straight. You never have to worry about setting or creating ids. Now we need to map the properties of our Java bean to the properties of our instances of the Book in our database. Typically, using Hibernate, Toplink or another JPA implementation, standard properties are very easy, you just declare them:
These three fields will get persisted in the database when you save your Book object. If you have a collection of items, you’ll just need to specify some basics of the mapping:
Empire only requires a little bit more information; namely, it needs to know what property each field in your bean corresponds to:
With these simple additional annotations, the Java bean can now be used with Empire.
Initializing Empire is trivial, you simply need to declare which API bindings you’d like to load. The following example shows how to load the support for Sesame, which allows Empire to connect to Sesame repositories. You can load multiple bindings at once and have different persistence contexts connected to databases of different types, while still maintaining the same public API:
Here we use the standard JPA framework to grab an instance of our persistence context named ‘oreilly’. The resulting EntityManager will be connected to the Sesame repository specified in our configuration:
The following shows how to retrieve a specific item, in this case a book, from the database and print some of its data:
Here we show that finding the same object in the database yields an instance which is .equals() to our original copy:
Additionally, we then make some edits to our original and save them back into the database. Our copy remains unchanged and is a snapshot of the state of the book at the time we retrieved it. This also shows how attributes on the JPA annotations can control the persistence behavior; in this case, how persistence is cascaded between objects:
We can always refresh a “stale” object with the latest data from the database.
Here is an example of removing an object from the database; it again demonstrates how persistence operations are controlled through the JPA annotations:
A final example demonstrates how standard JPA parameterized queries can be used with normal SPARQL to query the database:
Features and Support
Empire implements as much of JPA as possible while attempting to
retain the expected behavior based on the JPA spec. There are
features and portions of JPA that Empire does not yet support, such as
@SqlResultSetMapping; and some others that have no correlation to an
RDF based system, such as
Configuration of Empire is controlled through simple properties or XML format files loaded at startup. There is no tricky XML mapping language to learn, all mappings are controlled through the standard JPA annotations. The configuration files simply define the connection parameters for your database as well as allow for global properties to be used by all databases.
Empire uses Dependency Injection via Google Guice to manage its plugin architecture and Javassist for bytecode manipulation; generating instances from interfaces or abstract classes at runtime and lazy loading of resources from the database using method interceptors. This allows Empire to provide an API agnostic mechanism for working with RDF databases, thus avoiding API and/or database lock-in. Empire provides out of the box support for Sesame, Jena, 4Store; support for BigData, Oracle 11g, and Virutoso coming soon.
Empire provides a standard, widely-known Java persistence framework for use in Semantic Web projects where data is stored in RDF. By providing an implementation of JPA and using it to abstract the minutiae of RDF, it lowers the learning curve for new developers, and helps provide a straightforward path for migrating or enhancing existing traditional web applications to use semantic technologies.