Index
General
- Why do I get a
java.lang.NoClassDefFoundError
when I run a Jena application? - What does the error '
java.lang.NoSuchFieldError: actualValueType
' mean? - What versions of library jars does Jena require?
- How do I get the most up-to-date version of Jena?
- How do I reduce the space needed to deploy Jena?
RDF model API
- Why does the localname part of my URI look wrong?
- How do I change the URI or localName of a Resource?
- Why do I see the warning message 'Detected a NaN anomaly believed to be due to use of JDK 1.4.1'?
Reasoner and inference models
- I want to develop my own rules, how do I get started?
- Why are there two different arrows ( -> and <- ) in the rule syntax?
- The domain and range inferences look wrong, is that a bug?
- Why do I get a warning: Creating OWL rule reasoner working over another OWL rule reasoner
- Why do I get out of memory errors when working with the wine ontology?
- What causes the error
"
java.lang.UnsupportedOperationException: this is not a URI node
" in the DIG reasoner? - I want to use my own custom rules to extend an existing RDFS or OWL Schema, what do I do?
Ontology API
- Why doesn't
listClasses()
(listProperties()
/listIndividuals()
, etc) work? - Why doesn't
listProperties()
return any results whenlistObjectProperties()
(orlistDatatypeProperties()
) does? - Why are my transitive properties (or symmetric properties or inverse functional
properties) missing when I call
listObjectProperties()
? - Why does
.as( OntProperty.class )
fail withConversionException
on SymmetricProperty (or other property types)? - Why doesn't the ontology API handle
sub-class
(orsub-property
,domain
,range
, etc) relationships in a DAML model? - I don't understand very clearly the difference between
the various
OntModel
model profiles.
Database and persistence
- Why do I get an exception when trying to create a new persistent model?
- Why do I run out of memory when trying to list statements in a persistent model?
- Has Jena2 persistence been ported to other database engines and platforms besides those officially supported?
- Is there a limit on the number of models in a database?
- Why am I getting an exception on failure to lock or unlock the database?
- How do I access the Jena database tables?
XML serialisation (reading and writing)
- Why does my output use
<rdf:Description ...>
when I want output like<owl:Class ...>
? - Why does my XML output contain strange prefixes
j.0
,j.1
, etc?
SPARQL and query processing
See the ARQ documentation and ARQ FAQ. See also the ARQ web site for new versions of ARQ.
Answers
General
Q: Why do I get a java.lang.NoClassDefFoundError
when I run a Jena application?
A: This means that one or more of the libraries that Jena depends on is not
on your classpath. Typically, all of the libraries (.jar files) in $JENA/lib
,
where $JENA
refers to the directory in which you installed Jena, should be on
your classpath. Consult the documentation for your JDK for details on setting
the classpath for your system. There are also a number of on-line tutorials for setting
the Java classpath. Consult Google or see
here.
Q: What does the error 'java.lang.NoSuchFieldError: actualValueType
' mean?
A: This is almost always due to using the wrong version of the Xerces
library. Jena makes use of XML Schema support that changed at Xerces 2.6.0 and is not
compatible with earlier versions. At the time of writing Jena ships with Xerces 2.6.1.
In some situations your runtime environment may be picking up an earlier version of Xerces from
an "endorsed" directory and you will need to either disable use of that endorsed library or replace
it by a more uptodate version of Xerces. This occurs with tomcat 5.* and certain configurations of jdk 1.4.1.
Q: What versions of library jars does Jena require?
A: Jena makes use of several third party Java libraries. Copies of
each of these is included in the $JENA/lib area of the distribution and we recommend
including all of these jars in your classpath. In some circumstances applications
already make use of specific versions of these libraries (e.g. Xerces) and
need to check if the version they are using is compatible with those shipped
by Jena. The current library versions used by Jena are given
here.
Q: How do I get the most up-to-date version of Jena?
A: Released versions of Jena are available from the
downloads page
on SourceForge.
However, there may be changes and bug fixes
that have been added to the Jena codebase that are not yet available as
part of a release. To get the most up-to-date version of Jena, download
the source code from CVS (instructions for this are
available here).
To compile the source code and generate a new
jena.jar
file,
Ant must be installed and on the path. The
command to build Jena is the default Ant target, so it only necessary to
cd
to the Jena root directory, which should contain the file
build.xml
, and issue the command ant
.
To test that the new compiled version of Jena is working correctly, run
the script test.bat
(Windows) or test.sh
(Linux or
Cygwin) to run the full suite of Jena regression tests. There should
be no test failures. Note: however that non-release
versions of Jena may not be as fully tested and stable as the formal
releases.
Q: How do I reduce the space needed to deploy Jena (give
Jena a smaller footprint)?
A: Jena attempts to follow the various relevant specifications
very closely, which means that both Jena itself, and the libraries
(.jar files) it depends on, are quite large. For example, the icu4j
library assists with the correct interpretation of URI's encoded in international
character sets. For many applications the size of the Jena deployment
is not a problem. There are some
circumstances, however, where reducing the size of the installed libraries is
important - for example, when installing a semantic web application on a mobile
device. To reduce the storage space required for Jena itself you can build a
different version of jena.jar
, using the command:
ant jar-optimised
This builds a version of Jena with no symbols or other debugging information (see above for instructions on using ant). Note that this will make error stacktraces less informative.
The library jenatest.jar
contains Jena's unit test suite. This does not have to be included when deploying
a Jena application. Other libraries from the Jena lib directory may be left out,
with caution, if the functionality of that library is not required. For example,
if the application is only going to handle ascii text, it should not be
necessary to install the icu4j
library. However, users should be
aware that the only supported configuration of Jena is to deploy with
all of the .jar files from the lib/
directory, except for
jenatest.jar
or an alternative optimised jena.jar
.
Other configurations may well work, but are at the user's
own risk.
RDF model API
Q. Why does the localname part of my URI look wrong?
A: In Jena it is possible to retrieve the localname part of a Resource or Property
URI. Sometimes developers create Resources with a full URI reference but
find that the result of a getLocalName call is not quite what they expected.
This is usually because the URI is ill-formed or cannot be correctly split
in the way you expected. The only reason for separating namespace and local
name is to support the XML serialization in which qnames are used for properties
and classes. Thus the main requirement of the split is that the localname
component must be a legal XML NCName. This means it must start with a letter
or _ character and can only contain limited punctuation. In particular,
they can't contain spaces, but then spaces are not legal in URI references
anyway. In general, it is best to not use the localname split to encode
any information, you should only be concerned with it if you are coding
a parser or writer.
Q. How do I change the URI or localName of a Resource?
A: In Jena, the URI of a resource is invariant.
So there is no setLocalName()
, or setURI()
method,
and there will never be one.
The only way to "rename" a resource is to remove all of the statements that
mention resource R, add add new statements with R replaced by R'.
A utility for doing this is provided:
com.hp.hpl.jena.util.ResourceUtils.renameResource()
If you are working with inference or ontology models, you need to be
careful to do this on the base model, not the entailment (aka inference) model.
Why do I see the warning message 'Detected a NaN anomaly believed to be due to use of JDK 1.4.1'?
A: You're using an obsolete version of Jena and JDK 1.4.1.
As a side effect of some changes post-Jena 2.1 we started seeing
random error message of the form 'Illegal load factor: NaN' when creating a HashMap.
This appears to be a JDK bug in that the call is perfectly legal and the error
message is seen frequently under JDK 1.4.1 but
has never been seen under 1.3.1 or 1.4.2. We provided a work around which simply tries
again to create the HashMap and this log message indicates that the work around has
been triggered. Later versions of Jena ceased using that code.
To prevent it occuring switch to JDK 1.4.2 or later and upgrade your Jena. If you continute to see the message please let us know.
Reasoner and inference models
Q. I want to develop my own rules, how do I get started?
A: The GenericRuleReasoner is the place to start.
You can create instances of this reasoner by
supplying either an explicit set of Rule objects
or a configuration description (as a Jena Model)
that points to a local rule file.
See the inference documentation for more details:
inference/index.html#rules
Q. Why are there two different arrows ( -> and <- ) in the rule syntax?
A: As explained in the documentation there are two rule systems available - a
forward chainer and a backward chainer. You can chose to use either or
use the two together in a hybrid mode.
So if we use Ti as short hand for triple patterns like (?x rdf:type ?C),
and if we ignore functors and procedural call out for now, then the syntax:
T1, T2, ... TN -> T0 .
means that if the triple patterns T1 to TN match in the data set then
then the triple T0 can be deduced as a consequence. Similarly
T0 <- T1, T2, ... TN .
means the same thing - the consequence is always on the "pointy" end of
the arrow.
Now if you are just using pure forward or backward rules then you could
chose to use either syntax interchangeably. This allows you to write a
rule set and use it in either mode. Though in practice "->" is the more
conventional direction in forward systems and "<-" is the more
conventional one in backward systems.
The hybrid configuration allows you to create new backward rules as a
result of forward rules firing so that the syntax:
T1, T2 -> [T0 <- T3, T4] .
Is saying that if both T1 and T2 match in the dataset then add the
backward rule "[T0 <- T3, T4]" after instantiating any bound variables.
Q. The domain and range inferences look wrong, is that a bug?
A: The way rdfs range and domain declarations work is completely alien to
anyone who thinks of RDFS and OWL as being a bit like a type system for
a programming language, especially an object oriented language. Whilst there
may be bugs in the inference rule sets the most common explanation for surprising
results, when listing inferred domains and ranges, is this mismatch in expectations.
Suppose we have three classes eg:Man
is an rdfs:subClassOf
eg:Person
is an rdfs:subClassOf
eg:Animal
.
Suppose we have a property eg:personalName
which is declared to
have rdfs:domain
eg:Person
. Now the question is
what other values can be inferred for the rdfs:domain
of eg:personalName
?
In pure RDFS no additional conclusions can be made. The definition of
domain and range is intensional not extensional. It only works
forward. Declaring <eg:personalName rdfs:domain eg:Person>
means that anything to which eg:personalName
is applied can
be concluded to be of type eg:Person
. It does not work backward
- if you somehow knew that all things to which eg:personalName
applied were also Foo
's you cannot conclude that <eg:personalName
rdfs:domain Foo>.
However, RDFS permits systems to strengthen the meaning of domain and range
to be extensional, so that valid domain and range deductions can be made.
OWL makes use of this option. So in OWL, then in our example we can also
deduce that <eg:personalName rdfs:domain eg:Animal>
.
If you are used to object oriented programming this may look wrong. It is
tempting, but incorrect, to think of rdfs:domain as meaning this is the
class of objects to which this property can be applied. With that mindset
you might expect to find that <eg:personalName rdfs:domain eg:Man>
,
after all every eg:Man
is an eg:Person
so it is
always "legal" to apply eg:personalName
to an eg:Man
.
That is true, it is legal, any eg:Man
is allowed to have a
a eg:personalName
but rdfs:domain
does not describe
what is legal. The statement <P rdfs:domain C>
just means
all things to which P is applied can be inferred to have class C.
You can see that if we tried to infer <eg:personalName rdfs:domain
eg:Man>
then we would start concluding that anything with a name
was a man which is not right - every Man can have a name but non-Man Persons
are also allowed to have names in this example.
Q: Why do I get a warning: Creating OWL rule reasoner working over another OWL rule reasoner
A: If you create an inference graph explicitly from an OWL reasoner or implicitly
(by using OntModelSpec.OWL_*_RULE) then it is best if the argument models
(data and schema) are plain models. It is easy to accidentally misuse the API and
create an inference model working over the results of another inference model.
This is a redundancy which significantly affects performance to no useful effect.
To help detect this situation we have added a warning message. The best way
to stop the message is to change your model construction code so that only the
final InfModel/OntModel is specified to use OWL inference. If this is not
appropriate for some reason you can disable the check and warning messages
using the global flag com.hp.hpl.jena.shared.impl.JenaParameters.enableOWLRuleOverOWLRuleWarnings.
Q: Why do I get out of memory errors when working with the wine ontology?
A:
The wine/food ontology is specifically designed to exercise all OWL/DL constructs. The
Jena rule-based reasoner only supports the OWL/lite subset of OWL/Full and has scaling
problems with some of the constructs used in the wine ontology. If you need full
reasoning support for the wine (or similar) ontologies then use a full DL reasoner
such as Pellet, which can be accessed via the DIG interface or directly using the
Pellet-provided OntModelSpec. If you only need to do things like traverse the class
hiearchy and inference over RDFS plus OWL property relations is enough for you, then the
OWL micro reasoner may be an option.
Q: What causes the error
"java.lang.UnsupportedOperationException: this is not a URI node
" in the DIG reasoner?
A:
This is a known problem with the Jena 2.1 release. Please get the latest version of Jena (a later
release, if there is one, or get the sources from CVS and
build a new copy of jena.jar
.)
Q: I want to use my own custom rules to extend an existing RDFS or OWL
Schema, what do I do?
A:
The easiest way to do this is to define your rule set. You can use the
@include
directive at the top of your rules to include the RDFS (or OWL) rules
first. Then create a
GenericRuleReasoner
which you can use to build an InfModel such as an
OntModel (by attaching the
reasoner to your OntModelSpec).
See GenericRuleReasoner
configuration for an example of how to parse custom rules.
Some important guidelines:
- if you're using OWL rules:
setOWLTranslation(true)
on the reasoner - if you're using RDFS or OWLMicro:
setTransitiveClosureCaching(true)
- make your own rules backwards unless you know what you are doing
You may only use backward rules in this configuration because the RDFS and OWL rules use a mix of forward and backward chaining and the rule system architecture is a pure dataflow - the forward rules don't call the backward rules. Thus any forward rules will only see those parts of the RDFS/OWL inferences which are computed forwards. Rather than have to be familar with those details it is easiest to simply write your own rules as backward ones.
An alternative is to use a layered architecture - build your generic rule InfModel on top of a separate RDFS/OWL InfModel. That has higher overhead but then your own rules are unrestricted.
Ontology API
Q: Why doesn't listClasses()
(or listProperties()
/listIndividuals()
,
etc) work?
A: It does work. Extensive unit tests are used to check the correctness of Jena,
and are included in the downloaded source code for your reference. If listClasses()
,
or a similar method, is not producing the answers you expect, or no answers
at all, you should first check that your model is correctly defined. Print a
copy of your model as a debug step, to see if the URI's match up (e.g, if you
are expecting resource x to be an individual of class Y, check that the rdf:type
of x is the same as the URI of the class declaration for Y). A common problem
is that relative URI's change depending where you read the model from. Try adding
an xml:base
declaration to the document to ensure that URI's are
correctly specified.
Why doesn't listProperties()
return any results when listObjectProperties()
(or
listDatatypeProperties()
) does?
A: The method OntModel.listObjectProperties()
returns those resources from
the OntModel with rdf:type rdf:Property
. Under the OWL semantic theory, this is true
of owl:ObjectProperty
since ObjectProperty is a sub-class of Property. However, unless
an OWL reasoner is used with an OWL model (or a DAML reasoner with a DAML model, etc), this inferred
rdf:type
statement is not visible. Therefore, with no reasoner, the OntModel
cannot tell that an ObjectProperty is a Property. The solution is to construct the OntModel
with an appropriate reasoner. If, for some application reason, using a reasoner is not possible
then users should be prepared to list the various property types separately. Note also the
next question.
Why are my transitive properties (or symmetric properties or inverse functional
properties) missing when I call listObjectProperties()
?
A: This is essentially the same problem as the
previous FAQ. Without an OWL reasoner, the model cannot tell that an owl:TransitiveProperty
is also an owl:ObjectProperty
and an RDF property. The same solution advice
applies as with the previous question.
Q: Why does .as( OntProperty.class )
fail with
ConversionException
on SymmetricProperty (or other property types)?
A: This is a slightly tricky issue. Internally, .as()
calls the supports check,
which tests whether the node that is being converted is a common flavour of property.
Strictly, the only necessary test should be 'has rdf:type rdf:Property
',
because that is entailed by all of the other property types. However, that requires
the user to use a model with a reasoner, and some don't want to (for good reasons, e.g. building an editor).
The other position is to test for all the possible variants of property: object property,
datatype property, annotation, ontology, transitive, functional, inverse functional, etc etc.
The problem with this is that it duplicates the work of the reasoner, and my expectation was that
most people would be running with a reasoner. Thus my code would be duplicating the functionality
of the reasoner, which is bad design. The compromise solution was to make the supports check test
for the common (top level) property types. Users who aren't using the reasoner,
can either test explicitly for the other property types they expect to encounter (e.g. SymmetricProperty),
or can turn off the supports check by setting
strict mode to false on the model.
Q: Why doesn't the ontology API handle sub-class
(or sub-property
, domain
, range
, etc)
relationships in a DAML model?
A: These relationships are handled correctly, but the results you see are dependent on the
model configuration. The DAML specification includes a number of aliases for RDFS constructs to
copy them into the DAML+OIL namespace. This means that, for a DAML processor, daml:subClassOf
and rdfs:subClassOf
are equivalent. This is declared by means of a
daml:samePropertyAs
in the daml+oil.daml specification document. Without a reasoner
attached to the model, the ontology API will not recognise the equivalence with rdfs:
properties.
Thus, if you are not seeing the expected results when processing a DAML ontology,
it is likely that your ontology file contains, for example,
<daml:Class rdf:ID="A"> <rdfs:subClassOf rdf:resource="B" /> ...
To fix this, either ensure that the ontology consistently uses daml:
relationships,
or declare the ontology model with the DAML micro rule-reasoner:
OntModel m = ModelFactory.createOntologyModel( OntModelSpec.DAML_MEM_RULE_INF, null );
Q: I don't understand very clearly the difference between
the various OntModel
model profiles.
A:
OK, here's how it works. The ontology API is designed to provide a single
set of convenient programming abstractions for a Jena model that contains
an ontology in either RDFS, DAML or (the various flavours of) OWL.
Each of these languages is structurally similar, but differ in detail.
So, a class is declared variously as owl:Class
,
rdfs:Class
and daml:Class
.
Hence one role of the ont model profiles (i.e. OntModelSpec
objects) is to specify the
details of which syntax is being used.
Second, ontology models can be composed of many sub-models when an ontology
imports another ontology. These sub-models have to be stored
somewhere, perhaps in memory or in a database. The profile contains a
ModelMaker
, which provides the
OntModel
with new sub-models on demand,
to contain the imported ontology documents.
Third, ontologies can be made richer by including the entailments of the ontology assertions, given the semantics of the language. To do this, you need a reasoner. Since Jena provides an open, extensible architecture for adding reasoners, and some built-in pre-defined reasoners, the model profile specifies which reasoner, if any, that model will use.
These are the main components of an OntModelSpec
. You can construct each
of these elements independently, programatically or with RDF,
but we have anticipated some common choices.
So we provide some
built-in standard profiles. These have names like OWL_MEM
, or RDFS_MEM_RDFS_INF
.
The first component of the name is the syntax (OWL, RDFS etc). The second
component is the model-maker strategy (MEM means in-memory models). The
third component, which may be absent, specifies the reasoner.
OWL_MEM
has no reasoner, RDFS_MEM_TRANS_INF
uses a
simple reasoner that computes transitive closure on the class and property hierarchies, but nothing else.
Database and persistence
Q: Why do I get an exception when trying to create a new persistent model?
A: If the exception has to do with the database lock, see the question on
locking. Otherwise, assuming that your program uses correct methods to create the model (see
examples in the database How To Create Persistent Models), it may be that
your database files are corrupted. Jena2 does not do a good job in
checking the validity of the database. It makes a cursory check that some
required tables exist but does not check that the tables contain valid data. If
you suspect your database has been corrupted, you may invoke
cleanDB()
on a DBConnection object
prior to creating your model. This removes all Jena2 tables from a database.
Warning: this removes any other existing Jena2 models from the database so make sure
that this is what you want to do.
Q: Why do I run out of memory when trying to list statements in a persistent
model?
A: Jena2 uses the JDBC interface for accessing databases. The JDBC
specification has no cursors. Consequently, when a query is processed by JDBC,
the entire result set is returned from the database at once and the application program then iterates
over the in-memory result set. If the result set is large, as is often the case
when listing all statements of a large model, it may exceed the heap size of
the Java virtual machine. If you suspect this is happening, you might try to
increase the heap size of the Java virtual machine (-vmargs
-Xmx500M
for a 500 MB heap size). If this does not help, there is no
other work-around and the program should be recoded.
Q: Has Jena2 persistence been ported to other database engines and platforms
besides those officially supported?
A: The Jena team supports Jena2 persistence on the databases and operating
systems listed in the Database documentation. These
include MySQL, HSQLDB, PostgreSQL, Oracle, SQL Server.
Other users have had success porting Jena2 to other databases and platforms.
Jena2 has been ported to IBM's DB2 database. Contact
Liang-Jie Zhang for details.
Q: Is there a limit on
the number of models in a database?
A: The limit depends on the Jena database (schema) configuration and the
database engine (MySQL, PostgreSQL, Oracle, etc). Recall that a Jena model may
either be stored separately in its own database tables (the default) or,
alternatively, in tables that are shared with other models (see
StoreWithModel in the options for
persistent models). Also, a Jena model is identified internally by a 32 bit
integer. Consequently the maximum number of models is limited either by the
maximum number of tables allowed in a database (which depends on the database
engine) or by the maximum value of a 32 bit integer, i.e., 2G.
Q: Why am I getting an exception on failure to lock or unlock the database?
A: The Jena2 storage subsystem uses a lock internally to
implement a critical section for operations that modify the database structure
(create/delete tables). The lock is implemented as a database table, i.e., if
the table exists in the database, the lock is held. Normally, this lock
should be transparent to applications. But if an application has an exception
while in a critical section, the database may remain locked for subsequent
applications. In this case, a user must manually unlock the database either by
calling DriverRDB.unlockDB() or by deleting the table (Jena_Mutex) from the database.
Q: How do I access the Jena database tables?
A: The Jena2 database tables are not intended for direct
access
by Jena users or applications. The database tables are created, deleted and manipulated
through the Jena API methods. For example, creating a database model may cause tables
to be added to tbe database. So, the user need not directly view or access the Jena
database. Also, Jena encodes RDF statements, resources and literals in a way that
makes them difficult to view or query using conventional (SQL) database tools. Users who
are interested in the Jena2 database structure and value encoding can find details
in the layout documentation.
XML serialisation (reading and writing)
Q: Why does my output use <rdf:Description ...>
when I want output like <owl:Class ...>
?
A: This is the raw form of the RDF serialisation into XML. In terms of RDF's
information model, it expresses the same semantics as the compressed form. So the following
fragments are equivalent (in RDF terms):
<rdf:Description rdf:about="#foo"> <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"> </rdf:Description> <owl:Class rdf:about="#foo" />
However, the second is considerably easier for human readers to read, and tends to
be the form most people come across when reading OWL or DAML ontologies, for example. The default
output format is RDF/XML
, the abbreviated format is RDF/XML-ABBREV
.
To change from the default output style, pass the required output format to the
Model.write
method:
yourModel.write( yourOutputStream, "RDF/XML-ABBREV" );
Note that the abbreviated form requires the writer to do much more work (multiple passes are needed over the RDF model, to see which abbreviation rules can apply). Hence it may be inappropriate for large models. In particular, the abbreviated form is not recommended for serialising large models from a persistent database to RDF XML. More details on controlling the precise behaviour of the writer, including turning on and off abbreviation rules, are in the I/O howto.
Q: Why does my XML output contain strange prefixes j.0
,
j.1
, etc?
A: XML's namespace mechanism
is used in serialised RDF/XML to make legal XML element names from URI's. XML elements are not
permitted to contain certain characters, many of which are required when making URI's. For example,
http://example.com/test#SomeClass
is not a legal element name. We can make the
name XML-legal by ensuring all of the non-ncname characters (ncname denotes
characters that can form legal XML element names) appear in the XML namespace prefix. So,
<http://example.com/test#TestClass>
is not legal, but
<ns:TestClass xmlns="http://example.com/test#">
is legal.
Jena's XML writer will add xmlns prefixes as necessary to make your XML output
conform to the rules of correct XML. This may mean creating new prefix names. Jena's
convention is to name these new prefixes j.0
, j.1
, etc.
If you want these prefixes to have more meaningful names, before you write the model
call setNsPrefix
on your model to assign your preferred prefix to a given URI.
SPARQL and query processing
See the SPARQL Tutorial
Q: How do I do test substrings of literals?
A: SPARQL provides REGEX.
See the ARQ FAQ for details.