So you've got Eyeball installed and you've run it on one of your files, and Eyeball doesn't like it. You're not sure why, or what to do about it. Here's what's going on.
Eyeball inspects your model against a set of schemas. The default set of schemas includes RDF, RDFS, the XSD datatypes, and any models your model imports: you can add additional schemas from the command line or configuration file. Eyeball uses those schemas to work out what URIs count as "declared" in advance. It also checks URIs and literals for syntactic correctness and name space prefixes for being "sensible". Let's look at some of the messages you can get.
You'll probably find several messages like this:
predicate not declared in any schema: somePredicateURIEyeball treats the imported models, and (independently) the specified schemas, as single OntModels, and extracts those OntModels' properties. It includes the RDF and RDFS schemas. Anything used as a predicate that isn't one of those properties is reported.
If you're using OWL, you can silence the "undeclared property" messages about OWL properties by adding to your Eyeball command line the option:
-assume owl
Eyeball will read the OWL schema (it has a copy stashed away in the mirror directory) and add the declared properties to its known list. This works for any filename or URL you like, so long as there's RDF there and it has a suitable file suffix - .n3 for N3 or .rdf or .owl for RDF/XML - and for the built-in names dc (basic Dublin Core), dcterms (Dublin Core terms) and dc-all (both). So you can construct your own schemas, which declare your own domain-specific property declarations, and invoke Eyeball with
-assume owl mySchemaFile.n3 otherSchemaFile.rdf
You can give short names (like dc and rdfs) to your own schemas, or collections of schemas, using an Eyeball config file, but you'll have to see the manual to find out how.
You may see messages like this:
class not declared in any schema: someClassURI
Having read the previous section, you can probably work out what's going on: Eyeball reads the schemas (and imports) and extracts the declared OntClasses. Then anything used as a class that isn't one of those declared classes is reported..
And that's exactly it. "Used as a class" means appearing as C or D in any statement of the form:
It may be that you're not interested in the "unknown predicate" or "unknown class" reports until you've sorted out the URIs. Or maybe you don't care about them. In that case, you can switch them off.
Eyeball's different checks are carried out by inspector
classes. These classes are given short names by entries in
Eyeball config files (which are RDF files written using N3; you
can see the default config file by looking in Eyeball's
etc
directory for eyeball2-config.n3
).
By adding eg:
-exclude property class
to the Eyeball command line, you can exclude the inspectors with those short names from the check. property is the short name for the "unknown property" inspector, and class is the short name for the "unknown class" inspector.
bad namespace URI: "file:some-filename" on prefix: "pqr" for reason: file URI inappropriate for namespaceA "bad namespace URI" means that Eyeball doesn't like the URI for a namespace in the model. The "on prefix" part of the report says what the namespace prefix is, and the "for reason" part gives the reason. In this case, we (the designer of Eyeball) feel that it is unwise to use file URIs - which tend to depend on internal details of your directory structure - for global concepts.
A more usual reason is that the URI is syntactically illegal. Here are some possibilities:
URI contains spaces | literal spaces are not legal in URIs. This usually arises from file URIs when the file has a space in its name. Spaces in URIs have to be encoded. |
URI has no scheme | The URI has no scheme at all. This usually happens when some relative URI hasn't been resolved properly, eg there's no xml base in an RDF/XML document. |
URI has an unrecognised scheme | The scheme part of the URI - the bit before the first colon - isn't
recognised. Eyeball knows, by default, four schemes:
http, ftp, file, and urn. This
usually arises when a QName has "escaped" from somewhere, or from
a typo.
You can tell Eyeball about other schemes, if you need them. |
scheme should be lower-case | The scheme part of the URI contains uppercase letters. While this is not actually wrong, it is unconventional and pointless. |
URI doesn't fit pattern | Eyeball has some (weak) checks on the syntax of URIs in different schemes, expressed as patterns in its config files. If a URI doesn't match the pattern, Eyeball reports this problem. At the moment, you'll only get this report for a urn URI like urn:x-hp:23487682347 where the URN id (the bit between the first and second colons, here x-hp) is illegal. |
URI syntax error | A catch-all error: Java couldn't make any sense of this URI at all. |
Eyeball checks literals (using the literal inspector, whose short name is literal if you want to switch it off), but the checking is quite weak because it doesn't understand types at the moment. You can get two different classes of error.
bad language: someLanguageCode on literal: theLiteralInQuestion
Literals with language codes (things like en-UK or de) are checked to make sure that the language code conforms to the general syntax for language codes: alphanumeric words separated by hyphens, with the first containing no digits.
(Later versions of Eyeball will likely allow you to specify which language codes you want to permit in your models. But we haven't got there yet.)
bad datatype URI: someURI on literal: theLiteralInQuestion for reason: theReason
Similarly, literals with datatypes are checked to make sure that the type URI is legal. That's it for the moment: Eyeball doesn't try to find out if the URI really is a type URI, or if the spelling of the literal is OK for that type. But it spots the bad URIs. (The messages are the same as those that appear in the URI checking - above - for the very good reason that it's the same code doing the checking.)
Both RDF/XML and N3 allow (and RDF/XML requires) namespaces to be abbreviated by prefixes. Eyeball checks prefixes for two possible problems. The first:
non-standard namespace for prefix
This arises when a "standard" prefix has been bound to a namespace
URI which isn't its usual one. The "standard" prefixes are taken from
Jena's PrefixMapping.Extended
and are currently:
rdf, rdfs, daml, owl, xsd, rss, vcard
And the second:
Jena generated prefix found
This arises when the model contains prefixes of the form j.N
,
where N is a number. These are generated by Jena when writing RDF/XML
for URIs that must have a prefix (because they are used as types or
predicates) but haven't been given one.
If you're not bothered about inventing short prefixes for your
namespaces, you can -exclude jena-prefix
to suppress this inspection.
The reports described so far are part of Eyeball's default set of inspections. There are some other checks that it can do that are switched off by default, because they are expensive, initially overwhelming, or downright obscure. If you need to add these checks to your eyeballing, this is how to do it.
Some applications (or a general notion of cleanliness) require
that every individual in an RDF model has an explicit
rdf:type
. The Eyeball check for this isn't
enabled by default, because lots of casual RDF use doesn't
need it, and more sophisticated use has models with enough
inference power to infer types.
You can add the all-typed inspector to the inspectors that Eyeball will run by adding to the command line:
-inspectors defaultInspectors all-typed
The all-typed inspector will generate a message
resource has no rdf:type
for each resource in the model which is not the subject of
an rdf:type
statement.
One easy mistake to make in RDF is to make an assertion - we'll call it S P O - about some subject S which is "of the wrong type", that is, not of whatever type P's domain is. This isn't, in principle, an error, since RDF resources can have multiple types, and this just makes S have a type which is a subtype of both P's domain and whatever type it was supposed to have.
To spot this, and related problems, Eyeball has the consistent-type inspector. You can add it to the inspections in the same way as the all-typed inspector:
-inspectors defaultInspectors consistent-type
It checks that every resource which has been given at least one type has a type which is a subtype of all its types, under an additional assumption:
Types in the type graph (the network of rdfs:subClassOf statements) are disjoint (share no instances) unless the type graph says they're not.
For example, suppose that both A and B are subclasses of Top, ande that there are no other subclass relationships. Then consistent-types assumes that there are (supposed to be) no resources which have both A and B as types. If it finds a resource X which does have both types, it generates a message like this:
no consistent type for: X has associated type: A has associated type: B has associated type: Top
It's up to you to disentangle the types and work out what went wrong.
Note: this test requires that Eyeball do a significant amount of inference, to complete the type hierarchy and check the domains and ranges of proeprties. It's quite slow, which is one reason it isn't switched on by default.
You want to make sure that your data has the right properties
for things of a certain type: say, that a book has at least one
author (or editor), an album has at least one track, nobody in
your organisation has more than ten managers, a Jena contrib
has at least a dc:creator
, a dc:name
,
and a dc:description
. You write some OWL
cardinality constraints:
my:Type rdfs:subClassOf [owl:onProperty my:track; owl:minCardinality 1]Then you discover that, for wildly technical reasons, the OWL validation code in Jena doesn't think it's an error for some album to have no tracks (maybe there's a namespace error).
You can enable Eyeball's cardinality inspector by adding
-inspectors cardinalityto the command line. You'll now get a report item for every resource that has
rdf:type
your restricted type (my:Type
above) but doesn't have the right (at least one) value for the property.
It will look something like:
cardinality failure for: my:Instance on type: my:Type on property: my:track cardinality range: [min: 1] number of values: 0 values: {}If there are some values for the property - say you've supplied an
owl:maxCardinality
restriction and then gone
over the top - they get listed inside the values
curly braces.