Tag Service

Table of Contents

18.1. Introduction
18.2. Features
18.3. Architecture

18.1. Introduction

Tag service is providing the backbone of the tagging feature. The tags are keywords applied as metadata on documents reflecting the user opinion about that document. The tags are either categorizing the content of the document (labels like "document management", "ECM", "complex Web application", etc. can be thought as tags for Nuxeo), or they reflect the user feeling ("great", "user friendly", "versatile", etc.).

18.2. Features

The tag service allows to:

  • create tags

  • retrieve tags

  • apply tag on a document

  • list tags applied on a document

  • list documents tagged with a label

  • remove the link between a tag and a document

  • retrieve the popular clouds

  • retrieve the vote clouds

The service is available as remote service also. The EJB interface allows acquiring it over the network.

18.3. Architecture

Few types of objects are defined while working with tag service.

"Tag" is a new document type following the standard Nuxeo document approach. The schemes used are the usual ones (dublincore, common) and further a specific one containing the label and private flag. The tags can be stored anywhere, or they can be stored in a dedicated root tag folder. Tags are folderish, so they can be stored one under other, making possible creating categories of tags.

"Tagging" is an entity residing in a new table called TAGGING. This table is basically a link table storing the id of the tag document, the id of the target document (the document on which the tag was applied), the owner of the tagging (user which established the link), the private flag.

The owner of the tagging allows to select the tagging created by a specific user, so it is possible to allow deletion of a tagging only if the user actual owns that tagging. This means someone could not delete a not owned tagging (of course, the administrators can do that). Of course, this is higher level application decision, the tag service only allows such approach.

The API exports the tags as a DTO containing the label and id of the tag. Also, WeightedTag extends the Tag to provide the weight of the tag for the requested clouds. The clouds are provided as simple lists of WeightedTags. The service computes 2 types of clouds: vote cloud and popular cloud.

The cloud represents the visual representation of the distribution of tags around a domain. A domain can be anything, form a simple document to a workspace or even entire repository. Usually 2 types of clouds can be defined: “vote” and “popularity”. The first is counting how many times a tag was applied on a document by different users (votes), while the second counts how many documents in a particular domain were tagged with a particular tag, aiming to measure the tag popularity in a domain.

Let's have an example: have domain WorkspaceA with 2 documents Doc1 and Doc2. The tag tagX is applied by 3 different users on Doc1, tagY is applied by 5 different users on Doc2, tagZ is applied once on Doc1 and once on Doc2. Also, tagX was applied twice on WorkspaceA. The tag clouds would be:

  • "vote" on Doc1: tagX - 3, tagZ - 1

  • "popularity" on Doc1: tagX - 1, tagZ - 1

  • "vote" on Doc2: tagY - 5, tagZ - 1

  • "popularity" on Doc2: tagY - 1, tagZ - 1

  • "vote" on WorkspaceA: tagX - 2

  • "popularity" on WorkspaceA: tagX - 2, tagZ - 2, tagY - 1

There is a third less used tag cloud: the number of times the tag appears in the content of an item. This would be harder to implement (the content needs to be interpreted) and apparently less used. Indeed, to apply a tag like * "interesting", or "misleading" don't need that these terms appear in the article.

The underlaying operation in DB are performed through JPA accessing directly the Nuxeo default DB repository. This was selected for performance and usability. The configuration of DB connector has to be supplied through property file tagservice-db.properties.

Note

The properties have to follow identically the default Nuxeo repository configuration.