Compass's various settings have been logically grouped in the following section, with a short description of each setting. Note: the only mandatory setting is the index file location specified in compass.engine.connection.
Note, that configuring Compass is simpler when using a schema based configuration file. But in its core, all of Compass configuration is driven by the following settings. You can use only settings to configure Compass (either programatically or using the Compass configuration based on DTD).
Sets the Search engine index connecion string.
Table A.1.
Connection | Description |
---|---|
file:// prefix or no prefix | The path to the file system based index path, using default file handling. This is a JVM level setting for all the file based prefixes. |
mmap:// prefix | Uses Java 1.4 nio MMAp class. Considered slower than the general file system one, but might have memory benefits (according to the Lucene documentation). This is a JVM level setting for all the file based prefixes. |
ram:// prefix | Creates a memory based index, follows the Compass life-cycle. Created when the Compass is created, and disposed when Compass is closed. |
jdbc:// prefix | Holds the Jdbc url or Jndi (based on the DataSourceProvider configured). Allows storing the index within a database. This setting requires additional mandatory settings, please refer to the Search Engine Jdbc section. It is very IMPORTANT to read the Search Engine Jdbc section, especially in term of performance considerations. |
Controls Compass registration through JNDI, using Compass JNDI lookups.
Table A.2.
Setting | Description |
---|---|
compass.name | The name that Compass will be registered under. Note that you can specify it at the XML configuration file with a name attribute at the compass element. |
compass.jndi.enable | Enables JNDI registration of compass under the given name. Default to false. |
compass.jndi.class | JNDI initial context class, Context.INITIAL_CONTEXT_FACTORY. |
compass.jndi.url | JNDI provider URL, Context.PROVIDER_URL |
compass.jndi.* | prefix for arbitrary JNDI InitialContext properties |
Controls Compass automatic properties, and property names.
Table A.3.
Setting | Description |
---|---|
compass.property.alias | The name of the "alias" property that Compass will use (a property that holds the alias property value of a resource). Defaults to alias (set it only if one of your mapped meta data is called alias). |
compass.property.extendedAlias | The name of the property that extended aliased (if exists) of a given Resource will be stored. This allows for poly alias queries where one can query on a "base" alias, and get all the aliases the are extending it. Defaults to extendedAlias (set it only if one of your mapped meta data is called extendedAlias). |
compass.property.all | The name of the "all" property that Compass will use (a property that accumulates all the properties/meta-data). Defaults to all (set it only if one of your mapped meta data is called all). Note that it can be overriden in the mapping files. |
compass.property.all.termVector (defaults to no) | The default setting for the term vector of the all property. Can be one of no, yes, positions, offsets, or positions_offsets. |
Compass supports several transaction isolation levels. More information about them can be found in the Search Engine chapter.
Table A.4.
Transaction Level | Description |
---|---|
none | Not supported, upgraded to read_committed. |
read_uncommitted | Not supported, upgraded to read_committed. |
read_committed | The same read committed from data base systems. As fast for read only transactions. |
repeatable_read | Not supported, upgraded to serializable. |
serializable | The same as serializable from data base systems. Performance killer, basically results in transactions executed sequentially. |
lucene (batch_insert) | A special transaction level, lucene (previously known as batch_insert) isolation level is similar to the read_committed isolation level except dirty operations done during a transaction are not visible to get/load/find operations that occur within the same transaction. This isolation level is very handy for long running batch dirty operations and can be faster than read_committed. Most usage patterns of Compass (such as integration with ORM tools) can work perfectly well with the lucene isolation level. |
Please read more about how Compass::Core implements it's transaction management in the Search Engine section.
When using the Compass::Core transaction API, you must specify a factory class for the CompassTransaction instances. This is done by setting the property compass.transaction.factory. The CompassTransaction API hides the underlying transaction mechanism, allowing Compass::Core code to run in a managed and non-managed environments. The two standard strategies are:
Table A.5.
Transaction Strategy | Description |
---|---|
org.compass.core. transaction.LocalTransactionFactory | Manages a local transaction which does not interact with other transaction mechanisms. |
org.compass.core. transaction.JTASyncTransactionFactory | Uses the JTA synchronization support to synchronize with the JTA transaction (not the same as two phase commit, meaning that if the transaction fails, the other resources that participate in the transaction will not roll back). If there is no existing JTA transaction, a new one will be started. |
org.compass.core. transaction.XATransactionFactory | Uses the JTA Transaction to enlist a Compass implemented XAResource This allows for Compass to participate in a two phase commit operation. Note, the JTA implementation should automatically delist the resource when the transaction commit/rollback. If there is no existing JTA transaction, a new one will be started. |
An important configuration setting is the compass.transaction.commitBeforeCompletion. It is used when using transaction factories that uses synchronization (like JTA and Spring). If set to true, will commit the transaction in the beforeCompletion stage. It is very important to set it to true when using a jdbc based index storage, and set it to false otherwise. Defaults to false.
Although the J2EE specification does not provide a standard way to reference a JTA TransactionManager, to register with a transaction synchronization service, Compass provides several lookups which can be set with a compass.transaction.managerLookup setting (thanks hibernate!). The setting is not required since Compass will try to auto-detect the JTA environment.
The following table lists them all:
Table A.6.
Transaction Manager Lookup | Application Server |
---|---|
org.compass.core.transaction.manager.JBoss | JBoss |
org.compass.core.transaction.manager.Weblogic | Weblogic |
org.compass.core.transaction.manager.WebSphere | WebSphere |
org.compass.core.transaction.manager.Orion | Orion |
org.compass.core.transaction.manager.JOTM | JOTM |
org.compass.core.transaction.manager.JOnaAS | JOnAS |
org.compass.core.transaction.manager.JRun4 | JRun4 |
org.compass.core.transaction.manager.BEST | Borland ES |
The JTA transaction mechanism will use the JNDI configuration to lookup the JTA UserTransaction. The transaction manager lookup provides the JNDI name, but if you wish to set it yourself, you can set the compass.transaction.userTransactionName setting. Also, the UserTransaction will be cached by default (fetched from JNDI on Compass startup), the caching can be controlled by compass.transaction.cacheUserTransaction.
Property accessors are used for reading and writing Class properties. Compass comes with two implementations, field and property. field is used for directly accessing the Class property, and property is used for accessing the class property using the property getters/setters. Compass allows for registration of custom PropertyAccessor implementations under a lookup name, as well as changing the default property accessor used (which is property).
The configuration uses Compass support for group properties, with the compass.propertyAccessor group prefix. The name the property accessor will be registered under is the group name. In order to set the default property accessor, the default group name should be used.
Custom implementations of PropertyAccessor can optionally implement the CompassConfigurable interface, which allows for additional settings to be injected into the implementations.
Table A.7. Property Accessor Settings
Setting | Description |
---|---|
compass.propertyAccessor.[property accessor name].type | The fully qualified class name of the property accessor. |
Compass uses converters to convert all the different OSEM mappings into Resources. Compass comes with a set of default converters that should be sufficient for most applications, but does allow the extendibility to define custom converters for all aspects related to marshaling Objects and Mappings (Compass internal mapping definitions) into a search engine.
Compass uses a registry of Converters. All Converters are registered under a registry name (converter lookup name). Compass registers all it's default Converters under lookup names (which allows for changing the default converters settings), and allows for registration of custom Converters.
The following lists all the default Converters that comes with Compass. The lookup name is the lookup name the Converter will be registered under, the Converter class is Compass implementation of the Converter, and the Converter Type acts as shorthand string for the Converter implementation (can be used with the compass.converter.[converter name].type instead of the fully qualified class name). The default mapping converters are responsible for converting the meta-data mapping definitions.
Table A.8. Default Compass Converters
Java type | Lookup Name | Converter Class | Converter Type | Notes |
---|---|---|---|---|
java.lang.Boolean, boolean | boolean | org.compass.core.converter. simple.BooleanConveter | boolean | |
java.lang.Byte, byte | byte | org.compass.core.converter. simple.ByteConveter | byte | |
java.lang.Charecter, char | char | org.compass.core.converter. simple.CharConveter | char | |
java.lang.Float, float | float | org.compass.core.converter. simple.FloatConveter | float | Format-table converter |
java.lang.Double, double | double | org.compass.core.converter. simple.DoubleConveter | double | Format-table converter |
java.lang.Short, short | short | org.compass.core.converter. simple.ShortConveter | short | Format-table converter |
java.lang.Integer, int | int | org.compass.core.converter. simple.IntConveter | int | Format-table converter |
java.lang.Long, long | long | org.compass.core.converter. simple.LongConveter | long | Format-table converter |
java.lang.Date | date | org.compass.core.converter. simple.DateConveter | date | Format-table converter, defaults to yyyy-MM-dd-HH-mm-ss-S-a |
java.lang.Calendar | calendar | org.compass.core.converter. simple.CalendarConveter | calendar | Format-table converter, defaults to yyyy-MM-dd-HH-mm-ss-S-a |
java.lang.String | string | org.compass.core.converter. simple.StringConveter | string | |
java.lang.StringBuffer | stringbuffer | org.compass.core.converter. simple.StringBufferConveter | stringbuffer | |
java.math.BigDecimal | bigdecimal | org.compass.core.converter. simple.BigDecimalConveter | bigdecimal | |
java.math.BigInteger | biginteger | org.compass.core.converter. simple.BigIntegerConveter | biginteger | |
java.net.URL | url | org.compass.core.converter. simple.URLConveter | url | Uses the URL#toString |
java.io.File | file | org.compass.core.converter. extended.FileConveter | file | Uses the file name |
java.io.InputStream | inputstream | org.compass.core.converter. extended.InputStreamConveter | inputstream | Stores the content of the InputStream without performing any other search related operations. |
java.io.Reader | reader | org.compass.core.converter. extended.ReaderConverter | reader | |
java.util.Locale | locale | org.compass.core.converter. extended.LocaleConveter | locale | |
java.sql.Date | sqldate | org.compass.core.converter. extended.SqlDateConveter | sqldate | |
java.sql.Time | sqltime | org.compass.core.converter. extended.SqlTimeConveter | sqltime | |
java.sql.Timestamp | sqltimestamp | org.compass.core.converter. extended.SqlTimestampConveter | sqltimestamp | |
byte[] | primitivebytearray | org.compass.core.converter. extended.PrimitiveByteArrayConverter | primitivebytearray | Stores the content of the byte array without performing any other search related operations. |
Byte[] | objectbytearray | org.compass.core.converter. extended.ObjectByteArrayConverter | objectbytearray | Stores the content of the byte array without performing any other search related operations. |
Table A.9. Compass Mapping Converters
Mapping type | Lookup Name | Converter Class | Notes |
---|---|---|---|
org.compass.core.mapping. osem.ClassMapping | classMapping | org.compass.core.converter. mapping.ClassMappingConverter | |
org.compass.core.mapping. osem.ClassIdPropertyMapping | classIdPropertyMapping | org.compass.core.converter. mapping.ClassPropertyMappingConverter | |
org.compass.core.mapping. osem.ClassPropertyMapping | classPropertyMapping | org.compass.core.converter. mapping.ClassPropertyMappingConverter | |
org.compass.core.mapping. osem.ComponentMapping | componentMapping | org.compass.core.converter. mapping.ComponentMappingConverter | |
org.compass.core.mapping. osem.ReferenceMapping | referenceMapping | org.compass.core.converter. mapping.ReferenceMappingConverter | |
org.compass.core.mapping. osem.CollectionMapping | collectionMapping | org.compass.core.converter. mapping.CollectionMappingConverter | |
org.compass.core.mapping. osem.ArrayMapping | arrayMapping | org.compass.core.converter. mapping.ArrayMappingConverter | |
org.compass.core.mapping. osem.ConstantMapping | constantMapping | org.compass.core.converter. mapping.ConstantMappingConverter | |
org.compass.core.mapping. osem.ParentMapping | parentMapping | org.compass.core.converter. mapping.ParentMappingConverter |
Defining a new converter can be done using Compass support for group settings. compass.converter is the prefix for the group. In order to define new converter that will be registered under the "converter name" lookup, the compass.converter.[converter name] setting prefix should be used. The following lists all the settings that can apply to a converter definition.
Table A.10. Converter Settings
Setting | Description |
---|---|
compass.converter.[converter name].type | The type of the org.compass.converter.Converter implementation. Should either be the fully qualified class name, or the Converter Type (shorthand version for compass internal converter classes, defined in the previous table). |
compass.converter.[converter name].format | Applies to format-able converters. The format that will be used to format the data converted (see Java java.text.DecimalFormat and java.text.SimpleDateFormat). |
compass.converter.[converter name].format.locale | The Locale to be used when formatting. |
compass.converter.[converter name].format.minPoolSize | Compass pools the formatters for greater performance. The value of the minimum pool size. Defaults to 4. |
compass.converter.[converter name].format.maxPoolSize | Compass pools the formatters for greater performance. The value of the maximum pool size. Defaults to 20. |
Note, that any other setting can be defined after the compass.converter.[converter name]. If the Converter implements the org.compass.core.config.CompassConfigurable, it will be injected with the settings for the converter. The converter will get all the settings, with settings names without the compass.converter.[converter name] prefix.
For example, defining a new Date converter with a specific format can be done by setting two settings: compass.converter.mydate.type=date (same as compass.converter.mydate.type=org.compass.core.converter.basic.DateConverter), and compass.converter.mydate.format=yyyy-HH-dd. The converter will be registered under the "mydate" converter lookup name. It can than be used as a lookup name in the OSEM definitions.
In order to change the default converters, simply define a setting with the [converter name] of the default converter that comes with compass. For example, in order to override the format of all the dates in the system to "yyyy-HH-dd", simple set: compass.converter.date.format=yyyy-HH-dd.
Controls the different settings for the search engine.
Table A.11. Search Engine Settings
Setting | Description |
---|---|
compass.engine.connection | The index engine file system location. |
compass.engine.defaultsearch | When searching using a query string, the default property/meta-data that compass will use for non prefixed strings. Defaults to compass.property.all value. |
compass.engine.all.analyzer | The name of the analyzer to use for the all property (see the next section about Search Engine Analyzers). |
compass.transaction.lockDir | The directory where the search engine will maintain it's locking file mechanism for inter and outer process transaction synchronization. Defaults to the java.io.tmpdir Java system property. This is a JVM level property. |
compass.transaction.lockTimeout | The amount of time a transaction will wait in order to obtain it's specific lock (in seconds). Defaults to 10 seconds. |
compass.transaction.lockPollInterval | The interval that the transaction will check to see if it can obtain the lock (in milliseconds). Defaults to 100 milliseconds. This is a JVM level proeprty. |
compass.engine.optimizer.type | The fully qualified class name of the search engine optimizer that will be used. Defaults to org.compass.core.lucene.engine. optimizer.AdaptiveOptimizer. Please see the following section for a list of optimizers. |
compass.engine.optimizer.schedule | Determines if the optimizer will be scheduled or not (true or false), defaults to true. If it is scheduled, it will run each period of time and check if the index need optimization, and if it does, it will optimize it. |
compass.engine.optimizer. schedule.period | The period that the optimizer will check if the index need optimization, and if it does, optimize it (in seconds, can be a float number). Defaults to 10 seconds. The setting applies if the optimizer is scheduled. |
compass.engine.optimizer. schedule.fixedRate | Determines if the schedule will run in a fixed rate or not. If it is set to false each execution is scheduled relative to the actual execution of the previous execution. If it is set to true each execution is scheduled relative to the execution time of the initial execution. |
compass.engine.optimizer. adaptive.mergeFactor | For the adaptive optimizer, determines how often the optimizer will optimize the index. With small values, the faster the searches will be, but the more often that the index will be optimized. Larger values will result in slower searches, and less optimizations. |
compass.engine.optimizer. aggressive.mergeFactor | For the aggressive optimizer, determines how often the optimizer will optimize the index. With small values, the faster the searches will be, but the more often that the index will be optimized. Larger values will result in slower searches, and less optimizations. |
compass.engine.mergeFactor | With smaller values, less RAM is used, but indexing is slower. With larger values, more RAM is used, and the indexing speed is faster. Defaults to 10. |
compass.engine.maxBufferedDocs | Determines the minimal number of documents required before the buffered in-memory documents are flushed as a new Segment. Large values generally gives faster indexing. When this is set, the writer will flush every maxBufferedDocs added documents. Pass in -1 to prevent triggering a flush due to number of buffered documents. Note that if flushing by RAM usage is also enabled, then the flush will be triggered by whichever comes first. Disabled by default (writer flushes by RAM usage). |
compass.engine.maxBufferedDeletedTerms | Determines the minimal number of delete terms required before the buffered in-memory delete terms are applied and flushed. If there are documents buffered in memory at the time, they are merged and a new segment is created. Disabled by default (writer flushes by RAM usage). |
compass.engine.ramBufferSize | Determines the amount of RAM that may be used for buffering added documents before they are flushed as a new Segment. Generally for faster indexing performance it's best to flush by RAM usage instead of document count and use as large a RAM buffer as you can. When this is set, the writer will flush whenever buffered documents use this much RAM. Pass in -1 to prevent triggering a flush due to RAM usage. Note that if flushing by document count is also enabled, then the flush will be triggered by whichever comes first. The default value is 16 (M). |
compass.engine.termIndexInterval | Expert: Set the interval between indexed terms. Large values cause less memory to be used by IndexReader, but slow random-access to terms. Small values cause more memory to be used by an IndexReader, and speed random-access to terms. This parameter determines the amount of computation required per query term, regardless of the number of documents that contain that term. In particular, it is the maximum number of other terms that must be scanned before a term is located and its frequency and position information may be processed. In a large index with user-entered query terms, query processing time is likely to be dominated not by term lookup but rather by the processing of frequency and positional data. In a small index or when many uncommon query terms are generated (e.g., by wildcard queries) term lookup may become a dominant cost. In particular, numUniqueTerms/interval terms are read into memory by an IndexReader, and, on average, interval/2 terms must be scanned for each random term access. |
compass.engine.maxFieldLength | The number of terms that will be indexed for a single property in a resource. This limits the amount of memory required for indexing, so that collections with very large resources will not crash the indexing process by running out of memory. Note, that this effectively truncates large resources, excluding from the index terms that occur further in the resource. Defaults to 10,000 terms. |
compass.engine.useCompoundFile | Turn on (true) or off (false) the use of compound files. If used lowers the number of files open, but have very small performance overhead. Defaults to true. Note, when compass starts up, it will validate that the current index structure maps the configured setting, and if it is not, it will automatically try and convert it to the correct structure. |
compass.engine.cacheIntervalInvalidation | Sets how often (in milliseconds) the index manager will check if the index cache needs to be invalidated. Defaults to 5000 milliseconds. Setting it to 0 means that the cache will check if it needs to be invalidated all the time. Setting it to -1 means that the cache will not check the index for invalidation, it is perfectly fine if a single instance is working with the index, since the cache is automatically invalidated upon a dirty operation. |
compass.engine.indexManagerScheduleInterval | The index manager schedule interval (in seconds) where different actions related to index manager will happen (such as global cache interval invalidation checks - see SearchEngineIndexManager#notifyAllToClearCache and SearchEngineIndexManager#checkAndClearIfNotifiedAllToClearCache). Defaults to 60 seconds. |
compass.engine.waitForCacheInvalidationOnIndexOperation | Defaults to false. If set to true, will cause the index manager operation (including replace index) to wait for all other compass instances to invalidate their cache. The time to wait will be the indexManagerScheduledInterval configuration setting. |
The following section lists the different optimizers that are available with Compass::Core. Note that all the optimizers can be scheduled or not.
Table A.12.
Optimizer | Description |
---|---|
org.compass.core.lucene.engine. optimizer.AdaptiveOptimizer | When the number of segments exceeds that specified mergeFactor, the segments will be merged from the last segment, until a segment with a higher resource count will be encountered. |
org.compass.core.lucene.engine. optimizer.AggressiveOptimizer | When the number of segments exceeds that specified mergeFactor, all the segments are merged into a single segment. |
org.compass.core.lucene.engine. optimizer.NullOptimizer | Does no optimization, starts no threads. |
Compass allows storing the index in a database using Jdbc. When using Jdbc storage, additional settings are mandatory except for the connection setting. The value after the Jdbc:// prefix in the compass.engine.connection setting can be the Jdbc url connection or the Jndi name of the DataSource, depending on the configured DataSourceProvider.
It is important also to read the Jdbc Directory Appendix. Two sections that should be read are the supported dialects, and the performance considerations (especially the compound structure).
The following is a list of all the Jdbc settings:
Table A.13. Search Engine Jdbc Settings
Setting | Description |
---|---|
compass.engine.store.jdbc. dialect | Optional. The fully qualified class name of the dialect (the database type) that the index will be stored at. Please refer to Lucene Jdbc Directory appendix for a list of the currently supported dialects. If not set, Compass will try to auto-detect it based on the Database meta-data. |
compass.engine.store.jdbc. disableSchemaOperations | Optional. If set to true, no database schema level operations will be performed (drop and create tables). When deleting the data in the index, the content will be deleted, but the table will not be dropped. Default to false. |
compass.engine.store.jdbc. managed | Optional (defaults to false). If the connection is managed or not. Basically, if set to false, compass will commit and rollback the transaction. If set to true, compass will not perform it. Defaults to false. Should be set to true if using external transaction managers (like JTA or Spring PlatformTransactionManager), and false if using compass LocalTransactionFactory. Note as well, that if using external transaction managers, the compass.transaction.commitBeforeCompletion should be set to true. If the connection is not managed (set to false), the created DataSource will be wrapped with Compass Jdbc directory TransactionAwareDataSourceProxy. Please refer to Lucene Jdbc Directory appendix for more information. |
compass.engine.store.jdbc. connection.provider.class | The fully qualified name of the DataSourceProvider. The DataSourceProvider is responsible for getting/creating the Jdbc DataSource that will be used. Defaults to org.compass.core.lucene.engine .store.jdbc.DriverManagerDataSourceProvider (Poor performance). Please refer to next section for a list of the available providers. |
compass.engine.store.jdbc. useCommitLocks | Optional (defaults to false). Determines if the index will use Lucene commit locks. Setting it to true makes sense only if the system will work in autoCommit mode (which is not recommended anyhow). |
compass.engine.store.jdbc. deleteMarkDeletedDelta | Optional (defaults to an hour). Some of the entries in the database are marked as deleted, and not actually gets to be deleted from the database. The setting controls the delta time of when they should be deleted. They will be deleted if they were marked for deleted "delta" time ago (base on database time, if possible by dialect). |
compass.engine.store.jdbc. lockType | Optional (defaults to PhantomReadLock). The fully qualified name of the Lock implementation that will be used. |
compass.engine.store.jdbc. ddl.name.name | Optional (defaults to name_). The name of the name column. |
compass.engine.store.jdbc. ddl.name.size | Optional (defaults to 50). The size (charecters) of the name column. |
compass.engine.store.jdbc. ddl.value.name | Optional (defaults to value_). The name of the value column. |
compass.engine.store.jdbc. ddl.value.size | Optional (defaults to 500 * 1000 K). The size (in K) of the value column. Only applies to databases that require it. |
compass.engine.store.jdbc. ddl.size.name | Optional (defaults to size_). The name of the size column. |
compass.engine.store.jdbc. ddl.lastModified.name | Optional (defaults to lf_). The name of the last modified column. |
compass.engine.store.jdbc. ddl.deleted.name | Optional (defaults to deleted_). The name of the deleted column. |
Compass comes with several built in DataSourceProviders. They are all located at the org.compass.core.lucene.engine.store.jdbc package. The following table lists them:
Table A.14. Search Engine Data Source Providers
Data Source Provider Class | Description |
---|---|
DriverManagerDataSourceProvider | The default data source provider. Creates a simple DataSource that returns a new Connection for each request. Performs very poorly, and should not be used. |
DbcpDataSourceProvider | Uses Jakarta Commons DBCP Connection pool. Compass provdes several additional configurations settings to configure DBCP, please refer to LuceneEnvironment#DataSourceProvider#Dbcp javadoc. |
C3P0DataSourceProvider | Uses C3P0 Connection pool. Configring additional properties for the C3P0 connection pool uses C3p0 internal support for a c3p0.properties that should reside as a top-level resource in the same CLASSPATH / classloader that loads c3p0's jar file. |
JndiDataSourceProvider | Gets a DataSource from JNDI. The JNDI name is the value after the jdbc:// prefix in Compass connection setting. |
ExternalDataSourceProvider | A data source provider that can use an externally configured data source. In order to set the external DataSource to be used, the ExternalDataSourceProvider#setDataSource(DataSource) static method needs to be called before the Compass instance if created. |
The DriverManagerDataSourceProvider, DbcpDataSourceProvider, and C3P0DataSourceProvider use the value after the jdbc:// prefix in Compass connection setting as the Jdbc connection url. They also require the following settings to be set:
Table A.15. Internal Data Source Providers Settings
Setting | Description |
---|---|
compass.engine.store.jdbc. connection.driverClass | The Jdbc driver class. |
compass.engine.store.jdbc. connection.username | The Jdbc connection user name. |
compass.engine.store.jdbc. connection.password | The Jdbc connection password. |
Configuring the Jdbc store with Compass also allows defining FileEntryHandler settings for different file entries in the database. FileEntryHandlers are explained in the Lucene Jdbc Directory appendix (and require some Lucene knowledge). The Lucene Jdbc Directory implementation already comes with sensible defaults, but they can be changed using Compass configuration.
One of the things that come free with Compass it automatically using the more performant FetchPerTransactoinJdbcIndexInput if possible (based on the dialect). Special care need to be taken when using the mentioned index input, and it is done automatically by Compass.
Setting file entry handlers is done using the following setting prefix: compass.engine.store.jdbc.fe.[name]. The name can be either __default__ which is used for all unmapped files, it can be the full name of the file stored, or the suffix of the file (the last 3 charecters). Some of the currently supported settings are:
Table A.16. File Entry Handler Settings
Setting | Description |
---|---|
compass.engine.store.jdbc.fe. [name].type | The fully qualified class name of the file entry handler. |
compass.engine.store.jdbc.fe. [name].indexInput.type | The fully qualified class name of the IndexInput implementation. |
compass.engine.store.jdbc.fe. [name].indexOutput.type | The fully qualified class name of the IndexInput implementation. |
compass.engine.store.jdbc.fe. [name].indexInput.bufferSize | The RAM buffer size of the index input. Note, it applies only to some of the IndexInput implementations. |
compass.engine.store.jdbc.fe. [name].indexOutput.bufferSize | The RAM buffer size of the index output. Note, it applies only to some of the IndexOutput implementations. |
compass.engine.store.jdbc.fe. [name].indexOutput.threshold | The threshold value (in bytes) after which data will be temporarly written to a file (and them dumped into the database). Applies when using RAMAndFileJdbcIndexOutput (which is the default one). Defaults to 16 * 1024 bytes. |
With Compass, multiple Analyzers can be defined (each under a different analyzer name) and than referenced in the configuration and mapping definitions. Compass defines two internal analyzers names called: default and search. The default analyzer is the one used when no other analyzer can be found, it defaults to the standard analyzer with English stop words. The search is the analyzer used to analyze search queries, and if not set, defaults to the default analyzer (Note that the search analyzer can also be set using the CompassQuery API). Changing the settings for the default analyzer can be done using the compass.engine.analyzer.default.* settings (as explained in the next table). Setting the search analyzer (so it will differ from the default analyzer) can be done using the compass.engine.analyzer.search.* settings. Also, you can set a list of filter to be applied to the given analyzers, please see the next section of how to configure analyzer filters, especially the synonym one.
Table A.17. Search Engine Analyzer Settings
Setting | Description |
---|---|
compass.engine.analyzer.[analyzer name].type | The type of the search engine analyzer, please see the available analyzers types later in the section. |
compass.engine.analyzer.[analyzer name].filters | A comma separated list of LuceneAnalyzerTokenFilterProviders registered under compass, to be applied for the given analyzer. For example, adding a synonym analyzer, you should register a synonym LuceneAnalyzerTokenFilterProvider under your own choice for filter name, and add it to the list of filters here. |
compass.engine.analyzer.[analyzer name].stopwords | A comma separated list of stop words to use with the chosen analyzer. If the string starts with +, the list of stop-words will be added to the default set of stop words defined for the analyzer. Note, that not all the analyzers type support this feature. |
compass.engine.analyzer.[analyzer name].factory | If the compass.engine.analyzer.[analyzer name].type setting is not enough to configure your analyzer, use it to define the fully qualified class name of your analyzer factory which implements LuceneAnalyzerFactory class. |
Compass comes with core analyzers (Which are part of the lucene-core jar). They are: standard, simple, whitespace, and stop. See the Analyzers Section.
Compass also allows simple configuration of the snowball analyzer type (which comes with the lucene-snowball jar). An additional setting that must be set when using the snowball analyzer, is the compass.engine.analyzer.[analyzer name].name setting. The settings can have the following values: Danish, Dutch, English, Finnish, French, German, German2, Italian, Kp, Lovins, Norwegian, Porter, Portuguese, Russian, Spanish, and Swedish.
Another set of analyer types comes with the lucene-analyzers jar. They are: brazilian, cjk, chinese, czech, german, greek, french, dutch, and russian.
You can specify a set of analyzer filters that can then be applied to all the different analyzers configured. It uses the group settings, so setting the analyzer filter need to be prefixed with compass.engine.analyzerfilter, and the value after it is the analyzer filter name, and then the setting for the analyzer filter.
Filters are provided for simpler support for additional filtering (or enrichment) of analyzed streams, without the hassle of creating your own analyzer. Also, filters, can be shared across different analyzers, potentially having different analyzer types.
Table A.18.
Setting | Description |
---|---|
compass.engine.analyzerfilter.[analyzer filter name].type | The type of the search engine analyzer filter provider, must implement the org.compass.core.lucene.engine.analyzer.LuceneAnalyzerTokenFilterProvider interface. Can also be the value synonym, which will automatically map to the org.compass.core.lucene.engine.analyzer.synonym.SynonymAnalyzerTokenFilterProvider class. |
compass.engine.analyzerfilter.[analyzer filter name].lookup | Only applies for synonym filters. The class that implements the org.compass.core.lucene.engine.analyzer.synonym.SynonymLookupProvider for providing synonyms for a given term. |
With Compass, multiple Highlighters can be defined (each under a different highlighter name) and than referenced when using CompassHighlighter. Within Compass, an internal default highlighter is defined, and can be configured when using default as the highlighter name.
Table A.19.
Setting | Description |
---|---|
compass.engine.highlighter.[highlighter name].factory | Low level. Optional (defaults to DefaultLuceneHighlighterFactory). The fully qualified name of the class that creates highlighters settings. Must implement the LuceneHighlighterFactory interface. |
compass.engine.highlighter.[highlighter name].textTokenizer | Optional (default to auto). Defines how a text will be tokenized to be highlighted. Can be analyzer (use an analyzer to tokenize the text), term_vector (use the term vector info stored in the index), or auto (will first try term_vector, and if no info is stored, will try to use analyzer). |
compass.engine.highlighter.[highlighter name].rewriteQuery | Low level. Optional (defaults to true). If the query used to highlight the text will be rewritten or not. |
compass.engine.highlighter.[highlighter name].computeIdf | Low level. Optional (set according to the formatter used). |
compass.engine.highlighter.[highlighter name].maxNumFragments | Optional (default to 3). Sets the maximum number of fragments that will be returned. |
compass.engine.highlighter.[highlighter name].separator | Optional (defaults to ...). Sets the separator string between fragments if using the combined fragments highlight option. |
compass.engine.highlighter.[highlighter name].maxBytesToAnalyze | Optional (defaults to 50*1024). Sets the maximum byes of text to analyze. |
compass.engine.highlighter.[highlighter name].fragmenter.type | Optional (default to simple). The type of the fragmenter that will be used, can be simple, null, or the fully qualified class name of the fragmenter (implements the org.apache.lucene.search.highlight.Fragmenter). |
compass.engine.highlighter.[highlighter name].fragmenter.simple.size | Optional (defaults to 100). Sets the size (in bytes) of the fragments for the simple fragmenter. |
compass.engine.highlighter.[highlighter name].encoder.type | Optional (default to default). The type of the encoder that will be used to encode fragmented text. Can be default (does nothing), html (escapes html tags), or the fully qualifed class name of the encoder (implements org.apache.lucene.search.highlight.Encoder). |
compass.engine.highlighter.[highlighter name].formatter.type | Optional (default to simple). The type of the formatter that will be used to highlight the text. Can be simple (simply wraps the highlighted text with pre and post strings), htmlSpanGradient (wraps the highlighted text with an html span tag with an optional background and foreground gradient colors), or the fully qualified class name of the formatter (implements org.apache.lucene.search.highlight.Formatter). |
compass.engine.highlighter.[highlighter name].formatter.simple.pre | Optional (default to <b>). In case the highlighter uses the simple formatter, controlls the text that is appened before the highlighted text. |
compass.engine.highlighter.[highlighter name].formatter.simple.post | Optional (default to </b>). In case the highlighter uses the simple formatter, controlls the text that is appened after the highlighted text. |
compass.engine.highlighter.[highlighter name].formatter.htmlSpanGradient.maxScore | In case the highlighter uses the htmlSpanGradient formatter, the score that above it is displayed as max color. |
compass.engine.highlighter.[highlighter name].formatter.htmlSpanGradient.minForegroundColor | Optional (if not set, foreground will not be set on the span tag). In case the highlighter uses the htmlSpanGradient formatter, the hex color used for representing IDF scores of zero eg #FFFFFF (white). |
compass.engine.highlighter.[highlighter name].formatter.htmlSpanGradient.maxForegroundColor | Optional (if not set, foreground will not be set on the span tag). In case the highlighter uses the htmlSpanGradient formatter, the largest hex color used for representing IDF scores eg #000000 (black). |
compass.engine.highlighter.[highlighter name].formatter.htmlSpanGradient.minBackgroundColor | Optional (if not set, background will not be set on the span tag). In case the highlighter uses the htmlSpanGradient formatter, the hex color used for representing IDF scores of zero eg #FFFFFF (white). |
compass.engine.highlighter.[highlighter name].formatter.htmlSpanGradient.maxBackgroundColor | Optional (if not set, background will not be set on the span tag). In case the highlighter uses the htmlSpanGradient formatter, The largest hex color used for representing IDF scores eg #000000 (black). |
Several other settings that control compass.
Table A.20.
Setting | Description |
---|---|
compass.osem.managedId.index | Can be either un_tokenized or no (defaults to no). It is the index setting that will be used when creating an internal managed id for a class property mapping (if it is not a property id, if it is, it will always be un_tokenized). |
compass.osem.supportUnmarshall | Controls if the default support for un-marshalling within the class mappings will default to true or false (unless it is explicitly set in the class mapping). Defaults to true. Controls if the searchable class will support unmarshalling from the search engine or using Resource is enough. Un-marshalling is the process of converting a raw Resource into the actual domain object. If support un-marshall is enabled extra information will be stored within the search engine, as well as consumes extra memory |