Elasticsearch

Elasticsearch is a search server. Documents (key-values) get stored, configurable queries come in, Elasticsearch scores these documents, and returns the most relevant hits.

Installation

You can download the Elasticsearch code and run elasticsearch directly from this folder. This makes it easy to upgrade or test new versions as needed. Optionally you can install Elasticsearch using your preferred system package manager.

We are currently using Elasticsearch version 1.6.2. You can install by doing the following:

curl -O https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-1.6.2.tar.gz
tar -xvzf elasticsearch-1.6.2.tar.gz
cd elasticsearch-1.6.2

For running Marketplace you must install the ICU Analysis Plugin:

./bin/plugin -install elasticsearch/elasticsearch-analysis-icu/2.6.0

For more about the ICU plugin, see the ICU Github Page.

Settings

cluster.name: wooyeah

# Don't try to cluster with other machines during local development.
# Remove the following 3 lines to enable default clustering.
network.host: localhost
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["localhost"]

script.disable_dynamic: false

path:
  logs: /usr/local/var/log
  data: /usr/local/var/data

We use a custom analyzer for indexing add-on names since they’re a little different from normal text.

To get the same results as our servers, configure Elasticsearch by copying the scripts/elasticsearch/elasticsearch.yml (available in the scripts/elasticsearch/ folder of your install) to your system.

For example, copy it to the local directory so it’s nearby when you launch Elasticsearch:

cp /path/to/zamboni/scripts/elasticsearch/elasticsearch.yml .

If you don’t do this your results will be slightly different, but you probably won’t notice.

Launching and Setting Up

Launch the Elasticsearch service:

./bin/elasticsearch -Des.config=elasticsearch.yml

Zamboni has commands that sets up mappings and indexes for you. Setting up the mappings is analagous to defining the structure of a table, indexing is analagous to storing rows.

It is worth noting that the index is maintained incrementally through post_save and post_delete hooks.

Use this to create the apps index and index apps:

./manage.py reindex --index=apps

Or you could use the makefile target (using the settings_local.py file):

make reindex

If you need to use another settings file and add arguments:

make SETTINGS=settings_other ARGS='--force' reindex

Querying Elasticsearch in Django

We use Elasticsearch DSL, a Python library that gives us a search API to elasticsearch.

On Marketplace, apps use mkt/webapps/indexers:WebappIndexer as its interface to Elasticsearch:

query_results = WebappIndexer.search().query(...).filter(...).execute()

Testing with Elasticsearch

All test cases using Elasticsearch should inherit from mkt.site.tests.ESTestCase. All such tests will be skipped by the test runner unless:

RUN_ES_TESTS = True

This is done as a performance optimization to keep the run time of the test suite down, unless necessary.

Troubleshooting

I got a CircularReference error on .search() - check that a whole object is not being passed into the filters, but rather just a field’s value.

I indexed something into Elasticsearch, but my query returns nothing - check whether the query contains upper-case letters or hyphens. If so, try lowercasing your query filter. For hyphens, set the field’s mapping to not be analyzed:

'my_field': {'type': 'string', 'index': 'not_analyzed'}