The elastic_recheck.results Module

Elastic search wrapper to make handling results easier.

class elastic_recheck.results.FacetSet

Bases: dict

A dictionary like collection for creating faceted ResultSets.

Elastic Search doesn’t support nested facets, which are incredibly useful for things like faceting by build_status then by build_uuid. This is a client side implementation that processes a ResultSet with an ordered list of facets, and turns it into a data structure which is FacetSet -> FacetSet ... -> ResultSet (arbitrary nesting of FaceSets with ResultSet as the leaves.

Treat this basically like a dictionary (which it inherits from).

detect_facets(results, facets, res=3600)
class elastic_recheck.results.Hit(hit)

Bases: object

index()
class elastic_recheck.results.ResultSet(results={})

Bases: list

An easy iterator object for handling elasticsearch results.

pyelasticsearch returns very complex result structures, and manipulating them directly is both ugly and error prone. The point of this wrapper class is to give us a container that makes working with pyes results more natural.

For instance:

results = se.search(...)
for hit in results:
    print hit.build_status

This greatly simplifies code that is interacting with search results, and allows us to handle some schema instability with elasticsearch, through adapting our __getattr__ methods.

Design goals for ResultSet are that it is an iterator, and that all the data that we want to work with is mapped to a flat attribute namespace (pyes goes way overboard with nesting, which is fine in the general case, but in the elastic_recheck case is just added complexity).

class elastic_recheck.results.SearchEngine(url, indexfmt='logstash-%Y.%m.%d')

Bases: object

Wrapper for pyelasticsearch so that it returns result sets.

search(query, size=1000, recent=False, days=0)

Search an elasticsearch server.

query parameter is the complicated query structure that pyelasticsearch uses. More details in their documentation.

size is the max number of results to return from the search engine. We default it to 1000 to ensure we don’t loose things. For certain classes of queries (like faceted ones), this can actually be set very low, as it won’t impact the facet counts.

recent search only most recent indexe(s), assuming this is basically a real time query that you only care about the last hour of time. Using recent dramatically reduces the load on the ES cluster.

days search only the last number of days.

The returned result is a ResultSet query.

Previous topic

The elastic_recheck.query_builder Module

Next topic

The elastic_recheck.tests.functional.test_gerrit_comment Module

This Page