Search DSL¶

The `Search` object¶

The Search object represents the entire search request:

queries

filters

aggregations

sort

pagination

additional parameters

associated client

The API is designed to be chainable. With the exception of the aggregations functionality this means that the Search object is immutable - all changes to the object will result in a copy being created which contains the changes. This means you can safely pass the Search object to foreign code without fear of it modifying your objects.

You can pass an instance of the low-level elasticsearch client when instantiating the Search object:

from elasticsearch import Elasticsearch
from elasticsearch_dsl import Search

client = Elasticsearch()

s = Search(client)

You can also define the client at a later time (for more options see the ~:ref:connections chapter):

s = s.using(client)

Note

All methods return a copy of the object, making it safe to pass to outside code.

The API is chainable, allowing you to combine multiple method calls in one statement:

s = Search().using(client).query("match", title="python")

Note

In some cases this approach is not possible due to python’s restriction on identifiers - for example if your field is called @timestamp. In that case you have to fall back to unpacking a dictionary: s.query('range', ** {'@timestamp': {'lt': 'now'}})

To send the request to Elasticsearch:

response = s.execute()

If you just want to iterate over the hits returned by your search you can iterate over the Search object:

for hit in s:
    print(hit.title)

Search results will be cached. Subsequent calls to execute or trying to iterate over an already executed Search object will not trigger additional requests being sent to Elasticsearch. To force a request specify ignore_cache=True when calling execute.

For debugging purposes you can serialize the Search object to a dict explicitly:

print(s.to_dict())

Queries¶

The library provides classes for all Elasticsearch query types. Pass all the parameters as keyword arguments:

from elasticsearch_dsl.query import MultiMatch

# {"multi_match": {"query": "python django", "fields": ["title", "body"]}
MultiMatch(query='python django', fields=['title', 'body'])

You can use the Q shortcut to construct the instance using a name with parameters or the raw dict:

Q("multi_match", query='python django', fields=['title', 'body'])
Q({"multi_match": {"query": "python django", "fields": ["title", "body"]})

To add the query to the Search object, use the .query() method:

q = Q("multi_match", query='python django', fields=['title', 'body'])
s = s.query(q)

The method also accepts all the parameters as the Q shortcut:

s = s.query("multi_match", query='python django', fields=['title', 'body'])

If you already have a query object, or a dict representing one, you can just override the query used in the Search object:

s.query = Q('bool', must=[Q('match', title='python'), Q('match', body='best')])

Query combination¶

Query objects can be combined using logical operators:

Q("match", title='python') | Q("match", title='django')
# {"bool": {"should": [...]}}

Q("match", title='python') & Q("match", title='django')
# {"bool": {"must": [...]}}

~Q("match", title="python")
# {"bool": {"must_not": [...]}}

You can also use the + operator:

Q("match", title='python') + Q("match", title='django')
# {"bool": {"must": [...]}}

When using the + operator with Bool queries, it will merge them into a single Bool query:

Q("bool") + Q("bool")
# {"bool": {"..."}}

When you call the .query() method multiple times, the + operator will be used internally:

s = s.query().query()
print(s.to_dict())
# {"query": {"bool": {...}}}

If you want to have precise control over the query form, use the Q shortcut to directly construct the combined query:

q = Q('bool',
    must=[Q('match', title='python')],
    should=[Q(...), Q(...)],
    minimum_should_match=1
)
s = Search().query(q)

Filters¶

Filters behave similarly to queries - just use the F shortcut and .filter() method. When you use the .filter() method, the query will be automatically wrapped in a filtered query.

If you want to use the post_filter element for faceted navigation, use the .post_filter() method.

Aggregations¶

To define an aggregation, you can use the A shortcut:

A('terms', field='tags')
# {"terms": {"field": "tags"}}

To nest aggregations, you can use the .bucket() and .metric() methods:

a = A('terms', field='category')
# {'terms': {'field': 'category'}}

a.metric('clicks_per_category', 'sum', field='clicks')\
    .bucket('tags_per_category', 'terms', field='tags')
# {
#   'terms': {'field': 'category'},
#   'aggs': {
#     'clicks_per_category': {'sum': {'field': 'clicks'}},
#     'tags_per_category': {'terms': {'field': 'tags'}}
#   }
# }

To add aggregations to the Search object, use the .aggs property, which acts as a top-level aggregation:

s = Search()
a = A('terms', field='category')
s.aggs.bucket('category_terms', a)
# {
#   'aggs': {
#     'category_terms': {
#       'terms': {
#         'field': 'category'
#       }
#     }
#   }
# }

or

s = Search()
s.aggs.bucket('per_category', 'terms', field='category')\
    .metric('clicks_per_category', 'sum', field='clicks')\
    .bucket('tags_per_category', 'terms', field='tags')

s.to_dict()
# {
#   'aggs': {
#     'per_category': {
#       'terms': {'field': 'category'},
#       'aggs': {
#         'clicks_per_category': {'sum': {'field': 'clicks'}},
#         'tags_per_category': {'terms': {'field': 'tags'}}
#       }
#     }
#   }
# }

You can access an existing bucket by its name:

s = Search()

s.aggs.bucket('per_category', 'terms', field='category')
s.aggs['per_category'].metric('clicks_per_category', 'sum', field='clicks')
s.aggs['per_category'].bucket('tags_per_category', 'terms', field='tags')

Note

When chaining multiple aggregations, there is a difference between what .bucket() and .metric() methods return - .bucket() returns the newly defined bucket while .metric() returns its parent bucket to allow further chaining.

As opposed to other methods on the Search objects, defining aggregations is done in-place (does not return a copy).

Sorting¶

To specify sorting order, use the .sort() method:

s = Search().sort(
    'category',
    '-title',
    {"lines" : {"order" : "asc", "mode" : "avg"}}
)

It accepts positional arguments which can be either strings or dictionaries. String value is a field name, optionally prefixed by the - sign to specify a descending order.

To reset the sorting, just call the method with no arguments:

s = s.sort()

Pagination¶

To specify the from/size parameters, use the Python slicing API:

s = s[10:20]
# {"from": 10, "size": 10}

If you want to access all the documents matched by your query you can use the scan method which uses the scan/scroll elasticsearch API:

for hit in s.scan():
    print(hit.title)

Note that in this case the results won’t be sorted.

Highlighting¶

To set common attributes for highlighting use the highlight_options method:

s = s.highlight_options(order='score')

Enabling highlighting for individual fields is done using the highlight method:

s = s.highlight('title')
# or, including parameters:
s = s.highlight('title', fragment_size=50)

The fragments in the response will then be available on reach Result object as .meta.highlight.FIELD which will contain the list of fragments:

response = s.execute()
for hit in response:
    for fragment in hit.meta.highlight.title:
        print(fragment)

Suggestions¶

To specify a suggest request on your Search object use the suggest method:

s = s.suggest('my_suggestion', 'pyhton', term={'field': 'title'})

The first argument is the name of the suggestions (name under which it will be returned), second is the actual text you wish the suggester to work on and the keyword arguments will be added to the suggest’s json as-is.

If you only wish to run the suggestion part of the search (via the _suggest endpoint) you can do so via execute_suggest:

s = s.suggest('my_suggestion', 'pyhton', term={'field': 'title'})
suggestions = s.execute_suggest()

print(suggestions.my_suggestion)

Extra properties and parameters¶

To set extra properties of the search request, use the .extra() method:

s = s.extra(explain=True)

To set query parameters, use the .params() method:

s = s.params(search_type="count")

If you need to limit the fields being returned by elasticsearch, use the fields() method:

# only return the selected fields
s = s.fields(['title', 'body'])
# reset the field selection
s = s.fields()
# don't return any fields, just the metadata
s = s.fields([])

Serialization and Deserialization¶

The search object can be serialized into a dictionary by using the .to_dict() method.

You can also create a Search object from a dict:

s = Search.from_dict({"query": {"match": {"title": "python"}}})

Response¶

You can execute your search by calling the .execute() method that will return a Response object. The Response object allows you access to any key from the response dictionary via attribute access. It also provides some convenient helpers:

response = s.execute()

print(response.success())
# True

print(response.took)
# 12

print(response.hits.total)

print(response.suggest.my_suggestions)

If you want to inspect the contents of the response objects, just use its to_dict method to get access to the raw data for pretty printing.

Hits¶

To access to the hits returned by the search, access the hits property or just iterate over the Response object:

response = s.execute()
print('Total %d hits found.' % response.hits.total)
for h in response:
    print(h.title, h.body)

Result¶

The individual hits is wrapped in a convenience class that allows attribute access to the keys in the returned dictionary. All the metadata for the results are accessible via meta (without the leading _):

response = s.execute()
h = response.hits[0]
print('/%s/%s/%s returned with score %f' % (
    h.meta.index, h.meta.doc_type, h.meta.id, h.meta.score))

Note

If your document has a field called meta you have to access it using the get item syntax: hit['meta'].

Aggregations¶

Aggregations are available through the aggregations property:

for tag in response.aggregations.per_tag.buckets:
    print(tag.key, tag.max_lines.value)

`MultiSearch`¶

If you need to execute multiple searches at the same time you can use the MultiSearch class which will use the _msearch API:

.. code:: python

from elasticsearch_dsl import MultiSearch, Search

ms = MultiSearch(index=’blogs’)

ms = ms.add(Search().filter(‘term’, tags=’python’)) ms = ms.add(Search().filter(‘term’, tags=’elasticsearch’))

responses = ms.execute()

for response in responses:

print(“Results for query %r.” % response.search.filter) for hit in response:

print(hit.title)

Search DSL¶

The Search object¶

Queries¶

Query combination¶

Filters¶

Aggregations¶

Sorting¶

Pagination¶

Highlighting¶

Suggestions¶

Extra properties and parameters¶

Serialization and Deserialization¶

Response¶

Hits¶

Result¶

Aggregations¶

MultiSearch¶

The `Search` object¶

`MultiSearch`¶