Elasticsearch, a powerful distributed search and analytics engine, allows developers to store, search, and analyze large volumes of data in real time. One of its most versatile features is the ability to craft compound queries. These queries allow users to combine multiple query types, providing fine-grained control over search logic. In this article, we will dive deep into compound queries in Elasticsearch, their types, use cases, and best practices.
What Are Compound Queries?
In Elasticsearch, queries are divided into two main categories:
- Leaf queries – These operate on individual fields (e.g.,
match
,term
,range
queries). - Compound queries – These allow you to combine or modify the results of other queries.
Compound queries are essential when you need to implement complex search conditions. They let you:
- Combine multiple queries with logical operators.
- Adjust the relevance score of results.
- Execute queries conditionally.
Types of Compound Queries in Elasticsearch
Elasticsearch offers several types of compound queries, each serving a specific purpose. Here's an overview:
1. bool
Query
The bool
query is the most widely used compound query. It allows you to combine multiple leaf or compound queries using Boolean logic.
Structure
{
"query": {
"bool": {
"must": [ { "match": { "field": "value" } } ],
"filter": [ { "term": { "status": "active" } } ],
"should": [ { "range": { "age": { "gte": 30 } } } ],
"must_not": [ { "term": { "banned": true } } ]
}
}
}
Explanation:
must
: All conditions here must be satisfied.filter
: Results must meet these criteria, but they don’t affect scoring.should
: At least one of these conditions should match (optional, but boosts relevance).must_not
: Excludes documents matching these conditions.
2. constant_score
Query
The constant_score
query is used to ignore relevance scoring for certain matches. This is useful when you want all matching documents to have the same score.
Structure
{
"query": {
"constant_score": {
"filter": {
"term": { "status": "active" }
},
"boost": 1.5
}
}
}
3. dis_max
Query
The dis_max
(disjunction maximum) query selects the best matching query out of several provided, giving the highest relevance score.
Structure
{
"query": {
"dis_max": {
"queries": [
{ "match": { "title": "Elasticsearch" } },
{ "match": { "description": "Elasticsearch" } }
],
"tie_breaker": 0.3
}
}
}
Explanation:
tie_breaker
: Adjusts how much influence weaker matches have on the final score.
4. function_score
Query
The function_score
query lets you modify the relevance score of matching documents using custom functions.
Structure
{
"query": {
"function_score": {
"query": { "match": { "title": "Elasticsearch" } },
"functions": [
{
"weight": 2,
"filter": { "term": { "category": "tech" } }
},
{
"gauss": {
"created_at": {
"origin": "now",
"scale": "10d",
"decay": 0.5
}
}
}
],
"boost_mode": "multiply"
}
}
}
5. rescore
Query
The rescore
query is used to refine the results of a primary query by running a secondary, more specific query on the top N results.
Structure
{
"query": {
"match": { "content": "Elasticsearch" }
},
"rescore": {
"window_size": 50,
"query": {
"rescore_query": {
"match_phrase": { "content": "Elasticsearch compound queries" }
},
"query_weight": 0.7,
"rescore_query_weight": 2
}
}
}
Conclusion
Compound queries in Elasticsearch are incredibly powerful tools for crafting sophisticated search logic. Whether you're combining filters with bool
, fine-tuning relevance with function_score
, or prioritizing results with rescore
, understanding these queries can significantly enhance your Elasticsearch capabilities.
By applying the strategies and examples outlined in this guide, you’ll be able to create efficient, scalable, and maintainable queries that meet your application's unique requirements.