Elasticsearch is well-known for its full-text search capabilities, but it also excels at handling structured data, such as numerical values, exact terms, and IDs. When working with data that doesn't require analysis or tokenization, term-level queries are the go-to choice for achieving precise matching.
In this blog post, we’ll dive into what term-level queries are, their key use cases, and how to effectively utilize them in Elasticsearch.
What Are Term-Level Queries?
Term-level queries are Elasticsearch’s method for finding exact matches in a dataset. Unlike full-text queries, which rely on text analysis and tokenization, term-level queries directly search for the provided value as-is.
These queries are best suited for:
- Structured fields like keywords, dates, numbers, and boolean values.
- Fields where exact matching is critical (e.g., IDs or tags).
- Scenarios where data is already normalized or analyzed before indexing.
For instance, if you want to find a document with the exact tag "elasticsearch-tutorial," a term-level query ensures no partial matches like "elasticsearch" or "tutorial" are included.
Common Use Cases for Term-Level Queries
- Searching for Exact Terms: Ideal for fields like user IDs, product SKUs, or categories.
- Matching Numeric Data: Efficient for ranges or exact values in price, age, or other numerical fields.
- Boolean Searches: Quickly filter documents based on
true
/false
flags. - Aggregations: Precise grouping and counting of data based on terms or IDs.
Types of Term-Level Queries
Elasticsearch offers several term-level query types, each designed for specific matching needs. Let’s explore the most commonly used ones:
Term Query
The simplest term-level query searches for documents with an exact match to the provided value.
Example:
{
"query": {
"term": {
"status": {
"value": "active"
}
}
}
}
This query retrieves documents where the status
field has the exact value "active"
.
Terms Query
The terms
query allows you to search for multiple exact values in a field.
Example:
{
"query": {
"terms": {
"category": ["electronics", "books", "furniture"]
}
}
}
This query matches documents with category
values of "electronics"
, "books"
, or "furniture"
.
Range Query
When working with numerical data, dates, or other range-based fields, the range
query lets you define inclusive or exclusive bounds.
Example:
{
"query": {
"range": {
"price": {
"gte": 100,
"lte": 500
}
}
}
}
This query retrieves documents with price
values between 100 and 500, inclusive.
Exists Query
The exists
query checks if a field is present in a document, regardless of its value.
Example:
{
"query": {
"exists": {
"field": "description"
}
}
}
This query returns documents where the description
field exists.
Prefix Query
The prefix
query matches terms starting with a specified prefix, making it useful for autocompletion features.
Example:
{
"query": {
"prefix": {
"title": {
"value": "ela"
}
}
}
}
This query matches documents with title
values like "elastic", "elasticsearch", or "elaborate."
Wildcard Query
The wildcard
query enables pattern matching using *
(any characters) and ?
(single character).
Example:
{
"query": {
"wildcard": {
"username": {
"value": "user*"
}
}
}
}
This query matches username
values like "user1", "user123", or "user_name."
Regexp Query
For more complex pattern matching, the regexp
query uses regular expressions.
Example:
{
"query": {
"regexp": {
"sku": "prod[0-9]{3}"
}
}
}
This query matches sku
values like "prod001" or "prod123."
Combining Term-Level Queries
Term-level queries can be combined with other queries using Elasticsearch’s bool
query for complex search scenarios.
Example: Combining term
and range
Queries
{
"query": {
"bool": {
"must": [
{ "term": { "status": "active" } },
{ "range": { "price": { "gte": 100 } } }
]
}
}
}
This query retrieves documents where status
is "active"
and price
is greater than or equal to 100.
Optimizing Term-Level Queries
- Use Keyword Fields: For text fields, ensure they are mapped as
keyword
to avoid tokenization. - Minimize Wildcard Usage: Wildcard and
regexp
queries are resource-intensive. Use them sparingly or with short patterns. - Leverage Filters: Use term-level queries within filters for better performance, as filters don’t impact scoring.
When to Avoid Term-Level Queries
While term-level queries are highly efficient for exact matching, they are not suitable for:
- Analyzed text fields, where tokenization or stemming is required.
- Scenarios requiring relevance scoring (use full-text queries instead).
Conclusion
Term-level queries are an essential tool in Elasticsearch for handling structured data and exact matches. Whether you're filtering products by category, searching for specific IDs, or performing range queries on numerical fields, term-level queries provide the precision and performance needed for these tasks.
By understanding and leveraging these queries, you can build robust search functionalities tailored to your application's structured data requirements. Experiment with term-level queries in your Elasticsearch projects and unlock their full potential.