How to Query Specific Fields from an Elasticsearch Index in Spring Boot

Combining the power of Spring Boot and Elasticsearch offers robust capabilities for building real-time, search-driven applications. Yet many developers, especially those starting with Elasticsearch, encounter a common question early on: How can I retrieve only the fields I need from a specific index?

This blog post provides a comprehensive walkthrough on how to query only selected fields—such as @timestamp, message, and agent.id—from an Elasticsearch index using a Spring Boot application. You’ll learn not only the technical implementation using _source filtering but also best practices for improving performance, minimizing response size, and structuring your Java code with precision.


How to Query Specific Fields from an Elasticsearch Index in Spring Boot

Table of Contents


Why Selecting Fields in Elasticsearch Matters

By default, Elasticsearch returns the full document in response to a query. However, in real-world applications, it’s rarely necessary to retrieve every single field. For instance, when displaying log entries in a UI, fields like @timestamp, message, and agent.id are often sufficient. Selecting only the required fields offers several advantages:

  • Reduced Network Overhead: Smaller payloads mean faster transfers and less congestion.
  • Improved Response Times: Less data equals quicker Elasticsearch processing and client-side rendering.
  • Lower Memory Footprint: Applications don’t need to deserialize or process unneeded fields.

Elasticsearch supports this selective retrieval via the _source meta field. This field stores the original JSON document and allows clients to filter which parts they want to retrieve during search responses.

In the following section, we’ll configure Spring Boot to connect with Elasticsearch, laying the groundwork for querying specific fields efficiently.


Setting Up Spring Boot with Elasticsearch

Setting Up Spring Boot with Elasticsearch

To interact with Elasticsearch from a Spring Boot application, you need to configure a few essential components. Spring Data Elasticsearch offers a convenient abstraction over low-level Elasticsearch client APIs. This section walks you through dependency setup, connection configuration, and validation.

Elasticsearch

1. Add Dependencies

Include the following dependency in your pom.xml file if you’re using Maven:


  org.springframework.boot
  spring-boot-starter-data-elasticsearch

Make sure the version of Spring Boot and Spring Data Elasticsearch matches the version of your Elasticsearch cluster. Compatibility issues are one of the most common setup pitfalls.

2. Define Connection Settings

Configure Elasticsearch connectivity in your application.yml file:

spring:
  data:
    elasticsearch:
      client:
        reactive:
          endpoints: localhost:9200

This sets up a connection to a locally running Elasticsearch instance. If you are using authentication, SSL, or a cloud-hosted solution (e.g., AWS OpenSearch, Elastic Cloud), adjust the configuration accordingly.

3. Validate the Connection

After configuring the connection, it’s wise to verify whether your application is properly communicating with Elasticsearch. Use the following snippet in a service or @PostConstruct method to check index existence:

@Autowired
private ElasticsearchOperations elasticsearchOperations;

@PostConstruct
public void validateConnection() {
    boolean exists = elasticsearchOperations.indexOps(IndexCoordinates.of("log-index")).exists();
    System.out.println("Index 'log-index' exists: " + exists);
}

Now that the groundwork is complete, you’re ready to explore how to query specific fields in Elasticsearch using the _source filter in native queries.


Understanding _source Field Filtering

Elasticsearch stores the original JSON representation of each document in a special meta field called _source. This allows clients to retrieve exactly the fields they need, reducing the size of responses and improving performance. In this section, you’ll learn how _source filtering works and how to use it effectively.

1. What is the _source Field?

The _source field contains the raw document that was originally indexed. When a query is executed, Elasticsearch returns this entire document unless otherwise specified. To avoid unnecessary data transfer, you can limit the returned fields by specifying a subset using the _source parameter.

Here’s a simple example of retrieving only @timestamp, message, and agent.id from an index called log-index:

GET /log-index/_search
{
  "_source": ["@timestamp", "message", "agent.id"],
  "query": {
    "match_all": {}
  }
}

This instructs Elasticsearch to include only the specified fields in the response, discarding all others.

2. Using include and exclude

For more granular control, you can use includes and excludes arrays to define which fields to return or omit:

GET /log-index/_search
{
  "_source": {
    "includes": ["@timestamp", "message", "agent.id"],
    "excludes": ["host.name"]
  },
  "query": {
    "match_all": {}
  }
}

This approach is especially useful when working with nested documents or when you want to exclude large, rarely used fields such as full-text logs, metadata, or debug traces.

3. Dealing with Nested Structures

Elasticsearch documents can contain deeply nested structures. Fields like agent.id often reside within nested or object fields. In such cases, ensure you specify the full path when filtering:

{
  "@timestamp": "2025-04-01T12:00:00Z",
  "message": "System reboot detected",
  "agent": {
    "id": "def456",
    "name": "metricbeat"
  }
}

To retrieve agent.id, the path "agent.id" must be specified in the _source filter. Misnaming or partial paths will result in missing data or empty fields in the response.

With this foundational understanding of _source filtering, you’re ready to see how Spring Data Elasticsearch supports this functionality through Java APIs and native query builders.


Writing Field-Selective Queries with Spring Data Elasticsearch

Now that you understand how _source filtering works in raw Elasticsearch queries, let’s explore how to implement the same in a Spring Boot application using Spring Data Elasticsearch. This section covers native query construction, executing the query, and mapping the response to a custom DTO.

1. Building a NativeSearchQuery with Field Filters

To specify which fields to include in the search response, use NativeSearchQueryBuilder along with FetchSourceFilter. Here’s how you can build a query to retrieve only @timestamp, message, and agent.id:

NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
    .withQuery(QueryBuilders.matchAllQuery())
    .withSourceFilter(new FetchSourceFilter(
        new String[] { "@timestamp", "message", "agent.id" },
        null
    ))
    .build();

This query structure is internally translated into the JSON format shown earlier, allowing Elasticsearch to return only the fields you explicitly requested.

2. Executing the Query with ElasticsearchOperations

Once the query is built, you can execute it using ElasticsearchOperations or ElasticsearchRestTemplate. Here’s an example using ElasticsearchOperations:

@Autowired
private ElasticsearchOperations elasticsearchOperations;

SearchHits<LogDocument> hits = elasticsearchOperations.search(
    searchQuery,
    LogDocument.class,
    IndexCoordinates.of("log-index")
);

List<LogDocument> results = hits.stream()
    .map(SearchHit::getContent)
    .collect(Collectors.toList());

This will return a list of LogDocument objects, with only the fields specified in the _source filter populated.

3. Designing a DTO to Receive Selected Fields

It’s good practice to define a dedicated DTO that includes only the fields you expect from the query. This ensures efficient deserialization and better memory usage. When working with dotted field names like agent.id, use the @JsonProperty annotation to map JSON keys to Java properties:

@Data
public class LogDocument {

    @JsonProperty("@timestamp")
    private String timestamp;

    private String message;

    @JsonProperty("agent.id")
    private String agentId;
}

Alternatively, you can model nested objects as inner classes, which may improve clarity and maintainability, especially when dealing with deeply nested JSON structures.

In the next section, we’ll tie all these pieces together in a full end-to-end example to demonstrate how it all works in a real Spring Boot application.


Code Example: Retrieving @timestamp, message, agent.id

Let’s bring everything together into a complete, working code example. We’ll define the necessary DTO, write a service class to execute the query, and expose an API endpoint via a REST controller.

1. Define the DTO Class

The DTO will represent only the fields we want to retrieve from Elasticsearch. Use @JsonProperty to handle fields with special characters like @timestamp or dot-separated names such as agent.id.

@Data
@NoArgsConstructor
@AllArgsConstructor
public class LogDocument {

    @JsonProperty("@timestamp")
    private String timestamp;

    private String message;

    @JsonProperty("agent.id")
    private String agentId;
}

2. Implement the Search Service

The service layer constructs the query and interacts with Elasticsearch. Here’s how to build and execute a filtered query using ElasticsearchOperations:

@Service
public class LogSearchService {

    @Autowired
    private ElasticsearchOperations elasticsearchOperations;

    public List<LogDocument> fetchSelectedLogFields() {
        NativeSearchQuery query = new NativeSearchQueryBuilder()
            .withQuery(QueryBuilders.matchAllQuery())
            .withSourceFilter(new FetchSourceFilter(
                new String[] { "@timestamp", "message", "agent.id" },
                null
            ))
            .build();

        SearchHits<LogDocument> searchHits = elasticsearchOperations.search(
            query,
            LogDocument.class,
            IndexCoordinates.of("log-index")
        );

        return searchHits.stream()
            .map(SearchHit::getContent)
            .collect(Collectors.toList());
    }
}

3. Create a REST Controller

Now expose the search functionality through a RESTful endpoint. This allows external systems or frontend applications to fetch only the required log fields.

@RestController
@RequestMapping("/logs")
public class LogController {

    @Autowired
    private LogSearchService logSearchService;

    @GetMapping("/fields")
    public ResponseEntity<List<LogDocument>> getFilteredLogs() {
        List<LogDocument> logs = logSearchService.fetchSelectedLogFields();
        return ResponseEntity.ok(logs);
    }
}

With this implementation, a GET request to /logs/fields will return only the selected fields for each log document from the log-index. This setup is ideal for dashboards, monitoring tools, or APIs where minimal payload is essential.

Next, we’ll examine how these techniques contribute to system-wide performance improvements and offer practical optimization tips.


Performance Tips: Minimize Payload, Maximize Speed

Optimizing field selection in Elasticsearch queries isn’t just a matter of best practice—it’s a powerful way to improve overall system performance. This section outlines how smart querying can reduce response time, lower bandwidth consumption, and enhance application scalability.

1. Smaller Responses = Faster Applications

When you retrieve fewer fields, Elasticsearch returns a smaller payload, which directly translates to faster transmission over the network. Consider this comparison between full document retrieval and filtered field selection:

Request Type Payload Size Explanation
Full document ~150KB Includes all fields, many of which may be unused
Filtered fields ~12KB Returns only necessary fields like @timestamp, message

Reducing payload size leads to faster user experiences, especially in bandwidth-constrained environments such as mobile networks or server-to-server API calls.

2. Use DTOs to Optimize Serialization

When you define a Data Transfer Object (DTO) that only contains the fields you need, you prevent unnecessary serialization and deserialization. This saves CPU cycles and avoids memory bloat caused by unused fields.

Moreover, avoiding direct use of JPA entities (especially those loaded with annotations and relationships) in Elasticsearch responses can reduce processing overhead and unintended coupling between layers of your application.

3. Be Mindful of Field Types

Different field types have varying performance characteristics in Elasticsearch:

  • Keyword fields: Fast for sorting and filtering.
  • Text fields: Analyzed, and slower for full-text search unless tuned.
  • Date fields: Efficient for range queries and aggregations.

If you’re using a field like message for display only, consider retrieving message.keyword instead to skip analysis and gain performance.

4. Avoid Fetching Large Fields (Unless Needed)

Fields like stack traces, binary blobs, or verbose logs can be substantial in size. Always exclude them from your _source when they’re not essential to the current view.

In short, field filtering is not just a technical detail—it’s a strategic tool for performance tuning. Next, we’ll look at some common issues developers encounter and how to troubleshoot them effectively.


Common Issues and How to Fix Them

While querying specific fields from Elasticsearch is conceptually simple, implementation in Spring Boot can introduce unexpected challenges. Let’s go through several common problems you might encounter—and how to solve them efficiently.

1. “No mapping found for field” Error

This error typically occurs when the queried field name doesn’t match the index mapping. The causes might include:

  • Typos or incorrect case sensitivity in field names
  • Dot-separated fields not properly referenced in nested structures

Fix:

  • Use the GET /your-index/_mapping API to inspect actual field names and structure
  • Verify if the field is part of a nested type and adjust your query accordingly

2. Included Fields Not Reflected in Response

Sometimes, even when using FetchSourceFilter or _source filters, the full document is still returned. This may be caused by:

  • Incorrect usage of ElasticsearchRepository (which doesn’t support fine-grained field filtering)
  • Library or version mismatch between Spring Data and Elasticsearch client

Fix:

  • Switch to ElasticsearchOperations or RestHighLevelClient
  • Ensure library versions are compatible with your Elasticsearch cluster

3. Dot Notation Fields Cause Mapping Failures

Fields like agent.id contain dot notation that Java doesn’t support as variable names. Without special handling, mapping or serialization will fail.

Fix:

  • Use @JsonProperty("agent.id") to bind JSON field names to Java properties
  • Alternatively, restructure the DTO using nested classes:
@Data
public class Agent {
    private String id;
}

@Data
public class LogDocument {
    @JsonProperty("@timestamp")
    private String timestamp;
    private String message;
    private Agent agent;
}

4. Nested Fields Not Filtered Correctly

When working with nested fields, _source filtering may not behave as expected if you don’t include the entire path or if the mapping is ambiguous.

Fix:

  • Ensure the field is indexed as nested and not just an object
  • Use precise field paths in the _source array, e.g., "agent.id"

With these fixes in place, your application will reliably retrieve only the required fields, streamlining both performance and maintainability. In the final section, we’ll summarize the takeaways and offer guidance on extending your field filtering strategy further.


Conclusion: Precise Queries Shape Scalable Systems

Querying specific fields from an Elasticsearch index may seem like a minor optimization—but in scalable, real-world systems, it’s a critical performance strategy. By reducing payload size, lowering response latency, and eliminating unnecessary serialization, selective field querying directly improves both user experience and backend efficiency.

In this guide, you learned how to:

  • Understand and apply _source filtering in Elasticsearch queries
  • Use Spring Boot and Spring Data Elasticsearch to construct and execute field-specific searches
  • Handle nested fields, dot notation, and DTO mapping for cleaner and safer deserialization
  • Avoid common pitfalls and troubleshoot issues with practical, real-world solutions
“A well-targeted Elasticsearch query doesn’t just fetch data—it defines the boundaries of performance, precision, and purpose in your application.”

Now that you’ve mastered field selection, consider exploring advanced Elasticsearch features like:

  • Pagination: Efficiently manage large result sets with from and size parameters
  • Sorting: Combine field filtering with sorting on keyword or numeric fields
  • Conditional queries: Use term, match, or bool queries alongside field selection

These techniques, when combined, form the backbone of scalable search architectures in microservices, monitoring tools, and data platforms.

댓글 남기기

Table of Contents