Last updated on February 27th, 2025 at 07:16 am

Modern applications need to be fast—but constant trips to backend databases or APIs can grind performance to a halt. In this step-by-step guide to configure caching with Istio & Envoy, you’ll learn how to store frequently accessed data closer to users, eliminating delays and ensuring your app stays lightning-fast. By leveraging Istio’s service mesh and Envoy’s proxy capabilities, you’ll transform repetitive backend requests into instant cached responses, keeping your users happy and your systems efficient

The good news? You don’t need to rewrite your code or overhaul your Kubernetes setup. With Istio and Envoy, you can bake caching directly into your service mesh, acting like a smart “traffic cop” that decides what to store, when to refresh it, and how to slash latency without breaking a sweat.

In this guide, I’ll walk you through setting up caching in Kubernetes using Istio and Envoy—step by step. You’ll learn how to:

  • Activate Envoy’s built-in caching features (spoiler: it’s easier than you think).
  • Define rules for what to cache and how long to keep it.
  • Test and tweak your setup like a pro, even if you’re new to service meshes. 

Prerequisites

Before diving in, make sure you have:

  • A running Kubernetes cluster (Minikube, EKS, GKE, etc.).
  • Istio installed (if not, run istioctl install --set profile=demo -y) – Version I am running
    Client version: 1.23.3
    Control plane version: 1.24.2
    Data plane version: 1.24.2
  • A destination application / service / pods already running or can be deployed during the setup

Application Traffic Management

Istio simplifies traffic management in Kubernetes by letting you control how requests flow between services. Two key components for this are Gateway and VirtualService, which work together to manage external access and internal routing.

Istio Gateway: The Entry Point

Gateway acts as a “door” for traffic entering or leaving your service mesh. It defines rules for what external traffic is allowed in, similar to a security guard checking IDs at the entrance.

Key Configurations

  • Ports/Protocols: Specifies which ports (e.g., 80, 443) and protocols (HTTP, HTTPS, TLS) to listen on.
  • Hosts: The domains it handles (e.g., example.com).
  • TLS Settings: Configures SSL/TLS certificates for HTTPS traffic (terminates SSL here).

Example YAML

apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: my-webtutorials-gateway
  namespace: my-own-ns
spec:
  selector:
    istio: ingressgateway
  servers:
  - port:
      number: 80
      name: http
      protocol: HTTP
    hosts:
    - "webtutorials.dev"

Istio VirtualService: The Traffic Director

What it Does
VirtualService defines where incoming requests go once they’re inside the mesh. Think of it as a GPS for traffic, routing requests to specific services based on paths, headers, or other rules.

Key Features

  • Route Requests: Send traffic to different service versions (e.g., v1 vs. v2).
  • Advanced Routing: Split traffic, retry failed requests, inject faults (for testing).
  • Match Conditions: Route based on URI paths, headers, or query parameters

Example YAML

apiVersion: networking.istio.io/v1
kind: VirtualService
metadata:
  name: my-webtutorials-virtual-service  # Name of the Istio VirtualService resource
  namespace: my-own-ns
spec:
  hosts:
  - "webtutorials.dev"  # Host for which this VirtualService configuration applies
  gateways:
  - "my-webtutorials-gateway"  # Gateway to associate with this VirtualService
  http:
  - route:
    - destination:
        host: "first-app-v1-webtutorials"  # Destination service for this route
        subset: "v1"  # Subset of the destination service
    headers:
       response:
         add:
            my-header-from: "virtualservice"

In the above example I am adding a customer header response with key as my-custom-header and value fromvirtualservice. It also forward the request to a destination service in this case my service name is first-app-v1-webtutorials

We should also add a DestinationRule for the above

apiVersion: networking.istio.io/v1
kind: DestinationRule
metadata:
  name: reviews-webtutorials-destination
  namespace: my-own-ns
spec:
  host: first-app-v1-webtutorials
  subsets:
  - name: v1
    labels:
      version: v1

My Service YAML looks like the one below, all it does is connect to pods with label app=first-app-webtutorials running on nginx/php – For testing the 120 second cache behavior (The section below elaborates on max-age in greater depth), I built a basic Nginx-PHP application. The homepage includes JavaScript to show the client’s local time and PHP to display the server time, making it easy to track when cached vs. fresh content is served(not going to discuss more about pod container/image configuration as it is outside the scope of this tutorial)

apiVersion: v1
kind: Service
metadata:
  name: first-app-v1-webtutorials
  namespace: my-own-ns
spec:
  ports:
    - name: http
      port: 80
  selector:
    app: first-app-webtutorials

With this setup, accessing the website via the hostname webtutorials.dev should load the homepage correctly.

Now, if you would like to add caching, follow the below envoyfilter configuration

Enable Envoy’s Caching Filter

Envoy does the heavy lifting here. You’ll configure its cache filter via Istio’s EnvoyFilter resource. More details on EnvoyFilter can be found here.

Create a file named caching-envoy-filter.yaml:

apiVersion: networking.istio.io/v1alpha3  
kind: EnvoyFilter  
metadata:  
  name: cache-filter  
  namespace: istio-system  
spec:  
  workloadSelector:  
    labels:  
      istio: ingressgateway  
  configPatches:  
    - applyTo: HTTP_FILTER  
      match:  
        context: GATEWAY  
        listener:  
          filterChain:  
            filter:  
              name: "envoy.filters.network.http_connection_manager"  
      patch:  
        operation: INSERT_FIRST  
        value:  
          name: envoy.filters.http.lua  
          typed_config:  
            "@type": "type.googleapis.com/envoy.extensions.filters.http.lua.v3.Lua"  
            inlineCode: |  
              function envoy_on_request(request_handle)  
                domains = {  
                  "webtutorials.dev"  
                }  
                -- Add logic here (e.g., header manipulation)  
              end  
    - applyTo: HTTP_FILTER  
      match:  
        context: GATEWAY  
        listener:  
          filterChain:  
            filter:  
              name: "envoy.filters.network.http_connection_manager"  
              subFilter:  
                name: "envoy.filters.http.router"  
      patch:  
        operation: INSERT_BEFORE  
        value:  
          name: "envoy.filters.http.cache"  
          typed_config:  
            "@type": "type.googleapis.com/envoy.extensions.filters.http.cache.v3.CacheConfig"  
            typed_config:  
              "@type": "type.googleapis.com/envoy.extensions.http.cache.file_system_http_cache.v3.FileSystemHttpCacheConfig"  
              cache_path: /var/lib/istio/data  
              manager_config:  
                thread_pool:  
                  thread_count: 1  

Uses a tiny Lua script to inject the header.

  • Targets traffic for the domain webtutorials.dev.

Server-Side Caching (Istio Ingress)

  • What it does:
    Enables server-side caching at the Istio ingress gateway.
    • Cached responses are stored on disk (in /var/lib/istio/data).
    • Uses a simple file-based cache (not in-memory).
  • How it works:
    • Inserts a cache filter before the final routing step (envoy.filters.http.router).
    • Only 1 worker thread manages the cache (low resource usage, but slower for high traffic).

Why Would You Use This?

  • Faster page loads: Repeat visitors get cached content from their browser .
  • Reduce backend load: Common requests (e.g., product listings) are cached at the ingress, avoiding repeated processing.

Config (thread_count: 1)

Simple & Lightweight:

  • Good for low-traffic scenarios or testing.
  • Minimal CPU/memory usage.

Bottleneck Risk:

  • If multiple requests try to access the cache simultaneously, they’ll queue up and wait for the single thread.
  • This can slow down responses under heavy load

thread_count controls how many “workers” handle caching tasks. My config uses 1 worker (simple but limited). Adjust it based on traffic and performance needs.

Add custom headers from Sidecar

This EnvoyFilter modifies outgoing HTTP responses from my service (first-app-webtutorials). Think of it as a “response editor” that tweaks headers before sending data back to clients.

apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: response-filter
  namespace: my-own-ns # Please change this namespace to the one where the pods are deployed
spec:
  workloadSelector:
    labels:
      app: first-app-webtutorials # Please change this label to deployment of your choice
  configPatches:
    - applyTo: HTTP_FILTER
      match:
        context: SIDECAR_INBOUND
        listener:
          filterChain:
            filter:
              name: "envoy.filters.network.http_connection_manager"
      patch:
        operation: INSERT_BEFORE
        value:
          name: envoy.filters.http.lua
          typed_config:
            "@type": "type.googleapis.com/envoy.extensions.filters.http.lua.v3.Lua"
            inlineCode: |
              function envoy_on_response(response_handle)
                local response_headers = response_handle:headers()
                response_headers:replace("Cache-Control", "max-age=120")
                ip = os.getenv("HOST_IP");
                response_headers:replace("host-ip", ip)
                response_headers:remove("vary")
              end

Here’s what it does:

  • Applies to: Inbound traffic to your service (SIDECAR_INBOUND context).
  • Uses Lua scripting: Runs a tiny script to manipulate headers when responses are sent.
  • Trigger: Runs envoy_on_response after my service generates a response but before it’s sent to the client.
  • Adds/modifies headers in envoy_on_response  response:
    • Cache-Control: max-age=120: Tells browsers/clients to cache the response for 120 seconds (2 minutes).
    • host-ip: <HOST_IP>: Adds the server’s IP (from the HOST_IP environment variable) as a custom header.
    • Removes the Vary header: Simplifies caching by deleting this header (may affect caching proxies/CDNs).

What This Means for Browsers?

  • When we remove the Vary header, caches (browsers, CDNs, etc.) assume the response is identical for all clients, regardless of headers like User-Agent (Chrome vs. Firefox) or Accept-Encoding (gzip vs. brotli).
  • Result: Only one cached copy is stored per URL.

Example: If Chrome caches a response, Firefox will get the same cached version, even if their User-Agent headers differ.

Chrome, Firefox, Safari, etc. will all receive the same cached response from the cache (e.g., your Istio ingress or a CDN).

Testing Istio Setup

After applying the EnvoyFilter and VirtualService configurations, here’s how to verify everything works as expected:

Verify Response Headers

Check if the Cache-Controlhost-ip, and Vary headers are modified correctly.

Example Command (using curl):

curl -I http://webtutorials.dev

Expected Output:

HTTP/1.1 200 OK
cache-control: max-age=120
host-ip: 10.1.2.3  # Your server's actual IP
# "vary" header should be missing

If cache-controlhost-ip exist and vary is removed things are working as expected

To figure out HOST_IP you can try running

$ k exec -it <YOUR_APP> -n=<YOUR_NS> -c=istio-proxy -- printenv HOST_IP
172.31.17.238

Check Response Headers

Look for headers like Age (seconds since cached) or X-Cache: HIT in responses.

curl -I http://webtutorials.dev

Example Output:

age: 10  # Indicates the response was cached 10 seconds ago

Troubleshooting

  • Headers not applied?
    • Check if the EnvoyFilter is applied to the correct workload (app: first-app-webtutorials).
    • Verify the Lua script for typos (e.g., envoy_on_response vs. envoy_on_request).
  • No server-side caching?
    • Ensure the /var/lib/istio/data directory exists on the ingress pod.
    • Check Envoy logs for cache-related errors

Sample CURL output (truncated output)

># curl -vI http://webtutorials.dev
< HTTP/1.1 200 OK
HTTP/1.1 200 OK
< date: Tue, 25 Feb 2025 20:57:30 GMT
date: Tue, 25 Feb 2025 20:57:30 GMT
< server: istio-envoy
server: istio-envoy
< x-powered-by: PHP/5.6.14
x-powered-by: PHP/5.6.14
< content-type: text/html; charset=UTF-8
content-type: text/html; charset=UTF-8
< x-envoy-upstream-service-time: 6
x-envoy-upstream-service-time: 6
< cache-control: max-age=120
cache-control: max-age=120
< host-ip: 172.31.17.238
host-ip: 172.31.17.238
< my-header-from: virtualservice
my-header-from: virtualservice
< transfer-encoding: chunked
transfer-encoding: chunked
< age: 101
age: 101

Sample /var/lib/istio/data directory details. Once you connect to the application using CURL or browser you should see files getting populated in this directory

verify istio disk cache is working fine

Testing confirms the custom headers my-header-from and host-ip behave as expected.