Mastering Kubernetes v1.36: 7 Things You Need to Know About Server-Side Sharded List and Watch

From Usahobs, the free encyclopedia of technology

Kubernetes clusters are growing bigger every day, with some deployments reaching tens of thousands of nodes. At that scale, controllers that monitor high-cardinality resources like Pods hit a performance ceiling. Every replica of a horizontally scaled controller receives the entire event stream from the API server, wasting CPU, memory, and network bandwidth on objects it doesn’t manage. The traditional client-side sharding approach doesn’t fix this; it actually multiplies the cost with each additional replica. Kubernetes v1.36 introduces a game-changing alpha feature: server-side sharded list and watch (KEP-5866). This moves filtering upstream to the API server, so each replica only gets the events it owns. Here are seven crucial things you need to understand about this new capability.

1. The Scalability Wall for Large Clusters

As your cluster scales to thousands of nodes, controllers that watch resources like Pods face a massive data deluge. Every replica of those controllers receives the full stream of events from the API server. This means each replica deserializes every single event, consuming CPU and memory, only to discard the majority of objects that fall outside its responsibility. The cost per replica doesn’t decrease when you add more replicas; instead, the total cost multiplies. This becomes a serious bottleneck for large-scale Kubernetes operations, making it harder to maintain performance and reliability.

Mastering Kubernetes v1.36: 7 Things You Need to Know About Server-Side Sharded List and Watch

2. Why Client-Side Sharding Falls Short

Some controllers, like kube-state-metrics, already implement client-side sharding by assigning each replica a portion of the keyspace. While this distributes responsibility logically, it doesn’t reduce the data flowing from the API server. Every replica still receives and processes the entire event stream, then throws away what it doesn’t need. The network bandwidth scales with the number of replicas, not with the actual shard size. Similarly, CPU cycles spent on deserialization are wasted for the discarded fraction. This approach works functionally but fails to address the core scalability issue.

3. Server-Side Sharding: A Shift in Filtering

Kubernetes v1.36 flips the model by moving event filtering from the client to the API server. With server-side sharded list and watch (alpha feature), each controller replica tells the API server which hash range it owns. The API server then sends only the events that match that range. This eliminates unnecessary network traffic and wasted computation on the client side. It’s a fundamental change that reduces the load on both the API server and the controller replicas, enabling horizontal scaling without the multiplicative overhead.

4. How the Hash-Based Sharding Works

The feature introduces a new shardSelector field in ListOptions. Controllers specify a hash range using the shardRange() function, for example: shardRange(object.metadata.uid, '0x0000000000000000', '0x8000000000000000'). The API server computes a deterministic 64-bit FNV-1a hash of the specified field (currently object.metadata.uid or object.metadata.namespace) and returns only objects whose hash falls within the range [start, end). This applies to both initial list responses and subsequent watch event streams. The hash function is consistent across all API server replicas, ensuring reliability in multi-instance deployments.

5. Configuring Shard Selectors in Controllers

To use server-side sharding, controllers typically rely on informers. You can inject the shardSelector into the ListOptions used by your informers via the WithTweakListOptions function. Here’s a Go code snippet that demonstrates the setup:

import (
    metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
    "k8s.io/client-go/informers"
)

shardSelector := "shardRange(object.metadata.uid, '0x0000000000000000', '0x8000000000000000')"

factory := informers.NewSharedInformerFactoryWithOptions(client, resyncPeriod,
    informers.WithTweakListOptions(func(opts *metav1.ListOptions) {
        opts.ShardSelector = shardSelector
    }),
)

This way, each replica only lists and watches the subset of resources it cares about, drastically reducing resource consumption.

6. Practical Example: Splitting the Hash Space

For a two-replica deployment, you split the 64-bit hash space in half. Replica 0 might get the range from 0x0000000000000000 to 0x8000000000000000, while replica 1 gets the other half from 0x8000000000000000 to 0xFFFFFFFFFFFFFFFF. This ensures each replica handles exactly half the objects. The shard boundaries are arbitrary hex values; you can divide the space into any number of equal or unequal ranges. The key point is that the API server does the filtering, so each replica only receives relevant events, eliminating the wasted overhead of client-side discarding.

7. Benefits and Future Potential

Server-side sharding brings major improvements: reduced network bandwidth (no more sending full event streams to every replica), lower CPU usage on controllers (no wasted deserialization), and better horizontal scaling (adding replicas doesn’t multiply API server load). This feature is alpha in v1.36, so expect further enhancements, such as support for additional hash fields and even more efficient resource usage. It’s a significant step forward for large-scale Kubernetes operations, making it easier to run resource-intensive controllers without hitting performance ceilings.

Server-side sharded list and watch is a subtle but powerful change that addresses a long-standing scalability pain point. By shifting filtering to the API server, Kubernetes v1.36 enables controllers to scale horizontally without the multiplicative cost of client-side processing. As this feature matures, it will likely become a standard pattern for managing high-cardinality resources in large clusters. Stay tuned for future updates and start experimenting with the alpha flag today.