Enhancing Memory Management in Kubernetes 1.36: Tiered Protection and Opt-In Reservation
Introduction
Kubernetes continues to refine how it manages container memory with the latest updates to the Memory QoS feature, now in alpha since v1.22 and receiving significant enhancements in v1.36. Developed by SIG Node, this feature leverages the cgroup v2 memory controller to provide the kernel with better guidance on treating container memory. In v1.36, the key improvements include opt-in memory reservation, tiered protection by Quality of Service (QoS) class, new observability metrics, and a kernel-version warning for memory.high. These changes give cluster operators finer control over memory allocation while reducing the risk of OOM kills.
What's New in v1.36
Opt-In Memory Reservation with memoryReservationPolicy
In previous versions, enabling the MemoryQoS feature gate automatically applied hard memory reservations to all containers with a memory request. This could lead to excessive locking of node memory, leaving little headroom for the kernel, system daemons, or BestEffort workloads. v1.36 separates throttling from reservation: the feature gate still enables memory.high throttling (the kubelet sets memory.high based on memoryThrottlingFactor, default 0.9), but memory reservation is now controlled by a separate kubelet configuration field: memoryReservationPolicy.
This field offers two options:
- None (default): No
memory.minormemory.lowis written to cgroups. Throttling viamemory.highstill works. - TieredReservation: The kubelet writes tiered memory protection based on the Pod's QoS class, as described in the next section.
With this approach, operators can first enable throttling alone, observe workload behavior, and then opt into reservation when the node has sufficient headroom. This flexibility reduces the risk of inadvertently starving system processes.
Tiered Protection by QoS Class
When memoryReservationPolicy is set to TieredReservation, the kubelet applies different memory protection levels depending on the Pod's QoS class:
- Guaranteed Pods receive hard protection via
memory.min. For example, a Guaranteed Pod requesting 512 MiB of memory results in:
The kernel will not reclaim this memory under any circumstances. If it cannot honor the guarantee, it invokes the OOM killer on other processes to free pages.$ cat /sys/fs/cgroup/kubepods.slice/.../memory.min 536870912 - Burstable Pods receive soft protection via
memory.low. For the same 512 MiB request on a Burstable Pod:
The kernel avoids reclaiming this memory under normal pressure, but may reclaim it if the alternative is a system-wide OOM.$ cat /sys/fs/cgroup/kubepods-burstable.slice/.../memory.low 536870912 - BestEffort Pods receive neither
memory.minnormemory.low. Their memory remains fully reclaimable.
This tiered approach ensures that critical workloads get the strictest guarantees, while less critical ones can yield memory under extreme pressure.
Comparison with Earlier Behavior
To understand the improvement, consider a node with 8 GiB of RAM where Burstable Pod requests total 7 GiB. In earlier versions (v1.22–v1.27), all 7 GiB would be locked as memory.min, leaving little headroom for the kernel, system daemons, or BestEffort workloads. This increased the risk of OOM kills when the node encountered memory pressure.
With v1.36's tiered reservation, those Burstable requests map to memory.low instead of memory.min. Under normal pressure, the kernel still protects that memory, but under extreme pressure it can reclaim part of it to avoid a system-wide OOM. Only Guaranteed Pods use memory.min, keeping the total hard reservation lower and improving overall system resilience.
Observability Metrics
v1.36 introduces two alpha-stability metrics exposed on the kubelet /metrics endpoint:
| Metric | Description |
|---|---|
kubelet_memory_qos_node_memory_min_bytes | Total memory reserved via memory.min across all cgroups on the node. |
kubelet_memory_qos_node_memory_low_bytes | Total memory protected via memory.low across all cgroups on the node. |
These metrics allow cluster operators to monitor the amount of hard and soft memory protection in use, helping them tune memoryReservationPolicy and capacity planning.
Kernel Version Warning for memory.high
The v1.36 release also includes a kernel-version warning related to the memory.high cgroup file. The feature now checks the kernel version and warns if it is below a required threshold, ensuring that memory.high behaves as expected. This helps operators avoid silent misconfigurations.
Getting Started
To try the updated Memory QoS feature in Kubernetes v1.36, enable the MemoryQoS feature gate and set the memoryReservationPolicy in the kubelet configuration. For example:
--feature-gates=MemoryQoS=true
--memory-reservation-policy=TieredReservationAfter enabling, monitor the new metrics to observe how memory protection affects your workloads. Begin with the None policy to test throttling alone, then gradually move to TieredReservation once you have confidence in the node's headroom. As always, test changes in non-production environments first.
For more details, refer to the official Kubernetes documentation on Memory QoS (link placeholder).
Related Articles
- Kubernetes v1.36 Launches with Breakthrough Staleness Fixes for Controllers – Urgent Update for Cluster Stability
- 7 Key Steps to Deploy a Serverless Spam Detector with Scikit-Learn and AWS
- Automated Cost Optimization for Azure Blob and Data Lake Storage: Smart Tier Now Generally Available
- How to Simplify Hybrid and Multicloud Connectivity with AWS Interconnect
- Cross-Account AI Safety: Amazon Bedrock Guardrails Centralizes Policy Enforcement
- Docker Hardened Images: One Year of Choosing the Tougher Road
- AWS Deepens AI Ties with Anthropic, Secures Meta for Graviton-Powered Agentic AI
- Mastering Distributed Caching in .NET with Postgres on Azure: A Q&A Guide