How to Enable Tiered Memory Protection with Memory QoS in Kubernetes v1.36

By
<h2>Introduction</h2> <p>Kubernetes v1.36 introduces a smarter way to manage container memory with the updated <strong>Memory QoS</strong> feature. This guide walks you through enabling and configuring tiered memory protection based on Pod QoS classes. You'll learn how to use the new <code>memoryReservationPolicy</code> field to control whether memory is hard-reserved (<code>memory.min</code>) or soft-protected (<code>memory.low</code>), giving you better control over node memory pressure and reducing OOM risks. Whether you're a cluster administrator or a developer, this step-by-step approach will help you optimize memory allocation for your workloads.</p><figure style="margin:20px 0"><img src="https://picsum.photos/seed/1556379099/800/450" alt="How to Enable Tiered Memory Protection with Memory QoS in Kubernetes v1.36" style="width:100%;height:auto;border-radius:8px" loading="lazy"><figcaption style="font-size:12px;color:#666;margin-top:5px"></figcaption></figure> <h2>What You Need</h2> <ul> <li><strong>Kubernetes v1.36</strong> (or later) cluster with <em>cgroup v2</em> enabled on all nodes.</li> <li><strong>Feature gate</strong> <code>MemoryQoS</code> must be enabled (alpha in v1.36).</li> <li><strong>kubelet</strong> configuration access (e.g., via <code>KubeletConfiguration</code> file or flags).</li> <li><strong>Understanding</strong> of Pod QoS classes: Guaranteed, Burstable, BestEffort.</li> <li><strong>Optional</strong>: Metrics endpoint access (<code>/metrics</code>) to observe memory reservation.</li> </ul> <h2>Step-by-Step Guide</h2> <h3 id="step1">Step 1: Enable the Memory QoS Feature Gate</h3> <p>The feature is alpha and requires explicit activation. Edit the kubelet configuration on each node (or use a centralized <code>KubeletConfiguration</code> resource):</p> <ol> <li>Set <code>featureGates.MemoryQoS: true</code> in the kubelet config file (e.g., <code>/var/lib/kubelet/config.yaml</code>).</li> <li>Restart the kubelet service: <code>systemctl restart kubelet</code>.</li> <li>Verify the gate is active: check kubelet logs for <code>MemoryQoS feature gate enabled</code> or inspect <code>/sys/fs/cgroup/kubepods.slice/</code> – if <code>memory.high</code> is set on cgroups, throttling is active.</li> </ol> <h3 id="step2">Step 2: Configure memoryReservationPolicy</h3> <p>By default, enabling the gate only turns on <code>memory.high</code> throttling (a soft limit). To add memory protection, set the <code>memoryReservationPolicy</code> field in kubelet configuration:</p> <ul> <li><strong>None</strong> (default): No memory reservation (<code>memory.min</code> or <code>memory.low</code>). Only throttling via <code>memory.high</code> applies.</li> <li><strong>TieredReservation</strong>: The kubelet writes per-Pod cgroup values for <code>memory.min</code> or <code>memory.low</code> based on QoS class.</li> </ul> <p>To enable tiered protection, add to the kubelet config:</p> <pre><code>memoryReservationPolicy: TieredReservation</code></pre> <p>Restart kubelet for changes to take effect.</p> <h3 id="step3">Step 3: Understand Tiered Protection by QoS Class</h3> <p>Once <code>TieredReservation</code> is active, the kubelet assigns cgroup memory settings as follows:</p> <table> <tr><th>Pod QoS Class</th><th>Reservation Type</th><th>Example (request 512 MiB)</th></tr> <tr><td>Guaranteed</td><td><code>memory.min</code> (hard)</td><td><code>536870912</code> – kernel never reclaims; if contention, OOM kills others.</td></tr> <tr><td>Burstable</td><td><code>memory.low</code> (soft)</td><td><code>536870912</code> – kernel protects under normal pressure, reclaims under extreme pressure to avoid system-wide OOM.</td></tr> <tr><td>BestEffort</td><td>None</td><td>No reservation; memory fully reclaimable at any time.</td></tr> </table> <p>This is a key improvement over v1.27, where <em>every</em> container with a memory request got <code>memory.min</code>, potentially locking too much memory and causing OOMs. Now only Guaranteed Pods get hard protection.</p> <h3 id="step4">Step 4: Monitor Memory Reservation</h3> <p>Kubernetes v1.36 adds two alpha metrics on the kubelet’s <code>/metrics</code> endpoint. Enable them by ensuring the <code>MemoryQoS</code> feature gate is active:</p> <ul> <li><code>kubelet_memory_qos_node_memory_min_bytes</code>: Total <code>memory.min</code> across all Pods on the node.</li> <li><code>kubelet_memory_qos_cgroup_memory_low_bytes</code>: Total <code>memory.low</code> across all Pods (available in v1.36+).</li> </ul> <p>Use <code>curl</code> or Prometheus to scrape: <code>curl http://localhost:10249/metrics | grep memory_qos</code>. These help you verify that only Guaranteed Pods use hard reservation and that Burstable Pods use soft protection.</p> <h3 id="step5">Step 5: Adjust memoryThrottlingFactor (Optional)</h3> <p>The <code>memory.high</code> value is set based on <code>memoryThrottlingFactor</code> (default 0.9). This means the kubelet sets <code>memory.high = memory request * 0.9</code>. You can change this value in kubelet configuration to a float between 0 and 1. Lower values cause earlier throttling. Example:</p> <pre><code>memoryThrottlingFactor: 0.8</code></pre> <p>This is independent of reservation policy – throttling works regardless of <code>memoryReservationPolicy</code>.</p> <h3 id="step6">Step 6: Handle Kernel Version Warning</h3> <p>Some older kernels have issues with <code>memory.high</code>. Kubelet logs a warning if the kernel version is below 5.4 (or certain patched versions). To avoid instability, ensure all nodes run a kernel that properly supports cgroup v2 memory accounting. Check with <code>uname -r</code>.</p> <h2>Tips and Best Practices</h2> <ul> <li><strong>Start with throttling only</strong>: Before enabling <code>TieredReservation</code>, observe workload behavior under <code>memory.high</code> throttling. This helps you tune <code>memoryThrottlingFactor</code> and confirm workloads tolerate cgroup-level pressure.</li> <li><strong>Reserve headroom</strong>: If node memory is tight, avoid enabling <code>TieredReservation</code> until you have enough cushion. For example, with Burstable Pods requesting 7 GiB on an 8 GiB node, soft protection (<code>memory.low</code>) still preserves memory under normal load but allows reclamation under pressure – a safer default than hard reservation.</li> <li><strong>Test with Guaranteed Pods</strong>: Only add hard protection (<code>memory.min</code>) to critical, predictable workloads. Over-reserving with <code>memory.min</code> can starve other processes and trigger OOM kills.</li> <li><strong>Monitor OOM events</strong>: Use <code>kubectl top pods</code> and node-level tools like <code>dmesg</code> to watch for out-of-memory kills. Tiered protection reduces risk but does not eliminate it.</li> <li><strong>Upgrade carefully</strong>: If migrating from v1.27 where all requests became <code>memory.min</code>, tiered reservation may free up memory for BestEffort or system daemons. Verify node headroom after upgrade.</li> <li><strong>Use metrics for validation</strong>: Regularly check the memory reservation metrics to confirm you are not accidentally oversubscribing hard reservations.</li> </ul> <p>By following these steps, you can leverage Kubernetes v1.36’s updated Memory QoS to protect workloads proportionally, reduce system-wide OOM risks, and make more efficient use of your cluster’s memory.</p>
Tags:

Related Articles