How to Implement Tiered Memory Protection with Memory QoS in Kubernetes v1.36

By

Introduction

Kubernetes v1.36 introduces significant updates to the Memory QoS feature, first introduced in v1.22 and refined in v1.27. This guide walks you through setting up and leveraging the new tiered memory protection mechanism, which uses the cgroup v2 memory controller to give the kernel better guidance on how to treat container memory. By the end, you'll know how to enable the feature, configure memory reservation policies, and monitor memory protection across Pod QoS classes.

How to Implement Tiered Memory Protection with Memory QoS in Kubernetes v1.36

What You Need

  • A Kubernetes cluster running v1.36 or later (nodes must support cgroup v2).
  • kubectl configured to access your cluster.
  • SSH access to at least one node for verifying cgroup files (optional but helpful).
  • Basic understanding of Kubernetes Pod QoS classes (Guaranteed, Burstable, BestEffort).

Step-by-Step Guide

Step 1: Enable the MemoryQoS Feature Gate

The MemoryQoS feature is alpha in v1.36, so you must explicitly enable it. Edit the kubelet configuration (usually found at /var/lib/kubelet/config.yaml or passed via command line) to add MemoryQoS: true under featureGates. Then restart the kubelet process.

# Example kubelet config snippet
featureGates:
  MemoryQoS: true

After restart, verify that the feature is active by checking the kubelet logs or the /metrics endpoint (see Step 5).

Step 2: Configure the memoryReservationPolicy

By default, enabling MemoryQoS only activates throttling via memory.high (based on memoryThrottlingFactor, default 0.9). To enable tiered memory reservation, you need to set memoryReservationPolicy in the kubelet configuration. Two options exist:

  • None (default): No memory.min or memory.low values are written. Throttling still works.
  • TieredReservation: The kubelet writes tiered protection based on the Pod's QoS class (see Step 3).

To use tiered protection, add memoryReservationPolicy: TieredReservation to the kubelet configuration:

memoryReservationPolicy: TieredReservation

Restart the kubelet after making this change.

Step 3: Understand Tiered Protection Behavior

Once TieredReservation is enabled, the kubelet automatically applies memory protection based on the Pod's QoS class:

  • Guaranteed Pods: Receive hard protection via memory.min. The kernel will never reclaim this memory; if it cannot honor the guarantee, it invokes the OOM killer on other processes. For example, a Guaranteed Pod requesting 512 MiB of memory will have its cgroup's memory.min set to 536870912 bytes (512 MiB).
  • Burstable Pods: Receive soft protection via memory.low. The kernel avoids reclaiming this memory under normal pressure, but may reclaim it to prevent a system-wide OOM. The same 512 MiB request on a Burstable Pod results in memory.low set to 536870912 bytes.
  • BestEffort Pods: Get neither memory.min nor memory.low. Their memory remains fully reclaimable.

Step 4: Verify the Cgroup Values

After deploying a Pod, you can verify the protection settings directly on a node. SSH into a node and navigate to the Pod's cgroup directory under /sys/fs/cgroup/kubepods.slice/. For example:

# Check memory.min for a Guaranteed Pod
$ cat /sys/fs/cgroup/kubepods.slice/kubepods-pod6a4f2e3b_1c9d_4a5e_8f7b_2d3e4f5a6b7c.slice/memory.min
536870912

# Check memory.low for a Burstable Pod
$ cat /sys/fs/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod8b3c7d2e_4f5a_6b7c_9d1e_3f4a5b6c7d8e.slice/memory.low
536870912

Ensure the values match the Pod's memory request.

Step 5: Monitor Memory QoS Metrics

Kubernetes v1.36 exposes two alpha observability metrics on the kubelet /metrics endpoint:

Metric Description
kubelet_memory_qos_node_memory_min_bytes Total memory reserved via memory.min on the node (from Guaranteed Pods).
kubelet_memory_qos_node_memory_low_bytes Total memory reserved via memory.low on the node (from Burstable Pods).

These metrics help you understand how much memory is protected and make informed decisions about headroom. You can scrape them with Prometheus or any Prometheus-compatible monitoring tool.

Step 6: Compare with Previous Behavior

In earlier versions (v1.22–v1.27), enabling MemoryQoS set memory.min for every container with a memory request, regardless of QoS class. This could lock up a large portion of memory, leaving little for system daemons or BestEffort workloads. For instance, on an 8 GiB node with Burstable Pods requesting 7 GiB total, all 7 GiB would be hard-reserved, increasing OOM risk.

With TieredReservation in v1.36, Burstable pods use memory.low instead, allowing the kernel to reclaim that memory under extreme pressure. Only Guaranteed pods use memory.min, keeping hard reservation lower and providing more headroom for the node.

Tips and Best Practices

  • Monitor Before Enabling TieredReservation: Start with memoryReservationPolicy: None (throttling only) and observe your workload's memory behavior under pressure. This gives you a baseline before committing to hard or soft reservations.
  • Start with TieredReservation on Nodes with Ample Headroom: Ensure your node has enough free memory to accommodate the tiered reservations, especially for Guaranteed pods. Use the kubelet_memory_qos_node_memory_min_bytes metric to track total hard reservation.
  • Combine with Resource Quotas: Enforce memory quotas at the namespace level to prevent excessive reservation from a single compromised workload.
  • Test with Synthetic Load: Generate memory pressure using tools like stress inside a test namespace to verify that OOM killer behavior aligns with your expectations (e.g., BestEffort pods are reclaimed first).
  • Be Aware of Kernel Version Warnings: The feature may emit warnings for kernels that don't fully support memory.high or other cgroup v2 features. Keep your nodes updated to a recent Linux kernel (5.4+ recommended).
Tags:

Related Articles

Recommended

Discover More

Enterprise AI at Crossroads: New Hybrid Framework Combines Low-Code Speed with Full-Code Control – Analysts Say It's the Missing LinkAsteroid Route Optimization: First Exact Solution Achieved by Mathematical FrameworkMLJAR Studio: A Desktop AI Data Analyst That Generates Reproducible NotebooksHalley's Comet Meteor Shower Peaks This Week: Eta Aquariids to Light Up Early Morning SkiesExploring the Depths: A Guide to Ann Leckie's Radiant Star and the Radch Universe