Skip to content

Expose pod volumes / volumeClaimTemplates for persistent or SSD-backed storage #43

@jensens

Description

@jensens

Problem

spec.storage[].type=file emits -s <name>=file,<path>,<size> to varnishd, but there is currently no way to back <path> with anything other than the operator's built-in varnish-workdir EmptyDir at /var/lib/varnish (internal/controller/statefulset.go#L97-L100).

That's fine as a first step — on nodes with NVMe/SSD kubelet-ephemeral-storage the file ends up on SSD anyway — but it has real limits:

  • No persistence across pod restart. For a pure HTTP cache that's usually acceptable; for larger working sets the warmup cost is non-trivial.
  • No emptyDir.sizeLimit, no control over the EmptyDir medium, no way to cap disk use below the node's ephemeral-storage capacity.
  • No choice of StorageClass — on clusters with a dedicated SSD StorageClass (e.g. Hetzner hcloud-volumes, csi-driver-provisioned NVMe) there's no way to say "use that, 100 Gi, for the cache spill file".
  • No separation between OS disk and cache disk — heavy file-backed caches compete with node ephemeral use (logs, image layers).

Proposal

Add pod-volume / volume-claim support on the VinylCache CRD. Two layered options, smallest useful shape first:

Option A (minimum): spec.pod.volumes + spec.pod.volumeMounts

Arbitrary corev1.Volume[] + corev1.VolumeMount[] passthrough. Users bring their own PVC / hostPath / emptyDir-with-sizeLimit and reference the mount path from spec.storage[].path. Simple to implement, fully generic. Downside: users manage the PVC lifecycle externally.

Option B (ideal): spec.volumeClaimTemplates

StatefulSet-native volumeClaimTemplates[] passthrough. The operator turns these into per-pod PVCs (<claim>-<vc>-<ord>), lifecycle-bound to the StatefulSet. Users additionally declare volumeMounts referencing the claim name. This is the right model for cache-per-replica on persistent SSD.

Option B subsumes A for the SSD-persistent case but is more work — Option A alone unlocks SSD-backed file storage via user-provisioned PVCs.

Example end-state (Option B)

apiVersion: vinyl.bluedynamics.eu/v1alpha1
kind: VinylCache
spec:
  replicas: 2
  storage:
    - name: mem
      type: malloc
      size: 1500M
    - name: disk
      type: file
      path: /var/lib/varnish-cache/disk.bin
      size: 80Gi
  volumeClaimTemplates:
    - metadata:
        name: cache-ssd
      spec:
        accessModes: [ReadWriteOnce]
        storageClassName: hcloud-volumes
        resources:
          requests:
            storage: 100Gi
  pod:
    volumeMounts:
      - name: cache-ssd
        mountPath: /var/lib/varnish-cache

Validation / webhook considerations

  • Reject spec.storage[].path values that resolve into the reserved mounts (/var/lib/varnish, /tmp, /run/vinyl, /etc/varnish/*) — those are operator-owned.
  • If both volumeClaimTemplates and pod.volumes name the same volume, webhook error.
  • Warn if storage[].type=file is declared without any user volume covering its path (i.e. file will land in EmptyDir) — fine, but worth surfacing.

Out of scope for this issue

  • Multi-tier eviction policies, cache warming, snapshotting. Just plumbing the volume surface through.

Context

Came up while extending @bluedynamics/cdk8s-plone's PloneVinylCache construct to expose spec.storage (bluedynamics/cdk8s-plone#148). For a stage environment on Hetzner kup6s we'd like to trial an SSD-backed spill file on a dedicated hcloud-volumes PVC rather than sharing node ephemeral storage.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions