Deploying with OpenTofu on LINSTOR DRBD: the beginning of problems

November 26, 2025 · 9 min read

System Administrator

After choosing LINSTOR DRBD as the distributed storage solution for my Proxmox HA cluster, it was time to move on to deployment automation with OpenTofu (open-source fork of Terraform). Spoiler: it didn't go as planned.

Context Recap

In my previous article about choosing a distributed storage technology, I opted for LINSTOR DRBD over Ceph for several reasons:

Superior performance: LINSTOR DRBD uses synchronous block-level replication via DRBD, offering better performance than Ceph on a 1 Gbps network
Simpler architecture: No need for monitors, managers, and OSDs like with Ceph
Resource consumption: Lighter on RAM and CPU

My infrastructure consists of:

3 Proxmox nodes: acemagician (192.168.100.10), elitedesk (192.168.100.20), thinkpad (192.168.100.30)
LINSTOR configuration: thinkpad = controller, acemagician/elitedesk = satellites
Storage pools: linstor_storage (DRBD-replicated), local-lvm (local storage)

The Problem: LINSTOR Cannot Dynamically Provision VMs

When I attempted to deploy my K3s VMs with OpenTofu using the Proxmox provider, I encountered a systematic error:

Error: error creating VM: error cloning vm: 500 Internal Server Error:
unable to create VM 1000 - lvcreate 'linstor_storage/vm-1000-disk-0' error:
Exited with error code 1: Resource definition 'vm-1000-disk-0' not found.

My OpenTofu configuration was standard:

resource "proxmox_vm_qemu" "k3s_server_1" {
  name        = "k3s-server-1"
  target_node = "acemagician"
  clone       = "ubuntu-2404-cloudinit"
  vmid        = 1000

  disk {
    storage  = "linstor_storage"
    size     = "100G"
  }
}

Why Does It Fail?

LINSTOR uses a strict hierarchical object model:

Resource Definition: Template for a storage resource
Volume Definition: Volume size and properties
Resource: Actual storage instance on a node

The normal LINSTOR workflow requires manually creating these objects:

linstor resource-definition create mydata
linstor volume-definition create mydata 100G
linstor resource create node1 mydata --storage-pool linstor_storage
linstor resource create node2 mydata --storage-pool linstor_storage

The problem: When Proxmox attempts to clone a VM to LINSTOR:

Proxmox calls lvcreate to create the new disk
The LINSTOR plugin searches for the resource definition vm-XXX-disk-0
The resource definition doesn't exist (Proxmox assumes dynamic creation)
The operation fails

LINSTOR cannot dynamically create resource definitions during Proxmox clone operations. This is a fundamental architectural limitation.

Workaround Attempts

I tried several approaches:

1. Explicit Full Clone

resource "proxmox_vm_qemu" "k3s_server_2" {
  name        = "k3s-server-2"
  target_node = "elitedesk"
  clone       = "ubuntu-2404-cloudinit"
  vmid        = 1001
  full_clone  = true  # Explicit full clone

  disk {
    storage  = "linstor_storage"
    size     = "100G"
  }
}

Result: Same error - "Resource definition not found"

2. Linked Clone

full_clone = false  # Attempt a linked clone

Result:

Error: 400 Bad Request:
Parameter verification failed. (400)
storage: linked clone feature is not supported for 'linstor_storage'

This makes sense: DRBD doesn't support snapshots, which are required for linked clones.

Considered Solutions

Faced with this problem, I'm torn between several options:

Option 1: Pre-creation Script for Resource Definitions

Create a bash or Python script that runs before OpenTofu to pre-create the resource definitions:

#!/usr/bin/env python3
import subprocess
import json

def create_linstor_resource(vm_id, size_gb, nodes):
    """Creates a LINSTOR resource definition for a VM"""
    resource_name = f"vm-{vm_id}-disk-0"

    # Create the resource definition
    subprocess.run([
        "linstor", "resource-definition", "create", resource_name
    ], check=True)

    # Create the volume definition
    subprocess.run([
        "linstor", "volume-definition", "create",
        resource_name, f"{size_gb}G"
    ], check=True)

    # Create the resource on each node
    for node in nodes:
        subprocess.run([
            "linstor", "resource", "create",
            node, resource_name,
            "--storage-pool", "linstor_storage"
        ], check=True)

# Create resources for K3s VMs
vms = [
    {"id": 1000, "size": 100, "nodes": ["acemagician", "elitedesk"]},
    {"id": 1001, "size": 100, "nodes": ["elitedesk", "thinkpad"]},
    {"id": 1002, "size": 20, "nodes": ["thinkpad", "acemagician"]},
]

for vm in vms:
    create_linstor_resource(vm["id"], vm["size"], vm["nodes"])

Advantages:

Keeps LINSTOR DRBD usage
Allows automation via CI/CD
Maintains DRBD's superior performance
Preserves distributed storage for VMs

Disadvantages:

Additional complexity in the pipeline
Risk of desynchronization between script and OpenTofu configuration
Requires rigorous management of VMIDs and their allocation

Option 2: Manual Creation of Resource Definitions

Manually create LINSTOR resources before each deployment:

# For k3s-server-1 (VMID 1000)
linstor resource-definition create vm-1000-disk-0
linstor volume-definition create vm-1000-disk-0 100G
linstor resource create acemagician vm-1000-disk-0 --storage-pool linstor_storage
linstor resource create elitedesk vm-1000-disk-0 --storage-pool linstor_storage

Advantages:

Simple and straightforward solution
Full control over LINSTOR resources

Disadvantages:

Loss of Infrastructure as Code: Configuration drift guaranteed
Loss of GitOps: No traceability in git
Not automatable: Manual intervention for each deployment
Not scalable: Impossible for frequent deployments

This option completely contradicts my automation goals. This solution should be ruled out.

Option 3: Partition NVMe Drives (Local Storage + LINSTOR)

Partition the NVMe drives on each node into two parts:

One partition for local LVM storage (local-lvm)
One partition for the LINSTOR DRBD pool (linstor_storage)

Then use local-lvm for VM disks (simple provisioning) and linstor_storage for other needs requiring replication.

Important note for my Kubernetes use case: Using local-lvm (without Proxmox-level replication) is viable for a Kubernetes cluster because Kubernetes handles high availability, not Proxmox. With etcd distributed across 3 nodes and a replicated control plane, the loss of a VM doesn't impact the cluster - Kubernetes continues to function with the remaining nodes. VMs become "cattle" (replaceable via Infrastructure as Code) while real "pets" data (precious) would reside in application-level storage solutions.

Advantages:

Simple and fast VM provisioning on local-lvm
Preservation of LINSTOR DRBD for distributed storage needs
Optimal use of available hardware
Maximum performance for VMs (direct local access)
HA ensured at the right level: Kubernetes, not Proxmox

Disadvantages:

Setup complexity: Disk repartitioning required
Risk of data loss: Invasive operation on existing disks
Capacity planning: Need to determine partition size in advance
Less flexibility: Fixed partition sizes, difficult to modify
No HA at Proxmox level: VMs no longer benefit from replication (acceptable if HA at Kubernetes level)

Option 4: Migrate to Ceph with Network Upgrade

Abandon LINSTOR DRBD and migrate to Ceph, upgrading the network to 5 Gbps (or 10 Gbps if budget allows):

Advantages:

Native support for dynamic provisioning in Proxmox
Perfect integration with OpenTofu/Terraform
Mature and well-documented ecosystem
Snapshots and clones natively supported
Acceptable performance with a 5 Gbps NIC

Disadvantages:

Hardware cost: Purchase of 5 Gbps (or 10 Gbps) network cards for the 3 nodes
Increased complexity: Monitors, Managers, OSDs to manage
Resource consumption: More demanding on RAM and CPU than LINSTOR
Complete migration: Reconstruction of existing storage
Still inferior performance: Even with 5 Gbps, greater overhead than DRBD

My Current Thinking

I'm currently torn between these options:

Option 1 (Script) appeals to me because it preserves LINSTOR and automates everything. With fixed VMIDs (1000, 1001, 1002), the script would be relatively simple to maintain. Just need to ensure the script runs before OpenTofu in the CI/CD pipeline.

Option 3 (Partitioning) is technically interesting but very invasive. Repartitioning NVMe drives in production is risky, and I lose high availability at the Proxmox level for the VMs themselves. However, in my Kubernetes context, this isn't necessarily a problem since HA is managed at the K3s cluster level, not at the individual VM level. If a VM goes down, Kubernetes continues to function with the other nodes.

Option 4 (Ceph + network upgrade) solves all technical problems but involves a hardware investment. A 5 Gbps switch + 3 network cards represents a significant budget for a homelab. On the other hand, it opens the door to other future possibilities.

Key Takeaways

LINSTOR ≠ General-purpose Storage for Proxmox

LINSTOR excels for certain use cases, but dynamic VM provisioning via Proxmox cloning is not one of them. LINSTOR documentation is heavily focused on resource-group and application storage, not Proxmox integration.

The Limitation Is Architectural, Not a Bug

This isn't a configuration problem or my mistake: LINSTOR is designed with an explicit resource management model. On-the-fly dynamic provisioning simply isn't in its philosophy.

HA Can Be Delegated to a Higher Layer

For a Kubernetes cluster, losing HA at the Proxmox level (VMs on local storage) isn't necessarily problematic. Kubernetes is designed to handle node failures - that's actually its main role. With distributed etcd and a replicated control plane, the cluster survives the loss of one or more nodes.

Each Solution Has Its Cost

Script → Software complexity
Partitioning → Operational complexity and loss of HA at Proxmox level
Ceph → System complexity and hardware cost

There's no silver bullet. I must choose which type of complexity I'm willing to accept.

Next Steps

I'll probably test Option 1 (pre-creation script) first, as it allows me to:

Keep LINSTOR DRBD and its performance
Fully automate deployment
Avoid immediate hardware investment
Learn to better manage LINSTOR programmatically

If this approach proves too complex or fragile, I'll reconsider either Option 3 (partitioning, acceptable in a Kubernetes context), or Option 4 (Ceph + network upgrade), which is the most "standard" and documented solution in the Proxmox ecosystem.

I'll document my final decision and its implementation in a future article.

Update: Final Solution Adopted

After testing Option 1 with a Python script for LINSTOR resource management, I found that this approach, while functional, added too much complexity and synchronization risks for production use.

The final decision: Partition the NVMe drives on each Proxmox node according to the following strategy:

300 GB allocated to LINSTOR DRBD (linstor_storage) for:
- VMs and LXC containers requiring high availability at the Proxmox level
- The LXC container hosting the NFS server (see zfs-sync-nfs-ha project)
- Any distributed storage managed by Proxmox HA
200 GB allocated to local-lvm (local-lvm) for:
- K3S cluster VMs that don't need HA at the Proxmox level
- High availability ensured by the Kubernetes cluster itself
- Simple and fast provisioning via OpenTofu

This architecture allows using the right tool for the right purpose: LINSTOR DRBD for what truly requires synchronous replication at the infrastructure level, and performant local storage for workloads where HA is managed by the application layer (Kubernetes).

A detailed article on this implementation and the HA NFS container will follow soon.

References:

Context Recap​

The Problem: LINSTOR Cannot Dynamically Provision VMs​

Why Does It Fail?​

Workaround Attempts​

1. Explicit Full Clone​

2. Linked Clone​

Considered Solutions​

Option 1: Pre-creation Script for Resource Definitions​

Option 2: Manual Creation of Resource Definitions​

Option 3: Partition NVMe Drives (Local Storage + LINSTOR)​

Option 4: Migrate to Ceph with Network Upgrade​

My Current Thinking​

Key Takeaways​

LINSTOR ≠ General-purpose Storage for Proxmox​

The Limitation Is Architectural, Not a Bug​

HA Can Be Delegated to a Higher Layer​

Each Solution Has Its Cost​

Next Steps​

Context Recap

The Problem: LINSTOR Cannot Dynamically Provision VMs

Why Does It Fail?

Workaround Attempts

1. Explicit Full Clone

2. Linked Clone

Considered Solutions

Option 1: Pre-creation Script for Resource Definitions

Option 2: Manual Creation of Resource Definitions

Option 3: Partition NVMe Drives (Local Storage + LINSTOR)

Option 4: Migrate to Ceph with Network Upgrade

My Current Thinking

Key Takeaways

LINSTOR ≠ General-purpose Storage for Proxmox

The Limitation Is Architectural, Not a Bug

HA Can Be Delegated to a Higher Layer

Each Solution Has Its Cost

Next Steps