scripts use to manage our multi-tenant proxmox cluster
Find a file
2026-01-07 14:32:52 +01:00
create_organisation fixup 2026-01-04 15:22:38 +01:00
create_user initial commit 2023-10-26 16:45:29 +02:00
create_vm bump to debian 13 2026-01-04 15:32:45 +01:00
delete_organisation initial commit 2023-10-26 16:45:29 +02:00
delete_user initial commit 2023-10-26 16:45:29 +02:00
delete_vm initial commit 2023-10-26 16:45:29 +02:00
download_images bump to debian 13 2026-01-04 15:32:45 +01:00
get_linklocal_for_vm initial commit 2023-10-26 16:45:29 +02:00
manage_autostart initial commit 2023-10-26 16:45:29 +02:00
README.md WIP: add README.md 2026-01-07 14:32:52 +01:00

Proxmox Management

This repo contains script used to manage a multi-tenant proxmox environment for the Reudnetz w.V.. These scripts were created beacuse the ansible modules lacked the necessary features.


This hacky solution will be replaced by a clean terraform abstraction !


Overview

We use users, groups, resource pools, and roles to build a multi-tenante proxmox instance.

Proxmox by design is not a "cloud native" hypervisor platform. Although it is possible to seperate users and resources from on another there is no way give users/organisations a quota of cpu,memory,disk from which they can assign virtual machines/containers to themselfs.

To overcome this limitation an admin creates those vms/containers and then attaches them read-only to the specific user/organisation.

Moving Parts

For our hypervisor setup we need to administrate two components:

  • the hypervisor platform
  • routing/switching

Currently there is no idempotent automation for the hypervisor platform. State changes can be done via scripts or the cli directly (or via the WebGUI of course).

The routing/switching side is fully automated. State changes are configured by modifying the vlan_vms.yml file.

Terminology

Our abstraction is based on these objects:

  • users
  • organisations
  • vms

Users can be part of multiple organisations. Every vm is bound to a single organisation.

Proxmox Implementation

In proxmox we use a few more components to implement our abstraction:

  • users
  • groups (used as organisations)
  • resource pools
    • used to map permissions for groups
  • acls

Under the (proxmox) hood our users (as in from our abstraction) are normal users. Organisations are abstracted via groups.

VMs and file-based storage are assigned to so called Resource pools.

We now map permissions for a specific organisation (group) to a specific resource pool. This seperates the resources and creates a multi-tenant platform.

There are WIP

Requirements

HOW TO USE THESE SCRIPTS IN AN PVE ENVIRONMENT. TBD

Create an Organisation

  1. create proxmox related objects
./create_organisation ${name}

Create a VM

prerequisites:

  • there should be a organisation created for that vm
  • you should have a list of authorized_keys for that organisation saved as a file on that node
  1. search for next useable vlan/vm-id in vlan_vms.yml
  • either use an old but freed id (block should be commented)
  • or use next free one
  1. check that we have the newest images
grep debian download_images
ls /root/images
# execute ./download_images if script url does not match images directory
  1. create vm
./create_vm ${vmid} ${orga}-{vmname}-${vmid} ${orga}
./get_linklocal_for_vm ${vmid}
  1. add vlan configuration to vlan_vms.yml and deploy ansible
vim vlan_vms.yml
[...] copy block or create new one - change prefixed accordingly
[...] enter ll for v6 prefix
# deploy routers/switches
ansible-playbook -CD cores.yml -l cores,core_switches
ansible-playbook -D cores.yml -l cores,core_switches
# deploy hyperjump
ansible-playbook -CD vms.yml -l hyperjump
ansible-playbook -D vms.yml -l hyperjump
  1. configure cloud-init settings for vm
qm set ${vmid} --ipconfig0 "ip=${v4-cidr},gw=${v4-gw},ip6=${v6-prefix},gw6=fe80::1"
qm set ${vmid} --sshkeys ${path_to_authorized_keys}
  1. change vm specifications to order
# increase memory / cpu
qm set ${vmid} -memory 8192 --cores 4

# add more storage
#  use customer-disks on hyper01 and hyper03
#  use customer-disks-slow on hyper02 (dedicated sata ssd pool with more space but less bandwith)
qm set ${vmid} --virtio1 customer-disks-slow:250,iothread=1,discard=on
  1. start vm
qm start ${vmid}
  1. check reachability of vm
ping6 ${v6prefix}
nc ${v6prefix} 22
  1. notify customer
vmid: ${vmid}
vmname: ${vmname}
user: debian
authentication: key based auth only

v4: ${v4-cidr}
v4gw: ${v4-gw}
v6: ${v6-prefix}
v6gw: fe80::1

v6 reachable from everywhere - v4 only inside the reudnetz
ssh reverse proxy (for v4 access only) available via: ssh -p 2${vmid} <user>@hyperjump.reudnetz.org

best effort service - expect downtimes - backup your stuff !

Destroy a VM

commands need to be executed on node where vm resides

  1. schedule removal of replicated storage (if configured)
pvesr list | grep ${vmid}
pvesr delete ${vmid}-${replication_id}
echo wait till removal is complete - check with pvesr list
  1. remove vm
qm shutdown ${vmid}
qm destroy --purge 1 ${vmid}
  1. remove vms from vlan_vms.yml and deploy
vim vlan_vms.yml
[...]
# remove router and switch config
ansible-playbook -CD cores.yml -l cores,core_switches
ansible-playbook -D cores.yml -l cores,core_switches
# remove ssh reverse proxy config
ansible-playbook -CD vms.yml -l hyperjump
ansible-playbook -D vms.yml -l hyperjump

Destroy User

./delete_user ${username}@pve

Destroy organisation

  1. remove orga and associated objects from proxmox cluster
./delete_organisation ${org}
  1. destroy remaining zfs volumes on other nodes
# execute on all pve nodes
zfs destroy -r rpool/customer/${orga}-images

Add/Remove user from organisation

  1. check which groups the user currently belongs to
pveum group list | grep ${user}@pve
  1. change groups memberships
# add to group
pveum user modify ${user}@pve --groups ${currentgroups},${new_group}
# remove from group
pveum user modify ${user}@pve --groups ${groups-without-other-group}