Difference between revisions of "PVE Troubleshooting"
(10 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
For some reason, the denizens of the Internet assume that all difficulties when using PVE stem from screwing up your cluster. This is kind of odd when you consider that, sometimes, PVE servers run on their own... | For some reason, the denizens of the Internet assume that all difficulties when using PVE stem from screwing up your cluster. This is kind of odd when you consider that, sometimes, PVE servers run on their own... | ||
== Some resources: == | |||
[https://engineerworkshop.com/blog/how-to-unlock-a-proxmox-vm/ Proxmox Locked VM Errors] | |||
== qm commands fail hard == | == qm commands fail hard == | ||
Example: | Example: | ||
Line 10: | Line 12: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
Nearly every Google hit is discussion about how to get your cluster working again... | Nearly every Google hit is discussion about how to get your cluster working again... :{{!}} | ||
The actual problem, OTOH... | The actual problem, OTOH... Appears to be that, if your hostname doesn't match what's in /etc/hosts qm gets lost... | ||
Take a look at <code>/etc/hostname</code> | Take a look at <code>/etc/hostname</code> | ||
Line 44: | Line 46: | ||
Once that's fixed, restarting the machine should get things back to a working state. | Once that's fixed, restarting the machine should get things back to a working state. | ||
== "EXT4-fs (dm-xx): write access unavailable, skipping orphan cleanup" == | |||
Do Not Panic! | |||
This is not actually an error showing up repeatedly on the console. | |||
It's some sort of random silliness. | |||
[https://forum.proxmox.com/threads/ext4-fs-dm-10-write-access-unavailable-skipping-orphan-cleanup.46785/ Forum reference] | |||
Far as I've managed to determine, it's something related to LXCs being locked during backup. | |||
== "usb 2-1-port2: disabled by hub (EMI?), re-enabling..." == | |||
Far too much research indicates that pretty much no-one has a clue what this means... | |||
Tho I '''did''' find some reference to radio interference... hhhmmm... | |||
== Post Power Failure Boot Problems == | |||
Recently, one of my servers has begun failing to start it's guests upon bootup after a power failure event. | |||
'''Activation of logical volume pve/vm-XXXX-disk-X is prohibited while logical volume pve/data_tmeta is active.''' | |||
'''WARNING: Device /dev/sdi3 not initialized in udev database even after waiting 10000000 microseconds.''' | |||
'''TASK ERROR: activating LV 'pve/data' failed: Activation of logical volume pve/data is prohibited while logical volume pve/data_tdata is active.''' | |||
'''[ TIME ] Timed out waiting for device /dev/disk/by-uuid????????-????-????-????-????????????.''' | |||
[https://forum.proxmox.com/threads/task-error-activating-lv-pve-data-failed-activation-of-logical-volume-pve-data-is-prohibited-while-logical-volume-pve-data_tdata-is-active.106225/ This thread] has some hints... | |||
Specifically, the following ''sometimes'' gets it running again: | |||
*<code> lvchange -an pve/data</code> | |||
* <code>lvconvert --repair pve/data</code> | |||
* <code>lvchange -ay pve/data</code> | |||
* <code>reboot</code> | |||
[https://unix.stackexchange.com/questions/690552/boot-taking-forever-with-a-106-jbod-attached-warning-device-dev-xxx-not-ini This thread] has a suggested config change, but tis doesn't seem to have worked. | |||
Then, there's [https://forum.proxmox.com/threads/local-lvm-not-available-after-kernel-update-on-pve-7.97406/page-2#post-430860 this thread] where people seem to believe it's a bug in Debian... |
Latest revision as of 12:03, 23 June 2024
For some reason, the denizens of the Internet assume that all difficulties when using PVE stem from screwing up your cluster. This is kind of odd when you consider that, sometimes, PVE servers run on their own...
Some resources:
qm commands fail hard
Example:
root@proxmox-pve:~# qm list
ipcc_send_rec[1] failed: Connection refused
ipcc_send_rec[2] failed: Connection refused
ipcc_send_rec[3] failed: Connection refused
Unable to load access control list: Connection refused
Nearly every Google hit is discussion about how to get your cluster working again... :|
The actual problem, OTOH... Appears to be that, if your hostname doesn't match what's in /etc/hosts qm gets lost...
Take a look at /etc/hostname
In our example, it'll look like:
proxmox-pve
Now, if /etc/hosts
contains:
127.0.0.1 localhost.localdomain localhost
192.168.1.2 pve.nerdmage.ca pve
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
poor PVE is gonna be confused.
That second line should be:
192.168.1.2 proxmox-pve.nerdmage.ca proxmox-pve
Once that's fixed, restarting the machine should get things back to a working state.
Do Not Panic!
This is not actually an error showing up repeatedly on the console.
It's some sort of random silliness.
Far as I've managed to determine, it's something related to LXCs being locked during backup.
"usb 2-1-port2: disabled by hub (EMI?), re-enabling..."
Far too much research indicates that pretty much no-one has a clue what this means...
Tho I did find some reference to radio interference... hhhmmm...
Post Power Failure Boot Problems
Recently, one of my servers has begun failing to start it's guests upon bootup after a power failure event.
Activation of logical volume pve/vm-XXXX-disk-X is prohibited while logical volume pve/data_tmeta is active.
WARNING: Device /dev/sdi3 not initialized in udev database even after waiting 10000000 microseconds.
TASK ERROR: activating LV 'pve/data' failed: Activation of logical volume pve/data is prohibited while logical volume pve/data_tdata is active.
[ TIME ] Timed out waiting for device /dev/disk/by-uuid????????-????-????-????-????????????.
This thread has some hints...
Specifically, the following sometimes gets it running again:
lvchange -an pve/data
lvconvert --repair pve/data
lvchange -ay pve/data
reboot
This thread has a suggested config change, but tis doesn't seem to have worked.
Then, there's this thread where people seem to believe it's a bug in Debian...