Hi everyone,
I've run into what may be an uncommon issue when replacing nodes in Qluman. Say I have a node all configured, then I completely remove that machine and replace it with a new one (may or may not be different specs). I have noticed that Qluman allows one to make the new node in the configs, but it never boots Qlustar.
I also found the fix - the file /var/lib/misc/dnsmasq.leases doesn't appear to get updated *only when deleting a node.* Adding new nodes works perfectly, as does changing MAC addresses. However, I've had to go in and manually delete the corresponding entries from the file on the headnode, after which Qluman works perfectly adding a New Host with the old IP address.
Can anyone else reproduce this issue, and if so, would an addition to the scripts to the tune of deleting the line matching the hostname or MAC address in /var/lib/misc/dnsmasq.leases be required for the next round of updates? It makes sense that leases should have infinite lifetime in a cluster (nodes should sit tight for years), though in this rare situation it requires a bit of extra manual tweaking.
Thanks, -Mike
hereiam@mit.edu wrote:
Hi everyone,
I've run into what may be an uncommon issue when replacing nodes in Qluman. Say I have a node all configured, then I completely remove that machine and replace it with a new one (may or may not be different specs). I have noticed that Qluman allows one to make the new node in the configs, but it never boots Qlustar.
I can confirm this problem. Looking in /var/log/syslog I see:
Jun 25 09:36:17 ql-head-dev-g dnsmasq-dhcp[2073318]: not using configured address 172.16.168.201 because it is leased to 02:01:99:99:a8:c9
And the lease never expires so dnsmasq will never boot the new node. Unfortunately there is no provision in qluman to delete the lease when a node is deleted or the MAC changed. Just editing the leases file also isn't enough, dnsmasq needs to be restarted as well.
I tried adding the file for deletion when the dnsmasq config is written but then it will always show up as a diff. Meaning qluman-qt will always show that the dnsmasq config needs to be written, which is less than ideal.
One of our upcoming changes is that DHCP is only used by the bios/uefi to boot the node and the OS then uses purely static network config generated by qlumand. This speeds up the boot and simplifies some corner cases. With that change I think it will be OK to give leases a limited lifetime to solve this issue without having to mess with dnsmasq internals or negative effects.
Thanks for the report, Goswin von Brederlow
On Fri, Jun 25, 2021 at 07:53:35AM -0000, Goswin von Brederlow wrote:
hereiam@mit.edu wrote:
Hi everyone,
I've run into what may be an uncommon issue when replacing nodes in Qluman. Say I have a node all configured, then I completely remove that machine and replace it with a new one
We've come across that problem as well when replacing defective main boards (with on-board network).
Jun 25 09:36:17 ql-head-dev-g dnsmasq-dhcp[2073318]: not using configured address 172.16.168.201 because it is leased to 02:01:99:99:a8:c9
And the lease never expires so dnsmasq will never boot the new node.
Interestingly, there seems to be a difference in behaviour between dnsmasq and ISC dhcpd: the latter does not record statically configured addresses in its leases file, so there would be no problem when a MAC changes.
One of our upcoming changes is that DHCP is only used by the bios/uefi to boot the node and the OS then uses purely static network config generated by qlumand. This speeds up the boot and simplifies some corner cases. With that change I think it will be OK to give leases a limited lifetime to solve this issue without having to mess with dnsmasq internals or negative effects.
That's good news. Any idea as to when the lease time will be reduced?
Thanks,
A.
"A" == Ansgar Esztermann-Kirchner aeszter@mpibpc.mpg.de writes:
Hi Ansgar,
>> One of our upcoming changes is that DHCP is only used by the >> bios/uefi to boot the node and the OS then uses purely static >> network config generated by qlumand. This speeds up the boot and >> simplifies some corner cases. With that change I think it will be >> OK to give leases a limited lifetime to solve this issue without >> having to mess with dnsmasq internals or negative effects.
A> That's good news. Any idea as to when the lease time will be A> reduced?
will be part of one of the next releases (watch the changelogs).
Best,
Roland
On Mon, Nov 08, 2021 at 12:01:02PM +0100, Roland Fehrenbacher wrote:
"A" == Ansgar Esztermann-Kirchner aeszter@mpibpc.mpg.de writes:
Hi Ansgar,
A> That's good news. Any idea as to when the lease time will be A> reduced?
will be part of one of the next releases (watch the changelogs).
OK, thanks.
A.
Excellent, thank you very much for being responsive as always! I've got my own system patched for now.