Hi,
First off, many thanks Roland for your earlier help with the Qlustar 12.0 installer. We've been up and running for a few months now and have almost everything working as desired.
I'm writing because I have a few questions both about some issues we have not been able to resolve and some introduced in the latest security updates (QSA-0903211 and QSA-0903212).
Latest issues first:
1. Since installing the latest security updates a few days ago I have noticed that Slurm has stopped working and the qluman-qt 'Components' option is grayed out (see https://imgur.com/a/W1Ejsvn). I'm not totally sure what the issue might be, but I noticed that the slurmdbd and slurmctld services are not running on the head node, and in fact are masked, which I understand will stop them from starting, right? You can see the output from 'service --status-all' and 'systemctl list-unit-files --state=masked' below. Any suggestions about how to re-enable Slurm?
service --status-all: 0 root@geo-hpcc ~ # service --status-all [ - ] apache-htcacheclean [ + ] apache2 [ - ] console-setup.sh [ + ] cpufrequtils [ + ] cron [ + ] dbus [ + ] dnsmasq [ ? ] ganglia-monitor [ ? ] gmetad [ - ] grub-common [ - ] hwclock.sh [ - ] ipmievd [ - ] jobmonarch-jobmond [ - ] keyboard-setup.sh [ + ] kmod [ + ] loadcpufreq [ - ] lvm2 [ - ] lvm2-lvmpolld [ + ] munge [ + ] mysql [ + ] nagios4 [ - ] networking [ - ] nfs-common [ + ] nfs-kernel-server [ + ] ntp [ + ] postfix [ + ] procps [ + ] qluman-dhcpscanner [ + ] qluman-execd [ + ] qluman-router [ + ] qluman-server [ - ] qluman-slurmd [ + ] resolvconf [ - ] rng-tools [ + ] rpcbind [ - ] rsync [ + ] rsyslog [ + ] saslauthd [ + ] schroot [ - ] scsitools-pre.sh [ - ] scsitools.sh [ + ] slapd [ - ] slurmctld [ - ] slurmdbd [ + ] ssh [ + ] sssd [ + ] sysstat [ + ] udev [ - ] x11-common
systemctl list-unit-files --state=masked: 0 root@geo-hpcc ~ # systemctl list-unit-files --state=masked UNIT FILE STATE VENDOR PRESET cryptdisks-early.service masked enabled cryptdisks.service masked enabled hwclock.service masked enabled jobmonarch-jobmond.service masked enabled lvm2.service masked enabled nfs-common.service masked enabled qluman-slurmd.service masked enabled rc.service masked enabled rcS.service masked enabled scsitools-pre.service masked enabled scsitools.service masked enabled slurmctld.service masked enabled slurmdbd.service masked enabled sudo.service masked enabled x11-common.service masked enabled
15 unit files listed.
2. In addition to the Slurm issue, I have also not been able to create new user accounts because the 'Components' option in qluman-qt is grayed out. I have previously migrated our old NIS user accounts to LDAP, but I am lackign experience with LDAP and unsure how to explore this issue.
And finally, one older issue we have not resolved:
3. How do you change user passwords for LDAP users in Qlustar 12? I had been able to generate new passwords in qluman, but have not been able to figure out how what users should do to update their passwords. It seems perhaps there is a lifetime to the autogenerated passwords, so I have manually reset passwords when users needed them. Obviously, this is not something that is good in the long run. Do you have a suggestion?
And sorry for the compilation of questions above. Hopefully the fixes are obvious to you and not too much trouble.
Best, Dave
Hi,
Just wanted to see whether there was any advice about how to resolve the issues in my previous email. I have not made much of an attempt to start tinkering myself because I fear I might break things further :).
Thanks! Dave
"D" == david whipp david.whipp@helsinki.fi writes:
Hi Dave,
please note that this is a community forum and not a one-to-one support chat handled by Q-Leap. Q-Leap staff reads the forum messages and in case there is an indication of a bug in Qlustar software, we will react. If this is not so, like in your case, most likely there will be no reaction from our side and help has to be provided by community members. I assume that you checked out our web sites and know that Q-Leap provides professional support. But also note that we do so only for clusters that have been setup by ourselves within the context of a clearly defined customer project. Payed services/support is the only source of financing Qlustar development ...
One comment concerning your question 3: Users should be able to change their passwords simply by using the 'passwd' command.
Best,
Roland
D> Hi, Just wanted to see whether there was any advice about how to D> resolve the issues in my previous email. I have not made much of D> an attempt to start tinkering myself because I fear I might break D> things further :).
D> Thanks! Dave
Dear Roland,
Thanks for the reply.
please note that this is a community forum and not a one-to-one support chat handled by Q-Leap. Q-Leap staff reads the forum messages and in case there is an indication of a bug in Qlustar software, we will react. If this is not so, like in your case, most likely there will be no reaction from our side and help has to be provided by community members.
I do appreciate that this is a community forum, and I hope I’ve not misused the email list with my earlier messages.
I suppose my feeling was that our cluster was working properly, I installed two security updates, and then things that had be functioning properly previously were no longer working. I did not mask any services myself, so I was thinking that something in the security updates may have done so. Given this, and the fact the cluster did not return to operating as it had, my view was that perhaps this was unexpected behaviour and/or possibly a bug. Both points 1 and 2 in my original message, as far as I can tell, are issues that appeared after rebooting the head node following the security update installations. I followed instructions for installing the security updates at https://docs.qlustar.com/Qlustar/12.0/ClusterOS/administration-manual/admini....
I assume that you checked out our web sites and know that Q-Leap provides professional support. But also note that we do so only for clusters that have been setup by ourselves within the context of a clearly defined customer project. Payed services/support is the only source of financing Qlustar development ...
I have checked the websites for things like how to perform the security updates, yes. I also do understand that direct support from Q-Leap for issues such as configuration of our cluster is something that would fall under the professional support umbrella. No problem there. I’m just not sure about the points 1 and 2, which appear to be related to installation of the security updates, as I had not made any recent configuration changes prior to updating.
One comment concerning your question 3: Users should be able to change their passwords simply by using the 'passwd' command.
OK, thanks. I had been looking in the Qlustar docs for this, as we had previously used yppasswd to change passwords. I’ll inform our users about this.
Best, Dave
"D" == Whipp, David M david.whipp@helsinki.fi writes:
Hi Dave,
D> I suppose my feeling was that our cluster was working properly, I D> installed two security updates, and then things that had be D> functioning properly previously were no longer working. I did not D> mask any services myself, so I was thinking that something in the D> security updates may have done so. Given this, and the fact the D> cluster did not return to operating as it had, my view was that D> perhaps this was unexpected behaviour and/or possibly a bug. Both D> points 1 and 2 in my original message, as far as I can tell, are D> issues that appeared after rebooting the head node following the D> security update installations.
I understand that the simplest explanation for your problems are the security updates. However, we have done the updates on our test clusters and many other systems already, where no such effect occurred. Besides, slurm packages where not part of the update. So I can only assume that some other changes you have made on the cluster caused this and the effect of it was only visible after the reboot. The only option left for you is to debug the issue.
Best,
Roland
Hi,
I understand that the simplest explanation for your problems are the security updates. However, we have done the updates on our test clusters and many other systems already, where no such effect occurred. Besides, slurm packages where not part of the update. So I can only assume that some other changes you have made on the cluster caused this and the effect of it was only visible after the reboot. The only option left for you is to debug the issue.
OK, good to know. I had thought it was odd that Slurm would have been affected, so we will now focus on any changes we had made prior to installing the security update.
Thanks!
Dave