Hi,
Now I'm using, qlustar management interface, but GPU wizard do delete the GresTypes=gpu to GresTypes=, then it caused an error as follows,
### beosrv-c ### writing /etc/qlustar/common/slurm-llnl/slurm.conf writing /etc/qlustar/common/slurm-llnl/gres.conf writing /etc/qlustar/common/slurm-llnl/cgroup.conf postcmd: service slurmctld restart; echo ' * Waiting for slurmd to be ready...'; sleep 5; scontrol reconfigure Job for slurmctld.service failed because the control process exited with error code. See "systemctl status slurmctld.service" and "journalctl -xeu slurmctld.service" for details. * Waiting for slurmd to be ready... scontrol: error: Parse error in file /etc/qlustar/common/slurm-llnl/slurm.conf line 169: "GresTypes=" scontrol: fatal: Unable to process configuration file 1
How to fix it and how to add GressTypes? Can I set as GresTypes=gpu,mps ?
Regards, Morihisa
Hi Morihisa,
you will have to add a gpu GRES group and assign it to some nodes. Only then will GresTypes='gpu' be set automatically. GresTypes='...' should not be set manually in the slurm config header. Also be aware that with MPS you can partition a GPU for use by only a single user. From the Nvidia docs: 'only one user on a system may have an active MPS server'.
Best,
Roland
On 11/27/24 12:30, ulgs_mrq via Qlustar General wrote:
Hi,
Now I'm using, qlustar management interface, but GPU wizard do delete the GresTypes=gpu to GresTypes=, then it caused an error as follows,
### beosrv-c ### writing /etc/qlustar/common/slurm-llnl/slurm.conf writing /etc/qlustar/common/slurm-llnl/gres.conf writing /etc/qlustar/common/slurm-llnl/cgroup.conf postcmd: service slurmctld restart; echo ' * Waiting for slurmd to be ready...'; sleep 5; scontrol reconfigure Job for slurmctld.service failed because the control process exited with error code. See "systemctl status slurmctld.service" and "journalctl -xeu slurmctld.service" for details.
- Waiting for slurmd to be ready...
scontrol: error: Parse error in file /etc/qlustar/common/slurm-llnl/slurm.conf line 169: "GresTypes=" scontrol: fatal: Unable to process configuration file 1
How to fix it and how to add GressTypes? Can I set as GresTypes=gpu,mps ?
Regards, Morihisa