Hi,
Finally I got it working with local on those nodes gres.conf file containing:
Name=gpu Type=tesla-v100-sxm2-32gb File=/dev/nvidia0 Name=gpu Type=tesla-v100-sxm2-32gb File=/dev/nvidia1 Name=gpu Type=tesla-v100-sxm2-32gb File=/dev/nvidia2 Name=gpu Type=tesla-v100-sxm2-32gb File=/dev/nvidia3
and adding Gres=gpu:4 parameter to specific Node Groups and resulting to slurm.conf entry is
NodeName=node-[40-41] CoresPerSocket=16 Gres=gpu:4 MemSpecLimit=1024 RealMemory=1036547 Sockets=2 ThreadsPerCore=4
On those nodes I'm automounting /etc/qlustar/common and reusing via symlinks most slurm configuration.
I can confirm with nvidia-smi (limited by slurm) access to nvidia cards.
Thanks Rolandas
On 28/05/2021 13:36, Roland Fehrenbacher wrote:
"R" == rolnas rolnas@gmail.com writes:
Hi Rolandas
R> On 25/05/2021 11:18, Ansgar Esztermann-Kirchner wrote: >> Hi Rolandas, >> >>> In my cluster I have two exotic IBM Power servers with NVIDIA >>> cards. I succeded install Ubuntu 18.04 and make slurm working >>> qith qlustar 11 server, but for NVIDIA cards to manage I cannot >>> find any way to add manually those resources to global slurm >>> config (gres.conf is managed by qlustar and no way to add >>> something manually). >> >> Is there any particular reason you wish to configure this >> manually? We have a variety of NVidia cards here, and GRES >> configuration via Qlustar works quite well. R> On x86_64 nodes, managed by qlustar, they works also, but those R> IBM Power nodes are not supported by qlustar, because of R> different cpu architecture.
QluMan doesn't care that your nodes are IBM Power machines and cannot boot qlustar images. Just register them as you would with x86 servers and do the slurm config for them. You cannot use the GPU wizard, but other than that everything will work. The slurm config is written out in flat files on /etc/qlustar/common (NFS), so if you mount that on your power nodes, you should have the correct config.
Concerning Ansgar's question: Slurm management can be disabled either totally by uninstalling slurm packages on the head-nodes or manually by not having any defined configs in QluMan (and never do write files for slurm). Currently, these are the only options.
Best,
Roland _______________________________________________ Qlustar-General mailing list -- qlustar-general@qlustar.org To unsubscribe send an email to qlustar-general-leave@qlustar.org