Hi Nicolai,

due to systemd now also using persistent interface names for IB adapters, the installer often doesn't get the new name correctly. You should definitely use ibo49d1 in /etc/network/interfaces and additionally add a line like below to the ibo49d1 stanza there:

pre-up /sbin/modprobe ib_ipoib Hope this helps, Roland On 1/30/24 09:50, wo.nicolai@vuykrotterdam.com wrote:
Hi all,

I've started with a new cluster using Qlustar 13.1 (fresh install). 
Hardware setup contains (currently) 3 HP DL380 machines, all hardware boxes have an infiniband interface and are connected to the same infiniband switch. 1 headnode (with openSM during installation), FE-login as VM, 2 compute nodes.

I noticed that the on the head node the infiniband interface is down.
Upon investigating I discovered the following:
In the qlustar management interface, the network config for the headnode shows ib0
in the commandline interface on the headnode using "ip address" I do not see any ib0 interface. I do see ibo49d1 (o not zero)
"9: ibo49d1: <BROADCAST,MULTICAST> mtu 2044 qdisc fq_codel state DOWN group default qlen 256
    link/infiniband 80:00:02:25:fe:80:00:00:00:00:00:00:04:09:73:ff:ff:e4:f4:22 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
    altname ibp4s0d1"
    
in /etc/network/interfaces interface ib0 is mentioned

When I update the network config for the headnode in the qlustar management interface from ib0 to ibo49d1 and reboot after, the interface remains down without any ip address. I also noticed that for such a change in the qlustar management interface, I do not need to write files? 
in /etc/network/interface ib0 remains mentioned, while I expected that this would be updated to ibo49d1. Manually changing ib0 into ibo49d1 in /etc/network/interfaces did not work either.



The infiniband port is active:
ibv_devinfo -d mlx4_0
hca_id: mlx4_0
        transport:                      InfiniBand (0)
        fw_ver:                         2.42.5700
        node_guid:                      0409:73ff:ffe4:f420
        sys_image_guid:                 0409:73ff:ffe4:f423
        vendor_id:                      0x02c9
        vendor_part_id:                 4103
        hw_ver:                         0x0
        board_id:                       HP_1380110017
        phys_port_cnt:                  2
                port:   1
                        state:                  PORT_DOWN (1)
                        max_mtu:                4096 (5)
                        active_mtu:             1024 (3)
                        sm_lid:                 0
                        port_lid:               0
                        port_lmc:               0x00
                        link_layer:             Ethernet

                port:   2
                        state:                  PORT_ACTIVE (4)
                        max_mtu:                4096 (5)
                        active_mtu:             4096 (5)
                        sm_lid:                 1
                        port_lid:               1
                        port_lmc:               0x00
                        link_layer:             InfiniBand

Using ibnetdiscover on the headnode, both compute nodes and the switch is listed. 

Obviously I'm missing something. How do I get IPoIB up and running on the headnode?

The above is for the head node only. the compute nodes do have an interface ib0 and an ip adress for this interface.
_