Hi,
Recently I found that our login nodes are abused with users running
heavy nodejs sessions (from visual studio code). I setup some memory
limits for users and OOM is coming, but cannot kill users processes,
because all of them are with oom_score_adj=-1000. It is coming from
parent sshd server also with oom_score_adj=-1000.
As a workaround I'm stopping traditional ssh server and starting
ssh.socket at boot (via rc.boot script). Now oom_score_adj=0 as it
should be. That is also important for computing nodes with ssh
connection enabled.
Regards
Rolandas
P.S. We are still on Qlustar 13, so I don't know situation on Qlustar
14, but it could be different because ssh connection in Ubuntu 24.04 LTS
is activated via ssh socket already.