We are happy to announce a few new Qlustar features that we completed
recently and were published together with the latest security updates.
* QluMan is now capable of managing BeeGFS and Lustre mounts [1]. If you use
QluMan singularity images, make sure you download the latest version [2].
* The tool qlustar-image-edit now allows for simple static OS image
customization [3].
* OpenMPI 3.1.4 packages were added to Qlustar 11.0 in addition to the 4.0.x
packages already available. It turned that a number of MPI codes had
problems to be compiled with the 4.0.x version.
In the future, we plan to introduce such non-interruptive features in
irregular intervals, decoupled from the usual Qlustar releases.
[1] https://docs.qlustar.com/en-US/Qlustar_Cluster_OS/11.0/html-single/QluMan_G…
[2] https://qlustar.com/download
[3] https://docs.qlustar.com/en-US/Qlustar_Cluster_OS/11.0/html-single/Administ…
The Qlustar releases 11.0.0.1-b514f1258 and 10.1.1.5-b509f1256 are ready
including a number of security and bug fixes as well as a couple of
enhancements. Please check the following web pages for details about
security fixes:
https://qlustar.com/news/qsa-0624191-linux-kernel-vulnerabilitieshttps://qlustar.com/news/qsa-0624192-security-update-bundle
The following non-security related enhancements/bug fixes are included:
* Fix permissions of certain image base directories. Before that it
would have been possible for any user to fill a couple of tmpfs
filesystems thus blocking some main memory resources of compute nodes.
* The nvidia image module is now based on driver version 430.26 supporting the
newest GPUs (Qlustar 11.0 only).
* QluMan fixes:
- Fix rights management for QluMan users
- Prevent the main head-node from being deleted (Qlustar 11.0 only).
- Fix exceptions of the slurm job window that occurred in certain situations.
The Qlustar team is pleased to announce the immediate availability of
Qlustar 11.0 for download at https://qlustar.com/download.
It updates Qlustar's core platform to current Ubuntu 18.04 LTS. The
CentOS [https://www.centos.org/] edge platform is now based on 7.6 with
full integration of the just released OpenHPC 1.3.8
[https://openhpc.community/].
As a result of our continuous platform optimization/simplification
process we moved to dnsmasq [http://dnsmasq.org/] as a replacement of
the previously used ISC DHCP and atftp TFTP server. dnsmasq also
provides cluster internal name services (DNS) replacing the NIS hosts
map and acts as a DNS proxy.
In addition to the dnsmasq management interface, the second major
feature of QluMan is the possibility to handle Network
Filesystem resources. Initially this supports NFS mounts
including RDMA connections and a mechanism to automatically choose the
optimal network path to the NFS server. Mount resources are implemented
as systemd automount units. This new interface replaces the previously
used automount daemon which is now deactivated per default.
Highlights among the various major component updates include Kernel
4.19.x, Slurm 18.08.x, CUDA 10.1, OpenMPI 4.0.1 and BeeGFS 7.1.3. Please
read the release notes for more details
[https://docs.qlustar.com/en-US/Qlustar_Cluster_OS/11.0/html-single/Release_…].
We are organizing a Birds of a feather session [1] at this years ISC
(Wednesday, June 19th 2:45pm - 3:45pm, Room Kontrast). There will be two
presentations one by Qlustar founder Roland Fehrenbacher and a second
one by Ansgar Esztermann from the Max-Planck Institute in Göttingen. The
goal of the BoF is to bring together developers, HPC cluster admins and
hardware vendors to identify the most pressing issues to further enhance
Qlustar's suitability as an open-source full stack HPC management
solution. So if you're in Frankfurt next week, take the chance to join
us.
[1] https://2019.isc-program.com/?post_type=page&p=11&id=bof125&sess=sess237
The Qlustar release 10.1.1.4-b509f1240 is ready including a
number of security and a few bug fixes. Please
check the following web pages for details about security fixes:
https://qlustar.com/news/qsa-0510191-linux-kernel-vulnerabilitieshttps://qlustar.com/news/qsa-0510192-security-update-bundle
The following non-security related bug fixes are included:
* A kernel bug in the fuse driver could lead to the spontaneous unmounting of
filesystems in rare conditions due to false handling of error
conditions. This was fixed by the Q-Leap kernel team.
* Support for the Intel/QLogic Infinipath adapters has been readded after it
was dropped by mistake in a previous release.
The Qlustar release 10.1.1.3-b509f1235 is ready including a
number of security and bug fixes as well as a couple of enhancements. Please
check the following web pages for details about security fixes:
https://qlustar.com/news/qsa-0211191-linux-kernel-vulnerabilitieshttps://qlustar.com/news/qsa-0211192-security-update-bundle
The following non-security related enhancements/bug fixes are included:
* The Qlustar netboot process has been further improved: We now ship a dedicated
C++ based multicast-enabled image pusher 'ql-mcastd' to run on the
head-node. This fault-tolerant daemon delivers the assigned OS image
of a node to the corresponding client processes running in the further
stripped down initial RAM disk (now a mere 25MB).
There is a new config file /etc/qlustar/ql-mcastd.conf for this
daemon. It is automatically generated correctly when the DHCP config
is written.
* The nvidia image module is now based on driver version 410.104 supporting the
newest GPUs.
* Qlustar installer 10.1.1-2 was updated to run on the same kernel version as on
the finally installed system.
* QluMan has added the possibility to mass import MAC addresses for new hosts
from a file. This allows to quickly setup large numbers of compute nodes.
* QluMan fixes:
- Fix exception when opening Preferences
- Don't list initramfs as possible module
- HostTemplatesWidget: Don't allow assigning global sets to a host
template
- EnclosureView: catch KeyError when deleting from Props2Hosts
- EnclosureView: Include global template when checking host hardware
status
- CommandEditor: fix exception when moving command to group
- GPU Wizard: Fix open slurm config button
The Qlustar release 10.1.1.2-b505f1215 release is ready including a
number of security and bug fixes as well as minor enhancements. Please
check the following web pages for details about security fixes:
https://qlustar.com/news/qsa-0114191-linux-kernel-vulnerabilitieshttps://qlustar.com/news/qsa-0114192-security-update-bundle
The following non-security related changes/bug fixes are included:
* Change mechanism to disable SMT (HyperThreading): Now the only supported
method is to assign the kernel parameter 'nosmt=force' in boot
configs. The previous method proved to be unstable in certain
situations due to kernel bugs.
* Update Lustre client to version 2.12.0.
* Added lustre-2.12 image module.
The Qlustar release 10.1.1.1-b505f1206 release is ready including a
number of security and bug fixes. Please check the following web pages
for details:
https://qlustar.com/news/qsa-1212181-linux-kernel-vulnerabilitieshttps://qlustar.com/news/qsa-1212182-security-update-bundle
The following non-security related changes/bug fixes are included:
* Speed up boot process of cluster nodes
* Allow to safely disable SMT (HyperThreading)
* Update Lustre client to version 2.12 rc2. Lustre 2.11 could cause a silent
crash of lustre client machines under heavy load (Ubuntu only).
The Qlustar team is pleased to announce the immediate availability of
Qlustar 10.1.1 for download at https://qlustar.com/download. This is a
minor release within the 10.1 series. It integrates the recently
published OpenHPC [ https://openhpc.community ] 1.3.6 release into the
Qlustar CentOS platform. OpenHPC 1.3.6 ships with a gcc 8 toolchain
among other goodies.
A second major improvement is the further tuning of the Qlustar node
netboot process that cuts the size of the initial RAMdisk delivered via
tftp to a mere 35MB. This doesn't only speed up the boot process in
standard clusters, but it also makes it feasible to netboot over slow
and less reliable networks like e.g. a connection to a spaceship.
Apart from that, this release comes with a number of bug and security
fixes which were already addressed in separate security advisories as
usual at
https://www.qlustar.com/news/qsa-1129181-linux-kernel-vulnerabilitieshttps://www.qlustar.com/news/qsa-1129182-security-update-bundle
The Qlustar team is pleased to announce the immediate availability of
Qlustar 10.1 for download at https://qlustar.com/download.
Adding support for CentOS 7.5 [https://www.centos.org/] as a new fully
supported edge platform, combined with the integration of OpenHPC
[https://openhpc.community], this release marks an important milestone
in Qlustar's history. Qlustar users now have the option to run the most
popular HPC platform with the same ease-of-use they're familiar with
from our Debian/Ubuntu based platform.
The second big new feature of 10.1 is a totally revamped boot
process of Qlustar nodes: QluMan 10.1 now supports downloading the OS
image via a fast and fault-tolerant multi-cast mechanism. This can
reduce boot-time dramatically and allows for simultaneous and reliable
booting of a virtually unlimited number of cluster nodes without
increasing overall boot-time.
On top of that, Qlustar OS images are now created using squashfs with
compression. This reduces the memory footprint of the OS by roughly 66%
so that a standard compute node image with slurm and IB support consumes
a mere 160MB of RAM.
Qlustar 10.1 is a feature release with its core still based on
Ubuntu/Xenial (16.04). Highlights among the various <strong>major
component updates</strong> include Kernel 4.14.x, Slurm 17.11.9.2, CUDA
9.1 and Lustre 2.11. Please read the release notes at
https://docs.qlustar.com/en-US/Qlustar_Cluster_OS/10.1/html-single/Release_…
for more details.