Hi,
I am testing singularity and I encountered the problem that /var/lib is not synchronized with the chroot environment at the node. I am looking now into the unionfs -fuse of the init.qlustar script but I can not find any problem here. Is there a configuration file somewhere for the chroot environment?
Best regards,
Kwinten
"K" == Kwinten Nelissen kwinten.nelissen@gmail.com writes:
Hi Kwinten,
K> Hi, I am testing singularity and I encountered the problem that K> /var/lib is not synchronized with the chroot environment at the K> node. I am looking now into the unionfs -fuse of the K> init.qlustar script but I can not find any problem here. Is K> there a configuration file somewhere for the chroot environment?
there is no synchronization, but there are layered mounts. The full unionfs chroot starting from / is mounted underneath the image so it includes /var/lib as well. The problem must be something else.
Please describe your problem in detail otherwise we can't help.
Best,
Roland
Hi Roland,
The probem seems to be that SYS/var is mounted on /var after initializing unionfs.
#mount tmpfs on /run type tmpfs (rw,relatime) tmpfs on /union/rw type tmpfs (rw,relatime,size=1048576k) devtmpfs on /dev type devtmpfs (rw,relatime,size=32794936k,nr_inodes=8198734,mode=755) unionfs on / type fuse.unionfs (rw,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other) /dev/ram0 on /union/image type squashfs (ro,relatime) /dev/ram0 on /union/rw/union/image type squashfs (ro,relatime) SYS/var on /var type zfs (rw,relatime,xattr,noacl)
I think to problem lay in a miss configuration of unionfs when loading the following configuration file:
cat /etc/qlustar/disk-config # ZFS config for 2 disks (/dev/sda - /dev/sdb) as stripe (RAID 0): # Zpool name: SYS # 16GB zvol for swap (not activated) # Filesystems: /var (max 2GB) + /scratch - both compressed
[BASE] ZPOOLS = SYS ZFS = var, scratch ARC_LIMIT = 1024 #ZVOLS = swap
[SYS] vdevs = V-SYS
[V-SYS] devs = /dev/sd[ab] type =
[swap] zpool = SYS size = 16G
[var] zpool = SYS quota = 2G reservation = 2G compress = lz4
[scratch] zpool = SYS compress = lz4
So it seems that the unionfs /var is overwritten by SYS/var.
Best,
Kwinten
"K" == Kwinten Nelissen kwinten.nelissen@gmail.com writes:
Hi Kwinten,
we need a precise description of how to reproduce the problem. Below you just describe the configuration of the unionfs layers which is correct.
Thanks,
Roland
K> Hi Roland, The probem seems to be that SYS/var is mounted on /var K> after initializing unionfs.
K> #mount K> tmpfs on /run type tmpfs (rw,relatime) tmpfs on /union/rw type K> tmpfs (rw,relatime,size=1048576k) devtmpfs on /dev type devtmpfs K> (rw,relatime,size=32794936k,nr_inodes=8198734,mode=755) unionfs K> on / type fuse.unionfs K> (rw,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other) K> /dev/ram0 on /union/image type squashfs (ro,relatime) /dev/ram0 K> on /union/rw/union/image type squashfs (ro,relatime) SYS/var on K> /var type zfs (rw,relatime,xattr,noacl)
K> I think to problem lay in a miss configuration of unionfs when K> loading the following configuration file:
K> cat /etc/qlustar/disk-config K> # ZFS config for 2 disks (/dev/sda - /dev/sdb) as stripe (RAID K> # 0): K> # Zpool name: SYS 16GB zvol for swap (not activated) K> # Filesystems: /var (max 2GB) + /scratch - both compressed
K> [BASE] ZPOOLS = SYS ZFS = var, scratch ARC_LIMIT = 1024 K> #ZVOLS = swap
K> [SYS] vdevs = V-SYS
K> [V-SYS] devs = /dev/sd[ab] type =
K> [swap] zpool = SYS size = 16G
K> [var] zpool = SYS quota = 2G reservation = 2G compress = lz4
K> [scratch] zpool = SYS compress = lz4
K> So it seems that the unionfs /var is overwritten by SYS/var.
K> Best,
K> Kwinten _______________________________________________ K> Qlustar-General mailing list -- qlustar-general@qlustar.org To K> unsubscribe send an email to qlustar-general-leave@qlustar.org
--
Hi Roland,
There are no errors. To reproduce the problem one has just to add a compute node with a local disk to the cluster. If I boot the compute node diskless there is no problem.
Best,
Kwinten
"K" == Kwinten Nelissen kwinten.nelissen@gmail.com writes:
K> Hi Roland, There are no errors. To reproduce the problem one has K> just to add a compute node with a local disk to the cluster. If K> I boot the compute node diskless there is no problem.
I need to know what you are trying to do with singularity (please describe commands in detail) and why you think there is a problem with /var/lib when you have integrated a local disk.
Hi Roland,
I installed singularity in the chroot-xenial environment. During the installation a directory singularity is recreated in /var/lib which is required to run any image as user.
I started to debug the problem and it turned out that the directory /var from the chroot environment is not merged with /var from the image file. However when I boot a compute node as diskless everything works fine.
Thats why I come to the conclusion that something should go wrong in /lib/qlustar/disk-auto-setup
init_var() { local var_device=$1 fstype_opt=""
if ! $diskless; then grep -q $var_device /etc/fstab && \ fstype_opt="-t $(awk '$2 == "/var" {print $3}' /etc/fstab)" if ! mount $fstype_opt $var_device /mnt ; then echo "Unable to mount /mnt on disk partition $var_device" >&2 remove_start_links return 1 fi
# Copy ramdisk var directory on new var filesystem if $force_var_copy ; then cp -p -R /var/* /mnt else # Make sure that certain files of /var in the image # are always copied to the /var on disk # var device is now mounted on /mnt contents_file=/etc/qlustar/contents handle_contents ${contents_file} /var /mnt [ -d /var/yp ] && cp -a /var/yp /mnt fi
My guess is that here only the ramdisk is copied which is normal. However after I dont see that unionfs again is initialized for /var.
as a result I get the following when I try to singularity: $singularity shell container.img ERROR : Failed to resolve path to /var/lib/singularity/mnt/container: No such file or directory ABORT : Retval = 255
"K" == Kwinten Nelissen kwinten.nelissen@gmail.com writes:
Hi Kwinten,
we got it. Already fixed here. A disk's var FS was indeed mounted at the wrong position in the unionfs layer. New images will be available asap (I'll post here). Once available, running
$ apt update $ apt dist-upgrade
on the head-node and rebooting the nodes in question should fix the problem.
Best,
Roland
K> Hi Roland, I installed singularity in the chroot-xenial K> environment. During the installation a directory singularity is K> recreated in /var/lib which is required to run any image as user.
K> I started to debug the problem and it turned out that the K> directory /var from the chroot environment is not merged with K> /var from the image file. However when I boot a compute node as K> diskless everything works fine.
K> Thats why I come to the conclusion that something should go wrong K> in /lib/qlustar/disk-auto-setup
K> init_var() { K> local var_device=$1 fstype_opt=""
K> if ! $diskless; then K> grep -q $var_device /etc/fstab && \ K> fstype_opt="-t $(awk '$2 == "/var" {print $3}' /etc/fstab)" K> if ! mount $fstype_opt $var_device /mnt ; then K> echo "Unable to mount /mnt on disk partition $var_device" K> >&2 remove_start_links return 1 K> fi
K> # Copy ramdisk var directory on new var filesystem K> if $force_var_copy ; then K> cp -p -R /var/* /mnt K> else K> # Make sure that certain files of /var in the image are K> # always copied to the /var on disk var device is now K> # mounted on /mnt K> contents_file=/etc/qlustar/contents handle_contents K> ${contents_file} /var /mnt [ -d /var/yp ] && cp -a /var/yp K> /mnt K> fi
K> My guess is that here only the ramdisk is copied which is K> normal. However after I dont see that unionfs again is K> initialized for /var.
K> as a result I get the following when I try to singularity: K> $singularity shell container.img ERROR : Failed to resolve path K> to /var/lib/singularity/mnt/container: No such file or directory K> ABORT : Retval = 255