2019-09-11 TSD advanced user meeting
Minutes with current status as of 10. Oct 20191. Gard info on status and that new storage is signed one of these days. Info will come once public.IBM SPECTRUM SCALE , 5.2 PiBs
2. New 10GbE export computers are racked and will be set up asap for export from /cluster to backup and /cluster to VMs, this will hopefully remove the bottleneck feeling of moving data from /cluster /data/durableDONE!
3.Linux machines should be replaced with RHEL7 asap, and p33 has two RHEL6 machines with lots of memory that are unused due to /cluster lacking and software incompatibility
In this regard P33 and p22 and p19 are eager to test using RHEL7 + Horizon view.
P33 in the making, rest will follow
4. Singularity V3 is needed on all VMs not only the submit hosts, Singularity V3 might have RHEL 6 issues, so we need to move on with RHEL7 as standard linux machines.
Azab to fix, if needed new VMs first. Make new, then delete old then okay with users.
Believed to be in order now ? Comments ?
5. GPU nodes for P23, possible to make them available in the general P23 queue without “giving it all away to non-GPU-users”
NB : There are GPUS (v100) for general use in Colossus as well, please try, and if DeepVariant is not installed, tell us to
See emails from BHM
6. Dragen - issue with missing R packages for the CNV calling , issue on $scratch, - Azab to fix packages and to investigate the $scratch vs IB $scratch.Unknown status, checking. Last check was waiting for info from vendor
7. Issue of incoming large files over slow network (but steady) from US (UK biobank), P33 and Leon must look at this.Fixed
8. There is a need for simples system towards Sigma2Working on it
9. A plan is needed for the disk-upgrade procedure.Working on it
10. We will look into moving data for p22 from durable2 to durabel 3 and manipulate backup accordingly. Please give us the exact path of what is to be moved and where it is supposed to end up.Cancelled
General recommendation :
For long-running jobs on VMs it is definitely recommended to run some kind of terminal multiplexer like screen or tmux, and to have the terminal multiplexer running on the system where the job is running, so you're not vulnerable to network issues/instability.
Publisert 10. okt. 2019 11:37 - Sist endret 10. okt. 2019 11:37