From 129520396d521185ce0d26de3b76ee8f83ac6331 Mon Sep 17 00:00:00 2001 From: Branislav Jansik <branislav.jansik@vsb.cz> Date: Fri, 3 May 2024 15:02:01 +0200 Subject: [PATCH] Update shell-and-data-access.md --- docs.it4i/general/shell-and-data-access.md | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-) diff --git a/docs.it4i/general/shell-and-data-access.md b/docs.it4i/general/shell-and-data-access.md index aa7531dad..6687ba5ed 100644 --- a/docs.it4i/general/shell-and-data-access.md +++ b/docs.it4i/general/shell-and-data-access.md @@ -172,8 +172,10 @@ local $ rsync -r my-local-dir username@cluster-name.it4i.cz:directory ### Parallel Transfer !!! note - The data transfer speed is limited by the single-core ssh encryption speed to about **250 MB/s** (750 MB/s in case of aes256-gcm@openssh.com cipher) - Run **multiple** instances for unlimited transfers + The data transfer speed is limited by the single TCP stream and single-core ssh encryption speed to about **250 MB/s** (750 MB/s in case of aes256-gcm@openssh.com cipher) + Run **multiple** streams for unlimited transfers + +#### Many Files Parallel execution of multiple rsync processes utilizes multiple cores to accelerate encryption and multiple tcp streams for enhanced bandwidth. First, set up ssh-agent single sign on: @@ -194,6 +196,20 @@ local $ ls | xargs -n 2 -P 4 /bin/bash -c 'rsync "$@" username@cluster-name.it4i The **-n** argument detemines the number of files to transfer in one rsync call. Set according to file size and count (large for many small files). The **-P** argument determines number of parallel rsync processes. Set to number of cores on your local machine. +Alternatively, use HyperQueue. First get HyperQueue binary, then run: + +```console +local $ hq server start & +local $ hq worker start & +local $ find my-local-dir -type f | xargs -n 2 > jobfile +local $ hq submit --log=/dev/null --progress --array --each-line jobfile \ + bash -c 'rsync -R $HQ_ENTRY username@cluster-name.it4i.cz:mydir' +``` + +#### Single Very Large File + +### Data Transfer From Windows Clients + On Windows, use the [WinSCP client][c] to transfer data. The [win-sshfs client][d] provides a way to mount the cluster filesystems directly as an external disc. ## Connection Restrictions -- GitLab