diff --git a/docs.it4i/general/shell-and-data-access.md b/docs.it4i/general/shell-and-data-access.md index 6687ba5edc8dd33d55688e7c8367c0206e70b373..69db41f6f5595d193e98a8295c5b9fe0a829f4c6 100644 --- a/docs.it4i/general/shell-and-data-access.md +++ b/docs.it4i/general/shell-and-data-access.md @@ -206,8 +206,64 @@ local $ hq submit --log=/dev/null --progress --array --each-line jobfile \ bash -c 'rsync -R $HQ_ENTRY username@cluster-name.it4i.cz:mydir' ``` +Again, the **-n** argument detemines the number of files to transfer in one rsync call. Set according to file size and count (large for many small files). + #### Single Very Large File +To transfer single very large file efficienty, we need to transfer many blocks of the file in parallel, utilizing multiple cores to accelerate ssh encryption and multiple tcp streams for enhanced bandwidth. + +First, set up ssh-agent single sign on as [described above][10]. +Second, start the HyperQueue server and HyperQueue worker: + +```console +local $ hq server start & +local $ hq worker start & +``` + +Once set up, run the hqtransfer script listed below: + +```console +local $ ./hqtransfer mybigfile username@cluster-name.it4i.cz outputpath/outputfile +``` + +The hqtransfer script is listed below: + +```console +#!/bin/bash +#Read input +if [ -z $1 ]; then echo Usage: $0 'input_file ssh_destination [output_path/output_file]'; exit; fi +INFILE=$1 + +if [ -z $2 ]; then echo Usage: $0 'input_file ssh_destination [output_path/output_file]'; exit; fi +DEST=$2 + +OUTFILE=$INFILE +if [ ! -z $3 ]; then OUTFILE=$3; fi + +#Calculate transfer blocks +SIZE=$(($(stat --printf %s $INFILE)/1024/1024/1024)) +echo Transfering $(($SIZE+1)) x 1GB blocks + +#Execute +SECONDS=0 +hq submit --log=/dev/null --progress --array 0-$SIZE /bin/bash -c \ + "dd if=$INFILE bs=1G count=1 skip=\$HQ_TASK_ID | \ + ssh -c aes256-gcm@openssh.com $DEST \ + dd of=$OUTFILE bs=1G conv=notrunc seek=\$HQ_TASK_ID" + +#Stats +echo "Transfered: $(($SIZE+1))GB in $SECONDS s" +echo "Transfer speed: $((($SIZE+1)/$SECONDS)) GB/s" + +exit +``` + +Copy-paste the script into `hqtransfer` file and set executable flags: + +```console +local $ chmod u+x hqtransfer +``` + ### Data Transfer From Windows Clients On Windows, use the [WinSCP client][c] to transfer data. The [win-sshfs client][d] provides a way to mount the cluster filesystems directly as an external disc. @@ -300,6 +356,7 @@ Now, configure the applications proxy settings to `localhost:6000`. Use port for [7]: ../general/accessing-the-clusters/graphical-user-interface/vnc.md [8]: ../general/accessing-the-clusters/vpn-access.md [9]: #port-forwarding-from-compute-nodes +[10]: [b]: http://linux.die.net/man/1/sshfs [c]: http://winscp.net/eng/download.php