Skip to content
Snippets Groups Projects
shell-and-data-access.md 15 KiB
Newer Older
  • Learn to ignore specific revisions
  • Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    # Accessing the Clusters
    
    ## Shell Access
    
    All IT4Innovations clusters are accessed by the SSH protocol via login nodes at the address **cluster-name.it4i.cz**. The login nodes may be addressed specifically, by prepending the loginX node name to the address.
    
    
    !!! note "Workgroups Access Limitation"
    
    Jan Siwiec's avatar
    Jan Siwiec committed
        Projects from the **EUROHPC** workgroup can only access the **Karolina** cluster.
    
    !!! important "Supported keys"
        We accept only RSA or ED25519 keys for logging into our systems.
    
    | Login address                   | Port | Protocol | Login node                                |
    | ------------------------------- | ---- | -------- | ----------------------------------------- |
    | karolina.it4i.cz                | 22   | SSH      | round-robin DNS record for login{1,2,3,4} |
    | login{1,2,3,4}.karolina.it4i.cz | 22   | SSH      | login{1,2,3,4}                            |
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    ### Barbora Cluster
    
    
    | Login address                 | Port | Protocol | Login node                            |
    | ----------------------------- | ---- | -------- | ------------------------------------- |
    | barbora.it4i.cz               | 22   | SSH      | round-robin DNS record for login{1,2} |
    | login{1,2}.barbora.it4i.cz    | 22   | SSH      | login{1,2}                            |
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    
    ## Authentication
    
    Authentication is available by [private key][1] only. Verify SSH fingerprints during the first logon:
    
    
    Jan Siwiec's avatar
    Jan Siwiec committed
    ### Karolina
    
    **Fingerprints**
    
    
    Fingerprints are identical for all login nodes.
    
    Jan Siwiec's avatar
    Jan Siwiec committed
    
    
    ```console
    
    # login{1,2,3,4}:22 SSH-2.0-OpenSSH_7.4
    
    2048 MD5:41:3a:40:32:da:08:77:51:79:04:af:53:e4:57:d0:7c (RSA)
    2048 SHA256:Ip37d/bE6XwtWf3KnWA+sqA+zRGSFlf5vXai0v3MBmo (RSA)
    
    256 MD5:e9:b6:8e:7d:f8:c6:8f:42:34:10:71:02:14:a6:7c:22 (ED25519)
    256 SHA256:zKEtQMi2KRsxzzgo/sHcog+NFZqQ9tIyvJ7BVxOfzgI (ED25519)
    
    Jan Siwiec's avatar
    Jan Siwiec committed
    ```
    
    
    Gabriel Homa's avatar
    Gabriel Homa committed
    **Public Keys \ Known Hosts**
    
    Public Keys \ Known Hosts are identical for all login nodes.
    
    
    ```console
    
    login1,login1.karolina.it4i.cz,login1.karolina,karolina.it4i.cz ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC9Cp8/a3F7eOPQvH4+HjC778XvYgRXWmCEOQnE3clPcKw15iIat3bvKc8ckYLudAzomipWy4VYdDI2OnEXay5ba8HqdREJO31qNBtW1AXgydCfPnkeuUZS4WVlAWM+HDlK6caB8KlvHoarCnNj2jvuYsMbARgGEq3vrk3xW4uiGpS6Y/uGVBBwMFWFaINbmXUrU1ysv/ZD1VpH4eHykkD9+8xivhhZtcz5Z2T7ZnIib4/m9zZZvjKs4ejOo58cKXGYVl27kLkfyOzU3cirYNQOrGqllN/52fATfrXKMcQor9onsbTkNNjMgPFZkddufxTrUaS7EM6xYsj8xrPJ2RaN
    login1,login1.karolina.it4i.cz,login1.karolina,karolina.it4i.cz ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIDkIdDODkUYRgMy1h6g/UtH34RnDCQkwwiJZFB0eEu1c
    login2,login2.karolina.it4i.cz,login2.karolina,karolina.it4i.cz ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC9Cp8/a3F7eOPQvH4+HjC778XvYgRXWmCEOQnE3clPcKw15iIat3bvKc8ckYLudAzomipWy4VYdDI2OnEXay5ba8HqdREJO31qNBtW1AXgydCfPnkeuUZS4WVlAWM+HDlK6caB8KlvHoarCnNj2jvuYsMbARgGEq3vrk3xW4uiGpS6Y/uGVBBwMFWFaINbmXUrU1ysv/ZD1VpH4eHykkD9+8xivhhZtcz5Z2T7ZnIib4/m9zZZvjKs4ejOo58cKXGYVl27kLkfyOzU3cirYNQOrGqllN/52fATfrXKMcQor9onsbTkNNjMgPFZkddufxTrUaS7EM6xYsj8xrPJ2RaN
    login2,login2.karolina.it4i.cz,login2.karolina,karolina.it4i.cz ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIDkIdDODkUYRgMy1h6g/UtH34RnDCQkwwiJZFB0eEu1c
    login3,login3.karolina.it4i.cz,login3.karolina,karolina.it4i.cz ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC9Cp8/a3F7eOPQvH4+HjC778XvYgRXWmCEOQnE3clPcKw15iIat3bvKc8ckYLudAzomipWy4VYdDI2OnEXay5ba8HqdREJO31qNBtW1AXgydCfPnkeuUZS4WVlAWM+HDlK6caB8KlvHoarCnNj2jvuYsMbARgGEq3vrk3xW4uiGpS6Y/uGVBBwMFWFaINbmXUrU1ysv/ZD1VpH4eHykkD9+8xivhhZtcz5Z2T7ZnIib4/m9zZZvjKs4ejOo58cKXGYVl27kLkfyOzU3cirYNQOrGqllN/52fATfrXKMcQor9onsbTkNNjMgPFZkddufxTrUaS7EM6xYsj8xrPJ2RaN
    login3,login3.karolina.it4i.cz,login3.karolina,karolina.it4i.cz ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIDkIdDODkUYRgMy1h6g/UtH34RnDCQkwwiJZFB0eEu1c
    login4,login4.karolina.it4i.cz,login4.karolina,karolina.it4i.cz ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC9Cp8/a3F7eOPQvH4+HjC778XvYgRXWmCEOQnE3clPcKw15iIat3bvKc8ckYLudAzomipWy4VYdDI2OnEXay5ba8HqdREJO31qNBtW1AXgydCfPnkeuUZS4WVlAWM+HDlK6caB8KlvHoarCnNj2jvuYsMbARgGEq3vrk3xW4uiGpS6Y/uGVBBwMFWFaINbmXUrU1ysv/ZD1VpH4eHykkD9+8xivhhZtcz5Z2T7ZnIib4/m9zZZvjKs4ejOo58cKXGYVl27kLkfyOzU3cirYNQOrGqllN/52fATfrXKMcQor9onsbTkNNjMgPFZkddufxTrUaS7EM6xYsj8xrPJ2RaN
    login4,login4.karolina.it4i.cz,login4.karolina,karolina.it4i.cz ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIDkIdDODkUYRgMy1h6g/UtH34RnDCQkwwiJZFB0eEu1c
    
    
    Jan Siwiec's avatar
    Jan Siwiec committed
    ### Barbora
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    
    
    **Fingerprints**
    
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    ```console
    
    md5:
    39:55:e2:b9:2a:a2:c4:9e:b1:8e:f0:f7:b1:66:a8:73 (RSA)
    40:67:03:26:d3:6c:a0:7f:0a:df:0e:e7:a0:52:cc:4e (ED25519)
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    
    
    sha256:
    TO5szOJf0bG7TWVLO3WABUpGKkP7nBm/RLyHmpoNpro (RSA)
    ZQzFTJVDdZa3I0ics9ME2qz4v5a3QzXugvyVioaH6tI (ED25519)
    ```
    
    **Public Keys \ Known Hosts**
    
    ```console
    
    barbora.it4i.cz, ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQDHUHvIrv7VUcGIcfsrcBjYfHpFBst8uhtJqfiYckfbeMRIdaodfjTO0pIXvd5wx+61a0C14zy1pdhvx6ykT5lwYkkn8l2tf+LRd6qN0alq/s+NGDJKpWGvdAGD3mM9AO1RmUPt+Vfg4VePQUZMu2PXZQu2C4TFFbaH2yiyCFlKz/Md9q+7NM+9U86uf3uLFbBu8mzkk2z3jyDGR6pjmpYTAiV/goUGpHgsW8Qx4GUdCreObQ6GUfPVOPvYaTlfXfteD9HluB7gwCWaUi5hevHhc+kK4xj61v64mGBOPmCobnAlr2RYQv6cDn7PHgI2mE7ZwRsZkNyMXqGr1S2JK2M64K53ZfF70aGrW/muHlFrYVFaJg6s1f7K/Xqu21wjwwvnJ8CcP7lUjASqhfSn9OBzEI38KMMo5Qon9p108wvqSKP2QnEdrdv1QOsBPtOZMNRMfEVpw6xVvyPka0X6gxzGfEc9nn3nOok35Fbvoo3G0P8RmOeDJLqDjUOggOs0Gwk=
    
    barbora.it4i.cz, ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIOmUm4btn7OC0QLIT3xekKTTdg5ziby8WdxccEczEeE1
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    ```
    
    !!! note
    
    Jan Siwiec's avatar
    Jan Siwiec committed
        Barbora has identical SSH fingerprints on all login nodes.
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    
    
    ### Private Key Authentication:
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    
    On **Linux** or **Mac**, use:
    
    ```console
    
    local $ ssh -i /path/to/id_rsa username@cluster-name.it4i.cz
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    ```
    
    If you see a warning message **UNPROTECTED PRIVATE KEY FILE!**, use this command to set lower permissions to the private key file:
    
    ```console
    
    local $ chmod 600 /path/to/id_rsa
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    ```
    
    On **Windows**, use the [PuTTY SSH client][2].
    
    
    Jan Siwiec's avatar
    Jan Siwiec committed
    After logging in, you will see the command prompt with the name of the cluster and the message of the day.
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    
    !!! note
    
    Jan Siwiec's avatar
    Jan Siwiec committed
        The environment is **not** shared between login nodes, except for shared filesystems.
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    
    ## Data Transfer
    
    
    ### Serial Transfer
    
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    Data in and out of the system may be transferred by SCP and SFTP protocols.
    
    
    Jan Siwiec's avatar
    Jan Siwiec committed
    | Cluster  | Port | Protocol  |
    | -------- | ---- | --------- |
    | Karolina | 22   | SCP, SFTP |
    | Barbora  | 22   | SCP       |
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    
    Authentication is by [private key][1] only.
    
    
    Jan Siwiec's avatar
    Jan Siwiec committed
    On Linux or Mac, use an SCP or SFTP client to transfer data to the cluster:
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    
    ```console
    
    local $ scp -i /path/to/id_rsa my-local-file username@cluster-name.it4i.cz:directory/file
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    ```
    
    ```console
    
    local $ scp -i /path/to/id_rsa -r my-local-dir username@cluster-name.it4i.cz:directory
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    ```
    
    or
    
    ```console
    
    local $ sftp -o IdentityFile=/path/to/id_rsa username@cluster-name.it4i.cz
    
    You may request the **aes256-gcm@openssh.com cipher** for more efficient ssh based transfer:
    
    ```console
    local $ scp -c aes256-gcm@openssh.com -i /path/to/id_rsa -r my-local-dir username@cluster-name.it4i.cz:directory
    ```
    
    The -c argument may be used with ssh, scp and sftp, and is also applicable to sshfs and rsync below.
    
    
    Jan Siwiec's avatar
    Jan Siwiec committed
    A very convenient way to transfer files in and out of the cluster is via the fuse filesystem [SSHFS][b].
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    
    ```console
    
    local $ sshfs -o IdentityFile=/path/to/id_rsa username@cluster-name.it4i.cz:. mountpoint
    
    Jan Siwiec's avatar
    Jan Siwiec committed
    Using SSHFS, the user's home directory will be mounted on your local computer, just like an external disk.
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    
    Learn more about SSH, SCP, and SSHFS by reading the manpages:
    
    ```console
    
    local $ man ssh
    local $ man scp
    local $ man sshfs
    
    The rsync client uses ssh to establish connection.
    
    ```console
    
    local $ rsync my-local-file
    
    ```
    
    ```console
    local $ rsync -r my-local-dir username@cluster-name.it4i.cz:directory
    ```
    
    
    ### Parallel Transfer
    
    
        The data transfer speed is limited by the single TCP stream and single-core ssh encryption speed to about **250 MB/s** (750 MB/s in case of aes256-gcm@openssh.com cipher)
        Run **multiple** streams for unlimited transfers
    
    #### Many Files
    
    Parallel execution of multiple rsync processes utilizes multiple cores to accelerate encryption and multiple tcp streams for enhanced bandwidth.
    
    First, set up ssh-agent single sign on:
    
    
    local $ eval `ssh-agent`
    local $ ssh-add
    Enter passphrase for /home/user/.ssh/id_rsa:
    ```
    
    Then run multiple rsync instances in parallel, f.x.:
    
    local $ cd my-local-dir
    
    local $ ls | xargs -n 2 -P 4 /bin/bash -c 'rsync "$@" username@cluster-name.it4i.cz:mydir' sh
    
    The **-n** argument detemines the number of files to transfer in one rsync call. Set according to file size and count (large for many small files).
    
    The **-P** argument determines number of parallel rsync processes. Set to number of cores on your local machine.
    
    Alternatively, use [HyperQueue][11]. First get [HyperQueue binary][e], then run:
    
    
    ```console
    local $ hq server start &
    local $ hq worker start &
    local $ find my-local-dir -type f | xargs -n 2 > jobfile
    
    local $ hq submit --log=/dev/null --progress --each-line jobfile \
    
            bash -c 'rsync -R $HQ_ENTRY username@cluster-name.it4i.cz:mydir'
    ```
    
    
    Again, the **-n** argument detemines the number of files to transfer in one rsync call. Set according to file size and count (large for many small files).
    
    
    #### Single Very Large File
    
    
    To transfer single very large file efficienty, we need to transfer many blocks of the file in parallel, utilizing multiple cores to accelerate ssh encryption and multiple tcp streams for enhanced bandwidth.
    
    First, set up ssh-agent single sign on as [described above][10].
    
    Second, start the [HyperQueue server and HyperQueue worker][f]:
    
    
    ```console
    local $ hq server start &
    local $ hq worker start &
    ```
    
    Once set up, run the hqtransfer script listed below:
    
    ```console
    local $ ./hqtransfer mybigfile username@cluster-name.it4i.cz outputpath/outputfile
    ```
    
    
    The hqtransfer script:
    
    
    ```console
    #!/bin/bash
    #Read input
    if [ -z $1 ]; then echo Usage: $0 'input_file ssh_destination [output_path/output_file]'; exit; fi
    INFILE=$1
    
    if [ -z $2 ]; then echo Usage: $0 'input_file ssh_destination [output_path/output_file]'; exit; fi
    DEST=$2
    
    OUTFILE=$INFILE
    if [ ! -z $3 ]; then OUTFILE=$3; fi
    
    #Calculate transfer blocks
    SIZE=$(($(stat --printf %s $INFILE)/1024/1024/1024))
    echo Transfering $(($SIZE+1)) x 1GB blocks
    
    #Execute
    hq submit --log=/dev/null --progress --array 0-$SIZE /bin/bash -c \
            "dd if=$INFILE bs=1G count=1 skip=\$HQ_TASK_ID | \
             ssh -c aes256-gcm@openssh.com $DEST \
             dd of=$OUTFILE bs=1G conv=notrunc seek=\$HQ_TASK_ID"
    
    exit
    ```
    
    Copy-paste the script into `hqtransfer` file and set executable flags:
    
    ```console
    local $ chmod u+x hqtransfer
    ```
    
    
    The `hqtransfer` script is ready for use.
    
    
    ### Data Transfer From Windows Clients
    
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    On Windows, use the [WinSCP client][c] to transfer data. The [win-sshfs client][d] provides a way to mount the cluster filesystems directly as an external disc.
    
    ## Connection Restrictions
    
    Outgoing connections from cluster login nodes to the outside world are restricted to the following ports:
    
    | Port | Protocol |
    | ---- | -------- |
    | 22   | SSH      |
    | 80   | HTTP     |
    | 443  | HTTPS    |
    | 873  | Rsync    |
    
    !!! note
        Use **SSH port forwarding** and proxy servers to connect from cluster to all other remote ports.
    
    Outgoing connections from cluster compute nodes are restricted to the internal network. Direct connections from compute nodes to the outside world are cut.
    
    
    | Service          | IP/Port            |
    | ---------------- | ------------------ |
    | TCP/22, TCP      | port 1024-65535    |
    | e-INFRA CZ Cloud | 195.113.243.0/24   |
    | IT4I Cloud       | 195.113.175.128/26 |
    
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    ## Port Forwarding
    
    ### Port Forwarding From Login Nodes
    
    !!! note
        Port forwarding allows an application running on cluster to connect to arbitrary remote hosts and ports.
    
    It works by tunneling the connection from cluster back to the user's workstations and forwarding from the workstation to the remote host.
    
    Select an unused port on the cluster login node (for example 6000) and establish the port forwarding:
    
    ```console
    $ ssh -R 6000:remote.host.com:1234 cluster-name.it4i.cz
    ```
    
    In this example, we establish port forwarding between port 6000 on the cluster and port 1234 on the `remote.host.com`. By accessing `localhost:6000` on the cluster, an application will see the response of `remote.host.com:1234`. The traffic will run via the user's local workstation.
    
    Port forwarding may be done **using PuTTY** as well. On the PuTTY Configuration screen, load your cluster configuration first. Then go to *Connection > SSH > Tunnels* to set up the port forwarding. Click the _Remote_ radio button. Insert 6000 to the _Source port_ textbox. Insert `remote.host.com:1234`. Click _Add_, then _Open_.
    
    Port forwarding may be established directly to the remote host. However, this requires that the user has an SSH access to `remote.host.com`.
    
    ```console
    $ ssh -L 6000:localhost:1234 remote.host.com
    ```
    
    !!! note
        Port number 6000 is chosen as an example only. Pick any free port.
    
    ### Port Forwarding From Compute Nodes
    
    Remote port forwarding from compute nodes allows applications running on the compute nodes to access hosts outside the cluster.
    
    First, establish the remote port forwarding from the login node, as [described above][5].
    
    Second, invoke port forwarding from the compute node to the login node. Insert the following line into your jobscript or interactive shell:
    
    ```console
    $ ssh  -TN -f -L 6000:localhost:6000 login1
    ```
    
    In this example, we assume that port forwarding from `login1:6000` to `remote.host.com:1234` has been established beforehand. By accessing `localhost:6000`, an application running on a compute node will see the response of `remote.host.com:1234`.
    
    ### Using Proxy Servers
    
    Port forwarding is static; each single port is mapped to a particular port on a remote host. Connection to another remote host requires a new forward.
    
    !!! note
        Applications with inbuilt proxy support experience unlimited access to remote hosts via a single proxy server.
    
    
    To establish a local proxy server on your workstation, install and run the SOCKS proxy server software. On Linux, SSHD demon provides the functionality. To establish the SOCKS proxy server listening on port 1080 run:
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    
    ```console
    
    local $ ssh -D 1080 localhost
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    ```
    
    On Windows, install and run the free, open source Sock Puppet server.
    
    Once the proxy server is running, establish the SSH port forwarding from cluster to the proxy server, port 1080, exactly as [described above][5]:
    
    ```console
    
    local $ ssh -R 6000:localhost:1080 cluster-name.it4i.cz
    
    Now, configure the applications proxy settings to `localhost:6000`. Use port forwarding to access the [proxy server from compute nodes][9], as well.
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    
    
    [1]: ../general/accessing-the-clusters/shell-access-and-data-transfer/ssh-key-management.md
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    [2]: ../general/accessing-the-clusters/shell-access-and-data-transfer/putty.md
    [5]: #port-forwarding-from-login-nodes
    [6]: ../general/accessing-the-clusters/graphical-user-interface/x-window-system.md
    [7]: ../general/accessing-the-clusters/graphical-user-interface/vnc.md
    [8]: ../general/accessing-the-clusters/vpn-access.md
    
    [9]: #port-forwarding-from-compute-nodes
    
    [10]: #many-files
    [11]: ../general/hyperqueue.md
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    
    [b]: http://linux.die.net/man/1/sshfs
    [c]: http://winscp.net/eng/download.php
    [d]: http://code.google.com/p/win-sshfs/
    
    [e]: https://github.com/It4innovations/hyperqueue/releases/latest
    [f]: https://it4innovations.github.io/hyperqueue/stable/cheatsheet/