Skip to content
Snippets Groups Projects
shell-and-data-access.md 15 KiB
Newer Older
Lukáš Krupčík's avatar
Lukáš Krupčík committed
# Accessing the Clusters

## Shell Access

All IT4Innovations clusters are accessed by the SSH protocol via login nodes at the address **cluster-name.it4i.cz**. The login nodes may be addressed specifically, by prepending the loginX node name to the address.

!!! note "Workgroups Access Limitation"
Jan Siwiec's avatar
Jan Siwiec committed
    Projects from the **EUROHPC** workgroup can only access the **Karolina** cluster.
!!! important "Supported keys"
    We accept only RSA or ED25519 keys for logging into our systems.
| Login address                   | Port | Protocol | Login node                                |
| ------------------------------- | ---- | -------- | ----------------------------------------- |
| karolina.it4i.cz                | 22   | SSH      | round-robin DNS record for login{1,2,3,4} |
| login{1,2,3,4}.karolina.it4i.cz | 22   | SSH      | login{1,2,3,4}                            |
Lukáš Krupčík's avatar
Lukáš Krupčík committed
### Barbora Cluster

| Login address                 | Port | Protocol | Login node                            |
| ----------------------------- | ---- | -------- | ------------------------------------- |
| barbora.it4i.cz               | 22   | SSH      | round-robin DNS record for login{1,2} |
| login{1,2}.barbora.it4i.cz    | 22   | SSH      | login{1,2}                            |
Lukáš Krupčík's avatar
Lukáš Krupčík committed

## Authentication

Authentication is available by [private key][1] only. Verify SSH fingerprints during the first logon:

Jan Siwiec's avatar
Jan Siwiec committed
### Karolina

**Fingerprints**

Fingerprints are identical for all login nodes.
Jan Siwiec's avatar
Jan Siwiec committed

```console
# login{1,2,3,4}:22 SSH-2.0-OpenSSH_7.4
2048 MD5:41:3a:40:32:da:08:77:51:79:04:af:53:e4:57:d0:7c (RSA)
2048 SHA256:Ip37d/bE6XwtWf3KnWA+sqA+zRGSFlf5vXai0v3MBmo (RSA)
256 MD5:e9:b6:8e:7d:f8:c6:8f:42:34:10:71:02:14:a6:7c:22 (ED25519)
256 SHA256:zKEtQMi2KRsxzzgo/sHcog+NFZqQ9tIyvJ7BVxOfzgI (ED25519)
Jan Siwiec's avatar
Jan Siwiec committed
```

Gabriel Homa's avatar
Gabriel Homa committed
**Public Keys \ Known Hosts**
Public Keys \ Known Hosts are identical for all login nodes.

```console
login1,login1.karolina.it4i.cz,login1.karolina,karolina.it4i.cz ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC9Cp8/a3F7eOPQvH4+HjC778XvYgRXWmCEOQnE3clPcKw15iIat3bvKc8ckYLudAzomipWy4VYdDI2OnEXay5ba8HqdREJO31qNBtW1AXgydCfPnkeuUZS4WVlAWM+HDlK6caB8KlvHoarCnNj2jvuYsMbARgGEq3vrk3xW4uiGpS6Y/uGVBBwMFWFaINbmXUrU1ysv/ZD1VpH4eHykkD9+8xivhhZtcz5Z2T7ZnIib4/m9zZZvjKs4ejOo58cKXGYVl27kLkfyOzU3cirYNQOrGqllN/52fATfrXKMcQor9onsbTkNNjMgPFZkddufxTrUaS7EM6xYsj8xrPJ2RaN
login1,login1.karolina.it4i.cz,login1.karolina,karolina.it4i.cz ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIDkIdDODkUYRgMy1h6g/UtH34RnDCQkwwiJZFB0eEu1c
login2,login2.karolina.it4i.cz,login2.karolina,karolina.it4i.cz ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC9Cp8/a3F7eOPQvH4+HjC778XvYgRXWmCEOQnE3clPcKw15iIat3bvKc8ckYLudAzomipWy4VYdDI2OnEXay5ba8HqdREJO31qNBtW1AXgydCfPnkeuUZS4WVlAWM+HDlK6caB8KlvHoarCnNj2jvuYsMbARgGEq3vrk3xW4uiGpS6Y/uGVBBwMFWFaINbmXUrU1ysv/ZD1VpH4eHykkD9+8xivhhZtcz5Z2T7ZnIib4/m9zZZvjKs4ejOo58cKXGYVl27kLkfyOzU3cirYNQOrGqllN/52fATfrXKMcQor9onsbTkNNjMgPFZkddufxTrUaS7EM6xYsj8xrPJ2RaN
login2,login2.karolina.it4i.cz,login2.karolina,karolina.it4i.cz ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIDkIdDODkUYRgMy1h6g/UtH34RnDCQkwwiJZFB0eEu1c
login3,login3.karolina.it4i.cz,login3.karolina,karolina.it4i.cz ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC9Cp8/a3F7eOPQvH4+HjC778XvYgRXWmCEOQnE3clPcKw15iIat3bvKc8ckYLudAzomipWy4VYdDI2OnEXay5ba8HqdREJO31qNBtW1AXgydCfPnkeuUZS4WVlAWM+HDlK6caB8KlvHoarCnNj2jvuYsMbARgGEq3vrk3xW4uiGpS6Y/uGVBBwMFWFaINbmXUrU1ysv/ZD1VpH4eHykkD9+8xivhhZtcz5Z2T7ZnIib4/m9zZZvjKs4ejOo58cKXGYVl27kLkfyOzU3cirYNQOrGqllN/52fATfrXKMcQor9onsbTkNNjMgPFZkddufxTrUaS7EM6xYsj8xrPJ2RaN
login3,login3.karolina.it4i.cz,login3.karolina,karolina.it4i.cz ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIDkIdDODkUYRgMy1h6g/UtH34RnDCQkwwiJZFB0eEu1c
login4,login4.karolina.it4i.cz,login4.karolina,karolina.it4i.cz ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC9Cp8/a3F7eOPQvH4+HjC778XvYgRXWmCEOQnE3clPcKw15iIat3bvKc8ckYLudAzomipWy4VYdDI2OnEXay5ba8HqdREJO31qNBtW1AXgydCfPnkeuUZS4WVlAWM+HDlK6caB8KlvHoarCnNj2jvuYsMbARgGEq3vrk3xW4uiGpS6Y/uGVBBwMFWFaINbmXUrU1ysv/ZD1VpH4eHykkD9+8xivhhZtcz5Z2T7ZnIib4/m9zZZvjKs4ejOo58cKXGYVl27kLkfyOzU3cirYNQOrGqllN/52fATfrXKMcQor9onsbTkNNjMgPFZkddufxTrUaS7EM6xYsj8xrPJ2RaN
login4,login4.karolina.it4i.cz,login4.karolina,karolina.it4i.cz ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIDkIdDODkUYRgMy1h6g/UtH34RnDCQkwwiJZFB0eEu1c

Jan Siwiec's avatar
Jan Siwiec committed
### Barbora
Lukáš Krupčík's avatar
Lukáš Krupčík committed

**Fingerprints**

Lukáš Krupčík's avatar
Lukáš Krupčík committed
```console
md5:
39:55:e2:b9:2a:a2:c4:9e:b1:8e:f0:f7:b1:66:a8:73 (RSA)
40:67:03:26:d3:6c:a0:7f:0a:df:0e:e7:a0:52:cc:4e (ED25519)
Lukáš Krupčík's avatar
Lukáš Krupčík committed

sha256:
TO5szOJf0bG7TWVLO3WABUpGKkP7nBm/RLyHmpoNpro (RSA)
ZQzFTJVDdZa3I0ics9ME2qz4v5a3QzXugvyVioaH6tI (ED25519)
```

**Public Keys \ Known Hosts**

```console
barbora.it4i.cz, ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQDHUHvIrv7VUcGIcfsrcBjYfHpFBst8uhtJqfiYckfbeMRIdaodfjTO0pIXvd5wx+61a0C14zy1pdhvx6ykT5lwYkkn8l2tf+LRd6qN0alq/s+NGDJKpWGvdAGD3mM9AO1RmUPt+Vfg4VePQUZMu2PXZQu2C4TFFbaH2yiyCFlKz/Md9q+7NM+9U86uf3uLFbBu8mzkk2z3jyDGR6pjmpYTAiV/goUGpHgsW8Qx4GUdCreObQ6GUfPVOPvYaTlfXfteD9HluB7gwCWaUi5hevHhc+kK4xj61v64mGBOPmCobnAlr2RYQv6cDn7PHgI2mE7ZwRsZkNyMXqGr1S2JK2M64K53ZfF70aGrW/muHlFrYVFaJg6s1f7K/Xqu21wjwwvnJ8CcP7lUjASqhfSn9OBzEI38KMMo5Qon9p108wvqSKP2QnEdrdv1QOsBPtOZMNRMfEVpw6xVvyPka0X6gxzGfEc9nn3nOok35Fbvoo3G0P8RmOeDJLqDjUOggOs0Gwk=
barbora.it4i.cz, ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIOmUm4btn7OC0QLIT3xekKTTdg5ziby8WdxccEczEeE1
Lukáš Krupčík's avatar
Lukáš Krupčík committed
```

!!! note
Jan Siwiec's avatar
Jan Siwiec committed
    Barbora has identical SSH fingerprints on all login nodes.
Lukáš Krupčík's avatar
Lukáš Krupčík committed

### Private Key Authentication:
Lukáš Krupčík's avatar
Lukáš Krupčík committed

On **Linux** or **Mac**, use:

```console
local $ ssh -i /path/to/id_rsa username@cluster-name.it4i.cz
Lukáš Krupčík's avatar
Lukáš Krupčík committed
```

If you see a warning message **UNPROTECTED PRIVATE KEY FILE!**, use this command to set lower permissions to the private key file:

```console
local $ chmod 600 /path/to/id_rsa
Lukáš Krupčík's avatar
Lukáš Krupčík committed
```

On **Windows**, use the [PuTTY SSH client][2].

Jan Siwiec's avatar
Jan Siwiec committed
After logging in, you will see the command prompt with the name of the cluster and the message of the day.
Lukáš Krupčík's avatar
Lukáš Krupčík committed

!!! note
Jan Siwiec's avatar
Jan Siwiec committed
    The environment is **not** shared between login nodes, except for shared filesystems.
Lukáš Krupčík's avatar
Lukáš Krupčík committed

## Data Transfer

### Serial Transfer

Lukáš Krupčík's avatar
Lukáš Krupčík committed
Data in and out of the system may be transferred by SCP and SFTP protocols.

Jan Siwiec's avatar
Jan Siwiec committed
| Cluster  | Port | Protocol  |
| -------- | ---- | --------- |
| Karolina | 22   | SCP, SFTP |
| Barbora  | 22   | SCP       |
Lukáš Krupčík's avatar
Lukáš Krupčík committed

Authentication is by [private key][1] only.

Jan Siwiec's avatar
Jan Siwiec committed
On Linux or Mac, use an SCP or SFTP client to transfer data to the cluster:
Lukáš Krupčík's avatar
Lukáš Krupčík committed

```console
local $ scp -i /path/to/id_rsa my-local-file username@cluster-name.it4i.cz:directory/file
Lukáš Krupčík's avatar
Lukáš Krupčík committed
```

```console
local $ scp -i /path/to/id_rsa -r my-local-dir username@cluster-name.it4i.cz:directory
Lukáš Krupčík's avatar
Lukáš Krupčík committed
```

or

```console
local $ sftp -o IdentityFile=/path/to/id_rsa username@cluster-name.it4i.cz
You may request the **aes256-gcm@openssh.com cipher** for more efficient ssh based transfer:

```console
local $ scp -c aes256-gcm@openssh.com -i /path/to/id_rsa -r my-local-dir username@cluster-name.it4i.cz:directory
```

The -c argument may be used with ssh, scp and sftp, and is also applicable to sshfs and rsync below.

Jan Siwiec's avatar
Jan Siwiec committed
A very convenient way to transfer files in and out of the cluster is via the fuse filesystem [SSHFS][b].
Lukáš Krupčík's avatar
Lukáš Krupčík committed

```console
local $ sshfs -o IdentityFile=/path/to/id_rsa username@cluster-name.it4i.cz:. mountpoint
Jan Siwiec's avatar
Jan Siwiec committed
Using SSHFS, the user's home directory will be mounted on your local computer, just like an external disk.
Lukáš Krupčík's avatar
Lukáš Krupčík committed

Learn more about SSH, SCP, and SSHFS by reading the manpages:

```console
local $ man ssh
local $ man scp
local $ man sshfs
The rsync client uses ssh to establish connection.

```console
local $ rsync my-local-file
```

```console
local $ rsync -r my-local-dir username@cluster-name.it4i.cz:directory
```

### Parallel Transfer

    The data transfer speed is limited by the single TCP stream and single-core ssh encryption speed to about **250 MB/s** (750 MB/s in case of aes256-gcm@openssh.com cipher)
    Run **multiple** streams for unlimited transfers

#### Many Files
Parallel execution of multiple rsync processes utilizes multiple cores to accelerate encryption and multiple tcp streams for enhanced bandwidth.
First, set up ssh-agent single sign on:

local $ eval `ssh-agent`
local $ ssh-add
Enter passphrase for /home/user/.ssh/id_rsa:
```
Then run multiple rsync instances in parallel, f.x.:
local $ cd my-local-dir
local $ ls | xargs -n 2 -P 4 /bin/bash -c 'rsync "$@" username@cluster-name.it4i.cz:mydir' sh
The **-n** argument detemines the number of files to transfer in one rsync call. Set according to file size and count (large for many small files).
The **-P** argument determines number of parallel rsync processes. Set to number of cores on your local machine.
Alternatively, use [HyperQueue][11]. First get [HyperQueue binary][e], then run:

```console
local $ hq server start &
local $ hq worker start &
local $ find my-local-dir -type f | xargs -n 2 > jobfile
local $ hq submit --log=/dev/null --progress --each-line jobfile \
        bash -c 'rsync -R $HQ_ENTRY username@cluster-name.it4i.cz:mydir'
```

Again, the **-n** argument detemines the number of files to transfer in one rsync call. Set according to file size and count (large for many small files).

#### Single Very Large File

To transfer single very large file efficienty, we need to transfer many blocks of the file in parallel, utilizing multiple cores to accelerate ssh encryption and multiple tcp streams for enhanced bandwidth.

First, set up ssh-agent single sign on as [described above][10].
Second, start the [HyperQueue server and HyperQueue worker][f]:

```console
local $ hq server start &
local $ hq worker start &
```

Once set up, run the hqtransfer script listed below:

```console
local $ ./hqtransfer mybigfile username@cluster-name.it4i.cz outputpath/outputfile
```

The hqtransfer script:

```console
#!/bin/bash
#Read input
if [ -z $1 ]; then echo Usage: $0 'input_file ssh_destination [output_path/output_file]'; exit; fi
INFILE=$1

if [ -z $2 ]; then echo Usage: $0 'input_file ssh_destination [output_path/output_file]'; exit; fi
DEST=$2

OUTFILE=$INFILE
if [ ! -z $3 ]; then OUTFILE=$3; fi

#Calculate transfer blocks
SIZE=$(($(stat --printf %s $INFILE)/1024/1024/1024))
echo Transfering $(($SIZE+1)) x 1GB blocks

#Execute
hq submit --log=/dev/null --progress --array 0-$SIZE /bin/bash -c \
        "dd if=$INFILE bs=1G count=1 skip=\$HQ_TASK_ID | \
         ssh -c aes256-gcm@openssh.com $DEST \
         dd of=$OUTFILE bs=1G conv=notrunc seek=\$HQ_TASK_ID"

exit
```

Copy-paste the script into `hqtransfer` file and set executable flags:

```console
local $ chmod u+x hqtransfer
```

The `hqtransfer` script is ready for use.

### Data Transfer From Windows Clients

Lukáš Krupčík's avatar
Lukáš Krupčík committed
On Windows, use the [WinSCP client][c] to transfer data. The [win-sshfs client][d] provides a way to mount the cluster filesystems directly as an external disc.

## Connection Restrictions

Outgoing connections from cluster login nodes to the outside world are restricted to the following ports:

| Port | Protocol |
| ---- | -------- |
| 22   | SSH      |
| 80   | HTTP     |
| 443  | HTTPS    |
| 873  | Rsync    |

!!! note
    Use **SSH port forwarding** and proxy servers to connect from cluster to all other remote ports.

Outgoing connections from cluster compute nodes are restricted to the internal network. Direct connections from compute nodes to the outside world are cut.

| Service          | IP/Port            |
| ---------------- | ------------------ |
| TCP/22, TCP      | port 1024-65535    |
| e-INFRA CZ Cloud | 195.113.243.0/24   |
| IT4I Cloud       | 195.113.175.128/26 |

Lukáš Krupčík's avatar
Lukáš Krupčík committed
## Port Forwarding

### Port Forwarding From Login Nodes

!!! note
    Port forwarding allows an application running on cluster to connect to arbitrary remote hosts and ports.

It works by tunneling the connection from cluster back to the user's workstations and forwarding from the workstation to the remote host.

Select an unused port on the cluster login node (for example 6000) and establish the port forwarding:

```console
$ ssh -R 6000:remote.host.com:1234 cluster-name.it4i.cz
```

In this example, we establish port forwarding between port 6000 on the cluster and port 1234 on the `remote.host.com`. By accessing `localhost:6000` on the cluster, an application will see the response of `remote.host.com:1234`. The traffic will run via the user's local workstation.

Port forwarding may be done **using PuTTY** as well. On the PuTTY Configuration screen, load your cluster configuration first. Then go to *Connection > SSH > Tunnels* to set up the port forwarding. Click the _Remote_ radio button. Insert 6000 to the _Source port_ textbox. Insert `remote.host.com:1234`. Click _Add_, then _Open_.

Port forwarding may be established directly to the remote host. However, this requires that the user has an SSH access to `remote.host.com`.

```console
$ ssh -L 6000:localhost:1234 remote.host.com
```

!!! note
    Port number 6000 is chosen as an example only. Pick any free port.

### Port Forwarding From Compute Nodes

Remote port forwarding from compute nodes allows applications running on the compute nodes to access hosts outside the cluster.

First, establish the remote port forwarding from the login node, as [described above][5].

Second, invoke port forwarding from the compute node to the login node. Insert the following line into your jobscript or interactive shell:

```console
$ ssh  -TN -f -L 6000:localhost:6000 login1
```

In this example, we assume that port forwarding from `login1:6000` to `remote.host.com:1234` has been established beforehand. By accessing `localhost:6000`, an application running on a compute node will see the response of `remote.host.com:1234`.

### Using Proxy Servers

Port forwarding is static; each single port is mapped to a particular port on a remote host. Connection to another remote host requires a new forward.

!!! note
    Applications with inbuilt proxy support experience unlimited access to remote hosts via a single proxy server.

To establish a local proxy server on your workstation, install and run the SOCKS proxy server software. On Linux, SSHD demon provides the functionality. To establish the SOCKS proxy server listening on port 1080 run:
Lukáš Krupčík's avatar
Lukáš Krupčík committed

```console
local $ ssh -D 1080 localhost
Lukáš Krupčík's avatar
Lukáš Krupčík committed
```

On Windows, install and run the free, open source Sock Puppet server.

Once the proxy server is running, establish the SSH port forwarding from cluster to the proxy server, port 1080, exactly as [described above][5]:

```console
local $ ssh -R 6000:localhost:1080 cluster-name.it4i.cz
Now, configure the applications proxy settings to `localhost:6000`. Use port forwarding to access the [proxy server from compute nodes][9], as well.
Lukáš Krupčík's avatar
Lukáš Krupčík committed

[1]: ../general/accessing-the-clusters/shell-access-and-data-transfer/ssh-key-management.md
Lukáš Krupčík's avatar
Lukáš Krupčík committed
[2]: ../general/accessing-the-clusters/shell-access-and-data-transfer/putty.md
[5]: #port-forwarding-from-login-nodes
[6]: ../general/accessing-the-clusters/graphical-user-interface/x-window-system.md
[7]: ../general/accessing-the-clusters/graphical-user-interface/vnc.md
[8]: ../general/accessing-the-clusters/vpn-access.md
[9]: #port-forwarding-from-compute-nodes
[10]: #many-files
[11]: ../general/hyperqueue.md
Lukáš Krupčík's avatar
Lukáš Krupčík committed

[b]: http://linux.die.net/man/1/sshfs
[c]: http://winscp.net/eng/download.php
[d]: http://code.google.com/p/win-sshfs/
[e]: https://github.com/It4innovations/hyperqueue/releases/latest
[f]: https://it4innovations.github.io/hyperqueue/stable/cheatsheet/