diff --git a/docs.it4i/anselm-salomon-shutdown.md b/docs.it4i/anselm-salomon-shutdown.md index 182cc8ea60032a3a648abcec4c28f0d817e7b29a..794c4f9934a85976e7e5832ac277f3be577f70bc 100644 --- a/docs.it4i/anselm-salomon-shutdown.md +++ b/docs.it4i/anselm-salomon-shutdown.md @@ -1,25 +1,25 @@ # Salomon Supercomputer Withdrawal From Service !!! note - Content updated on 01.12.2021 + Content updated on 12.1.2021 ## Salomon Withdrawal From Service ### Salomon Access and Job Scheduling -- The long jobs (qlong queue) will be no longer scheduled after Monday 6.12.2021 9:00 -- All jobs will be scheduled to finish on **Monday 13.12.2021, 9:00**. -- Access to Salomon login nodes and data will be preserved to **03.01.2022, 9:00**. +- Scheduling of long jobs (`qlong` queue) ended on 6.12.2021 9:00. +- All jobs were scheduled to finish on **Monday 13.12.2021, 9:00**. +- Access to Salomon login nodes and data ended on **03.01.2022, 9:00**. ### Salomon Data !!! note - The **PROJECT** storage is available to hold the /home and /scratch data. + The **PROJECT** storage is available to hold the `/home` and `/scratch` data. -- Users should synchronize any remaining data on /home and /scratch to the PROJECT storage themselves. -- The data on the Salomon /home and /scratch will be read-only from **Monday 13.12.2021, 9:00**. -- The data on the Salomon /home and /scratch storage will become permanently inaccessible starting **03.01.2022, 9:00**. -- No backup or data transfer is scheduled for /home or other storages by IT4I, data will be **permanently lost on 03.01.2022, 9:00**. +- Users should synchronize any remaining data on `/home` and `/scratch` to the PROJECT storage themselves. +- The data on the Salomon `/home` and `/scratch` storage were set to read-only on **Monday 13.12.2021, 9:00**. +- The data on the Salomon `/home` and `/scratch` storage became permanently inaccessible on **03.01.2022, 9:00**. +- After 03.01.2022, 9:00, all data from Salomon `/home` and `/scratch` storage will be **permanently lost**. - Make sure that **you [save all the relevant data][2]** to external resources or to PROJECT storage before **03.01.2022, 9:00**. ### Salomon Future diff --git a/docs.it4i/einfracz-migration.md b/docs.it4i/einfracz-migration.md index d06bac4c8f24e5da91341c56be386350b2da96f2..7d395caddc882f5eb3ba483aea6e08b93c7311a2 100644 --- a/docs.it4i/einfracz-migration.md +++ b/docs.it4i/einfracz-migration.md @@ -1,4 +1,4 @@ -# Migration of IT4I Accounts to e-INFRA CZ +# Migration to e-INFRA CZ ## Introduction @@ -39,8 +39,9 @@ After the migration process is completed, you will receive a confirmation email ## Steps After Migration +After the migration, you must use your **e-INFRA CZ credentials** to access all IT4I services as well as [e-INFRA CZ services][5]. + Successfully migrated accounts tied to e-INFRA CZ can be self-managed at [e-INFRA CZ User profile][4]. -With this account, you will have access to all IT4I services as well as to [e-INFRA CZ services][5]. !!! tip "Recommendation" After migration, we recommend [verifying your SSH keys][6] for cluster access. diff --git a/docs.it4i/general/obtaining-login-credentials/obtaining-login-credentials.md b/docs.it4i/general/obtaining-login-credentials/obtaining-login-credentials.md index 2077f72c755be758903cf18bf694230a83ee979e..e329313a61f266690be7e691a463b9d401d0293d 100644 --- a/docs.it4i/general/obtaining-login-credentials/obtaining-login-credentials.md +++ b/docs.it4i/general/obtaining-login-credentials/obtaining-login-credentials.md @@ -36,7 +36,6 @@ John Smith You will receive your personal login credentials by encrypted email. The login credentials include: 1. username -1. SSH private key and private key passphrase 1. system password The clusters are accessed by the [private key][5] and username. Username and password are used for login to the [information systems][d]. diff --git a/docs.it4i/general/resource_allocation_and_job_execution.md b/docs.it4i/general/resource_allocation_and_job_execution.md index e8e936817fa86e6b7c172af2256580132213df7d..3733e0ba82839d307e7b7a58512724ee66e58ba6 100644 --- a/docs.it4i/general/resource_allocation_and_job_execution.md +++ b/docs.it4i/general/resource_allocation_and_job_execution.md @@ -45,4 +45,4 @@ Read more on [Capacity Computing][6] page. [5]: job-submission-and-execution.md [6]: capacity-computing.md -[a]: https://extranet.it4i.cz/rsweb/salomon/queues +[a]: https://extranet.it4i.cz/rsweb/ diff --git a/docs.it4i/general/resources-allocation-policy.md b/docs.it4i/general/resources-allocation-policy.md index 918055eb296201aab80b692355845e3533099ff0..af07b10a788a4c14ee0ce81e78273ec806b58f2e 100644 --- a/docs.it4i/general/resources-allocation-policy.md +++ b/docs.it4i/general/resources-allocation-policy.md @@ -50,38 +50,13 @@ Resources are allocated to jobs in a fair-share fashion, subject to constraints * **qnvidia**, **qfat**, Dedicated queues: The queue qnvidia is dedicated to accessing the Nvidia accelerated nodes and qfat the Fat nodes. It is required that an active project with nonzero remaining resources is specified to enter these queues. Included are 8 NVIDIA (4 NVIDIA cards per node) and 1 fat nodes. Full nodes, 24 cores per node, are allocated. The queues run with very high priority. The PI needs to explicitly ask [support][a] for authorization to enter the dedicated queues for all users associated with their project. * **qfree**, Free resource queue: The queue qfree is intended for utilization of free resources, after a project has exhausted all of its allocated computational resources (Does not apply to DD projects by default; DD projects have to request permission to use qfree after exhaustion of computational resources). It is required that active project is specified to enter the queue. Consumed resources will be accounted to the Project. Access to the qfree queue is automatically removed if consumed resources exceed 120% of the resources allocated to the Project. Only 189 nodes without accelerators may be accessed from this queue. Full nodes, 16 cores per node, are allocated. The queue runs with very low priority and no special authorization is required to use it. The maximum runtime in qfree is 12 hours. -### Salomon - -| queue | active project | project resources | nodes | min ncpus | priority | authorization | walltime | -| --------- | -------------- | -------------------- | ------------------------------------------------------------- | --------- | -------- | ------------- | --------- | -| **qexp** | no | none required | 32 nodes, max 8 per user | 24 | 150 | no | 1 / 1h | -| **qprod** | yes | > 0 | 1006 nodes, max 86 per job | 24 | 0 | no | 24 / 48h | -| **qlong** | yes | > 0 | 256 nodes, max 40 per job, only non-accelerated nodes allowed | 24 | 0 | no | 72 / 144h | -| **qmpp** | yes | > 0 | 1006 nodes | 24 | 0 | yes | 2 / 4h | -| **qfat** | yes | > 0 | 1 (uv1) | 8 | 200 | yes | 24 / 48h | -| **qfree** | yes | < 120% of allocation | 987 nodes, max 86 per job | 24 | -1024 | no | 12 / 12h | -| **qviz** | yes | none required | 2 (with NVIDIA Quadro K5000) | 4 | 150 | no | 1 / 8h | -| **qmic** | yes | > 0 | 864 Intel Xeon Phi cards, max 8 mic per job | 0 | 0 | no | 24 / 48h | - -* **qexp**, Express queue: This queue is dedicated for testing and running very small jobs. It is not required to specify a project to enter the qexp. There are 2 nodes always reserved for this queue (w/o accelerators), a maximum 8 nodes are available via the qexp for a particular user. The nodes may be allocated on a per core basis. No special authorization is required to use the queue. The maximum runtime in qexp is 1 hour. -* **qprod**, Production queue: This queue is intended for normal production runs. It is required that active project with nonzero remaining resources is specified to enter the qprod. All nodes may be accessed via the qprod queue, however only 86 per job. Full nodes, 24 cores per node are allocated. The queue runs with medium priority and no special authorization is required to use it. The maximum runtime in qprod is 48 hours. -* **qlong**, Long queue: This queue is intended for long production runs. It is required that active project with nonzero remaining resources is specified to enter the qlong. Only 336 nodes without acceleration may be accessed via the qlong queue. Full nodes, 24 cores per node are allocated. The queue runs with medium priority and no special authorization is required to use it. The maximum runtime in qlong is 144 hours (three times of the standard qprod time - 3 \* 48 h) -* **qmpp**, massively parallel queue. This queue is intended for massively parallel runs. It is required that active project with nonzero remaining resources is specified to enter the qmpp. All nodes may be accessed via the qmpp queue. Full nodes, 24 cores per node are allocated. The queue runs with medium priority and no special authorization is required to use it. The maximum runtime in qmpp is 4 hours. An PI needs explicitly ask support for authorization to enter the queue for all users associated to their Project. -* **qfat**, UV2000 queue. This queue is dedicated to access the fat SGI UV2000 SMP machine. The machine (uv1) has 112 Intel IvyBridge cores at 3.3GHz and 3.25TB RAM (8 cores and 128GB RAM are dedicated for system). The PI needs to explicitly ask support for authorization to enter the queue for all users associated to their Project. -* **qfree**, Free resource queue: The queue qfree is intended for utilization of free resources, after a Project exhausted all its allocated computational resources (Does not apply to DD projects by default. DD projects have to request for permission on qfree after exhaustion of computational resources.). It is required that active project is specified to enter the queue. Consumed resources will be accounted to the Project. Access to the qfree queue is automatically removed if consumed resources exceed 120% of the resources allocated to the Project. Only 987 nodes without accelerator may be accessed from this queue. Full nodes, 24 cores per node are allocated. The queue runs with very low priority and no special authorization is required to use it. The maximum runtime in qfree is 12 hours. -* **qviz**, Visualization queue: Intended for pre-/post-processing using OpenGL accelerated graphics. Currently when accessing the node, each user gets 4 cores of a CPU allocated, thus approximately 73 GB of RAM and 1/7 of the GPU capacity (default "chunk"). If more GPU power or RAM is required, it is recommended to allocate more chunks (with 4 cores each) up to one whole node per user, so that all 28 cores, 512 GB RAM and whole GPU is exclusive. This is currently also the maximum allowed allocation per one user. One hour of work is allocated by default, the user may ask for 2 hours maximum. -* **qmic**, This queue is used to access MIC nodes. It is required that active project with nonzero remaining resources is specified to enter the qmic. All 864 MICs are included. - -!!! note - To access a node with Xeon Phi co-processor, you need to specify it in a [job submission select statement][3]. - ## Queue Notes The job wall clock time defaults to **half the maximum time**, see the table above. Longer wall time limits can be [set manually, see examples][3]. Jobs that exceed the reserved wall clock time (Req'd Time) get killed automatically. The wall clock time limit can be changed for queuing jobs (state Q) using the `qalter` command, however it cannot be changed for a running job (state R). -You can check the current queue configuration on rsweb: [Barbora][b] or [Salomon][d]. +You can check the current queue configuration on rsweb: [Barbora][b]. ## Queue Status @@ -210,4 +185,4 @@ Options: [a]: https://support.it4i.cz/rt/ [b]: https://extranet.it4i.cz/rsweb/barbora/queues [c]: https://extranet.it4i.cz/rsweb -[d]: https://extranet.it4i.cz/rsweb/salomon/queues +[d]: https://extranet.it4i.cz/rsweb diff --git a/docs.it4i/general/shell-and-data-access.md b/docs.it4i/general/shell-and-data-access.md index 9b01791f08c1b78032f8678af79900dc9b670753..21408102ed2c96722f37a75d19b800bfd7d95a6b 100644 --- a/docs.it4i/general/shell-and-data-access.md +++ b/docs.it4i/general/shell-and-data-access.md @@ -5,7 +5,7 @@ All IT4Innovations clusters are accessed by the SSH protocol via login nodes at the address **cluster-name.it4i.cz**. The login nodes may be addressed specifically, by prepending the loginX node name to the address. !!! note "Workgroups Access Limitation" - Projects from the **PRACE** workgroup can only access the **Barbora** and **Salomon** clusters.<br>Projects from the **EUROHPC** workgroup can only access the **Karolina** cluster. + Projects from the **PRACE** workgroup can only access the **Barbora** cluster.<br>Projects from the **EUROHPC** workgroup can only access the **Karolina** cluster. !!! important "Karolina and Barbora updated security requirements" Due to updated security requirements on Karolina and Barbora, @@ -31,16 +31,6 @@ All IT4Innovations clusters are accessed by the SSH protocol via login nodes at | login1.barbora.it4i.cz | 22 | SSH | login1 | | login2.barbora.it4i.cz | 22 | SSH | login2 | -### Salomon Cluster - -| Login address | Port | Protocol | Login node | -| ---------------------- | ---- | -------- | ------------------------------------- | -| salomon.it4i.cz | 22 | SSH | round-robin DNS record for login[1-4] | -| login1.salomon.it4i.cz | 22 | SSH | login1 | -| login2.salomon.it4i.cz | 22 | SSH | login2 | -| login3.salomon.it4i.cz | 22 | SSH | login3 | -| login4.salomon.it4i.cz | 22 | SSH | login4 | - ## Authentication Authentication is available by [private key][1] only. Verify SSH fingerprints during the first logon: @@ -101,24 +91,8 @@ barbora.it4i.cz, ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDZb1HGGREAV2ybYJgzeWuhy5o barbora.it4i.cz, ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIOmUm4btn7OC0QLIT3xekKTTdg5ziby8WdxccEczEeE1 ``` -### Salomon - -```console - md5: - f6:28:98:e4:f9:b2:a6:8f:f2:f4:2d:0a:09:67:69:80 (DSA) - 70:01:c9:9a:5d:88:91:c7:1b:c0:84:d1:fa:4e:83:5c (RSA) - 66:32:0a:ef:50:01:77:a7:52:3f:d9:f8:23:7c:2c:3a (ECDSA) - ab:3d:5e:ff:82:68:c7:72:da:4a:2d:e3:ca:85:0d:df (ED25519) - - sha256: - epkqEU2eFzXnMeMMkpX02CykyWjGyLwFj528Vumpzn4 (DSA) - WNIrR7oeQDYpBYy4N2d5A6cJ2p0837S7gzzTpaDBZrc (RSA) - cYO4UdtUBYlS46GEFUB75BkgxkI6YFQvjVuFxOlRG3g (ECDSA) - bFm3stNM8ETmj8Xd7iPXNtu5X5dC2apLNXGiH3VSTuw (ED25519) -``` - !!! note - Barbora and Salomon have identical SSH fingerprints on all login nodes. + Barbora has identical SSH fingerprints on all login nodes. ### Private Key Authentication: @@ -139,7 +113,7 @@ On **Windows**, use the [PuTTY SSH client][2]. After logging in, you will see the command prompt with the name of the cluster and the message of the day. !!! note - The environment is **not** shared between login nodes, except for [shared filesystems][3]. + The environment is **not** shared between login nodes, except for shared filesystems. ## Data Transfer @@ -149,7 +123,6 @@ Data in and out of the system may be transferred by SCP and SFTP protocols. | -------- | ---- | --------- | | Karolina | 22 | SCP, SFTP | | Barbora | 22 | SCP | -| Salomon | 22 | SCP, SFTP | Authentication is by [private key][1] only. @@ -187,8 +160,6 @@ $ man sshfs On Windows, use the [WinSCP client][c] to transfer data. The [win-sshfs client][d] provides a way to mount the cluster filesystems directly as an external disc. -More information about the shared file systems is available [here][4]. - ## Connection Restrictions Outgoing connections from cluster login nodes to the outside world are restricted to the following ports: @@ -272,8 +243,6 @@ Now, configure the applications proxy settings to `localhost:6000`. Use port for [1]: ../general/accessing-the-clusters/shell-access-and-data-transfer/ssh-keys.md [2]: ../general/accessing-the-clusters/shell-access-and-data-transfer/putty.md -[3]: ../anselm/storage.md#shared-filesystems -[4]: ../anselm/storage.md [5]: #port-forwarding-from-login-nodes [6]: ../general/accessing-the-clusters/graphical-user-interface/x-window-system.md [7]: ../general/accessing-the-clusters/graphical-user-interface/vnc.md diff --git a/docs.it4i/index.md b/docs.it4i/index.md index d5af25e0336b4168172a40d187d2c698a52e2636..73f902260593f1ebdd0d9d6fd1a23faea900a87d 100644 --- a/docs.it4i/index.md +++ b/docs.it4i/index.md @@ -1,6 +1,6 @@ # Documentation -Welcome to the IT4Innovations documentation. The IT4Innovations National Supercomputing Center operates the [Karolina][2], [Barbora][3], and [Salomon][1] supercomputers. The supercomputers are [available][4] to the academic community within the Czech Republic and Europe, and the industrial community worldwide. The purpose of these pages is to provide comprehensive documentation of the hardware, software, and usage of the computers. +Welcome to the IT4Innovations documentation. The IT4Innovations National Supercomputing Center operates the [Karolina][2] and [Barbora][3] supercomputers. The supercomputers are [available][4] to the academic community within the Czech Republic and Europe, and the industrial community worldwide. The purpose of these pages is to provide comprehensive documentation of the hardware, software, and usage of the computers. ## How to Read the Documentation diff --git a/docs.it4i/storage/project-storage.md b/docs.it4i/storage/project-storage.md index 95b5c7d85aec97f70c1f3bb9e7f6ddaf7813f2a9..e86a8a48935e9031e3a03e4f57348ea04353b57b 100644 --- a/docs.it4i/storage/project-storage.md +++ b/docs.it4i/storage/project-storage.md @@ -40,7 +40,6 @@ The PROJECT storage can be accessed via the following nodes: | ------------- | ----------------------------- | | Karolina | Login, Compute, Visualization | | Barbora | Login, Compute, Visualization | -| Salomon | Login | To show the path to your project's directory on the PROJECT storage, use the `it4i-get-project-dir` command: @@ -67,12 +66,8 @@ Quota Type Cluster / PID File System Space used Space limit Entr ------------- --------------- ------------- ------------ ------------- -------------- --------------- ------------------- User barbora /home 11.1 MB 25.0 GB 122 500,000 2021-08-24 07:50:09 User karolina /home 354.6 MB 25.0 GB 3,194 500,000 2021-08-24 08:20:08 -User salomon /home 407.0 MB 250.0 GB 5,522 500,000 2021-08-24 08:20:08 User barbora /scratch 256.5 GB 10.0 TB 169 10,000,000 2021-08-24 07:50:19 User karolina /scratch 52.5 GB 100.0 TB 967 20,000,000 2021-08-24 08:20:18 -User salomon /scratch 3.7 TB 100.0 TB 212,252 10,000,000 2021-08-24 08:20:41 -User salomon /scratch/temp 3.1 TB N/A 50.328 2021-08-24 08:20:54 -User salomon /scratch/work 2.8 TB N/A 207,594 2021-08-24 08:20:47 Project open-XX-XX proj1 3.9 TB 20.0 TB 212,377 5,000,000 2021-08-24 08:20:02 Project open-YY-YY proj3 9.5 MB 20.0 TB 182 5,000,000 2021-08-24 08:20:02 Project open-ZZ-ZZ proj2 844.4 GB 20.0 TB 797 5,000,000 2021-08-24 08:20:02 @@ -114,7 +109,7 @@ Snapshots are read-only. Snapshots' names have the `YYYY-MM-DD-hhmmss` format. ```console -[vop999@login1.salomon ~]# ls -al /mnt/proj3/open-XX-XX/.snapshots +[vop999@login1.karolina ~]# ls -al /mnt/proj3/open-XX-XX/.snapshots total 4 dr-xr-xr-x. 2 root root 4096 led 14 12:14 . drwxrws---. 16 vop999 open-XX-XX 4096 led 20 16:36 .. diff --git a/mkdocs.yml b/mkdocs.yml index c03add14a7aed9124be1a97cabba83042889f8ab..6d879f8d57f4fc094526c6b3c489830aafcddcde 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -85,7 +85,7 @@ nav: - Satisfaction and Feedback: general/feedback.md - PRACE: prace.md - Support: general/support.md - - e-INFRA CZ Migration: einfracz-migration.md + - Migration to e-INFRA CZ: einfracz-migration.md - Withdrawal from service: anselm-salomon-shutdown.md - PROJECT Storage Availability: project-storage-availability.md - Storage: @@ -114,16 +114,6 @@ nav: - Accessing the DGX-2: dgx2/accessing.md - Resource Allocation and Job Execution: dgx2/job_execution.md - Software deployment: dgx2/software.md - - Salomon: - - Introduction: salomon/introduction.md - - Hardware Overview: salomon/hardware-overview.md - - Compute Nodes: salomon/compute-nodes.md - - Network: - - InfiniBand Network: salomon/network.md - - IB Single-Plane Topology: salomon/ib-single-plane-topology.md - - 7D Enhanced Hypercube: salomon/7d-enhanced-hypercube.md - - Storage: salomon/storage.md - - Visualization Servers: salomon/visualization.md - Archive: - Introduction: archive/archive-intro.md - Anselm: @@ -132,6 +122,16 @@ nav: - Compute Nodes: anselm/compute-nodes.md - Storage: anselm/storage.md - Network: anselm/network.md + - Salomon: + - Introduction: salomon/introduction.md + - Hardware Overview: salomon/hardware-overview.md + - Compute Nodes: salomon/compute-nodes.md + - Network: + - InfiniBand Network: salomon/network.md + - IB Single-Plane Topology: salomon/ib-single-plane-topology.md + - 7D Enhanced Hypercube: salomon/7d-enhanced-hypercube.md + - Storage: salomon/storage.md + - Visualization Servers: salomon/visualization.md - Software: - Environment and Modules: environment-and-modules.md - Modules: