Newer
Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
# PROJECT Data Storage
The PROJECT data storage is a central storage for projects'/users' data on IT4Innovations.
The PROJECT data storage is accessible from all IT4Innovations clusters and allows to share data amongst clusters.
The storage is intended to be used throughout the whole project's lifecycle.
## Technical Overview
The PROJECT storage consists of three equal file storages (blocks) called PROJ1, PROJ2, and PROJ3.
Each file storage implements GPFS file system exported via NFS protocol using three NFS servers.
File storages provide high-availability and redundancy.

| Specification | Total | Per Block |
| ----------------- | -------------------|-------------------- |
| Protocol | NFS over GPFS |
| Total capacity | 15PB | 5PB |
| Throughput | 39GB/s | 13GB/s |
| IO Performance | 57kIOPS | 19kIOPS |
## Accessing PROJECT
All aspects of allocation, provisioning, accessing, and using the PROJECT storage are driven by project paradigm.
Storage allocation and access to the storage are based on projects (i.e. computing resources allocations) and project membership.
A project directory (actually implemented as an independent fileset) is created for every active project.
Default limits (quotas), default file permissions, and ACLs are set.
The project directory life cycle strictly follows the project's life cycle.
The project directory is removed after the project's data expiration.
### POSIX File Access
!!!note "Mountpoints"
PROJECT file storages are accessible at mountpoints `/mnt/proj1`, `/mnt/proj2`, and `/mnt/proj3`.
The PROJECT storage can be accessed via the following nodes:
| Cluster | Node(s) |
| ------------- | ----------------------------- |
To show the path to your project's directory on the PROJECT storage, use the `it4i-get-project-dir` command:
```console
$ it4i-get-project-dir OPEN-XX-XX
/mnt/proj3/open-XX-XX
```
### Project Quotas
The PROJECT storage enforces quotas on projects' usage (used capacity and allocated inodes).
Default quotas for capacity and amount of inodes per project are set by IT4Innovations.
| Project default quota | |
| --------------------- | ------ |
| Space quota | 20TB |
| Inodes quota | 5 mil. |
You can check the actual usage of the PROJECT storage (e.g. location of project directory, used capacity, allocated inodes, etc.) by executing the `it4ifsusage` command from the Login nodes' command line. The command lists all projects associated with the user.
```console
[vop999@login1.barbora ~]$ it4ifsusage
Quota Type Cluster / PID File System Space used Space limit Entries used Entries limit Last update
------------- --------------- ------------- ------------ ------------- -------------- --------------- -------------------
User barbora /home 11.1 MB 25.0 GB 122 500,000 2021-08-24 07:50:09
User karolina /home 354.6 MB 25.0 GB 3,194 500,000 2021-08-24 08:20:08
User salomon /home 407.0 MB 250.0 GB 5,522 500,000 2021-08-24 08:20:08
User barbora /scratch 256.5 GB 10.0 TB 169 10,000,000 2021-08-24 07:50:19
User karolina /scratch 52.5 GB 100.0 TB 967 20,000,000 2021-08-24 08:20:18
User salomon /scratch 3.7 TB 100.0 TB 212,252 10,000,000 2021-08-24 08:20:41
User salomon /scratch/temp 3.1 TB N/A 50.328 2021-08-24 08:20:54
User salomon /scratch/work 2.8 TB N/A 207,594 2021-08-24 08:20:47
Project open-XX-XX proj1 3.9 TB 20.0 TB 212,377 5,000,000 2021-08-24 08:20:02
Project open-YY-YY proj3 9.5 MB 20.0 TB 182 5,000,000 2021-08-24 08:20:02
Project open-ZZ-ZZ proj2 844.4 GB 20.0 TB 797 5,000,000 2021-08-24 08:20:02
```
The information can also be found in IT4Innovations' [SCS information system][b].
!!!note
At this time, only PIs can see the quotas of their respective projects in IT4Innovations' SCS information system.
We are working on making this information available to all users assigned to their projects.
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
It is preferred that you request additional storage space allocation in advance in you application for computational resources.
Alternatively, if the project is already active, contact [IT4I support][a].
### ACL and File Permissions
Access to a project directory and containing files is restricted by Unix file permissions and file access control lists (ACLs).
Default file permissions and ACLs are set by IT4Innovations during project directory provisioning.
## Backup and Safety
!!!important "Data Backup"
Data on the PROJECT storage is **not** backed up.
The PROJECT storage utilizes fully redundant design, redundant devices, highly available services, data redundancy, and snapshots. For increased data protection, disks in each disk array are connected in Distributed RAID6 with two hot-spare disks, meaning the disk array can recover full redundancy after two simultaneous disk failures.
However, the storage does not provide data backup, so we strongly recommend using the [CESNET storage][1] for making independent copies of your data.
### Snapshots
The PROJECT storage provides snapshot functionality. A snapshot represents a state of a filesystem at a particular point in time. Snapshots are created for all projects on fileset (i.e. project directory) level.
Snapshots are created every day, snapshots older than seven days are deleted.
Files in snapshots are accessible directly by users in the special subdirectory of each project directory named `.snapshots`.
Snapshots are read-only.
Snapshots' names have the `YYYY-MM-DD-hhmmss` format.
```console
[vop999@login1.salomon ~]# ls -al /mnt/proj3/open-XX-XX/.snapshots
total 4
dr-xr-xr-x. 2 root root 4096 led 14 12:14 .
drwxrws---. 16 vop999 open-XX-XX 4096 led 20 16:36 ..
drwxrws---. 16 vop999 open-XX-XX 4096 led 20 16:36 2021-03-01-022441
drwxrws---. 16 vop999 open-XX-XX 4096 led 20 16:36 2021-03-02-022544
drwxrws---. 16 vop999 open-XX-XX 4096 led 20 16:36 2021-03-03-022949
drwxrws---. 16 vop999 open-XX-XX 4096 led 20 16:36 2021-03-04-023454
drwxrws---. 16 vop999 open-XX-XX 4096 led 20 16:36 2021-03-05-024152
drwxrws---. 16 vop999 open-XX-XX 4096 led 20 16:36 2021-03-06-020412
drwxrws---. 16 vop999 open-XX-XX 4096 led 20 16:36 2021-03-07-021446
```
<! --- (HA data replication?) -->
<! --- (balancing in case of overload (data migration?) -->
## Computing on PROJECT
!!!important "I/O Intensive Jobs"
Stage files for intensive I/O calculations onto the SCRATCH storage.
The PROJECT storage is not primarily intended for computing and it is strongly recommended to avoid using it directly for computing in majority of cases.
On the other hand, the PROJECT storage is accessible from compute nodes and can be used for computing jobs with low I/O demands,
when copying data to other storage for computing is not feasible or efficient.
However, be aware of overloading the storage, as this will result in degraded performance for other users of the PROJECT storage or its unavailability.
For maximum performance, you should always copy the files of I/O intensive jobs onto the SCRATCH storage.
The files should be copied to SCRATCH from Login nodes before submitting the job.
<! --- See also: data storage policy on filesystems (link?) -->
<! --- ## Technical Specification -->
<! --- For a detailed technical specification, see the Technical Specification section. -->
## Summary
| PROJECT Storage | |
| -------------------- | ------------------- |
| Mountpoint | /mnt/proj{1,2,3} |
| Capacity | 15PB |
| Throughput | 39GB/s |
| IO Performance | 57kIOPS |
| Default project space quota | 20TB |
| Default project inodes quota | 5 mil. |
[1]: ../storage/cesnet-storage.md
[a]: mailto:support@it4i.cz
[b]: https://scs.it4i.cz/projects