diff --git a/docs.it4i/storage/proj4-storage.md b/docs.it4i/storage/proj4-storage.md index cf7502e4c869b142e1c0ad1b5df60c2e21fcf2ca..1f08e9c821bbae5b5460db56d6bdfa2e1236b1a5 100644 --- a/docs.it4i/storage/proj4-storage.md +++ b/docs.it4i/storage/proj4-storage.md @@ -18,12 +18,12 @@ that is well-suited for a wide range of applications and use cases. ## Accessing Proj4 -The Proj4 object storage is accessible from all IT4Innovations clusters +The Proj4 object storage is accessible from all IT4Innovations clusters' login nodes as well as from the outside. Additionally, it allows to share data across clusters, etc. -User has to be part of project, which is allowed to use S3 storage. -After that you will obtain role and credentials for using s3 storage. +User has to be part of project, which is allowed to use S3 storage. If you haven't received your S3 credentials (access and secret) after your project was created, please send a request to support@it4i.cz asking for the "S3 PROJECT ACCESS", stating your IT4I login and name of your project (where the name of the project is in format OPEN-XX-YY or similar). +After that an active role on the S3 storage will be created and you will obtain via na email the credentials for using the S3 storage. ## How to Configure S3 Client @@ -40,7 +40,7 @@ $ s3cmd --configure Default Region: US S3 Endpoint: 195.113.250.1:8080 DNS-style bucket+hostname:port template for accessing a bucket: 195.113.250.1:8080 - Encryption password: random + Encryption password: RANDOM Path to GPG program: /usr/bin/gpg Use HTTPS protocol: False HTTP Proxy server name: @@ -49,17 +49,18 @@ $ s3cmd --configure . . -Configuration saved to '/home/dvo0012/.s3cfg' +Configuration saved to '/home/IT4USER/.s3cfg' . . ``` +Please note, that the Encryption password should be defined by you instead using the value "RANDOM". -now you have to make some bucket for you data with **your_policy** (for example ABC - if ABC is your project). -If you make a bucket without policy, we will not able to manage your data expiration of project - so please use the policy. +Now you have to make some bucket for your data with **your_policy**, referencing your project (e.g. OPEN-XX-YY - if OPEN-XX-YY is your active and eligible project). +If you make a bucket without policy, we will not able to manage your project's data expiration and you might loose the data before the end of your actuall project - so please use the policy. ```console -~ s3cmd --add-header=X-Storage-Policy:ABC mb s3://test-bucket +~ s3cmd --add-header=X-Storage-Policy:OPEN-XX-YY mb s3://test-bucket ~ $ s3cmd put test.sh s3://test-bucket/ upload: 'test.sh' -> 's3://test-bucket/test.sh' [1 of 1] @@ -108,10 +109,42 @@ Permission can be set only by the owner of the bucket. URL: [http://195.113.250.1:8080/test-bucket/test1.log](http://195.113.250.1:8080/test-bucket/test1.log) x-amz-meta-s3cmd-attrs: atime:1696588450/ctime:1696588452/gid:1001/gname:******/md5:******/mode:33204/mtime:1696588452/uid:******/uname:****** ``` +## Access to Multiple Projects +If a user needs to access multiple projects' data, it is needed to repeat the step asking the IT4I support for new credentials for the additional projects. In case you don't have the credentials assigned with the project activation, please send a request to support@it4i.cz. + +As the first step, rename your current S3 configuration, so that it uniquely identifies your current project or organize it on your local storage accordingly. + +```console +$ mv /home/IT4USER/.s3cfg /home/IT4USER/.s3cfg-OPEN-XX-YY +``` +Then create new S3 configuration for the additional project (e.g. OPEN-AA-BB). + +```console +$ s3cmd --configure +``` +Rename or organize you newly created config. +```console +$ mv /home/IT4USER/.s3cfg /home/IT4USER/.s3cfg-OPEN-AA-BB +``` +When acccessing the data of the different project specify the right configuration using the S3 commands. + +```console +~ s3cmd -c /home/IT4USER/.s3cfg-OPEN-AA-BB --add-header=X-Storage-Policy:OPEN-AA-BB mb s3://test-bucket + +~ $ s3cmd -c /home/IT4USER/.s3cfg-OPEN-AA-BB put test.sh s3://test-bucket/ +upload: 'test.sh' -> 's3://test-bucket/test.sh' [1 of 1] +1239 of 1239 100% in 0s 19.59 kB/s done + +~ $ s3cmd -c /home/IT4USER/.s3cfg-OPEN-AA-BB ls +2023-10-17 13:00 s3://test-bucket + +~ $ s3cmd -c /home/IT4USER/.s3cfg-OPEN-AA-BB ls s3://test-bucket +2023-10-17 13:09 1239 s3://test-bucket/test.sh +``` ## Bugs & Features -By default, the S3cmd client uses the so-called "multipart upload", +By default, the S3CMD client uses the so-called "multipart upload", which means that it splits the uploaded file into "chunks" with a default size of 15 MB. However, this upload method has major implications for the data capacity of the filesystem/fileset when overwriting existing files. When overwriting an existing file in a "multipart" mode, the capacity is duplicated @@ -125,4 +158,4 @@ upload: '/install/test1.log' -> 's3://test-bucket1/test1.log' [1 of 1] 1024000000 of 1024000000 100% in 9s 99.90 MB/s done ``` -this method is not recommended for large files, because it is not as fast and reliable as multipart upload, but it is the only way how to overwrite files without duplicating capacity. +This method is not recommended for large files, because it is not as fast and reliable as multipart upload, but it is the only way how to overwrite files without duplicating capacity.