[Tutorial] Access HPC files from an S3 gateway

This isn’t strictly related to HPC but may be anyway helpful.

If you want to access your files on the clusters from an application, it may be interesting to access them through an Amazon S3 gateway like.

This quick tutorial will explain how to do so with minio. This is a commercial solution but they do have a community version as well and they distribute it as a Docker image.

On the cluster we don’t provide Docker but an alternative named Singularity. The good news is that it is easy to convert images from Docker to Singularity.

When you’ll launch the minio server, there is an associated web console that is launched. To access it, you need credentials that you should define as environment variable like that prior to launch the server:

[sagon@login2 ~] $export MINIO_ROOT_USER=myroot
[sagon@login2 ~] $export MINIO_ROOT_PASSWORD=veryhardtoguess (min 8 char)

Then you should load Singularity module:

[sagon@login2 ~] $ml GCC/9.3.0 Singularity/3.7.3-Go-1.14

When launching minio, you need to specify the listening port and the data location from where the files will be served.

Launch the minio server, listening on port 9000 for S3 access, and port 9001 for web console access, serving the directory /home/sagon/minio:

[sagon@login2 ~] $ singularity run docker://minio/minio server /home/sagon/minio/ --console-address ":9001"
INFO:    Using cached SIF image
RootUser: myroot
RootPass: veryhardtoguess 

RootUser: myroot
RootPass: veryhardtoguess 

Command-line: https://docs.min.io/docs/minio-client-quickstart-guide
   $ mc alias set myminio myroot veryhardtoguess 

Documentation: https://docs.min.io

Even if the public ip of login2 is listed as API entrypoint, it isn’t possible to access the server through the it as we are using a firewall as announced during the previous maintenance

Once launched, you need to open a web browser, using x2go or an ssh tunnel for example and connect to the web interface with the credentials you defined previously. See this post for a tunnel or socks example.

On this interface, you should create a bucket (this is like a namespace or a top directory, that will hold the files).

You must as well create a new user to that will be able to access the files using S3.
Remember the access key id and secret access key. You can specify one or more policies, in this case I’m giving read/write rights to the user.

Once done, you may try to access your files through an S3 browser such as this one.

You need to create an account based on the user you created previously.

And you can now access your files with the S3 client (again from inside the cluster or with a tunnel or proxy)

Thanks for reading, feedback welcome!