[tutorial] ssh tunneling and socks proxy

From time to time, it may be interesting to be able to connect to a compute node from outside.

Example use case: you are running a JupyterNotebook on a compute node and you need to connect to this instance using your desktop’s web browser.

In this case, the solutions are:

  1. create an ssh tunnel using the login node as “gateway”
  2. create a socks proxy and use it from your desktop’s browser
  3. other not described here

ssh tunnel

Example configuration to be added to your .ssh/config (in your laptop/desktop, not the cluster) to connect directly to a compute node from outside the cluster:

Host baobab
  HostName login2.baobab.hpc.unige.ch
  User your_user_name

Host node001
  HostName node001
  User your_user_name
  ProxyJump baobab
  LocalForward 1234 localhost:1234

You can now connect to the host node001 using ssh.

ysagon@lyann-dell-portable:~$ ssh baobab-node 
Last login: Wed Nov 17 15:08:22 2021 from login2.cluster
Installed: Fri Sep 10 09:05:36 CEST 2021

This will connect to the node through the baobab login node thanks to the ProxyJump pragma. The pragma LocalForward will forward the the port 1234 on the compute node to your pc where you instantiated the ssh connection.

From your browser, you should then open an url like this one: http://localhost:1234.

This method is working fine and quite secure, but not very flexible as you’ll have to update your ssh tunnel configuration according to the compute node where you are running your JupyterNotebook instance.

socks proxy

Another option is to connect to the login node using ssh and use the DynamicForward pragma.

Host baobab
  HostName login2.baobab.hpc.unige.ch
  User sagon
  DynamicForward 5000

When you connect to this host using ssh, this creates a socks proxy.

You can then configure your browser to use this proxy. Example for firefox:

You can then enter directly in the url the final destination: http://node001:1234. The advantage is that you don’t need to update the ssh configuration if you connect to another node.

ref: How to chain port forwarding to cluster node?

edit: correct ssh config.

2 Likes

Hello Yann.

So the steps to run a jupyter notebook using a partition on yggdrasil should be (just correct me if I do something wrong or unnecessary)

Write in your .ss/config

host ygg
HostName login1.yggdrasil.hpc.unige.ch
User ferrigno

Host cpu004
HostName cpu004
User ferrigno
ProxyJump ygg
LocalForward 1234 localhost:1234

In one terminal window
ssh ygg
salloc -n1 -c2 --partition=public-longrun-cpu --time=48:00:00
Verify that the node cpu004 is indeed use and in case add another node to

In another terminal window
ssh cpu004
and
jupyter notebook --no-browser --port=1234

In your browser
http://127.0.0.1:1234/?token=11

I have not managed to use the socks5 proxy, though.

Hi,

.ssh yes, but not on the cluster, on YOUR desktop/laptop.

Easiest way is to proceed as explained in this post to launch jupyter: [tutorial] Jupyter notebook
No need to previously use salloc and connect using ssh to the node.

Check the sbatch script line 19 in the Jupyter notebook tutorial.

Thanks for sharing! Interesting. I’d been taught a different method. So far I’ve been using Jupyter notebooks on my past University’s cluster and Yggdrasil this way:

1 -ssh connect to Yggdrasil and open tmux session
2 - salloc on public-longrun-cpu/cpu004 (1 task + 2 cpus-per-task)
3 - Launch jupyter wth: jupyter notebook --no-browser --port=8888 --ip=0.0.0.0
4 - Open second terminal tab and ssh connect to Yggdrasil with option -L for port forwarding: ssh username@login1.yggdrasil.hpc.unige.ch -L8888:cpu004:8888
5 - when I’m done with my work, exit salloc job on tmux session.

With the methods you described above, how much node resources, specially memory, are made accessible to the Jupyter Notebook?

Hi, thanks for the feedback.

The method you are describing is basically the same I’m using in the first part of my tutorial : [tutorial] ssh tunneling and socks proxy. Why do you need a tmux session?

The method I’m showing is only to create a ssh tunnel or accessing a web service from outside the cluster. The resource you’ll get depend on what you’ll launch and how you launch (salloc, srun, sbatch).

Ah, you’re right it’s the same in the end, I didn’t read carefully sorry.
The tmux session is because when I use the FortiClient VPN, the connection can be lost from time to time, disconnecting the ssh sessions. If that happens and I don’t have the salloc+jupyter in a tmux session, I won’t be able to access again the std output of the running notebook / salloc job.

Hi,

I don’t know if you are aware that there is no need to use the VPN to access the cluster.

Instead of using salloc I suggest to launch your jupyter instance using sbatch as in my other tutorial [tutorial] Jupyter notebook, doing so prevent your job to be killed if your connection close.

Hi Yann,

thanks for the heads up!
For those curious this works perfectly within VS Code for remote development and debugging on a worker node.
I know this has come up as a question from other users (adding @Debajyoti.Sengupta).

Cheers,
Johnny

Hello Yann,

I’m trying to use this method to connect directly into an interactive node in yggdrasil (cpu003), but I’m getting permission denied:

stefx@stefx-dell:~$ ssh cpu003 
Password: 
franchel@cpu003's password: 
Permission denied, please try again.

Following the instructions, I’ve modified locally my .ssh/config file in this way:

Host yggdrasil
     HostName login1.yggdrasil.hpc.unige.ch
     User franchel

Host cpu003
     HostName cpu003 
     User franchel
     ProxyJump yggdrasil
     LocalForward 1234 localhost:1234

I’ve also tried substituting HostName cpu003HostName cpu003.yggdrasil , but none is working.

Am I missing something?
Is ssh tunnel supposed to work also for yggdrasil?

Hi,

you need to have a running job on cpu003 in order to connect to it. Is it the case?

Indeed I wasn’t running any job… Thanks!

Still if I launch a jupyter notebook job, I didn’t managed to connect with ssh tunnel, but only using the x2go client. But I guess it is fine running the notebook there, right?

You mean, you are running jupyter notebook on a compute node or on the login node? On the login node, it isn’t fine. But you are allowed to launch firefox on the login node inside x2go to connect to jupyter notebook.

Sorry, I used a bad phrasing. I meant running the notebook job on a compute and using it from a login node inside x2go. Thanks again for the clarification!

So yes, this is fine:)

A post was split to a new topic: Uneable to connect to a node

Hello,

I have been trying to get this to work for JupyterLab on baobab because x2go still doesn’t work for me there.

Here are the steps:

on Baobab:

(torch) (baobab)-[lastufka@login2 ~]$ sbatch launchJupyterLab.sh
Submitted batch job 6045638
(torch) (baobab)-[lastufka@login2 ~]$ squeue --job 6045638
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
6045638 public-cp launchJu lastufka R 0:37 1 cpu239
(torch) (baobab)-[lastufka@login2 ~]$ grep -A1 ‘running at’ slurm-6045638.out
[I 2023-12-01 09:18:10.863 ServerApp] Jupyter Server 1.21.0 is running at:
[I 2023-12-01 09:18:10.863 ServerApp] http://cpu239:1234/lab?token=xxxx

On local machine:
(base):~$ more .ssh/config
Host baobab
HostName login2.baobab.hpc.unige.ch
User lastufka

Host cpu239
HostName cpu239
User lastufka
ProxyJump baobab
LocalForward 1234 localhost:1234

Attempt to ssh to cpu239:
(base) glados@DESKTOP-39KVP62:~$ ssh cpu239
(lastufka@login2.baobab.hpc.unige.ch) Password:
lastufka@cpu239’s password:
Permission denied, please try again.

I am surprised to be asked for a password for cpu239, as I thought entering it for the login node would be enough. Also I don’t have a password for the node, and my ISIS password is denied.

Is there something I am missing in the setup? It seems to not be a problem for other users.

Also, is there a possibility to request the same cpu in launchJupyterLab so that I don’t have to update .ssh/config every time?

Thanks,
~ Erica

Hi @Erica.Lastufka

Are you connecting with ssh keys and ssh agent?

An important thing is to remove in $HOME/.ssh/ any private key and public key.

See here for more details:

This is not the way to go. Instead modify your .ssh/config file as indicated in the previous link.

You may be interested to use JupyterLab from our new openondemand server? No need to use a tunnel at all.

An important thing is to remove in $HOME/.ssh/ any private key and public key.

Unfortunately I can’t do this since I have to access CSCS now and then… I’ll try openondemand, it sounds perfect!

In this case, just change the name of your private key, create an ssh config file in .ssh/config and specify the new filename for your ssh private key with the parameter IdentityFile , this should do the trick.