Ssh – query Redshift over multiple SSH hops

sockssshssh-tunnel

I'd like to query an AWS Redshift cluster from an IDE running on my laptop (IntelliJ). This cluster is only directly accessible from an EC2 instance in our VPC.

enter image description here

That EC2 instance isn't directly accessible from my laptop: it's necessary to ssh to another server in our datacenter, and then ssh to an EC2 instance in our VPC in order to run queries on the Reshift cluster via the psql client. Other users and cron'd shell scripts also execute commands on this cluster from the same EC2 instance.

I'd like to setup an SSH tunnel from my laptop, through the intermediate server in our datacenter, to the EC2 instance, and ultimately to the Redshift cluster without impacting other users/scripts.

I tried this:

ssh -fNL 5439:localhost:22 host_in_datacenter ssh -fNL 22:localhost:22 ec2_instance ssh -fNL 22:localhost:5439 reshift_host

Which returned:

[08P01] Protocol error.  Session setup failed.

… when I try to connect via a client.

Most of the SSH chaining examples to access remote databases I've seen have just one hop to the host running the database, and the final hop ends up on the database host itself.

Is it possible to create an SSH tunnel, without impacting other users, from my laptop to the Redshift cluster? If so, can you see what I'm doing wrong?

Best Answer

It seems as if you may not understand the SSH forwarding syntax. Please don't take this personally, but this is a prime example of why copy/paste software development and systems administration is so dangerous. People find examples online of how to do things, without bothering to do a bit of research into what actually is happening.

Let's break down a part of your command:

ssh -fNL 5439:localhost:22 host_in_datacenter

This means take traffic on localhost:5439 and forward it to port 22 on the server you're ssh'ing to. This doesn't make any sense, and won't ever work, as it's an SSH server listening on port 22, which won't know what to do with non-SSH traffic.

What you want to do is pick another random high port and use that for passing traffic through your chain. Something like this:

ssh -fNL 5439:localhost:12345 host_in_datacenter ssh -fNL 12345:redshift_host:12345 ec2_instance

Honestly, though, this is pretty kludgy. If I were you, I'd either collapse your network so you only need to go through a single host to get to Redshift or set up a VPN in AWS so you can connect directly from your workstation.