Linux – Managing cluster of linux computers behind firewalls

ansibleaptdevopslinuxlinux-networking

My company's product is essentially a Linux box (Ubuntu) sitting in somebody else's network running our software. Up to now we had less than 25 boxes in the wild and used TeamViewer to manage them.

We're now about to ship 1000 of these boxes and TeamViewer is no longer an option. My job is to figure out a way of accessing these boxes and updating the software on them. This solution should be able to punch through firewalls and what have you.

I've considered:

1. Home grown solution (e.g. a Linux service) that establishes an SSH reverse tunnel to a server in the cloud, and another service in the cloud that keeps track of those & lets you connect to them.

This is obviously labour intensive and frankly speaking feels like reinventing the wheel since so many other companies must have already run across this problem. Even so, I'm not sure we'll do a great job at it.

2. Tools such as puppet, chef or OpenVPN

I tried to read as much as possible but I can't seem to penetrate enough through the marketing speak to understand the obvious choice to go with.

No one else except us needs to connect to these boxes. Is there anyone with relevant experience that can give me some pointers?

Best Answer

2022 June - Update

If all you need is remote access to the machine, two newer approaches (if you're comfortable with AWS) would be to use one of:

  • AWS SSM
  • AWS VPN

That said, I would still opt for a pull mechanism for ensuring updates are deployed. You ideally want to use these direct shells only in case of an emergency. Otherwise, you will (inevitably) end up with a Frankenstein fleet of servers, each with their own funny configuration tweaks that were done manually by someone in a pinch, without documentation.

Pull updates, don't push

As you scale, it's going to become unfeasible to do push updates to all your products.

  • You'll have to track every single customer, who might each have a different firewall configuration.
  • You'll have to create incoming connections through the customer's firewall, which would require port-forwarding or some other similar mechanism. This is a security risk to your customers

Instead, have your products 'pull' their updates periodically, and then you can add extra capacity server-side as you grow.

How?

This problem has already been solved, as you suggested. Here's several approaches I can think of.

  • using apt: Use the built-in apt system with a custom PPA and sources list. How do I setup a PPA?

    • Con: Unless you use a public hosting service like launchpad, Setting up your own apt PPA + packaging system is not for the faint of heart.
  • using ssh: Generate an SSH public key for each product, and then add that device's key to your update servers. Then, just have your software rsync / scp the files required.

    • Con: Have to track (and backup!) all the public keys for each product you send out.
    • Pro: More secure than a raw download, since the only devices that can access the updates would be those with the public key installed.
  • raw download + signature check:

    • Post a signed update file somewhere (Amazon S3, FTP server, etc)
    • Your product periodically checks for the update file to be changed, and then downloads / verifies the signature.
    • Con: Depending on how you deploy this, the files may be publicly accessible (which may make your product easier to reverse engineer and hack)
  • ansible: Ansible is a great tool for managing system configurations. It's in the realm of puppet / chef, but is agentless (uses python) and designed to be idempotent. If deploying your software would require a complicated bash script, I'd use a tool like this to make it less complicated to perform your updates.

Of course, there are other ways to do this.. But it brings me to an important point.

Sign / validate your updates!

No matter what you do, it's imperative that you have a mechanism to ensure that your update hasn't been tampered with. A malicious user could impersonate your update server in any of the above configurations. If you don't validate your update, your box is much easier to hack and get into.

A good way to do this is to sign your update files. You'll have to maintain a certificate (or pay someone to do so), but you'll be able to install your fingerprint on each of your devices before you ship them out so that they can reject updates that have been tampered with.

Physical Security

Of course, if someone has physical access to the customer's deployment, they could easily take over the server. But at least they can't attack the other deployments! Physical security is likely the responsibiltiy of your customer.

If you would for a moment, imagine what would happen if you used a large OpenVPN network for updates... They could then use the compromised server to attack every instance on the VPN

Security

Whatever you do, security needs to be built in from the beginning. Don't cut corners here - You'll regret it in the end if you do.

Fully securing this update system is out of scope of this post, and I strongly recommend hiring a consultant if you or someone on your team isn't knowledgeable in this area. It's worth every penny.