Security and latency

Just a Minute

Author(s): Kurt Seifried

In a high-performance environment, you want speed as well as security. Kurt looks at some approaches to security that won't slow things down.

High-performance computing (HPC) used to be all about buying the fastest system you could afford. It was simple: You had a problem, and you spent as much money as you could to buy a single fast system. Then, some clever people (Thomas Sterling and Donald Becker at NASA) came along and pointed out that computers had gotten quite a bit faster, and if they worked together, they could solve certain types of problems just as quickly for much less money. So, these folks took some commodity hardware and Ethernet gear, wrote some software to tie it all together, and created the Beowulf cluster [1].

Since then, the TOP500 supercomputer list [2] has changed from a list of systems running specialized software from Cray and friends – with a handful of distributed systems running Linux at the bottom – to a list of almost completely Linux-based systems (95.2% of systems and 97.4% of performance as of June 2013), while silicon speeds have basically topped out at 5GHz. Sure, you can go faster, but it's more expensive; going horizontal is much more cost effective.

All of these developments have made HPC vastly more affordable. The technologies and software designed and built to allow commodity hardware to handle large loads also work very well at a smaller scale. This means you can build a solution at a small scale (e.g., using Hadoop, memcached, MongoDB, CouchDB, MapReduce, etc.) and simply add nodes to scale up as necessary. You don't need to replace the system or make significant changes to the software, you can just add more systems running the exact same software.

Discuss Amongst Yourselves

Such systems are built using commodity hardware, and although Ethernet has gotten a lot faster (gigabit network is standard, 10Gb is becoming affordable), it hasn't gotten a lot faster when it comes to latency. Typical Gigabit Ethernet is slightly less than a millisecond (so less than one thousandth of a second), which is not bad.

But with TCP/IP, you need to do round trips, and initial setup takes a few packets before you can start sending data. So, when you want to get data from a remote system and you need to establish a TCP/IP connection, you can eat up multiple milliseconds on the network traffic alone. This is nothing, however, compared with the delays that can be caused by authentication (especially if you use a central authentication server). So, you want security, but you also want speed. The question is: What trade-off do you make?

No Authentication

The fastest authentication is, of course, no authentication: No code to process, no server to contact, no credentials to pass around and verify. This approach does, however, open up some significant risks. If you decide to use no authentication on a server, minimally, you will need to firewall the service to trusted clients. Assuming these client systems do not have local users or run user-supplied code, you will be relatively safe from attackers – until, of course, they do get in. At that point, they will be able to spread through your infrastructure very quickly. The reality is that, in most systems, you have strong perimeter security and very little, if any, internal security, so this might be an acceptable trade-off.

Authenticate the Network/Traffic

One strategy is to move authentication to the network layer. If you have systems with no user accounts, or a trusted set of users (e.g., a computer cluster), or a series of specialized applications that don't talk to untrusted users or data sources, then it might make sense to authenticate systems at the system level. Using 802.1x, you can force systems to authenticate to the network switches before the network port is enabled; however, I can't imagine many situations where you have a trusted set of systems where people plugging random systems in is a problem.

IPsec or other VPN software, such as OpenVPN, is another option, but again, if you have a highly trusted set of systems, they are likely behind a firewall, so you don't need to worry about this. If you choose this approach, make sure that external access is heavily limited, and I don't mean just network ports – I mean any external access – data that gets passed in, DNS lookups, everything.

SSH

An issue with the above solutions using 802.1x, IPsec, and so forth is that they'll generally need root access, and 802.1x can be tricky to manage remotely. SSH is a great way to connect disparate systems (everything supports it by default), especially at the user level. For example, simply using public/private key pairs, you can easily configure a program on one system to connect to another system and run a program, connect to a socket, or use some other form of communication channel. The downside of SSH is key management, and rotation can be tricky unless, of course, you use OpenSSH certificates [3].

OpenSSH Certificates

Traditional SSH relies simply on simple public/private key pairs. When you connect to a server, the public key is sent to you and you are asked whether you want to trust it. If this is the first time you're connecting to that server and you don't have some way to verify the key fingerprint, you really have no idea whether it is legitimate or not. In OpenSSH 5.4, support for a simple homegrown certificate protocol was added (for some reason, they wanted to avoid the complexity of X.509 certificates, and I tend to agree with them). Deployment and configuration are simple: You create a trusted signing key and then use it to sign host keys and user keys:

# ssh-keygen -f ssh_signing_key
# ssh-keygen -h -s ssh_signing_key -I someid -V +365d -n hostname,hostname.example.org /etc/ssh/ssh_host_dsa_key.pub

You'll want the key signature to expire at some point (a year is good), and you'll need to specify the hostname and any aliases allowed for that host. The preceding command will drop the signed key into /etc/ssh/ssh_host_dsa_key-cert.pub. To handle remote systems, you'll need to copy the key to your signing server, sign it, and then upload the signed copy to the original server and configure sshd_config to use it:

HostCertificate /etc/ssh/ssh_host_rsa_key-cert.pub

Users will also need a copy of the public signing key so they can verify signatures made using the private signing key. The key itself needs to be in a text file with a @cert-authority statement to let it know that it is the public part of a certificate-signing key pair:

@cert-authority * ssh-rsa [public key here]

You can then distribute that file manually and tell users to use the -oUserKnownHostsFile command-line option; or, you can deploy it on all your systems in /etc/ssh and use a GlobalKnownHostsFile directive to point at the file so that all clients by default on that system trust the signing key. In future, new connections to any system with a signed public key will just work, so there will be no need to distribute the host keys manually and all that entails.

Kerberos and SSL Certificates

If you're not afraid of complexity, Kerberos and traditional SSL certificates can also work. However, you'll need to kerberize your services and clients or set everything up to handle SSL connections. Kerberos is especially well suited for large-scale deployments because it not only authenticates the client to the server but also the server to the client. Additionally, Kerberos uses several levels of tickets so that, ultimately, "service tickets" are used to connect to a service – such a ticket offers an attacker only minimal access at best.

Reducing Latency

The downside of things like OpenSSH certificates, Kerberos, and SSL is that the connection setup time will take longer; so, in most cases, you'll want to establish a connection and hold it open before you actually need it. That way you aren't constantly handling the cost of setup and teardown. Although this approach is suitable for things like connecting to remote message queues and file servers, it can pose a problem if you need a large number of clients to connect to a large number of servers. In that case, you might want to investigate hardening the perimeter and removing authentication entirely to speed things up.

September 2013 Issue 154 linux-magazine.com | Linuxpromagazine.com

Kurt Seifried

Kurt Seifried is an Information Security Consultant specializing in Linux and networks since 1996. He often wonders how it is that technology works on a large scale but often fails on a small scale.

Infos

Beowulf cluster: http://en.wikipedia.org/wiki/Beowulf_cluster
TOP500: http://www.top500.org/
SSH VPNs: http://www.linux-magazine.com/Issues/2012/144/Security-Lessons-OpenSSH-VPNs/(language)/eng-US