Implementation Notes

After many years of not bothering, I have a blog again. Not that I plan to update it with any regularity, but I’m tired of resorting to pastebin services when I want to throw a document online. Rather than seeding with the usual content-free “Hello World” post, here are some notes and security considerations regarding the implementation of this site.

Indistinguishable from Random is hosted on a Linode running Apache HTTPD on Debian Wheezy. Site content is generated using Hakyll.

TLS configuration

Indistinguishable from Random doesn’t have much use for TLS encryption. It’s a static site that serves only public content and doesn’t use cookies, so the only things exchanged with the site which a user could possibly want to keep secret are his/her identity (IP address, etc.) and which pages he/she is requesting. TLS doesn’t do anything to protect the former (that’s what Tor is for), and contrary to popular belief, it won’t do a good job of protecting the latter either. TLS doesn’t do much of anything to mitigate traffic analysis side-channels: it doesn’t conceal how much data you’re sending or when you’re sending it. Since the combination of the size of your HTTP request with the size of my server’s response constitutes a likely-unique fingerprint for any piece of content on this blog, you can’t actually count on that green lock icon beside your address bar to prevent people from knowing that you’re reading this page.

However, TLS isn’t only for encryption: it also provides authentication. This blog’s use of TLS should make it somewhat harder for an attacker to put words in my mouth by altering the content that my server sends back to you. An attacker’s best bet would probably be to obtain a forged certificate for my domain. Unfortunately, with the multitude of certificate authorities trusted by most browsers and the willingness of many of them to issue certificates with domain-only validation (such as the one I’m serving), this isn’t a particularly high bar. To mitigate this, I may eventually implement public-key pinning once both the standard and my server operations have matured a bit. I won’t however, be upgrading to a better-validated certificate: it would cost me more money, wouldn’t improve your user experience one bit, and wouldn’t prevent an attacker from obtaining and using a forged DV certificate!

My OpenSSL cipher string is ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-SHA:AES128-SHA. The most appealing characteristic of this string is that it’s short: it gives me only a few permutations to test and keeps the number of buggy OpenSSL codepaths to which I’m exposed to a minimum. It earns top marks from SSL Labs, but this is mostly valuable for keeping up appearances, since as I just explained, all that strong encryption isn’t really accomplishing anything. Nonetheless, all the ciphers are very fast on modern hardware supporting AES‒NI, and the inclusion of the mandatory AES128‒SHA ciphersuite at the end means it shouldn’t cause any compatibility issues.

Following the lead of my Akamai colleague Mark Nottingham, I am deliberately requiring that clients support SNI. This will leave some users of old browsers out in the cold, but I feel safe in my expectation that most users of Internet Explorer on Windows XP don’t follow security blogs. On the positive side, it may also keep away some undesirables like spam crawlers and sslsqueeze.

Like Mark, I was unhappy with the idea of replying to non-SNI clients with a 403 page. What I want is to fail the TLS handshake as soon I receive a ClientHello with no SNI extension in it. Apache’s configuration language isn’t designed to facilitate this. Setting an empty cipher string or failing to give a certificate to the default vhost doesn’t work: that’ll cause Apache to error out on startup. Eventually though, I figured out a hack that makes it do what I want: give the default vhost a self-signed RSA certificate combined with a cipher string that doesn’t contain any RSA ciphersuites.

Speed hacks

Indistinguishable from Random is a low-traffic site consisting entirely of static content pre-generated using Hakyll, so right off the bat I don’t anticipate any performance issues. But since I can’t help obsessing over these things, I did some additional tuning.

First, the obvious stuff: use mpm_event, and turn on Cache-Control headers. Images usually won’t ever change once posted, so those have a cache TTL of a day. HTML files need to be regenerated when sidebar content changes, which due to the “Recent Posts” box means every time I make a blog post, so those get a 15-minute TTL. I’ll likely often need to hack on my stylesheet to get a new post to look right, so that just gets 15 minutes as well.

Next, I’ve enabled serving of compressed content. Even with small files and clients with fast internet connections, this can make a big improvement to page load times because of TCP’s slow start. Ensuring that content fits within one TCP window can save a round trip.

Since all my content is static, there’s no reason to waste CPU by compressing it on the fly; I can keep precompressed copies of everything on disk. Figure 1 shows my Apache recipe for this. Warning: there’s no RewriteCond here to verify that the compressed version exists. My build scripts are responsible for ensuring that it always does.

1
2
3
4
5
RewriteCond %{HTTP:Accept-Encoding} gzip
RewriteRule ^(.*)\.(html|js|txt|css|svg)$ $1\.$2\.gz [E=REQUEST_URI:$1\.$2]
SetEnvIf REQUEST_URI "\.(html|js|txt|css|svg)$" MAY_GZIP
Header merge Vary Accept-Encoding env=MAY_GZIP
Header set Content-Location %{REQUEST_URI}e.gz env=REQUEST_URI
Figure 1: serving pre-compressed content from Apache

Indistinguishable from Random’s library dependencies — namely font files and MathJax — are served from CDNs, which hopefully ensures that you’ll receive these files from a server geographically close to you. I haven’t — yet — put my own domain behind a CDN. Doing so will become a clearer win once my site traffic increases, and with it, the probability of a cache hit at the CDN edge server.

Essential CSS is served inline, and inessential bits such as fonts are loaded asynchronously. This means that you see a useful page after one fewer round trip to the server. The inlined CSS bloats the compressed page size by a little over two kilobytes, but that’s still a big win on virtually any internet link since the end of dial-up.

Accessibility

I’ve put considerable effort into making Indistinguishable from Random accessible. I’ve carefully followed the W3C Web Content Accessibility Guidelines and believe I am in compliance with Level AA, and with much but not all of Level AAA. I’ve tested all of my content using WAVE and ChromeVox, and added aural stylesheets and ARIA attributes to smooth over some rough patches of narration.

I’ve chosen my fonts and colors to be simple, readable, and high-contrast. If you use a Gecko-based browser such as Firefox and you’ve customized your color scheme, those customizations are respected. On other browsers, main text areas conform to the usual white background, black text, and blue and purple hyperlinks.

On code listings, I’ve changed the color scheme for syntax highlighting in order to improve contrast. However, the code listings still need some work accessibility-wise. Right now, if you play them in a screen reader, the reader will read out all the line numbers first — “one two three four five” — and then move on to the actual code. That’s not very useful. I’ll probably have to contribute some patches to Kate if I want to fix this. Also, of course, screen readers butcher the pronunciation of code, but this is an open problem that I’m not likely to be able to address.

If you have any sort of visual or cognitive impairment or you rely on any sort of assistive technology for browsing the web, I would greatly appreciate your feedback on anything that could be improved.

Comment system

Indistinguishable from Random doesn’t support comments yet. I’ve yet to find any off-the-shelf comment system which inspires sufficient confidence that I’d be willing to install it on my own server. I plan to look into adding a hosted solution such as Disqus, or, failing that, just link to a discussion thread on a relevant forum. Disqus-based comments are now live.