Post Category: Security

I love cryptography, computer security, and physical security. So I write about it.

ECB Isn't a Mode of Operation

  • By Brad Conte, October 30, 2013
  • Post Categories: Security

ECB isn't a block cipher mode of operation. At least, not for a developer.

In fact, I would suggest that calling ECB an encryption mode is similar to calling "see if it opens in Word after you decrypt it" a MAC. Or even akin to calling "textbook" a type of RSA padding and listing it next to OAEP in a crypto API. From a development perspective, ECB mode, "open it in Word", and "textbook RSA padding" are all examples of a horrible ad-hoc scheme that attempts to address a cryptography problem without using a real cryptographic scheme.

I understand why we formally categorize ECB as a mode of operation, and that makes perfect sense. Conceptually, ECB is the trivial/zero case of operation modes; you might even call it the degenerate case. In a formal setting, labeling the degenerate case under the same term as the general case is perfectly fine.

But ECB isn't a mode of operation for developers of real world cryptosystems in that it doesn't satisfy the requirements we have for all the other modes of operation. Modes of operation were designed to turn the block cipher primitive into a more general encryption scheme. By themselves, block ciphers are just pieces of the overall puzzle. They have security models that are great, but have nothing to do with end-to-end cryptosystems. Block ciphers are built to provide a piece of cryptographic functionality, a piece that needs to be leveraged by a bigger scheme to produce a real world cryptosystem. ECB is a "mode" that doesn't do that.

Calling ECB a "mode of operation" in a software interface implicitly confers a status upon ECB that it doesn't deserve. It implies to all who see it, "ECB is an official mode of operation that fulfills some of the needs of a mode of operation," when the reality is that it does not. Inappropriate use of ECB mode (which is pretty much always "using ECB mode") is one of the most classic cryptography mistakes in software development and allows simple vulnerabilities that a middle-schooler could be trained to exploit. ECB is the zero case, and giving the zero case a name that doesn't sound like the zero case has hurt the community of developers who use it.

Consider a typical crypto API. The user specifies a parameter to the library choosing a mode of operation for the block cipher. What kind of options do they have? If they're using OpenSSL, Crypto++, .NET, Java, or PyCrypto, then it is a list analogous to this:

enum BLOCK_CIPHER_MODE {
    ECB,
    CBC,
    CTR,
    // [...]
}

The options might be classes, maybe the function names use them as suffixes, etc, but the options are grouped together thematically like that. In a case like this, the label "ECB" is a lie because ECB isn't just another one of the choices. Instead, this would be a more honest list:

enum BLOCK_CIPHER_MODE {
    NONE,
    CBC,
    CTR,
    // [...]
}

This clearly labels which one is the "0"/"off"/"none" option. "You aren't using a mode of operation, so implicitly you're missing the benefit of using one," it says.

Obviously, developers should research what they're using. And to be fair, some APIs (including some listed above) have the courtesy to point out ECB's security pitfalls in the documentation. The documentation is generally an incomplete warning, saying something like "warning, ECB mode leaks information about plaintext". But how does the developer know if that problem applies to their use case? Just how bad is that? A warning in documentation requires the developer read the documentation (reasonable), assess the severity of the problem presented (somewhat reasonable), derive or research all the other problems ECB brings (semi-unreasonable), then realize that ECB is nothing like what they're looking for. Relabeling "ECB" to "NONE" hints at a lot of that up-front.

Imagine if we did the same thing for RSA. "Textbook" RSA (aka, encode the message as an integer and exponentiate it) is also a cryptographic primitive and it shouldn't be used in real cryptosystems either. Proper use of RSA requires that a padding scheme be applied to the plaintext first, but imagine if RSA functionality was exposed in APIs with a variety of padding options like "OAEP", "PKCS1.5", and "textbook", all presented together like they're the same thing. "Textbook" padding is just writing the plaintext as an integer with leading 0s. It's the zero/base case, like ECB for block ciphers, and has no business being listed next to the other padding options. Choosing "textbook" RSA padding is really choosing no padding, since the padding scheme isn't addressing any of the issues the other padding schemes exist to address. (Thankfully, I've never seen an API offer "textbook" as an RSA padding scheme.)

Even academia notes that ECB is out-of-place when modes of operation are considered as a whole. Consider "Evaluation of Some Blockcipher Modes of Operation" by Rogaway. When it summarizes ECB (pg. 5) it points out that ECB is the black sheep of the operational modes family, not meeting the same pattern of practical usefulness that the others do:

[...] ECB should not be regarded as a "general-purpose" confidentiality mode.

Nothing in this article is new. Yes, my complaint is largely semantics, and obviously the perils of ECB have been preached for a very long time. As well, developers are responsible for the code they write. But misleading interface choices do bear a portion of the blame when the masses use (and abuse) them, and I believe that using the label "ECB" under "modes of operation" in developer-oriented APIs and documentation is misleading. "ECB" is a term best relegated to formal cryptographic speak. It should not be used anywhere someone is building a real-world cryptosystem.

Unfortunately, this is just one problem among many, as cryptography libraries have a long history of making it easy for developers to misuse them. ECB is just one example of presenting bad choices to developers. Part of the bigger picture solution is to have good cryptography interfaces that hide lots of details, allowing as few potentially fatal decisions as possible. But I still find the ECB issue particularly annoying, primarily because it's rooted in semantics. The fact that real-world cryptography problems start with vocabulary choices is sad.

Certificate Verification Using load_verify_locations()

  • By Brad Conte, September 13, 2013
  • Post Categories: Security

I was writing a small Python script recently to use a web API over HTTPS. To use HTTPS properly, I naturally wanted to validate the server's certificate before trusting the HTTPS connection. Sadly, doing something so simple required more digging than I would have thought.

I was using Python 3's http.client module for HTTPS, so I needed use the ssl.SSLContext.load_verify_locations method of my SSL context to specify the path of the trusted CA certificate store for the cert verification. I wanted to use the pre-existing CA store on my computer, which I thought should be straight forward.

Like many crypto APIs in higher-level libraries, the Python function turned out to be pretty much a pass-through wrapper to the OpenSSL API of the same name. I hadn't used that API before, so I read the brief Python documentation, then skimmed the OpenSSL documentation from which I gleaned:

int SSL_CTX_load_verify_locations(SSL_CTX *ctx, const char *CAfile,
                                   const char *CApath);

If CAfile is not NULL, it points to a file of CA certificates in PEM format. The file can contain several CA certificates identified by

 -----BEGIN CERTIFICATE-----
 ... (CA certificate in base64 encoding) ...
 -----END CERTIFICATE-----

sequences. [...]

If CApath is not NULL, it points to a directory containing CA certificates in PEM format. [...]

That seemed relatively straight-forward. I looked at the "ca-certificates" package on my Arch Linux system to see where the system's default CA certificates where installed. (The ca-certificates package is a bundle of default (and often pre-installed) CA certificates deemed worthy of shipping out-of-the-box by organizations like Mozilla. Other Linux distros may use a different name for the package of certs.) I tried load_verify_locations with the CApath argument pointed to the ca-certificates store, which in my case it was /usr/share/ca-certificates/mozilla. Even though the PEM certificates were there (the ".crt" files are PEMs; since PEM allows for a concatenation of ".crt" file contents the ".crt" files are PEM as a trivial case), the Python module didn't find the certificates and the HTTPS connection returned an SSL: CERTIFICATE_VERIFY_FAILED error. I tried pointing the CAfile argument at the X509 cert files, but that failed too.

I searched for example usage of load_verify_paths in Python and C. I was very disappointed to find no useful examples among the first 20 or so. I found a lot of code that ignored certificate validation (which was disturbing), code that only used arguments pulled from external settings, and code that used paths to app-specific certificates. But no literal examples of using the environment's existing store.

I went back to the OpenSSL documentation. Whoops, I hadn't read it fully:

If CApath is not NULL, it points to a directory containing CA certificates in PEM format. The files each contain one CA certificate. The files are looked up by the CA subject name hash value, which must hence be available.

[...]

Prepare the directory /some/where/certs containing several CA certificates for use as CApath:

OpenSSL needs the cert directory to be setup first. The cert filenames within the setup directory are going to be the CA name hash. OK, where was that setup on my system?

After more searching I stumbled across a very recent helpful comment on a Python bug report. (Less than surprising, the bug was advocating that Python should make the system CA cert store easier to access, although the suggested method itself was not a good idea.) The comment provided a list of example paths for CApath and CAfile on different common *nix systems:

cert_paths = [
    # Debian, Ubuntu, Arch, SuSE
    # NetBSD (security/mozilla-rootcerts)
    "/etc/ssl/certs/",
    # Debian, Ubuntu, Arch: maintained by update-ca-certificates
    "/etc/ssl/certs/ca-certificates.crt",
    # Red Hat 5+, Fedora, Centos
    "/etc/pki/tls/certs/ca-bundle.crt",
    # Red Hat 4
    "/usr/share/ssl/certs/ca-bundle.crt",
    # FreeBSD (security/ca-root-nss package)
    "/usr/local/share/certs/ca-root-nss.crt",
    # FreeBSD (deprecated security/ca-root package, removed 2008)
    "/usr/local/share/certs/ca-root.crt",
    # FreeBSD (optional symlink)
    # OpenBSD
    "/etc/ssl/cert.pem",
    # Mac OS X
    "/System/Library/OpenSSL/certs/cert.pem",
    ]

These are locations setup for OpenSSL's load_verify_locations use. On my system, the majority of the files were symlinks to certs in the store, where the symlink name was the CA subject name hash. The tools c_rehash (a part of OpenSSL) and update-ca-certificates (a part of ca-certificates) setup this path for OpenSSL.

It was a little disappointing that there was no de-facto path, environment variable, or something of the like to find the prepared OpenSSL-ready CA store. But while you can't rely on the CA store setup to be in the same place across platforms, if this list is true then /etc/ssl/certs/ seems very popular choice on Linux.

So in my Python script, I used:

ssl_ctx = ssl.SSLContext(ssl.PROTOCOL_TLSv1)
ssl_ctx.verify_mode = ssl.CERT_REQUIRED
ssl_ctx.load_verify_locations(None, "/etc/ssl/certs/")

and got SSL certificate verification to work using my pre-installed certificate store.

Indeed, the solution was indeed fairly straight-forward. I was just surprised by how long it took find it. I had originally thought there would be lots of examples since, well, scripts aren't carrying their own CA list around and they are performing certificate validation. Right?

On that disturbing note: Please, devs, do verify certificates! HTTPS is just HTTP with obfuscation if you don't do cert validation. Anyone can MITM the connection with a self-signed cert. Even if the application is doing something mundane, like some seemingly boring API queries, you probably still want to protect your API key. If using the pre-installed cert store was confusing before, I hope this article helped address that issue.

FoxyProxy, Firefox 3.5, and DNS Leaking

[Update: Jan. 24, 2010] The DNS leaking problem described in this article applied to FoxyProxy v2.14. On Jan. 12, FoxyProxy v2.17 fixed the problem.


FoxyProxy is a popular Firefox extension that enables users to, setup, easily manage, and quickly switch between multiple proxy configurations. One of the most common uses of a proxy server is for security/privacy. By establishing an encrypted connection (usually via SSH) with a proxy server on a trusted network, you can have your web traffic go through an encrypted "pipe" to that server and have that server send and receive web requests on your behalf, relaying data to and from you only through that pipe. By doing this you eliminate the risk that someone on your current network could see your HTTP traffic. Maybe you don't trust other clients on the network, maybe you don't trust the gateway, it doesn't matter -- your HTTP(S) traffic is encrypted and shielded from prying eyes. (Readers unfamiliar with the concept of using HTTP proxies through SSH tunnels are encouraged to research the matter, there are many well-written tutorials available.)

There are many other popular uses of proxy servers, but the application of encrypted web traffic is of concern in this case for the following reason: A key problem that arises when using web proxy servers is the issue of handling DNS requests. DNS does not go through the HTTP protocol, so even if HTTP is being forwarded to a proxy the DNS requests may not be. If DNS requests are sent out over the network like normal then eavesdroppers can still read them. So although the actual traffic may be encrypted, the DNS queries behave normally and may cause the same problems that using an encrypted tunnel was designed to avoid in the first place. A situation in which HTTP traffic is encrypted but DNS is not is referred to as "DNS leaking". When using a proxy for the benefit of security or privacy, DNS leaking may be just as bad as non-encrypted traffic.

Solving the DNS leaking problem is simple. One type of proxy, SOCKS5, can forward DNS requests as well as HTTP(S) data. A user simply needs to tell their browser to use the SOCKS5 proxy for both HTTP(S) and DNS and then both will be routed through the encrypted stream, allowing the user to surf the web with greatly strengthened security and privacy.

However, Firefox users who use FoxyProxy at the moment will encounter a problem when using DNS forwarding to a SOCKS5 proxy. When using FoxyProxy, DNS leaking occurs even when it is configured not to, which has made many users very upset. Initially many people thought the problem was with Firefox 3.5, but others confirmed it was only present with FoxyProxy installed. Unfortunately, however, not everyone is convinced that this is FoxyProxy-related behavior and I have not found anyone who has presented a solution yet. I plan to do both.

This is the basic setup for my tests:

  • I set up an SSH server.
  • I established an SSH connection and used the built-in SOCKS5 functionality of the SSH server daemon:
    $ ssh username@myserver -D localhost:8080
    

    (For the non-SSH inclined: This command forwarded all traffic my client sends to itself on port 8080 through the SSH connection to the SSH server, which then acts as a SOCKS5 proxy and sends the data on to the destination.)

  • I used Wireshark to monitor all packets, specifically DNS requests, sent or received my network's interface. Note that DNS requests tunneled over the SSH connection to the SOCKS5 proxy are not visible to the packet sniffer.
  • I monitored my Firefox configuration in about:config. (All network proxy-related settings are under the filter network.proxy.)
  • I used Firefox v3.5.5 and FoxyProxy v2.14.

Using this I was able to monitor all DNS requests while I experimented with Firefox and FoxyProxy using a SOCKS5 proxy. I did a base test with no proxy configuration, a test using Firefox's included proxy management, and a test using Foxyproxy for proxy management.

Using no proxy

Starting with a default configuration (SSH SOCKS connection established but no proxy settings configured to use it) I visited several websites such as google.com, yahoo.com, and schneier.com. This was the simple base test.

I checked showmyip.com to get my IP address.

The relevant about:config settings:

network.proxy.socks               ""
network.proxy.socks_port          0
network.proxy.socks_remote_dns    false
network.proxy.type                0

Via Wireshark I watched as the websites generated normal DNS requests over the standard network.

Using Firefox to configure proxy settings

I restarted Firefox to avoid any cached DNS entries. Then, without FoxyProxy installed, I setup my SOCKS5 proxy. (Note that FoxyProxy replaces the standard Firefox proxy editor, so it is impossible to not use FoxyProxy when it is installed.)

Under Firefox's Preferences/Tools (depending on your operating system) I went to the "Advanced" tab, "Network" sub-tab, and opened "Settings". I chose "Manual proxy configuration" and entered "localhost" for the SOCKS Host and "8080" for the port.

Unfortunately, Firefox v3.5 does not support a GUI method of enabling DNS SOCKS5 proxying, so I had to manually go to about:config and enable it by setting network.proxy.socks_remote_dns to "true".

I checked showmyip.com to ensure that my IP address displayed as coming from the server and not my client. It did show as coming from the server, so Firefox was using the proxy.

The final about:config settings were:

network.proxy.socks               localhost
network.proxy.socks_port          8080
network.proxy.socks_remote_dns    true
network.proxy.type                1

I visited the same websites. Via Wireshark, I did not see any DNS requests sent over the standard network. Firefox channeled both the HTTP and DNS data through the SSH tunnel perfectly.

Using FoxyProxy to configure proxy settings

I reset all the about:config settings back to their defaults. Then I installed FoxyProxy Standard v2.14. I went to FoxyProxy's options and, under the "Proxies" tab, created a new proxy entry whch I named "SSH SOCKS5". I set it to connect to "localhost" on port 8080. As well, I check-marked the "SOCKS proxy?" box and selected "SOCKS v5". I went to the "Global Options" tab and checked the box "Use SOCKS proxy for DNS lookups". To let this take effect, I had to restart Firefox.

When Firefox had restarted, I went to the Tools > FoxyProxy menu and selected to "Use proxy SSH SOCKS5 for all URLS". I checked showmyip.com to ensure that my IP address displayed as coming from the server and not my client. It did show as coming from the server, so Firefox was using the proxy.

I checked about:config:

network.proxy.socks               ""
network.proxy.socks_port          0
network.proxy.socks_remote_dns    false
network.proxy.type                0

The configuration was the same as the default, so apparently FoxyProxy does not adjust about:config to do its work.

Watching the DNS requests via Wireshark, I watched as all the website visits generated DNS requests over the normal network. Complete and thorough DNS leaking. And I would like to emphasize that I had selected "Use SOCKS proxy for DNS lookups", which is FoxyProxy's option to address the DNS leaking issue.

Fixing DNS leaking

There was no question about it, FoxyProxy caused the DNS leaking in my test. I wanted to solve the problem so I fiddled with about:config.

In about:config I manually set network.proxy.type to 1. I verified my IP address was from the server via showmyip.com.

The new about:config:

network.proxy.socks               
network.proxy.socks_port          0
network.proxy.socks_remote_dns    false
network.proxy.type                1

I watched for DNS requests again via Wireshark. I saw none. It seemed that just manually setting network.proxy.type to 1 fixed the FoxyProxy DNS leaking problem.

I also tried other about:config settings, such as manually changing network.proxy.socks_remote_dns to "true", but that didn't work. The above was the only change in about:config that I found that fixed the problem.

Summary

I repeated the results above three times in different orders on different computers on both Linux and Windows to ensure I made no configuration mistakes and to verify that the behavior was consistent and cross-platform. All the tests yielded the same results. Here is the final summary:

  • Firefox v3.5 does not suffer from DNS leaking by itself.
  • DNS leaking occurs when FoxyProxy is managing the proxies.
  • FoxyProxy does not suffer from DNS leaking when network.proxy.type is manually set to 1.

It is obvious that FoxyProxy does not adjust about:config in order to configure proxy settings, but I do not know why. Many Firefox extensions adjust about:config in order to accomplish their goals and I know of no reason they should not. It's possible that FoxyProxy has not had a need to do so before, but in light of this serious problem that may need to change. The quickest/simplest solution for FoxyProxy may to set network.proxy.type to 1 if the currently enabled proxy is SOCKS5 and if the global options for FoxyProxy (or the about:config for Firefox) are set to enable DNS forwarding.

However, although this seems to indicate that FoxyProxy has made a mistake, I don't know that FoxyProxy is the party at fault. Clearly FoxyProxy does not have to alter about:config in order to change the other proxy settings, so why must network.proxy.type be set in about:config in order for DNS forwarding to work? Note that network.proxy.type isn't related to DNS forwarding, it just specifies which type of proxy is enabled. For all I know someone implemented a hack in Firefox that checks about:config when it shouldn't. Of course, I don't know that and I don't know if this is expected behavior from Firefox or not. It could be that FoxyProxy isn't setting whatever hidden configuration for DNS forwarding that exists on the same plane as the other invisible proxy settings it uses. Or maybe FoxyProxy is relying on an unreliable hack in order to avoid changing about:config. I don't know about any of that. What I do know is that Firefox by itself does not have this DNS leaking problem, FoxyProxy does, and a simple solution exists.

Again, I am certainly not the first person to note this problem, but a) I have seen many people blame Firefox for this bug, and b) I have not yet seen anyone else mention the solution that I noted above.

I leave it to someone with more time and knowledge about these software projects to determine which project should have which bug report filed. This needs to be fixed permanently.

The Practicality of Port Knocking

  • By Brad Conte, May 29, 2008
  • Post Categories: Security

Port knocking is one of those server security topics that seems to come up every now and then, and when it does it always sparks a bit of debate over matters of practicality.

The idea behind port knocking is simple: The administrator of a server sets up a server with an Internet-accessible service. The administrator then closes all the ports on the server, including the ports that the service uses, and starts a daemon that monitors all incoming packets to the server. The daemon then opens the port corresponding to the service when and only when the daemon receives a series of packets on a specific set of ports in the correct order. The packets can be any type -- TCP SYNs, TCP ACKs, UDPs, whatever -- but they must be sent to the server on the correct ports and in the correct order to cause the daemon to open a service's port. Thus a port remains closed until someone "knocks" on the correct ports to cause the daemon to temporarily open it.

(Example: If the required sequence is TCP ACK packets on ports 333, 4444, and 55555, then the sequence of ACKs "27, 333, 4444, 55555, 42" would open the port, whereas the sequence "333, 27, 4444, 55555, 42" would not. I would tend to use TCP ACK packages, because they look the most boring to someone sniffing a network -- more below.)

Thus port knocking adds the security equivalent of another "password" to a service, because a client has to successfully knock on a server's ports in the correct order in order to open a port for business. Given 2^16 possible ports, about 4 possible protocols, and (usually) about 4 necessary correct port guesses, standard port knocking comes out to the equivalent of a ((2^16 * 4) ^ 4) = 2^72 bit key. This isn't too shabby a number, in terms of key size, especially if port knocking is just an extra layer of security for a service.

Thus begins the debate on the security practicality of using port knocking.

The main advantage of port knocking is that it conceals the very existence of a service until the port knock sequence is complete. An attacker cannot attack a service he cannot find, and until he finds out how to properly knock, every port on the host will appear closed to him. Thus port knocking is very helpful in situations where a server administrator wishes to conceal the very existence of a service. Service concealment was the primary motivation behind the development of port knocking.

Another advantage of port knocking is that it allows a port, when it is opened, to only be open for connection from a specific IP address. The port knocking daemon monitors all incoming packets and when it detects the correct "knock" it can not only open up a port but can open a port that accepts packets from the knocking IP only. Thus a WAN-side service can be told to accept connections only from certain IPs, but those IPs can be decided in real time.

Unfortunately, however, port knocking is a very fragile security policy. Since the knocking packets must always be sent in the clear it has the equivalent security of a password sent in cleartext. Anyone sniffing your network on either the client or server (or anyone who can trick the client into sending the knock sequence to a spoofed server, ie via DNS poisoning) end can tell what "knock" is used and replay the packets, effectively negating the security of port knocking. The good news is that unless an attacker is actively looking for a port knock sequence, the knock will look like normal boring network traffic. But if a sniffer is on the lookout for a knock sequence -- especially if they know which server it is destined to -- it's impossible to slip the knock past him.

Also, because port knocking requires the equivalent of a symmetric key problem, port knocking does not scale well to services that must handle many individual connections. The pork knocking "key" must be distributed securely to everyone who needs access to the server's service. Any cryptographer and/or security auditor will tell you that the symmetric key distribution infrastructure for this sort of thing is both annoying and brittle -- the larger the infrastructure to be maintained, the larger the hassle and the larger the potential for single point failure.

Obviously, scalability does not matter to the individual wishing to SSH into his home computer, but it is of major concern to anyone operating a server with more than four or five users, especially if users must be added, revoked, and have their "knocking keys" rotated. This is a classic (annoying) symmetric key distribution problem.

Port knocking is also impractical for popular/busy servers because the popularity of a server contradicts the very goal of port knocking: to conceal the very existence of a service on a server. If an attacker is aware of the nature of his target and knows (or at least has an idea of) what services the server has, he will not be satisfied if his port scan turns up empty. If he expects certain services to exist on the server, and if he is in any way persistent, he will investigate the server until he finds the services he knows exist, thus he will investigate the potential use of port knocking sooner or later. This does not result in instantaneous defeat of course, but the port knock is only a secure as a password sent in cleartext, which is a bad security measure to have to fall back on. The exact situation will dictate how easy the knock will be to sniff.

This quick assessment should make it obvious that the only practical use for port knocking is on small servers. Realistically, the service itself should be secure enough in both configuration and implementation to not require the additional security of port knocking -- it is an Internet service, after all -- but port knocking does add yet another level of plausible security, and the paranoid never underestimate defense in depth.

All things considered, port knocking is not too useful, despite being a fun idea. The only place I've actually observed it in use was by people at DefCon looking for any way to add any level of security to their home computers. But, obviously, those are the "I will because I can" types. Plus, when you're at DefCon you use anything to secure your home server that you can get. But outside of the experimental world, port knocking is only an interesting notion, it never sees wide-scale usage for a reason.

Note that Port knocking is not the same thing as Single Packet Authentication. Port knocking was the initial attempt to gain security by service invisibility. SPA is the more secure successor to port knocking, developed to address key problems in port knocking.

Cryptography Links

  • By Brad Conte, October 27, 2007
  • Post Categories: Security

This is a list of resources on cryptography knowledge that I've compiled. The goal of this list is to cover the fundamental spectrum of cryptography and to touch on the higher mathematical end.

Index of Links

Pre-Cryptography Concepts

Fundamental Cryptography Concepts

Simple Encryption Algorithms

Encryption Algorithms

One-Way Hashes / Checksums

Cryptanalysis

Online Resources

Encryption Programs

  • TrueCrypt -- Symetric key, disk/virtual disk encryption.
  • GPG -- Public key, multiple encryption options.
  • PGP -- Public key, multiple encryption options.
  • AxCrypt -- Symmetric key, individual file encryption.
  • DriveCrypt -- Symmetric key, whole disk encryption.
  • dsCrypt -- Symmetric key, individual file encryption (stand-alone EXE).
  • Snake Oil Encryption Software -- This isn't an encryption program, but it's a good article on how to evaluate encryption software.

Encryption Libraries

  • Crypto++ -- A C++ library under a custom, permissive license.
  • PolarSSL -- A C library under the GNU GPL license.
  • OpenSSL -- A C++ library under an Apache-style license.
  • Brian Gladman -- C source code for AES, SHA, and HMAC.

Cryptographer Resources

Historical Background

Cryptography Conferences:


Page info:

  • List Started: October 27, 2007
  • Last Content Update: November 15, 2009 - Removed stale links, added more links, reorganized some existing links.

Becoming a "Hacker"

  • By Brad Conte, August 24, 2007
  • Post Categories: Security

Introduction

That word "hacker" carries with it a lot of baggage. The word intrigues teenagers, scares politicians, and causes computer geeks to endlessly debate its exact definition.

Computers have created an deep, complex, intertwined world that few people are truly familiar with. A world in which most of its users can't even begin to comprehend the complexity or functionality of it. Even people who would be labeled as relatively computer literate, in all likelihood, don't really know much about how their computer works. Average Joe hasn't really a clue what his computer actually does or how it actually works beyond the point-and-click GUI he sees. And that's fine, because thousands of man-hours have gone into the design of everything computers do and it's unreasonable to expect Average Joe to understand a large portion of it.

But hackers make a point of knowing what happens inside a computer. What's more, they try to manipulate what happens. The combination of their knowledge and skill sets scares a lot of people (and excites others). Hackers have a knowledge base that many others do not. And they have a skill set that many others do not.

Odds are you're reading this because the concept of "hacking" interests you and you want to learn more about it. If you are an aspiring "hacker" (we'll see if you truly are in a bit), this if for you. For others, this will give an introspective into the mind of the hacker. I assume that the "aspiring hacker" is somewhat familiar with computers but hasn't really touched on hacking and computer security in any depth. My goal here is to provide a first introduction to what you'll need to do to pursue "hacking". The most daunting part of any task is usually just finding a place to start, and giving you that start is my goal.

However, my goal is not to teach you how to hack; it is just to tell you were to start. First I'm going to explain what the concept of hacking entails, so that you know what you're getting into. Next I'm going to lay forth the learning mentality and efforts you're going to need to expend, namely, what you're going to need to do to become a hacker. Third, I'm going to show you what your goal should be, namely, what it is you should eventually arrive at. Last, I'm going to give you a starting point so that you can go off and actually begin your studies, because that's what you probably want anyway.

I'm going to spend a lot more time elaborating on this topic than need be. A lot of good advice for the aspiring hacker could be condensed in just a couple sentences. But all of that has been said and done before, it's my goal to give the exhaustive explanation that covers every relevant conceptual topic. And since every hacker, more or less, follows a similar path of development, this can also give the non-hacker an insight into the mind of a hacker.

Remember, this is only an article about where and how to start learning, it's not my goal to actually give you your first Hacking 101 lesson.

What You're Getting Into

Hacking requires a lot of knowledge and understanding. If you just want to find one computer program you can download so that you can click a couple buttons and impress your friends by breaking some stupid thing on their computer then you aren't aspiring to be a hacker, you're aspiring to be a script kiddie. And script kiddies are to hackers what construction workers are to structural engineers. Script kiddies are out to, usually, amuse themselves. They act without purpose and rarely have any ethic standards.

First you'll need to understand what it is that hackers do. In the beginning, rules to govern the everyday aspects of computing were created. These rules are for networking, web design, programming, graphics design, etc. They govern everything that computers do and dictate how they do it. Most developers live by these rules.

Hackers, however, have what might be likened to a second layer of rules, a layer of rules built on the first layer. At the risk of using a cheesy analogy, the normal rules of computer are can be described by the line from The Matrix where Morphius tells Neo (in the sparring chamber): "Like any rules, they can be bent, others can be broken." That's fairly true of computing. The normal rules of computing can be manipulated, others can be completely bypassed. The realm of hacking lies on the second layer of rules, rules that are based on the first layer and can manipulate the functionality of the first layer.

Note the lack of the phrase "breaking into computers" in my definition of "hacking". Hacking is about learning, changing, and manipulating. Whether you use those skills to break into someone else's computer is a separate, unrelated issue. The mainstream media uses the buzzword "hacker" too narrowly, the roots of the term are far more broad than what modern mainstream usage implies.

For example, take the ARP networking protocol. It doesn't matter if you know what ARP is or how it works, suffice it to say that it's a networking protocol. ARP is a "first level" set of rules that govern how computers communicate on a network. Every network administrator must be familiar with how it works. However, a hacker knows how ARP works and knows how to use those rules to perform a different task than the goal of the original rules. Using the ARP protocol is standard computer networking. Manipulating ARP to do something you want it to is hacking. ARP is a normal rule, but it is a rule you can build from and it is a rule you can manipulate.

Thus in computer security/hacking it should be obvious that it pays to know the normal operating rules. Everything hackers do is based on the normal rules and unless you understand those rules you won't be a decent hacker. You're going to have to study how things work, why they work, learn how they work successfully under normal conditions, learn how they fail under abnormal conditions, and then figure out how to use them abnormally and make them achieve different goals.

So to begin your hacking endeavors you're going to have to do a lot of reading and a lot of question asking. If you already assumed this then you're in the right frame of mind. If you thought you just needed to find those one or two magic programs that would let you crash Windows boxes by the second day, rethink your plans.

How to Learn

In brief, you're going to need to read. A lot. Everyone's learning style is different and there are a lot of perfectly valid learning methods, but all of them will include a lot of reading.

Let me offer this bit of advice: Do not start by reading RFCs or any other sort of extremely technical specification. (RFCs are technical papers describing many common computing standards.) Some people will give you that advice and, frankly, I don't think it helps very much when you're very new to hacking. Instead, start by reading articles/essays/tutorials that are more than a listing of facts and specifications. Read what one person has to say on a topic, then go read technical documentation if you want to know very specific details about it. RFC's and other technical documentation can be complex, and sometimes downright unhelpful to a beginner. They make create resources to consult later, however.

To start your reading, get a browser that has tabs and makes both searching the Internet and searching a rendered web page easy (such as Firefox). Then find an online article about some security topic that interests you and seems roughly at your level of understanding.

There are two things I would like to emphasize from just that last sentence. First, I would like to stress that you find something roughly on your level. You will kill your ambition if you try to understand and dissect concepts too far over your head. Some stuff is very complicated, don't discourage yourself by trying to take it on before you're ready. You'll get frustrated trying to bench press a 500lb weight on your third day at the gym, it may be out of your league at the moment and just isn't worth your time trying to lift. This isn't to say you should forget about any complex topic you come across, you should just make a note to come back to that topic later when you know more and can assimilate information on that complex topic better. Go spend some serious time researching and learning what you need to in order to understand that topic and come back to it when you're ready. If it takes a day before you're ready, great. If it takes a month, that's fine too. The important thing is that you're learning. You don't get any sort of points for reading the articles you originally decided to, you get points for what you learn.

My second point may seem a bit obvious but I do encourage that you start your research online, as opposed to going out and buying a hacker book or magazine. The newer you are to hacking the less you'll know, and the less you know the more likely you are to find heaps of material about what you're looking for online for free anyway, so it may not be worth buying hard copy material. Plus the less you know the more you'll need to look up as you read. It's easier to look stuff up on the fly if you're on the Internet. Every new golfer likes to go out and buy their first new set of clubs, but that mentality doesn't work as well here. You will likely not need to spend a dime for anything (other than a computer and/or basic accessories) for a long time. Most stuff is available, in some form, for free.

Anyway, find an article on something that looks like material you want to learn. Read it and make note of all the words or terms you don't recognize. As you find them, highlight them, right-click them with your mouse, and search Google for them (one of the nice features of Firefox). All of them. If you have to open 20 new tabs from one article then that's fine, remember, you only get points for what you learn, not for whether you finish the initial article you started to read. Sometimes you'll come across a tutorial, paper, or audio lecture that does nothing but provide you with a long list of topics to go research. You may not be able to make much sense of it for months, but that's OK.

Read on all of those topics you just searched Google for. You don't have to read just one article per topic, if the article you read seemed short and didn't satisfy your curiosity then read another article to make sure you're not missing anything important. If those articles themselves have terms you don't know, look those up too. Make it a habit to instinctively always look up terms or concepts you don't know. The number of open tabs you have can balloon quickly but that's OK, you're out to learn.

A word of sympathy to those who find themselves with dozens of open tabs and dozens of topics that need to be researched: It can get very big very fast and you will sometimes have to call it quits on a topic. Don't be a wimp, though, call it quits only when you've truly hit a wall or are getting into a subject that truly bores you. Even if you don't learn the details about a subject, just learn the vague idea behind it. For example, even if you don't fully understand how ARP poisoning works and how to do it yourself, understand what it allows you to do that way you know what it means the next time you hear the term. You can learn how to perform an ARP poison attack some other time, and when you do learn it will be easier if you already understand what it is. Be flexible with what you learn in depth. If something frustrates or bores you, move on to something else -- there's plenty of interesting topics to study. (If everything bores you, you might not be cut out for it after all.)

Which topics should you be pursuing more than others? This is up to you. There are dozens of topics you can research and you'll have to determine for yourself which ones you spend the bulk of your time pursuing. Almost everyone will recommend that you have at least minimal well-rounded knowledge in most fields but that you find some topic that specifically interests you and you pursue it.

As you read more and more articles, you should come across fewer and fewer terms you don't understand. You will always run across unfamiliar terms -- no one knows everything -- but the quantity of those terms should diminish with time. Even if you don't necessarily remember every acronym you read (there are a lot of them) a quick Google search for one you've forgotten and a quick glance down the results page should spark the "Duh, now I remember that," light bulb.

Remember, never hesitate to use Google (or your search engine of choice), no matter what your question is about. If you're new to hacking and seriously trying to learn it would not be unreasonable for you to be executing 15 Google queries an hour during any given period of research. (This wouldn't be unreasonable for a veteran to do either if they're delving into new territory.) Google isn't a sign of weakness. And it's free. Use it. Often.

There will be some questions you have that Google doesn't answer, or doesn't answer to your satisfaction. When this happens, ask someone who might know. The best way to do that is to post your question to a hacker / computer security forum.

At some point you're going to need to start doing some actual hands-on work. All knowledge and no experience makes Jack a condescending pseudo-guru. If you want to be a hacker you're going to have to actually do something sooner or later. Feel free to experiment on your own computers on your own network. Port scan your desktop various ways from your laptop, try an ARP attack against your laptop from your desktop, etc.

And remember, for the sake of all that is good in this world, don't attack computers you don't own. It's tempting, but don't do anything against other computers, you'd be surprised how easily people can get upset. And I've seen enough hackers more competent than you will likely be who got into legal trouble to last me a life time. It seems tempting to try, but it's not worth it.

The obvious question is, "When do I actually start doing hands-on work?" This is up to you. Some people prefer to research to their heart's content before touching any tools, some prefer to experiment with tools as they learn everything. Do whatever helps you learn the most, but always research a concept at least a little bit before you try it out -- I'm not a big fan of "try it then learn it". Otherwise you won't know what you're doing, you might get confused, and you'll waste your time. And, most importantly, if you're really unlucky you'll break something by accident. If you don't know what your tools are doing and how they work when you use them, you're more of a script kiddie.

When you do start doing hands-on work is when you might want to start reading technical manuals. I assumed that you started very new to hacking, but by the time you're attempting to try things you should be able to read and understand RFCs and similar documentation on the subjects that you're experimenting with. Because by then you're more concerned with the more nitty-gritty details of how stuff works, and then the mumbo jumbo in RFCs and other technical documentation should make sense.

I would also advise that you focus more on technical understanding than on technical memorization. Its important to know how you can use ARP for a man-in-the-middle attack. Its much less important to be able to construct an ARP header frame from memory. If you ever need to do that, you can look it up. If you understand how something works, you can get the precise details when you need them.

Where You're Headed

In your endeavors to learn about hacking, eventually you should hit a point where your tools stop dictating what you do and you start dictating what your tools should do. You should be continuously learning and feeling more in control of what you do, and eventually you should hit a point where you know that "I need a program that explicitly allows me to perform this specific function," and you go to Google and type in a precise seven word query looking for such a program. It may not exist so you have to make hack together something using other tools. Or, better yet, you may find yourself writing the tool yourself.

In other words, eventually the worker is going to have to start buying tools that will build what he wants and stop building what his tools will allow him to. He needs to design a building that he likes, then he needs to go find tools that will allow him to do that.

Some people don't understand this concept and live in an infinite loop of never doing anything that their four favorite tools can't do. Don't let this be you. Govern what your tools do, don't let your tools govern what you do. This isn't to say you can't find some nice, multi-functional tools out there, but don't just download a couple programs with pretty interfaces and stick with just those, you'll hold yourself back.

Usually hackers will eventually learn at least one or two programming languages, so that they can write at least small programs themselves. I would advise you to learn at least some programming. Even if you don't do much with it, learn at least one or two languages semi-fluently. Most security-oriented hacking requires knowledge on the level of C, and learning a scripting language is helpful for automating tasks.

A Starting Point

Enough advice. You need to start doing something. Where do you start? My first bit of advice involved finding those first articles of interest to read and branch out from, but how do you find those articles?

Start with the following concepts and vocabulary words. What I provide is a list to get you started researching computer security/hacking. Remember, don't just read one article on each of these topics and call it quits, use these topics to start your Google research from. Read on these topics and everything related to them, then everything related to those topics, and related to those topics, and so on. This list is just to help you figure out what your first Google queries should be. The items on this list were not chosen to give you a comprehensive grasp of hacking but to rather give you a starting point in the most important fields. If you want to hack, you're going to need to branch out into all the different fields from these starting points.

  • Man in the middle
  • SQL injections
  • Packet sniffing
  • ARP poisoning
  • Buffer overflows
  • SSH
  • Public and private key cryptography (RSA, AES, DES, Blowfish)
  • Cryptographic Hashes (MD5, SHA1, SHA2)
  • Proxies
  • DOS attacks
  • Reverse engineering
  • Worm / virus / trojan
  • Router / hub / switch
  • Networking OSI (TCP, IP, UDP, ICMP, ARP protocols)
  • Port scanning (stateful filters, half-open scans, open/closed/filtered ports)

Hackerthreads.org has "start here" thread with a collection of links to actual articles on topics such as these for newbies. If you're looking for material to read on a topic, or material to read in general, you can start there.

When you want to start actually doing stuff, you're going to need tools. The following are interesting/handy tools of the trade. Remember, don't think that you have to stop learning when you start using tools. You can learn a lot from the tools you use. Visit a tool's homepage and read on what it does. More importantly, if the tool offers a complex function/feature, read articles on how it does what it does. Most tools come with their own tutorials/manual on how to use them that explain what the tool does and provides some descriptive information about why what it does works. Read it.

  • wireshark
  • nmap
  • nessus
  • ettercap
  • nemesis
  • hping
  • dsniff
  • Cain & Abel

For a larger list of useful tools, see the list of tools included in Backtrack 2.0 (a Linux-based security-auditing oriented OS) and the tools included in Arudius (the parent OS of Backtrack).

Unfortunately, once you start using tools reality will kick in and you may have to decide which operating system you're going to use. Before now I've said nothing specific to any specific operating system or software, but unfortunately not all tools work for all operating systems. Most of them can be run on both Linux and Windows, and a lot of Linux programs can be run on Windows with Cygwin (which requires Linux knowledge to use effectively), but the deeper you get into actually doing stuff the more OS-specific some things are going to get. I'm not going to officially endorse one operating system over another because an operating system itself just another tool, but I encourage you to do hands-on research and to select your favorite.

A list of recommended operating systems to try would include one or two basic systems from each major family. Linux: Ubuntu/Debian, Fedora, Arch, FreeBSD, and the popular Windows and Mac operating systems. Many more Linux and BSD distros exist, but if you're new to hacking then odds are that you're not looking for the more complex/powerful ones, that was just a beginners list for those who don't know what to try. (If you know which distro you want to use, you don't need a recommended starting point. Use what you already like.) Feel free to hop around operating systems for a time. When you find one you like, and you know why you like it, stick with it. Practically, your operating system will limit what you can do, I advise that if you decide to stick with Windows, you at least give Linux a serious try. There's a lot to be learned from it, and like all Unix-like systems it will inherently be more hacker-friendly. But the choice is yours. Use what works. Best yet, keep more than one OS around and use multiple OSs that work.

Good luck in your studies. Have fun with them. Be responsible with your knowledge. And when you can, contribute back to the global hacking community that's provided you with all the information, articles, and tools you've been able to utilize.

2Wire's Weakened WEP

  • By Brad Conte, July 25, 2007
  • Post Categories: Security

It's a well established fact by now that the security a 64-bit WEP encryption offers a WiFi network is small, in the same sense that the Pacific Ocean is big. This is especially true as of late as recent months have unveiled multiple attacks against WEP that, figuratively, kicked it while it was down, making the popular personal network encryption scheme even more trivially broken by hackers. Various network configurations, network protocol flaws, and mathematical breakthroughs have chipped away at the effectiveness of WEP encryption to the point where it can reasonably be broken within 20 minutes by a skilled attacker.

However, while everyone is out looking for ways to use replay attacks and better mathematical algorithms to break WEP faster, I happened to note that router manufacturer 2WIRE is making on its own odd effort at rendering WEP less effective.

It's becoming a more standard practice these days for ISPs to enable WEP by default when they install a wireless router for a new Internet customer. And what with the popularity of wardriving and P2P file-sharing lawsuits you can't blame them. WEP isn't the most secure wireless encryption solution but it's the easiest one and it makes the ISPs look like they're doing something. But for as insecure as WEP is by nature, 2WIRE does something that compromises the effectiveness of WEP on their wireless routers in a new way.

When the SBC Yahoo! ISP hooks up a customer's Internet service the customer is provided with a 2WIRE wireless router with 64-bit WEP turned on by default and an SSID that starts with "2WIRE". On the outside of the physical router there's a white label with the router's default WEP key printed on it for the convenience of the customer (and tech support). However, 2WIRE makes a crucial flaw with both the default WEP keys and all WEP keys generated by the router. If you're used to dealing with WEP keys, one quick look at just a couple of default WEP keys for 2WIRE routers should tell you there's something wrong.

The keys have no letters. Just numbers. All 2WIRE WEP keys are composed purely of numbers.

This means that every character in the WEP key uses just 10 of its potential 16 values. If you aren't convinced of how significant that is, then let's see how large the actual keyspace (namely, the effective strength) of a 64-bit WEP key is if you exclude the "letter range" from the hexadecimal key. We're only going to examine the brute-force aspects of this.

(Note: The following will assume familiarity with binary and hexadecimal.)

Start with the full 64 bits of the WEP key (as 2WIRE uses 64-bit, rather than 128-bit, WEP by default). The first 24 bits of the key are merely the IV, which is not an effective part of the key. This was originally done in accordance with the US government's cryptography export regulations which at the time prohibited the export of encryption technology stronger than 40 bits, so this isn't 2WIRE's fault. Thus, by definition, 64-bit WEP only has 40 bits of actual effective keyspace. (Hence the reason 64-bit WEP is often referred to as 40-bit WEP.)

If the entire 40 bit range of keyspace were used, the key would still be relatively small. 40 bits is nothing by todays standards -- no one uses anything less than 128 bits if they're serious about the security of their encrypted data. But 2Wire chips away at even those 40 bits.

The remaining 40 bits are the equivalent of 5 bytes. Each byte is represented by two hexadecimal characters (each hexadecimal character representing four bits of the byte). These 10 hexadecimal characters will compose the final human-readable WEP key. Each hexadecimal character has a range of 16 values (because it is 4 bits in length). However, by restricting a WEP key to be composed of only numbers, each character only has a range of 10 possible values. It only takes log_2 (10) = 3.322 bits to have 10 possible values in binary, thus we only have 3.322 effective bits of key.

Now instead of having 10 * 4 = 40 bits of keyspace, we are left with 10 * 3.322 = 33.22 bits of keyspace. This 17.5% reduction of effective keyspace may not seem that critical, but remember that each bit of keyspace doubles the strength of the key. This is because the strength of the key is expressed as the number of combinations that would be required to successfully guess the key based on its length and value restrictions.

With forty bits of keyspace, we have 2^40 = 1,099,511,627,776 possible combinations. This isn't large in the world of cryptography, but it's annoying enough that likely no one would want to spend a couple weeks breaking it on their computer -- not if the payoff was simply access to a personal network. However, contrast that number of combination with the number of combinations we get from just 33.22 bits of keyspace, namely, 2^33.22 = 10,004,985,324 possible combinations.

Yes, those are both big numbers, but count the commas in them and note that the first is over 100 times larger than the second. Now, assume that a hypothetical attacker wants to launch a brute force attack against a key of 40 bits and a key of 33.22 bits. Further assume that his computer can make 500,000 attempts per second, which is not unreasonable for a home computer. (Remember, if if the attacker has a weak laptop on site he may still have a powerful desktop he can use remotely to do his hard work.) So the hypothetical attacker captures a few encrypted packets from your network then goes to work brute forcing them.

With his assumed computing speed, it would take 25.5 days to brute force the 40 bit key, but only 5.6 hours to brute force the 33.22 bit key. And those are the worst-case scenarios, note that we are assuming that the correct key is the very last one the attacker guesses. Statistically the attacker has a 50% chance of guessing correctly half way through.

Now is that 40 - 33.22 = 6.78 bits of difference in keyspace looking more important? An attacker started needing to devote nearly a full uninterrupted month of computer processing time to the attack and has downgraded to just needing to leave his computer working while he goes out to a movie.

In summary:

  • 2WIRE made a decision to only use numerals in their customers' default 64-bit WEP network setup.
  • In doing so, the necessary time for launching a brute force attack against such a network is decreased from about 3.6 weeks to about 5.5 hours.
  • These networks are easily identifiable via a default SSID that starts with "2WIRE".
  • Despite the advantages this compromise of keyspace gives the attacker, there are still many faster (but more complicated) ways to break a WEP network.

In conclusion, allow me to compare this new ways we now have of attacking a 2WIRE WEP network with the traditional way that works against all WEP networks.

The traditional way:

  • is faster (20 minutes possible break times)
  • requires active attack (attacker must collect packets for a long time)
  • requires more specific hardware/firmware (requires injection mode)
  • is more complicated / less reliable

The 2WIRE-specific brute force way:

  • is longer
  • does not require active attacks, a few packets can be saved and attacked later
  • requires less specific hardware/firmware (only monitor mode required, this is very standard)

There's no question about it, 2WIRE WEP is significantly worse than standard WEP.

However, I would speculate that it's unlikely that this weakness will turn out to be of much, if any, consequence. The thing is, for as bad as 2WIRE WEP is, WEP's inherent weaknesses are worse. Serious attackers will have better ways to get into encrypted networks, so they're unlikely to care about this. The people that would be best off using this tool would be the less-serious attackers, but there exist no automated tools (yet) for launching this specific attack -- and odds are probably decent that there never will exist them. This flaw is too specific and overshadowed by too many greater flaws to receive much attention. Writing the tool, though, would be trivial -- the aircrack-ng suite already contains all the necessary functionality.

This kind of security bungle would make any security engineer cringe. Such a poor configuration would never fly (I hope) at the U.S. DoD or the NSA, but I don't think it impacts Joe Schmoe that much.

SBC Yahoo!/2WIRE got off easy with this poor decision because the serious weakness they introduced was none weaker than weaknesses that already existed. They definitely dodged the bullet on this one. However, had WEP not been so critically broken before 2WIRE's mistake came on the scene, I guarantee that much more attention would've been focused on their flaw.

Finally, 2WIRE's decision to only use numbers in WEP keys itself is somewhat puzzling. I don't know if all 2WIRE routers are this way or if SBC Yahoo! made a deal with 2WIRE for this functionality in order to ease up on tech support calls. Regardless, my guess is that it was a tech support problem having to do with letters being in the keys. Average consumers probably naturally associated security codes with numbers and were getting confused to find letters amist their WEP keys.

And in the end, that's the heart of security, tradeoffs. Some are good, some are bad. Depending on the resources SBC Yahoo!/2WIRE saves because of the decision, it may even be worth it. I just know they got lucky.

Another interesting factoid for the archives of 802.11 wireless security.

Using TrueCrypt for the First Time

  • By Brad Conte, January 14, 2006
  • Post Categories: Security

Note that this tutorial was written for TrueCrypt v4.1. Since then, many things have changed in TrueCrypt but the principle features discussed in this article remain the same. Also note that I do not attempt to explain the concept of cryptography. Read on the fundamental concepts of cryptography before you proceed if it is an unfamiliar concept to you. The tutorial will assume that you understand the basic concepts and importance of cryptography.

Introduction

From bank account information to intimate personal letters, from company secrets to diaries, and from incriminating documents to MP3 collections -- we all have things that we'd perfer be kept private to us, and to us alone. To protect such things in the physical world, we use strong safe boxes that open only when introduced to the correct key key. Protecting such data in the digital world is very similar concept.

In this tutorial I will examine how to encrypt files using a program called TrueCrypt. TrueCrypt is a free, open-source encryption program that has won the respect of the general cryptographic-savvy public, and recognition by cryptographic experts such as Bruce Schneier. It is currently maintained by a group of anonymous programmers who have shown themselves to be quite crypto-savvy over the time they have managed the TrueCrypt project.

This tutorial has been split into several main sections.

Hopefully I will address everything you need to know in order to use TrueCrypt comfortably and with confidence. Contact me if I am too vague or unclear at any point in this tutorial, so that I can fix it if necessary.

General Overview

Unlike the physical world, in the digital world it is impossible to create a literal physical safe box to store data in. Instead, we create what you might call a "virtual" safe box. In the physical world, objects are placed in a single container, such as a safe. This safe acts as a singular entity and binds all the objects in it together, separating the objects from the rest of the world by strong walls, walls that will open only when presented with the correct key. Likewise, in the digital world, data placed in a "virtual box" will be placed into a single container which acts like a safe. This "virtual safe" acts as a singular entity and binds all the data in it together. It separates the data from the rest of the world by using an algorithm to scramble the original data values so that they are not recognizable unless you have the right key to unscramble it.

This safe box concept is the idea behind encrypted volumes, which is the method TrueCrypt uses for encryption. There are programs that exist for encrypting a single file individually, but managing encrypted files individually is not always reasonable. Oftentimes many, many files must be encrypted, and they must be viewed, edited, added to, and subtracted from, frequently. Each file could be manually managed, but it would take a lot of time (and could even cause technical difficulties) to do so. And when you have, say, an entire hard drive full of files that need to be encrypted, it's not even humanly possible to attempt to manage them individually. Thus, the solution is to mass-manage them together in one encrypted safe box, a.k.a, an encrypted volume.

There are two types of encrypted volumes: files and partitions. With a file, the encrypted volume will be nothing but an ordinary computer file containing the encrypted data placed in it. This file can be copied across drives, downloaded, anything that can be done with a normal computer file. (You could think of it as being basically just like a ZIP or RAR archive, as the concepts of encrypted volumes and compressed archives are very similar.) With a partition, the encrypted volume will be a literal partition on your hard drive, and it will behave just like one.

Don't be intimidated by the fact that you'll be using volumes -- dealing with encrypted volumes is very simple.

First, you choose the volume you wish to encrypt, whether it be a file or a partition. Then you specify some of the details you want to use (more on those later). Most importantly, you specify a password key that will be used to encrypt the volume. This key will not be stored in any way in the volume, so it is unrecoverable. TrueCrypt then creates the specified volume with the details you provided, encrypts said volume, then writes some encrypted data to the header section of the volume. Of specific interest, in the header there exists something called the "master key". This master key is what is actually used to encrypt the contents of the volume. The key you entered is used to decrypt the master key, and the master key is used to decrypt the volume. (This means that if you change your key, the entire volume does not have to be decrypted and re-encrypted the new key, just the small header part with the master key needs to re-encrypted.)

After the volume has been created and encrypted, you can easily use it by mounting it with TrueCrypt. Mounting a volume is essentially telling the operating system to treat that volume as an actual disk partition, allowing you to access and manage it just like a normal partition. To mount a volume, all you have to do is select the volume and provide your original key. Once the volume has been mounted, it will appear as a normal drive on your operating system and you can treat it just like one in all regards. You can copy files to it, delete files from it, edit files in it, run programs from it, etc. As far as your operating system is concerned, this drive is just like any other drive it manages.

The mounted drive is nothing more than an interface to the encrypted volume, be it a file or an actual device. Thus, all data in the mounted drive resides actually resides in the encrypted volume. If your encrypted volume is a file, ie, whenever you move data to the mounted drive it is actually being moved to the file.

Encrypting and decrypting data on a volume is as simple as moving the data to and from the mounted drive just like you normally would any normal drive. When a volume is mounted, TrueCrypt acts as a middleman between the operating system and the mounted volume, similar to how virtual disk drive emulators, such as Daemon Tools, work. When data is saved to the drive, TrueCrypt encrypts it before saving it to the volume. When data is requested from the drive, Truecrypt decrypts it before giving it to the operating system to give to you. It's drag-n-drop simple.

How To Create Encrypted Volumes

This is a walk-though for creating an encrypted volume, regardless of whether the volume is a file or partition. Windows users, note that you must have admin privileges to use TrueCrypt.

  1. (This first step is only for file volumes. If you are using a partition, skip this step.) Create a file to use as the encrypted volume -- it can be a new text document, a recycled old PDF, it doesn't matter. Just note that the file's original contents will be overwritten and lost. This file you create/choose will be the file you will use as your encrypted volume.
  2. Open TrueCrypt and select the "Create Volume" option. (Click "Next".)
  3. Select the "Create a standard TrueCrypt volume" option. (It is possible to create a "hidden" volume only if you are creating it in a location where a standard volume already exists. More on hidden volumes follows later.) (Click "Next".)
    • File volume: Click "Select File" and find the file you created in the first step.
    • Partition volume: Click "Select Device" and choose the drive/partition you wish to encrypt. You can select an entire drive or just a partition. However, if you wish to encrypt an entire drive, it is recommended that you first create a normal partition on the entire drive and then encrypt that partition, rather than encrypting the drive itself directly. There is no difference security or usability-wise, but it can avoid problems where Windows will automatically try initialize a disk that it doesn't detect to be formatted.

      Note that when creating a volume, all data on the drive or in the file you choose will be lost. (Click "Next".)

      (A more comprehensive comparison between the advantages/disadvantages of using file volumes and physical device volumes can be found later in this tutorial.)

  4. Select the encryption algorithm you wish to use. While selecting the perfect algorithm is a very complex subject, suffice it to say that there is no wrong choice here because all algorithms TrueCrypt employs have been professionally created, tested, and approved. The "official" recommendation, however, is AES, as it is currently accepted by the general security community as the most secure algorithm in today's world, with Twofish probably being the closest runner-up. You will note that some of the algorithms consist of two names separated by a dash, such as "AES-Blowfish" -- these options mean that both of the algorithms will be used in the order they are listed. While using two, or even three, layers of encryption is unnecessary, it may be a prudent precaution for anyone who either knows they have very smart and powerful adversaries, or is just plain paranoid.
  5. Below the choice of encryption algorithms, there is a choice to select the hashing algorithm to be used. Again, there is no wrong answer. These hashes won't be used to actually store or authenticate data, so don't worry about that. (The fact that SHA-1 has been somewhat compromised is not of any concern at all in this specific context.) (Click "Next".)
  6. Next you will need to determine how large the encrypted volume should be. This size will be permanent and cannot be changed, so choose a size that provides as much space as you may ever need. (Unused space on the volume will always be filled with random garbage, so if you're dealing with a file volume its size will always be the same, regardless of how much data you're actually storing in there.) (Click "Next".)
  7. Now you will create a key to use to encrypt your files, the step you no doubt have been anticipating all along. (This key, is what will be used to encrypt the master key, which is generated later.) This is, obviously, the most critical step to the security of your encrypted volume, as your goal is to select a key that won't be breakable by an adversary, yet still something you can remember. When creating a key, you obviously have the text portion of the key, but TrueCrypt also allows you to mix the contents of a normal file in with your key, making it basically such that your text key combines with the file's contents to yeild the final key. Thus both your text key and the file will be necessary to decrypt the volume. If you opt to use a file, check the "Use Keyfiles" box then click the "Keyfiles" button. Use the "Add File" button to add individual files to the keyfile list or use the "Add Path" button to add entire folders of files. If you want to generate a new file just to serve as a keyfile, click "Generate Random Keyfile" in the bottom-right corner, save the new file, then select it with "Add File". (Click "Next".)
  8. Finally, choose which file system and cluster size you wish the volume to use to store data. Unless you're familiar with file systems and cluster sizes, I'd recommend keeping the cluster setting at whatever TrueCrypt recommends by default. The only security difference between FAT32 and NTFS is that NTFS does not support hidden volumes, so you can't add one later.
  9. Below the file system settings, there will be a random data pool with a long hexadecimal string that keeps changing. This is some of the random data that will be used in the encryption process and the master key that will be used for encryption. The master key will be automatically managed and encrypted by your user-specified key, all you have to remember the text and whatever keyfile(s) used for your key. All mouse and keyboard activity you generate in the window during this time will add to the entropy (randomness) of the data, so be sure to wave the mouse around at least a few times to help generate unique entropy.
  10. Next there is a checkbox that gives you the option to perform a quick format. A quick format will only initialize the file system of the volume and will be much faster. Leaving the quick format choice off will perform a full format, in which random data will be written to every bit (literally) of the volume. Doing this ensures that, at a later time, an attacker looking at the contents of the encrypted volume will not be able to tell how much data is in the volume and where it's stored, because encrypted data looks exactly like random data. Not performing a full format means that it is very likely the unused portions of the volume will not contain random-looking data, and an attacker will be able to make decent guesses as to how much data is stored in the volume. This may not seem like a big deal, but the smallest bit of information can sometimes be much more than you want an adversary to know. For example, if they know you're only storing one file in the volume and they can figure out exactly how big it is, that may tell them everything they need to know about it. Always perform a full format unless you know that the volume is already full of random data, such as when you are re-formatting an existing volume.
  11. When you are done, click "Format" to create the new encrypted volume. When the format process has completed, click "Exit" if you do not wish to encrypt another volume, or "Next" to create another volume using these same steps.

NOTE: When TrueCrypt creates an encrypted volume, it encrypts the entire volume, including the file system. This means that, if the encrypted volume is a physical drive/partition, when you connect the drive to your computer, your operating system will not recognize the drive as formatted and will not be able to read from or write to it. This is how it is supposed to be, the only way to access the volume is through TrueCrypt -- so don't panic. And above all, don't take Windows' suggestion to format the disk, as this will erase everything on it.

How To Use An Encrypted Volume

Once you've created an encrypted volume, you will no doubt need to actually use it. All management of the encrypted volume's contents must be done while the volume is mounted as a drive by TrueCrypt. This will provide a walk-through for how to mount a volume, and how to manage it once it's mounted.

  1. Open TrueCrypt and look at mid-bottom of the window for a rectangular region with the TrueCrypt logo on the left. On the right side there will be two buttons: "Select File" and "Select Device". Use the first button if you wish to mount an encrypted file, use the second to mount an encrypted drive/partition.
  2. Select the encrypted file/partition/device you wish to mount.
  3. Once the volume has been selected, look at the very top of the window and notice the long list of letters. These are all the drive letters that are either empty or currently being used by TrueCrypt. Select an unused drive letter to mount the volume as. This will be the drive letter the operating system assigns to it (and no, there is no need to always mount a drive under the same letter, unless there are shortcuts that point to that specific drive).
  4. When the volume and the drive letter to mount it as have been selected, click the "Mount" button at the bottom-left of the window. You will be prompted to enter the password/keyfile you used when you created the volume originally. (If you used a keyfile, you will need to locate it on the drive where it is stored.) If you present the correct key, the drive will be mounted. If you enter the incorrect password, you will be prompted to try again. (If you enter an invalid password several consecutive times, double-check that the file you're trying to mount is actually an encrypted volume. Without a correct password, TrueCrypt has no way of knowing whether a volume is encrypted or not, and thus, if you're accidentally trying to mount a file/partition that is not encrypted, it has no way of informing you that you're on an impossible mission.)
  5. Once the drive has been mounted, you will see its basic stats listed next to its respective drive letter in the list of drive letters at the top. This list allows you to assess and access all of the encrypted volumes you're managing at a glance.
  6. To manage the contents of the volume you mounted, just use the drive like you would any other. Encrypt files by copying them to the drive, and decrypt files by reading them from the drive. You can access the drive yourself via "My Computer" and your programs can access the drive and write files to and/or read files from it. As far as Windows is concerned, it's a perfectly normal, average drive, and can be treated just like one. NOTE: Once a drive has been mounted, you do not need to leave the TrueCrypt program running in order to use the drive. Closing TrueCrypt will not dismount the drive. When you re-open TrueCrypt, it will still recognize the encrypted volume and you will be able to dismount the drive.
  7. When you're done using the volume, dismount it by hitting the "Dismount" button at the bottom. The drive will disappear into thin air and no longer be accessible. Simply shutting down the computer will dismount the volume, which will not be automatically remounted when Windows starts again.

NOTE: It is possible that, while a disk is being used, some file contents that are being used will be stored in the computer's virtual memory. Since everything being read from the volume is automatically decrypted, and because virtual memory exists on the operating system's hard drive, this means that file contents stored in virtual memory will be stored unencrypted on the hard drive. This is obviously undesireable, so users are encouraged to disable their virtual memory systems before managing mass amounts of encrypted data. (Windows users: Start > Control Panel > System > Advanced > Settings > Advanced > (Virtual Memory) Change > No Paging File > Set.)

What's Better

There are three important choices that you must make when creating an encrypted volume: You must choose between using a file or a physical device, a standard or hidden volume, and a password or a keyfile. Here I will examine the pros and cons of both options for both of these choices in depth, as I skipped over these subjects earlier. I address these issues in what I believe to be the order of their importance.

Standard vs. Hidden volumes:

There are two modes you can have an encrypted volume in: standard and hidden, also called "outer" and "inner", respectively.

A standard volume is just a normal encrypted volume that TrueCrypt creates. All your data is simply encrypted and stored to it. Since the advantage of using standard volumes is dependent on hidden volumes, so I will address it in the context of a hidden volume.

The disadvantage of using standard volumes is that any adversary analyzing a disk where an encrypted volume is stored would be able to detect the presence of encrypted data of some sort, because all of the data in that location will be conspicuously very random. If they know you have a copy of TrueCrypt, they would probably assume that you have a TrueCrypt encrypted volume in that "random" space. An adversary may then force you (by legal or physical means) to reveal your encryption key for the volume. If you comply (having your fingernails ripped off via pliers can be very motivating) and all your important data is in this volume, then all is lost.

This is why hidden volumes exist. Hidden volumes are encrypted volumes within encrypted volumes -- but they are impossible to detect. Thus, you can place your most important secrets in there and even if your standard volume is breached, the secrets in the hidden volume remain intact. This concept is called "plausible deniability".

It is possible to all but prove the actual existence of a standard volume, but it is impossible to prove the existence of a hidden volume. Thus, an adversary could potentially force you to reveal the key that decrypts the outer volume, but they would have no way of forcing you to reveal the key for the inner volume, because they do not even know that an inner volume exists. If they are familiar with TrueCrypt, they will know that the potential for an inner volume exists, but they have no proof that you have utilized this function.

Thus, by storing some semi-serious data in the outer volume and the serious data in the inner volume, you can protect your most critical data even if the outer volume is compromised. Hopefully the assailants will assume that they have found everything you have to offer and not press beyond that, as there is no way they can prove you have anything more to offer.

Keyfile vs. No Keyfile:

When creating a key for an encrypted volume, TrueCrypt offers the option to add to the text key (the key entered via keyboard into the prompt) by using keyfiles. Keyfiles are just normal computer files that TrueCrypt adds the contents of to the normal text key. Together the keyfile and the text key are used to generate the master encryption key that would otherwise be generated from only the text key. There is no limit to the number of keyfiles that can be used. However, only the first 1024 bytes of each file are actually used (which is from there compressed down to the maximum key length of 64 bytes), so data beyond those bytes is irrelevant.

The main disadvantage of using a keyfile is that it causes inconvenience. Since the keyfile is a part of the key, it must always be present when you wish to mount the volume. Thus, if you move the volume from one computer to another, you must find a way to transport the keyfile along with it. In addition, the keyfile must be kept secret, which introduces a range of security problems regarding how you can keep the keyfile itself physically safe. (This is for you to sort out on your own, as physical security is a totally different topic. All I'll say is that it might be wise to make use of floppies and to keep a heavy magnet close by. It might also be worth looking into a program called SecureTrayUtil.) Another disadvantage of using a keyfile is that if any of the file's first 1024 bytes are changed (due to any cause, including file corruption), it is impossible to mount the encrypted volume.

One major advantage of keyfiles is that they allow for an encryption key to be split up over more than just one user. If two people wish to encrypt a volume such that it is impossible for just one of them to decrypt it alone, they could both contribute a keyfile when creating the encryption key, and use both keyfiles when creating the volume's key. Then, the volume cannot be decrypted without the keyfiles of both people. Another advantage is that keyfiles protect against keylogging, because the keylogger will only log the part of the key entered via the keyboard, it will not detect the part of the key that is contributed by the keyfile.

The biggest advantage of using keyfiles is that they allow for the user to use a longer password with a more diverse byte value range. Creating a long, good password can be difficult to do ("long" here being at least 20 characters), especially if it's something that you have to mentally remember. Invoking keyfiles provides a way to easily use a long sequence of random values without having to remember them. Plus, the text password from the keyboard is limited to ASCII values, meaning that it is impossible to take advantage of a byte's full 256 value range. Using a binary keyfile allows you to inject more diverse bytes values into the key. Thus, keyfiles allow for an easy way to use long, diverse keys.

In the end, the decision to use keyfiles or not is up to you. Before making your decision, consider what your text key will be (and how strong it is), how you will securely store/hide the keyfile, and how you might be able to securely transport the keyfile if needed. In the end, it's probably worth throwing a keyfile into the mix if you cannot think of a reason not to.

Do not rely on your keyfile, however, for good security. It is still highly recommended that you make your text key as good as possible. And yes, it is possible to use just a keyfile, with no text password, as the key, but this is strongly not recommended. If you need help creating a good password, I've written somewhat extensively on the subject.

File vs. Device Volumes:

The main advantage of using files is that they're more flexible. You can copy them, delete them, and move them at will. This allows you to create backups, easily give copies to colleagues, and such. Another very important advantage is that it allows you to "hide" volumes as other files. Because encrypted volumes stand out as being suspiciously random to anyone analyzing the drive in which volume is located on, you can put a DLL extention on the end of the volume's file name, put it in the Windows\system32 directory, and it hopefully will never be questioned as being valid. This trick has limitations, though, and obviously doesn't work for 900MB file volumes on a 1GB USB flash drive, and similar senarios.

The disadvantage of using files is that whenever you move or copy the file, if you do not wish for the presence of the file to be detected, you have to securely delete the original file, otherwise it might be recoverable by someone else. The file will still be encrypted and unreadable to them (unless they have the password), but sometimes you don't even want another party to know that you have the encrypted file in the first place. And if you create a file volume larger than 4GB, it obviously won't be able to exist on a FAT32 file system, which may or may not inconvenience you. Also, if your volume becomes heavily fragmented, file volumes will run a bit slower (defragging can easily solve this, though).

Dealing with physical drives/partitions is slightly different. It prevents copies from easily being made, which can be a good thing if you have reason to want the data to remain in one and only one location.

It's not really a big deal which type of volume you choose. The only real issues are the easy of copying a volume, and the convenience of using it. Before creating the volume, you should consider whether or not you (or anyone) will want to copy it in the future, and how convenient it would be to manage a physical device instead of a file.

Tips

As it is with any program, there are details about TrueCrypt that the newer user might not notice. Hopefully this list of tips will help enhance your experience with TrueCrypt, although this is by no means comprehensive of all the goodies TrueCrypt offers. I won't explain how to do everythin;, you can figure that out on your own (remember, TrueCrypt has an official manual). Rather, I'll simply let you know what options exist that I recomment you research.

  • TrueCrypt does not have to be installed on a machine in order to function. -- Because it can be annoying to install programs on every computer you need to use them on, and also because merely the presence of TrueCrypt residue on your system might give away more information to your opponents than you would like, TrueCrypt does not have to be installed in order to work. It can operate as stand-alone set of executables. Thus, it is possible to run TrueCrypt on-the-fly from a USB flash drive (or even a floppy) without the bother of installation.
  • Back up your encrypted volume's header. -- TrueCrypt offers a wonderful option to back up the critical, encryption-related data for a volume. Having a volume's header backed up is extremely handy if some accident (or malicious attack) occurs and changes part of the volume's encryption-related data. If the volume header were to be damaged in such an accident, you would be unable to mount the volume and retrieve whatever non-damaged portions still exist. But if you had a backup, you could simply use TrueCrypt to restore the header data from the backup data and you would be back in business. The option for this can be found under Tools -> Backup Volume Header.
  • Keep a list of "favorite" volumes. -- If you have several (or even just one) encrypted volumes that you consistently need to mount, you can create a "favorites" list of all the volume locations and the drive to mount them as. Then you can simply select the option to mount all favorite volumes, and then just enter the key for each one, without having to manually select each volume and its drive letter.
  • Do not upgrade the moment a new version comes out. -- You must fight your inner-geek tendencies to upgrade TrueCrypt the instant the newest version is released. This is because, tragically, TrueCrypt does not have a great track record of producing stable new releases. If you rush off to get the newest version, you may find yourself upgrading once or twice in the next couple weeks -- especially if the upgrade was to a whole new major version number. Give every new TrueCrypt release at least a week to be examined for flaws by the rest of the public before bothering to upgrade. I have nothing against TrueCrypt, its just that there have been unstable releases in its past, but these errors are always corrected quickly.

FAQ

This is a quick list of the most common, natural questions new users have about TrueCrypt.

  • Q: How secure is TrueCrypt? Is it good enough to protect my very sensitive data?

    A: TrueCrypt is recognized as one of the best encryption programs available to the public. It has been written, scrutinized, and heartily endorsed by many security experts. The fact that it is open source means any expert with the ambition has the ability to analyze TrueCrypt from the inside out - which is a big plus in matters of security. The creators of TrueCrypt have done an excellent job thus far analyzing and improving upon potential weaknesses, and the program has a very loyal, intelligent following of users. All in all, TrueCrypt comes highly recommended.

  • Q: Is it possible to analyze a hard disk and determine for sure whether or not there is an encrypted file/partition?

    A: No. Encrypted data looks just like random data, and the entire volume is encrypted, so no plaintext flags or marks exist that identify it as being encrypted. However, it is very unusual for existing files, and even drives, to be filled with perfectly random data. So an adversary would probably (and rightfully) assume that such a region contains encrypted data. I personally would recommend renaming an encrypted file volume to have a file extension that is known for containing basically random data, such as DLL. An adversary can always check to see if the file is indeed valid for the extension it is listed as, but if you put it in a non-conspicuous place (such as a system or a video game folder), hopefully it will be overlooked as nothing abnormal. Drives/partitions have no easy method of disguise.

  • Q: I have a normal partition that I would like to encrypt, is it possible to convert it to an encrypted partition volume without losing my data?

    A: No, the partition will have to be completely reformatted in order to be used as an encrypted volume. To convert a non-encrypted partition to an encrypted partition, you will need to copy all of your files to another location, encrypt the original partition, and then copy all the files back. Programs like SyncBack can aid you with this. You might even want to consider creating another encrypted volume to temporarily house the files you're copying while you format the original partition, so that the sensitive data is not left exposed. After you're completely done, depending on where you copied the files to, it might be smart to shred them using a program like Eraser.

  • Q: Can I change my encryption key after I create an encrypted volume?

    A: Yes, but only if you know the original key. There are no backdoors built into TrueCrypt, so your key is the only thing that can unlock the volume contents.

Conclusion

I'd go on longer on this subject, but I've covered enough of the basics. Hopefully I didn't over complicate the process, as my only intention was to simplify it. In the end the concept is simple: Create a safe, create a key, open/close the safe adding/removing valuables as necessary.

I didn't come close to covering everything about TrueCrypt, but then I didn't try to. Remember that TrueCrypt has an official user's manual and an extensive FAQ of its own, and it contains technical details that I didn't mention here. Consult these if you plan to make good use of this fine program.

Any questions you have should be directed at the fine members of the official TrueCrypt forums, I even hang out there a bit myself.

And although I didn't touch on this subject during the tutorial, as it is long enough already, I must state, in closing, that the concept of encryption is very important. Information is power, and the ability to control information is power as well. The more you control what information you do/don't disseminate to others, the more power you reserve for yourself and deny to potential adversaries.

Encryption is the main (but not only) tool to limit information dissemination. Use it, value it, protect it, know it -- both your information and encryption techniques. Sometimes you don't fully appreciate how valuable your information is until it falls into the wrong hands. Don't let that happen to you.

Software Liability

  • By Brad Conte, July 28, 2005
  • Post Categories: Security

From multi-billion dollar government agency's to home burglar alarms -- different forms of security and protection abound everywhere. Specifically, one of the most interesting forms of day-to-day security in the lives of average people is the aspect of Internet security.

Every year, millions of dollars are spent on Internet-related security. From international businesses hiring world-class consultants to normal home computer users purchasing antivirus programs, people realize that Internet security is critical and many do their best to take appropriate security-measures. This is not without good reason: Currently, there are over 50,000 virus on the Internet, having wrecked a reported 55 billion dollars in damages.

The basic idea of "security," as defined by Bruce Schneier in his latest book Beyond Fear, "is about preventing adverse consequences from the intentional and unwarranted actions of others". This basically means that security is about protecting innocent people from people who have malicious intentions -- a definition that should be easily agreed upon. Security exists to protect the innocent from harmful people or situations that are not their fault.

Security with specific regards to the Internet has two forms. The most familiar one is software used for the express purpose of protecting a computer from the Internet, such as antivirus software. The other form of Internet security lies in software that exists to perform one task, but in order to perform that task it must also take steps to ensure a certain level of security that will not allow malicious attackers exploit the program to their advantage, this would include a much wider range of software, such as Internet browsers. Regardless of which type of security a software product offers, it must do so flawlessly, because if a flaw is found, a security system that is 99% secure is just a as worthless as a system with no security at all.

Unfortunately, despite the fact that users spend so much time and money on highly reputed software, attempting to secure themselves and their data from attackers on the Internet, the attackers continue to be wildly successful in their hacking endeavors. Until recently, the reason for the common level of attacking success was simply attributed to user incompetence. It was commonly assumed that the level of security a computer had was thoroughly determined by the user. If the computer's security was compromised then it was the user's fault for not securing the computer well enough.

However, security experts, such as Schneier and Blake Ross, have recently stated that users can no longer bear so much of the blame for having insecure computers. Instead, they state that, while many security issues are indeed caused by the end-user's stupidity, the fundamental security problem that allows for Internet attacks to be so successful lies with the security software that the users rely on for security in the first place. These experts argue that users cannot be fully blamed for having insecure computer that fall prey to countless Internet attacks because they, the users, have no way of truly securing their systems to begin with -- the very programs they reply on to provide the necessary security are actually causing problems themselves.

In a recent lecture here in Sacramento on this very subject of software companies' role in Internet security, Schneier repeatedly bashed software designers in general for their stupid and thoughtless design practices. He stated that, as the world's most recognized security guru, he was ashamed that he could not offer his own mother a solution to surf the Internet safely. Just this month, Ross also wrote on the subject in his blog. He stated that, being someone who is regularly asked to comment on what he believe the future of computes holds, he foresees a bleak future for the computing industry in general if software designers don't start getting serious about their design practices. "I’m disgusted by what the average person has to deal with on a day-to-day basis," he states.

Users deserve to have a higher general plateau of security in the software they use than they currently have. Software manufacturers have become lazy, and are rapidly producing products that are bursting at the seems with simple security holes, holes that are forever being exploited by bored seventeen-year-olds, often at precious cost of the victim. Users have no way of protecting themselves from such security holes, because they are relying on those programs for their security in the first place. Thus, software manufacturers must be forced to stop making such simple security mistakes in their software, because it is completely unfair to the end-user to purchase (relatively) expensive software that falls victim to some of the oldest attacks in the book. Software manufacturers will not change their production practices easily, however, and if they are to be forced into producing better software, legal action will be required.

First, it is critical to understand why companies are reluctant to bother designing secure software. They are not producing poor security out of sheer spite for their users (despite the fact it may sometimes feels that way), but rather because of four simple, main reasons.

Once of these reasons is because designing solid security is just plain difficult. Designing software that has to have a certain level of security is perhaps the most difficult software task that can be tackled. Not only does the product have the problems and difficulties of a normal software program, but it has to be able to absorb and deal with excessive intentional abuse, and to be able to identify security threats properly and weed them out without affecting the normal flow of legitimate user-generated activity. There are thousands of aspects that have to be analyzed and properly dealt with, yet not one mistake can be made. To top this off, security designers are faced with the dark, brutal reality of their situation, which is, as Schneier so eloquently put it: "As computer scientists, we have no clue how to write secure code. [...] We don't even know how to make a program end." This is true, because there is no Bible on designing software security that designers can consult for absolute answers. Everyone is forced to simply dream up ideas, test them, and hope they work -- they have no way of knowing exactly how attackers are going to try to manipulate the program. Writing secure programs is very, very difficult and if a program is to be secure, it must have a lot of time and hard work invested in it. Secure software cannot be designed overnight.

Another reason security companies produce poor security in their software is because the security is often designed by the wrong people. The people who actually enjoy doing designing proper security are few, expensive, and far between. Thus, most security is not designed by experts in the field, but rather by "forced" experts. These are not people who excel at security design in specific, but rather people who are reasonably bright and are assigned the task of analyzing security. They do it because they're good thinkers, not because they're ingenious security analysts. Because of this they are not going to attack security problems with the same level of passion and expertise that a security expert would. Thus, when their designs are then analyzed by attackers who do care passionately about security design and do have a high level of expertise, flaws are found.

Designing good security also consumes time and resources, something corporate managers are reluctant to spend. It takes a lot of time to analyze and test a security system, and most companies work on deadlines that don't allow for extended analyzation testing. Also, since true security experts are rare, companies usually have to hire a consultant if they wish to have someone with the necessary expertise review their designs. Unfortunately, many managers couldn't tell the different between a world-renowned security consultant and a dead duck, so they often are reluctant to hire the necessary experts because they don't appreciate them for all that they offer.

However, the most critical reason companies design poor security is because the results of good security are relatively invisible and do little to nothing to boost product sales. Most of the minor security measures and fixes will never be used, resulting in nothing but lost resources as far as managers are concerned. Plus, the security measures that actually do get used will often be used unnoticeable. One way or another, the user will likely never personally know notice the minor security features. If they don't notice them, then they won't factor them into their purchasing decisions.

Thus, from the manager's point of view, the company is wasting time, effort, and money on a feature that won't boost sales. Needless to say, this is rarely an attractive option. The result of this is that most security is sloppy, rushed, and has not been analyzed with a fine-toothed comb. Then when attackers spend more time analyzing the program than the actual designers themselves did, they, the attackers, find flaws and write viruses to exploit them. When these exploits are released publicly, the company then throws together a quick patch that fixes the problem and offers it to their users as an "upgrade".

It is such common to have insecure software that gets exploited and broken on a consistent basis, that the public actually accepts it to be a normal part of their lives. People actually expect to have their computers infected with viruses every so often, and they expect to have to update their software.

There is no reason that the public should accept this, however, because companies have the ability to write more secure programs than they do. In situations where designers do not have enough time to analyze their programs before they are released, deadlines can be extended -- albeit at a small profit lose to the company. In situations where there are not enough knowledgeable experts available to analyze and/or implement the security designs, consultants can be hired. There are more than enough capable consultants for hire, they are just not so plentiful they can be hired and disposed of as easily as the average programmer.

Since software companies owe it to consumers to provide the best security they can, and they have proven that they are unwilling to do so by their own good will, they must be forced to do so under some sort of penalty. Preferably, the consumer market would rise up as a whole and demand reform, refusing to purchase products from companies that did not conform to a certain level of security standards. This option is unrealistic, however, as the large majority of the consumer market is unaware that they're being ripped off by the software companies in the first place, much less know what sort of reform to demand. The only other way to impose standards on software companies is to do so legally.

Legal action against software companies would take one of two (if not both) forms. First, software companies would be forced to face inspection and auditing of their software, and would be subjected to penalties if their software failed to pass a certain set of basic standards. If the imposed fines and penalties were stiff enough, it would be more cost beneficial for companies to produce more security, rather than less of it.

The second, and probably most important, form of legal action would be that users would be granted the legal grounds to sue software companies if that the company's software was exploited due to poor security, damaging the user in some way. This would not only have the financial trade-off advantage of the first method, but would also have the benefit of "public shaming". What software company in their right mind would want their product publicly taken to court and sued for poor security practices? If such a thing were to happen, they would not only stand to lose money from the lawsuit, but they would stand to lose their consumer respect as well.

The most common argument raised against this legal proposal is that it is impractical to demand that software companies write near-flawless programs. It is pointed out that even leading experts, like Schneier himself, acknowledge that it is impossible to confirm that any given program is completely secure. There are simply too many variables to account for, and too many unknown attack strategies exist to be taken into full, flawless consideration.

While this is most certainly a valid argument, it addresses a scenario that is a long way off from where the world software security is today. Demanding that software companies produce near-flawless security is a long, long way from where the situation stands right now. Currently, the security they produce is riddled with trivial bugs and juvenile mistakes -- think about it, many viruses, such as Sasser, are written by kids in their late teens. Software companies have a long, long way to go before their security products could even be considered to be somewhat close to perfect. It would be more than reasonable to at least hold software companies responsible for their basic, on-going security flaws. The world will never see a time in which everyone is perfectly secure from everything, but hopefully it will see a time in which world-class software programs are not repeatedly reduced to shreds by simple teenagers.

If legal action were taken against software companies that produced sloppy security programs, there would be a sharp decline in the number of careless mistakes that allowed these programs to be exploited. Cooperations would save millions of dollars and home users could use their computers with greater confidence. If Schneier, the world's leading expert on security analysis, cannot protect his own mother from the dangers of the Internet, something is obviously very, very wrong with the state of modern computer security.


Note: This article was originally written as a research paper for a college English class, titled Computer Insecurity. A few minor grammatical errors have been made since it was originally completed. Sources were inlined as hyperlinks and the sources for trivial ideas were dropped (the original audience was non-technical).

The Solitaire Cipher

  • By Brad Conte, July 3, 2005
  • Post Categories: Security

About The Solitaire Cipher

The Solitaire Cipher was designed by Bruce Schneier specifically for Neal Stephenson's book, Cryptonomicon. The algorithm was developed so that people would have a way of encrypting messages without the aid of computers. All that is necessary is a full deck of standard playing cards (or something equivalent).

How it's used

The Solitaire algorithm itself does not encrypt anything, rather, it generates a string of pseudo-random values that can be used to do the actual encrypting. This string of values is generated by a key that the user provides. The "random" output will always be the same for every key, and since this output is used to encrypt the message, it is easy to see how the key for the random generator is, essentially, the password for the encryption/decryption process, as without it, you have no hope of generating the same values again to use to decrypt the message.

When the key is provided to the algorithm, it is used to initialize the data pool. Once that is done, any number of random characters can be generated.

The values the algorithm spits out range from 1 to 26, corresponding to A-Z (it can also be implemented to spit out values from 1-52, corresponding to A-Z and a-z). To encrypt/decrypt a message (using 26 values), translate all the characters in the message to the same case and then generate enough random characters for them all. If you are encrypting the message, add all the random values to the values in the message (rolling back to 1 should any value exceed 26). To decrypt, subtract the random values from the values in the message (rolling back up to 26 should any value go below 1).

Algorithm

This algorithm assumes that the user has two or four decks of cards and two jokers. For simplicity's sake, only two decks will be used in this example. The two decks of cards will be combined and each card will be assigned a numerical value. The first deck of cards will be numbered from 1 to 13 (Ace through King) and the second deck will be numbered 14 through 26 in the same manner. The jokers will be assigned the values of 27 and 28. Thus, a 5 from the first deck would have the value 5 in our combined deck, the value 1 in the second deck would have the value 14 in the combined deck.

The deck will be assumed to be a circular array, meaning that should a card ever need to advance below the bottom card in the deck, it will simply rotate back to the top (in other words, the first card follows the last card).

  1. Arrange the deck of cards according to a specific key. This is the most important part of the algorithm as anyone who knows the deck's starting value can easily generate the same values from it. How the deck is initialized is equivalent to the encryption security key and is left up to the recipients. Shuffling the deck perfectly randomly is preferable, although there are many other methods. For this example, the deck will simply start at 1 and count up by 3's, modulo 28. Thus the starting deck will look like this:

    1 4 7 10 13 16 19 22 25 28 3 6 9 12 15 18 21 24 27 2 5 8 11 14 17 20 23 26
    
  2. Locate the first joker (value 27) and move it down the deck by one place, basically just exchanging with the card below it. The deck now looks like this:

    1 4 7 10 13 16 19 22 25 28 3 6 9 12 15 18 21 24 2 27 5 8 11 14 17 20 23 26
    
  3. Locate the second joker (value 28) and move it down the deck by two places.

    1 4 7 10 13 16 19 22 25 3 6 28 9 12 15 18 21 24 2 27 5 8 11 14 17 20 23 26
    
  4. Perform a triple-cut on the deck. Everything above the top joker (which, after several repetitions, may not necessarily be the first joker) and everything below the bottom joker will be exchanged. The joker's themselves, and the cards between them, are left untouched.
    5 8 11 14 17 20 23 26 28 9 12 15 18 21 24 2 27 1 4 7 10 13 16 19 22 25 3 6
    
  5. Observe the value of the card at the bottom of the deck, if the card is either joker let the value just be 27. Take that number of cards from the top of the deck and insert them back to the bottom of the deck just above the last card.

    23 26 28 9 12 15 18 21 24 2 27 1 4 7 10 13 16 19 22 25 3 5 8 11 14 17 20 6
    
  6. Note the value of the top card. Count this many places below that card and take the value of the card there. This value is the next value in the keystream, in this example it would be 11. (Note that no cards are changing places in this step, this step simply determines the value).
  7. Repeat steps 2 through 6 for as many keystream values as required.

My implementation of the Solitaire Cipher in the C language can be found on in my code project.

Security

This algorithm has been proven to have bias toward certain values in its output and is not as random as today's modern cryptographic standards would demand. The weaknesses of the algorithm are outlined here. This is not to say, however, that Solitaire is worthless, it is still an easily remembered hand-executable algorithm that does generate "random enough" values.


Note: I used this article to create the first draft of the Wikipedia article on the Solitaire Cipher. Because of this, the text in this article is under Wikipedia's License.