Post Category: General Tech

These are articles on various geeky and computing-related topics.

What Exactly Does "Rooting" Mean?

There is a lot of interest in "rooting" Android and iOS devices so that the owner can do more fun things with them. But most rooting guides are simple "how-to" lists without much explanation of how the process works and what is being done. This can make the process feel arbitrary and complicated, particularly for beginners. But it's not complicated, and if you're going to root your device, you're better off knowing as much as you can about it.

This is a conceptual and technical overview of what rooting a modern mobile device entails. We'll cover the whole pictures of rooting in simple, non-technical language. (If you've rooted a device before then you've probably inferred a fair amount of the information here.) The concepts should apply to both Android and iOS (although examples will favor Android since I have more experience with it).

Here's an outline of the topics. We'll work backwards.

What is "Root"?

Root is a term that refers to the de-facto ultimate administrative account on a device. It's a term that originated with Unix systems a long time ago. On a Unix-like system (Android, iOS, Linux, and OSX are Unix-like in concept), there is an account literally named "root" that exists by default and has the administrative privileges to do anything.

Rooting a device is simply a process to obtain access to that root user account. This requires effort because operating systems like Android and iOS run the normal user environment under a non-root user account that has privilege restrictions. To do something that the default user account can't do usually means the user must get root access to the device and use the root user account privileges to accomplish their goal.

There are perfectly good reasons why the phone doesn't run with root access by default. Root privileges allow you to do just about anything, which means you can screw up just about anything. The principle of least privilege is a fantastic security and stability adage; root privileges are best left to those who explicitly want/need them.

How Do You Get Root?

There is a program called su (also with old origins in Unix) that allows a user to open a new session under a different user account. This program allows users to effectively switch to a different user, including the root user, and perform tasks as that other user. Naturally, you must have access to the target user account to su to it.

Rooting a device is basically just the act of placing an su program onto it. Apps that want root access run this program to get a session under the root user and then they perform their root-requiring tasks with that session.

How Does "su" Grant Root Access?

A typical program runs with the permissions of the user who executed it. You can't run a program as the root user without first effectively being the root user. This creates a chicken-egg problem for getting root access because the user starts with a non-root account.

But this is hardly a new problem, and there's a simple solution. Unix-style systems have long had a file permission called the SUID (Set User ID) permission. When a file has the SUID permission set, it means "regardless of who runs this program, run it with the permissions of the owner of this file". When you install the su program, it will be installed as owned by the root user with the SUID flag set, so when a non-root app runs su later, su will run with root permissions, giving a root user session to the app that called it.

But having su hand out root access to anyone/anything that executes it is a really bad idea because there would be no protection for the account. So the a mobile-device version of su usually has a user prompt that requires the user to authorize attempts by su to get root access.

How Do You Install "su"?

To install su you must place it to a findable location, have it owned by the root user, and have the SUID permission set. Installing normal apps is easy but doesn't work for su. Normal apps can't be owned by root, can't install to key system locations where other apps can find them, and will belong to a user account with limited permissions. We need a way to take an su file that is not owned by root and make it owned by root.

Now we have another chicken-egg problem: For security reasons (such as this very process) you can't change a file's owner to be another user without that other user's permission. So we can't change a file to be owned by root without root access.

The solution is to get temporary root access in order to install su and give it the proper ownership and permissions, after which it will serve as a permanent anchor for root access.

How Do You Get Temporary Root Access?

This is the hard part of the rooting process, and one of the main reasons why there are so many different rooting methods. They all involve side-stepping the normal operating system security checks in some way. Some devices make it easy, some don't. Most of the solutions fall into one of these categories:

  • Boot to a different operating system and install "su".

    This is the probably the "cleanest" method. In a different operating system we can do whatever we want to the device's main operating system because it won't be running and can't enforce its permissions. Booting into a different OS usually entails unlocking the boot loader, after which you may boot to something like a custom recovery image, which is essentially a mini-operating system itself. This is one reason why unlockable boot loaders are such a big deal to modders in the community, if it's unlockable then rooting the device is straight-forward.

  • Find a security exploit that gives admin access.

    Sometimes security bugs in the operating system allow for exploit code to be executed in a privileged environment. For example, years ago iOS 4.1 could be exploited by a maliciously crafted PDF that would exploit the OS to gain root access, and that temporary root access could be used to install the su program for permanent access. Many security holes get patched eventually, so the community has to find new ones. (Security holes you can exploit can also possibly be exploited by others, so it makes sense that they get fixed.) Finding security exploits isn't always easy, that's why it sometimes you have to wait a while after a device is released for a rooting method to be made publicly available. (Although, there are rumors that sometimes some distributors include relatively easy vulnerabilities just as a nod to modders.)

  • Attach through a debugger.

    Sometimes administrative tasks can be performed through developer debugging support. Debugging support is aimed at developers and turned off by default. After it is enabled, it can often perform a variety of admin functions.

You only need to get root access once. Any subsequent task, including switching the su program for another one or updating it, can be channeled through an existing root session or a new root session from su.

What Is Necessary After "su" Is Installed?

After the su program is installed, a normal app to manage it should be installed. For example, on Android the popular ones currently are SuperUser and SuperSU. Such an app will try to manage the su program, such as by upgrading it, allowing for the user to configure settings for it, etc, since su is usually fairly minimal by itself. Not all su programs are identical, you should make sure that the one you install matches the su-management app you install later.

You may also need to worry about preserving root access through OS upgrades, such as OTA updates. When the OS is updated sometimes directories get cleaned out and re-installed. If the su program gets removed, root access is lost. This will be addressed more below.

How Do You Detect if Root Is Available?

Any app that needs root access has to be able to find the su program. It is usually placed in a well-known, typical folder of common system programs. Apps that need root access then check these common locations to see if they can find anything named "su".

Some apps refuse to run on rooted devices. They can detect a rooted device in the same way a normal app does, by finding a program named "su". Any program named "su" is obviously suspicious, but detection can go beyond that and check for any file owned by the root user with the sticky bit set. If the detection is very aggressive, in theory it may also pick apart an executable program to see if it makes any operating system calls that change the user account.

How Do You Preserve Root?

Given the above two sections, some users will face a threat to their root access at some point, be it due to an OTA update or because an app that doesn't work on rooted devices. Sometimes the user needs to preserve root access, even if they concede to losing it temporarily.

Keeping the "unrooting" temporary is simple: open a root session, use it to back up the su program, remove the original, and then restore the back up later. This is possible because a root session can be opened and preserved even once the su program is gone. This can save su from being blown away by an OTA update and it can be used to hide it from apps that complain if the device is rooted. To really evade active detection, they can try to hide su in even more creative ways, such as by removing the sticky bit and change the file owner, which look suspicious.

However, temporary unrooting is risky because the only hook into root access during the process is a temporary root session. If it is lost, then root is lost. This is why unrooting guides always warn against doing anything risky with your device during the process. The obvious big problem is rebooting, if you reboot then you'll lose your active root sessions and if su wasn't properly restored you'll be permanently unrooted.

Detecting and evading root detection is an open-ended problem, and the evaders (aka, users) often have an advantage. But it doesn't seem like most anti-root apps are overly aggressive about their detection, often they only do enough to be able to claim that they're avoiding rooted devices. (Think about legal media consumption apps that just want to convince the content providers that the app will check for rooted devices before downloading copyrighted content.)

Risks of Having Root

Having root access is not without risk. Any apps with root access have unlimited access to the device, including all the apps. Obviously some apps store things like login credentials, but others might store even more sensitive information, like the Google Wallet app on Android which has to store credit card related information locally. I'm sure Wallet has a lot of protection around the information it stores, but the bottom line is that since Wallet can unlock that information another app might also be able to. Hence the reason why the Wallet app offers a warning when it detects the device is rooted.

Even if you trust the apps you grant root access to and are sure they won't intentionally abuse it, what if yet a different app finds a vulnerability in one of them, uses them to get root permissions, then uses those root permissions for evil? You must realize that your security trust perimeter is expanded in a grey-ish way to any app you install, and fully extended to any app with root permissions, including su itself. You are trusting that they won't do stupid or insecure things with their root privileges.

Checking "su"

If you have rooted a device, you can confirm whether it uses the su method described in this article.

  1. First get a shell/terminal/command prompt on the device. (Use a terminal app, install and connect to an SSH server, or use developer debugging tools, etc.)
  2. Find the su file. Popular locations on Android devices are /system/xbin/su and /system/bin/su. If your device has the find command, you can find su using the command:
    find / -name su
    
  3. If su exists, check its permissions using:
    ls -l /path/to/su
    

    An example output:

    -rwsr-sr-x root     root        91980 2012-11-03 03:06 su
    

    This indicates that the file is owned by user "root" and group "root" and the two "s" flags mean that the SUID (run with owner's user permissions) and SGID (run with owner group permissions as well) flags are set.

You can manually use su to open a root shell on your device at any time just by opening a normal command prompt and entering su as a command. (At least, I can with the ones I've tried, YMMV.) You should get the standard confirmation prompt followed by a new root shell.

Summary

A typical rooting session on your device typically needs the user to:

  1. Temporarily obtain admin access.
  2. Install an su program that is owned by the root user and has a special "inherit root user's permissions" file-level permission set.

Just about any method that gets that done will suffice.

As a side note, the term "root" as a verb isn't new. It's long been used by hackers who manage full administrative access to a hacked machine. It's actually pretty much the same thing.

Hopefully that provides some context and perspective on the rooting process and how it works.

Reducing the Size of Device Images

While device images can be helpful to keep, they can be a pain to store. They must literally contain every byte of the original device, making them about as big as the original image. However, saving unused disk space is pointless. By zeroing out unused filesystem space and then using fast compression on the resulting image, the compressed will be about the same size as the used data on the filesystem. If a significant portion of the filesystem is unused, this can save a lot of space in final image.

Making a device image is not difficult. Unix-based systems have long had the "dd" utility. For Windows, Vista introduced a built-in utility for creating image backups. There are also many third-party software applications like Partition Magic provide this sort of functionality.

The biggest disadvantage of a raw device image is that space unused by the filesystem is still saved. If a 20 GB file system only has 4 GB of data on it, all 20 GB will still be saved in the image. If there are many images that need to be stored (such as images from multiple devices, or images from multiple points in time from the same device) this isn't very space efficient. Users with space concerns often compress the images, but the unused space often contains normal data since it is likely that it was used at least once in the past as files were created, deleted, and moved. So, unfortunately, the unused portion of the filesystem usually compresses only a little better than the used portion. Since the contents of the unused portion of the filesystem are often arbitrary, it is undesirable to have such high overhead for storing then.

(Note that filesystem based images do not have this problem. But these are more complicated and do not work on full disks, which is necessary to preserve the MBR, partition boundaries, etc.)

However, if the unused space of the file system is filled with zeros before compression then the image containing the filesystem is compressed the unused space will be trivially compressed to almost nothing. Compressing 4 GB of data is practically the same as compressing 4 GB of data and 16 GB of zeros.

Implementation

Zeroing out unused filesystem space is simple. First mount the file system to be imaged, then create a temporary file on the file system and fill this file with zeros until either a) the file system runs out of space, or b) the file system's maximum file size limit is reached. In the case of the latter, continue creating and filling temporary files until the file system is full. Delete all the temporary files once you are finished. At this point practically all unused space has been allocated for one of the zero-filled files that was created, and thus has had physical zeros written to it.

On a Unix/Linux system, the dd utility makes this easy. The following command:

$ dd if=/dev/zero of=/my/file/zero.tmp bs=16M

reads from virtual device /dev/zero which supplies and unlimited quantity of zeros and writes the zeros to an output file, automatically terminating when the file can not grow anymore. The argument bs=16M is included to speed up the operation, since by default dd will read and write in chunks of 512 bytes and the constant switching between read and write operations in very inefficient and can make the process take tens of times longer.

I've written a quick platform independent C++ program that will create files full of zeros until the files are as large as they can grow and no more files can be created. While "dd" is certainly more convenient, this should work on Windows systems and on filesystems that don't support sufficiently large files. Execute this program with one argument pointing to a path on the partition you want to zero-ize, no argument will default to the current working path.

Obviously, it may not be a good idea to perform this zero-ization operation on a filesystem that is in active use. After the filesystem is filled, but before the temporary file(s) are deleted, there will be almost no room to write data to disk. While this will be a very small window of time, any applications (including the operating system) that need to write to disk will possibly be denied the ability to do so and since it is rare for applications to be denied write access to open file handles, their behavior may be unpredictable. In the majority of my own personal tests I have not encountered a problem, but a couple times the system froze or slowed down noticeably until I deleted the temporary files. Just be careful, filling a live filesystem to the brim is not standard good practice.

There are at least couple technical issues that prevent the above "files with zeros" approach from completely zeroing out unused space on the filesystem.

  • Partially-used sectors will not have the unused portion zeroed out. But these sectors will represent a negligible percentage of the total disk. They typically occur as the last sector of a file that isn't an integral multiple of the sector size. From quick empirical testing, a typical Windows 7 install will probably have less than 500,000 files, and 500,000 sectors means about 125 MB of space on average.

  • Writing to and then deleting a file does not guarantee the file to be written to disk due to both the OS/filesystem and disk-level caches. Cached data that never gets written to disk will be abandoned when the file is deleted, and the space on disk it was supposed to zero will be left untouched. But only a small portion of the data to be written will be cache-able by the OS and the disk. Home disks rarely have larger than 32 MB caches, and the OS/filesystem will likely cache at most a gigabyte or so. This has the potential to be non-negligible size, but even an aggressive cache would have a small total impact. Since so much is being written to disk, the caches will overflow quickly and be forced to write most of it to disk.

  • While it should be obvious, this is just a "good enough" approach. We probably don't care about an extra couple hundred MB of space in the image (and we're definitely not relying on this for security).

    A fast compression scheme is probably better than a good compression scheme, unless time is not important. It's likely that the majority of used space will be binary data (executables, already-compressed media formats, etc) and will yield very low compression no matter how hard you try. Any simple compression method, like GZIP, will be able to make efficient use of the sections of 0s and not waste too much time compressing the rest of the image.

    Concluding Notes

    The difference in compressed image between the original device contents and the zero-ized device contents will depend on the filesystem(s) involved, how full they are, how much data has been added and deleted, how long it's been since the last time the device(s) were wiped, and similar factors. However, since this is a fairly easy procedure, it wouldn't hurt to try this if saving device image backup space is helpful. In my personal experience, I've seen the size of the compressed image as much as halved. On a device where not much data is copied, this may only need to be applied once or twice in the lifetime of the device to keep the majority of the unused space zeroed.

    Somewhat obviously, this technique should not be used on a device that requires forensic analysis, as sectors unclaimed by the filesystem may still have contents that need to be examined.

A Letter to My Congressmen Regarding SOPA and PIPA

  • By Brad Conte, January 18, 2012
  • Post Categories: General Tech

I wrote my three congressmen today to voice my opposition to a well-known pair of bills that are under consideration by the United States House of Representatives and Senate (respectively), namely SOPA and PIPA. These bills were drafted with strong support from the multimedia industry and they bring a very heavy hand into the legal realm of copyright enforcement and are very unpopular with Internet-based companies and most Internet users in general. As I write this, many websites are in the middle of a day-long self-imposed blackout in protest.

I strongly oppose these bills. I do sympathize with the fact that multimedia companies have legal and moral rights to exercise over their content and that these rights are violated by mass-piracy, but these bills take far too drastic action to protect said rights. I won't re-iterate all the reasons why these bills are bad, even in spite of recent tamer modifications.

So I wrote my three congressmen today to voice my opposition. I'd like to share the letter publicly, as a wider proclamation of my position on this issue and hopefully as an aid for anyone writing their congressmen on a political issue. It's not a master-piece, but I thought it might be helpful. I'll explain some of the reasoning and structure below. Here it is:

To the Honorable [insert congressman's full name],

I would like to add my voice of support to the millions of people who oppose [SOPA/PIPA].

I am sympathetic to the motivation of the bill. I understand that multimedia companies have proper motivation to protect their intellectual property. I support their moral and legal rights to own their content.

However, I do not believe that their efforts to protect their property should be at the expense of humanity's convenience and technological development. We the people do not exist to listen to music, we listen to music while we live our lives. Similarly, technology does not exist to play music, it exists to enable us to do what we want, some of which is to listen to music. Legislation like [SOPA/PIPA] takes a multimedia-centered viewpoint of the universe, assuming that we should hinder productivity and change the dynamics of an entire industry just to protect the convenience of one sector of that industry.

The multimedia industry has a history of resisting technological development due to their reluctance to change business models. They tried to legally combat cassette tapes, video tapes, and CDs under the pretense of fighting piracy that would hurt them. Yet those very mediums enabled them to distribute their content in better, more widely-reaching ways than before. They resist change, yet change and progress is what technology, and humanity itself, is about. Every sector of an industry faces critical changes over time, and while it can be difficult for them to adjust they should not seek legal aid pass their difficulties onto us, the common citizen. This is a capitalist society, the multimedia sector needs to adjust, not be pampered.

My request, congressman, is that your position be to protect the progress of the general public. I support the multimedia sector's desire to protect its content - once again, I sympathize with their motivation - but not at the expense of all our development. Their history and the current [SOPA/PIPA] legislation show that they are not seeking to provide us with multimedia enjoyment while we live our lives, they prefer to limit our lives to fit their existing business models.

Thank you for your time. For what it's worth, I voted for you. To protect me.

--Brad Conte

Some thoughts on this letter, in no particular order:

  • I used SOPA or PIPA where appropriate for the recipient. One bill is in one house, the other bill in the other house. Writing a blanket statement like "SOPA and PIPA", or worse yet just the popular one "SOPA", sounds more like a form letter. Who wants to get a letter that mentioning a bill that they can't even vote on?
  • I wanted to paint a big picture perspective. 1) I gave a very quick history of similar actions in the industry and their outcomes. 2) I noted that the multimedia industry is just a sub-sector of the larger industry that this bill effects. 3) I remembered international concerns; the U.S. is the country most impacted immediately, but this likely has implications for the whole world, hence I slipped in the word "humanity" a couple times. The goal wasn't to be dramatic, but to always remind them of how wide-reaching the implication would be. The overall point was that this was far too invasive an action to protect one specific sector.
  • As a general rule for opinions and debates, you should always separate general intent from specific implementation and you should separate analysis of the consequences from debate over what the consequences are. If you want to get your point across to someone, decide which perspective you're arguing and make it clear, too many conversations are completely wasted due to a missing of the minds on those simple topics and jumping confusingly between perspectives. In my letter, I opposed this implementation of copyright and laid out the consequences for what happened. If the congressmen doesn't agree that the assessments that I asserted, that's the subject for a separate, much longer e-mail. I would suggest that it's best to go for lots of detail or very little, because few things are as weak as an argument that uses just a few details strewn about.
  • It's easy to read - about 370 words and about 2/3 of a printed page, so easily readable in a few minutes. There are 5 paragraphs (the closing line isn't really a paragraph), but only 3 have the bulk of the content; it looks very digestible at a quick glance. That's all you need to give a summary of a position. This makes it easily skim-able: a) while each paragraph is unique, you could omit any one and get the majority of the argument, b) you could read the first sentence of each paragraph and still get the message (that's actually a good rule in general), and c) you could read the first and last paragraphs and understand it. Given the popularity of this topic, it's likely that if the congressman is reading my letter, it is just one of hundreds on the same topic.
  • I emphasized what impact my position would have on the opponents (the multimedia industry). Actually, I somewhat trivialized their position as simply being for "convenience". It isn't a matter of preserving the multimedia companies, it's simply about them finding and adjusting to a new business model, just like they've done several times in the past. It's important to know what's at stake in a situation like this, and I wanted to contrast humanity's technological stifling against their convenient business model. (In retrospect, trivializing it as "convenience" was a pretty strong statement to make without supporting evidence, maybe it should've been a little tamer.)
  • Similar to the above, the secondary focus of the letter was on priorities. "We the people" are not befitted in general by this bill. I said and implied that a couple of times.
  • The tone is unemotional, yet a little grave. It's a serious subject but it won't kill my grandmother, so there's no need to sound like it would.
  • The closing line ("...I voted for you. To protect me.") might come off a bit snarky, but it conveys a valid point. Congressmen are put in place by "the people" and those people are who will lose (and lose very badly) should the legislation pass. Obviously there are times where a congressmen needs to put aside the individual or short-term benefits of each person for the big picture, but I wanted to remind them that my hope is that their voice will do its best to echo ours. If I would be strongly opposed to the bill, that should count for something. I didn't make any threats (ie, "I won't vote for you if you don't oppose this"), I just reminded them of established fact. And, for what it's worth, I did actually vote for the congressmen I wrote to. (I shouldn't have to point that out, but it's an easy statement to lie about and I've seen it done elsewhere.)

Hopefully the letters will do some good. (Update: For what it's worth, I only received back form replies from a couple of the congressmen.)

Redirect Input and Output For an Existing Process On Linux

Redirecting input and output of an executable is a standard and trivial practice in Linux operations. When launching a process, the user can trivially redirect the output of the process away from stdout via the > operator and can redirect input away from stdin using the < operator.

But that only works if stdin or stdout are switching before the process is launched. What if we have a pre-existing process that we would like to change a file handles for, but we would like to avoid restarting the process? For example, say you have a cron script execute sudo my_command from an environment where you can't provide input (perhaps you meant to use gksudo instead). You might be able to kill the sudo process, but perhaps when sudo exits the script will proceed with undesirable results. You could kill the script too, but assume that you very badly don't want to abort the script in a semi-completed state. (Obviously a well-written script shouldn't have this sort of behavior, but the assumption is that we are in an unusual situation.) The ideal solution would be to redirect input into the hanging sudo process allowing it to succeed and your script to continue.

Thankfully, we can perform redirection on existing processes by explicitly manipulating the existing file descriptors. The method for doing so is fairly straight forward:

  1. Determine which file descriptor you need to change.
  2. Attach to the target process via gdb.
  3. Create a file (to redirect output) or named pipe (to redirect input).
  4. Use gdb to point the desired file descriptor to the file or pipe.
  5. If applicable, send the necessary content through the pipe.

In terminal A, find the PID of the process, call it TARGET_PID. First, list the target's existing file descriptors:

$ ls -l /proc/TARGET_PID/fd/

When we are done we will double check this list to ensure we made the changes we wanted to.

Now you need to determine which file descriptor (hereon "FD") you want to change. Remember, we can only manipulate existing FDs, not add new ones. (For those who don't know: FD 0 is stdin (standard input), FD 1 is stdout (standard output), FD 2 is stderr (standard error output). These are the the base FDs that every process will have, your target process may or may not have more.) Examples:

  • To change the input for a typical terminal program you likely need to to change stdin.
  • To change output file X to a different file Y, you need to find which FD on the list is linked to X.
  • For sudo, to change the input that accepts the user password you actually need to change FD 4, which should point to /dev/tty or something similar.

We'll call the the FD number that you want to change TARGET_FD.

For using a named pipe: First create the pipe using

$ mkfifo /my/file/name

We'll call this path TARGET_FILE. Then provide input to the pipe, or else gdb will not be able to open it in a later step. Provide the content by, for example, echo MyContent > TARGET_FILE from a separate terminal or as a backgrounded process. MyContent should be the content you want to send the process.

For using a normal file: Create an output file called TARGET_FILE.

Attach gdb to the process:

$ gdb -p TARGET_PID

Within gdb, close the file descriptor using the close() system call:

(gdb) p close(TARGET_FD)
$1 = 0

The right-hand side of the output is the return value of the call we just executed. A value of 0 means that all went well.

Now create an FD using the open() system call. This must be done after "close()", because file descriptors are issued sequentially from the lowest unused non-negative integer and we are making use of the fact that once we delete TARGET_FD it is now the lowest unused file descriptor, so the next one created will use the same number.

(gdb) p open("TARGET_FILE",0600)
$2 = TARGET_FD

If the right-hand side number is equal to TARGET_FD, that means we just successfully created an FD and it got the same FD that we just closed, which is perfect. Remember, if you are using a named pipe, this step may (will?) hang if there is no output going into the named pipe.

Now quit gdb:

(gdb) q

At this point, you should be done. If you are redirecting output, the redirection should be under way. If you are redirecting input, the first input should be consumed from the pipe and you can continue to provide input as necessary by sending it into the pipe; when you are done simply delete the pipe using rm.

We can verify hat we were successful by checking the target process's FDs. Run ls -l /proc/TARGET_PID/fd/ again and compare the output against the output from the first time. If all went well then TARGET_FD should be changed to point at TARGET_FILE.

FoxyProxy, Firefox 3.5, and DNS Leaking

[Update: Jan. 24, 2010] The DNS leaking problem described in this article applied to FoxyProxy v2.14. On Jan. 12, FoxyProxy v2.17 fixed the problem.


FoxyProxy is a popular Firefox extension that enables users to, setup, easily manage, and quickly switch between multiple proxy configurations. One of the most common uses of a proxy server is for security/privacy. By establishing an encrypted connection (usually via SSH) with a proxy server on a trusted network, you can have your web traffic go through an encrypted "pipe" to that server and have that server send and receive web requests on your behalf, relaying data to and from you only through that pipe. By doing this you eliminate the risk that someone on your current network could see your HTTP traffic. Maybe you don't trust other clients on the network, maybe you don't trust the gateway, it doesn't matter -- your HTTP(S) traffic is encrypted and shielded from prying eyes. (Readers unfamiliar with the concept of using HTTP proxies through SSH tunnels are encouraged to research the matter, there are many well-written tutorials available.)

There are many other popular uses of proxy servers, but the application of encrypted web traffic is of concern in this case for the following reason: A key problem that arises when using web proxy servers is the issue of handling DNS requests. DNS does not go through the HTTP protocol, so even if HTTP is being forwarded to a proxy the DNS requests may not be. If DNS requests are sent out over the network like normal then eavesdroppers can still read them. So although the actual traffic may be encrypted, the DNS queries behave normally and may cause the same problems that using an encrypted tunnel was designed to avoid in the first place. A situation in which HTTP traffic is encrypted but DNS is not is referred to as "DNS leaking". When using a proxy for the benefit of security or privacy, DNS leaking may be just as bad as non-encrypted traffic.

Solving the DNS leaking problem is simple. One type of proxy, SOCKS5, can forward DNS requests as well as HTTP(S) data. A user simply needs to tell their browser to use the SOCKS5 proxy for both HTTP(S) and DNS and then both will be routed through the encrypted stream, allowing the user to surf the web with greatly strengthened security and privacy.

However, Firefox users who use FoxyProxy at the moment will encounter a problem when using DNS forwarding to a SOCKS5 proxy. When using FoxyProxy, DNS leaking occurs even when it is configured not to, which has made many users very upset. Initially many people thought the problem was with Firefox 3.5, but others confirmed it was only present with FoxyProxy installed. Unfortunately, however, not everyone is convinced that this is FoxyProxy-related behavior and I have not found anyone who has presented a solution yet. I plan to do both.

This is the basic setup for my tests:

  • I set up an SSH server.
  • I established an SSH connection and used the built-in SOCKS5 functionality of the SSH server daemon:
    $ ssh username@myserver -D localhost:8080
    

    (For the non-SSH inclined: This command forwarded all traffic my client sends to itself on port 8080 through the SSH connection to the SSH server, which then acts as a SOCKS5 proxy and sends the data on to the destination.)

  • I used Wireshark to monitor all packets, specifically DNS requests, sent or received my network's interface. Note that DNS requests tunneled over the SSH connection to the SOCKS5 proxy are not visible to the packet sniffer.
  • I monitored my Firefox configuration in about:config. (All network proxy-related settings are under the filter network.proxy.)
  • I used Firefox v3.5.5 and FoxyProxy v2.14.

Using this I was able to monitor all DNS requests while I experimented with Firefox and FoxyProxy using a SOCKS5 proxy. I did a base test with no proxy configuration, a test using Firefox's included proxy management, and a test using Foxyproxy for proxy management.

Using no proxy

Starting with a default configuration (SSH SOCKS connection established but no proxy settings configured to use it) I visited several websites such as google.com, yahoo.com, and schneier.com. This was the simple base test.

I checked showmyip.com to get my IP address.

The relevant about:config settings:

network.proxy.socks               ""
network.proxy.socks_port          0
network.proxy.socks_remote_dns    false
network.proxy.type                0

Via Wireshark I watched as the websites generated normal DNS requests over the standard network.

Using Firefox to configure proxy settings

I restarted Firefox to avoid any cached DNS entries. Then, without FoxyProxy installed, I setup my SOCKS5 proxy. (Note that FoxyProxy replaces the standard Firefox proxy editor, so it is impossible to not use FoxyProxy when it is installed.)

Under Firefox's Preferences/Tools (depending on your operating system) I went to the "Advanced" tab, "Network" sub-tab, and opened "Settings". I chose "Manual proxy configuration" and entered "localhost" for the SOCKS Host and "8080" for the port.

Unfortunately, Firefox v3.5 does not support a GUI method of enabling DNS SOCKS5 proxying, so I had to manually go to about:config and enable it by setting network.proxy.socks_remote_dns to "true".

I checked showmyip.com to ensure that my IP address displayed as coming from the server and not my client. It did show as coming from the server, so Firefox was using the proxy.

The final about:config settings were:

network.proxy.socks               localhost
network.proxy.socks_port          8080
network.proxy.socks_remote_dns    true
network.proxy.type                1

I visited the same websites. Via Wireshark, I did not see any DNS requests sent over the standard network. Firefox channeled both the HTTP and DNS data through the SSH tunnel perfectly.

Using FoxyProxy to configure proxy settings

I reset all the about:config settings back to their defaults. Then I installed FoxyProxy Standard v2.14. I went to FoxyProxy's options and, under the "Proxies" tab, created a new proxy entry whch I named "SSH SOCKS5". I set it to connect to "localhost" on port 8080. As well, I check-marked the "SOCKS proxy?" box and selected "SOCKS v5". I went to the "Global Options" tab and checked the box "Use SOCKS proxy for DNS lookups". To let this take effect, I had to restart Firefox.

When Firefox had restarted, I went to the Tools > FoxyProxy menu and selected to "Use proxy SSH SOCKS5 for all URLS". I checked showmyip.com to ensure that my IP address displayed as coming from the server and not my client. It did show as coming from the server, so Firefox was using the proxy.

I checked about:config:

network.proxy.socks               ""
network.proxy.socks_port          0
network.proxy.socks_remote_dns    false
network.proxy.type                0

The configuration was the same as the default, so apparently FoxyProxy does not adjust about:config to do its work.

Watching the DNS requests via Wireshark, I watched as all the website visits generated DNS requests over the normal network. Complete and thorough DNS leaking. And I would like to emphasize that I had selected "Use SOCKS proxy for DNS lookups", which is FoxyProxy's option to address the DNS leaking issue.

Fixing DNS leaking

There was no question about it, FoxyProxy caused the DNS leaking in my test. I wanted to solve the problem so I fiddled with about:config.

In about:config I manually set network.proxy.type to 1. I verified my IP address was from the server via showmyip.com.

The new about:config:

network.proxy.socks               
network.proxy.socks_port          0
network.proxy.socks_remote_dns    false
network.proxy.type                1

I watched for DNS requests again via Wireshark. I saw none. It seemed that just manually setting network.proxy.type to 1 fixed the FoxyProxy DNS leaking problem.

I also tried other about:config settings, such as manually changing network.proxy.socks_remote_dns to "true", but that didn't work. The above was the only change in about:config that I found that fixed the problem.

Summary

I repeated the results above three times in different orders on different computers on both Linux and Windows to ensure I made no configuration mistakes and to verify that the behavior was consistent and cross-platform. All the tests yielded the same results. Here is the final summary:

  • Firefox v3.5 does not suffer from DNS leaking by itself.
  • DNS leaking occurs when FoxyProxy is managing the proxies.
  • FoxyProxy does not suffer from DNS leaking when network.proxy.type is manually set to 1.

It is obvious that FoxyProxy does not adjust about:config in order to configure proxy settings, but I do not know why. Many Firefox extensions adjust about:config in order to accomplish their goals and I know of no reason they should not. It's possible that FoxyProxy has not had a need to do so before, but in light of this serious problem that may need to change. The quickest/simplest solution for FoxyProxy may to set network.proxy.type to 1 if the currently enabled proxy is SOCKS5 and if the global options for FoxyProxy (or the about:config for Firefox) are set to enable DNS forwarding.

However, although this seems to indicate that FoxyProxy has made a mistake, I don't know that FoxyProxy is the party at fault. Clearly FoxyProxy does not have to alter about:config in order to change the other proxy settings, so why must network.proxy.type be set in about:config in order for DNS forwarding to work? Note that network.proxy.type isn't related to DNS forwarding, it just specifies which type of proxy is enabled. For all I know someone implemented a hack in Firefox that checks about:config when it shouldn't. Of course, I don't know that and I don't know if this is expected behavior from Firefox or not. It could be that FoxyProxy isn't setting whatever hidden configuration for DNS forwarding that exists on the same plane as the other invisible proxy settings it uses. Or maybe FoxyProxy is relying on an unreliable hack in order to avoid changing about:config. I don't know about any of that. What I do know is that Firefox by itself does not have this DNS leaking problem, FoxyProxy does, and a simple solution exists.

Again, I am certainly not the first person to note this problem, but a) I have seen many people blame Firefox for this bug, and b) I have not yet seen anyone else mention the solution that I noted above.

I leave it to someone with more time and knowledge about these software projects to determine which project should have which bug report filed. This needs to be fixed permanently.

Three Tips for the Linux Terminal

  • By Brad Conte, January 11, 2009
  • Post Categories: General Tech

The power of Linux lies in the tools it uses, and the shell is an essential tool. If you spend a lot of time in a terminal, you likely value anything that makes the experience smoother. Here are a couple tips to help make the terminal experience as smooth as possible.

Interact with the X clipboard

Before I discovered xclip, one of the most annoying things about being in a terminal was my lack of access to the X clipboard. Some terminal/shell combinations work well with a standard desktop environment, but "highlight-and-middle-click" a) isn't always feasible, and b) doesn't always work. Thankfully, xclip makes it easier.

xclip can output from and write to the clipboard from the standard input. The "-i" and "-o" arguments tell xclip whether you are inputting or outputting clipboard contents, respectively.

Example:

$ pwd
/some/long/path/you/dont/want/to/retype
$ pwd | xclip -i

You may now "Control-V" paste that path where ever you choose. Another example:

wget `xclip -o`

This will download the file from the URL that is in the clipboard. I have found that "pasting" the contents of xclip into the shell using backticks (aka, `) a very convenient work-flow. The result of `xclip -o` is basically having the code right were you would have pasted it in the shell.

You can use xclip to read from and write to the different X clipboards, which allows you to interact with the clipboards for pasting via middle-click or Control-V. The "middle-click" clipboard is selected with the arguments -selection primary and the "Control-V" clipboard is selected with the arguments -selection clipboard.

I might suggest aliasing "xclip" to xclip -clipboard X, where X is your preferred clipboard to operate on.

Use "head" and "tail" more powerfully

You probably know how to use "head" to extract the first N lines from a file and "tail" to extract the last N lines of a file. While this is useful, it's often just as useful to extract the complement of those selections, namely, everything except the first N lines or everything except the last N lines.

The head and tail utilities are powerful enough to accommodate those needs. head allows you to extract either a finite quantity of text from the top or everything but a finite quantity of text from the bottom, and tail allows you to extract a finite quantity of text from the bottom or everything but a finite quantity of text from the top.

  • head -n N - by default outputs the first N lines (equivalent to using +N). Using -N outputs all but the last N lines.
  • tail -n N - by default outputs the last N lines (equivalent to using -N). Using +N outputs lines starting on the Nth.

Example:

tmp $ cat example
1
2
3
4
5
/tmp $ head -n 2 example
1
2
/tmp $ tail -n 2 example
4
5
/tmp $ head -n -2 example
1
2
3
/tmp $ tail -n +3 example
3
4
5

Note that tail's complement mode requires you to specify the first line number to include in the output, so if you want all but the top N lines actually specify the argument N+1.

Change directories with the directory stack

The Bash shell (as well as others, like Zsh) have a built-in "back" feature for navigating directories. The built-in pushd command will put your current working path at the top of a shell-maintained stack and allow you to change to another directory. To go back to the saved path you can use popd. This one is more commonly known than the first two, but worth including because it's so incredibly convenient.

Example:

$ pwd
/some/long/path/you/dont/want/to/retype
$ pushd /some/other/path/
/some/long/path/you/dont/want/to/retype
$ pwd
/some/other/path/
[...]
$ popd
/some/big/long/directory/you/dont/want/to/retype
$ pwd
/some/long/path/you/dont/want/to/retype

Since "pushd" stores the directories in a stack, you can push multiple directories onto the stack and later pop them off in the reverse order you pushed them. It's basically the standard "cd" only with a "back" feature. Speaking of which, the command cd - will always take you back to the previous directory you were in.

Bash Path Hashing

  • By Brad Conte, September 5, 2008
  • Post Categories: General Tech

Bash is a Unix shell of many features. One of those features, which is both useful and potentially annoying, is the path hash table.

The environment variable we are probably most familiar with is PATH -- it contains all the paths that should be searched when we specify an executable to run. Bash starts with the first path specified in PATH and looks to see if there is an executable present, if there is not it checks the second directory, and so on until it either finds an executable with the specified name or it runs out of paths to search.

The path of an executable practically never changes. "ls" is always in the same place, as is "cat", and every other executable you might have. Thus, to save time, the first time Bash executes an executable it saves the path it finds the executable in, this way Bash can avoid searching for the executable again. The first time you execute "ls", Bash has to search for it. The second through bajillionth times you execute it, Bash will just look up the full path from the table -- much more efficient. It is important to note that every instance of bash maintains its own path hash table -- thus if you have multiple terminals open they will not share path tables.

However, if the actual desired path of an executable changes, problems can arise. By default, Bash will not look for an executable again. If an executable changes paths, Bash will not find it. If an executable is duplicated to a new path, Bash will execute the original executable even if the new one is in a path of higher priority (namely, listed first in PATH). As can be imagined, this can cause quite a headache in these rare scenarios if one is unfamiliar with Bash path hashing. If you create a script, execute it, then edit it and copy it somewhere else (say from your personal "bin" to a global "bin"), the old version will still execute. If you simply move an executable between paths, bash will not find the new executable despite the fact that it seemingly should. Observe the following example, in which I create "myscript" in my home bin, run it, copy it to a global bin, run "myscript" again, and execute the one in my home bin again.

Example of Bash storing the full path of a script and not recognizing a new one.
(This image can serve as a stand-alone explanation -- feel free to redistribute it.)

Thankfully the hash table is easily manipulated -- by the built-in shell command "hash" no less. There are four switches that are likely of the most interest to you:

  • -d [name] will delete a specific entry from the table
  • -r will remove all entries from the table
  • -t [name] will list the save path for the given executable name
  • -l will list all existing executable -> path entries in the table

Using this tool you can view existing hashes and remove old stale ones as you see fit. To solve an out-of-date path hash problem, simple removing it will suffice. The next time the command is run the path hash will be re-updated. For more details, I refer you to the following key excerpts from the Bash man page:

       If the name is neither a shell function nor a builtin, and contains  no
       slashes,  bash  searches  each element of the PATH for a directory con-
       taining an executable file by that name.  Bash uses  a  hash  table  to
       remember  the  full pathnames of executable files (see hash under SHELL
       BUILTIN COMMANDS below).  A full search of the directories in  PATH  is
       performed  only  if the command is not found in the hash table.  If the
       search is unsuccessful, the shell prints an error message  and  returns
       an exit status of 127.

[...]

       hash [-lr] [-p filename] [-dt] [name]
              For  each  name, the full file name of the command is determined
              by searching the directories in $PATH and remembered.  If the -p
              option is supplied, no path search is performed, and filename is
              used as the full file name of the command.  The -r option causes
              the  shell  to  forget  all remembered locations.  The -d option
              causes the shell to forget the remembered location of each name.
              If  the  -t  option is supplied, the full pathname to which each
              name corresponds is printed.  If  multiple  name  arguments  are
              supplied  with  -t,  the  name is printed before the hashed full
              pathname.  The -l option causes output to be displayed in a for-
              mat  that may be reused as input.  If no arguments are given, or
              if only -l is supplied, information about remembered commands is
              printed.   The  return status is true unless a name is not found
              or an invalid option is supplied.

How to Review a Linux Distribution

The GNU/Linux project has begot the most versatile operating system environment ever. It's free, open source and designed to be modified and built upon by its users. The heart of the Linux philosophy is the concept of choice. Linux is about providing its users with the freedom to choose what they want to do and how they want to do it. Because both Linux and the software developed for it is so versatile, there exist
hundreds of
Linux
distributions, each one being nothing more than a different collection of software and configurations running on top of the Linux kernel.

Because of the number of Linux distributions and the degree to which their styles/configurations can vary, a given user usually has the ability to choose a distribution closely suited to their specific needs. To aid the ever-prevalent distro-seeker, many Linux users have sought to provide written reviews of various Linux distros. Unfortunately, many reviews fall short. With the increasing population of Linux users, and the increasing number of Linux distributions, there has been a notable increase in unhelpful reviews.

Which leads to the point of this article, namely, how to review a Linux distro. The idea of reviewing a Linux distro is to inform the reader as to how the specific Linux distro in question is different from the other Linux distros. To do this, a reviewer must provide an analysis of the software and configurations that set the disto in question apart from other distros, putting solid emphasis on the big differences, less emphasis on the minor differences, and no attention to the things that are inherent in all Linux distributions. (If you are writing to inform the writer about Linux, you should do that instead, rather than pretend you are reviewing any specific distro.) This basic principle holds for almost any review of any product. For example, movie reviews tell you why one movie, a four-star action flick, is different than another movie, a two-star romantic comedy. The movie reviewer doesn't spend time explaining the concepts of cut-scenes and character development, it merely explains the aspects of a movie that make it unique.

This article is aimed at the potential Linux distro review writer. It is split into two main sections: things one should review, and things one should not review. Most of the points have sample topics for that point, some of the points and sample topics are more advanced/technical, some are less so. No good review must cover every point and sample topic in this article, this is just a broad list of good ideas. Suit your information for the knowledge level of your target audience.

Good Things to Review

Here's a list of things that make for a good review. This is not an exhaustive list, but it covers a lot of ground.

  • Philosophy

    Every Linux distro has a philosophy, namely, a statement of purpose and a set of guidelines for development that support that purpose. "Linux for human beings," "KISS," "Free, robust, stable," "Design for control," are some philosophies that describe various distros. Anyone using a distro with one of those philosophies would instantly recognize it.

    This topic is not optional, you must state the distro's philosophy and exemplify it throughout your review. Consequentially, it is probably the best place to start your distro review, as what you write thereafter should exemplify the philosophy. Remember, your goal is to show how the distro is unique, and it is the philosophy that dictates why the distro unique. All criticisms or negative points against the distro should be held in light of the distro's philosophy. Is the distro complicated? That might be worth mentioning, but if the distro strives to put user control first and simplicity is of no concern then complexity is not a short-coming of the distro but rather a side-effect of its philosophy. Rather than criticizing a distro for being too complex, instead note that it is designed for experts. Many reviews fail to evaluate a distro for what it is and criticize a distro for lacking things it does not strive to have.

    As boring as it may seem, a quick history of the foundations of the distro might prove very informative. Understanding why a distro was created in the first place can say a lot about what its like today and where it might be tomorrow.

    Sample topics:

    • Who is the distro's intended audience? The newbie, the guru, the average home user, the developer, the server administrator?
    • Does the distro tend to value stability or bleeding edge software? (See later points.)
    • Is the distro designed for users who like to control everything?
    • What is the history of the development of the distro? Why was it created?
    • How robust is a base level install of the distro? (More on this later.)

  • The parent distro

    If the distro in question is itself based on another distro (many are) you should mention the parent distro and how this spin-off/child distro differs from its parent. Some parent-child relationships are distant, some are close. Some spin-off distros maintain their own package repositories and make their own configuration decisions, some are nothing more than the parent distro with a different default window manager and a few extra pre-installed packages. Telling the reader what the parent distro is and how closely the child is related to it is extremely informative for those either familiar with the parent or who will seek to become familiar with the parent.

  • The package manager

    The package manager is the heart of a Linux distro. It is how the user most intimately adds to and subtracts from his system. The package manager is arguably the most important and unique component of a distro. Even though most package managers offer the same rudimentary functions, you should review the package manager. And the word "review" does not imply that you should confirm that there are indeed commands for installing and removing packages. You need to provide enough information that it's clear how the package manager feels to the user. Explain what sets the package manager apart from other package managers. Different distributions make it easier/harder to eliminate orphaned packages or check dependency information, and many have different ideas of what security precautions to take.

    Sample topics:

    • Does the package manager have a GUI?
    • How easy is it to upgrade all the packages in the system?
    • Does it provide clear output informing the user what its doing?
    • What level of security is offered? Package authenticity verification? SSL downloads?
    • Does it ever require that the user perform manual tasks to complete a package installation/integration into the system? (Ie, removing certain configuration files or backups.)
    • How is the package database stored? How easily human-readable is it? How easily backup-able is it?
    • How many tools are a part of the package manager? How many are necessary for day-to-day use?
    • How easy is it to perform a system upgrade?

  • Support/Documentation

    The level of support and/or documentation available for a distro will likely be one of the most significant influences in a user's experience with a distro.

    The interactive support offered by a forum (and the archived support offered by its search function) can be priceless as a user gets started with a distro and has questions as they encounter Linux phenomena they are not familiar with. A wiki can help the user through the install and/or basic system setup, setup common software, configure it the way they want, and fix common bugs. The absence of a wiki can mean many hours of Googling for decentralized information and exasperation, leading to unfixed problems and a poorer user experience. A documentation tree (somewhat rare outside of behemoth distros) can help a user dig deeply into the distro and modify/fix it if they are so inclined.

    Sample topics:

    • Does the distro have an official forum? How large is it?
    • Does the distro have a wiki? How expansive is it? Does it cover the configuration of sudo? What are some of the more obscure articles it has?
    • Does the distro have developer-written/contributed documentation?

  • The Packages

    While the package manager controls the addition/removal of packages to/from the system, it might be silly to overlook reviewing the packages themselves. The packages will be composed of compressed archives, arrainged with file hierarchy and package metadata that handles dependency checking, integrity verification, and the like.

    It is arguable that the layout of the filesystem be included in this topic. Many distros have their own ideas about how the filesystem should be laid out. If the distro has any peculiarities or quirks to how it uses the filesystem tree, its usually related to the compilation destination of packages.

    Sample topics:

    • What type of packaging standard does the system use? How easy is it for a user to create their own packages? (Some users like to assemble their own software and use the package manager to mange their installation. See point on compiling from source below.)
    • Is package downgrading supported?
    • How quickly are packages updated when their official upstream versions are updated? (Don't be to picky here: Are we talking about weeks or months?)
    • Continuing from the Philosophy point and the above sample topic, does the distro lean towards being bleeding edge or being stable? (This has been mentioned twice. Hint.)
    • Do the developers do extensive modification of the original packages (security patches, extra functionality), or are they provided vanilla as-is? (For example, Debian developers tend to modify packages as they see fit, Arch Linux developers only apply very common patches occasionally.)
    • How are the packages compiled by default? (For example, with or without debug information?)
    • How is Gnome/KDE split up? How many packages compose a full Gnome/KDE suit? (For example: Is Kolour Paint an individual package, or only available as part of the larger KDE Graphics package?)
    • Are there any filesystem layout quirks, additions, or reservations? (For example, is /usr/local ever used by packages or is it reserved for the user? Is Firefox installed into /var?)

  • Support for compiling from source

    The package manager's job is (usually) only to manage pre-compiled packages. But some users like to manually compile some packages from source, such as to apply custom patches to a program or to get speed by optimizing the program for their specific machine. Many users don't compile much (if any) from source, but the Linux philosophy is deeply rooted in the distribution and manual compilation of source code, and there are many who like/need to do so. How an operating system lets the user compile/managed compiled programs is important to how the operating system works. Those who rely on it will find their experience very dependent on how manual compilation is handled.

    If there exist any distro-specific methods for handling the compilation of both supported (packages in the repositories) and/or un-supported (packages not in the repositories) software, it should definitely be included in a review. If the distro has any form of a ports-like system, a review without mention of it is by default incomplete.

    And do not ever, ever waste space explaining that you can do "./configure && make && make install", that you can use "checkinstall", or that you can use the "install" program itself -- every distro has those, they aren't worth mentioning in the slightest.

    Sample topics:

    • Do there exist any specific tools for the system that aid the user in compiling programs from source?
    • How are users supposed to recompile the kernel? Is initramfs or initrd used?
    • If a user does manually compile a program/package, how easily can it be managed by the package manager? (This has been mentioned twice. Hint.)
    • How is the presence of different kernel versions handled?
    • Is there risk of having the manually-compiled program conflict with the packages managed by the package manager, or is there a way to keep manually installed non-package programs separate?

  • Release system

    Distros have different styles of updating the end-users installed distro to be the most current version. Some distros have a new, fully-updated version released every 6 months. Some just release all package upgrades on-the-fly and have no concept of different versions. The way the distro handles version releases is significant to the end-user, as usually full version upgrades mean that the developers reserve the right to move stuff around and/or reconfigure things in a noticeable way if they want to, and push those changes through in the next major release. It also means that developers reserve the right to stop providing package upgrades for a given version, resulting in obligatory periodic upgrades.

    Sample topics:

    • What factors decide the version releases (if there are any)?
    • How big are the changes between releases?
    • How often are the releases?
    • How long is there support for old releases?
    • Does evidence seem to suggest that performing a version upgrade is smooth, or is the user better off performing a clean install of the new version?

  • Startup style

    Summarizing how a distro manages startup scripts/daemons is easy: All you have to do is say whether the distro uses System V or BSD-style init scripts. (Users unfamiliar with those can look them up.)

    How the operating system uses run-levels is a more complex subject, yet still an interesting on. Fewer users will care about this level of detail but more advanced users might be able to intuit more about the nature of a distro from it.

  • Robustness of a default install

    Distros vary in what software they provide for a from-disk installation. Some have more, some have less. Some have roughly equal quantities but the theme of the offered software is completely different.

    After you install the distro, how much work must be done to get the system up and running with a graphical desktop and a decent collection of tools? What kind of packages are installed, and what aren't?

    Sample topics:

    • Is a default display manager installed or offered? (Which one?)
    • Does the distro rely heavily on sources from the servers, or can you obtain multiple CD/DVDs for a more robust from-disk install?
    • Does the default install include tools for compiling? (GCC, make, etc.)
    • Does the default install offer Apache for install?
    • What are some of the most specific/obscure programs install/offered by the installer? (For example, if a distro provides Compiz and OpenOffice.org straight from disk, that implies the sort of other packages that are installed.)

The above topics cover a lot of ground. Obviously not every detail need be included in a good review, but many of these points are very critical to how the distro works and failure to address a significant number of them will result in an incomplete review of the distro. If there seem to be too many points to address, remember that you're writing a technical review, not a grand work of fiction. Wordiness can usually be eliminated. If you know what you're talking about and put some thought into what you write, you can cram a lot of information into three sentences.

One key to remember is that that not all of the information may seem important to you, the author of the review, but it is what separates one distribution from another. Obviously the information has to be appropriate for the audience you are writing for; that is a decision faced by every writer.

If you feel unable to address many of these topics in your review then you probably not as familiar with the distro as you should be in order to do a good review of it. Many people want to pop in a CD, install the distro, play with it for 30 minutes, and then write a review about it. Unless you did your research beforehand and you made very good use of those 30 minutes, you're probably not qualified to offer a review. Spend some time learning more about the distro.

Bad Things to Review

Now a list of things you should not review.

  • Plug-n-play drivers

    Do not ever review a distro by saying "I plugged my external wireless card X and it did(n't) work". This is the lamest review you can provide, and is my pet peeve to boot. First of all, almost all distros use the same drivers. Support for hardware is pretty much universal, the same drivers that manage your hardware in one distro probably manage it in most other distros, and this only becomes more true with time. Comparing hardware support between distros is basically pointless. It's even more pointless when your analysis consists of just one generic external wireless card. If it didn't work, odds are you didn't set something up correctly.

    Note that it is obviously acceptable to review the quantity of drivers provided by a distro if they seem to have a noticeable shortage/abundance of them, but most distros will have about the same drivers. Don't mention driver quantity unless its very noticeable.

  • Five minute quirks

    At various times, some distros have what you might call "five minute quirks". These are one or two little problems that need to be ironed out of a fresh install of the system and are easily solvable by a user with the skills that the distro is oriented towards. Sometimes an environment variable needs to be set, sometimes a file needs to be configured, etc. These are bugs not unique to anyone's specific install of the distro, but rather a very large (if not the entire) user base. One of the developers mis-configured something before release, or some default setting doesn't work with the majority of hardware, or something like that.

    Do not write about these little five minute hiccups in your review, they don't count. They only effect you for five minutes and will likely be fixed in one of the upcoming releases. (See the next point.) Usually the distro's forum and/or wiki has an easily accessible guide to tell you how to fix the quirks and after a day of usage, the user will not remember it. At absolute most you should say that "minor quirks exist" and say where/how the site addresses these issues. This could be part of your review of the distro's documentation. Any quirks you fix within five minutes of attempting to do so should not be mentioned. Their fix in the next release will out-date your review.

    What's more, do not write about the stability of a distro based on its five-minute hiccups. Nothing says "I didn't do any actual research about this distro" like a stability evaluation based on the initial five-minute quirks the distro had as of some arbitrary date.

  • Unrelated/temporary bugs

    Do not write about temporary bugs. If the bug is the type that will be fixed within four months, don't use it as basis for your evaluation of the distro's stability. This point bears repeating: Whatever you do, do not base your assessment of a distro's stability on little quirks that will be gone in a couple months. If your readers wanted to read an bug listing, they'd browse bugzilla.org.

    On that note, do not assess the distro, in any way, based on bugs that are not specific to the distro itself. If the developers of a software package X make a mistake, don't blame the developers of the Linux distro in question, it's outside of their control, not their fault, and not their responsibility. It's irrelevant to your review, so ignore it. If distro X has a bug but distro Y does not, it is possibly that distro Y did not push the new version of the software and thus does not have the buggy version -- you can use this to exemplify how "bleeding edge" the distro is. Or its possible that distro Y manually patched the software -- you can use this to exemplify how the developers do (or don't) manually patch packages.

    You can obviously cite the presence of bugs in your analysis of the stability operating system, but don't focus on petty bugs. Instead, focus on the release/testing cycle of packages, the speed of security fixes, and frequency of distro-specific bugs. If this means that you actually have to spend some time using a distro and learning about it before you evaluate its stability, cry me a river. If you aren't willing to do that, don't comment on it's stability because you don't know anything useful about it.

  • Artwork

    No one cares what the default desktop color theme of the distro is. Or the default desktop wallpaper. Or whatever. This is my other distro-review pet peeve. As shocking as this may sound, every desktop environment and window manager in the entire universe, throughout every civilization, all of time, and every dimension, can have its color theme changed. If a user doesn't like the default wallpaper, they will change it. If they don't like the default Gnome color theme, they will change it. Defaults are irrelevant.

    Don't mention the quantity of community artwork, in the sense of backgrounds and color themes. This is directly proportional to the number of users the distro has and thus redundant information. And, realistically, nobody makes a distro decision based on it. And many users routinely become attached to a certain color theme and will take it with them from distro to distro.

  • Installation Process

    This is not a terribly important point, but I think that reviewing the installation process of a distro is almost pointless. Most installers, even text-based ones, guide the user through the installation very clearly. Sure a text interface may look a bit daunting to some, but if one just reads the instructions, one can usually just click-through (or, tab-enter-through) the installation with relative ease.

    This item might warrant a quick mention, perhaps tying into the distro's philosophy (ie, in analyzing whether the distro is oriented towards casual or power users), but unless the installation process is exceptionally complicated or buggy don't devote much space to it. It's a one-time experience by the user, and a determined user won't change their opinion on whether to install it based on the installation graphics. Mention if it's GUI or text based, and move on.

Remember, your goal, as a reviewer, is simple: Review things from the perspective of the average user you are writing for. Don't go into pain-staking detail unless you're aiming to provide an in-depth guide, minor details/quirks are often best left for their own special article and are of interest to few readers. Don't waste time on things that were (likely) very unique to your experience, because almost no one else cares.

Even though this article was somewhat long, your distro review doesn't have to be. A lot of useful information can be packed into one paragraph.

A Simple Partitioning Tip

Deciding how to partition a hard drive is not necessarily a simple task. Depending on what you want to do with it, your partitioning scheme may vary greatly. For those with specific needs I put forward a practical tip: Make all your operating system partitions the exact same size. I've learned the value of this several times over in situations that called for experimentation/backup with/for an operating system.

I will use the *nix tools fdisk (the Windows version of fdisk will not suffice) and dd, and consequentially assume that the reader has minimal *nix experience and access to some form of a *nix like system. Linux live CDs are adequate, as will (I believe) OSX. The *nix system will be used to perform the delicate initial partitioning and the partition clones thereafter.

I always dual-boot Windows and Linux, keep a third partition around for experimenting with other operating systems, and have a fourth partition blank. All four of these partitions are the exact same size, which offers me flexibility in moving them around. The biggest advantage is that it allows me to perform a byte-for-byte clone any one partition onto a second partition, perform whatever perverse thing I'm experimenting with on the first partition, and then restore it if something goes wrong. Or I can just boot straight to the new copy of the partition I cloned. Somewhat time consuming, but simple and error-proof.

I know this concept may strike some as a waste of time and space, but it's not inefficient as it may sound. Cloning a 20GB partition can be done over lunch and, like many others, I have a hard disk exclusively devoted to operating systems, no one of which needs more than 25GB.

There are several reasons why you might want to backup an entire partition. The need to do so for a *nix system is less pronounced, because the flexibility of *nix systems lets you do a file-level copy between partitions without adverse effects. But doing a byte-for-byte direct copy isn't much longer than a file-based copy, and removes any of the file-based copy complications. As well, Windows systems are attached to their partitions and if they are copied on the file level to a new file system they will not work, so doing a byte-by-byte image of a Windows system is almost a must for backup purposes.

The problem, though, is that if you rely on a partition manager like the default Windows manager or GParted (which is the default partition manager in the Ubuntu installation process) to create your partitions for you, as odd as it may seem, you aren't guaranteed to get partitions the exact size you specified. I don't know the details, but creating two 20GB partitions does not guarantee you of getting two partitions of the same size. Thus when you go back to try and clone one partition to another you may discover that the destination partition is, say, 8MB smaller than the source partition, and there will be hell to pay.

Creating Precisely-Sized Partitions

  • Open the disk to partition with fisk:

    # fdisk /dev/sda

    Use the "p" command to get a printout of my current table. If the disk is new, the printout should be empty.

  • Determine how large you wish the partitions to be. Create a new partition and choose the starting first sector (use the default if you are creating them sequentially). Then select the last sector by using the +XG to indicate that the last sector should be X gigabytes worth of sectors after the first. This should work with both primary and extended partitions, with one exception noted below.
  • Exception: There is a special partition that requires a non-default starting sector. The first partition within an extended partition will start in the same cylinder as the extended partition marker, and thus share some space with said marker and not be the exact size specified. To correct this, create the first "inner" extended partition one sector after the beginning of the "outer" extended partition. This will result in a few bytes of wasted space, but if both partitions start on the same sector then size of the final partition will be slightly too small.

Copying Partitions

  • Decide which partition will be the source and which will be the destination. Use dd to copy from the source to the destination:
    # dd if=/dev/sda2 of=/dev/sda4 bs=8M

    A key parameter here is the "bs" option, which tells dd how many bytes to read and write each time. Read/write operations to disks are relatively slow and have significant overhead so you want to make each read/write operation worth your computer's time. I've found 8MB to be about the most efficient block size to use for each read and write. If you fail to specify a value then the default of 512 bytes will be used, this can make the process take on the order of 100 times longer to finish.

Obviously, copying partitions also applies to copying entire disks if the disks are the exact same size. This can be used make manual backups of entire drives.